# Subsets with no repeated distances

Today I wish to discuss several problems concerning finding a subset ${\cal P}'$ of a point set ${\cal P}$, such that the points of ${\cal P}'$ do not span any distance more than once. This is another family of problems for my survey of open variants of the distinct distances problem. I also present a surprisingly short and elegant proof that was recently observed by Charalambides (and is a simple combination of a proof by Lefmann and Thiele with a result from Guth and Katz’s distinct distances paper). The following table lists the best known bounds for the problems that are presented in this post.

Variant Lower Bound Upper Bound
$\mathsf{subset}(n)$ $\Omega(n^{1/3}/\log^{1/3} n)$ $O\left(\sqrt{n}/(\log n)^{1/4}\right)$
$\mathsf{subset}_O(n)$ $\Omega(n^{1/3}/\log^{1/3} n)$ $O\left(\sqrt{n}\right)$
$\mathsf{subset}({\cal L})$ $\Omega(n^{1/3}/\log^{1/3} n)$ $O\left(\sqrt{n}/(\log n)^{1/4}\right)$
$\mathsf{subset}({\cal L}_d)$ $\Omega(n^{2/(3d)})$ $O(n^{1/d})$
$\mathsf{subset}'(n)$ $\Omega(n^{0.4315})$ $O\left(\sqrt{n}/(\log n)^{1/4}\right)$
Given a set ${\cal P}$ of points in ${\mathbb R}^2$, let $\mathsf{subset}({\cal P})$ denote the size of the largest subset ${\cal P}' \subset {\cal P}$ such that every distance is spanned by the points of ${\cal P}'$ at most once; that is, there are no points $a,b,c,d \in {\cal P}'$ such that $|ab|=|cd|>0$ (including cases where $a=c$). The following figure depicts a set of 25 points and a subset of 5 points that span every distance at most once.

Let $\mathsf{subset}(n) = \min_{|{\cal P}|=n}\mathsf{subset}({\cal P})$. In other words, $\mathsf{subset}(n)$ is the maximum number satisfying the property that every set of $n$ points in the plane contains a subset of $\mathsf{subset}(n)$ points that do not span any distance more than once. Erdős posed the following problem (see here (in Hungarian), here, and here (page 838)).

Problem 1. (Erdős) Find the asymptotic value of $\mathsf{subset}(n)$.

Let ${\cal P}$ be a point set that spans $d$ distinct distances. Notice that if all of the distances that are spanned by the points of a subset ${\cal P}' \subset {\cal P}$ are unique, then $|{\cal P}'|=O(\sqrt{d})$. Let ${\cal L}$ be a $\sqrt{n}\times \sqrt{n}$ integer lattice. Erdős proved that the number of distinct distances that are spanned by ${\cal L}$ is $\Theta(n/\sqrt{\log n})$ (for more details see this post). Therefore, we have $\mathsf{subset}(n) \le \mathsf{subset}({\cal L}) = O\left(\sqrt{n}/(\log n)^{1/4}\right)$. Lefmann and Thiele derived the bound $\mathsf{subset}(n)=\Omega(n^{0.25})$ by using a probabilistic argument. Dumitrescu improved this bound to $\mathsf{subset}(n)=\Omega(n^{0.288})$ by combining it with an upper bound of Pach and Tardos on the maximum number of isosceles triangles that can be spanned by a set of $n$ points. Recently, Charalambides combined the probabilistic argument of Lefmann and Thiele with a result from Guth and Katz’s distinct distances paper, to obtain the following improved result and simple proof.

Theorem 1. $\mathsf{subset}(n)=\Omega(n^{1/3}/\log^{1/3} n).$
Proof.     Consider a set ${\cal P}$ of $n$ points in ${\mathbb R}^2$. Similarly to the Elekes-Sharir reduction (e.g., see this post), we define the set

$Q_1 = \left\{ (a,b,c,d)\in {\cal P}^4 \ \big| \ |ab|=|cd|>0 \right\},$

such that every quadruple of $Q_1$ consists of four distinct points. Guth and Katz proved that $|Q_1|=O(n^3\log n)$ (also for the case where $Q_1$ is allowed to contain quadruples where not all four points are distinct). Let $Q_2$ be the set of isosceles and equilateral triangles that are spanned by points of ${\cal P}$. Pach and Tardos proved that $|Q_2|=O(n^{2.137})$.

Let ${\cal P}' \subset {\cal P}$ be a subset that is obtained by selecting every point of ${\cal P}$ with probability $0 < p <1$ that we will fix later. We have ${\mathbb E}[|{\cal P}'|] = p n$. Let $Q_1' \subset Q_1$ be the set of quadruples of $Q_1$ that contain only points of ${\cal P}'$. Every quadruple of $Q_1$ is in $Q_1'$ with a probability of $p^4$, so ${\mathbb E}[|Q_1'|] \le \alpha_1 p^4 n^3\log{n}$, for a sufficiently large constant $\alpha_1$. Let $Q_2'$ be the set of triangles of $Q_2$ that contain only points of ${\cal P}'$. The bound of Pach and Tardos implies ${\mathbb E}[|Q_2'|] \le \alpha_2 p^3 n^{2.137}$, for a sufficiently large constant $\alpha_2$. Notice that the points of ${\cal P}'$ span every distance at most once if and only if $|Q_1'|=|Q_2'|=0$. By linearity of expectation, we have

${\mathbb E}\left[|{\cal P}'|-|Q_1'|-|Q_2'|\right]\ge p n - \alpha_1 p^4 n^3\log{n} - \alpha_2 p^3 n^{2.137}.$

By setting $p=1/(2\alpha_1n^2\log n)^{1/3}$, when $n$ is sufficiently larger we obtain

${\mathbb E}\left[|{\cal P}'|-|Q_1'|-|Q_2'|\right]> \frac{n^{1/3}}{3(\alpha_1\log n)^{1/3}}.$

Therefore, there exists a subset ${\cal P}' \subset {\cal P}$ for which $|{\cal P}'|-|Q_1'|-|Q_2'|\ge \frac{n^{1/3}}{3(\alpha_1\log n)^{1/3}}$. Let ${\cal P}''$ be a subset of ${\cal P}'$ that is obtained by removing from ${\cal P}'$ a point from every element of $Q_1'$ and $Q_2'$. The subset ${\cal P}''$ does not span any repeated distances and contains $\Omega(n^{1/3}/\log^{1/3} n)$ points, concluding the proof.     $\Box$

Let $\mathsf{subset}_O(n)$ denote the maximum number satisfying the property that every set of $n$ points on a sphere in ${\mathbb R}^3$ contains a subset of $\mathsf{subset}_O(n)$ points that do not span any distance more than once. Charalambides showed that a simple extension of the proof of Theorem 1 implies $\mathsf{subset}(n)=\Omega(n^{1/3}/\log^{1/3} n)$. This requires an extension of Guth and Katz’s distinct distances result to points on a sphere, as derived by Tao. Let ${\cal P}_r$ be the set of vertices of a regular $n$-gon in ${\mathbb R}^3$. It can be easily verified that the points of ${\cal P}_r$ determine $O(n)$ distinct distances and are contained in a common sphere. This implies $\mathsf{subset}_O(n)=O(\sqrt{n})$.

Problem 2. Find the asymptotic value of $\mathsf{subset}_O(n)$.

Erdős and Guy considered the following special case of Problem 1.

Problem 3. (Erdős and Guy)  Find the asymptotic value of $\mathsf{subset}({\cal L})$, where ${\cal L}$ is a $\sqrt{n}\times \sqrt{n}$ integer lattice.

As mentioned above, the best known upper bound on $\mathsf{subset}({\cal L})$ is the trivial $O\left(\sqrt{n}/(\log n)^{1/4}\right)$. Erdős and Guy derived the bound $\mathsf{subset}({\cal L}) = \Omega(n^{1/3-\varepsilon})$, which was later improved by Lefmann and Thiele to $\mathsf{subset}({\cal L}) = \Omega(n^{1/3})$. This bound is still slightly better than the bound implied by Theorem 1.

Erdős and Guy also considered the higher dimensional variant of Problem 3. That is, they considered higher dimensional lattices ${\cal L}_d$ of the form $n^{1/d}\times \cdots \times n^{1/d}$, and conjectured that $\mathsf{subset}({\cal L}_d) = \Omega(n^{2/(3d)})$. This conjecture was proved by Lefmann and Thiele. It is simple to show that the points of ${\cal L}_d$ span $O(n^{2/d})$ distinct distances (e.g., see this post), which implies $\mathsf{subset}({\cal L}_d) = O(n^{1/d})$.

Problem 4. (Erdős and Guy)  Find the asymptotic value of $\mathsf{subset}({\cal L}_d)$, where ${\cal L}_d$ is an $n^{1/d}\times \cdots \times n^{1/d}$ integer lattice.

We can also consider the higher dimensional variant of Problem 1. Let $\mathsf{subset}_d(n)$ denote the maximum number satisfying the property that every set of $n$ points in ${\mathbb R}^d$ contains a subset of $\mathsf{subset}_d(n)$ points that do not span any distance more than once.

Problem 5.  Find the asymptotic value of $\mathsf{subset}_d(n)$, for $d\ge 3$.

I am not aware of any non-trivial bounds for this case.

In the open problems book by Brass, Moser, and Pach, they suggest the following variant of Probelm 1. Let $\mathsf{subset}'(n)$ denote the maximum number satisfying the property that every set of $n$ points in the plane contains a subset of $\mathsf{subset}'(n)$ points that do not span any isosceles triangles.

Problem 6. (Brass, Moser, and Pach) Find the asymptotic value of $\mathsf{subset}'(n)$.

For a trivial upper bound, we have $\mathsf{subset}'(n) \le \mathsf{subset}(n) = O\left(\sqrt{n}/(\log n)^{1/4}\right)$. By adapting the proof of Theorem 1 (i.e., removing $Q_1$ from the analysis), we obtain $\mathsf{subset}'(n) = \Omega(n^{0.4315})$.

## 7 thoughts on “Subsets with no repeated distances”

1. Frank de Zeeuw |

The formula defining subset(n) doesn’t seem to match the verbal description…
Frank de Zeeuw

2. Thanks for the comment Frank!
I need more help with it, since it seems to me that the two do match. Why don’t they?

• Frank de Zeeuw |

Right after the picture with 25 and 5 points, it says subset(n)=min_P D(P), which sounds like something else to me…

• Oh I see. That seems correct, but perhaps not written quite clearly.

The minimum is basically going over every possible set of n points and selecting the set for which D(P) is the smallest. If this yielded a value of x, then the corresponding point set contains no good subset of size x+1, so x is indeed the largest integer satisfying the verbal description.

It’s good that someone is verifying that I’m not writing nonsense! Do tell me about any other problems that you notice in the future.

• Frank de Zeeuw |

By D(P), do you mean subset(P)?

• Oh! I really had a blind spot with regard to this. Fixed.
Thanks Frank!