# Subsets with no repeated distances

Today I wish to discuss several problems concerning finding a subset ${\cal P}'$ of a point set ${\cal P}$, such that the points of ${\cal P}'$ do not span any distance more than once. This is another family of problems for my survey of open variants of the distinct distances problem. I also present a surprisingly short and elegant proof that was recently observed by Charalambides (and is a simple combination of a proof by Lefmann and Thiele with a result from Guth and Katz’s distinct distances paper). The following table lists the best known bounds for the problems that are presented in this post.

Variant Lower Bound Upper Bound
$\mathsf{subset}(n)$ $\Omega(n^{1/3}/\log^{1/3} n)$ $O\left(\sqrt{n}/(\log n)^{1/4}\right)$
$\mathsf{subset}_O(n)$ $\Omega(n^{1/3}/\log^{1/3} n)$ $O\left(\sqrt{n}\right)$
$\mathsf{subset}({\cal L})$ $\Omega(n^{1/3}/\log^{1/3} n)$ $O\left(\sqrt{n}/(\log n)^{1/4}\right)$
$\mathsf{subset}({\cal L}_d)$ $\Omega(n^{2/(3d)})$ $O(n^{1/d})$
$\mathsf{subset}'(n)$ $\Omega(n^{0.4315})$ $O\left(\sqrt{n}/(\log n)^{1/4}\right)$
Given a set ${\cal P}$ of points in ${\mathbb R}^2$, let $\mathsf{subset}({\cal P})$ denote the size of the largest subset ${\cal P}' \subset {\cal P}$ such that every distance is spanned by the points of ${\cal P}'$ at most once; that is, there are no points $a,b,c,d \in {\cal P}'$ such that $|ab|=|cd|>0$ (including cases where $a=c$). The following figure depicts a set of 25 points and a subset of 5 points that span every distance at most once.

Let $\mathsf{subset}(n) = \min_{|{\cal P}|=n}\mathsf{subset}({\cal P})$. In other words, $\mathsf{subset}(n)$ is the maximum number satisfying the property that every set of $n$ points in the plane contains a subset of $\mathsf{subset}(n)$ points that do not span any distance more than once. Erdős posed the following problem (see here (in Hungarian), here, and here (page 838)).

Problem 1. (Erdős) Find the asymptotic value of $\mathsf{subset}(n)$.

Let ${\cal P}$ be a point set that spans $d$ distinct distances. Notice that if all of the distances that are spanned by the points of a subset ${\cal P}' \subset {\cal P}$ are unique, then $|{\cal P}'|=O(\sqrt{d})$. Let ${\cal L}$ be a $\sqrt{n}\times \sqrt{n}$ integer lattice. Erdős proved that the number of distinct distances that are spanned by ${\cal L}$ is $\Theta(n/\sqrt{\log n})$ (for more details see this post). Therefore, we have $\mathsf{subset}(n) \le \mathsf{subset}({\cal L}) = O\left(\sqrt{n}/(\log n)^{1/4}\right)$. Lefmann and Thiele derived the bound $\mathsf{subset}(n)=\Omega(n^{0.25})$ by using a probabilistic argument. Dumitrescu improved this bound to $\mathsf{subset}(n)=\Omega(n^{0.288})$ by combining it with an upper bound of Pach and Tardos on the maximum number of isosceles triangles that can be spanned by a set of $n$ points. Recently, Charalambides combined the probabilistic argument of Lefmann and Thiele with a result from Guth and Katz’s distinct distances paper, to obtain the following improved result and simple proof.

Theorem 1. $\mathsf{subset}(n)=\Omega(n^{1/3}/\log^{1/3} n).$
Proof.     Consider a set ${\cal P}$ of $n$ points in ${\mathbb R}^2$. Similarly to the Elekes-Sharir reduction (e.g., see this post), we define the set

$Q_1 = \left\{ (a,b,c,d)\in {\cal P}^4 \ \big| \ |ab|=|cd|>0 \right\},$

such that every quadruple of $Q_1$ consists of four distinct points. Guth and Katz proved that $|Q_1|=O(n^3\log n)$ (also for the case where $Q_1$ is allowed to contain quadruples where not all four points are distinct). Let $Q_2$ be the set of isosceles and equilateral triangles that are spanned by points of ${\cal P}$. Pach and Tardos proved that $|Q_2|=O(n^{2.137})$.

Let ${\cal P}' \subset {\cal P}$ be a subset that is obtained by selecting every point of ${\cal P}$ with probability $0 < p <1$ that we will fix later. We have ${\mathbb E}[|{\cal P}'|] = p n$. Let $Q_1' \subset Q_1$ be the set of quadruples of $Q_1$ that contain only points of ${\cal P}'$. Every quadruple of $Q_1$ is in $Q_1'$ with a probability of $p^4$, so ${\mathbb E}[|Q_1'|] \le \alpha_1 p^4 n^3\log{n}$, for a sufficiently large constant $\alpha_1$. Let $Q_2'$ be the set of triangles of $Q_2$ that contain only points of ${\cal P}'$. The bound of Pach and Tardos implies ${\mathbb E}[|Q_2'|] \le \alpha_2 p^3 n^{2.137}$, for a sufficiently large constant $\alpha_2$. Notice that the points of ${\cal P}'$ span every distance at most once if and only if $|Q_1'|=|Q_2'|=0$. By linearity of expectation, we have

${\mathbb E}\left[|{\cal P}'|-|Q_1'|-|Q_2'|\right]\ge p n - \alpha_1 p^4 n^3\log{n} - \alpha_2 p^3 n^{2.137}.$

By setting $p=1/(2\alpha_1n^2\log n)^{1/3}$, when $n$ is sufficiently larger we obtain

${\mathbb E}\left[|{\cal P}'|-|Q_1'|-|Q_2'|\right]> \frac{n^{1/3}}{3(\alpha_1\log n)^{1/3}}.$

Therefore, there exists a subset ${\cal P}' \subset {\cal P}$ for which $|{\cal P}'|-|Q_1'|-|Q_2'|\ge \frac{n^{1/3}}{3(\alpha_1\log n)^{1/3}}$. Let ${\cal P}''$ be a subset of ${\cal P}'$ that is obtained by removing from ${\cal P}'$ a point from every element of $Q_1'$ and $Q_2'$. The subset ${\cal P}''$ does not span any repeated distances and contains $\Omega(n^{1/3}/\log^{1/3} n)$ points, concluding the proof.     $\Box$

# A short update

This weekend I will not have the time to add another long post, partly because I will be traveling to Rio to present in SoCG 2013 a result about point-circle incidences in ${\mathbb R}^3$ (see here).

I did add this new webpage to the blog, where I will place relevant pdf files. For now it contains my survey of open variants of the distinct distances problem, and the best known bounds for them. This document contains the problems discussed here and here, and I plan to extend it soon.

Finally, a paper quite useful to us all can be found here.

# The Elekes-Sharir Framework (part 3)

In the previous posts of this series (part 1 and part 2), we discussed the history of the Elekes-Sharir reduction, the basic technical details, and applied the reduction to reprove a result of Szemerédi. In this post, the third and final one of the series, we present another application of the reduction.

Let ${\cal P}_1$ (resp., ${\cal P}_2$) be a sets of $m$ (resp., $n$) points, all on a common line $\ell_1$ (resp., $\ell_2$). Let $D({\cal P}_1,{\cal P}_2)$ denote the number of distinct distances between the pairs of ${\cal P}_1\times{\cal P}_2$ (that is, only distinct distances between points on different lines are considered). There exist configurations where $\ell_1$ and $\ell_2$ are either parallel or orthogonal and $D_{\text{lines}}({\cal P}_1,{\cal P}_2) = \Theta(\max\{m,n\})$. For example, see the following figure.

The case where $\ell_1$ and $\ell_2$ are neither parallel nor orthogonal is more difficult and has prompted various works in the past 15 years (see Elekes and Rónyai, Elekes, Schwartz, Solymosi, and de Zeeuw, and Sharir, Sheffer, and Solymosi). Let $D_{\text{lines}}(m,n)$ denote the minimum number of distinct distances in such a scenario. Elekes observed the subquadratic upper bound $D_{\text{lines}}(m,n) = O(\max\{n^2/\log n, m^2/\log m\})$. The best known lower bound $D_{\text{lines}}(m,n)=\Omega\left( \min\left\{m^{2/3}n^{2/3},m^2,n^2 \right\}\right)$ was recently derived by Sharir, Sheffer, and Solymosi). Notice the large gap between the two bounds — one of the largest among the variants of the distinct distances problem.

We now explain how to derive the lower bound of Sharir, Sheffer, and Solymosi by applying the Elekes-Sharir framework. Recently, we learned that this proof can be slightly simplified by removing the use of the framework from it (see here for more details).

Theorem. (Sharir, Sheffer, and Solymosi)

$D_{\text{lines}}(m,n)=\Omega\left( \min\left\{m^{2/3}n^{2/3},m^2,n^2 \right\}\right).$

Proof.     Consider a set ${\cal P}_1$ of $m$ points (resp., a set ${\cal P}_2$ of $n$ points), all on a common line $\ell_1$ (resp., $\ell_2$). Additionally, the lines $\ell_1$ an $\ell_2$ are neither parallel nor orthogonal. We set $x=D({\cal P}_1,{\cal P}_2)$ and denote the $x$ distinct distances in ${\cal P}_1\times{\cal P}_2$ as $\delta_1,\ldots, \delta_x$. We apply the Elekes-Sharir framework with a small change. This time, we define

$Q =\left\{ (a,p,b,q) \mid a,b\in{\cal P}_1, \quad p,q\in{\cal P}_2, \quad \text{and}\quad |ap|=|bq|>0 \right\}$

# The Elekes-Sharir Framework (part 2)

In my first post about the Elekes-Sharir framework, we presented the history of the reduction and its basic technical details. In this post we present another application of the framework, reproving a result by Szemerédi on the minimum number of distinct distances in sets with no $k$ collinear points. As usual, thanks to Micha Sharir for helping with the preparation of the post.

One set of open problems that still seem to be beyond our reach concerns the characterization of point sets that determine a small number of distinct distances (e.g., see this post). Hardly anything is known about these problems, and as a first step Erdős suggested to determine whether every such point set (say, every point set that spans $O(n/\sqrt{\log n})$ distinct distances) contains $\Omega(\sqrt{n})$ points on a line, or even only $\Omega(n^\varepsilon)$ such points. The only non-trivial result comes from an unpublished lemma of Szemerédi (the proof can be found in the book Combinatorial Geometry by Pach and Agarwal).

We now show how a variant of Szemerédi bound can be proved by applying the Elekes-Sharir framework. The proof can be seen as equivalent to Szemerédi’s proof and thus does not seem to shed new light on the problem. However, it seems to me that the proof does provide more intuition for the Elekes-Sharir framework, and prepares us for more advanced uses, as the one that will be presented in the following post.

Lemma. (Szemerédi) Consider a set $\cal P$ of $n$ points in the plane and a positive constant $k$, such that no line contains $k$ points of $\cal P$. Then $\cal P$ spans $\Omega(n)$ distinct distances.