# Few Distinct Distances Implies Many Points on a Line

In this post I will show that if a planar $n$ point set spans a small number of distinct distances, there must exist a line containing $\Omega(\log n)$ points of the set. This result originated in a conversation between Shakhar Smorodinsky, Frank de Zeeuw, and myself, a few months ago in IPAM. I understand that Orit Raz and Micha Sharir independently obtained similar observations at about the same time.

Shakhar Smorodinsky and Frank de Zeeuw.

Guth and Katz proved that every set of $n$ points in ${\mathbb R}^2$ determines $\Omega(n/ \log n)$ distinct distances. This almost completely settled a conjecture of Erdős , who observed that the $\sqrt{n}\times \sqrt{n}$ integer lattice determines $\Theta(n/\sqrt{\log n})$ distinct distances, and conjectured that every set of $n$ points determines at least this number of distinct distances. Beyond the remaining $\sqrt{\log n}$ gap, a major open problem is characterizing the point sets that determine a small number of distinct distances.

Given a set of $n$ points in ${\mathbb R}^2$, we say that it is near-optimal if it determines $O(n/\sqrt{\log n})$ distinct distances. Erdős asked whether every near-optimal set “has lattice structure”. Since this seems to be a very challenging problem, Erdős wrote “The first step would be to decide if there always is a line which contains $cn^{1/2}$ of the $x_i$ [i.e., of the points of the set], (and in fact $n^{\varepsilon}$ would already be interesting).”

Embarrassingly, almost three decades later the $n^{\varepsilon}$ bound seems as distant as it ever was. Up to now, the best known bound was a variant of a proof of Szemerédi, which was communicated by Erdős.

Theorem 1 (Szemerédi). Let $\cal P$ be a near-optimal set of $n$ points. Then there exists a line containing $\Omega(\sqrt{\log n})$ points of $\cal P$.

Proof.     Let $x$ denote the maximum number of points of $\cal P$ that are on a common line. Let $T = \{(a,p,q)\in {\cal P}^3 \mid |ap|=|aq|\}$, where $a,p,q$ are three distinct points and where $(a,p,q)$ and $(a,q,p)$ are counted as the same triple. The proof is based on double counting $|T|$, and we begin by deriving an upper bound for it. Given a pair of points $p,q\in{\cal P}$, the triplet $(a,p,q)$ is in $T$ if and only if $a$ is on the perpendicular bisector of the segment $pq$. By the assumption, each such perpendicular bisector contains at most $x$ points of ${\cal P}$, which implies

$|T| \le x \binom{n}{2} = \frac{x(n^2-n)}{2} \qquad \qquad (1).$

For the lower bound, notice that for every point $p\in {\cal P}$, the points of ${\cal P}\setminus\{p\}$ are contained in at most $d = O(n/\sqrt{\log n})$ concentric circles around $p$. We denote these circles as $C_{p,1},\ldots,C_{p,d}$ and set $n_{p,i} = |C_{p,i} \cap {\cal P}|$. Notice that $\sum_{i=1}^dn_{p,i} = n-1$ for every $p\in{\cal P}$. By the Cauchy-Schwarz inequality, we have $\sum_{i=1}^d n_{p,i}^2 \ge \frac{1}{d}(n-1)^2$. This in turn implies

$|T| = \sum_{p\in{\cal P}}\sum_{i=1}^d \binom{n_{p,i}}{2} = \frac{1}{2}\sum_{p\in{\cal P}}\sum_{i=1}^d (n_{p,i}^2-n_{p,i})$
$\ge \frac{1}{2}\sum_{p\in{\cal P}} \left( \frac{1}{d}(n-1)^2 - (n-1)\right) = \frac{n(n-1)(n-1-d)}{2d}. \qquad (2)$

Combining $(1)$ and $(2)$ immediately implies

$x \ge \frac{n(n-1)(n-1-d)}{2d} \cdot \frac{2}{(n^2-n)} = \Omega\left(\sqrt{\log n}\right). \qquad \qquad \qquad \qquad \Box$

Szemerédi’s proof is almost identical to the proof of the Kővári–Sós–Turán theorem. We now prove a variant of this theorem.

Theorem 2 (Kővári–Sós–Turán variant). Let $\cal C$ be a set of $n$ circles and let $\cal P$ be a set of $m$ points, both in ${\mathbb R}^2$, such that the incidence graph of ${\cal P}\times{\cal C}$ contains no copy of $K_{2,s}$ (where $s$ may depend on $m,n$). Then

$I({\cal P},{\cal C}) = O(mn^{1/2}s^{1/2}+n).$

Proof.     The proof is a variant of the standard proof of the Kővári–Sós–Turán theorem. We double count the number $Q$ of triplets $(p_1,p_2,c)$, such that $p_1,p_2 \in {\cal P}$, $c \in {\cal C}$, and $c$ is incident to both $p_1$ and $p_2$. On one hand, there are ${m\choose 2}$ pairs of vertices of $\cal P$, and each can have at most $s$ circles of $\cal C$ that are incident to each point of the set. That is, we have

$|Q| \le {m\choose 2}s = \frac{sm(m-1)}{2}. \qquad (3)$

On the other hand, for each $c \in {\cal C}$, let $d_c$ denote the number of points of $\cal P$ that are contained in $c$, and notice that
$I({\cal P},{\cal C}) = \sum_{c\in {\cal C}} d_c$. Then we have

$Q = \sum_{c\in {\cal C}} {d_c \choose 2} = \frac{1}{2} \sum_{c\in {\cal C}} (d_c^2-d_c).$

By applying the Cauchy-Schwarz inequality, we obtain

$|Q| = \frac{1}{2} \sum_{c\in {\cal C}} (d_c^2-d_c) \ge \frac{1}{2} \left(\frac{ \left(\sum_{c\in {\cal C}}d_c\right)^2}{n} - \sum_{c\in {\cal C}}d_c \right) = \frac{I({\cal P},{\cal C})^2 - n\cdot I({\cal P},{\cal C})}{2n}. \qquad (4)$

By combining $(3)$ and $(4)$, we obtain

$\frac{I({\cal P},{\cal C})^2 - n\cdot I({\cal P},{\cal C})}{2n} \le \frac{sm(m-1)}{2}.$

Hence $I({\cal P},{\cal C}) = O(mn^{1/2}s^{1/2}+n)$, as asserted.     $\Box$

Theorem 1 can be easily derived from Theorem 2.

Alternative proof of Theorem 1.     Denote the set of distinct distances that are spanned by $\cal P$ as $D=\{\delta_1,\delta_2,\ldots\}$. Let $\cal C$ be a set of $O(n^2/\sqrt{\log n})$ circles: around every point of $\cal P$ we place $|D|$ circles whose radii are the values of $D$. It can be easily verified that $I({\cal P},{\cal C})=n(n-1)$.

Let $x$ denote the maximum number of points of $\cal P$ that are on a common line. Given two points $p,q\in{\cal P}$, a circle of $\cal C$ can contain both $p$ and $q$ only if the perpendicular bisector of $p,q$ is incident to the center of the circle. This implies that the incidence graph of ${\cal P}\times{\cal C}$ contains no copy of $K_{2,x}$. Thus, by Theorem 2 we have

$I({\cal P},{\cal C}) = O\left(n\left(n^2/\sqrt{\log n}\right)^{1/2}x^{1/2}+n^2/\sqrt{\log n}\right) = O\left(\frac{n^2x^{1/2}}{(\log n)^{1/4}}\right).$

Combining this bound with $I({\cal P},{\cal C})=n(n-1)$ immediately completes the proof.     $\Box$

Now that the proof is reformulated as an incidence problem, we can apply stronger incidence tools to obtain an improved bound. Specifically, we use polynomial partitioning (see my guide to polynomial partitioning here). If you are not interested in polynomial partitioning, you are welcome to skip the proof of the following theorem.

Theorem 3. Let $\cal C$ be a set of $n$ circles and let $\cal P$ be a set of $m$ points, both in ${\mathbb R}^2$, such that the incidence graph of ${\cal P}\times{\cal C}$ contains no copy of $K_{2,s}$ (where $s$ may depend on $m,n$). Then $I({\cal P},{\cal C}) = O(m^{2/3}n^{2/3}s^{1/3}+m+n)$.

Proof.     We consider an $m^{4/3}s^{2/3}/n^{2/3}$-partitioning polynomial $f$ for $\cal P$. According to polynomial partitioning theorem, the degree of $f$ is $O(m^{2/3}s^{1/3}/n^{1/3})$ and every cell in ${\mathbb R}^2\setminus Z(f)$ is an open set that contains at most $n^{2/3}/(m^{1/3}s^{2/3})$ points of $\cal P$. Let $t$ denote the number of cells in (i.e., connected components of) ${\mathbb R}\setminus Z(f)$.

We set ${\cal P}_0 = Z(f) \cap {\cal P}$, and similarly denote by ${\cal C}_0$ the set of circles that are fully contained in $Z(f)$. For $1\le i \le s$, let ${\cal P}_i$ denote the set of points that are contained in the $i$-th cell and let ${\cal C}_i$ denote the set of circles that intersect the $i$‘th cell. Notice that

$I({\cal P},{\cal C}) = I({\cal P}_0,{\cal C}_0) + I({\cal P}_0,{\cal C}\setminus{\cal C}_0) + \sum_{i=1}^t I({\cal P}_i,{\cal C}_i).$

We bound each of these three expressions separately. Consider a circle $c \in {\cal C} \setminus {\cal C}_0$, and notice that $|c \cap Z(f)| =O(m^{2/3}s^{1/3}/n^{1/3})$. By Bézout’s theorem, we have

$I({\cal P}_0,{\cal C} \setminus {\cal C}_0) = O(n\cdot m^{2/3}s^{1/3}/n^{1/3}) = O(m^{2/3}n^{2/3}s^{1/3}).$

Every regular point of $Z(f)$ is incident to at most one circle of ${\cal C}_0$. Thus, points of ${\cal P}_0$ that are on a regular point yield $O(m)$ incidences with ${\cal C}_0$. By Bézout’s theorem, every circle of ${\cal C}_0$ can intersect at most $2(\deg f-1) = O(m^{2/3}s^{1/3}/n^{1/3})$ singular point of $Z(f)$. Therefore, we have

$I({\cal P}_0,{\cal C}_0) = O(m + n\cdot m^{2/3}s^{1/3}/n^{1/3}) = O(m^{2/3}n^{2/3}s^{1/3}+m).$

It remains to bound $\sum_{i=1}^t I({\cal P}_i,{\cal C}_i)$. By Warren’s theorem, the number of cells in the partition is $t = O\left((m^{2/3}s^{1/3}/n^{1/3})^2\right) = O\left(m^{4/3}s^{2/3}/n^{2/3}\right)$. We set $m_i=|{\cal P}_i|\le n^{2/3}/(m^{1/3}s^{2/3})$ and $n_i = |{\cal C}_i|$. Since every circle of ${\cal C} \subset {\cal C}_0$ intersects $Z(f)$ in $O(m^{2/3}s^{1/3}/n^{1/3})$ points, we get $\sum_{i=1}^t n_i= O(m^{2/3}n^{2/3}s^{1/3})$. According to the Cauchy-Schwarz inequality, we have

$\sum_{i=1}^t n_i^{1/2} \le \sqrt{\left(\sum_{i=1}^t n_i\right)\left(\sum_{i=1}^t 1\right)} = O\left(\sqrt{m^{2/3}n^{2/3}s^{1/3}\cdot m^{4/3}s^{2/3}/n^{2/3}}\right) = O(ms^{1/2}).$

By applying Theorem 2 separately in each cell, we obtain

$\sum_{i=1}^t I({\cal P}_i,{\cal C}_i) = O\left(\sum_{i=1}^t (m_in_i^{1/2}s^{1/2}+n_i)\right)$
$= O\left(\frac{n^{2/3}}{m^{1/3}s^{2/3}}s^{1/2}\sum_{i=1}^t n_i^{1/2}+\sum_{i=1}^t n_i\right)$
$= O\left(m^{2/3}n^{2/3}s^{1/3}\right),$

which completes the proof.     $\Box$

By replacing Theorem 2 with Theorem 3 in the alternative proof of Theorem 1, we obtain the following improved bound.

Theorem 4. Let $\cal P$ be a near-optimal set of $n$ points. Then there exists a line containing $\Omega(\log n)$ points of $\cal P$.

Proof sketch.     As before, we have $I({\cal P},{\cal C})=n(n-1)$. By applying Theorem 3, we have

$I({\cal P},{\cal C}) = O\left(n^{2/3}\left(n^2/\sqrt{\log n}\right)^{2/3}x^{1/3}+n^2/\sqrt{\log n} + n\right) = O\left(n^2x^{1/3}/(\log n)^{1/3}\right).$

Combining these two bounds yields the assertion of the theorem.     $\Box$