# Few Distinct Distances Implies Many Points on a Line

In this post I will show that if a planar $n$ point set spans a small number of distinct distances, there must exist a line containing $\Omega(\log n)$ points of the set. This result originated in a conversation between Shakhar Smorodinsky, Frank de Zeeuw, and myself, a few months ago in IPAM. I understand that Orit Raz and Micha Sharir independently obtained similar observations at about the same time.

Guth and Katz proved that every set of $n$ points in ${\mathbb R}^2$ determines $\Omega(n/ \log n)$ distinct distances. This almost completely settled a conjecture of Erdős , who observed that the $\sqrt{n}\times \sqrt{n}$ integer lattice determines $\Theta(n/\sqrt{\log n})$ distinct distances, and conjectured that every set of $n$ points determines at least this number of distinct distances. Beyond the remaining $\sqrt{\log n}$ gap, a major open problem is characterizing the point sets that determine a small number of distinct distances.

Given a set of $n$ points in ${\mathbb R}^2$, we say that it is near-optimal if it determines $O(n/\sqrt{\log n})$ distinct distances. Erdős asked whether every near-optimal set “has lattice structure”. Since this seems to be a very challenging problem, Erdős wrote “The first step would be to decide if there always is a line which contains $cn^{1/2}$ of the $x_i$ [i.e., of the points of the set], (and in fact $n^{\varepsilon}$ would already be interesting).”

Embarrassingly, almost three decades later the $n^{\varepsilon}$ bound seems as distant as it ever was. Up to now, the best known bound was a variant of a proof of Szemerédi, which was communicated by Erdős.

Theorem 1 (Szemerédi). Let $\cal P$ be a near-optimal set of $n$ points. Then there exists a line containing $\Omega(\sqrt{\log n})$ points of $\cal P$.

Proof.     Let $x$ denote the maximum number of points of $\cal P$ that are on a common line. Let $T = \{(a,p,q)\in {\cal P}^3 \mid |ap|=|aq|\}$, where $a,p,q$ are three distinct points and where $(a,p,q)$ and $(a,q,p)$ are counted as the same triple. The proof is based on double counting $|T|$, and we begin by deriving an upper bound for it. Given a pair of points $p,q\in{\cal P}$, the triplet $(a,p,q)$ is in $T$ if and only if $a$ is on the perpendicular bisector of the segment $pq$. By the assumption, each such perpendicular bisector contains at most $x$ points of ${\cal P}$, which implies $|T| \le x \binom{n}{2} = \frac{x(n^2-n)}{2} \qquad \qquad (1).$

For the lower bound, notice that for every point $p\in {\cal P}$, the points of ${\cal P}\setminus\{p\}$ are contained in at most $d = O(n/\sqrt{\log n})$ concentric circles around $p$. We denote these circles as $C_{p,1},\ldots,C_{p,d}$ and set $n_{p,i} = |C_{p,i} \cap {\cal P}|$. Notice that $\sum_{i=1}^dn_{p,i} = n-1$ for every $p\in{\cal P}$. By the Cauchy-Schwarz inequality, we have $\sum_{i=1}^d n_{p,i}^2 \ge \frac{1}{d}(n-1)^2$. This in turn implies $|T| = \sum_{p\in{\cal P}}\sum_{i=1}^d \binom{n_{p,i}}{2} = \frac{1}{2}\sum_{p\in{\cal P}}\sum_{i=1}^d (n_{p,i}^2-n_{p,i})$ $\ge \frac{1}{2}\sum_{p\in{\cal P}} \left( \frac{1}{d}(n-1)^2 - (n-1)\right) = \frac{n(n-1)(n-1-d)}{2d}. \qquad (2)$

Combining $(1)$ and $(2)$ immediately implies $x \ge \frac{n(n-1)(n-1-d)}{2d} \cdot \frac{2}{(n^2-n)} = \Omega\left(\sqrt{\log n}\right). \qquad \qquad \qquad \qquad \Box$

Szemerédi’s proof is almost identical to the proof of the Kővári–Sós–Turán theorem. We now prove a variant of this theorem.

Theorem 2 (Kővári–Sós–Turán variant). Let $\cal C$ be a set of $n$ circles and let $\cal P$ be a set of $m$ points, both in ${\mathbb R}^2$, such that the incidence graph of ${\cal P}\times{\cal C}$ contains no copy of $K_{2,s}$ (where $s$ may depend on $m,n$). Then $I({\cal P},{\cal C}) = O(mn^{1/2}s^{1/2}+n).$