# The Two Formulations of the Szemerédi–Trotter Theorem

The Szemerédi–Trotter theorem is usually presented in one of the two following formulations.

Theorem 1. Given a set of $m$ points and a set of $n$ lines, both in ${\mathbb R}^2$, the number of incidences between the two sets is $O(m^{2/3}n^{2/3}+m+n)$.

Theorem 2. Given a set of $n$ lines in ${\mathbb R}^2$, the number of points in ${\mathbb R}^2$ that are incident to at least $k$ of the lines is $O\left(\frac{n^2}{k^3}+\frac{n}{k}\right)$ (for $2\le k \le n$).

A third formulation, symmetric to the one in Theorem 2, considers the number of lines that are incident to at least $k$ points of a given point set. We ignore this formulation here.

Many papers mention that the statements of Theorems 1 and 2 are equivalent, and some also show how Theorem 2 can be easily derived from Theorem 1 (including this Wikipedia page). On the other hand, I could not locate a single source that shows how to derive Theorem 1 from Theorem 2. At first it might seem as if a simple dyadic decomposition would do the trick. However, this approach leads to an extra logarithmic factor in the bound.

This post presents a simple elegant argument that leads to an extra logarithmic factor. It then present a longer proof that does not yield this extra factor. The latter proof was recently presented to me by Noam Solomon. If you have a more elegant proof, I would be very happy to hear about it.

Noam Solomon.

Proof with extra logarithmic factor.     Consider a set $\cal P$ of $m$ points and a set $\cal L$ of $n$ lines. Let $m_i$ denote the number of points of $\cal P$ that are incident to more than $2^{i-1}$ lines of $\cal L$ and to at most $2^i$ such lines. We set $r = \left\lceil\log \left(n^{2/3}/m^{1/3}\right) \right\rceil$. Since $m_i \le m$ obviously holds for every $i$ and by applying Theorem 2, we have

$I({\cal P},{\cal L}) \le \sum_{i\ge 0} m_i 2^i \le \sum_{i=0}^{r} m 2^i + \sum_{i= r+1}^{\log n} m_i 2^i$
$= O\left(m^{2/3}n^{2/3}+m + \sum_{i= r+1}^{\log n} \left(\frac{n^2}{2^{2i}}+n\right)\right)$
$= O\left(m^{2/3}n^{2/3}+m+n\log n\right). \qquad \qquad \Box$

We now move to the proof that does not yield an extra logarithmic factor.

Full proof.     Let $d$ denote the constant in the $O(\cdot)$-notation in the bound of Theorem 2, and assume that $d\ge1$. We prove by induction on $m$ that for every set $\cal P$ of $m$ points and set $\cal L$ of $n$ lines, $I({\cal P},{\cal L})\le c(m^{2/3}n^{2/3}+m+n)$ for a sufficiently large constant $c$. For the induction basis, the claim holds for small values of $m$ (say, $m\le 100$) by taking $c$ to be sufficiently large. For the induction step, let ${\cal P}_1$ denote the set of points of ${\cal P}$ that are incident to less than $10dn^{2/3}/m^{1/3}$ lines of $\cal L$. By taking $c$ to be larger than $100d$, we have

$I({\cal P}_1,{\cal L}) < \frac{10dn^{2/3}}{m^{1/3}} \cdot m = 10dn^{2/3}m^{2/3}< \frac{c}{10}n^{2/3}m^{2/3}. \qquad (1)$

Let ${\cal P}_2 = {\cal P} \setminus {\cal P}_1$ denote the points of $\cal P$ that are incident to at least $10dn^{2/3}/m^{1/3}$ lines of $\cal L$. If $|{\cal P}_2| \le m/2$, then by the induction hypothesis we have $I({\cal P}_2,{\cal L}) \le c((m/2)^{2/3}n^{2/3}+m/2+n)$. By combining this with $(1)$ we get $I({\cal P},{\cal L}) \le c(m^{2/3}n^{2/3}+m+n)$.

It remains to consider the case where $|{\cal P}_2| > m/2$. Since every point of ${\cal P}_2$ is incident to at least $10dn^{2/3}/m^{1/3}$ lines of $\cal L$, by Theorem 2 we have

$m/2 < |{\cal P}_2| \le \frac{dn^2}{(10dn^{2/3}/m^{1/3})^3}+\frac{dn}{10dn^{2/3}/m^{1/3}} = \frac{m}{1000d^2}+\frac{n^{1/3}m^{1/3}}{10}. \qquad (1)$

By recalling that $d\ge 1$, we have $0.499m \le n^{1/3}m^{1/3}/10$, which in turn leads to $m > \sqrt{n}$. By plugging this into the bound $I({\cal P}_2,{\cal L})= O(m\sqrt{n}+m)$ (obtained by a simple application of the Cauchy-Schwarz inequality) and taking $c$ to be sufficiently large with respect to the constant in the $O(\cdot)$-notation, we obtain $I({\cal P}_2,{\cal L}) \le \frac{c}{2}(m^{2/3}n^{2/3}+n)$. Combining this bound with $(1)$ completes the proof.     $\Box$

## 2 thoughts on “The Two Formulations of the Szemerédi–Trotter Theorem”

1. Very nice! This has bothered me for a long time too. A slight shorter proof, based on the same principles is as follows: let P_1 denote the set of points incident to at most |L|^1/2 lines, and let P_2 denote the set of points incident to more than |L|^1/2 lines. Then I(P,L) = I(P_1,L) + I(P_2,L). Apply the dyadic decomposition proof the first quantity (but summing only over $2^i<|L|^1/2$, where the first term $n^2/k^3$ dominates), and use the Cauchy-Schwarz bound on the second quantity, using the fact that |P_2|<<|L|^1/2 to obtain I(P_2,L) < |P_2||L|^1/2 + |L| << |L|.

2. Nice one Brendan. That does simplify the proof. I’m polishing my lecture notes now, and will add this proof.