The Two Formulations of the Szemerédi–Trotter Theorem

The Szemerédi–Trotter theorem is usually presented in one of the two following formulations.

Theorem 1. Given a set of m  points and a set of n  lines, both in {\mathbb R}^2  , the number of incidences between the two sets is O(m^{2/3}n^{2/3}+m+n)  .

Theorem 2. Given a set of n  lines in {\mathbb R}^2  , the number of points in {\mathbb R}^2  that are incident to at least k  of the lines is O\left(\frac{n^2}{k^3}+\frac{n}{k}\right)  (for 2\le k \le n  ).

A third formulation, symmetric to the one in Theorem 2, considers the number of lines that are incident to at least k  points of a given point set. We ignore this formulation here.

Many papers mention that the statements of Theorems 1 and 2 are equivalent, and some also show how Theorem 2 can be easily derived from Theorem 1 (including this Wikipedia page). On the other hand, I could not locate a single source that shows how to derive Theorem 1 from Theorem 2. At first it might seem as if a simple dyadic decomposition would do the trick. However, this approach leads to an extra logarithmic factor in the bound.

This post presents a simple elegant argument that leads to an extra logarithmic factor. It then present a longer proof that does not yield this extra factor. The latter proof was recently presented to me by Noam Solomon. If you have a more elegant proof, I would be very happy to hear about it.

noamSolomon
Noam Solomon.

Proof with extra logarithmic factor.     Consider a set \cal P  of m  points and a set \cal L  of n  lines. Let m_i  denote the number of points of \cal P  that are incident to more than 2^{i-1}  lines of \cal L  and to at most 2^i  such lines. We set r = \left\lceil\log \left(n^{2/3}/m^{1/3}\right) \right\rceil  . Since m_i \le m  obviously holds for every i  and by applying Theorem 2, we have

I({\cal P},{\cal L}) \le \sum_{i\ge 0} m_i 2^i \le \sum_{i=0}^{r} m 2^i + \sum_{i= r+1}^{\log n} m_i 2^i
= O\left(m^{2/3}n^{2/3}+m + \sum_{i= r+1}^{\log n} \left(\frac{n^2}{2^{2i}}+n\right)\right)
= O\left(m^{2/3}n^{2/3}+m+n\log n\right). \qquad \qquad \Box

We now move to the proof that does not yield an extra logarithmic factor.

Full proof.     Let d  denote the constant in the O(\cdot)  -notation in the bound of Theorem 2, and assume that d\ge1  . We prove by induction on m  that for every set \cal P  of m  points and set \cal L  of n  lines, I({\cal P},{\cal L})\le c(m^{2/3}n^{2/3}+m+n)  for a sufficiently large constant c  . For the induction basis, the claim holds for small values of m  (say, m\le 100  ) by taking c  to be sufficiently large. For the induction step, let {\cal P}_1  denote the set of points of {\cal P}  that are incident to less than 10dn^{2/3}/m^{1/3}  lines of \cal L  . By taking c  to be larger than 100d  , we have

I({\cal P}_1,{\cal L}) < \frac{10dn^{2/3}}{m^{1/3}} \cdot m = 10dn^{2/3}m^{2/3}< \frac{c}{10}n^{2/3}m^{2/3}. \qquad (1)

Let {\cal P}_2 = {\cal P} \setminus {\cal P}_1  denote the points of \cal P  that are incident to at least 10dn^{2/3}/m^{1/3}  lines of \cal L  . If |{\cal P}_2| \le m/2  , then by the induction hypothesis we have I({\cal P}_2,{\cal L}) \le c((m/2)^{2/3}n^{2/3}+m/2+n)  . By combining this with (1)  we get I({\cal P},{\cal L}) \le c(m^{2/3}n^{2/3}+m+n)  .

It remains to consider the case where |{\cal P}_2| > m/2  . Since every point of {\cal P}_2  is incident to at least 10dn^{2/3}/m^{1/3}  lines of \cal L  , by Theorem 2 we have

m/2 < |{\cal P}_2| \le \frac{dn^2}{(10dn^{2/3}/m^{1/3})^3}+\frac{dn}{10dn^{2/3}/m^{1/3}} = \frac{m}{1000d^2}+\frac{n^{1/3}m^{1/3}}{10}. \qquad (1)

By recalling that d\ge 1  , we have 0.499m \le n^{1/3}m^{1/3}/10  , which in turn leads to m > \sqrt{n}  . By plugging this into the bound I({\cal P}_2,{\cal L})= O(m\sqrt{n}+m)  (obtained by a simple application of the Cauchy-Schwarz inequality) and taking c  to be sufficiently large with respect to the constant in the O(\cdot)  -notation, we obtain I({\cal P}_2,{\cal L}) \le \frac{c}{2}(m^{2/3}n^{2/3}+n)  . Combining this bound with (1)  completes the proof.     \Box

Advertisements

2 thoughts on “The Two Formulations of the Szemerédi–Trotter Theorem

  1. Very nice! This has bothered me for a long time too. A slight shorter proof, based on the same principles is as follows: let P_1 denote the set of points incident to at most |L|^1/2 lines, and let P_2 denote the set of points incident to more than |L|^1/2 lines. Then I(P,L) = I(P_1,L) + I(P_2,L). Apply the dyadic decomposition proof the first quantity (but summing only over $2^i<|L|^1/2$, where the first term $n^2/k^3$ dominates), and use the Cauchy-Schwarz bound on the second quantity, using the fact that |P_2|<<|L|^1/2 to obtain I(P_2,L) < |P_2||L|^1/2 + |L| << |L|.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s