# The k-set problem

The $k$-set problem is a classic open problem in Combinatorial Geometry, which is considered similar in spirit to incidence problems and to distance problems. While in recent years algebraic techniques led to significant progress in incidence and distance problems, so far none of the new methods were applied to the $k$-set problem. This post briefly introduces the problem, and describes a way of reducing it to an incidence problem. I hope that it will get someone to think of the $k$-set problem in an algebraic context and maybe come up with some new observations.

Let ${\cal P}$ be a set of $n$ points in ${\mathbb R}^2$, and let $1\le k\le n/2$ be an integer. A $k$-set of ${\cal P}$ is a subset ${\cal P}'\subset {\cal P}$ of $k$ points such that there exists a line that separates ${\cal P}'$ from ${\cal P}\setminus{\cal P}'$. That is, ${\cal P}'$ is on one open half-plane defined by the line, while ${\cal P}\setminus{\cal P}'$ is on the other half-plane. The following figure depicts two 5-sets of a set of 12 points.

The $k$-set problem asks for the maximum number of $k$-sets that a set of $n$-points in ${\mathbb R}^2$ can have. As a first silly example, note that the maximum number of 1-sets is exactly $n$. It is not difficult to show that we may assume that no three points are collinear (specifically, perturbing the point set cannot significantly decrease the number of $k$-sets).

The number of subsets of $k$ points of ${\cal P}$ is $\Theta(n^k)$, so this is a straightforward upper bound for the maximum number of $k$ sets. However, we can easily obtain a significantly stronger upper bound. Consider a $k$-set ${\cal P}'\subset{\cal P}$ that is separated by a line $\ell$. We translate $\ell$ until it is incident to a point $p\in {\cal P}'$ and then rotate $\ell$ around $p$ until it is incident to a second point of ${\cal P}$ (not necessarily in ${\cal P}'$). For example, see the figure below. Thus, to find all of the $k$-sets of ${\cal P}$ it suffices to consider every line that is incident to two points of ${\cal P}$. Since there are $O(n^2)$ such lines, this is also an upper bound for the number of $k$-sets.

# Additive Energy of Real Point Sets

Over the years, more and more interactions between Discrete Geometry and Additive Combinatorics are being exposed. These include results such as the Green–Tao ordinary lines theorem and Solymosi’s sum-product bound. One reason for this connection is that both fields study the structure and symmetries of various objects (such as sets of points or subsets of additive groups). In this post I will discuss one of the simplest connections between the two fields — studying the additive energy of a set of points in a real space ${\mathbb R}^d$. The main goal of the post is to present two open problems that involve the additive energy of such sets. I heard one of these problems from Nets Katz and the other from Ciprian Demeter. In future posts we might discuss more involved interactions between the two fields.

Ciprian Demeter and Nets Katz.

# Incidences Outside of Discrete Geometry

Starting the 1980’s, more and more applications of geometric incidences have been found in discrete geometry. By now, one can arguably claim that there are countless such applications. More surprising are the applications of incidences outside of discrete geometry. In the past decade such applications have been discovered in harmonic analysis, theoretical computer science, number theory, and more. This post is my attempt to list these applications. If you are familiar with additional applications that I did not include, I’d be very interested to hear about them (in the comments below or by email).

Additive combinatorics: One of the main milestones in the study of the sum-product problem is a seminal paper by Elekes, in which he reduced the problem to a point-line incidence problem in ${\mathbb R}^2$. This led to a variety of results in additive combinatorics that rely on incidences. Tao and Vu’s additive combinatorics book contains a full chapter that is dedicated to incidences. Konyagin and Shkredov’s recent improvement of the sum-product bound also relies indirectly on incidences (to bound the number of collinear triples in a lattice). For more examples, see here and here. Also, several additive combinatorics results in finite fields rely on incidence bounds in finite fields (e.g., see this paper).

Fourier restriction problems: Bourgain and Demeter rely on incidences to derive bounds for the discrete Fourier restriction to the four and five dimensional spheres. Incidences also appear in several related papers by the same authors (e.g., see here and here).

Number theory: A recent paper of Bombieri and Bourgain studies a type of additive energy of the Gaussian integers by relying on incidences with unit circles in ${\mathbb R}^2$.

Error correcting codes: A family of papers by Dvir, Saraf, Wigderson, and others use incidences to study error correcting codes. For example, see this paper and this paper. These papers rely mainly on Sylvester-Gallai problems, which do not exactly fit the standard incidence problem formulation. Still, the authors of these paper and others consider these as proper incidence problems.

Extractors: Yet another novel use of incidences by Jean Bourgain. Bourgain relied on a finite field point-line incidence bound to build 2-source extractors. Recently this result has been superseded, as far as I understand without relying on incidences in any way.

Minors of totally positive matrices: Farber, Ray, and Smorodinsky used incidences to bound the number of times that a $d\times d$ minor can repeat in $d\times n$ a totally positive matrix.

It might also be interesting to state results that do not rely on incidences but do rely on the same polynomial partitioning technique (applied in the same way):

More restriction problems: Guth recently relied on polynomial partitioning to derive improved restriction estimates for smooth compact surface in ${\mathbb R}^3$ with strictly positive second fundamental form.

Range searching algorithms: Agarwal, Matoušek,and Sharir used polynomial partitioning to derive range searching algorithms for semialgebraic sets. A later simplification of this result, also based on polynomial partitioning, was derived by Matoušek and Patáková.

# Incidences: Lower Bounds (part 8)

We continue to survey the known lower bounds for incidence problems (the previous post in this series can be found here). After a couple of posts in ${\mathbb R}^3$ we now return to ${\mathbb R}^2$, to introduce a new technique that I recently learned about. This technique was discovered by Jozsef Solymosi, and is described in a work of Solymosi and Endre Szabó that is unpublished at this point. As seems to often be the case with results of Solymosi, it is quite simple and elegent. In ${\mathbb R}^2$ this technique yields bounds that can also be obtained from Elekes’ technique (discussed in a previous post), but in higher dimensions it leads to new bounds that are also tight (which would hopefully discuss in a future post). I would like to thank Jozsef Solymosi for allowing me to describe his technique in this blog, and to Frank de Zeeuw for discussions that led to this post.

Jozsef Solymosi.

# Incidences: Lower Bounds (part 6)

Last year I surveyed the known lower bounds for incidence problems in the plane (click here for the previous post in the series). I now return to this series of posts to survey the lower bounds that are known in higher dimensions.

From what is already known about incidences with curves in ${\mathbb R}^d$, it seems likely that the maximum number of incidences is usually obtained when the points and curves lie in a constant-degree two-dimensional surface. For example, Guth and Katz proved that the maximum number of point-line incidences in ${\mathbb R}^3$ is obtained when the points and lines are in a common plane. While incidences with general curves are not fully understood even in the plane, various recent results hint that this is also the case for general curves in ${\mathbb R}^d$ (for example, see here and here).

Due to the above, incidences with curves in ${\mathbb R}^d$ are usually studied with a restriction on the number of curves that can be contained in various types of constant-degree surfaces; another reason for studying such restrictions is that they arise from interesting problems. For example, Guth and Katz proved the following theorem.

Theorem 1. Let $\cal P$ be a set of $m$ points and let $\cal L$ be a set of $n$ lines, both in ${\mathbb R}^3$, such that every plane contains $O(\sqrt{n})$ lines of $\cal L$. Then

$I({\cal P},{\cal L}) = O\left(m^{1/2}n^{3/4}+n+m\right).$

Currently, all of the known lower bounds for incidences with curves in ${\mathbb R}^d$ are simple extensions of the planar bounds that we studied in the previous posts. In the current post we only present one such bound. Specifically, we extend Elekes’ planar construction to show that Theorem 1 is tight. The following posts will focus on the more challenging issues that arise for incidences with higher-dimensional objects.

Claim 1. Consider integers $m$ and $n$ that satisfy $m \ge 4\sqrt{n}$ and $m\le n^{3/2}$. Then there exist a set $\cal P$ of $m$ points and a set $\cal L$ of $n$ lines, both in ${\mathbb R}^3$, such that every plane contains $O(\sqrt{n})$ lines of $\cal L$ and

$I({\cal P},{\cal L})= \Theta\left(m^{1/2}n^{3/4} \right).$

Proof.     Set $k = m^{1/2}/(2n^{1/4})$ and $\ell = \sqrt{2}n^{3/8}/m^{1/4}$, and let

${\cal P} = \left\{(r,s,t)\in {\mathbb N}^3 : 1\le r \le k \ \text{ and } \ 1\le s,t \le 2k\ell \right\}.$

For our set of lines, we take

${\cal L} = \left\{y-ax-b,z-cx-d : 1 \le a,c \le \ell \ \text{ and } \ 1\le b,d \le k\ell \right\}.$

First, notice that we indeed have

$|{\cal P}| = k (2k\ell)^2 = 4k^3 \ell^2 = 4 \cdot \frac{m^{3/2}}{8n^{3/4}} \cdot \frac{2n^{3/4}}{m^{1/2}} = m,$
$|{\cal L}| = k^2\ell^4 = \frac{m}{4n^{1/2}} \cdot \frac{4n^{3/2}}{m} = n.$

For any line $\gamma \in {\cal L}$ and $1\le r \le k$, there exists a unique point in $\cal P$ that is incident to $\gamma$ and whose $x$-coordinate is $r$. That is, every line of $\cal L$ is incident to exactly $k$ points of $\cal P$. Thus, we have

$I({\cal P},{\cal L}) = k \cdot n = \frac{m^{1/2}}{2n^{1/4}} \cdot n= \frac{m^{1/2}n^{3/4}}{2}.$

It remains to verify that every plane contains $O(\sqrt{n})$ lines of $\cal L$. We first consider a plane $h$ that is defined by an equation of the form $y=sx+t$ (that is, a plane that contains lines that are parallel to the $z$-axis). Such a plane $h$ contains exactly the lines that have $a=s$ and $b=t$ in their definition in $\cal L$. There are $k\ell^2= \Theta(\sqrt{n})$ such lines.

Next, consider a plane $h$ that is not defined by an equation of the form $y=sx+t$. In this case, for every choice of $a,b$, the plane $h' =Z(y-ax-b)$ intersects $h$ in a unique line. That is, for every choice of $a,b$ as in the definition of $\cal L$, there is at most one line of $\cal L$ with these parameters that is contained in $h$. Thus, $h$ contains $k\ell^2= \Theta(\sqrt{n})$ lines of $\cal L$.

# The Two Formulations of the Szemerédi–Trotter Theorem

The Szemerédi–Trotter theorem is usually presented in one of the two following formulations.

Theorem 1. Given a set of $m$ points and a set of $n$ lines, both in ${\mathbb R}^2$, the number of incidences between the two sets is $O(m^{2/3}n^{2/3}+m+n)$.

Theorem 2. Given a set of $n$ lines in ${\mathbb R}^2$, the number of points in ${\mathbb R}^2$ that are incident to at least $k$ of the lines is $O\left(\frac{n^2}{k^3}+\frac{n}{k}\right)$ (for $2\le k \le n$).

A third formulation, symmetric to the one in Theorem 2, considers the number of lines that are incident to at least $k$ points of a given point set. We ignore this formulation here.

Many papers mention that the statements of Theorems 1 and 2 are equivalent, and some also show how Theorem 2 can be easily derived from Theorem 1 (including this Wikipedia page). On the other hand, I could not locate a single source that shows how to derive Theorem 1 from Theorem 2. At first it might seem as if a simple dyadic decomposition would do the trick. However, this approach leads to an extra logarithmic factor in the bound.

This post presents a simple elegant argument that leads to an extra logarithmic factor. It then present a longer proof that does not yield this extra factor. The latter proof was recently presented to me by Noam Solomon. If you have a more elegant proof, I would be very happy to hear about it.

Noam Solomon.

Proof with extra logarithmic factor.     Consider a set $\cal P$ of $m$ points and a set $\cal L$ of $n$ lines. Let $m_i$ denote the number of points of $\cal P$ that are incident to more than $2^{i-1}$ lines of $\cal L$ and to at most $2^i$ such lines. We set $r = \left\lceil\log \left(n^{2/3}/m^{1/3}\right) \right\rceil$. Since $m_i \le m$ obviously holds for every $i$ and by applying Theorem 2, we have

$I({\cal P},{\cal L}) \le \sum_{i\ge 0} m_i 2^i \le \sum_{i=0}^{r} m 2^i + \sum_{i= r+1}^{\log n} m_i 2^i$
$= O\left(m^{2/3}n^{2/3}+m + \sum_{i= r+1}^{\log n} \left(\frac{n^2}{2^{2i}}+n\right)\right)$
$= O\left(m^{2/3}n^{2/3}+m+n\log n\right). \qquad \qquad \Box$

We now move to the proof that does not yield an extra logarithmic factor.

Full proof.     Let $d$ denote the constant in the $O(\cdot)$-notation in the bound of Theorem 2, and assume that $d\ge1$. We prove by induction on $m$ that for every set $\cal P$ of $m$ points and set $\cal L$ of $n$ lines, $I({\cal P},{\cal L})\le c(m^{2/3}n^{2/3}+m+n)$ for a sufficiently large constant $c$. For the induction basis, the claim holds for small values of $m$ (say, $m\le 100$) by taking $c$ to be sufficiently large. For the induction step, let ${\cal P}_1$ denote the set of points of ${\cal P}$ that are incident to less than $10dn^{2/3}/m^{1/3}$ lines of $\cal L$. By taking $c$ to be larger than $100d$, we have

$I({\cal P}_1,{\cal L}) < \frac{10dn^{2/3}}{m^{1/3}} \cdot m = 10dn^{2/3}m^{2/3}< \frac{c}{10}n^{2/3}m^{2/3}. \qquad (1)$

Let ${\cal P}_2 = {\cal P} \setminus {\cal P}_1$ denote the points of $\cal P$ that are incident to at least $10dn^{2/3}/m^{1/3}$ lines of $\cal L$. If $|{\cal P}_2| \le m/2$, then by the induction hypothesis we have $I({\cal P}_2,{\cal L}) \le c((m/2)^{2/3}n^{2/3}+m/2+n)$. By combining this with $(1)$ we get $I({\cal P},{\cal L}) \le c(m^{2/3}n^{2/3}+m+n)$.

It remains to consider the case where $|{\cal P}_2| > m/2$. Since every point of ${\cal P}_2$ is incident to at least $10dn^{2/3}/m^{1/3}$ lines of $\cal L$, by Theorem 2 we have

$m/2 < |{\cal P}_2| \le \frac{dn^2}{(10dn^{2/3}/m^{1/3})^3}+\frac{dn}{10dn^{2/3}/m^{1/3}} = \frac{m}{1000d^2}+\frac{n^{1/3}m^{1/3}}{10}. \qquad (1)$

By recalling that $d\ge 1$, we have $0.499m \le n^{1/3}m^{1/3}/10$, which in turn leads to $m > \sqrt{n}$. By plugging this into the bound $I({\cal P}_2,{\cal L})= O(m\sqrt{n}+m)$ (obtained by a simple application of the Cauchy-Schwarz inequality) and taking $c$ to be sufficiently large with respect to the constant in the $O(\cdot)$-notation, we obtain $I({\cal P}_2,{\cal L}) \le \frac{c}{2}(m^{2/3}n^{2/3}+n)$. Combining this bound with $(1)$ completes the proof.     $\Box$

# Incidences in the complex plane

Recently Joshua Zahl and I have uploaded to arXiv a paper about incidence in ${\mathbb C}^2$. In this post I briefly survey what was previously known concerning such incidences, and then describe what the new paper is about.

Joshua Zahl.

Let $\Gamma$ be a set of curves and let ${\cal P}$ be a set of points, both in ${\mathbb F}^2$ for some field $\mathbb F$. We say that the arrangement $({\cal P},\Gamma)$ has $k$ degrees of freedom and multiplicity-type $s$ if

• For any $k$ points from $\cal P$, there are at most $s$ curves from $\Gamma$ that contain all of them.
• Any pair of curves from $\Gamma$ intersect in at most $s$ points.
The following theorem is currently the best known general incidence bound in ${\mathbb R}^2$ (better bounds are known for some special cases, such as circles and parabolas). For a proof, see for example this paper by Pach and Sharir.

Theorem 1. Let $\cal P$ be a set of $m$ points and let $\Gamma$ be a set of $n$ algebraic curves of degree at most $D$, both in ${\mathbb R}^2$. Suppose that $({\cal P},\Gamma)$ has $k$ degrees of freedom and multiplicity type $s$. Then

$I({\cal P},\Gamma) = O_{k,s,D}\big(m^{\frac{k}{2k-1}}n^{\frac{2k-2}{2k-1}}+m+n\big).$

In the complex plane, things progressed rather slowly. Csaba Tóth derived a matching incidence bound for the special case of complex lines in ${\mathbb C}^2$ (i.e., the Szemerédi–Trotter bound). This is the longest case of refereeing that I know of, and if I recall correctly it took 13 years (!!!) for the journal to accept this paper (and it is still not published). In the meantime, a similar result was independently obtained by Solymosi and Tao, by using the polynomial method.