# Difference Sets with Local Properties

I recently attended a wonderful workshop about Algebraic methods in combinatorics, which took place in Harvard’s CMSA. There were many interesting participants from a variety of combinatorial fields, and a very friendly/productive atmosphere. My talk focused on a recent work with Cosmin Pohoata, and I also mentioned some distinct distances result that we derived. During the talk Zeev Dvir asked about an additive variant of the problem. After thinking about this variant for a bit, I think that it is a natural interesting problem. Surprisingly, so far I did not manage to find any hint of previous work on it (this might say more about my search capabilities than about the problem…)

Zeev Dvir and Cosmin Pohoata.

Let $\phi(n,k,\ell)$ denote the minimum size $A-A$ can have, when $A$ is a set of $n$ real numbers with the property that for any $A' \subset A$ with $|A'|=k$ we have $|A'-A'|\ge \ell$. That is, by having a local additive property of every small subset, we wish to obtain a global additive property of the entire set. For simplicity, we will ignore zero in the difference set. Similarly, we will ignore negative differences. These assumptions do not change the problem, but make it easier to discuss.

As a first example, note that $\phi(n,3,3)$ is the minimum number of differences determined by a set of $n$ reals with no 3-term arithmetic progressions. Behrend’s construction is a set $A$ of positive integers $a_1< a_2 < \cdots < a_n$ with no 3-term arithmetic progression and $a_n < n2^{O(\sqrt{\log n})}$. Thus, $\phi(n,3,3) < n2^{O(\sqrt{\log n})}$.

For another simple example, Consider a constant $k\ge 4$. Since we consider only positive differences, any set of $k$ reals determines at most $\binom{k}{2}$ differences. If a specific difference $d$ repeats $\lfloor k/2 \rfloor$ times, then by taking the numbers that span $d$ we obtain $A'\subset A$ such that $|A'|\le k$ and $|A'-A'| \le \binom{k}{2}- \lfloor k/2 \rfloor+1$. Thus, by asking every subset of size $k$ to span at least $\binom{k}{2}- \lfloor k/2 \rfloor+2$ differences, we obtain that no difference repeats $\lfloor k/2 \rfloor$ times in $A$. In other words

$\phi\left(n,k,\binom{k}{2}-\lfloor k/2 \rfloor +2 \right) = \Omega\left(n^2\right).$

Repeating a simple argument of Erdős and Gyárfás gives

$\phi\left(n,k,\binom{k}{2}-\lfloor k/2 \rfloor +1\right) = \Omega\left(n^{4/3}\right).$

That is, when moving from $\ell = \binom{k}{2}-\lfloor k/2 \rfloor +2$ to $\ell = \binom{k}{2}-\lfloor k/2 \rfloor +1$, we move from a trivial problem to a wide open one. My work with Cosmin Pohoata leads to the following result.

Theorem 1. For any $d\ge 2$ there exists $c$ such that

$\phi\left(n,k,\binom{k}{2}-k\frac{d}{d+1}+c\right) =\Omega\left(n^{1+1/d} \right).$

For example, when $d=2$ we get the bound

$\phi\left(n,k,\binom{k}{2}-\frac{2k}{3}+c\right) =\Omega\left(n^{3/2} \right).$

When $d=3$ we get a significant improvement for the range of the Erdős-Gyárfás bound:

$\phi\left(n,k,\binom{k}{2}-\frac{2k}{3}+c\right) =\Omega\left(n^{3/2} \right). \qquad \qquad \qquad (1)$

Since not much is known for this problem, it seems plausible that additional bounds could be obtained using current tools. Our technique does not rely on any additive properties, and holds for a more abstract scenario of graphs with colored edges. Hopefully in the case of difference sets one would be able to use additive properties to improve the bounds. Moreover, so far I know nothing about much smaller values of $\ell$, such as $\phi(n,k,100k)$.

Proof sketch for Theorem 1. For simplicity, let us consider the case of $d=3$, as stated in $(1)$. Other values of $d$ are handled in a similar manner. Let $A$ be a set of $n$ reals, such that any $A'\subset A$ of size $k$ satisfies $|A'-A'|\ge \binom{k}{2}-\frac{3k}{4}+13$. We define the third distance energy of $A$ as

$E_3(A) = \left|\left\{(a_1,a_2,a_3,b_1,b_2,b_3) \in A^6 :\, a_1-b_1=a_2-b_2=a_3-b_3 >0 \right\}\right|.$

The proof is based on double counting $E_3(A)$. For $\delta\in {\mathbb R}$, let $m_\delta = \left|\left\{(a,b)\in A^2 : a-b = \delta\right\}\right|$. That is, $m_\delta$ is the number of representations of $\delta$ as a difference of two elements of $A$. Note that the number of 6-tuples that satisfy $a_1-b_1=a_2-b_2=a_3-b_3$ is $m_\delta^3$. A simple application of Hölder ‘s inequality implies

$E_3(A) = \sum_{\delta>0} m_\delta^3 \ge \frac{n^6}{|A-A|^2}.$

To obtain a lower bound for $|A-A|$, it remains to derive an upper bound for $E_3(A)$.

For $j\in {\mathbb N}$ let $k_j$ denote the number of differences $\delta \in {\mathbb R}^+$ such that $m_\delta \ge j$. A dyadic decomposition gives

$E_3(A) = \sum_{\delta>0} m_\delta^3 = \sum_{j=1}^{\log n} \sum_{\substack{\delta>0 \\ 2^j \le m_\delta < 2^{j+1}}} m_\delta^3< \sum_{j=1}^{\log n} k_{2^j} 2^{3(j+1)}. \qquad \qquad \qquad (2)$

For $j\in {\mathbb N}$ let $\Delta_j$ denote the set of $\delta>0$ with $m_\delta\ge j$ (so $|\Delta_j| = k_j$). For $\delta >0$, let $A_\delta$ be the set of points that participate in at least one of the representations of $\delta$. If there exist $\delta_1,\delta_2, \delta_3$ such that $|A_{\delta_1} \cap A_{\delta_2} \cap A_{\delta_3}| \ge k/4$, then there exist a subset $A'\subset A$ with $|A'|=k$ and $|A'-A'|< \binom{k}{2}-\frac{3k}{4}+13$ (see the paper for a full explanation). Thus, for every $\delta_1,\delta_2, \delta_3$ we have that $|A_{\delta_1} \cap A_{\delta_2} \cap A_{\delta_3}| < k/4$.

We have $k_j$ sets $A_\delta$ with $|A_\delta| \ge j$. These are all subsets of the same set $A$ of size $n$, and every three intersect in fewer than $k/4$ elements. We now have a set theoretic problem: How many large subsets can $A$ have with no three having a large intersection. We can use the following counting lemma (for example, see Lemma 2.3 of Jukna’s Extremal Combinatorics) to obtain an upper bound on $k_j$.

Lemma 2. Let $A$ be a set of $n$ elements and let $d\ge 2$ be an integer. Let $A_1,\ldots,A_k$ be subsets of $A$, each of size at least $m$. If $k \ge 2d n^d/m^d$ then there exist $1\le j_1 < \ldots < j_d \le d$ such that $|A_{j_1}\cap \ldots \cap A_{j_d}| \ge \frac{m^d}{2n^{d-1}}$.

Lemma 2 implies the bound $k_j = O(n^3/j^3)$ for large values of $j$. Combining this with $(2)$ and with a couple of standard arguments leads to $E_3(A) = O(n^{10/3})$. Combining this with $E_3(A) \ge \frac{n^6}{|A-A|^2}$ implies $|A-A|=\Omega(n^{4/3})$. $\Box$

# A New Combinatorics REU

I am excited to announce the beginning of the CUNY Combinatorics REU, which I am organizing together with Radoš Radoičić. For the past three years I have been mentoring Caltech undergraduates in research projects, and before that students in Tel-Aviv University. These often led to papers and almost all of the students continued to grad school or are applying now. This REU is our way of continuing this work in CUNY. Many more details can be found here.

Please send us strong students! Also, if you are a mathematician with some interest in combinatorics, might be around NYC at some point during the summer, and might be willing to give a talk or just come to chat with the participants, let me know!

I’m happy to hear any comments and questions. Now let’s work hard and get some impressive research done in this program!

# Incidences Outside of Discrete Geometry (part 2)

You may have noticed that I have a bit of an obsession with geometric incidences. I do believe that incidences are a natural mathematical object that is connected to many different parts of math. This belief seems at least partially justified by the developments of the recent years. Less than two years ago I posted a list of some applications of incidences outside of Discrete Geometry. The purpose of that list was to show how incidences are becoming useful in a variety of fields, such as Harmonic Analysis, Theoretical Computer Science, and Number Theory. It seems that this process did not slow down in the past two years – incidences have continued to demonstrate their usefulness in the aforementioned fields, and there is even a new interest in the subject by model theorists.

This post is part 2 of the list of incidence uses outside of Discrete Geometry. It is just a list of references, and does not include many details. In future posts I might focus on specific applications and provide actual explanations. Hopefully new applications will continue to appear and I’ll have to keep adding more and more parts to this list!

The Kakeya conjecture. Katz and Zahl derived an improved bound for the Kakeya conjecture. Specifically, they improved Wolff’s longstanding bound for the Hausdorff dimension problem in ${\mathbb R}^3$. This is the latest in a sequence of Harmonic Analysis works that have strong connections to incidences (see the first part of this list and also below).

Model theory. In Logic, a group of model theorists generalized incidence results to Distal structures. Another similar work extended incidence results to o-minimal structures. In general, there seems to be some interest in generalizing incidence-related problems to various models.

Number theory. A very recent number theoretic result is relying on incidence bounds. Admittedly, I still do not understand what this paper is about, and am hoping to learn that soon.

Algorithms. Moving to Theoretical Computer Science, a recent work analyzes point covering algorithms using incidence results.

Quantum Information. A few years ago, incidences in spaces over finite fields were used to study a problem in Quantum Information. This result is not from the past two years, but I was not aware of it before (thanks Ben Lund).

More Harmonic Analysis. An older survey of Łaba contains a nice review of previous appearances of incidences in Harmonic Analysis. I write “older” although not even a decade have passed. I just mean that this survey appeared before the new era of polynomial methods in Discrete Geometry.

There are obviously more results that are still missing from this list. If you noticed anything that I missed, I would be happy to hear about it.

# A Prisoners Riddle

I just heard a nice riddle and wanted to share.

Prisoner A and prisoner B are given an integer between 1 and 100. Each prisoner only knows his own number, but also that the two numbers are consecutive. For example, if prisoner A got the number 60, he knows that prisoner B got either 59 or 61. At the end of every hour each prisoner can choose to guess the number of the other. If either prisoner guesses correctly, they both go free. However, if either prisoner guesses wrong, they both get executed (even if at the same time the other guessed correctly). Both prisoners can choose not to guess for as many hours as they like.

The two prisoners are in different cells and cannot communicate in any way. Also, they did not have time to coordinate a strategy in advance. Each can only assume that the other prisoner is intelligent. Can they always guess correctly? After how many hours?

I think it would be nice not to post solutions in the comments. Although you can just write the number of hours you got.

# Incidences in Higher Dimensions: A Conjecture

I recently added to the incidence theory book two chapters about incidences in ${\mathbb R}^d$ (Chapters 8 and 9). At the moment incidence bounds in ${\mathbb R}^d$ are known for several different cases, and it is rather unclear how a general incidence bound in ${\mathbb R}^d$ should look like. In this post I would like to state what I believe this bound should look like. This conjecture is in part a result of conversations with Joshua Zahl over several years (I do not know whether Josh agrees with everything in this post). The post assumes some familiarity with incidence problems.

Joshua Zahl. A common name in this blog.

Let us start by stating some known bounds for incidences in ${\mathbb R}^d$. The first general incidence bound that relied on polynomial techniques was by Solymosi and Tao.

Theorem 1. Let ${\cal P}$ be a set of $m$ points and let ${\cal V}$ be a set of $n$ varieties, both in ${\mathbb R}^d$. The varieties of ${\cal V}$ are of degree at most $k$ and dimension at most $d/2$. The incidence graph of ${\cal P}\times {\cal V}$ contains no copy of $K_{s,t}$. Also, whenever two varieties $U_1,U_2 \in {\cal V}$ are incident to a point $p\in {\cal P}$, the tangent spaces of $U_1$ and $U_2$ at $p$ intersect at a single point. Then for any $\varepsilon>0$

$I({\cal P},{\cal V}) = O_{k,s,t,d,\varepsilon}\left(m^{\frac{s}{2s-1}+\varepsilon}n^{\frac{2s-2}{2s-1}}+m+n\right).$

While Theorem 1 provides a nice incidence bound, the varieties in it are somewhat restricted. A result of Fox et al. holds for varieties of any dimension and without the restriction on the tangent spaces, at the cost of having a weaker bound.

Theorem 2. Let $\cal P$ be a set of $m$ points and let $\cal V$ be a set of $n$ varieties of degree at most $k$, both in ${\mathbb R}^d$. Assume that the incidence graph of ${\cal P}\times {\cal V}$ contains no copy of $K_{s,t}$. Then for any $\varepsilon>0$

$I({\cal P},{\cal V}) = O_{k,s,t,d,\varepsilon}\left(m^{\frac{(d-1)s}{ds-1}+\varepsilon}n^{\frac{d(s-1)}{ds-1}}+m+n\right).$

Finally, a result of Sharir et al. provides a stronger bound for the special case of varieties of dimension one that do not cluster in a low degree surface (for brevity, the following is a simplified weaker variant).

Theorem 3. For every $\varepsilon>0$ there exists a constant $c_\varepsilon$ that satisfies the following. Let $\cal P$ be a set of $m$ points and let $\cal V$ be a set of $n$ irreducible varieties of dimension one and degree at most $k$, both in ${\mathbb R}^d$. Assume that the incidence graph of ${\cal P}\times {\cal V}$ contains no copy of $K_{s,t}$, and that every variety of dimension two and degree at most $c_\varepsilon$ contains at most $q$ elements of $\cal V$. Then

$I({\cal P},{\cal V}) = O_{k,s,t,d,\varepsilon}\left(m^{\frac{s}{ds-d+1} +\varepsilon}n^{\frac{ds-d}{ds-d+1}} + m^{\frac{s}{2s-1}+\varepsilon}n^{\frac{ds-d}{(d-1)(2s-1)}}q^{\frac{(s-1)(d-2)}{(d-1)(2s-1)}}+m+n\right).$

The incidence theory book describes in detail how to prove the three above bounds. With these result in mind, we are now ready to make a somewhat vague general conjecture.

Conjecture 4. Let $\cal P$ be a set of $m$ points and let $\cal V$ be a set of $n$ varieties of degree at most $k$ and dimension $d'$, both in ${\mathbb R}^d$. Assume that the incidence graph contains no copy of $K_{s,t}$ and that the varieties of $\cal V$ satisfy some “reasonable conditions”. Then for any $\varepsilon>0$

$I({\cal P},{\cal V}) = O_{k,s,t,d,\varepsilon}\left(m^{\frac{sd'}{ds-d+d'}+\varepsilon}n^{\frac{ds-d}{ds-d+d'}} + m + n\right).$

Note that we already have three cases of conjecture 4:

• Theorem 1 obtains the conjecture for varieties of dimension $d/2$.
• Theorem 2 obtains the conjecture for varieties of dimension $d-1$.
• Theorem 3 obtains the conjecture for varieties of dimension $1$.
For varieties of dimension smaller than $d/2$, the proof of Theorem 1 projects the configuration to a space of dimension that is twice the dimension of the varieties (forcing the dimension of the varieties to be exactly half the dimension of the space). This seems to be an inefficient step, which makes it likely that the theorem is not tight for varieties of dimension smaller than $d/2$. Indeed, Theorem 3 already obtains a stronger bound for the case of varieties of dimension one. For similar reasons, the proof of Theorem 2 seems not to be tight for varieties of dimension smaller than $d-1$.

A recent work reduces the distinct distances problem in ${\mathbb R}^d$ to an incidence problem between points and $(d-1)$-dimensional planes in ${\mathbb R}^{2d-1}$. Combining this reduction with the bound stated in Conjecture 4 would lead to an asymptotically tight bound for the distinct distances problem in dimension $d\ge3$. Other problems also reduce to incidence problems with the same dimensions. Thus, the case of $d' = (d-1)/2$ (for odd $d$) seems to be a main open case.

While conjecture 4 is known for the cases of varieties of dimension $1,d/2$, and $d-1$, so far all of the other cases are open. The bound in the conjecture was not obtained by interpolating the three known cases — there is a better approach that leads to it. Recall that in the polynomial partitioning technique we partition ${\mathbb R}^d$ into cells, bound the number of incidences in each cell, and then bound the number of incidences that are on the partition (not in any cell). The bound stated in Conjecture 4 is obtained by applying this technique while ignoring the incidences on the partition (since bounding the number of incidences in the cells is much easier).

Recently discovered configurations of points and varieties achieve the bound of the conjecture up to extra $\varepsilon$‘s in the exponents. Thus, in its most general form the bound of the conjecture is tight. However, these constructions give a tight bound only for certain families of varieties of dimension $d-1$, for certain ranges of $m$ and $n$, and when $s=2$. It seems plausible that the bound of the conjecture is not tight in some other interesting cases. In particular, in ${\mathbb R}^2$ a stronger bound is already known when $s\ge 3$, and it seems reasonable that the same technique would also work for varieties of dimension one in ${\mathbb R}^d$. At the moment, it is difficult to guess what should happen with varieties of dimension larger than one. Either way, Conjecture 4 seems to be a reasonable bound for the limits of the current polynomial partitioning technique. It seems that this specific technique could not yield better bounds without major changes.

Let us briefly discuss what the expression “reasonable conditions” in the statement of Conjecture 4 means. We already have two examples of conditions that are considered reasonable:

• A non-clustering condition: Not too many varieties of $\cal V$ are contained in a common higher-dimensional variety. For example, this restriction was used in Theorem 3, and more famously in the Guth–Katz proof of the planar distinct distances problem.
• A transversality condition: When two varieties meet at a point $p$, their tangent spaces at $p$ intersect only at one point. This condition was used in Theorem 1.
It is not clear what the “natural” restrictions are. For example, when studying incidences with two-dimensional planes in ${\mathbb R}^4$ most works rely on the transversality condition. In this case, the condition is equivalent to asking every two planes to intersect in at most one point. I recently noticed that the same incidence bound holds with a much weaker restriction of not too many planes in a hyperplane.

There is more to say about Conjecture 4, but this post is already getting longer than I intended.

# The k-set problem

The $k$-set problem is a classic open problem in Combinatorial Geometry, which is considered similar in spirit to incidence problems and to distance problems. While in recent years algebraic techniques led to significant progress in incidence and distance problems, so far none of the new methods were applied to the $k$-set problem. This post briefly introduces the problem, and describes a way of reducing it to an incidence problem. I hope that it will get someone to think of the $k$-set problem in an algebraic context and maybe come up with some new observations.

Let ${\cal P}$ be a set of $n$ points in ${\mathbb R}^2$, and let $1\le k\le n/2$ be an integer. A $k$-set of ${\cal P}$ is a subset ${\cal P}'\subset {\cal P}$ of $k$ points such that there exists a line that separates ${\cal P}'$ from ${\cal P}\setminus{\cal P}'$. That is, ${\cal P}'$ is on one open half-plane defined by the line, while ${\cal P}\setminus{\cal P}'$ is on the other half-plane. The following figure depicts two 5-sets of a set of 12 points.

The $k$-set problem asks for the maximum number of $k$-sets that a set of $n$-points in ${\mathbb R}^2$ can have. As a first silly example, note that the maximum number of 1-sets is exactly $n$. It is not difficult to show that we may assume that no three points are collinear (specifically, perturbing the point set cannot significantly decrease the number of $k$-sets).

The number of subsets of $k$ points of ${\cal P}$ is $\Theta(n^k)$, so this is a straightforward upper bound for the maximum number of $k$ sets. However, we can easily obtain a significantly stronger upper bound. Consider a $k$-set ${\cal P}'\subset{\cal P}$ that is separated by a line $\ell$. We translate $\ell$ until it is incident to a point $p\in {\cal P}'$ and then rotate $\ell$ around $p$ until it is incident to a second point of ${\cal P}$ (not necessarily in ${\cal P}'$). For example, see the figure below. Thus, to find all of the $k$-sets of ${\cal P}$ it suffices to consider every line that is incident to two points of ${\cal P}$. Since there are $O(n^2)$ such lines, this is also an upper bound for the number of $k$-sets.

# A First Draft of the Book “Incidence Theory”

You might have noticed that I did not post anything new for quite a while. The past months were unusually busy for me, due to personal reasons such as having my first child born(!), being on the job market, and several other things. I hope to return to my regular posting frequency around July.

The purpose of this post is to announce that I just uploaded the first draft of my book “Incidence Theory”. This book is about our current understanding of incidences (with a focus on the polynomial method), and their applications in other fields. I am trying to achieve two goals in this book: To have a clear and basic introduction of this subfield, while also creating a repository of results and techniques which may be used as a reference to experts. The current draft already contains several folklore results that I have not seen written before. It contains only the first seven chapters. I predict that the final version would contain about 15 chapters, and plan to gradually release the remaining ones.

Comments would be very appreciated, preferably by email. These can point out mistakes, typos, unclear formulations, suggestions for style changes, additional topics, simpler arguments, exercises, or anything else that might help improve the draft. The acknowledgements section is way too short. Please help me to extend it!