# The Elekes-Sharir Framework (part 1)

After surveying some of the variants of the distinct distances problem (part 1, part 2), I would now like to discuss a technique that is used to study such problems. As usual, thanks to Micha Sharir for helping in the preparation of this post.

Guth and Katz’s seminal work provided an almost tight bound for Erdős’s planar distinct distances problem. One can regard the proof of this bound as consisting of four main tools:

• A reduction from the distinct distances problem to a problem about line intersections in ${\mathbb R}^3$. This part is referred to as the Elekes-Sharir reduction or as the Elekes-Sharir framework.
• The introduction of polynomial partitioning.
• Applying 19th century analytic geometry tools that are related to ruled surfaces, such as flecnode polynomials.
• The polynomial technique presented in Guth and Katz’s previous paper. (That is, relying on the existence of a polynomial of a small degree that vanishes on a given point set to reduce incidence problems between points and lines to the interactions between the lines and an algebraic variety.)
This series of posts discusses the Elekes-Sharir framework from the first item. This framework already has several applications beyond its original use by Guth and Katz, and more applications for it are constantly being discovered. Recent bounds for other variants of the distinct distances problem rely on this framework (this will be discussed in the following posts), and a recent work by Bennett, Iosevich, and Pakianathanalso applies it in finite fields. The framework is also used to obtain bounds for classes of congruent and similar triangles and distinct area triangles. Currently, the main challenge might be to extend the reduction to distinct distances problems in higher dimensions. While initial work in progress indicates that this can be done, the resulting problems still seem to be hard to tackle. Still, one might hope that applying the reduction in some clever manner would lead to a more elegant problem that can be solved more easily.

Before getting to the technical details, we begin with some of the history of the Elekes-Sharir framework. Then we move describe the basics of the Elekes-Sharir framework. In the following posts we discuss other applications of the framework. We will present a variant of the reduction, reproving Szemerédi’s bound on the minimum number of distinct distances where no $k$ points are collinear. While this alternative proof can be seen as equivalent to Szemerédi’s original proof, I think that it gives more intuition about the reduction and prepares us for more advanced applications. Afterwards, we will discuss such an advanced application — namely, the number of distinct distances between two planar sets of points, each on a different line. This result is taken from this paper.

## Some history

Elekes and Sharir used to think about the distinct distances problem, and around the turn of the millennium Elekes communicated to Sharir the basics of a reduction from this problem to a problem of bounding the number of intersections in a set of helices in ${\mathbb R}^3$ (this problem can also be formulated as a point-helix incidence problem in ${\mathbb R}^3$). Although this reduction seemed elegant and somewhat surprising, the resulting problem appeared hopeless with the existing tools for solving incidence-related problems. Thus, the reduction was abandoned for nearly a decade.

György Elekes passed away in September 2008. A few years earlier, he has made some progress on a very simplified variant of the 3-dimensional incidence problems that his reduction leads to (in this variant, one asks for the number of incidences between points and equally inclined lines in ${\mathbb R}^3$; lines that form a fixed angle, say $\pi/4$, with the $z$-axis). Elekes communicated this result too to Sharir, concluding the email by writing

By the way, in case of something unexpected happens to me (car accident, plane crash, a brick on the top of my skull) I definitely ask you to publish anything we have, at your will.”

This progress too was left untouched until Elekes’s death. His son, Márton Elekes, found this note while going over his father’s files, and asked Sharir to try and publish it. In a curious turn of events, while Sharir was extending and polishing Elekes’s note and planning to publish it, Guth and Katz published their work on the joints problem which provided new tools for dealing with incidences in ${\mathbb R}^3$ (and in a way initiated the recent use of algebraic techniques for problems in combinatorial geometry. In fact, the initiation is due to Dvir, who has used similar ideas for related problems on finite fields. Nevertheless, the application of these tools in the real domain has originated in Guth and Katz’s work.). Sharir simplified the reduction so that it resulted in a problem concerning incidences between points and parabolas in ${\mathbb R}^3$, applied the tools from Guth and Katz’s joints paper to obtain some initial (weak) bounds for the point-parabola problem, and published the result. Publishing the reduction, thereby exposing it for the first time to the general community, proved to be a good idea, because hardly any time had passed before Guth and Katz managed to apply it to get their almost tight bound for the distinct distances problem. Guth and Katz further simplified the reduction so that it has now resulted in a problem concerning intersections between lines in ${\mathbb R}^3$.

The Elekes-Sharir-Guth-Katz framework?

This turn of events makes it unclear whether the appropriate name for the reduction is “the Elekes framework”, “the Elekes-Sharir framework”, or “the Elekes-Sharir-Guth-Katz framework”. For now, it seems that the second option has caught on. More historical details, can be found here.

## The basics of the reduction

Consider a set ${\cal P}$ of $n$ points in the plane, and let $x$ denote the number of distinct distances that are determined by pairs of points from ${\cal P}$. The reduction revolves around the set

$Q = \left\{ (a,p,b,q)\in {\cal P}^4 \mid |ap|=|bq|>0 \right\}. \ \ \ \$

The quadruples in $Q$ are ordered in the sense that $(a,p,b,q)$, $(b,p,a,q)$, $(p,a,q,b)$, and the other possible permutations are all considered as distinct elements of $Q$. In a quadruple $(a,p,b,q)\in Q$, the segments $ap$ and $bq$ are allowed to share vertices, though we do not allow a quadruple where both $a=b$ and $p=q$ (the case where $a=q$ and $b=p$ is allowed). Basically, the reduction is just double counting $|Q|$, and we begin by deriving a lower bound. We denote the set of (nonzero) distinct distances that are determined by ${\cal P}\times{\cal P}$ as $\delta_1,\ldots,\delta_x$. Also, for $1 \le i \le x$, we set

$\displaystyle E_i = \left\{(p,q)\in {\cal P}^2 \mid |pq| = \delta_i \right\}. \ \ \ \$

As before, we consider $(p,q)$ and $(q,p)$ as two distinct pairs in $E_i$. Notice that $\sum_{i=1}^x|E_i|=n^2-n$ since every ordered pair of ${\cal P}\times{\cal P}$ is contained in a unique set $E_i$, except for the $n$ pairs of the form $(p,p)$. By applying the Cauchy-Schwarz inequality, we have

$|Q| = \sum_{i=1}^x 2\binom{|E_i|}{2} \ge \sum_{i=1}^x \left(|E_i| - 1\right)^2$
$\ge \frac{1}{x} \left(\sum_{i=1}^x |E_i| - 1\right)^2 = \frac{(n^2-n-x)^2}{x}. \ \ \ \ (1)$

It remains to upper bound $|Q|$. Specifically, if we manage to derive the bound $|Q|=O(n^3\log n)$, combining it with $(1)$ would immediately imply $x=\Omega(n/\log n)$.

A transformation of the plane is said to be a rigid motion if it preserves distances between points. Any combination of rotations, translations, and reflections is a rigid motion. A proper rigid motion is a rigid motion that also preserves orientation; that is, an ordered triple of points $abc$ forms a left turn after applying the transformation if and only if it originally formed a left turn. In the following figure, the second rigid motion is not proper since it does not preserve orientation.

The proper rigid motions are exactly the transformations that are obtained by combining rotations and translations. In fact, every (planar) proper rigid motion is either a single rotation or a single translation. (That is, any combination of rotations and translations results in a single translation or in a sinle rotation; more details can be found in the book “Mathematic Form and Function by Saunders Mac Lane.)

For a pair of points $a,b \in {\cal P}$, consider the rotations that take $a$ to $b$. The origin of such a rotation must be equidistant from $a$ and $b$. In other words, the centers of these rotations must all be on the perpendicular bisector of the segment $ab$. Conversely, every point on the perpendicular bisector of $ab$ is the origin of a rotation that takes $a$ to $b$. An example is depicted in the following figure.

Consider a quadruple $(a,p,b,q) \in Q$ and recall that by definition $|ap|=|bq|$. We can always apply a rotation that takes $a$ to $b$ and then rotate around the new position of $a$ until $p$ is taken to $q$. This translation followed by a rotation is a proper rigid motion taking $ap$ to $bq$. To see that there is a unique proper rigid motion that takes $ap$ to $bq$, we denote by $\ell_1$ and $\ell_2$ the perpendicular bisectors of the segments $ab$ and $pq$, respectively. If $\ell_1$ and $\ell_2$ are parallel, then there is a unique translation taking $ap$ to $bq$, and no rotations. Similarly, if $\ell_1$ and $\ell_2$ intersect, then there is a unique rotation taking $ap$ to $bq$, and no translations. Specifically, the origin of this rotation is the point $\ell_1\cap \ell_2$, and the angles of rotation from $a$ to $b$ and from $p$ to $q$ are equal because $|ap|=|bq|$. The two cases are depicted in the following figure.

By the above, we have the following equivalent definition for $Q$: a quadruple $(a,p,b,q)$ is in $Q$ if and only if there exists a proper rigid motion $\tau$ that takes $ap$ to $bq$ (the implication which is not considered above is trivial: If there exists a proper rigid motion that takes $ap$ to $bq$, obviously $|ap|=|bq|$). We say that the quadruple $(a,p,b,q)$ corresponds to $\tau$. That is, our goal is to show that the number of quadruples from ${\cal P}^4$ that correspond to a proper rigid motion is $O(n^3\log n)$. As already noted, such a bound, combined with $(1)$, would lead to the Guth-Katz bound on the number of distinct distances.

We first bound the number of quadruples in $Q$ that correspond to a translation. Given the first three points of a quadruple $(a,p,b,?)$, there is at most one point in ${\cal P}$ that can complete it to a quadruple that corresponds to a translation. Thus, $O(n^3)$ quadruples in $Q$ correspond to a translation.

Bounding the number of quadruples in $Q$ that correspond to a rotation is more difficult. A rotation can be parameterized using three parameters — two parameters for the origin and another one for the angle of rotation. Given a rotation with origin $(o_x,o_y)$ and an angle of $\alpha$, Guth and Katz parameterize it as $(o_x,o_y,\cot(\alpha/2))\in{\mathbb R}^3$. The advantage of this paramterization is that, given a pair of points $a,b\in{\mathbb R}^2$, the set of parametrizations of the rotations that take $a$ to $b$ is exactly the following line in ${\mathbb R}^3$:

$\ell_{ab} = \left(\frac{a_x+b_x}{2}, \frac{a_y+b_y}{2},0\right) + t\left(\frac{b_y-a_y}{2}, \frac{a_x-b_x}{2},1\right), \quad \text{ for } t\in{\mathbb R}. \ \ \ \$

The proof of this property is a rather standard calculation which is not relevant to the rest of our discussion, so we do not present it here. Notice that the projection of $\ell_{ab}$ on the $xy$-plane is the perpendicular bisector of $ab$, as expected. That is, $\ell_{ab}$ is obtained by “lifting” the perpendicular bisector of $ab$ to a line in ${\mathbb R}^3$ whose slope in the $z$-direction is $2/|ab|$, as is easily checked.

Consider a quadruple $(a,p,b,q)\in {\cal P}^4$ and let $\ell_{ab}$ and $\ell_{pq}$ be the pair of lines in ${\mathbb R}^3$ corresponding to $(a,b)$ and $(p,q)$, as defined in $(1)$. If the intersection point $p = \ell_{ab} \cap \ell_{pq}$ exists, then it is the parametrization of a rotation taking both $a$ to $b$ and $p$ to $q$. That is, the quadruple $(a,p,b,q)$ corresponds to a rotation (and is thus in $Q$) if and only if $\ell_{ab}$ and $\ell_{pq}$ intersect. The following figure depicts such a quadruple of points, the two perpendicular bisectors, and their “liftings” to ${\mathbb R}^3$.

Pairs of points from ${\cal P}$ yield $\Theta(n^2)$ lines in the parametric space ${\mathbb R}^3$, and there is a bijection between quadruples that correspond to rotations and pairs of intersecting lines. Thus, an upper bound of $O(n^3\log n)$ on the number of pairs of intersecting lines would imply $|Q| = O(n^3\log n)$, as required.

By placing $\Theta(n^2)$ lines on a common plane or regulus we can easily obtain $\Theta(n^4)$ pairs of intersecting lines. However, the lines that are obtained by the above reduction are restricted and cannot all be on a common plane or regulus. Specifically, Guth and Katz show that no plane or regulus can contain more than $O(n)$ of these lines. With this additional restriction, Guth and Katz prove that no more than $O(n^3\log n)$ pairs of intersecting lines are possible. The proof is somewhat complicated, and I might write about it in the future. Instead, in the following posts we will discuss additional applications of the Elekes-Sharir framework.