# Discrete Geometry Classic: Two-distance Sets

This post presents another simple and elegant argument in Discrete Geometry. This classic is based on the so-called “Linear Algebra method”.

Warm-up problem. What is the maximum size of a set ${\cal P} \subset {\mathbb R}^d$ such that the distance between every two points of the set is 1?

By taking the vertices of a $d$-dimensional simplex with side length 1 in ${\mathbb R}^d$, we obtain $d+1$ points that span only the distance 1. It is not difficult to show that a set of $d+2$ points in ${\mathbb R}^d$ cannot span a single distance.

The two-distance sets problem. A point set $\cal P$ is a two-distance set if there exist $r,s\in {\mathbb R}$ such that the distance between every pair of points of $\cal P$ is either $r$ or $s$. What is the maximum size of a two-distance set in ${\mathbb R}^d$?

There is a simple trick for constructing a two-distance set of size $\binom{d}{2}$ in ${\mathbb R}^d$. You might like to spend a minute or two thinking about it. Here is a picture of Euler to prevent you from seeing the answer before you wish to.

Consider the set of all points in ${\mathbb R}^d$ that have two coordinates with value 1 and the other $d-2$ coordinates with value 0. There are $\binom{d}{2}$ such points, and the distance between every pair of those is either 1 or 2.

Larman, Rogers, and Seidel showed that the above example is not far from being tight.

Theorem. Every two-distance set in ${\mathbb R}^d$ has size at most $\binom{d}{2}+3d+2$.

Proof. Let ${\cal P} = \{p_1,p_2,\ldots, p_m\}$ be a two-distance set in ${\mathbb R}^d$, and denote the two distances as $r,s\in {\mathbb R}$. Given points $a,b\in {\mathbb R}^d$, we denote the distance between them as $D(a,b)$. We refer to a point in ${\mathbb R}^d$ as $x=(x_1,\ldots,x_d)$. For $1\le j \le m$, define the polynomial $f_j\in {\mathbb R}[x_1,\ldots,x_d]$ as

$f_j(x) = \left(D(x,p_j)^2 - r^2\right) \cdot \left(D(x,p_j)^2 - s^2\right).$

That is, $f_j$ is a polynomial of degree four in $x_1,\ldots,x_d$ that vanishes on points at distance $r$ or $s$ from $p_j$. Every such polynomial is a linear combination of the polynomials

$\left(\sum_{j=1}^d x_j^2\right)^2,\qquad x_k \sum_{j=1}^d x_j^2,\qquad x_k,\qquad x_\ell x_k,\qquad 1,$

for every $1\le \ell, k\le d$. Since every $f_j$ is a linear combination of $t = \binom{d}{2}+3d+2$ polynomials, we can represent $f_j$ as a vector in ${\mathbb R}^t$. In particular, the vector $V = (v_1,\ldots,v_t)$ corresponds to the polynomial

$v_1 \cdot \left(\sum_{j=1}^d x_j^2\right)^2 + v_2\cdot x_1 \sum_{j=1}^d x_j^2 + v_3\cdot x_2 \sum_{j=1}^d x_j^2 + \cdots + v_{t} \cdot 1.$

For $1\le j \le m$, let $V_j$ denote the vector corresponding to $f_j$. Consider $\alpha_1,\ldots,\alpha_m\in {\mathbb R}$ such that $\sum_{j=1}^m \alpha_j V_j = 0$. Let $g(x) = \sum_{j=1}^m \alpha_j f_j(x)$. By definition, the polynomial $g(x)$ is identically zero. For some fixed $p_k\in {\cal P}$, note that $f_j(p_k)=0$ for every $j\neq k$ and that $f_j(p_j) = r^2s^2$. Combining the above implies

$g(p_k) = \sum_{j=1}^m \alpha_j f_j(p_k) = \alpha_kr^2s^2.$

Since $g(x)=0$, we have that $\alpha_k=0$. That is, the only solution $\sum_{j=1}^m \alpha_j V_j = 0$ is $\alpha_1=\cdots = \alpha_m = 0$. This implies that the vectors $V_1,\ldots,V_m$ are linearly independent. Since these vectors are in ${\mathbb R}^t$, we conclude that $m \le t = \binom{d}{2}+3d+2$. $\Box$

One can improve the above bound by showing that the polynomials $f_j$ are contained in a smaller vector space. This is left as an exercise of your googling capabilities.

# Points in General Position

Many Discrete Geometry problems study planar point sets in general position. Most commonly, this definition refers to point sets with no three collinear points and no four cocircular points (that is, no circle contains four points of the set). A few examples of such problems:

• What is the minimum number of distinct distances that are spanned by a set of $n$ points in general position?
• What is the minimum number of convex $k$-gons determined by $n$ points in general position?
• For even $n$, what is the maximum number of distinct ways in which one can cut a set of $n$ points in general position by a line into two halves, each of cardinality $n/2$?
• What is the maximum number of points that can be placed on an $n \times n$ grid so that no three points are collinear. What happens when we also forbid four cocircular points?
Randomly choosing $n$ points from the plane will lead to a set in general position, but this set is unlikely to have any interesting extremal structure. For example, we expect such a set to span many distinct distances, and it is not clear why such a set should span few convex $k$-gons. Problems such as the ones above ask for point sets that are in general position while also having some kind of structure. I often find it useful to have examples of such point sets as part of my “Discrete Geometry toolkit”. I am familiar with two very different constructions of such sets. This post describes these two constructions. If you are familiar with additional constructions, do let me know!

Construction 1: Subsets of the integer lattice. Our first construction involves finding a large subset of an $n \times n$ section of the integer lattice that is in general position. The basic idea for this construction seems to originate from Erdős (for example, see this paper of Roth), although we will follow a later variant of Thiele.

For a prime $p$, we consider the point set

${\cal P} = \left\{\left(t,t^2 \mod p\right)\ :\ 0\le t

Note that ${\cal P}$ is a set of $\lfloor p/4\rfloor$ points from a $p\times p$ section of the integer lattice. We will now show that ${\cal P}$ does not contain three collinear points or four cocircular points.

An example of the first construction, created by Gábor Damásdi.

Three points $(x_1,y_1),(x_2,y_2),$ and $(x_3,y_3)$ are collinear if and only if

$\begin{vmatrix} 1 & x_1 & y_1 \\ 1 & x_2 & y_2 \\ 1 & x_3 & y_3 \end{vmatrix} = 0.$

It is not difficult to check that for three points $(t_1,t_1^2),(t_2,t_2^2),(t_3,t_3^2)$, we get

$\begin{vmatrix} 1 & t_1 & t_1^2 \\ 1 & t_2 & t_2^2 \\ 1 & t_3 & t_3^2 \end{vmatrix} = (t_2-t_1)(t_3-t_2)(t_3-t_1).$

For distinct integers $t_1,t_2,t_3$, this determinant is clearly nonzero. When we also have that $0\le t_1,t_2,t_3 < p$, this determinant is also not a multiple of the prime $p$. Thus,

$\begin{vmatrix} 1 & t_1 & (t_1^2 \mod p) \\ 1 & t_2 & (t_2^2 \mod p) \\ 1 & t_3 & (t_3^2 \mod p) \end{vmatrix} \neq 0.$

We conclude that no three points of ${\cal P}$ are collinear. We can use a similar argument for the case of cocircular points. Four points $(x_1,y_1),(x_2,y_2),(x_3,y_3),$ and $(x_4,y_4)$ are cocircular if and only if

$\begin{vmatrix} 1 & x_1 & y_1 & x_1^2 +y_1^2\\ 1 & x_2 & y_2 & x_2^2 +y_2^2\\ 1 & x_3 & y_3 & x_3^2 +y_3^2\\ 1 & x_4 & y_4 & x_4^2 +y_4^2 \end{vmatrix} = 0.$

(One can think of this test is as lifting the four points to the paraboloid defined by $z=x^2+y^2$ in ${\mathbb R}^3$, and checking if the four lifted points are coplanar. It is known that the intersections of the paraboloid with planes are exactly the lifting of circles to the paraboloid.)

Taking the points $(t_1,t_1^2),(t_2,t_2^2),(t_3,t_3^2),(t_4,t_4^2)$, we get

$\begin{vmatrix} 1 & t_1 & t_1^2 & t_1^2 +t_1^4\\ 1 & t_2 & t_2^2 & t_2^2 +t_2^4\\ 1 & t_3 & t_3^2 & t_3^2 +t_3^4\\ 1 & t_4 & t_4^2 & t_4^2 +t_4^4 \end{vmatrix} = (t_1+t_2+t_3+t_4)\prod_{1\le j

Since $0\le t_1,t_2,t_3,t_4 < p/4$, we have that $t_1+t_2+t_3+t_4$ is not a multiple of $p$. Assuming that the four integers are distinct, none of the elements of the product is a multiple of $p$. Repeating the above argument, we get that no four points of ${\cal P}$ are cocircular.

Now that we know that the above construction is in general position, let us mention a few of its applications:

• Erdős and Purdy asked for the size of the largest subset of the $n\times n$ integer lattice with no three points on a line and no four on a circle. The above construction provides the current best lower bound for this problem.
• Erdős, Hickerson, and Pach asked whether there exists a planar set of $n$ points in general position that spans no parallelograms and still determines $o(n^2)$ distinct distances. Dumitrescu provided a positive answer to this question by using the above construction.
• The Heilbronn triangle problem asks for the smallest $\Delta(n)$ such that every set of $n$ points in the unit square determine a triangle of area at most $\Delta(n)$. Erdős used a variant of the above construction to produce one of the first non-trivial lower bounds for the problem.
Construction 2: Projection from a hypersphere. In our first construction, the point set has some structure since it is a subset of a $p\times p$ section of the integer lattice. The structure in our second construction also originates from a lattice, although in a very different way. This construction comes from a paper of Erdős, Füredi, Pach, and Ruzsa, and seems to be influenced by earlier ideas of Behrend.

For a positive integer $d$, consider the section of the $d$-dimensional integer lattice

${\cal L} = \left\{(x_1,\ldots,x_d)\in {\mathbb Z}^d :\ 0\le x_1,\ldots,x_d <2^d \right\}.$

We claim that there exists a hypersphere around the origin that contains many points of ${\cal L}$. First, note that $|{\cal L}| = 2^{d^2}$. We also note that the distance between a point $x\in {\cal L}$ and the origin is $\sqrt{x_1^2+\cdots+x_d^2}$. Every such distance is the square root of an integer between zero and $d\cdot 2^{2d}$. That is, the points of ${\cal L}$ determine at most $d\cdot 2^{2d}$ distinct distances from the origin. By the pigeonhole principle, there exists a distance $\delta$ such that at least $2^{d^2}/(d\cdot 2^{2d})=2^{d(d-2)}/d$ points are at distance $\delta$ from the origin. In other words, the hypersphere $S_\delta$ centered at the origin and of radius $\delta$ contains at least $2^{d(d-2)}/d$ points of ${\cal L}$. Let ${\cal P}'$ be a set of exactly $n=2^{d(d-2)}/d$ of these points. To recap, ${\cal P}'$ is a set of $n$ points with integer coordinates on the hypersphere $S_\delta$.

Let $h$ be a generic two-dimensional plane in ${\mathbb R}^d$ (or, if you prefer, a random plane). Let ${\cal P}$ be the set of points that are projections of the points of ${\cal P}'$ on $h$. It is not difficult to verify that, on a generic plane, no two points of ${\cal P}'$ are projected to the same point. Similarly, when projecting four points on a generic plane, we do not expect the projections to be cocircular. That is, we may assume that ${\cal P}$ consists of $n$ distinct points, no four which are cocircular.

It is not true that the projections of any three points on a generic plane are not collinear. If three points are collinear in ${\mathbb R}^d$, then their projections on every plane would remain collinear. This is why we take the points of ${\cal P}'$ to be on the hypersphere $S_\delta$: Since every line intersects the hypersphere in at most two points, no three points of ${\cal P}'$ are collinear.

Now that we constructed a set ${\cal P}$ of $n$ points in general position, it remains to see what structure ${\cal P}$ has. For this purpose, we return to the lattice ${\cal L}$ in ${\mathbb R}^d$. It is not difficult to verify that the set of vectors determined by pairs of points of ${\cal L}$ is

$\left\{(x_1,\ldots,x_d)\in {\mathbb Z}^d :\ -2^d\le x_1,\ldots,x_d <2^d \right\}.$

That is, the number of distinct vectors determined by pairs of points of ${\cal L}$ is $O\left(2^{d(d+1)}\right)$. Since ${\cal P}' \subset {\cal L}$, the number of distinct vectors determined by pairs of points of ${\cal P}'$ is also $O(2^{d(d+1)})$. Let $p_1,p_2,p_3,p_4\in {\mathbb R}^d$ satisfy $p_1-p_2=p_3-p_4$, and let $\pi(\cdot)$ be a projection from ${\mathbb R}^d$ to a two-dimensional plane. Then $\pi(p_1)-\pi(p_2)=\pi(p_3)-\pi(p_4)$. This implies that the number of distinct vectors determined by pairs of points of ${\cal P}$ is also $O(2^{d(d+1)})$.

From the definition $n=2^{d(d-2)}/d$ we obtain $\log n = d(d-2) - \log d$, which in turn implies $2^{d(d+1)} = O(n2^{\sqrt{\log n}})$. Thus, the number of distinct vectors determined by pairs of points of ${\cal P}$ is $O\left(n2^{\sqrt{\log n}}\right)$. (If you are not used to such expressions, note that for every $\varepsilon>0$ we have $n2^{\sqrt{\log n}} = O\left(n^{1+\varepsilon}\right)$.) This property shows that ${\cal P}$ has a strong structure. For example, it immediately implies that the number of distinct distances determined by ${\cal P}$ is $O\left(n2^{\sqrt{\log n}}\right)$ (and similarly for the number of distinct directions). That is, a set in general position can still determine a relatively small number of distinct distances.