## Equidistribution for random rotations

Two very different types of dynamical behaviour are illustrated by a pair of very well-known examples on the circle: the doubling map and an irrational rotation. On the unit circle in ${{\mathbb C}}$, the doubling map is given by ${z\mapsto z^2}$, while an irrational rotation is given by ${z\mapsto e^{2\pi i\theta}z}$ for some irrational ${\theta}$.

Lebesgue measure (arc length) is invariant for both transformations. For the doubling map, it is just one of many invariant measures; for an irrational rotation, it turns out to be the only invariant measure. We say that the doubling map exhibits hyperbolic behaviour, while the irrational rotation exhibits elliptic behaviour.

Systems with hyperbolicity have many invariant measures, as we saw in a series of previous posts. The goal of this post is to recall a proof that the opposite situation is true for an irrational rotation, and that in particular every orbit equidistributes with respect to Lebesgue measure; then we consider orbits generated by random rotations, where instead of rotating by a fixed angle ${\theta}$, we rotate by either ${\theta_1}$ or ${\theta_2}$, with the choice of which to use being made at each time step by flipping a coin.

1. Invariant measures via ${C(X)}$

First we recall some basic facts from ergodic theory for topological dynamical systems. Given a compact metric space ${X}$ and a continuous map ${f\colon X\rightarrow X}$, let ${\mathcal{M}}$ denote the space of Borel probability measures on ${X}$. Writing ${C(X)}$ for the space of all continuous functions ${X\rightarrow {\mathbb C}}$, recall that ${C(X)^*}$ is the space of all continuous linear functionals ${C(X)\rightarrow {\mathbb C}}$. Then ${C(X)^*}$ is (isomorphic to) the space of finite complex Borel measures on ${X}$. (This last assertion uses the fact that ${X}$ is a compact metric space and combines various results from “Linear Operators” by Dunford and Schwartz, but a more precise reference will have to wait until I have the book available to look at.)

Using this fact together with the polar decomposition for finite complex Borel measures, we have the following: for every ${L\in C(X)^*}$, there is ${\mu\in \mathcal{M}}$ and a measurable function ${\theta\colon X\rightarrow {\mathbb R}}$ such that

$\displaystyle L(\varphi) = \|L\| \int \varphi e^{i\theta} \,d\mu \text{ for all } \varphi\in C(X). \ \ \ \ \ (1)$

Note that although ${C(X)^*}$ is endowed with the operator norm, we will usually think of it as a topological vector space with the weak* topology. Thus ${\mathcal{M}}$ embeds naturally into ${C(X)^*}$, and (1) shows that every element of ${C(X)^*}$ can be described in a canonical way in terms of ${\mathcal{M}}$.

Let ${P\subset C(X)}$ be a countable set whose span is dense in ${C(X)}$. Then to every ${L\in C(X)^*}$ we can associate the sequence ${\Phi_P(L) = \{ L(p) \mid p\in P\} \subset {\mathbb C}^P}$, where depending on the context we may index using either ${{\mathbb N}}$ or ${{\mathbb Z}}$. If ${P}$ is bounded then ${\Phi_P(L)\in \ell^\infty}$ for every ${L\in C(X)^*}$, and so such a ${P}$ defines a linear map ${\Phi_P\colon C(X)^*\rightarrow \ell^\infty}$.

Because ${L}$ is determined by the values ${L(p)}$ (by linearity and continuity of ${L}$, and density of the span of ${P}$), the map ${\Phi}$ is 1-1. In particular, it is an isomorphism onto its image, which we denote by

$\displaystyle V_P := \Phi_P(C(X)^*) \subset \ell^\infty$

Note that ${V_P \neq \ell^\infty}$ because ${C(X)^*}$ is separable and ${\ell^\infty}$ is not.

It is straightforward to see that ${\Phi_P}$ is continuous, and its inverse is also continuous on ${V_P}$. Thus we can translate questions about ${C(X)^*}$, and in particular about ${\mathcal{M}}$, into questions about ${\ell^\infty}$.

Remark 1 It is a nontrivial problem to determine which elements of ${\ell^\infty}$ correspond to elements of ${C(X)^*}$, and also to determine which of those sequences correspond to actual measures (elements of ${\mathcal{M}}$). We will not need to address either of these problems here.

The action ${f\colon X\rightarrow X}$ induces an action on ${C(X)}$ by ${\varphi\mapsto \varphi\circ f}$, and hence on ${C(X)^*}$ by duality. This action ${f_*\colon C(X)^*\rightarrow C(X)^*}$ is given by

$\displaystyle (f_*L)(\varphi) = L(\varphi\circ f). \ \ \ \ \ (2)$

In particular, ${f}$ also induces an action ${f_*\colon \mathcal{M}\rightarrow\mathcal{M}}$ by

$\displaystyle \int\varphi\,d(f_*\mu) = \int (\varphi \circ f)\,d\mu. \ \ \ \ \ (3)$

A measure ${\mu}$ is ${f}$-invariant iff it is a fixed point of ${f_*}$. Let ${f_P}$ be the action induced by ${f}$ on ${V_P\subset \ell^\infty}$; that is,

$\displaystyle f_P(\Phi_P(\mu)) = \Phi_P(f_*\mu). \ \ \ \ \ (4)$

If ${P}$ can be chosen so that ${f_P}$ takes a particularly nice form, then this can be used to understand what invariant measures ${f}$ has, and how empirical measures converge.

Let us say more clearly what is meant by convergence of empirical measures. Given ${x\in X}$ and ${n\in {\mathbb N}}$, let ${\mathcal{E}_n(x) = \frac 1n \sum_{k=0}^{n-1} \delta_{f^kx}}$ be the empirical measure along the orbit segment ${x,f(x),\dots,f^{n-1}(x)}$. Let ${V(x)\subset \mathcal{M}}$ be the set of weak* accumulation points of the sequence ${\mathcal{E}_n(x)}$. By compactness of ${\mathcal{M}}$, the set ${V(x)}$ is non-empty, and it is a standard exercise to show that every measure in ${V(x)}$ is ${f}$-invariant.

In particular, if ${(X,f)}$ is uniquely ergodic, then it only has one invariant measure ${\mu}$, and so ${V(x) = \{\mu\}}$ for every ${x\in X}$. In this case we have ${\mathcal{E}_n(x) \rightarrow \mu}$ for every ${x}$, and it is reasonable to ask how quickly this convergence occurs.

2. Irrational rotations

Now we specialise to the case of an irrational rotation. Let ${X=S^1\subset {\mathbb C}}$ be the unit circle, fix ${\theta\in {\mathbb R}}$ irrational, and let ${f\colon X\rightarrow X}$ be given by ${f(z) = e^{2\pi i\theta}z}$. We will show that ${\mu\in \mathcal{M}}$ is ${f}$-invariant iff it is Lebesgue measure, and then examine what happens in a broader setting.

Given ${n\in {\mathbb Z}}$, let ${p_n(z) = z^n}$, and let ${P = \{p_n \mid n\in {\mathbb Z}\}}$. Then the span of ${P}$ contains all functions ${S^1\rightarrow {\mathbb C}}$ that are polynomials in ${z}$ and ${\bar{z}}$, thus it is a subalgebra of ${C(S^1)}$ that contains the constant functions, separates points, and is closed under complex conjugation. By the Stone–Weierstrass theorem, this span is dense in ${C(S^1)}$, and since ${P}$ is bounded the discussion from above gives an isomorphism ${\Phi\colon C(S^1)^* \rightarrow V\subset \ell^\infty}$, where we suppress ${P}$ in the notation. This isomorphism is given by

$\displaystyle \Phi(L)_n = L(p_n) \ \ \ \ \ (5)$

for a general ${L\in C(S^1)^*}$, and for ${\mu\in \mathcal{M}}$ we write

$\displaystyle \Phi(\mu)_n = \int_{S^1} z^n \,d\mu(z). \ \ \ \ \ (6)$

The sequence ${\Phi(\mu)}$ is the sequence of Fourier coefficients associated to the measure ${\mu}$. The choice of ${p_n}$ means that the action ${f}$ induces on ${V\subset \ell^\infty}$ takes a simple form: the Fourier coefficients of ${L}$ and ${f_*L}$ are related by

$\displaystyle \Phi(f_*L)_n = (f_*L)(z^n) = L((f(z))^n) = L\left(e^{2\pi i \theta n} z^n\right) = e^{2\pi i \theta n} \Phi(L)_n.$

Thus if ${\mu}$ is invariant, we have ${\Phi(\mu)_n = e^{2\pi i \theta n} \Phi(\mu)_n}$ for all ${n=0,1,2,\dots}$. Because ${\theta}$ is irrational, we have ${e^{2\pi i\theta n} \neq 1}$ for all ${n\neq 0}$, and so ${\Phi(\mu)_n=0}$. Thus the only non-zero Fourier coefficient is ${\Phi(\mu)_0 = \int {\mathbf{1}} \,d\mu(z) = 1}$. Because ${\Phi}$ is an isomorphism between ${C(S^1)^*}$ and ${V\subset \ell^\infty}$, this shows that the only ${L\in C(S^1)^*}$ with ${f_*L=L}$ is Lebesgue measure. In particular, ${f}$ is uniquely ergodic, with Lebesgue as the only invariant measure, and thus for any ${z\in S^1}$, the empirical measures ${\mathcal{E}_n(z)}$ converge to Lebesgue.

3. Random rotations

Consider a sequence of points ${z_1,z_2,z_3,\dots\in S^1}$, and let ${m_n\in \mathcal{M}}$ be the average of the point masses on the first ${n}$ points of the sequence:

$\displaystyle m_n = \frac 1n \sum_{k=1}^n \delta_{z_k}, \qquad\qquad m_n(\varphi) = \frac 1n \sum_{k=1}^n \varphi(z_k). \ \ \ \ \ (7)$

We say that the sequence ${z_n}$ equidistributes if ${m_n}$ converges to Lebesgue on ${S^1}$ in the weak* topology.

The previous sections showed that if the points of the sequence are related by ${z_{n+1} = e^{2\pi i \theta} z_n}$, where ${\theta}$ is irrational, then the sequence equidistributes. A natural generalisation is to ask what happens when the points ${z_n}$ are related not by a fixed rotation, but by a randomly chosen rotation.

Here is one way of making this precise. Let ${\Omega = \{1,2\}^{\mathbb N}}$ be the set of infinite sequences of 1s and 2s, and let ${\mu}$ be the ${\left(\frac 12, \frac 12\right)}$-Bernoulli measure on ${\Omega}$, so that all sequences of length ${n}$ are equally likely. Fix real numbers ${\theta_1}$ and ${\theta_2}$, and fix ${z_1\in S^1}$. Given ${\omega\in \Omega}$, consider the sequence ${z_n(\omega)}$ given by

$\displaystyle z_{n+1}(\omega) = e^{2\pi i \theta_{\omega_n}} z_n(\omega). \ \ \ \ \ (8)$

Then one may ask whether or not ${z_n(\omega)}$ equidistributes almost surely (that is, with probability 1 w.r.t. ${\mu}$). The remainder of this post will be dedicated to proving the following result.

Theorem 1 If either of ${\theta_1}$ or ${\theta_2}$ is irrational, then ${z_n(\omega)}$ equidistributes almost surely.

Remark 2 The proof given here follows a paper by Lagarias and Soundararajan, to which I was referred by Lucia on MathOverflow.

Using Fourier coefficients as in the previous section, we have that ${z_n(\omega)}$ equidistributes iff all the non-constant Fourier coefficients of ${m_n(\omega)}$ converge to zero — that is, iff ${\Phi(z_n(\omega))_k \rightarrow 0}$ as ${n\rightarrow\infty}$ for all ${k\neq 0}$. This is Weyl’s criterion for equidistribution.

Fix a value of ${k\neq 0}$, which will be suppressed in the notation from now on. Write ${a_n}$ for the absolute value of the ${k}$th Fourier coefficient of ${m_n(\omega)}$, and note that

$\displaystyle a_n := |\Phi(z_n(\omega))_k| = |m_n(\omega)(z^k)| = \frac 1n \left|\sum_{j=1}^n z_j(\omega)^k\right|. \ \ \ \ \ (9)$

The outline of the proof is as follows.

1. Show that there is a constant ${C}$ such that the expected value of ${a_n}$ is at most ${C/n}$.
2. Given ${\delta>0}$, show that there is a constant ${C'}$ such that the probability that ${a_n}$ exceeds ${\delta}$ is at most ${C'/n}$.
3. Find an exponentially increasing sequence ${n_j\rightarrow\infty}$ such that if ${a_{n_j}\leq \delta}$, then ${a_n\leq 2\delta}$ for every ${n\in [n_j,n_{j+1}]}$.
4. Use the Borel–Cantelli lemma to deduce that with probability 1, ${a_{n_j}}$ exceeds ${\delta}$ only finitely often, hence ${a_n}$ exceeds ${2\delta}$ only finitely often. Since ${\delta>0}$ was arbitrary this shows that ${a_n\rightarrow 0}$.

Step 1

Given ${\xi\in {\mathbb R}}$, let ${\|\xi\|}$ denote the distance between ${\xi}$ and the nearest integer.

Lemma 2

$\displaystyle \mathop{\mathbb E}_\mu[a_n^2] \leq \left( 1 + \frac 1{\|k\theta_1\|^2 + \|k\theta_2\|^2}\right) \frac 1n \ \ \ \ \ (10)$

Proof: Let ${y_1\in {\mathbb R}}$ be such that ${z_1 = e^{2\pi i y_1}}$, and define ${y_n}$ recursively by ${y_{n+1} = y_n + \theta_{\omega_n}}$, so that ${z_n = e^{2\pi i y_n}}$.

\displaystyle \begin{aligned} a_n^2 &= \frac 1{n^2} \left\lvert \sum_{j=1}^n z_j^k \right\rvert^2 = \frac 1{n^2} \left(\sum_{\ell=1}^n z_\ell^k\right) \left(\sum_{j=1}^n \overline{z_j^k}\right) \\ &= \frac 1{n^2} \left( n + \sum_{\ell\neq j} z_\ell^k z_j^{-k} \right) = \frac 1{n^2} \left( n + \sum_{\ell\neq j} e^{2\pi i k(y_\ell - y_j)} \right). \end{aligned}

Using the fact that ${z+\bar{z} = 2\Re(z)}$, we have

$\displaystyle \sum_{\ell \neq j} e^{2\pi i k(y_\ell - y_j)} = 2\Re \sum_{1\leq \ell < j \leq n} e^{2\pi i k(y_\ell - y_j)}.$

If ${\ell - j = r}$, then

\displaystyle \begin{aligned} \mathop{\mathbb E}_\mu[e^{2\pi i k (y_\ell - y_j)}] &= \frac 1{2^r} \sum_{\omega_\ell,\dots, \omega_{j-1}} e^{2\pi i k \sum_{m=\ell}^{j-1} \theta_{\omega_{m}}} = \frac 1{2^r} \sum \prod_{m=\ell}^{j-1} e^{2\pi i k \theta_{\omega_m}} \\ &= \left(\frac{e^{2\pi i k\theta_1} + e^{2\pi i k\theta_2}}{2}\right)^r, \end{aligned}

where the sums are over all ${\omega_m\in \{1,2\}}$ for ${\ell\leq m. Since there are ${n-r}$ values of ${\ell,j}$ with ${\ell - j = r}$, we have

$\displaystyle \mathop{\mathbb E}_\mu[a_n^2] = \frac 1{n^2}\left(n + 2\Re \sum_{r=1}^n (n-r)z^r\right), \ \ \ \ \ (11)$

where ${z=\frac 12(e^{2\pi i k\theta_1} + e^{2\pi i k\theta_2})}$. Now

\displaystyle \begin{aligned} \sum_{r=1}^n (n-r) z^r &= \sum_{r=1}^{n-1} z_r + \sum_{r=1}^{n-2} z^r + \cdots + \sum_{r=1}^1 z^r \\ &= \frac z{1-z}\sum_{s=1}^{n-1} (1-z^s) = \frac z{1-z} \left( n-\sum_{s=0}^{n-1} z^s \right). \end{aligned}

and since ${|z|\leq 1}$ we have

$\displaystyle \left\lvert \sum_{r=1}^n (n-r) z^r \right\rvert \leq \frac{2n}{|1-z|}.$

Together with (11), this gives

$\displaystyle \mathop{\mathbb E}_\mu[a_n^2] \leq \frac 1n \left( 1 + \frac 4{|1-z|}\right) = \frac 1n\left( 1 + \frac 8{|2 - e^{2\pi i k \theta_1} - e^{2\pi i k \theta_2}|} \right).$

Using the fact that ${|w|\geq |\Re w|}$ and that ${\Re(1-e^{2\xi i}) \geq 1-\cos(2\xi) = 2\sin^2\xi}$, we have

\displaystyle \begin{aligned} |2 - e^{2\pi ik\theta_1} - e^{2\pi ik\theta_2}| &\geq 2(\sin^2 \pi k\theta_1 + \sin^2 \pi k\theta_2), \end{aligned}

and so

$\displaystyle \mathop{\mathbb E}_\mu[a_n^2] \leq \frac 1n\left(1 + \frac 4{\sin^2\pi k\theta_1 + \sin^2\pi k\theta_2}\right)$

Finally, for ${|\xi|\leq \frac 12}$ we have ${|\sin \pi\xi| \geq |2\xi|}$, which proves the bound in (10). $\Box$

Because one of ${\theta_1,\theta_2}$ is irrational, the denominator in (10) is positive, and so ${C := 1 + (\|k\theta_1\|^2 + \|k\theta_2\|^2)^{-1} < \infty}$, which completes Step 1.

Step 2

Given ${\delta>0}$, we have

$\displaystyle \mathop{\mathbb E}[a_n^2] \geq \delta^2 \mathop{\mathbb P}(a_n\geq \delta),$

and so by Lemma 2 we have

$\displaystyle \mathop{\mathbb P}(a_n\geq \delta) < \frac C{\delta^2 n}.$

Putting ${C' = C\delta^{-2}}$ completes step 2.

Step 3

Given ${m\leq n\in {\mathbb N}}$, we have

$\displaystyle na_n \leq ma_m + \left|\sum_{\ell=m}^{n-1} z_\ell^k\right| \leq ma_m + n-m,$

and so

$\displaystyle a_n \leq \frac{n-m}n + \frac mn a_m.$

In particular, if ${n\leq (1+\delta)m}$ for some ${\delta>0}$, we have

$\displaystyle a_n \leq \delta + a_m. \ \ \ \ \ (12)$

Let ${n_j}$ be such that

$\displaystyle 1+\frac \delta 2 \leq \frac{n_{j+1}}{n_j} \leq 1+\delta \ \ \ \ \ (13)$

for all ${j}$. If ${a_{n_j}\leq \delta}$, then (12) implies that ${a_n\leq 2\delta}$ for all ${n\in [n_j, n_{j+1}]}$.

Step 4

Let ${E_j}$ be the event that ${a_{n_j} \geq \delta}$. By Part 2, we have ${\mathop{\mathbb P}(E_j) \leq C'/n_j}$, and because ${n_j}$ increases exponentially in ${j}$ by (13), we have ${\sum_{j=1}^\infty \mathop{\mathbb P}(E_j) < \infty}$. By the Borel–Cantelli lemma, this implies that with probability 1, there are only finitely many values of ${j}$ for which ${a_{n_j}\geq \delta}$.

By the previous part, this in turn implies that there are only finitely many values of ${n}$ for which ${a_n\geq 2\delta}$. In particular, ${\varlimsup a_n \leq 2\delta}$, and since ${\delta>0}$ was arbitrary, we have ${a_n\rightarrow 0}$. Thus the sequence ${z_n}$ satisfies Weyl’s criterion almost surely, which completes the proof of Theorem 1.