## Slowly mixing sets

There are two equivalent definitions of mixing for a measure-preserving dynamical system ${(X,f,\mu)}$. One is in terms of sets:

$\displaystyle \lim_{n\rightarrow\infty} |\mu(A \cap f^{-n}B) - \mu(A)\mu(B)| = 0 \ \ \ \ \ (1)$

for all measurable ${A,B\subset X}$. The other is in terms of functions:

$\displaystyle \lim_{n\rightarrow\infty} \left\lvert \int \varphi \cdot(\psi\circ f^n)\,d\mu - \int\varphi\,d\mu \int\psi\,d\mu\right\rvert = 0 \ \ \ \ \ (2)$

for all ${\varphi,\psi\in L^1(X,\mu)}$. In both cases one may refer to the quantity inside the absolute value as the correlation function, and write it as ${C_n(A,B)}$ or ${C_n(\varphi,\psi)}$, so that (1) and (2) become ${C_n(A,B)\rightarrow 0}$ and ${C_n(\varphi,\psi)\rightarrow 0}$, respectively.

Then it is natural to ask how quickly mixing occurs; that is, how quickly the correlation functions ${C_n}$ decay to 0. Is the convergence exponential in ${n}$? Polynomial? Something else?

In a previous post, we discussed one method for establishing exponential decay of correlations using a spectral gap for a certain operator associated to the system. The result there stated that as long as ${\varphi}$ and ${\psi}$ are chosen from a suitable class of test functions (often Hölder continuous functions), then ${C_n(\varphi,\psi)}$ decays exponentially.

A comment was made in that post that it is important to work with sufficiently regular test functions, because for arbitrary measurable functions, or for the setwise correlations ${C_n(A,B)}$, the decay may happen arbitrarily slowly even when the system is “as mixing as possible”. But no examples were given there — so I’d like to explain this a little more completely here.

Let ${X=\{0,1\}^{\mathbb N}}$ be the space of infinite sequences of 0s and 1s, and let ${f=\sigma}$ be the shift map. Let ${\mu}$ be ${(\frac 12,\frac 12)}$-Bernoulli measure, so that writing ${[x_1\cdots x_n]}$ for the ${n}$-cylinder containing all sequences in ${X}$ that begin with the symbols ${x_1,\dots,x_n}$, we have ${\mu[x_1\cdots x_n] = 2^{-n}}$. This is isomorphic to the doubling map on the circle with Lebesgue measure, which was discussed in the previous post and which has exponential decay of correlations for Hölder observables ${\varphi,\psi}$. Nevertheless, we will show that there are sets ${A,B\subset X}$ such that ${C_n(A,B)}$ decays quite slowly.

We can think of ${(X,f,\mu)}$ as modeling sequences of coin flips, with tails corresponding to the symbol 0, and heads to 1.

Let ${B = [1]}$ be the set of all sequences in ${X}$ that begin with 1. Then ${B}$ corresponds to the event that the first flip is heads, and ${\sigma^{-n}B}$ corresponds to the event that the ${(n+1)}$st flip is heads.

The description of ${A}$ is a little more involved. Given ${j\leq k \in {\mathbb N}}$, let ${A_{j,k} = \bigcup_{i=j}^{k-1} \sigma^{-i}[1]}$, so that ${x\in A_{j,k}}$ if and only if (at least) one of the symbols ${x_j,\dots,x_{k-1}}$ is 1. This corresponds to the event that heads appears at least once on or after the ${j}$th flip and before the ${k}$th flip. Note that ${\mu(X \setminus A_{j,k}) = 2^{-(k-j)}}$, and so ${\mu(X \setminus A_{j,k}) = 1-2^{-(k-j)}}$.

Fix an increasing sequence of integers ${k_1,k_2,\dots}$ with ${k_1 = 1}$, and let ${A = \bigcap_{i=1}^\infty A_{k_i, k_{i+1}}}$ be the set of all sequences ${x\in X}$ such that every subword ${x_{k_i} \cdots x_{k_{i+1}-1}}$ contains at least one occurrence of the symbol 1.

Using the interpretation as coin flips, we see that the events ${A_{k_i,k_{i+1}-1}}$ are independent, and so

$\displaystyle \mu(A) = \mathop{\mathbb P}(A) = \prod_{i=1}^\infty \mathop{\mathbb P}(A_{k_i,k_{i+1}}) = \prod_{i=1}^\infty (1-2^{-(k_{i+1} - k_i)}). \ \ \ \ \ (3)$

We recall the following elementary lemma.

Lemma 1 Let ${t_i\in {\mathbb R}}$ be a sequence of real numbers such that ${0\leq t_i < 1}$ for all ${i\in {\mathbb N}}$. Then ${\prod_{i=1}^\infty (1-t_i) > 0}$ if and only if ${\sum_{i=1}^\infty t_i < \infty}$.

It follows from Lemma 1 that ${\mu(A)>0}$ if and only if

$\displaystyle \sum_{i=1}^\infty 2^{-(k_{i+1} - k_i)} < \infty. \ \ \ \ \ (4)$

Thus from now on we assume that the sequence ${k_i}$ is chosen such that (4) holds. Note that the correlation function ${C_n(A,B)}$ can be computed via conditional probabilities: using

$\displaystyle \mathop{\mathbb P}(A \mid \sigma^{-n}B) = \frac{\mu(A \cap \sigma^{-n}B)}{\mu(B)}$

(notice that this uses shift-invariance of ${\mu}$), we have

\displaystyle \begin{aligned} C_n(A,B) &= \mu(A\cap \sigma^{-n}B) - \mu(A)\mu(B) \\ &= \mu(B) (\mathop{\mathbb P}(A \mid \sigma^{-n}B) - \mu(A)). \end{aligned} \ \ \ \ \ (5)

Recalling that ${A}$ is the intersection of the independent events ${A_{k_i,k_{i+1}}}$, we let ${j=j(n)}$ be such that ${n\in [k_j, k_{j+1})}$, and observe that ${\sigma^{-n}(B) \subset A_{k_j,k_{j+1}}}$. Thus

$\displaystyle \mathop{\mathbb P}(A \mid \sigma^{-n}B) = \prod_{i\neq j} (1-2^{-(k_{i+1}-k_i)}) = \frac{\mu(A)}{1-2^{-(k_{j+1}-k_j)}},$

where the last equality uses the expression for ${\mu(A)}$ from (3). Together with the expression (5) for ${C_n(A,B)}$ in terms of the conditional probability ${\mathop{\mathbb P}(A\mid \sigma^{-n}B)}$, this gives

\displaystyle \begin{aligned} C_n(A,B) &= \mu(A) \mu(B) \left( \frac 1{1-2^{-(k_{j+1}-k_j)}} - 1 \right) \\ &= \mu(A)\mu(B) \frac 1{2^{k_{j+1} - k_j} - 1}, \end{aligned} \ \ \ \ \ (6)

where we emphasise that ${j}$ is a function of ${n}$.

Example 1 Take ${k_i = \lfloor 2 i\log_2 i \rfloor}$, so that ${k_{i+1} - k_i \approx 2 i \log_2 (1+\frac 1i) + 2\log_2(i+1)}$, and ${2^{k_{i+1} - k_i} \approx (i+1)^2 (1+\frac 1i)^{2i}}$. Since ${(1+\frac 1i)^{2i} \rightarrow e^2}$, we see that (4) is satisfied, so that ${\mu(A)>0}$.

Moreover, ${j=j(n)}$ satisfies ${2j\log_2 j \leq n}$, and so in particular ${j\leq n}$. This implies that ${k_{j+1} - k_j \leq k_{n+1} - k_n}$, and so we can estimate the correlation function using (6) by

$\displaystyle C_n(A,B) \geq \mu(A)\mu(B)\frac 1{2^{k_{n+1} - k_n} - 1} \approx \mu(A)\mu(B) \frac 1{n^2}.$

The example shows that the correlation function of sets may only decay polynomially, even if the system has exponential decay of correlations for regular observable functions. In fact, we can use the construction above to produce sets with correlations that decay more or less as slowly as we like along some subsequence of times ${n}$.

Theorem 2 Let ${\gamma_n>0}$ be any sequence of positive numbers converging to 0. Then there is a sequence ${k_i}$ such that the sets ${A,B}$ from the construction above have ${\limsup_{n\rightarrow\infty} C_n(A,B)/\gamma_n = \infty}$.

The result says that no matter how slowly ${\gamma_n}$ converges to 0, we can choose ${A,B}$ such that ${C_n(A,B)}$ is not bounded above by ${K\gamma_n}$ for any constant ${K}$.

To prove Theorem 2, we will choose ${k_i}$ such that ${C_n(A,B)/\gamma_n}$ is large whenever ${n\in [k_j, k_{j+1})}$ with ${j}$ even. Let ${c_i,d_i}$ be any increasing sequences of integers (conditions on these will be imposed soon) and define ${k_i}$ recursively by

$\displaystyle k_1 = 1, \quad k_{2i} = k_{2i-1} + c_i, \qquad k_{2i+1} = k_{2i} + d_i.$

Because ${c_i,d_i}$ are increasing, we have

$\displaystyle \sum_{i=1}^\infty 2^{-(k_{i+1} - k_i)} = \sum_{i=1}^\infty 2^{-c_i} + 2^{-d_i} < \infty,$

so (4) holds and ${\mu(A)>0}$. The idea is that we will take ${c_i \gg d_i}$, so that when ${n\in [k_{2j},k_{2j+1})}$, we have ${k_{2j+1} - k_{2j} = d_j}$ small relative to ${k_{2j}}$, and hence to ${n}$.

Let’s flesh this out a little. Given ${n\in [k_{2j},k_{2j+1})}$, it follows from (6) that

$\displaystyle C_n(A,B) = \mu(A)\mu(B) \frac 1{2^{k_{j+1} - k_j} - 1} \geq \mu(A)\mu(B) 2^{-d_j}. \ \ \ \ \ (7)$

On the other hand, ${n\geq k_{2j} \geq \sum_{i=1}^j c_i}$. Now we may take ${d_j}$ to be any increasing sequence (${d_j=j}$ will do just fine) and define ${c_j}$ recursively in terms of ${c_1,\dots, c_{j-1}}$ and ${d_j}$. We do this by observing that since ${\gamma_n \rightarrow 0}$, given any ${c_1,\dots, c_{j-1}}$ and ${d_j}$, we can take ${c_j}$ large enough so that for every ${n\geq \sum_{i=1}^j c_i}$, we have

$\displaystyle j\gamma_n \leq \mu(A) \mu(B) 2^{-d_j}.$

In particular, (7) gives ${C_n(A,B) \geq j\gamma_n}$ for every ${n\in [k_{2j},k_{2j+1})}$. Sending ${j\rightarrow\infty}$ completes the proof of Theorem 2.