Entropy of S-gap shifts

1. S-gap shifts

S-gap shifts are a useful example for studying dynamics of shift spaces that are not subshifts of finite type but still exhibit some strong mixing properties. They are defined as follows: given ${S\subset \{0,1,2,\dots,\}}$, let ${G\subset \{0,1\}^*}$ be the set of all words on two symbols of the form ${0^n1}$ — that is, ${n}$ 0s followed by a single 1. (Edit 8/2/15: As Steve Kass pointed out in his comment, we need to specify here that ${0^n1\in G \Leftrightarrow n\in S}$.) Then let ${G^{{\mathbb Z}}}$ be the set of all bi-infinite sequences of 0s and 1s that can be written as an infinite concatenation of words in ${G}$, and let ${X\subset \{0,1\}^{\mathbb Z}}$ be the smallest closed shift-invariant set containing ${G^{\mathbb Z}}$.

Equivalently, ${X}$ is the collection of bi-infinite sequences ${x\in \{0,1\}^{\mathbb Z}}$ for which every subword of the form ${10^n1}$ has ${n\in S}$. If ${S}$ is finite then ${X}$ is a shift of finite type. We are usually most interested in the case where ${S}$ is infinite — for example, in this paper (arXiv) where Dan Thompson and I considered questions of uniqueness of the measure of maximal entropy. For purposes of this post, ${S}$ may be finite or infinite, it will not matter.

Recall that if ${\mathcal{L}_n}$ denotes the set of words of length ${n}$ that appear somewhere in the shift ${X}$, then the topological entropy of ${X}$ is ${h(X) = \lim \frac 1n \log \#\mathcal{L}_n}$. The following result is well-known and often quoted.

Theorem 1 Given ${S\subset \{0,1,2,\dots\}}$, the topological entropy of the corresponding ${S}$-gap shift is ${h(X) = \log\lambda}$, where ${\lambda>1}$ is the unique solution to ${\sum_{n\in S} \lambda^{-(n+1)} = 1}$.

Note that when ${S=\{0,1,2,\dots\}}$, the ${S}$-gap shift is the full shift on two symbols, and the equation has solution ${\lambda=2}$.

Despite the fact that Theorem 1 is well-known, I am not aware of a complete proof written anywhere in the literature. In a slightly different language, this result already appears in B. Weiss’ 1970 paper “Intrinsically ergodic systems” [Bull. AMS 76, 1266–1269] as example 3.(3), but no details of the proof are given. It is exercise 4.3.7 in Lind and Marcus’ book “Symbolic dynamics and coding”, and the section preceding it gives ideas as to how the proof may proceed. Finally, a more detailed proof appears in Spandl’s “Computing the topological entropy of subshifts” [Math. Log. Quart. 53 (2007), 493–510], but there is a gap in the proof. The goal of this post is to explain where the gap in Spandl’s proof is, and then to give two other proofs of Theorem 1, one more combinatorial, the other more ergodic theoretic.

2. An incomplete proof

The idea of Spandl’s proof is this. Given an ${S}$-gap shift, let ${\mathcal{R}_n}$ be the set of words of length ${n}$ that end in the symbol ${1}$. Every such word is either ${0^{n-1}1}$ or is of the form ${w0^k1}$, where ${w\in \mathcal{R}_{n-(k+1)}}$ and ${k\in S}$. Thus we have

$\displaystyle \#\mathcal{R}_n = 1 + \sum_{k=0}^{n-1} {\mathbf{1}}_S(k) \#\mathcal{R}_{n-(k+1)}. \ \ \ \ \ (1)$

Moreover, every ${w\in \mathcal{L}_n}$ is either ${0^n}$ or is of the form ${v0^{n-k}}$ for some ${v\in \mathcal{R}_k}$, so ${\#\mathcal{L}_n = 1 + \sum_{k=1}^n \#\mathcal{R}_k}$. With a little more work, one can use this together with (1) to get

$\displaystyle \#\mathcal{L}_n = c_n + \sum_{k=0}^{n-1} {\mathbf{1}}_S(k) \#\mathcal{L}_{n-(k+1)},$

where ${c_n\in [0,n+2]}$ for all ${n}$. Dividing through by ${\#\mathcal{L}_n}$ gives

$\displaystyle 1 = \frac{c_n}{\#\mathcal{L}_n} + \sum_{k=0}^{n-1} {\mathbf{1}}_S(k)\frac{\#\mathcal{L}_{n-(k+1)}}{\#\mathcal{L}_n} \ \ \ \ \ (2)$

Writing ${\lambda = e^{h(X)}}$, Spandl now says that ${\#\mathcal{L}_n}$ is asymptotically proportional to ${\lambda^n}$, and so for each ${k}$ the ratio inside the sum converges to ${\lambda^{-(k+1)}}$ as ${n\rightarrow\infty}$. Since ${c_n}$ is subexponential, this would prove Theorem 1.

The problem is that the ratio ${\frac{\#\mathcal{L}_{n-(k+1)}}{\#\mathcal{L}_n}}$ may not converge as ${n\rightarrow\infty}$. Indeed, taking ${S = \{1,3,5,7,\dots\}}$ it is not hard to show that when ${k}$ is even, the limit taken along odd values of ${n}$ differs from the limit taken along even values of ${n}$.

One might observe that in this specific example, the terms where ${k}$ is even do not contribute to the sum in (2), since ${S}$ contains no even numbers. Thus it is plausible to make the following conjecture.

Conjecture. For any ${S}$, let ${d = \mathrm{gcd}(S+1)}$ and let ${h = \log\lambda}$ be the topological entropy of the corresponding ${S}$-gap shift. Then for every ${j=0,1,\dots,d-1}$ the limit

$\displaystyle a_j := \lim_{n\rightarrow\infty} \frac{\#\mathcal{L}_{j+dn}}{\lambda^{j+dn}}$

exists. In particular, if ${k\in S}$ then

$\displaystyle \frac{\#\mathcal{L}_{n-(k+1)}}{\#\mathcal{L}_n} = \lambda^{-(k+1)}\frac{\#\mathcal{L}_{n-(k+1)}}{\lambda^{n-(k+1)}} \frac{\lambda^n}{\#\mathcal{L}_n} \rightarrow \lambda^{-(k+1)} \text{ as }n\rightarrow\infty.$

If the conjecture is true, this would complete Spandl’s proof of Theorem 1. I expect that the conjecture is true but do not know how to prove it. In general, any shift space with entropy ${h=\log \lambda}$ has the property that ${\#\mathcal{L}_n\geq \lambda^n}$. There are examples of shift spaces where ${\#\mathcal{L}_n /\lambda^n}$ is not bounded above; however, it can be shown that every ${S}$-gap shift admits an upper bound, so that ${\#\mathcal{L}_n/\lambda^n}$ is bounded away from 0 and ${\infty}$ (this is done in my paper with Dan Thompson). I don’t see how those techniques can extend to a proof of the conjecture.

So instead of patching the hole in this proof, we investigate two others.

3. A combinatorial proof

Given a shift space ${X}$ with language ${\mathcal{L}}$, let ${a_n = \#\mathcal{L}_n}$ be the number of words of length ${n}$. Consider the following generating function:

$\displaystyle H(z) = \sum_{n=1}^\infty a_n z^n.$

(This is similar to the dynamical zeta function but is somewhat different since we consider all words of length ${n}$ and not just ones that represent periodic orbits of period ${n}$.) Observe that the radius of convergence of ${H}$ is ${\xi := e^{-h(X)}}$. Indeed, for ${|z|<\xi}$ we can put ${\beta = |z|/\xi < 1}$ and observe that the quantity ${|a_n z^n| = \frac{\#\mathcal{L}_n}{e^{nh(X)}} \beta^n}$ decays exponentially; similarly, for ${|z|>\xi}$ the terms in the sum grow exponentially.

Now fix an ${S}$-gap shift and consider the function

$\displaystyle F_1(z) = \sum_{n\in S} z^{n+1}.$

Our goal is to find a relationship between ${H}$ and ${F_1}$ allowing us to show that the radius of convergence of ${H}$ is given by the positive solution to ${F_1(z)=1}$.

First recall the set ${G = \{0^n1\mid n\in S\}}$. Given integers ${k,n}$, let ${A_n^k}$ be the number of words in ${\{0,1\}^n}$ that can be written as a concatenation of exactly ${k}$ words from ${G}$. Note that ${A_n^1 = {\mathbf{1}}_S(n-1)}$, so that

$\displaystyle F_1(z) = \sum_{n=1}^\infty A_n^1 z^n.$

More generally, we consider the power series

$\displaystyle F_k(z) = \sum_{n=1}^\infty A_n^k z^n.$

A little thought reveals that ${F_k(z)F_\ell(z) = F_{k+\ell}(z)}$, since the coefficient of ${z^n}$ in ${F_k(z)F_\ell(z)}$ is given by

$\displaystyle \sum_{m=1}^n A_{n-m}^k A_m^\ell,$

which is equal to ${A_n^{k+\ell}}$ (here ${m}$ represents the location where the ${\ell}$th element of ${G}$ ends). In particular, we get

$\displaystyle F_k(z) = (F_1(z))^k.$

At this point, the natural thing to do is to say that ${\#\mathcal{L}_n = \sum_{k\geq 1} A_n^k}$ and hence ${H(z) = \sum_{k\geq 1} F_k(z)}$. However, this is not quite correct because ${\mathcal{L}_n}$ includes words that are not complete concatenations of words from ${G}$ and so are not counted by any ${A_n^k}$. We return to this in a moment, but first point out that if this were true then we would have ${H(z) = \sum_{k\geq 1} F_1(z)^k}$, and so ${H(z)}$ converges if ${F_1(z)<1}$ and diverges if ${F_1(z)>1}$, which was our goal.

To make this precise, we observe that every word in ${\mathcal{L}_n}$ is either ${0^n}$ or is of the form ${0^i w 0^j}$ where ${w\in A_{n-(i+j)}^k}$ for some ${k}$. Thus we have the bounds

$\displaystyle \sum_{k\geq 1} A_n^k \leq \#\mathcal{L}_n \leq 1 + \sum_{k\geq 1} \sum_{i\geq 0} \sum_{j\geq 0} A_{n-(i+j)}^k.$

In particular, for all ${z\geq 0}$ we get

$\displaystyle \sum_{k\geq 1} F_k (z) \leq H(z) \leq \sum_{n\geq 1} z^n + \sum_{k\geq 1} F_k (z) \left(\sum_{i\geq 0} z^i \right) \left(\sum_{i\geq 0} z^j\right) \ \ \ \ \ (3)$

Writing ${\hat H(z) = \sum_k F_k(z) = \sum_k F_1(z)^k}$, we note that for every ${z\in [0,1)}$ we have ${\sum_n z^n < \infty}$ and so ${\hat H(z)}$ converges if and only if ${H(z)}$ converges. Thus ${\hat H}$ and ${H}$ have the same radius of convergence; in particular, the radius of convergence of ${H(z)}$ is the unique ${z}$ such that ${F_1(z) = \sum_{n\in S} z^{n+1} = 1}$, and by the earlier discussion we have ${h(X) = \log(1/z)}$, proving Theorem 1.

4. An ergodic theory proof

Now we sketch a proof that goes via the variational principle, relating an ${S}$-gap shift (on the finite alphabet ${\{0,1\}}$) to a full shift on a (possibly countable) alphabet. The combinatorial proof above is elementary and requires little advanced machinery; this proof, on the other hand, requires a number of (rather deep) results from ergodic theory and thermodynamic formalism, but has the advantage of illuminating various aspects of the structure of ${S}$-gap shifts.

Let ${X}$ be an ${S}$-gap shift and let ${Y}$ be the set of sequences ${x\in X}$ which contain the symbol 1 infinitely often both forwards and backwards. By the Poincaré recurrence theorem, ${\mu(Y)=1}$ for every ergodic measure other than the delta-measure on the fixed point ${0^\infty}$. Note that ${Y}$ is ${\sigma}$-invariant but not compact (unless ${S}$ is finite).

Let ${Z = Y \cap [1]}$, and let ${F\colon Z\rightarrow Z}$ be the first return map. Thus ${F(x) = \sigma^n(x)}$ for all ${x\in [10^n 1]}$. Note that ${F}$ is topologically conjugate to the full shift ${S^{\mathbb Z}}$ on the alphabet ${S}$, which we allow to be finite or countable. The conjugacy is given by ${\pi\colon S^{\mathbb Z}\rightarrow Z}$ that takes ${\vec n = \cdots n_{-1}.n_0n_1\cdots}$ to ${\cdots 10^{n_{-1}}.10^{n_0}10^{n_1}1\cdots}$.

Given an ergodic ${\sigma}$-invariant probability measure ${\mu}$ on ${X}$ with ${\mu(Y)=1}$, let ${\mu_Z}$ be the induced ${F}$-invariant measure on ${Z}$. Then by Abramov’s formula, we have

$\displaystyle h_\mu(X,\sigma) = \mu(Z) h_{\mu_Z}(Z,F).$

Associate to each ${\mu}$ the shift-invariant measure on ${S^{\mathbb Z}}$ given by ${h^{-1}_*\mu_Z}$. Then we have

$\displaystyle h_\mu(X,\sigma) = \mu(Z) h_{\nu}(S^{\mathbb Z},\sigma).$

Our goal is to relate the topological entropy on ${(X,\sigma)}$ to the topological pressure of a suitable function on ${(S^{\mathbb Z},\sigma)}$. Let ${\varphi\colon S^{\mathbb Z}\rightarrow {\mathbb R}}$ be the function taking ${\vec n}$ to ${n_0 + 1}$, and observe that if ${\mu}$ and ${\nu}$ are identified as above, we have ${\mu(Z) = \mu([1]) = 1/\int \varphi\,d\nu}$, so that

$\displaystyle h_\nu = \left(\int \varphi\,d\nu \right) h_\mu.$

At this point we elide some technical details regarding Gurevich pressure, etc., and simply remark that for ${t>0}$ we have

\displaystyle \begin{aligned} P(S^{\mathbb Z},-t\varphi) &= \lim_{k\rightarrow\infty} \frac 1k \log \sum_{(n_0,\dots,n_{k-1})\in S^k} \sup_{\vec m\in [n_0\cdots n_{k-1}]} e^{-t \sum_{i=0}^{k-1} \varphi(\sigma^i\vec m)} \\ &= \lim_{k\rightarrow\infty} \frac 1k \log \left(\sum_{n_0,\dots,n_{k-1}} e^{-t\sum_{i=0}^{k-1} (n_i + 1)} \right) \\ &= \lim_{k\rightarrow\infty} \frac 1k \log \left( \sum_{n\in S} e^{-t(n+1)} \right)^k = \log \sum_{n\in S} e^{-t(n+1)}, \end{aligned}

while by the variational principle

$\displaystyle P(S^{\mathbb Z},-t\varphi) = \sup_\nu \left(h_\nu - t \int\varphi\,d\nu\right) = \sup_\nu \left(\int\varphi\,d\nu\right) (h_\mu - t).$

We conclude that

$\displaystyle \sup_\mu (h_\mu - t) \mu[1]^{-1} = \log \sum_{n\in S} e^{-t(n+1)}. \ \ \ \ \ (4)$

Let ${t}$ be such that the right-hand side is equal to 0; to prove Theorem 1 we need to prove that ${h(X,\sigma) = t}$. First observe that for every ${\mu}$ we have ${h_\mu - t \leq 0}$, thus ${h(X,\sigma) \leq t}$. For the other inequality, let ${\nu}$ be the Bernoulli measure on ${S^{\mathbb Z}}$ that assigns weight ${e^{-t(n+1)}}$ to the symbol ${n}$. Then ${\nu}$ is an equilibrium state for ${-t\varphi}$ on ${S^{\mathbb Z}}$, and by our choice of ${t}$, we have

$\displaystyle h_\nu - \int \varphi\,d\nu = 0,$

so that in particular the left hand side of (4) vanishes and we get ${h_\mu=t}$ for the measure ${\mu}$ that corresponds to ${\nu}$. This shows that ${h(X,\sigma) = t}$ and completes the proof of the theorem.

We note that this proof has the advantage of giving an explicit description of the MME. With a little more work it can be used to show that the MME is unique and has good statistical properties (central limit theorem, etc.).

I'm an assistant professor of mathematics at the University of Houston. I'm interested in dynamical systems, ergodic theory, thermodynamic formalism, dimension theory, multifractal analysis, non-uniform hyperbolicity, and things along those lines.
This entry was posted in ergodic theory, examples and tagged , . Bookmark the permalink.

2 Responses to Entropy of S-gap shifts

1. Steve Kass says:

Hi Vaughan,

In the first paragraph of this post, you define {G} as “the set of all words on two symbols of the form {0^n1} — that is, {n} 0s followed by a single 1,” but you don’t say what {n} can be. You forgot to add “where {n\in S}.” in that definition.

As defined in your first paragraph, the {S}-gap shift doesn’t depend on {S}, but it’s clear from everything that follows (and from definitions elsewhere) that it should.

• Right you are! I’ve corrected it in the post, thanks for pointing that out.