Entropy bounds for equilibrium states

[Update 6/15/17: The original version of this post had a small error in it, which has been corrected in the present version; the definition of {\mathcal{I}_n} in the proof of the main theorem needed to be modified so that each {k_i} is a multiple of {2(\tau+1)}.  Thanks to Leonard Carapezza for pointing this out to me.]

Let {X} be a compact metric space and {f\colon X\rightarrow X} a homeomorphism. Recall that an equilibrium state for a continuous potential function {\varphi\colon X\rightarrow {\mathbb R}} is an {f}-invariant Borel probability measure on {X} maximizing the quantity {h_\mu(f) + \int\varphi\,d\mu} over all invariant probabilities; the topological pressure {P(\varphi)} is the value of this maximum.

A classical result on existence and uniqueness of equilibrium states is due to Bowen, who proved that if {f} is expansive and has specification, and {\varphi} has a bounded distortion property (the `Bowen property’), then there is a unique equilibrium state {\mu_\varphi}. In particular, this applies when {f} is Anosov and {\varphi} is Hölder.

It seems to be well-known among experts that under Bowen’s hypotheses, {\mu_\varphi} must have positive entropy (equivalently, {P(\varphi) > \sup_\mu \int\varphi\,d\mu}), but I do not know of an explicit reference. In this post I provide a proof of this fact, which also gives reasonably concrete bounds on the entropy of {\mu_\varphi}; equivalently, a bound on the size of the gap {P(\varphi) - \sup_\mu \int\varphi\,d\mu}.

1. Definitions and result

First, let’s recall the definitions in the form that I will need them. Given {x\in X}, {n\in {\mathbb N}}, and {\varepsilon>0}, the Bowen ball around {x} of order {n} and radius {\varepsilon} is the set

\displaystyle B_n(x,\varepsilon) := \{y\in X : d(f^kx, f^ky) < \varepsilon \text{ for all } 0\leq k < n\}.

The map {f} has specification if for every {\varepsilon>0} there is {\tau=\tau(\varepsilon)\in {\mathbb N}} such that for every {x_1,\dots, x_k\in X} and {n_1,\dots, n_k\in {\mathbb N}}, there is {x\in X} such that

\displaystyle x\in B_{n_1}(x_1,\varepsilon),\qquad f^{n_1 + \tau}(x)\in B_{n_2}(x_2,\varepsilon),

and in general

\displaystyle f^{n_1 + \tau + \cdots + n_{i-1} + \tau}(x) \in B_{n_i}(x_i,\varepsilon)

for every {1\leq k\leq n}. We refer to {\tau} as the “gluing time”; one could also consider a weaker property where the gluing times are allowed to vary but must be bounded above by {\tau}; this makes the estimates below more complicated, so for simplicity we will stick with the stronger version.

A function {\varphi\colon X\rightarrow {\mathbb R}} has the Bowen property at scale {\varepsilon} with distortion constant {V} if {V\in {\mathbb R}} is such that

\displaystyle |S_n\varphi(x) - S_n\varphi(y)| \leq V \text{ for all } x\in X\text{ and } y\in B_n(x,\varepsilon),

where {S_n\varphi(x) := \sum_{k=0}^{n-1} \varphi(f^k x)}. We write

\displaystyle \Lambda_n(\varphi,\varepsilon) := \sup_{E\in \mathcal{E}_{n,\varepsilon}} \sum_{x\in E} e^{S_n\varphi(x)},

where {\mathcal{E}_{n,\varepsilon}} is the collection of {(n,\varepsilon)}-separated subsets of {X} (those sets {E\subset X} for which {y\notin B_n(x,\varepsilon)} whenever {x,y\in E}, {x\neq y}). The topological pressure is {P(\varphi) = \lim_{\varepsilon\rightarrow 0} P(\varphi,\varepsilon)}, where

\displaystyle P(\varphi,\varepsilon) = \limsup_{n\rightarrow\infty} \frac 1n \log \Lambda_n(\varphi,\varepsilon).

Theorem 1 Let {X} be a compact metric space with diameter {>6\varepsilon}, {f\colon X\rightarrow X} a homeomorphism with specification at scale {\varepsilon} with gap size {\tau}, and {\varphi\colon X\rightarrow {\mathbb R}} a potential with the Bowen property at scale {\varepsilon} with distortion constant {V}. Let

\displaystyle \Delta = \frac{\log(1+e^{-(V+2(2\tau+1)\|\varphi\|)})}{2(\tau+1)}

where {\|\varphi\| = \sup_{x\in X} |\varphi(x)|}. Then we have

\displaystyle P(\varphi) \geq P(\varphi,\varepsilon) \geq \Big( \sup_\mu \int\varphi\,d\mu\Big) + \Delta. \ \ \ \ \ (1)


In particular, if {\mu} is an equilibrium state for {\varphi}, then we have {h_\mu(f) \geq \Delta > 0}.

2. Consequence for Anosov diffeomorphisms

Before proving the theorem we point out a useful corollary. If {M} is a compact manifold and {f\colon M\rightarrow M} is a topologically mixing {C^1} Anosov diffeomorphism, then {f} has specification at every scale (similar results apply in the Axiom A case). Moreover, every Hölder continuous potential has the Bowen property, and thus Theorem 1 applies.

For an Anosov diffeo, the constants {V} and {\tau} in (1) can be controlled by the following factors (here we fix a small {\varepsilon>0}):

  1. the rate of expansion and contraction along the stable and unstable directions, given in terms of {C,\lambda>0} such that {\|Df^n_x(v^s)\| \leq C e^{-\lambda n}} for all {n\geq 0} and {v^s\in E^s}, and similarly for {v^u\in E^u} and {n\leq 0};
  2. how quickly unstable manifolds become dense, in other words, the value of {R>0} such that {W_R^u(x)} is {\varepsilon}-dense for every choice of {x};
  3. the angle between stable and unstable directions, which controls the local product structure, in particular via a constant {K>0} such that {d(x,y) < \varepsilon} implies that {W_{K\varepsilon}^s(x)} intersects {W_{K\varepsilon}^u(y)} in a unique point {z}, and the leafwise distances from {x,y} to {z} are at most {K d(x,y)};
  4. the Hölder exponent ({\beta}) and constant ({|\varphi|_\beta}) for the potential {\varphi}.

For the specification property for an Anosov diffeo, {\tau =\tau(\varepsilon)} is determined by the condition that {C^{-1}e^{\lambda\tau}(\varepsilon/K) > R}, so that small pieces of unstable manifold expand to become {\varepsilon}-dense within {\tau} iterates; thus we have

\displaystyle \tau(\varepsilon) \approx \lambda^{-1} \log(R(\varepsilon) KC\varepsilon^{-1}).

For the Bowen property, one compares {S_n\varphi(x)} and {S_n\varphi(y)} by comparing each to {S_n\varphi(z)}, where {z} is the (Smale bracket) intersection point coming from the local product structure. Standard estimates give {d(f^j x, f^jz) \leq CK\varepsilon e^{-\lambda j}}, so the Hölder property gives

\displaystyle \begin{aligned} |S_n\varphi(x) - S_n\varphi(z)| &\leq \sum_{j=0}^{n-1} |\varphi(f^j x) - \varphi(f^j z)| \leq \sum_{j=0}^{n-1} |\varphi|_\beta d(f^jx,f^jz)^\beta \\ &\leq |\varphi|_\beta \sum_{j=0}^\infty (CK\varepsilon)^\beta e^{-\lambda\beta j} = |\varphi|_\beta (CK\varepsilon)^\beta (1-e^{-\lambda\beta})^{-1}. \end{aligned}

A similar estimate for {|S_n\varphi(y) - S_n\varphi(z)|} gives

\displaystyle V = 2(CK\varepsilon)^\beta(1- e^{-\lambda\beta})^{-1} |\varphi|_\beta.

Thus Theorem 1 has the following consequence for Anosov diffeomorphisms.

Corollary 2 Let {f} be a topologically mixing Anosov diffeomorphism on {M} and {C,\lambda,\varepsilon,R,K} the quantities above. Let

\displaystyle \delta = \frac{\lambda}{2\log(RKC\varepsilon^{-1})}.

Given a {\beta}-Hölder potential {\varphi\colon M\rightarrow {\mathbb R}}, consider the quantity

\displaystyle Q(\varphi) := 2(CK\varepsilon)^\beta (1-e^{-\lambda\beta})^{-1} |\varphi|_\beta + 5\lambda^{-1} \log(RKC\varepsilon^{-1})\|\varphi\|.

Then we have

\displaystyle P(\varphi) \geq P(\varphi,\varepsilon) \geq \Big(\sup_\mu \int\varphi\,d\mu\Big) + \delta \log(1+e^{-Q(\varphi)})

so that in particular, if {\mu} is an equilibrium state for {\varphi}, then

\displaystyle h_\mu(f) \geq \delta \log(1+e^{-Q(\varphi)}) > 0.

Finally, note that since shifting the value of {\varphi} by a constant does not change its equilibrium states, we can assume without loss of generality that {\|\varphi\| \leq (\mathrm{diam}\, M)^\beta |\varphi|_\beta} and write the following consequence of the above, which is somewhat simpler in appearance.

Corollary 3 Let {M} be a compact manifold and {f\colon M\rightarrow M} a topologically mixing Anosov diffeomorphism. For every {\beta>0} there are constants {\delta = \delta(M,f)>0} and {R = R(M,f,\beta)} such that for every {\beta}-Hölder potential {\varphi}, we have

\displaystyle P(\varphi) \geq \Big(\sup_\mu \int\varphi\,d\mu\Big) + \delta e^{-R|\varphi|_\beta}

so that as before, if {\mu} is an equilibrium state for {\varphi}, we have

\displaystyle h_\mu(f) \geq \delta e^{-R|\varphi|_\beta} > 0.

This corollary gives a precise bound on how the entropy of a family of equilibrium states can decay as the Hölder semi-norms {|\varphi|_\beta} of the corresponding potentials become large. To put it another way, given any threshold {h_0>0}, this gives an estimate on how large {|\varphi|_\beta} must be before {\varphi} can have an equilibrium state with entropy below {h_0}.

3. Proof of the theorem

We spend the rest of the post proving Theorem 1. Fix {x\in X} and consider for each {n\in {\mathbb N}} the orbit segment {x, f(x), \dots, f^{n-1}(x)}. Fix {\alpha\in (0,\frac 12]}. Let {m_n = \lceil \frac{\alpha n}{2(\tau+1)} \rceil}, and let

\displaystyle \mathcal{I}_n = \{ 0 < k_1 < k_2 < \cdots < k_{m_n} < n : k_i \in 2(\tau+1){\mathbb N} \ \forall i\}.

Write {k_0 = 0} and {k_{m_n + 1} = n}. The idea is that for each {\vec k\in \mathcal{I}_n}, we will use the specification property to construct a point {\pi(\vec k) \in X} whose orbit shadows the orbit of {x} from time {0} to time {n}, except for the times {k_i}, at which it deviates briefly; thus the points {\pi(\vec k)} will be {(n,\varepsilon)}-separated on the one hand, and on the other hand will have ergodic averages close to that of {x}.

First we estimate {\#\mathcal{I}_n} from below; this requires a lower bound on {{n\choose \ell}}. Integrating {\log t} over {[1,k]} and {[1,k+1]} gives

\displaystyle k\log k - k + 1 \leq \log(k!) \leq k\log k - k + 1 + \log(k+1),

and thus we have

\displaystyle \begin{aligned} \log{n\choose \ell} &= \log(n!) - \log(\ell!) - \log(n-\ell)! \\ &\geq n\log n + 1 - \ell\log\ell - (n-\ell)\log(n-\ell) - \log((\ell+1)(n-\ell+1)) \\ &\geq h\big( \tfrac\ell n\big) n - 2\log n, \end{aligned}

where {h(\delta) = -\delta\log\delta - (1-\delta)\log(1-\delta)}. This function is increasing on {(0,\frac12)}, so

\displaystyle \begin{aligned} \log\#\mathcal{I}_n &\geq \log{\lfloor \frac{n}{2(\tau+1)}\rfloor \choose m_n} \geq h(\tfrac {2(\tau+1) m_n} n) \frac{n}{2(\tau+1)} - 2\log \frac{n}{2(\tau+1)} \\ &\geq \frac{h(\alpha)}{2(\tau+1)} n - 2\log n. \end{aligned} \ \ \ \ \ (2)


Given {k\in \{0, \dots, n-1\}}, let {y_k \in X} be any point with {d(f^k(x),y_k) > 3\varepsilon} (using the assumption on the diameter of {X}). Now for every {\vec{k}\in \mathcal{I}_n}, the specification property guarantees the existence of a point {\pi(\vec{k})\in X} with the property that

\displaystyle \begin{aligned} \pi(\vec{k}) &\in B_{k_1-\tau}(x,\varepsilon), \\ \qquad f^{k_1}(\pi(\vec{k})) &\in B(y_{k_1},\varepsilon), \\ \qquad f^{k_1+\tau+1}(\pi(\vec{k})) &\in B_{k_2 - k_1 - 2\tau - 1}(f^{k_1 + \tau+1}(x)), \end{aligned}

and so on, so that in general for any {0\leq i \leq m_n} we have

\displaystyle \begin{aligned} f^{k_i + \tau + 1}(\pi(\vec{k})) &\in B_{k_{i+1} - k_i - 2\tau - 1}(f^{k_i + \tau _ 1}(x)), \\ f^{k_{i+1}}(\pi(\vec{k})) &\in B(y_{k_{i+1}},\varepsilon). \end{aligned} \ \ \ \ \ (3)


Write {j_i = k_{i+1} - k_i - 2\tau - 1}; then the first inclusion in (3), together with the Bowen property, gives

\displaystyle |S_{j_i} \varphi(f^{k_i + \tau + 1}(x)) - S_{j_i} \varphi(f^{k_i + \tau + 1} (\pi (\vec{k}))| \leq V.

Now observe that for any {y\in X} we have

\displaystyle \bigg| S_n \varphi(y) - \sum_{i=0}^{m_n} S_{k_{i+1} - k_i - 2\tau-1}\varphi(f^{k_i + \tau + 1} y) \bigg| \leq (2\tau + 1)m_n \|\varphi\|.

We conclude that

\displaystyle |S_n\varphi(\pi(\vec{k})) - S_n\varphi(x)| \leq m_n(V + 2(2\tau+1)\|\varphi\|). \ \ \ \ \ (4)


Consider the set {\pi(\mathcal{I}_n) \subset X}. The second inclusion in (3) guarantees that this set is {(n,\varepsilon)}-separated; indeed, given any {\vec{k} \neq \vec{k}' \in \mathcal{I}_n}, we can take {i} to be minimal such that {k_i \neq k_i'}, let {j=k_i'}, and then observe that {f^j(\pi(\vec{k})) \in B(y_j, \varepsilon)} and {f^j(\pi(\vec{k}')) \in B(f^j(x),\varepsilon)}; since {d(y_j,f^j(x)) > 3\varepsilon} this guarantees that {\pi(\vec{k}') \notin B_n(\pi(\vec{k}),\varepsilon)}.

Using this fact and the bounds in (4) and (2), we conclude that

\displaystyle \begin{aligned} \Lambda_n(\phi,\varepsilon) &\geq \sum_{\vec{k} \in \pi(\mathcal{I}_n)} e^{S_n\varphi(\pi(\vec{k}))} \\ &\geq (\#\mathcal{I}_n) \exp\big(S_n\varphi(x) - m_n(V+2(2\tau+1)\|\varphi\|)\big) \\ &\geq n^{-2} \exp\big(S_n \varphi(x) + \tfrac {h(\alpha)}{2(\tau+1)} n - (\tfrac{\alpha}{2(\tau+1)} n + 1)(V+2(2\tau+1)\|\varphi\|)\big). \end{aligned}

Taking logs, dividing by {n}, and sending {n\rightarrow\infty} gives

\displaystyle P(\varphi,\varepsilon) \geq \Big(\limsup_{n\rightarrow\infty} \frac 1n S_n\varphi(x) \Big) + \frac 1{2(\tau+1)} \Big(h(\alpha) - \alpha(V+2(2\tau+1)\|\varphi\|) \Big).

Given any ergodic {\mu}, we can take a generic point {x} for {\mu} and conclude that the lim sup in the above expression is equal to {\int\varphi\,d\mu}. Thus to bound the difference {P(\varphi,\varepsilon) - \int\varphi\,d\mu}, we want to choose the value of {\alpha \in (0,\frac 12]} that maximizes {h(\alpha) - \alpha Q}, where {Q=V+2(2\tau+1)\|\varphi\|}.

A straightforward differentiation and some routine algebra shows that {\frac d{d\alpha} (h(\alpha) - \alpha Q) = 0} occurs when {\alpha = (1+e^Q)^{-1}}, at which point we have {h(\alpha) - \alpha Q = \log(1+e^{-Q})}, proving Theorem 1.

About Vaughn Climenhaga

I'm an assistant professor of mathematics at the University of Houston. I'm interested in dynamical systems, ergodic theory, thermodynamic formalism, dimension theory, multifractal analysis, non-uniform hyperbolicity, and things along those lines.
This entry was posted in Uncategorized. Bookmark the permalink.

2 Responses to Entropy bounds for equilibrium states

  1. Pingback: Alpha-beta shifts | Vaughn Climenhaga's Math Blog

  2. Pingback: Vaughn Climenhaga's Math Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s