This post is based on notes from Matt Nicol’s talk at the UH summer school in dynamical systems. The goal is to present the ideas behind a proof of the central limit theorem for dynamical systems using martingale approximations.
1. Conditional expectation
Before we can define and use martingales, we must recall the definition of conditional expectation. Let be a probability space, with defined on a -algebra . Let be a sub--algebra of .
Example 1 Consider the doubling map given by . Let be Lebesgue measure, the Borel -algebra, and . Then is a sub--algebra of , consisting of precisely those sets in which are unions of preimage sets — that is, those sets for which a point is in if and only if .
This example extends naturally to yield a decreasing sequence of -algebras
Given a sub--algebra and a random variable that is measurable with respect to , the conditional expectation of given is any random variable such that
- if -measurable (that is, for every interval ), and
- for every .
It is not hard to show that these conditions characterise for an almost-sure choice of , and so the conditional expectation is uniquely defined as a random variable. We write it as .
A key property is that conditional expectation is linear: for every , every , and every , we have
Example 2 If is already -measurable, then .
Example 3 At the other extreme, if and are independent — that is, if for every — then is the constant function .
Example 4 Suppose is a countable partition of such that for every . Let be the -algebra generated by the sets . Then
Now we can define martingales, which are a particular sort of stochastic process (sequence of random variables) with “enough independence” to generalise results from the IID case.
Definition 1 A sequence of random variables is a martingale if
- for all ;
- there is an increasing sequence of -algebras (a filtration) such that is measurable with respect to ;
- the conditional expectations satisfy .
The first condition guarantees that everything is in . If is taken to be the -algebra of events that are determined by the first outcomes of a sequence of experiments, then the second condition states that only depends on those first outcomes, while the third condition requires that if the first outcomes are known, then the expected value of is 0.
Example 5 Let be a sequence of fair coin flips — IID random variables taking the values with equal probability. Let . As suggested in the previous paragraph, let be the smallest -algebra with respect to which are all measurable. (The sets in are precisely those sets in which are determined by knowing the values of .)
It is easy to see that satisfies the first two properties of a martingale, and for the third, we use linearity of expectation and the definition of to get
When is a sequence of random variables for which is a martingale, we say that the sequence is a martingale difference.
In the previous example the martingale property (the third condition) was a direct consequence of the fact that the random variables were IID. However, there are examples where the martingale differences are not IID.
Example 6 Polya’s urn is a stochastic process defined as follows. Consider an urn containing some number of red and blue balls. At each step, a single ball is drawn at random from the urn, and then returned to the urn, along with a new ball that matches the colour of the one drawn. Let be the fraction of the balls that are red after the th iteration of this process.
Clearly the sequence of random variables is neither independent nor identically distributed. However, it is a martingale, as the following computation shows: suppose that at time there are red balls and blue balls in the urn. (This knowledge represents knowing which element of we are in.) Then at time , there will be red balls with probability , and red balls with probability . Either way, there will be total balls, and so the expected fraction of red balls is
If we assume that the martingale differences are stationary (that is, identically distributed) and ergodic, then we have the following central limit theorem for martingales, from a 1974 paper of McLeish (we follow some notes by S. Sethuraman for the statement).
More sophisticated versions of this result are available, but this simple version will suffice for our needs.
3. Koopman operator and transfer operator
Now we want to apply 2 to a dynamical system with an ergodic measure by taking for some observable .
for all . The key result for our purposes is that the operators and are one-sided inverses of each other.
- , where is the -algebra on which is defined.
Proof: For the first claim, we see that for all we have
where the first equality uses the definition of and the second uses the fact that is invariant. To prove the second claim, we first observe that given an interval , we have
This follows from a similar computation to the one above: given we have
which establishes (2) and completes the proof.
We see from Proposition 3 that a function has zero conditional expectation with respect to if and only if it is in the kernel of . In particular, if then is a martingale; this will be a key tool in the next section.
Example 7 Let and be the doubling map. Let be Lebesgue measure. For convenience of notation we consider the space of complex-valued functions on ; the functions form an orthonormal basis for this space. A simple calculation shows that
so . For the transfer operator we obtain , while for odd values of we have
4. Martingale approximation and CLT
The machinery of the Koopman and transfer operators from the previous section can be used to apply the martingale central limit theorem to observations of dynamical systems via the technique of martingale approximation, which was introduced by M. Gordin in 1969.
The idea is that if quickly enough for functions with , then we can approximate the sequence with a martingale sequence .
Let . We claim that is a martingale. Indeed,
and since is the identity we see that the last term is just , so that .
Proposition 3 now implies that , and we conclude that is a martingale, so by the martingale CLT converges in distribution to , where .
Now we want to apply this result to obtain information about itself, and in particular about . We have , and so
and the last term goes to 0 in probability, which yields the central limit theorem for .
Remark 1 There is a technical problem we have glossed over, which is that the sequence of -algebras is decreasing, not increasing as is required by the definition of a martingale. One solution to this is to pass to the natural extension and to consider the functions and the -algebras . Another solution is to use reverse martingales, but we do not discuss this here.
Example 8 Let and be an intermittent type (Manneville–Pomeau) map given by
where is a fixed parameter. It can be shown that has a unique absolutely continuous invariant probability measure , and that the transfer operator has the following contraction property: for every with , there is such that , where .
For small values of , this shows that is summable, and consequently satisfies the CLT by the above discussion.