One of the fundamental results in probability theory is the strong law of large numbers, which was discussed in an earlier post under the guise of the Birkhoff ergodic theorem.
Suppose we have a sequence of random variables which take values in . If the sequence is independent and identically distributed with , then the strong law of large numbers shows that almost surely.
It is often the case, however, that one or both of these conditions fail; there may be some dependence between the variables , or the distribution may not be identical for all values of . For example, the following situation arose recently in a project I have been working on: there is a sequence of random variables as above, which has the property that no matter what values take, the random variable has expected value at most . In the language of conditional expectation, we have
This is true despite the fact that the distribution of may vary according to the values of . In what follows we will write for the -algebra generated by and write the above condition as
It is plausible to conjecture that in this setting one still has almost surely. And indeed, it turns out that a statement very close to this can be proved.
The following result is proved by modifying one of the standard proofs of the strong law of large numbers, which can be found, for example, in Billingsley’s book “Probability and Measure”, where it is Theorem 22.1. There is also a discussion on Terry Tao’s blog, which emphasises the role of the main tools in this proof: the moment method and truncation.
The most important consequence of this is that the sequence is uncorrelated — that is, for all , which in turn implies that . This means that the variance is additive for the sequence :
Recall that is the sequence bounding , and let be such that . Let be a random variable taking the value with probability for . Note that by the definition of and , we have , thus
for some fixed constant . Note that the final expression is non-decreasing in , and so together with (4) we have
where again is a fixed constant. Fixing and using (5) in Chebyshev’s inequality yields
Now let be a sequence of integers such that . We see that (6) yields
where depends on and on . Observe that
where is the minimal value of for which . Because grows like , the sum is bounded above by a constant times , so that together with (7) we have
where finiteness of is the crucial place where we use (2).
Let . Because we have for all , and in particular
Taking the limit and using (8) gives
Finally, we recall from the definition of and that whenever . In particular, we may observe that
and apply Borel-Cantelli again to deduce that with probability one, for at most finitely many values of . In particular, (9) implies that
which completes the proof of Proposition 1.