Lebesgue probability spaces, part I

In various areas of mathematics, classification theorems give a more or less complete understanding of what kinds of behaviour are possible. For example, in linear algebra we learn that up to isomorphism, {{\mathbb R}^n} is the only real vector space with dimension {n}, and every linear operator on a finite-dimensional vector space can be put into Jordan normal form via a change of coordinates; this means that many questions in linear algebra can be answered by understanding properties of Jordan normal form. A similar classification result is available in measure theory, but the preliminaries are a little more involved. In this and the next post I will describe the classification result for complete probability spaces, which gives conditions under which such a space is equivalent to the unit interval with Lebesgue measure.

The main original references for these results are a 1942 paper by Halmos and von Neumann [“Operator methods in classical mechanics. II”, Ann. of Math. (2), 43 (1942), 332–350, and a 1949 paper by Rokhlin [“On the fundamental ideas of measure theory”, Mat. Sbornik N.S. 25(67) (1949). 107–150, English translation in Amer. Math. Soc. Translation 1952 (1952). no. 71, 55 pp.]. I will refer to these as [HvN] and [Ro], respectively.

1. Equivalence of measure spaces

First we must establish exactly which class of measure spaces we work with, and under what conditions two measure spaces will be thought of as equivalent. Let {I} be the unit interval and {\mathop{\mathcal B},\mathop{\mathcal L}} the Borel and Lebesgue {\sigma}-algebras, respectively; let {m} be Lebesgue measure (on either of these). To avoid having to distinguish between {(I,\mathop{\mathcal B},m)} and {(I,\mathop{\mathcal L},m)}, let us agree to only work with complete measure spaces; this is no great loss, since given an arbitrary metric space {(X,\mathcal{A},\mu)} we can pass to its completion {(X,\overline{\mathcal{A}},\overline\mu)}.

The most obvious notion of isomorphism is that two complete measure spaces {(X,\mathcal{A},\mu)} and {(X',\mathcal{A}',\mu')} are isomorphic if there is a bijection {f\colon X\rightarrow X'} such that {f,f^{-1}} are measurable and {f_*\mu = \mu'}; that is, given {A\subset X} we have {A\in \mathcal{A}} if and only if {f(A) \in \mathcal{A}'}, and in this case {\mu(A) = \mu'(f(A))}.

In the end we want to loosen this definition a little bit. For example, consider the space {X = \{0,1\}^{\mathbb N}} of all infinite binary sequences, equipped with the Borel {\sigma}-algebra {\mathop{\mathcal B}} associated to the product topology (or if you prefer, the metric {d(x,y) = e^{-\min\{n \mid x_n \neq y_n\}}}). Let {\mu} be the {(\frac 12,\frac 12)}-Bernoulli measure on {(X,\mathop{\mathcal B})}; that is, for each {w\in \{0,1\}^n} the cylinder {[w] = \{x\in X \mid x_1 \cdots x_n = w\}} gets weight {\mu[w] = 2^{-n}}. Then there is a natural correspondence between the completion {(X,\overline{\mathop{\mathcal B}},\overline\mu)} and {(I,\mathop{\mathcal L},m)} given by

\displaystyle  \begin{aligned} f\colon X&\rightarrow I \\ x &\mapsto \sum_{n=1}^\infty x_n 2^{-n}. \end{aligned}

By looking at dyadic intervals {[\frac k{2^n}, \frac{k+1}{2^n}] \subset I} one can readily verify that {f_* \overline\mu = m}; however, {f} is not a bijection because for every {w\in \{0,1\}^n} we have {f(w01^\infty) = f(w10^\infty)}.

The points at which {f} is non-injective form a {\mu}-null set (since there are only countably many of them), so from the point of view of measure theory, it is natural to disregard them. This motivates the following definition.

Definition 1 Two measure spaces {(X,\mathcal{A},\mu)} and {(X',\mathcal{A}',\mu')} are isomorphic mod 0 if there are measurable sets {E \subset X} and {E'\subset X'} such that {\mu(X\setminus E) = \mu'(X'\setminus E') = 0}, together with a bijection {f\colon E\rightarrow E'} such that {f,f^{-1}} are measurable and {f_*(\mu|_E) = \mu'|_{E'}}.

From now on we will be interested in the question of classifying complete measure spaces up to isomorphism mod 0. The example above suggests that {(I,\mathop{\mathcal L},m)} is a reasonable candidate for a `canonical’ complete measure space that many others are equivalent to, and we will see that this is indeed the case.

Notice that the total measure {\mu(X)} is clearly an invariant of isomorphism mod 0, and hence we restrict our attention to probability spaces, for which {\mu(X)=1}.

2. Separability, etc.

Let {(X,\mathcal{A},\mu)} be a probability space. We describe several related conditions that all give senses in which {\mathcal{A}} can be understood via countable objects.

The {\sigma}-algebra {\mathcal{A}} carries a natural pseudo-metric given by {\rho(A,B) = \mu(A\Delta B)}. Write {A\sim B} if {\rho(A,B)=0}; this is an equivalence relation on {\mathcal{A}}, and we write {\hat{\mathcal{A}}} for the space of equivalence classes. The function {\rho} induces a metric {\hat\rho} on {\hat{\mathcal{A}}} in the natural way, and we say that {(X,\mathcal{A},\mu)} is separable if the metric space {(\hat{\mathcal{A}},\hat\rho)} is separable; that is, if it has a countable dense subset.

Another countability condition is this: call {\mathcal{A}} “countably generated” if there is a countable subset {\Gamma \subset \mathcal{A}} such that {\mathcal{A} = \sigma(\Gamma)} is the smallest {\sigma}-algebra containing {\Gamma}. We write (CG) for this property; for example, the Borel {\sigma}-algebra in {[0,1]} satisfies (CG) because we can take {\Gamma} to be the set of all intervals with rational endpoints. (In [HvN], such an {\mathcal{A}} is called “strictly separable”, but we avoid the word “separable” as we have already used it in connection with the metric space {(\hat{\mathcal{A}},\hat\rho)}.)

In and of itself, (CG) is not quite the right sort of property for our current discussion, because it does not hold when we pass to the completion; the Lebesgue {\sigma}-algebra {\mathop{\mathcal L}} is not countably generated (one can prove this using cardinality estimates). Let us say that {\mathcal{A}} satisfies property (CG0) (for “countably generated mod 0”) if there is a countably generated {\sigma}-algebra {\Sigma \subset \mathcal{A}} with the property that for every {E\in \mathcal{A}}, there is {F\in \Sigma} with {\mu(E\Delta F) = 0}. In other words, we have {\hat{\mathcal{A}} = \hat\Sigma}. Note that {\mathop{\mathcal L}} is countably generated mod 0 by taking {\Sigma = \mathop{\mathcal B}}. (In [HvN], such an {\mathcal{A}} is called “separable”; the same property is used in §2.1 of [Ro] with the label {(L')}, rendered in a font that I will not attempt to duplicate here.)

In fact, the approximation of {\mathop{\mathcal L}} by {\mathop{\mathcal B}} satisfies an extra condition. Let us write (CG0+) for the following condition on {\mathcal{A}}: there is a countably generated {\Sigma \subset \mathcal{A}} such that for every {E\in \mathcal{A}}, there is {F\in \Sigma} with {E\subset F} and {\mu(F\setminus E)=0}. This is satisfied for {\mathop{\mathcal L}} and {\mathop{\mathcal B}}. (In [HvN], such an {\mathcal{A}} is called “properly separable”; the same property is used in §2.1 of [Ro] with the label {(L)}.)

The four properties introduced above are related as follows.

\displaystyle  \textbf{(CG)} \Rightarrow \textbf{(CG0+)} \Rightarrow \textbf{(CG0)} \Leftrightarrow \text{separable}

The first two implications are immediate, and their converses fail in general:

  • The Lebesgue {\sigma}-algebra {\mathop{\mathcal L}} satisfies (CG0+) but not (CG).
  • Let {\mathcal{A} = \{ A \subset [0,1] \mid A\in \mathop{\mathcal L}, \mu(A)=0 \text{ or } 1\}}. Then {\mathcal{A}} satisfies (CG0) but not (CG0+).

Now we prove that (CG0) and separability are equivalent. First note that if {\Gamma \subset \mathcal{A}} is a countable subset, then the algebra {\mathop{\mathcal F}} generated by {\Gamma} is also countable; in particular, {(X,\mathcal{A},\mu)} is separable if and only if there is a countable algebra {\mathop{\mathcal F}\subset \mathcal{A}} that is dense with respect to {\rho}, and similarly in the definition of (CG0) the generating set can be taken to be an algebra. To show equivalence of (CG0) and separability it suffices to show that given an algebra {\mathop{\mathcal F} \subset \mathcal{A}} and {E\in \mathcal{A}}, we have

\displaystyle  (E\in \overline{\mathop{\mathcal F}}) \Leftrightarrow (\text{there is } A\in \sigma(\mathop{\mathcal F}) \text{ with } \rho(E,A) = \mu(E\Delta A) = 0). \ \ \ \ \ (1)

First we prove {(\Leftarrow)} by proving that {\overline{\mathop{\mathcal F}}} is a {\sigma}-algebra, and hence contains {\sigma(\mathop{\mathcal F})}; this will show that (CG0) implies separability.

  • Closure under {{}^c}: if {E\in \overline{\mathop{\mathcal F}}} then there are {A_n\in \mathop{\mathcal F}} such that {\rho(A_n,E) \rightarrow 0}. Since {\rho(A_n^c, E^c) = \rho(A_n, E)} and {A_n^c\in \mathop{\mathcal F}} (since it is an algebra), this gives {E^c\in \overline{\mathop{\mathcal F}}}.
  • Closure under {\cup}: given {E_1,E_2,\dots \in \overline{\mathop{\mathcal F}}}, let {E = \bigcup_n E_n}. To show that {E \in \overline{\mathop{\mathcal F}}}, note that given any {\epsilon>0}, there are {A_n\in \mathop{\mathcal F}} such that {\rho(A_n,E_n) < \epsilon 2^{-n}}. Let {F_N = \bigcup_{n=1}^N E_n} and {B_N = \bigcup_{n=1}^N A_n}; note that

    \displaystyle  \rho(F_N,B_N) \leq \sum_{n=1}^N \rho(E_n,A_n) < \epsilon.

    Moreover by continuity from below we have {\lim \mu(F_N) = \mu(E)}, so {\lim \rho(E,F_N)=0}, and thus for sufficiently large {N} we have {\rho(E,B_N) < \rho(E,F_N) + \rho(F_N,B_N) < 2\epsilon}. This holds for all {\epsilon>0}, so {E\in \overline{\mathop{\mathcal F}}}.

Now we prove {(\Rightarrow)}, thus proving that {\overline{\mathop{\mathcal F}}} is “large enough” that separability implies (CG0). Given any {E\in \overline{\mathop{\mathcal F}}}, there are {A_n\in \mathop{\mathcal F}} such that {\rho(A_n,E)\leq 2^{-n}}. Let { A = \bigcap_{N\in {\mathbb N}} \bigcup_{n\geq N} A_n \in \sigma(\mathop{\mathcal F}). } We get

\displaystyle  \begin{aligned} \mu(A \cap E^c) &= \mu\big(\bigcap_N \bigcup_{n\geq N} (A_n \cap E^c)\big) = \lim_{N\rightarrow\infty} \mu\big( \bigcup_{n\geq N} (A_n \cap E^c) \big) \\ &\leq \lim_{N\rightarrow\infty} \sum_{n\geq N} \mu(A_n \cap E^c) \leq \lim_{N\rightarrow\infty} 2^{1-N} = 0, \end{aligned}

and similarly, {A^c \cap E = \bigcup_N \bigcap_{n\geq N} (A_n^c \cap E)}, which gives

\displaystyle  \mu(A^c \cap E) \leq \sum_N \mu\big( \bigcap_{n\geq N} A_n^c \cap E\big) \leq \sum_N \limsup_{n\rightarrow\infty} \mu(A_n^c \cap E) = 0.

Then {\rho(A,E) = 0}, which completes the proof of {(\Rightarrow)}.

The first half of the argument above (the {\Leftarrow} direction) appears in this MathOverflow answer to a question discussing the relationship between different notions of separability, which ultimately inspired this post. That answer (by Joel David Hamkins) also suggests one further notion of “countably generated”, distinct from all of the above; say that {(X,\mathcal{A},\mu)} satisfies (CCG) (for “completion of countably generated”) if there is a countably generated {\sigma}-algebra {\Sigma \subset \mathcal{A}} such that {\mathcal{A} \subset \overline{\Sigma}}, where {\overline{\Sigma}} is the completion of {\Sigma} with respect to the measure {\mu}. One quickly sees that

\displaystyle  \textbf{(CG)} \Rightarrow \textbf{(CCG)} \Rightarrow \textbf{(CG0)}.

Both reverse implications fail; the Lebesgue {\sigma}-algebra satisfies (CCG) but not {\textbf{(CG)}}, and an example satisfying separability (and hence (CG0)) but not (CCG) was given in that same MathOverflow answer (the example involves ordinal numbers and something called the “club filter”, which I will not go into here).

3. Abstract {\sigma}-algebras

It is worth looking at some of the previous arguments through a different lens, that will also appear next time when we discuss the classification problem.

Recall the space of equivalence classes {\hat{\mathcal{A}}} from earlier, where {A\sim B} means that {\mu(A\Delta B) = 0}. Although elements of {\hat{\mathcal{A}}} are not subsets of {X}, we can still speak of the “union” of two such elements by choosing representatives from the respective equivalence classes; that is, given {\hat A, \hat B\in \hat{\mathcal{A}}}, we choose representatives {A\in \hat A} and {B\in \hat B} (so {A,B\in \mathcal{A}}), and consider the “union” of {\hat A} and {\hat B } to be the equivalence class of {A\cup B}; write this as {\hat A\vee \hat B}. One can easily check that this is well-defined; if {A_1\sim A_2} and {B_1\sim B_2}, then {(A_1\cup B_1) \sim (A_2 \cup B_2)}.

This shows that {\cup} induces a binary operation {\vee} on the space {\hat{\mathcal{A}}}; similarly, {\cap} induces a binary operation {\wedge}, complementation {A\mapsto A^c} induces an operation {\hat A \mapsto \hat A'}, and set inclusion {A\subset B} induces a partial order {\hat A \leq \hat B}. These give {\hat{\mathcal{A}}} the structure of a Boolean algebra; say that {\Sigma} is an abstract Boolean algebra if it has a partial order {\leq}, binary operations {\vee}, {\wedge}, and a unary operation {'}, satisfying the same rules as inclusion, union, intersection, and complementation:

  1. {A \vee B} is the join of {A} and {B} (the minimal element such that {A,B \leq A\vee B}), and {A\wedge B} is the meet of {A} and {B} (the maximal element such that {A,B \geq A\wedge B});
  2. the distributive laws {A\vee (B\wedge C) = (A\vee B) \wedge (A\vee C)} and {A\wedge (B\vee C) = (A\wedge B) \vee (A\wedge C)} hold;
  3. there is a maximal element {X} whose complement {X'} is the minimal element;
  4. {A\wedge A' = X'} and {A\vee A' = X}.

For the form of this list I have followed this blog post by Terry Tao, which gives a good in-depth discussion of some other issues relating to concrete and abstract Boolean algebras and {\sigma}-algebras.

Exercise 1 Using the four axioms above, prove the following properties:

  • {A'} is the unique element satisfying (4) — that is, if {A\vee B =X} and {A\wedge B = X'}, then {B=A'};
  • {(A')' = A};
  • de Morgan’s laws: {(A\vee B)' = A' \wedge B'} and {(A\wedge B)' = A' \vee B'}.

If you get stuck, see Chapter IV, Lemma 1.2 in A Course in Universal Algebra by Burris and Sankappanavar.

In fact {\hat{\mathcal{A}}} inherits just a little bit more, since {\vee} (and hence {\wedge}) can be iterated countably many times. We add this as a fifth axiom, and say that an abstract Boolean algebra {\Sigma} is an abstract {\sigma}-algebra if in addition to (1)–(4) it satisfies


  1. any countable family {A_1,A_2,\dots, \in \Sigma} has a least upper bound {\bigvee_n A_n} and a greatest lower bound {\bigwedge_n A_n}.

A measured abstract {\sigma}-algebra is a pair {(\Sigma,\mu)}, where {\Sigma} is an abstract {\sigma}-algebra and {\mu\colon \Sigma\rightarrow [0,\infty]} is a function satisfying the usual properties: {\mu(X')=0} and {\mu(\bigvee_n A_n) = \sum_n \mu(A_n)} whenever {A_i \wedge A_j = X'} for all {i\neq j}. (Note that {X'} is playing the role of {\emptyset}, but we avoid the latter notation to remind ourselves that elements of {\Sigma} do not need to be represented as subsets of some ambient space.)

The operations {\vee,\wedge,'} induce a binary operator {\Delta} on {\Sigma} by

\displaystyle  A \Delta B = (A\wedge B') \vee (B \wedge A'),

which is the abstract analogue of set difference, and so a measured abstract {\sigma}-algebra carries a pseudo-metric {\rho} defined by

\displaystyle  \rho(A,B) = \mu(A \Delta B).

If {(\Sigma,\mu)} has the property that {\mu(A)>0} for all {A\neq X'}, then this becomes a genuine metric.

In particular, if {(X,\mathcal{A},\mu)} is a measure space and {\Sigma = \hat{\mathcal{A}}} is the space of equivalence classes modulo {\sim} (equivalence mod 0), then {\mu} induces a function {\hat{\mathcal{A}} \rightarrow [0,\infty]}, which we continue to denote by {\mu}, such that {(\hat{\mathcal{A}},\mu)} is a measured abstract {\sigma}-algebra; this has the property that {\mu(A)>0} for all non-trivial {A\in \hat{\mathcal{A}}}, and so it defines a metric {\rho} as above.

Given an abstract {\sigma}-algebra {\Sigma} and a subset {\Gamma \subset \Sigma}, the algebra ({\sigma}-algebra) generated by {\Gamma\subset \Sigma} is the smallest algebra ({\sigma}-algebra) in {\Sigma} that contains {\Gamma}. Now we can interpret the equivalence (1) from the previous section (which drove the correspondence between (CG0) and separability) in terms of the measured abstract {\sigma}-algebra {\hat{\mathcal{A}}}.

Proposition 2 Let {(\Sigma,\mu)} be a measured abstract {\sigma}-algebra with no non-trivial null sets. Then for any algebra {\mathop{\mathcal F} \subset \Sigma}, we have {\overline{\mathop{\mathcal F}} = \sigma(\mathop{\mathcal F})}; that is, the {\rho}-closure of {\mathop{\mathcal F}} is equal to the {\sigma}-algebra generated by {\mathop{\mathcal F}}.

Next time we will see how separability (or equivalently, (CG0)) can be used to give a classification result for abstract measured {\sigma}-algebras, which at first requires us to take the abstract point of view introduced in this section. Finally, we will see what is needed to go from there to a similar result for probability spaces.

About Vaughn Climenhaga

I'm an assistant professor of mathematics at the University of Houston. I'm interested in dynamical systems, ergodic theory, thermodynamic formalism, dimension theory, multifractal analysis, non-uniform hyperbolicity, and things along those lines.
This entry was posted in Uncategorized. Bookmark the permalink.

5 Responses to Lebesgue probability spaces, part I

  1. Alexander Gouberman says:

    This is a nice blog post on connections between several different definitions for “separability” of a measure space. There is also another interesting characterization for such a separability: [Bogachev “Measure Theory” 1.12(iii) and 7.14(iv)] also defines a measure space as separable if the underlying metric space is separable (i.e. (S) holds). Then he states that

    (S) $L^p$ is separable for all 0 < p < \infty $L^p$ is separable for some $0 < p < \infty$.

    He also mentions that separability could be defined for infinite measures by allowing the distance to take the value $\infty$.

    Do the implications and equivalences in your blog post generalize to sigma-finite measures?

  2. Pingback: Lebesgue probability spaces, part II | Vaughn Climenhaga's Math Blog

  3. I expect that these results on separability would go through in that setting; if the measure is sigma-finite then one ought to be able to decompose it into its countably many finite parts and just apply the results from this post to those pieces. I confess I haven’t worked this through carefully, but I don’t see any dangers. Things may be a little more subtle when we try to do a classification result, since total measure is an isomorphism invariant and we usually just consider probability spaces, but I wouldn’t be surprised if those results go through too. Again, though, I haven’t thought through the details carefully…

  4. Pingback: Vaughn Climenhaga's Math Blog

  5. rongchang says:

    I think the example of sigma algebra that satisfies (CG0) but not (CG0+) is not right due to the regularity of the Lebesgue measure.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s