## Notes to The Ergodic Hierarchy

1. The use of the term ‘space’ in physics might cause confusion. On the one hand the term is used in its ordinary meaning to refer to the three-dimensional space of our everyday experience. On the other hand, an entire class of mathematical structures are referred to as ‘spaces’ even though they have nothing in common with the space of everyday experience (except some abstract algebraic properties, which is why these structures earned the title ‘spaces’ in the first place). Phase spaces are abstract mathematical spaces.

2. Sometimes a fourth component is mentioned in the definition: a sigma algebra \(\sigma\). Although in certain circumstances it is convenient to add \(\sigma\), it is not strictly necessary since the main purpose of \(\sigma\) is to provide a basis to define the measure \(\mu\), and so \(\sigma\) is always present in the background when there is a measure \(\mu\) and it is not necessary to mention it explicitly. For a discussion of sigma algebras and measures see Appendix B.

3. By using \(\mathbb{R}\) and \(\mathbb{Z}\) we assume that time extends to the past as well as to the future, and we also assume that the time evolution is reversible. This need not be the case and these assumptions can be relaxed in different ways. Nothing in what follows depends on this.

4. Strictly speaking \(A\) has to be measurable. In what follows we always assume that the sets we consider are measurable. This is a technical assumption that has no bearing on the issues that follow since the relevant sets are always measurable.

5. First appearances notwithstanding, this is not a substantial restriction. Systems in statistical mechanics are all measure preserving. Some systems in chaos theory are not measure preserving, but these systems, if they are chaotic on a certain part of the phase space (which can be an attractor, for instance), then there is an invariant measure on this part and EH is applicable with respect to that measure. For a discussion of this point see Werndl (2009b).

6. The basic idea of an integral is the following: slice up the space into \(m\) small cells \(c_1,\ldots ,c_m\) (e.g. by putting a grid on it), then choose a point in each cell and take the value of \(f\) for that point. Then multiply that value with the size of the cell (its measure) and add them all up: \(f(x_1)\mu(c_1) + \ldots + f(x_m)\mu(c_m)\), where \(x_1\) is a point in \(c_1\) etc. Now we start making the cells smaller (and as result we need more of them to cover \(X)\) until they become infinitely small (in technical terms, we take the limit). That is the integral. Put simply, the integral is just \(f(x_1)\mu(c_1) + \ldots + f(x_m)\mu(c_m)\) for infinitely small cells.

7. The concept of ergodicity has a long and complex history. For a sketch of this history see Sklar (1993, Ch. 2) and the Appendix, Section A.

8. Sometimes EH is presented as having another level, namely C-systems (also referred to as Anosov systems or completely hyperbolic systems). Although interesting in their own right, C-systems are beyond the scope of this review. They do not have a unique place in EH and their relation to other levels of EH depends on details, which we cannot discuss here. Paradigm examples of C-systems are located between K- and B-systems; that is, they are K-systems but not necessarily B-systems. The cat map, for instance, is a C-system that is also a K-system (Lichtenberg & Liebermann, 1992, p. 307). But there are K-systems such as the so-called stadium billiard that are not C-systems (Ott, 1993, p. 262). Some C-systems preserve a smooth measure (where ‘smooth’ in this context means absolutely continuous with respect to the Lebesgue measure), in which case they are Bernoulli systems. But not all C-systems have smooth measures. It is always possible to find other measures such as SRB (Sinai, Ruelle, Bowen) measures. However, matters are more complicated in such cases, as such C-systems need not be mixing and a fortiori they need not be K- or B-systems (Ornstein & Weiss, 1991, pp. 75–82).

9.
The mechanism described here is also known as *stretching and
folding*. It is typical of mixing in fluids. As Christov, Lueptow
and Ottino (2011) point out, the mechanism for mixing in granular
media is *cutting and shuffling*.

10. To be precise, a second condition has to be satisfied: \(\alpha\) must be \(T\)-generating (Mañé 1983, 87). However, what matters for our considerations is the independence condition.

11.
A formal proof can be found in Cornfeld *et al.* (1982,
9–10).

12. A similar situation exists for quantum field theory, which has a number of inequivalent formulations including (to name just a few) the canonical, algebraic, axiomatic, and path integral frameworks.

13. For detailed discussions of SM see Frigg (2008), Hemmo and Shenker (2012), Shenker (2017), Sklar (1993) and Uffink (2007); for a discussion from a mathematical point of view see Chibbaro, Rondoni and Vulpiani (2014). Those interested in the long and intricate history of SM are referred to Brush (1976) and von Plato (1994).

14. It is a common assumption in the literature on Boltzmannian SM that there is a finite number of macrostates that system can possess. We should point out, however, that this assumption is based on an idealisation if the relevant macro variables are continuous. In fact, we obtain a finite number of macrostates only if we coarse-grain the values of the continuous variables.

15. Darrigl (2018) and Uffink (2004, 2007) provide discussions of the tangled development of Boltzmann’s constantly changing views. Frigg (2009a) and Myrvolt (2016) discuss probabilities in Boltzmann’s account.

16. For discussion of the physical basis of this entropy see Maroney (2008).

17. For details see, for instance, Tolman (1938 Chs. 3 and 4). For a discussion of Brownian motion in this framework see Luczak (2016).

18. An energy hypersurface is a hypersurface in the system’s phase space on which the energy is constant. For further discussion of Khinchin's results see Badino (2006) and Batterman (1998).

19. To be more precise: what we are after is a proof for cases where there are nontrivial interaction terms, meaning those for which there does not exist a canonical transformation that effectively eliminates such terms.

20. See for instance Lichtenberg & Liebermann (1992), Ott (1993), and Tabor (1989).

21. Provided that \(\mu\) is normalised, which is the case in most systems studied in ergodic theory. Due to their connection to \(\mu\) some maybe inclined not to interpret the \(p(A^t)\) as epistemic probabilities; in fact, in particular in the literature on ergodic theory \(\mu\) is often interpreted as a time average and so one could insist that \(p(A^t)\) be a time average as well. While this could be done, it is not conducive to our analysis. Our goal is to explicate randomness in terms of degrees of unpredictability and to this end one needs to assume that \(p(A^t)\) are epistemic probabilities. However, contra some radical Bayesians, we posit that the values of these probabilities be constrained by objective facts about the system (here the measure \(\mu)\). But this does not make these probabilities objective. For discussion of various interpretations of probability in deterministic theories see Frigg (2016), Frigg and Hoefer (2013), Hemmo and Shenker (2014), Hoefer (2011), List and Pivato (2019), Lavis (2011), Maudlin (2011), Myrvold (2011), Strevens (2011), and Wüthrich (2011). Werndl (2009c) discusses the question whether deterministic and indeterministic descriptions of systems are observationally equivalent (in the sense that they give the same predictions); see Belanger (2013) for a discussion of her claims. For a discussion of the relation between randomness and probabilities see Eagle (2016).

22.
We would also like to mention that the analysis of randomness in
Bernoulli and K-systems is based on *implications* of the
definitions of these systems, but they do not exhaust these
definitions (or provide verbal restatements of them) because there are
parts of the definition that have not been used (in the case of
Bernoulli systems the condition that there be a generating partition,
and in the case of K-systems sets in \(\sigma(n,r)\) that are
different from \(T_k A_{j_0} \cap T_{k +1}A_{j_1} \cap T_{k +2}A_{j_2}
\cap \ldots)\). By contrast, the analyses of SM, WM, and E in the
following paragraphs exhaust the respective definitions. In the case
of Bernoulli this has the consequence that the characterisation given
here also applies to some systems that are not ergodic. For a
discussion of such cases see Werndl (2009a, section 4.2.2).

23. This bit of conventional wisdom is backed-up by a theorem by Markus and Mayer (1974), which is based on KAM theory and basically says that generic Hamiltonian dynamical systems are not ergodic. For a discussion of this theorem see Frigg and Werndl (2011).

24.
We would like to point out that an analysis of chaos in terms of
positive KS-entropy needs further qualifications. A system whose
dynamics is, intuitively speaking, chaotic only on a part of the phase
space can still have positive KS-entropy. A case in point is a system
with \(X = [-1, 1\)] where dynamics on [\(-1,0)\) is the identity
function and the tent map on \([0, 1]\). This system has positive
KS-entropy, but the dynamics of the *entire* systems is not
chaotic (only the part on \([0, 1]\) is). This problem can be circumvented
by adding extra conditions, for instance that the system is ergodic
(which the above system clearly is not).

### Notes to Appendix

25. Not all subsets of phase space points are measurable—see (Royden 1968, pp. 52–65) for an explanation.