#### Supplement to Causal Models

## Supplement 3. Further Topics in Causal Inference

This supplement briefly surveys some more advanced topics in causal inference, and point to some references. Two recent review articles, Eberhardt 2017 and Spirtes & Zhang (2016), survey many of these topics.

*Portability*: We are often interested in exporting a causal
inference made in one context to a novel context. For example, we may
conduct an experiment establishing that a certain development program,
such as micro-lending, is successful in one country. We may then be
interested in whether the same program will be successful in a
different country, with different social institutions. Bareinboim and
Pearl (2013, 2014) explore conditions in which this is possible.

*Inferences from sample data*: We have largely focused on what
can be inferred about causal structure if one knows the true
probability distribution. In practice, of course, we must make
inferences from finite sample data. This raises particular problems
for causal inference. Standard statistical methods allow us to
*reject* the hypothesis that two variables are
probabilistically independent, but they never allow us to
*accept* such a hypothesis (or reject probabilistic
dependence). This raises the issue of how we might confirm a causal
hypothesis that entails a relation of probabilistic independence
between variables. Strategies include assigning Bayesian prior
probabilities and updating (Claassen & Heskes 2012; Cooper &
Herskovits 1992; Geiger & Heckerman 1994), using cost functions
that weigh goodness of fit with data against complexity (Hyttinen et
al. 2014; Triantafillou & Tsamardinos 2015), and
learning-theoretic approaches (Schulte et al. 2010).

*Relational causal models:* As mentioned in the previous
paragraph, causal inference frequently requires that we make
inferences about a probability distribution from sample data. The most
commonly employed methods assume that sampling is independent. This
independence assumption can be violated when individuals in the sample
population stand in certain kinds of causal or non-causal
relationships. For instance, suppose that we are conducting an
epidemiological study about risk factors for a particular disease. If
some of the subjects in our study have come in contact with one
another, they may spread the disease among themselves. This is a case
of inter-unit causation. Or suppose that we wish to study the factors
that influence which academic authors are widely cited. If two of the
subjects in our study are co-authors, their citation rates will not be
independent. In this case, the relationship between the subjects in
the study population is non-causal. Representative work on this topic
includes Maier et al. 2010, Shalizi & Thomas 2011; Schulte &
Khosravi 2012; and Maier et al. 2013.

*Combining evidence from different studies*: Sometimes we have
data available from different studies. These studies may employ
different methods; e.g., some may involve experimental interventions
while others are observational. They may investigate overlapping but
non-identical sets of variables. And they may reach incompatible
conclusions. Strategies for drawing inferences about causal structure
from diverse studies involve deriving constraints from individual
studies and searching for causal models that optimally satisfy the
constraints. See, e.g., Hyttinen et al. 2014; Tillman & Eberhardt
2014; and Bareinboim & Pearl 2015.

*Time series*: Often we are interested in tracking the state of
a system over a period of time. In econometrics, we may be interested
in changes in inflation, interest rates, unemployment, and government
spending from month to month or year to year. Unemployment rates in
one month may affect inflation in the next month, and it may also
affect unemployment in the following month. To represent such a
system, we can have multiple copies of the same set of variables:
e.g., we might have different variables for the rate of unemployment
in May, and for the rate in June. Then we could look for causal
relationships among these time-indexed variables. But complications
can arise if we do not have separate observations of each time period.
For instance, perhaps we only have quarterly data on unemployment,
inflation, and so on, while unemployment has an impact on inflation
rates within one month. Or perhaps we only have aggregate data that
combines observations from multiple time periods. Another problem is
that there may be latent common causes that are changing over the
course of the time scales that we are observing. See Eichler 2012 for
an overview of these issues, and Danks & Plis 2014; Gong et al.
2015; Hyttinen et al. 2016; and Gong et al. 2017 for some recent
approaches.

*Dynamic systems:* In physics and other sciences, we are often
interested in modeling the evolution of a system over time. The state
of the system at a time is represented by a set of variables, and the
way in which these variables change over time is represented by a set
of differential equations. These systems can also be represented using
causal models with discrete temporal stages. One question we may ask
about these systems is whether they will evolve toward a stable
equilibrium. Another question is how the causal relations governing
the evolution of the system relates to the causal relations governing
the equilibrium state. For a simple example, increasing the rate at
which water pours into a cup at time \(t\) may affect the amount of
water present in the cup at time \(t + 1\); but increasing the rate at
which water pours into the cup over a period of time will not affect
the volume of water that is present at equilibrium (when the flow of
water in equals the flow of water out), which is determined solely by
the capacity of the cup. Modeling such dynamic systems raises a number
of conceptual and technical challenges. In particular, it requires
greater flexibility in the way we represent interventions. For
instance, we must distinguish between interventions that change the
value of a variable at one time, and interventions that fix the value
of the variable at all times. Also, the variables characterizing
dynamic systems typically include time derivatives, or discrete
differences, of other variables. For instance, to model the trajectory
of a body in classical physics, we must represent both its position
and its velocity at each time. This means that it will not be possible
to intervene on all of the variables independently: an intervention
that fixes the position of the body over time will also set the
velocity of the body to zero. So our representation will need to
encode information about which variables are mathematically related to
others as derivatives or integrals, and we will need a mechanism to
capture the effects of interventions on such variables. Work on these
problems was pioneered by Denver Dash, see for example Dash &
Druzdzel 2001. See also
Mooij et al. 2013 for a more recent discussion.

*Cycles*: Actual causation is usually assumed to be asymmetric:
if \(C\) causes \(E\) then \(E\) does not cause \(C\). But at the
general level, there can be cycles. For example, in familiar supply
and demand models of economics, the price of a good (such as an
iPhone) affects the level of demand for that good; and the level of
demand for the good affects its price. If we add time indices to the
variables, it will be possible to eliminate the cycle. For instance,
if Apple drops the price of an iPhone at noon on a Monday, that will
cause demand to increase starting on Monday afternoon. If demand for
iPhones suddenly increases, that may cause Apple to increase the price
shortly afterward. However, if we collect data on prices and demand
levels over a period of time, it may not be possible to separate out
the variables in this way. In this case, we would represent the causal
relationships under investigation with a graph that includes cycles.
If we assume that the underlying causal relationships are linear, we
can still infer a good deal about models with cycles. But the general
case poses serious challenges. See Hyttinen et al. 2013b; Neal 2000;
Pearl & Dechter 1996; and Spirtes 1995.

*Macro-variables*: A sample of a gas consists of a huge number
of molecules, each of which has a position and momentum (if we ignore
quantum mechanics). But we can predict many features of the behavior
of the gas using macro-variables like pressure and temperature.
Methods exist for determining when it is possible to construct
macro-variables for use in causal inference. See Chalupka et al. 2017
for methods and applications.

In addition to these topics in causal inference, there is much important work on the use of Bayes nets for computations. Pearl (1988) is a technical work on Bayes nets and other graphical representations of probability, although it is not focused on causation. Neapolitan 2004 is a text book that treats Bayes nets in causal and noncausal contexts. Neapolitan & Jiang 2016 is a short overview of this topic.