Book Review: Causality: Models,Reasoning,and Inference

Abstract

Judea Pearl's Causality: Models, Reasoning, and Inference is a triumph of clarity, insight, and impact. Causality presents a succinct synthesis of more than 25 years’ of contributions to causal inference. The author is a leading computer scientist and a founder of the field of Bayesian networks (he invented the term), on which Causality draws extensively. Causality is now available in an updated and corrected 2013 printing of the much-revised 2009 second edition. Originally published in 2000, the book’s influence extends far beyond computer science into statistics and philosophy and increasingly informs causal inference in the social sciences.

Causality deals with causal reasoning. It is not a conventional statistics textbook with mechanical recipes for data analysis. The book is low on estimation advice and makes no reference to common software packages. Instead, the book teaches social scientists to differentiate between causal and statistical concepts, to link substantive theories to their observational implications, and to spot the conditions under which statistical data can substantiate causal claims.

The distinction between causal and statistical concepts is central to the book. Pearl argues—correctly—that conventional mathematical notation is ill suited to convey causal statements. Social scientists conventionally read structural equations from right to left, such that $Y = α + X β + ϵ$ means that X causes Y. Suppose that Y is wearing rain boots and X is rain. Algebra permits one to rewrite the equation as $X = \frac{Y - α - ϵ}{β}$ . How is the reader of these two algebraically equivalent equations to know that wearing rain boots does not cause rain? Clearly, conventional algebra needs to be vested with additional, and often implicit, provisos to convey causal meaning.

Pearl avoids the ambiguities of conventional notation by introducing two new and equivalent notations specifically designed for causal information: the so-called do-operator and directed acyclic graphs (DAGs). Of the two, DAGs are certainly more important for social scientists. DAGs are a non-parametric generalization of conventional social-scientific path models. Like path models, they are essentially pictures of dots and arrows that succinctly represent the analyst’s causal knowledge (or assumptions) about the data-generating process. However, unlike path models, DAGs are completely non-parametric, that is, they make no assumptions about the distributions of the variables or the functional form of the direct causal effects between them. This makes DAGs much more general, and hence more powerful, tools.

Causality justifies the interpretation and uses of DAGs from first principles. Sociologists may find the presentation of the technical foundations in Chapters 1 and 2 challenging. However, the results and examples of Chapters 3 and 4, which present the main uses of DAGs for causal inference, can largely be enjoyed without the preceding chapters if readers are already familiar with the basics of DAGs. This is even more true for Chapter 5, which details Pearl’s engagement with “causality and structural models in social science and economics,” an insightful, often pithy, and all around important examination to current social science practice.

The heart of Causality explains the identification of causal effects of interventions. “Identification” means the ability of extracting causal information from data, that is, stripping observed statistical associations of all spurious components to isolate a causal effect. “Causal effects of interventions” basically means causal effects in the usual manipulationist or counterfactual sense. Pearl is thus referring to the same concept of causality that has so fruitfully united statistics, philosophy, and social science methodologies over the past two decades.

Pearl is primarily interested in non-parametric identification of causal effects, that is, identification that derives entirely from qualitative causal assumptions about the data-generating process. (Other chapters, especially Chapters 5 and 8, also deal with parametric identification, including linear models and instrumental variables.) Pearl’s non-parametric focus is healthy for social science, because parametric assumptions are hardly ever defended, and often not defensible, in sociology.

The central contribution of Causality lies in elaborating on a complete solution for all non-parametric identification problems (first published in Pearl’s path-breaking 1995 paper in Biometrica). This solution, dubbed the “do-calculus,” is a set of three rules by which analysts can transform causal into observational (statistical) statements. The do-calculus determines unambiguously whether or not a causal effect of interest is identifiable from observed data given the analyst’s model of data generation. Despite its compactness, however, the do-calculus is difficult. Therefore, Chapters 3 and 4 present several remarkable and intuitive shortcuts for determining identification, which together cover more scenarios than most sociologists will ever encounter. The most famous, and most practically important, of these shortcuts is Pearl’s “backdoor criterion,” which determines which variables an analyst must, or must not, adjust for, for example, in regression analysis, to identify a causal effect. Another shortcut, dubbed the “front-door criterion,” demonstrates how to piece together total causal effects from the bits and pieces of information along the causal pathways between treatment and outcome.

Applying the backdoor and frontdoor criteria to various examples, Pearl startles the reader with surprising insights. For example, Pearl shows that controlling for a certain type of pretreatment covariate can increase rather than decrease bias; that it is not generally necessary (although it is sufficient) to control for all causes of the treatment (i.e., the “treatment assignment mechanism”); and that it is similarly not generally necessary to exhaustively model the outcome.

Much of Causality is presented rather abstractly. Nonetheless, the book frequently shines in the presentation of incisive social science examples. For example, Pearl distinguishes between different types of direct effects and shows that employment discrimination (and, by extension, race and sex discrimination) turn on direct effects with interactions between the treatment (e.g., race) and a mediator (e.g., qualification). Conventional mediation analysis, of course, struggles to accommodate interactions. Modern causal mediation analysis, as presented in Chapter 4.5, by contrast, generalizes conventional concepts and naturally incorporates interactions.

Methodologists and applied social scientists will find much in Causality to inspire new applications. Several of Pearl’s ideas have, to my knowledge, never been tried in sociology. For example, Pearl discusses “stochastic policies,” which involve treatments that are to be applied with a given probability rather than with certainty. As another example, a section on “surrogate experiments” gives a solution for dealing with treatments that are not observationally identified and cannot be experimentally manipulated. The trick is to manipulate experimentally some antecedent of the treatment so as to render the treatment observationally identifiable.

Pearl's book fills an important space by providing an accessible formal language for reasoning about causality. While the technical details of the book require a background in basic probability theory, the tools provided can be used by sociologists who are not also methodologists. Since causality is the common currency of most science, Pearl’s language empowers social scientists to communicate causal models with each other across sub-disciplines. By carefully distinguishing causal from statistical concepts, Pearl’s book also eliminates important ambiguities and enables social scientists to communicate more effectively with statistical methodologists. It is the intersection between causal reasoning and statistical data that produces new causal knowledge.