Where there is no single Theory of Change: The uses of Decision Tree models

Eliciting tacit and multiple Theories of Change

Rick Davies, November 2012.Available as pdf  and a 4 page summary version

This paper begins by identifying situations where a theory-of-change led approach to evaluation can be difficult, if not impossible. It then introduces the idea of systematic rather than ad hoc data mining and the types of data mining approaches that exist. The rest of the paper then focuses on one data mining method known as Decision Trees, also known as Classification Trees.  The merits of Decision Tree models are spelled out and then the processes of constructing Decision Trees are explained. These include the use of computerised algorithms and ethnographic methods, using expert inquiry and more participatory processes. The relationships of Decision Tree analyses to related methods are then explored, specifically Qualitative Comparative Analysis (QCA) and Network Analysis. The final section of the paper identifies potential applications of Decision Tree analyses, covering the elicitation of tacit and multiple Theories of Change, the analysis of project generated data and the meta-analysis of data from multiple evaluations. Readers are encouraged to explore these usages.

Included in the list of merits of Decision Tree models is the possibility of differentiating what are necessary and/or sufficient causal conditions and the extent to which a cause is a contributory cause (a la Mayne)

Comments on this paper are being sought. Please post them below or email Rick Davies at rick@mande.co.uk

Separate but related:

See also: An example application of Decision Tree (predictive) models (10th April 2013)

Postscript 2013 03 20: Probably the best book on Decision Tree algorithms is:

Rokach, Lior, and Oded Z. Maimon. Data Mining with Decision Trees: Theory and Applications. World Scientific, 2008. A pdf copy is available


A Tale of Two Cultures: Qualitative and Quantitative Research in the Social Sciences

Gary Goertz & James Mahoney, 2012
Princeton University Press. Available on Amazon

Review of the book by Dan Hirschman

Excerpts from his review:

“Goertz, a political scientist, and Mahoney, a sociologist, attempt to make sense of the different cultures of research in these two camps without attempting to apply the criteria of one to the other. In other words, the goal is to illuminate difference and similarity rather than judge either approach (or, really, affiliated collection of approaches) as deficient by a universal standard.

G&M are interested in quantitative and qualitative approaches to causal explanation.

Onto the meat of the argument. G&M argue that the two cultures of quantitative and (causal) qualitative research differ in how they understand causality, how they use mathematics, how they privilege within-case vs. between-case variation, how they generate counterfactuals, and more. G&M argue, perhaps counter to our expectations, that both cultures have answers to each of these questions, and that the answers are reasonably coherent across cultures, but create tensions when researchers attempt to evaluate each others’ research: we mean different things, we emphasize different sorts of variation, and so on. Each of these differences is captured in a succinct chapter that lays out in incredible clarity the basic choices made by each culture, and how these choices aggregate up to very different models of research.

Perhaps the most counterintuitive, but arguably most rhetorically important, is the assertion that both quant and qual research are tightly linked to mathematics. For quant research, the connection is obvious: quantitative research relies heavily on probability and statistics. Causal explanation consists of statistically identifying the average effect of a treatment. For qual research, the claim is much more controversial. Rather than relying on statistics, G&M assert that qualitative research relies on logic and set theory, even if this reliance is often implicit rather than formal. G&M argue that at the core of explanation in the qualitative culture are the set theoretic/logical criteria of necessary and sufficient causes. Combinations of necessary and sufficient explanations constitute causal explanations. This search for non-trivial necessary and sufficient conditions for the appearance of an outcome shape the choices made in the qualitative culture, just as the search for significant statistical variation shapes quantitative resarch. G&M include a brief review of basic logic, and a quick overview of the fuzzy-set analysis championed by Charles Ragin. I had little prior experience with fuzzy sets (although plenty with formal logic), and I found this chapter extremely compelling and provocative. Qualitative social science works much more often with the notion of partial membership – some countries are not quite democracies, while others are completely democracies, and others are completely not democracies. This fuzzy-set approach highlight the non-linearities inherent in partial membership, as contrasted with quantitative approaches that would tend to treat “degree of democracy” as a smooth variable.”

Earlier paper by same authors available as pdf: A Tale of Two Cultures: Contrasting Quantitative and Qualitative Research
by James Mahoney, Gary Goertz. Political Analysis (2006) 14:227–249 doi:10.1093/pan/mpj017

See also these recent reviews:

See also The Logic of Process Tracing Tests in the Social Sciences by James Mahoney, Sociological Methods & Research, XX(X), 1-28 Published online 2 March 2012

RD comment: This books is recommended reading!

PS 15 February 2013: See Howard White’s new blog posting “Using the causal chain to make sense of the numbers” where he provides examples of the usefulness of simple set-theoretic analyses of the kind described by Mahoney and Goetz (e.g. in an analysis of arguments about why Gore lost to Bush in Florida)


Making causal claims

by John Mayne. ILAC Brief, October 2012 Available as pdf

“An ongoing challenge in evaluation is the need to make credible causal claims linking observed results to the actions of interventions. In the very common situation where the intervention is only one of a number of causal factors at play, the problem is compounded – no one factor ’caused’ the result. The intervention on its own is neither necessary nor sufficient to bring about the result. The Brief argues the need for a different perspective on causality. One can still speak of the intervention making a difference in the sense that the intervention was a necessary element of a package of causal factors that together were sufficient to bring about the results. It was a contributory cause. The Brief further argues that theories of change are models showing how an intervention operates as a contributory cause. Using theories of change, approaches such as contribution analysis can be used to demonstrate that the intervention made a difference – that it was a contributory cause – and to explain how and why.”

See also Making Causal Claims by John Mayne at IPDET 2012, Ottawa

RD Comments:

What I like in this paper: The definition of a contributory cause as something neither necessary or sufficient, but a necessary part of a package of causes that is sufficient for an outcome to occur

I also like the view that “theories of change are models of causal sufficiency”

But I query the usefullness of distinguishing between contributory causes that are triggering causes, sustaining causes and enabling causes, mainly on the grounds of the difficulty of reliably identifying them

I am more concerned with the introduction of probablistic statements about “likely” necessity and “likely” sufficiency, because it increases the ease with which claims of casual contribution can be made, perhaps way too much. Michael Patton  recently expressed a related anxiety: “There is a danger that as stakeholders learn about the non-linear dynamics of complex systems and come to value contribution analysis , they will be inclined to always find some kind of linkage between implemented activities and desired outcomes ….In essence, the concern is that treating contribution as the as the criterion (rather than direct attribution) is so weak that a finding of no contribution is extremely unlikely

John Mayne’s paper distinguishes between four approaches to demonstrating causality (adapted from Stern et al., 2012:16-17):

  • “Regularity frameworks that depend on the frequency of association between cause and effect – the basis for statistical approaches making causal claims
  • Counterfactual frameworks that depend on the difference between two otherwise identical cases – the basis for experimental and quasiexperimental approaches to making causal claims
  • Comparative frameworks that depend on combinations of causes that lead to an effect – the basis for ‘configurational’ approaches to making causal claims, such as qualitative comparative analysis
  • Generative frameworks that depend on identifying the causal links and mechanisms that explain effects – the basis for theory-based and realist approaches to making causal claims .”
I would simplify these into two broad categories, with sub-categories:
  • Claims can be made about the co-variance of events
    • Counterfactual approaches: describing the effects of the absence and presence of an intervention on an outcome of interest,  when all other conditions being kept the same
    • Configurational approaches, describing the effects of the presence and absence of multiple conditions (relating to both context and intervention)
    • Statistical approaches, describing the effects of more complex mixes of variables
  • Claims can be made about causal mechanisms underlying each co-variance that is found

Good causal claims contain both 1 and 2: evidence of co-variance and plausable or testable explanations of why each co-variance exists. One without the other is insufficient. You can start with theory (a proposed mechanism) and look for supporting co-variance, or start with a co-variance and look for a supporting mechanism. Currently theory led approaches are in vogue.

For more on causal mechanisms, see Causal Mechanisms in the Social Sciences by Peter Hedstrom and Petri Ylikoski
See also my blog posting on Representing different combinations of causal conditions, for emans of distingishing different configurations of necessary and sufficient conditions

What Causes What & Hypothesis testing: Truth and Evidence

Two very useful chapters in Denise Cummins (2012) “Good Thinking“, Cambridge University Press

Cummins is a professor of psychology and philosophy, both of which she brings to bear in this great book. Read an interview with author here

Contents include:

1. Introduction
2. Rational choice: choosing what is most likely to give you what you want
3. Game theory: when you’re not the only one choosing
4. Moral decision-making: how we tell right from wrong
5. The game of logic
6. What causes what?
7. Hypothesis testing: truth and evidence
8. Problem solving: another way of getting what you want
9. Analogy: this is like that.

Models of Causality and Causal Inference

by BarbaraBefani. An annex to BROADENING THE RANGE OF DESIGNS AND METHODS FOR IMPACT EVALUATIONS. Report of a study commissioned by the Department for International Development, APRIL 2012 ,  by Elliot Stern (Team Leader), Nicoletta Stame, John Mayne, Kim Forss, Rick Davies, Barbara Befani


The notion of causality has given rise to disputes among philosophers which still continue today. At the same time, attributing causation is an everyday activity of the utmost importance for humans and other species, that most of us carry out successfully outside the corridors of academic departments. How do we do that? And what are the philosophers arguing about? This chapter will attempt to provide some answers, by reviewing some of the notions of causality in the philosophy of science and “embedding” them into everyday activity. It will also attempt to connect these with impact evaluation practices, without embracing one causation approach in particular, but stressing strengths and weaknesses of each and outlining how they relate to one another. It will be stressed how both everyday life, social science and in particular impact evaluation have something to learn from all these approaches, each illuminating on single, separate, specific aspects of the relationship between cause and effect. The paper is divided in three parts: the first addresses notions of causality that focus on the simultaneous presence of a single cause and the effect; alternative causes are rejected depending on whether they are observed together with effect. The basic causal unit is the single cause, and alternatives are rejected in the form of single causes. This model includes multiple causality in the form of single independent contributions to the effect. In the second part, notions of causality are addressed that focus on the simultaneous presence of multiple causes that are linked to the effect as a “block” or whole: the block can be either necessary or sufficient (or neither) for the effect, and single causes within the block can be necessary for a block to be sufficient (INUS causes). The third group discusses models of causality where simultaneous presence is not enough: in order to be defined as such, causes need to be shown to actively manipulate / generate the effect, and focus on how the effect is produced, how the change comes about. The basic unit here – rather than a single cause or a package – is the causal chain: fine-grained information is required on the process leading from an initial condition to the final effect.

The second type of causality is something in-between the first and third: it is used when there is no finegrained knowledge on how the effect is manipulated by the cause, yet the presence or absence of a number of conditions can be still spotted along the causal process, which is thus more detailed than the bare “beginning-end” linear representation characteristic of the successionist model.


RD Comment: I strongly recommend this paper

For more on necessary and/or sufficient conditions see this blog posting which shows how different combinations of causal conditions can be visually represented and recognised, using Decision Trees