Papers and discussion on the evaluation of climate change mitigation

Some recent papers

See also the website: Climate-eval: Sharing best practices on climate change and development evaluation

Where there is no single Theory of Change: The uses of Decision Tree models

Eliciting tacit and multiple Theories of Change

Rick Davies, November 2012. Unpublished paper. Available as pdf version available hereand a 4 page summary version

This paper begins by identifying situations where a theory-of-change led approach to evaluation can be difficult, if not impossible. It then introduces the idea of systematic rather than ad hoc data mining and the types of data mining approaches that exist. The rest of the paper then focuses on one data mining method known as Decision Trees, also known as Classification Trees.  The merits of Decision Tree models are spelled out and then the processes of constructing Decision Trees are explained. These include the use of computerised algorithms and ethnographic methods, using expert inquiry and more participatory processes. The relationships of Decision Tree analyses to related methods are then explored, specifically Qualitative Comparative Analysis (QCA) and Network Analysis. The final section of the paper identifies potential applications of Decision Tree analyses, covering the elicitation of tacit and multiple Theories of Change, the analysis of project generated data and the meta-analysis of data from multiple evaluations. Readers are encouraged to explore these usages.

Included in the list of merits of Decision Tree models is the possibility of differentiating what are necessary and/or sufficient causal conditions and the extent to which a cause is a contributory cause (a la Mayne)

Comments on this paper are being sought. Please post them below or email Rick Davies at rick@mande.co.uk

Separate but related:

See also: An example application of Decision Tree (predictive) models (10th April 2013)

Postscript 2013 03 20: Probably the best book on Decision Tree algorithms is:

Rokach, Lior, and Oded Z. Maimon. Data Mining with Decision Trees: Theory and Applications. World Scientific, 2008. A pdf copy is available

A Tale of Two Cultures: Qualitative and Quantitative Research in the Social Sciences

Gary Goertz & James Mahoney, 2012
Princeton University Press. Available on Amazon

Review of the book by Dan Hirschman

Excerpts from his review:

“Goertz, a political scientist, and Mahoney, a sociologist, attempt to make sense of the different cultures of research in these two camps without attempting to apply the criteria of one to the other. In other words, the goal is to illuminate difference and similarity rather than judge either approach (or, really, affiliated collection of approaches) as deficient by a universal standard.

G&M are interested in quantitative and qualitative approaches to causal explanation.

Onto the meat of the argument. G&M argue that the two cultures of quantitative and (causal) qualitative research differ in how they understand causality, how they use mathematics, how they privilege within-case vs. between-case variation, how they generate counterfactuals, and more. G&M argue, perhaps counter to our expectations, that both cultures have answers to each of these questions, and that the answers are reasonably coherent across cultures, but create tensions when researchers attempt to evaluate each others’ research: we mean different things, we emphasize different sorts of variation, and so on. Each of these differences is captured in a succinct chapter that lays out in incredible clarity the basic choices made by each culture, and how these choices aggregate up to very different models of research.

Perhaps the most counterintuitive, but arguably most rhetorically important, is the assertion that both quant and qual research are tightly linked to mathematics. For quant research, the connection is obvious: quantitative research relies heavily on probability and statistics. Causal explanation consists of statistically identifying the average effect of a treatment. For qual research, the claim is much more controversial. Rather than relying on statistics, G&M assert that qualitative research relies on logic and set theory, even if this reliance is often implicit rather than formal. G&M argue that at the core of explanation in the qualitative culture are the set theoretic/logical criteria of necessary and sufficient causes. Combinations of necessary and sufficient explanations constitute causal explanations. This search for non-trivial necessary and sufficient conditions for the appearance of an outcome shape the choices made in the qualitative culture, just as the search for significant statistical variation shapes quantitative resarch. G&M include a brief review of basic logic, and a quick overview of the fuzzy-set analysis championed by Charles Ragin. I had little prior experience with fuzzy sets (although plenty with formal logic), and I found this chapter extremely compelling and provocative. Qualitative social science works much more often with the notion of partial membership – some countries are not quite democracies, while others are completely democracies, and others are completely not democracies. This fuzzy-set approach highlight the non-linearities inherent in partial membership, as contrasted with quantitative approaches that would tend to treat “degree of democracy” as a smooth variable.”

Earlier paper by same authors available as pdf: A Tale of Two Cultures: Contrasting Quantitative and Qualitative Research
by James Mahoney, Gary Goertz. Political Analysis (2006) 14:227–249 doi:10.1093/pan/mpj017

See also these recent reviews:

See also The Logic of Process Tracing Tests in the Social Sciences by James Mahoney, Sociological Methods & Research, XX(X), 1-28 Published online 2 March 2012

RD comment: This books is recommended reading!

PS 15 February 2013: See Howard White’s new blog posting “Using the causal chain to make sense of the numbers” where he provides examples of the usefulness of simple set-theoretic analyses of the kind described by Mahoney and Goetz (e.g. in an analysis of arguments about why Gore lost to Bush in Florida)

 

A Bibliography on Evaluability Assessment

PS: This posting and bibliography was first published in November 2012, but has been updated since then, most recently in March 2018. The bibliography now contains 150 items.

An online (Zotero) bibliography was generated in November 2012 by Rick Davies, as part of the process of developing a “Synthesis of literature on evaluability assessments” contracted by the DFID Evaluation Department

[In 2012] There are currently 133 items in this bibliography, listed by year of publication, starting with the oldest first. They include books, journal articles, government and non-government agency documents and webpages, produced between 1979 and 2012. Of these 59% described actual examples of Evaluability Assessments, 13% reviewed experiences of multiple kinds of Evaluability Assessments, 28% were expositions on Evaluability Assessments, with some references to examples, 10% were official guidance documents on how to do Evaluability Assessments and 12% were Terms of Reference for Evaluability Assessments. Almost half (44%) of the documents were produced by international development agencies.

The list is a result of a search using Google Scholar and Google Search to find documents with “evaluability” in the title. The first 100 items in the search result listing were examined. Searches were also made via PubMed, JSTOR and Sciverse. A small number of documents were also identified as a result of a request posted on the MandE NEWS, Xceval and Theory Based Evaluation email lists.

This list is open to further editing and inclusions. Suggestions should be sent to rick.davies@gmail.com

 

“Evaluation and Inequality: Moving Beyond the Discussion of Poverty” International Development Evaluation Association Global Assembly

Bridgetown, Barbados (May 6-9, 2013)

IDEAS website

Introduction:

The Board of the International Development Evaluation Association (IDEAS) is pleased to announce its next Global Assembly on May 7-9, 2013 in Bridgetown, Barbados, preceded by professional development training sessions on May 6. The theme of the Assembly will be on the relation of evaluation and inequality and their influence on development. The theme of this coming assembly underscores the role that evaluative knowledge can play in development in general and more particularly in the focus on the sustaining factors that generate and perpetuate poverty.

Assembly Agenda and Call for Paper/Panel Proposals:

The Assembly will organize itself into a number of substantive strands. Each of these strands will be discussed here. Potential presenters are invited to make a proposal for a paper or panel in one or more of these strands. General paper proposals on topics of evaluation outside the theme of the strands are also invited. We especially invite papers that are grounded in development experiences.

Strand One: Understanding Inequality and its relation to the causes and consequences of poverty

Strand Two: Effective program strategies to address inequality—findings from evaluation

Strand Three: Regional responses/regional strategies to address inequality

Strand Four: The measurement and assessment of inequality

Strand Five: General Paper Sessions—all other papers/panels being proposed on any evaluation topic

All paper/panel proposals should be sent by January 10, 2013 to: Ray C. Rist, President of IDEAS, at the following e-mail address: rayrist11@gmail.com

Proposal Guidelines:

1) Each paper or panel proposal can be no more than 250 words in total. This proposal should include the title, name (s) of participants, affiliation of participants; and brief description of the subject of the paper/panel.

2) The date for submission of all proposals is January 10, 2013!!

3) Consideration of any proposal after January 10 is at the full discretion of the chair.

4) Decisions on all proposals will be made within two weeks and presenters will be informed immediately.

 

Scholarships: There will be some few scholarships available to ensure a global representation of development evaluators at this Assembly. First priority for scholarships will be for current IDEAS Members who present a paper/panel or are actively involved in the Assembly as a panel chair or discussant.

NOTE: Anyone who wishes to present at this Assembly will have to be a present member of IDEAS.

Evaluating Peacebuilding Activities in Settings of Conflict and Fragility: Improving Learning for Results

DAC Guidelines and Reference Series

Publication Date :08 Nov 2012
Pages :88
ISBN :9789264106802 (PDF) ; 9789264106796 (print)
DOI :10.1787/9789264106802-en

Abstract

Recognising a need for better, tailored approaches to learning and accountability in conflict settings, the Development Assistance Committee (DAC) launched an initiative to develop guidance on evaluating conflict prevention and peacebuilding activities.  The objective of this process has been to help improve evaluation practice and thereby support the broader community of experts and implementing organisations to enhance the quality of conflict prevention and peacebuilding interventions. It also seeks to guide policy makers, field and desk officers, and country partners towards a better understanding of the role and utility of evaluations. The guidance  presented in this book provides background on key policy issues affecting donor engagement in settings of conflict and fragility and introduces some of the challenges to evaluation particular to these settings. It then provides step-by-step guidance on the core steps in planning, carrying out and learning from evaluation, as well as some basic principles on programme design and management.

Table of Contents

Foreword
Acknowledgements

Executive summary

Glossary

Introduction: Why guidance on evaluating donor engagement in situations of conflict and fragility?

Chapter 1. Conceptual background and the need for improved approaches in situations of conflict and fragility

Chapter 2. Addressing challenges of evaluation in situations of conflict and fragility

Chapter 3. Preparing an evaluation in situations of conflict and fragility

Chapter 4. Conducting an evaluation in situations of conflict and fragility

Annex A. Conflict analysis and its use in evaluation

Annex B. Understanding and evaluating theories of change

Annex C. Sample terms of reference for a conflict evaluation

Bibliography

 

On prediction, Nate Silver’s “The Signal and the Noise”

Title The Signal and the Noise: The Art and Science of Prediction
Author Nate Silver
Publisher Penguin UK, 2012
ISBN 1846147530, 9781846147531
Length 544 pages

Available on Amazon Use Google Books to read the first chapter.

RD Comment: Highly recommended reading. Reading this book reminded me of M&E data I had to examine on a large maternal and child health project in Indonesia. Rates on key indicators were presented for each of the focus districts for the year prior to the project started, then for each year during the four year project period. I remember thinking how variable these numbers were, there was nothing like a trend over time in any of the districts. Of course what I was looking at was probably largely noise, variations arising from changes in who and how the underlying data was collected and reported.This sort of situation is by no means uncommon. Most projects, if they have a base line at all, have baseline data from one year prior to when the project started. Subsequent measures of change are then, ideally, compared to that baseline. This arrangement assumes minimal noise, which is a tad optimistic. The alternative, which should not be so difficult in large bilateral projects dealing with health and education systems for example, would be to have a baseline data series covering the preceding x years, where x is at least as long as the expected duration of the proposed project.

See also Malkiel’s review in the Wall Street Journal (Telling Lies From Statistics). Malkiel is author of “A Random Walk Down Wall Street.” While a positive review overall, he charges Silver with ignoring false positives when claiming that some recent financial crises were predictable. Reviews also available in The Guardian. and LA Times. Nate Silver also writes a well known blog for the New York Times.

DRAFT DFID Evaluation Policy – Learning What Works to Improve Lives

RD Comment: The policy document is a draft for consultation at this stage. The document will be revised to accommodate comments received. The aim is to have a finished product by the end of this calendar year. People who are interested to comment should do so directly to Liz Ramage by 16th November.

DRAFT FOR DISCUSSION 24 AUGUST 2012 (Pdf available here)

“This Evaluation Policy sets out the UK Government’s approach to, and standards for, independent evaluation of its Official Development Assistance (ODA).

PREFACE

We are publishing this evaluation policy for Official Development Assistance (ODA) at a time when the UK Government’s (the Government) approach to evaluation of international development programmes is being completely transformed.

This policy covers evaluation of all UK ODA around 87% of which is managed by the Department for International Development (DFID).  Major elements of ODA are also delivered through other Government Departments, including the Foreign and Commonwealth Office, the Department of Energy and Climate Change.

The Government is rapidly scaling up its programmes to deliver on international commitments and the Millennium Development Goals.   In doing so, the Government has made a pact with the taxpayer that this will be accompanied by greater transparency and commitment to results and measurable impact.   Evaluation plays a central part in this undertaking.

In 2011, the Independent Commission for Aid Impact (ICAI) was established, a radical change in the UK’s architecture and adopting a model which sets new standards for independence with a focus on value for money and results.  Reporting directly to Parliament, ICAI sets a new benchmark for independence in scrutiny of development programmes which applies across all UK ODA.

In parallel withICAI’s work, UK Government Departments are placing much greater emphasis on evidence and learning within programmes.

I am excited by the changes we are seeing within DFID on this initiative.  We are rapidly moving towards commissioning rigorous impact evaluations within the programmes, with much stronger links into decision making and to our major investments in policy-relevant research.

Not only has the number of specialist staff working on evaluation more than doubled, but these experts are now located within the operational teams where they can make a real improvement to programme design and delivery.

Finally, I want to note that DFID is working closely with Whitehall partners in building approaches to evaluation.  This fits well with wider changes across government, including the excellent work by the Cross-Government Evaluation Group including the updateof the Guidance for Evaluation (The Magenta book)”

Mark Lowcock, Permanent Secretary, Department for International Development

Contents

KEY MESSAGES.

1       INTRODUCTION.

1.1      Purpose of the Policy and its Audience.

1.2      Why we need independent and high quality evaluation.

2       A TRANSFORMED APPROACH TO EVALUATION.

2.1      The Government’s commitment to independent evaluation.

2.2      The Independent Commission for Aid Impact

2.3      The international context for development evaluation.

3       WHAT IS EVALUATION?.

3.1      Definition of evaluation.

3.2      Distinctions with other aspects of results management

3.3      Evaluation Types.

4       ENSURING EVALUATIONS ARE HIGH QUALITY.

4.1      Quality.

4.2      Principles.

4.3      Standards.

4.4      Criteria.

4.5      Methods.

4.6      How to decide what to evaluate.

4.7      Resources.

5       IMPACT EVALUATION.

5.1      Definitions and quality standards for impact evaluation.

6       USING EVALUATION FINDINGS.

6.1      The importance of communicating and using evaluation findings.

6.2      Timeliness.

6.3      Learning and using evidence.

7       PARTNERSHIPS FOR EVALUATIONS.

7.1      A more inclusive approach to partnership working.

7.2      A stronger role for developing countries.

7.3      Partnerships with multilaterals, global and regional funds and civil society organisations.

8       DFID’s STRATEGY FOR EMBEDDING EVALUATION.

8.1      A transformed approach to evaluation.

8.2      DFID’s co-ordinated approach to results: where evaluation fits in.

8.3      Mandatory quality processes.

8.4      Ensuring there are no evidence gaps in DFID’s portfolio.

8.5      Building capacity internally: evaluation professional skills and accreditation programme.

8.6      Roles and responsibilities for evaluation.

PS: For comparison, the previous policy document: Building the evidence to reduce poverty The UK’s policy on evaluation for international development. Department for International Development (DFID) June 2009, and the March 2009 draft version (for consultation).

 

 

Making causal claims

by John Mayne. ILAC Brief, October 2012 Available as pdf

“An ongoing challenge in evaluation is the need to make credible causal claims linking observed results to the actions of interventions. In the very common situation where the intervention is only one of a number of causal factors at play, the problem is compounded – no one factor ’caused’ the result. The intervention on its own is neither necessary nor sufficient to bring about the result. The Brief argues the need for a different perspective on causality. One can still speak of the intervention making a difference in the sense that the intervention was a necessary element of a package of causal factors that together were sufficient to bring about the results. It was a contributory cause. The Brief further argues that theories of change are models showing how an intervention operates as a contributory cause. Using theories of change, approaches such as contribution analysis can be used to demonstrate that the intervention made a difference – that it was a contributory cause – and to explain how and why.”

See also Making Causal Claims by John Mayne at IPDET 2012, Ottawa

RD Comments:

What I like in this paper: The definition of a contributory cause as something neither necessary or sufficient, but a necessary part of a package of causes that is sufficient for an outcome to occur

I also like the view that “theories of change are models of causal sufficiency”

But I query the usefullness of distinguishing between contributory causes that are triggering causes, sustaining causes and enabling causes, mainly on the grounds of the difficulty of reliably identifying them

I am more concerned with the introduction of probablistic statements about “likely” necessity and “likely” sufficiency, because it increases the ease with which claims of casual contribution can be made, perhaps way too much. Michael Patton  recently expressed a related anxiety: “There is a danger that as stakeholders learn about the non-linear dynamics of complex systems and come to value contribution analysis , they will be inclined to always find some kind of linkage between implemented activities and desired outcomes ….In essence, the concern is that treating contribution as the as the criterion (rather than direct attribution) is so weak that a finding of no contribution is extremely unlikely

John Mayne’s paper distinguishes between four approaches to demonstrating causality (adapted from Stern et al., 2012:16-17):

  • “Regularity frameworks that depend on the frequency of association between cause and effect – the basis for statistical approaches making causal claims
  • Counterfactual frameworks that depend on the difference between two otherwise identical cases – the basis for experimental and quasiexperimental approaches to making causal claims
  • Comparative frameworks that depend on combinations of causes that lead to an effect – the basis for ‘configurational’ approaches to making causal claims, such as qualitative comparative analysis
  • Generative frameworks that depend on identifying the causal links and mechanisms that explain effects – the basis for theory-based and realist approaches to making causal claims .”
.
I would simplify these into two broad categories, with sub-categories:
  • Claims can be made about the co-variance of events
    • Counterfactual approaches: describing the effects of the absence and presence of an intervention on an outcome of interest,  when all other conditions being kept the same
    • Configurational approaches, describing the effects of the presence and absence of multiple conditions (relating to both context and intervention)
    • Statistical approaches, describing the effects of more complex mixes of variables
  • Claims can be made about causal mechanisms underlying each co-variance that is found

Good causal claims contain both 1 and 2: evidence of co-variance and plausable or testable explanations of why each co-variance exists. One without the other is insufficient. You can start with theory (a proposed mechanism) and look for supporting co-variance, or start with a co-variance and look for a supporting mechanism. Currently theory led approaches are in vogue.

For more on causal mechanisms, see Causal Mechanisms in the Social Sciences by Peter Hedstrom and Petri Ylikoski
See also my blog posting on Representing different combinations of causal conditions, for emans of distingishing different configurations of necessary and sufficient conditions

Free relevant well organised online courses: Statistics, Model Thinking and others

Provided FREE by Coursera in cooperation with Princeton, Stanford and other Universities

Each opening page gives this information: about the Course, About the Instructor, The Course Sylabus, Introductory Video, Recommended Background, Suggested Readings, Course Format, FAQs,

Example class format: “Each week of class consists of multiple 8-15 minute long lecture videos, integrated weekly quizzes, readings, an optional assignment and a discussion. Most weeks will also have a peer reviewed assignment, and there will be the opportunity to participate in a community wiki-project. There will be a comprehensive exam at the end of the course.”

The contents of past courses remain accessible.

RD Comment: Highly Recomended! [ I am doing the stats course this week]

%d bloggers like this: