Qualitative Comparative Analysis: A Valuable Approach to Add to the Evaluator’s ‘Toolbox’? Lessons from Recent Applications

Posted on 8 February, 2016 – 12:09 PM
Schatz, F. and Welle, K. CDI Practice Paper 13, Publisher IDS
Available as pdf.

[From IDS website] “A heightened focus on demonstrating development results has increased the stakes for evaluating impact (Stern 2015), while the more complex objectives and designs of international aid programmes make it ever more challenging to attribute effects to a particular intervention (Befani, Barnett and Stern 2014).

Qualitative Comparative Analysis (QCA) is part of a new generation of approaches that go beyond the standard counterfactual logic in assessing causality and impact. Based on the lessons from three diverse applications of QCA, this CDI Practice Paper by Florian Schatz and Katharina Welle reflects on the potential of this approach for the impact evaluation toolbox.”

Rick Davies comment: QCA is one part of a wider family of methods that can be labelled as “configurational” See my video on “Evaluating ‘loose’ Theories of Change” for an outline of the other methods of analysis that fall into the same category. I think they are an important set of alternative methods for three reasons:

(a) they can be applied “after the fact”, if the relevant data is available. They do not require the careful setting up and monitoring that is characteristics of methods such as randomised control trials,

(b) they can use categorical (i.e. nominal) data, not just variable data.

(c) configurational methods are especially suitable for dealing with “complexity” because of the view of causality that is the basis of these configurational methods…it is one that has some correspondence with the complexity of the world we see around us. Configurational methods:

  • see causes as involving both single and multiple (i.e. conjunctural) causal conditions
  • see outcomes as potentially the result of more than one type of conjuncture (/configuration) of conditions  at work. This feature is also known as equifinality
  • see causes being of different types: Sufficient, Necessary, both and neither
  • see causes as being asymmetric: causes of an outcome not occurring may be different from simply the absence of the causes the outcome




VN:F [1.9.22_1171]
Rating: +1 (from 1 vote)

IFAD Evaluation manual (2nd ed.)

Posted on 24 December, 2015 – 7:00 PM

“The [Dec 2015]  Evaluation Manual contains the core methodology that the Independent Office of Evaluation of IFAD (IOE) uses to conduct its evaluations. It has been developed based on the principles set out in the IFAD Evaluation Policy, building on international good evaluation standards and practice.

This second edition incorporates new international evaluative trends and draws from IOE’s experience in implementing the first edition. The Manual also takes into account IFAD’s new strategic priorities and operating model – which have clear implications for evaluation methods and processes – and adopts more rigorous methodological approaches, for example by promoting better impact assessment techniques and by designing and using theories of change.

The Evaluation Manual’s primary function is to is to ensure consistency, rigour and transparency across independent evaluations, and enhance IOE’s effectiveness and quality of work. It serves to guide staff and consultants engaged in evaluation work at IOE and it is a reference document for other IFAD staff and development partners, (such as project management staff and executing agencies of IFAD-supported operations), especially in recipient countries, on how evaluation of development programmes in the agriculture and rural development sector is conducted in IFAD.

The revision of this Manual was undertaken in recognition of the dynamic environment in which IFAD operates, and in response to the evolution in the approaches and methodologies of international development evaluation. It will help ensure that IFAD’s methodological practice remains at the cutting edge.

The Manual has been prepared through a process of engagement with multiple internal and external feedback opportunities from various stakeholders, including peer institutions (African Development Bank, Asian Development Bank, Food and Agriculture Organization of the United Nations, Institute of Development Studies [University of Sussex], Swiss Agency for Development and Cooperation and the World Bank). It was also reviewed by a high-level panel of experts.

Additionally, this second edition contains the core methodology for evaluations that were not contemplated in the first edition, such as corporate-level evaluations, impact evaluations and evaluation synthesis reports.

The manual is available in Arabic, English, French and Spanish to facilitate its use in all regions where IFAD has operations.”

VN:F [1.9.22_1171]
Rating: +1 (from 1 vote)

A visual introduction to machine learning

Posted on 24 December, 2015 – 6:38 PM

A Visual Introduction to Machine Learning

This website explains very clearly, using good visualisations, how a Decision Tree algorithm can make useful predictions about how different attributes of a case, such as  project, relate to the presence or absence of an outcome of interest. Decision tree models are a good alternative to the use of QCA, in that the results are easily communicable and the learning curve is not so steep. See my blog “Rick on the Road” for  a number of posts I have made on the use of Decision Trees, for more information

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Hivos ToC Guidelines: THEORY OF CHANGE THINKING IN PRACTICE – A stepwise approach

Posted on 4 December, 2015 – 10:35 AM

Marjan van Es (Hivos), Irene Guijt, Isabel Vogel, 2015.  Available as pdf
1 Introduction
1.1 Hivos and Theory of Change
1.2 Origin of the guidelines
1.3 Use of the guidelines
2 Theory of Change
2.1 What are Theories of Change? What is a ToC approach?
2.2 Why a Theory of Change approach?
2.3 Core components of a ToC process and product
2.4 Theories of Change at different levels
2.5 Using ToC thinking for different purposes
3 Key features of a ToC process
3.1 From complexity to focus and back
3.2 Making assumptions explicit
3.3 The importance of visualisation
4 Quality of ToC practice
4.1 Principles of ToC practice
4.2 Power at play
4.3 Gender (in)equality
5 Developing Theories of Change – eight steps Introduction
• Step 1 – Clarify the purpose of the ToC process
• Step 2 – Describe the desired change
• Step 3 – Analyse the current situation
• Step 4 – Identify domains of change
• Step 5 – Identify strategic priorities
• Step 6 – Map pathways of change
• Step 7 – Define monitoring, evaluation and learning priorities and process
• Step 8 – Use and adaptation of a ToC
6 ToC as a product
7 Quality Audit of a ToC process and product
8 Key tools, resources and materials
8.1 Tools referred to in these guidelines
• Rich Picture
• Four Dimensions of Change
• Celebrating success
• Stakeholder and Actor Analysis
• Power Analysis
• Gender Analysis
• Framings
• Behaviour change
• Ritual dissent
• Three Spheres: Control, Influence, Interest
• Necessary & Sufficient
• Indicator selection
• Visualisations of a ToC process and product
8.2 Other resources
8.3 Facilitation

Rick Davies comment: I have not had a chance to read the whole document, but I would suggest changes to the section on page 109 titled Sufficient and Necessary

A branch of a Theory of Change (in a tree shaped version) or a pathway (in a network version) can represent a sequence of events that is either:
  • Necessary and Sufficient to achieve the outcome. This is probably unlikely in most cases. If it was, there would be no need for any other branches/pathways
  • Necessary but Insufficient. In other words, events in the other branches were also necessary. In this case the ToC is quite demanding in its requirements before outcomes can be achieved. An evaluation would only have to find one of these branches not working to find the ToC not working
  • Sufficient but Unnecessary. In other words the outcome can be achieved via this branch or via the other branches. This is a less demanding ToC and more difficult to disprove. Each of the branches which was expected to be Sufficient would need to be tested

Because of these different interpretations and their consequences we should expect a ToC to state clearly the status of each branch in terms of its Necessity and/or Sufficiency

VN:F [1.9.22_1171]
Rating: +1 (from 1 vote)

Evaluation in the Extreme: Research, Impact and Politics in Violently Divided Societies

Posted on 25 November, 2015 – 6:11 PM

Kenneth Bush – University of York Heslington, York, England
Colleen Duggan – International Development Research Centre, Ottawa
published October 2015, by Sage

“Over the past two decades, there has been an increase in the funding of research in and on violently divided societies. But how do we know whether research makes any difference to these societies—is the impact constructive or destructive? This book is the first to systematically explore this question through a series of case studies written by those on the front line of applied research. It offers clear and logical ways to understand the positive or negative role that research, or any other aid intervention, might have in developing societies affected by armed conflict, political unrest and/or social violence.”

Kenneth Bush is Altajir Lecturer and Executive Director of the Post-war Reconstruction and Development Unit, University of York (UK).  From 2016: School of Government & International Affairs, Durham University

Colleen Duggan is a Senior Programme Specialist in the Policy and Evaluation Division of the International Development Research Centre, Ottawa.

Download PDF:  


 Download EBook: http://www.idrc.ca/EN/Resources/Publications/openebooks/584-7/index.html

 Order Book:


VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

The sustainable development goals as a network of targets

Posted on 13 November, 2015 – 12:01 AM

DESA Working Paper No. 141, ST/ESA/2015/DWP/141, March 2015
Towards integration at last? The sustainable development goals as a network of targets
David Le Blanc, Department of Economic & Social Affairs

ABSTRACT “In 2014, UN Member States proposed a set of Sustainable Development Goals (SDGs), which will succeed the Millennium Development Goals (MDGs) as reference goals for the international development community for the period 2015-2030. The proposed goals and targets can be seen as a network, in which links among goals exist through targets that refer to multiple goals. Using network analysis techniques, we show that some thematic areas covered by the SDGs are well connected among one another. Other parts of the network have weaker connections with the rest of the system. The SDGs as a whole are a more integrated system than the MDGs were, which may facilitate policy integration across sectors. However, many of the links among goals that have been documented in biophysical, economic and social dimensions are not explicitly reflected in the SDGs. Beyond the added visibility that the SDGs provide to links among thematic areas, attempts at policy integration across various areas will have to be based on studies of the biophysical, social and economic systems.”

Rick Davies Comment: This is an example of something I would like to see many more examples of (what are in effect, almost): network Theories of Change, in place of overly simplified hierarchical models which typically have few if any feedback loops (aka cyclic graphs) Request: Could the author making the underlying data set publicly available, so other people can do their own network analyses? I know the data set could be reconstructed from existing sources on the SDGs, but…it could save a lot of unnecessary work. Also, the paper should provide a footnote explanation of the layout algorithm used to generate the network diagrams

Some simple improvements that could be made to the existing network diagrams:

  • Vary node size by their centrality (number of immediate connections they have with other nodes)
  • Represent Target nodes as squares and goal nodes as circles, not all as circles

What is now needed is a two mode network diagram showing what agencies (perhaps UN for a start) are prioritizing which SDGs.  This will help focus minds on where coordination needs are greatest, i.e. between which specific agencies re which specific goals. Here is an example of this kind of network diagram from Ghana, showing which different government agencies prioritised which Governance objectives in the Ghana Poverty Reduction Strategy, more than a decade ago. (Blue nodes – government agencies, red nodes = GPRS governance objectives, thicker lines = higher priority). The existence of SDG targets as well as goal could make an updated version of this kind of exercise even more useful.


Postscript: 16 April 2016. See also this online network diagram of the SDGs and targets, unfortunately lacking any text commentary. Basically eye candy only

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Crowdsourced research: Many hands make tight work

Posted on 11 October, 2015 – 10:44 AM

Crowdsourced research: Many hands make tight work, Raphael Silberzahn& Eric L. Uhlmann, Nature, 07 October 2015

Selected quotes:

“Crowdsourcing research can balance discussions, validate findings and better inform policy”

Crowdsourcing research can reveal how conclusions are contingent on analytical choices. Furthermore, the crowdsourcing framework also provides researchers with a safe space in which they can vet analytical approaches, explore doubts and get a second, third or fourth opinion. Discussions about analytical approaches happen before committing to a particular strategy. In our project, the teams were essentially peer reviewing each other’s work before even settling on their own analyses. And we found that researchers did change their minds through the course of analysis.

Crowdsourcing also reduces the incentive for flashy results. A single-team project may be published only if it finds significant effects; participants in crowdsourced projects can contribute even with null findings. A range of scientific possibilities are revealed, the results are more credible and analytical choices that seem to sway conclusions can point research in fruitful directions. What is more, analysts learn from each other, and the creativity required to construct analytical methodologies can be better appreciated by the research community and the public.

The transparency resulting from a crowdsourced approach should be particularly beneficial when important policy issues are at stake. The uncertainty of scientific conclusions about, for example, the effects of the minimum wage on unemployment, and the consequences of economic austerity policies should be investigated by crowds of researchers rather than left to single teams of analysts.

Under the current system, strong storylines win out over messy results. Worse, once a finding has been published in a journal, it becomes difficult to challenge. Ideas become entrenched too quickly, and uprooting them is more disruptive than it ought to be. The crowdsourcing approach gives space to dissenting opinions.

Researchers who are interested in starting or participating in collaborative crowdsourcing projects can access resources available online. We have publicly shared all our materials and survey templates, and the Center for Open Science has just launched ManyLab, a web space where researchers can join crowdsourced projects.

Summary of  this Nature article in this weeks Economist (Honest disagreement about methods may explain irreproducible results. From the Economist, p82, October 10th, 2015)

“IT SOUNDS like an easy question for any half-competent scientist to answer. Do dark-skinned footballers get given red cards more often than light-skinned ones? But, as Raphael Silberzahn of IESE, a Spanish business school, and Eric Uhlmann of INSEAD, an international one (he works in the branch in Singapore), illustrate in this week’s Nature, it is not. The answer depends on whom you ask, and the methods they use.

Dr Silberzahn and Dr Uhlmann sought their answers from 29 research teams. They gave their volunteers the same wodge of data (covering 2,000 male footballers for a single season in the top divisions of the leagues of England, France, Germany and Spain) and waited to see what would come back.

The consensus was that dark-skinned players were about 1.3 times more likely to be sent off than were their light-skinned confrères. But there was a lot of variation. Nine of the research teams found no significant relationship between a player’s skin colour and the likelihood of his receiving a red card. Of the 20 that did find a difference, two groups reported that dark-skinned players were less, rather than more, likely to receive red cards than their paler counterparts (only 89% as likely, to be precise). At the other extreme, another group claimed that dark-skinned players were nearly three times as likely to be sent off.

Dr Uhlmann and Dr Silberzahn are less interested in football than in the way science works. Their study may shed light on a problem that has quite a few scientists worried: the difficulty of reproducing many results published in journals.

Fraud, unconscious bias and the cherry-picking of data have all been blamed at one time or another—and all, no doubt, contribute. But Dr Uhlmann’s and Dr Silberzahn’s work offers another explanation: that even scrupulously honest scientists may disagree about how best to attack a data set. Their 29 volunteer teams used a variety of statistical models (“everything from Bayesian clustering to logistic regression and linear modelling”, since you ask) and made different decisions about which variables within the data set were deemed relevant. (Should a player’s playing position on the field be taken into account? Or the country he was playing in?) It was these decisions, the authors reckon, that explain why different teams came up with different results.

How to get around this is a puzzle. But when important questions are being considered—when science is informing government decisions, for instance—asking several different researchers to do the analysis, and then comparing their results, is probably a good idea.”

See also another summary of the Nature articel in: A Fix for Social Science, Francis Diep, Pacific Standard, 7th October


VN:F [1.9.22_1171]
Rating: -1 (from 1 vote)

How Useful Are RCTs in Evaluating Transparency and Accountability Projects ?

Posted on 24 September, 2015 – 4:32 PM
by LEAVY, J., IDS Research, Evidence and Learning Working Paper, Issue 1, September 2014. 37 pages. Available as pdf
List of abbreviations iv
1 Introduction 1
1.1 Objectives of this review 2
2 Impact evaluation and RCTs 4
2.1 Impact evaluation definitions 4
2.1.1 Causality and the counterfactual 4
2.2 Strengths and conditions of RCTs 5
3 T&A initiatives 6
3.1 What are ‘transparency’ and ‘accountability’? 6
3.2 Characteristics of T&A initiatives 7
3.2.1 Technology for T&A 8
3.3 Measuring (the impact of) T&A 9
4 RCT evaluation of T&A initiatives 10
4.1 What do we already know? 10
4.2 Implications for evaluation design 12
5 How effective are RCTs in measuring the impact of T&A programmes? 14
5.1 Analytical framework for assessing RCTs in IE of T&A initiatives 14
5.2 Search methods 15
5.3 The studies 16
5.4 Analysis 18
5.4.1 Design 18
5.4.2 Contribution 20
5.4.3 Explanation 20
5.4.4 Effects 22
5.5 Summary 25
6 Conclusion 26
References 28
Rick Davies comment: I liked the systematic way in which the author reviewed the different aspects of 15 relevant RCTs, as documented in section 5. The Conclusions section was balanced and pragmatic
VN:F [1.9.22_1171]
Rating: +2 (from 2 votes)

Social Network Analysis for [M&E of] Program Implementation

Posted on 1 September, 2015 – 2:25 PM
Valente, T.W., Palinkas, L.A., Czaja, S., Chu, K.-H., Brown, C.H., 2015. Social Network Analysis for Program Implementation. PLoS ONE 10, e0131712. doi:10.1371/journal.pone.0131712 Available as pdf

“Abstract: This paper introduces the use of social network analysis theory and tools for implementation research. The social network perspective is useful for understanding, monitoring, influencing, or evaluating the implementation process when programs, policies, practices, or principles are designed and scaled up or adapted to different settings. We briefly describe common barriers to implementation success and relate them to the social networks of implementation stakeholders. We introduce a few simple measures commonly used in social network analysis and discuss how these measures can be used in program implementation. Using the four stage model of program implementation (exploration, adoption, implementation, and sustainment) proposed by Aarons and colleagues [1] and our experience in developing multi-sector partnerships involving community leaders, organizations, practitioners, and researchers, we show how network measures can be used at each stage to monitor, intervene, and improve the implementation process. Examples are provided to illustrate these concepts. We conclude with expected benefits and challenges associated with this approach”.

Selected quotes:

“Getting evidence-based programs into practice has increasingly been recognized as a concern in many domains of public health and medicine [4, 5]. Research has shown that there is a considerable lag between an invention or innovation and its routine use in a clinical or applied setting [6]. There are many challenges in scaling up proven programs so that they reach the many people in need [7–9].”

“Partnerships are vital to the successful adoption, implementation and sustainability of successful programs. Indeed, evidence-based programs that have progressed to implementation and translation stages report that effective partnerships with community-based, school, or implementing agencies are critical to their success [11, 17, 18]. Understanding which partnerships can be created and maintained can be accomplished via social network analysis.”

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

The median impact narrative

Posted on 29 August, 2015 – 4:06 PM

Rick Davies comment: The text below is an excerpt from a longer blog posting found here: Impact as narrative, by Bruce Wydick

I want to suggest one particular tool that I will call the “median impact narrative,” which (though not precisely the average–because the average typically does not factually exist) recounts the narrative of the one or a few of the middle-impact subjects in a study. So instead of highlighting the outlier, Juana, who has built a small textile empire from a few microloans, we conclude with a paragraph describing Eduardo, who after two years of microfinance borrowing, has dedicated more hours to growing his carpentry business and used microloans to weather two modest-size economic shocks to his household, an illness to his wife and the theft of some tools. If one were to choose the subject for the median impact narrative rigorously it could involve choosing the treated subject whose realized impacts represent the closest Euclidean distance (through a weighting of impact variables via the inverse of the variance-covariance matrix) to the estimated ATTs.

Consider, for example, the “median impact narrative” of the outstanding 2013 Haushofer and Shapiro study of GiveDirectly, a study finding an array of substantial impacts from unconditional cash transfers in Kenya. The median impact narrative might recount the experience of Joseph, a goat herder with a family of six who received $1100 in five electronic cash transfers. Joseph and his wife both have only two years of formal schooling and have always struggled to make ends meet with their four children. At baseline, Joseph’s children went to bed hungry an average of three days a week. Eighteen months after receiving the transfers, his goat herd increased by 51%, bringing added economic stability to his household. He also reported a 30% reduction in his children going to bed hungry in the period before the follow-up survey, and a 42% reduction in number of days his children went completely without food. Tests of his cortisol indicated that Joseph experienced a reduction in stress, about 0.14 standard deviations relative to same difference in the control group. This kind of narrative on the median subject from this particular study cements a truthful image of impact into the mind of a reader.

A false dichotomy has emerged between the use of narrative and data analysis; either can be equally misleading or helpful in conveying truth about causal effects. As researchers begin to incorporate narrative into their scientific work, it will begin to create a standard for the appropriate use of narrative by non-profits, making it easier to insist that narratives present an unbiased picture that represents a truthful image of average impacts.”

Some of the attached readers’ Comments are also of interest e.g.

The basic point is a solid and important one: sampling strategy matters to qualitative work and for understanding what really happened for a range of people.

One consideration for sampling is that the same observables (independent vars) that drive sub-group analyses can also be used to help determine a qualitative sub-sample (capturing medians, outliers in both directions, etc).

A second consideration, in the spirit of lieberman’s call for nested analyses (or other forms of linked and sequential qual-quant work), the results of quantitative work can be used to inform sampling of later qualitative work, targeting those representing the range of outcomes values.”

Read more on this topic from this reader here http://blogs.worldbank.org/publicsphere/1-2014

Rick Davies comment: If the argument for using median impact narratives is accepted the interesting question for me is then how do we identify median cases? Bruce Wydick seems to suggest above that this would be done by looking at impact measures and finding a median case among those (Confession: I don’t fully understand his reference to Euclidean distance and ATTs). I would argue that we need to look at median-ness not only in impacts, but also in other attributes of the cases, including the context and interventions experienced by each case. One way of doing this is to measure and use Hamming distance as a measure of similarity between cases, an idea I have discussed elsewhere. This can be done with very basic categorical data, as well as variable data

Postscript: Some readers might ask “Why not simply choose sources of impact narratives from a randomised sample of cases, as you might do with quantitative data? Well, with a random sample of quantitative data you can average the responses. But you just cannot do that with a random sample of narrative data, there is no way of “averaging” the content of a set of texts. But you would end up with a set of stories that readers might then themselves “average out” into one overall impression in their own minds. But that will not be a very transparent or consistent process.

VN:F [1.9.22_1171]
Rating: -1 (from 1 vote)