How Useful Are RCTs in Evaluating Transparency and Accountability Projects ?

Posted on 24 September, 2015 – 4:32 PM
by LEAVY, J., IDS Research, Evidence and Learning Working Paper, Issue 1, September 2014. 37 pages. Available as pdf
List of abbreviations iv
1 Introduction 1
1.1 Objectives of this review 2
2 Impact evaluation and RCTs 4
2.1 Impact evaluation definitions 4
2.1.1 Causality and the counterfactual 4
2.2 Strengths and conditions of RCTs 5
3 T&A initiatives 6
3.1 What are ‘transparency’ and ‘accountability’? 6
3.2 Characteristics of T&A initiatives 7
3.2.1 Technology for T&A 8
3.3 Measuring (the impact of) T&A 9
4 RCT evaluation of T&A initiatives 10
4.1 What do we already know? 10
4.2 Implications for evaluation design 12
5 How effective are RCTs in measuring the impact of T&A programmes? 14
5.1 Analytical framework for assessing RCTs in IE of T&A initiatives 14
5.2 Search methods 15
5.3 The studies 16
5.4 Analysis 18
5.4.1 Design 18
5.4.2 Contribution 20
5.4.3 Explanation 20
5.4.4 Effects 22
5.5 Summary 25
6 Conclusion 26
References 28
Rick Davies comment: I liked the systematic way in which the author reviewed the different aspects of 15 relevant RCTs, as documented in section 5. The Conclusions section was balanced and pragmatic
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Social Network Analysis for [M&E of] Program Implementation

Posted on 1 September, 2015 – 2:25 PM
Valente, T.W., Palinkas, L.A., Czaja, S., Chu, K.-H., Brown, C.H., 2015. Social Network Analysis for Program Implementation. PLoS ONE 10, e0131712. doi:10.1371/journal.pone.0131712 Available as pdf

“Abstract: This paper introduces the use of social network analysis theory and tools for implementation research. The social network perspective is useful for understanding, monitoring, influencing, or evaluating the implementation process when programs, policies, practices, or principles are designed and scaled up or adapted to different settings. We briefly describe common barriers to implementation success and relate them to the social networks of implementation stakeholders. We introduce a few simple measures commonly used in social network analysis and discuss how these measures can be used in program implementation. Using the four stage model of program implementation (exploration, adoption, implementation, and sustainment) proposed by Aarons and colleagues [1] and our experience in developing multi-sector partnerships involving community leaders, organizations, practitioners, and researchers, we show how network measures can be used at each stage to monitor, intervene, and improve the implementation process. Examples are provided to illustrate these concepts. We conclude with expected benefits and challenges associated with this approach”.

Selected quotes:

“Getting evidence-based programs into practice has increasingly been recognized as a concern in many domains of public health and medicine [4, 5]. Research has shown that there is a considerable lag between an invention or innovation and its routine use in a clinical or applied setting [6]. There are many challenges in scaling up proven programs so that they reach the many people in need [7–9].”

“Partnerships are vital to the successful adoption, implementation and sustainability of successful programs. Indeed, evidence-based programs that have progressed to implementation and translation stages report that effective partnerships with community-based, school, or implementing agencies are critical to their success [11, 17, 18]. Understanding which partnerships can be created and maintained can be accomplished via social network analysis.”

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

The median impact narrative

Posted on 29 August, 2015 – 4:06 PM

Rick Davies comment: The text below is an excerpt from a longer blog posting found here: Impact as narrative, by Bruce Wydick

I want to suggest one particular tool that I will call the “median impact narrative,” which (though not precisely the average–because the average typically does not factually exist) recounts the narrative of the one or a few of the middle-impact subjects in a study. So instead of highlighting the outlier, Juana, who has built a small textile empire from a few microloans, we conclude with a paragraph describing Eduardo, who after two years of microfinance borrowing, has dedicated more hours to growing his carpentry business and used microloans to weather two modest-size economic shocks to his household, an illness to his wife and the theft of some tools. If one were to choose the subject for the median impact narrative rigorously it could involve choosing the treated subject whose realized impacts represent the closest Euclidean distance (through a weighting of impact variables via the inverse of the variance-covariance matrix) to the estimated ATTs.

Consider, for example, the “median impact narrative” of the outstanding 2013 Haushofer and Shapiro study of GiveDirectly, a study finding an array of substantial impacts from unconditional cash transfers in Kenya. The median impact narrative might recount the experience of Joseph, a goat herder with a family of six who received $1100 in five electronic cash transfers. Joseph and his wife both have only two years of formal schooling and have always struggled to make ends meet with their four children. At baseline, Joseph’s children went to bed hungry an average of three days a week. Eighteen months after receiving the transfers, his goat herd increased by 51%, bringing added economic stability to his household. He also reported a 30% reduction in his children going to bed hungry in the period before the follow-up survey, and a 42% reduction in number of days his children went completely without food. Tests of his cortisol indicated that Joseph experienced a reduction in stress, about 0.14 standard deviations relative to same difference in the control group. This kind of narrative on the median subject from this particular study cements a truthful image of impact into the mind of a reader.

A false dichotomy has emerged between the use of narrative and data analysis; either can be equally misleading or helpful in conveying truth about causal effects. As researchers begin to incorporate narrative into their scientific work, it will begin to create a standard for the appropriate use of narrative by non-profits, making it easier to insist that narratives present an unbiased picture that represents a truthful image of average impacts.”

Some of the attached readers’ Comments are also of interest e.g.

The basic point is a solid and important one: sampling strategy matters to qualitative work and for understanding what really happened for a range of people.

One consideration for sampling is that the same observables (independent vars) that drive sub-group analyses can also be used to help determine a qualitative sub-sample (capturing medians, outliers in both directions, etc).

A second consideration, in the spirit of lieberman’s call for nested analyses (or other forms of linked and sequential qual-quant work), the results of quantitative work can be used to inform sampling of later qualitative work, targeting those representing the range of outcomes values.”

Read more on this topic from this reader here

Rick Davies comment: If the argument for using median impact narratives is accepted the interesting question for me is then how do we identify median cases? Bruce Wydick seems to suggest above that this would be done by looking at impact measures and finding a median case among those (Confession: I don’t fully understand his reference to Euclidean distance and ATTs). I would argue that we need to look at median-ness not only in impacts, but also in other attributes of the cases, including the context and interventions experienced by each case. One way of doing this is to measure and use Hamming distance as a measure of similarity between cases, an idea I have discussed elsewhere. This can be done with very basic categorical data, as well as variable data

Postscript: Some readers might ask “Why not simply choose sources of impact narratives from a randomised sample of cases, as you might do with quantitative data? Well, with a random sample of quantitative data you can average the responses. But you just cannot do that with a random sample of narrative data, there is no way of “averaging” the content of a set of texts. But you would end up with a set of stories that readers might then themselves “average out” into one overall impression in their own minds. But that will not be a very transparent or consistent process.

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

What methods may be used in impact evaluations of humanitarian assistance?

Posted on 19 August, 2015 – 6:18 PM

Jyotsna Puri, Anastasia Aladysheva, Vegard Iversen, Yashodhan Ghorpade, Tilman Brück, International Initiative for Impact Evaluation (3ie) Working Paper 22, December 2014. Available as pdf

“Humanitarian crises are complex situations where the demand for aid has traditionally far exceeded its supply. The humanitarian assistance community has
long asked for better evidence on how each dollar should be effectively spent. Impact evaluations of humanitarian assistance can help answer these questions and
also respond to the increasing call to estimate the impact of humanitarian assistance and supplement the rich tradition for undertaking real-time and process evaluations
in the sector. This working paper gives an overview of the methodological techniques that can be used to address some of the important questions in this area, while
simultaneously considering the special circumstances and constraints associated with humanitarian assistance.”

Executive summary
1. Introduction
2. Defining and categorising humanitarian emergencies and humanitarian action
3. Defining and discussing high-quality, theory-based impact evaluations 
3.1 Various forms of evaluations
3.2 Impact evaluations in non-emergency settings
3.3 Impact evaluations in emergency settings
3.4 Objectives of impact evaluations
3.5 Methods for impact evaluations
4. A conceptual framework for using impact evaluations in humanitarian emergencies.
5. Impact evaluations of humanitarian assistance: a review of the literature .
5.1 Emergency relief
5.2 Recovery and resilience
5.3 General discussion on methods used by studies
6. Using appropriate methods to overcome ethical concerns
7. Case studies
Case study 1: Multiple interventions or a multi-agency intervention
Case study 2: Unanticipated emergencies
Case study 3: A complex emergency involving flooding and conflict
Case study 4: A protracted emergency – internally displaced peoples in DRC
Case study 5: Using impact evaluations to estimate the effect of assistance after typhoons in the Philippines
Case study 6: Using impact evaluations to estimate the effect of assistance in the recovery phase in the absence of ex ante planning
8. Conclusions 
Appendix A : Table on impact evaluations of humanitarian relief

VN:F [1.9.22_1171]
Rating: +1 (from 1 vote)

Case-Selection [for case studies]: A Diversity of Methods and Criteria

Posted on 19 August, 2015 – 12:12 PM
Gerring, J., Cojocaru, L., 2015. Case-Selection: A Diversity of Methods and Criteria. January 2015 Available as pdf

Excerpt: “Case-selection plays a pivotal role in case study research. This is widely acknowledged, and is implicit in the practice of describing case studies by their method of selection – typical, deviant, crucial, and so forth. It is also evident in the centrality of case-selection in methodological work on the case study, as witnessed by this symposium. By contrast, in large-N cross-case research one would never  describe a study solely by its method of sampling. Likewise, sampling occupies a specialized methodological niche within the literature and is not front-and-center in current methodological debates. The reasons for this contrast are revealing and provide a fitting entrée to our subject.

First, there is relatively little variation in methods of sample construction for cross-case research. Most samples are randomly sampled from a known population or are convenience
samples, employing all the data on the subject that is available. By contrast, there are myriad approaches to case-selection in case study research, and they are quite disparate, offering many opportunities for researcher bias in the selection of cases (“cherry-picking”).

Second, there is little methodological debate about the proper way to construct a sample in cross-case research. Random sampling is the gold standard and departures from this standard are
recognized as inferior. By contrast, in case study research there is no consensus about how best to choose a case, or a small set of cases, for intensive study.

Third, the construction of a sample and the analysis of that sample are clearly delineated, sequential tasks in cross-case research. By contrast, in case study research they blend into one
another. Choosing a case often implies a method of analysis, and the method of analysis may drive the selection of cases.

Fourth, because cross-case research encompasses a large sample – drawn randomly or incorporating as much evidence as is available – its findings are less likely to be driven by the
composition of the sample. By contrast, in case study research the choice of a case will very likely determine the substantive findings of the case study.

Fifth, because cross-case research encompasses a large sample claims to external validity are fairly easy to evaluate, even if the sample is not drawn randomly from a well-defined population. By
contrast, in case study research it is often difficult to say what a chosen case is a case of – referred to as a problem of “casing.”

Finally, taking its cue from experimental research, methodological discussion of cross-case research tends to focus on issues of internal validity, rendering the problem of case-selection less
relevant. Researchers want to know whether a study is true for the studied sample. By contrast, methodological discussion of case study research tends to focus on issues of external validity. This could be a product of the difficulty of assessing case study evidence, which tends to demand a great deal of highly specialized subject expertise and usually does not draw on formal methods of analysis that would be easy for an outsider to assess. In any case, the effect is to further accentuate the role of case-selection. Rather than asking whether the case is correctly analyzed readers want to know whether the results are generalizable, and this leads back to the question of case-selection.”

Other recent papers on case selection methods:

Herron, M.C., Quinn, K.M., 2014. A Careful Look at Modern Case Selection Methods. Sociological Methods & Research
 Nielsen, R.A., 2014. Case Selection via Matching.
VN:F [1.9.22_1171]
Rating: -1 (from 1 vote)

Participatory Approaches (to impact evaluation – a pluralist view)

Posted on 12 August, 2015 – 6:15 PM

Methodological Briefs. Impact Evaluation No. 5 by Irene Guijt (and found via the Better Evaluation website). Available as pdf.

“This guide, written by Irene Guijt for UNICEF, looks at the use of participatory approaches in impact evaluation…..By asking the question, ‘Who should be involved, why and how?’ for each step of an impact evaluation, an appropriate and context-specific participatory approach can be developed”


  • Participatory approaches: a brief description
  • When is it appropriate to use this method?
  • How to make the most of participatory approaches
  • Ethical concerns
  • Which other methods work well with this one?
  • Participation in analysis and feedback of results
  • Examples of good practices and challenges

Rick Davies comment: I like the pluralist approach this paper takes towards the use of participatory approaches. It is practically oriented rather than driven by a ideological type of belief that peoples participation must always be maximised. That said, I did find  Table 1 “Types of participation by programme participants in impact evaluation” out of place, because it was a typology with a very simple linear scale with fairly obvious indications of not only what kinds of  participation are possible,but which ones are more desirable. On the other hand I thought Box 3 was really useful, because it spelled out a number of useful questions to ask about possible forms of participation at each stage of the evaluation design, implementation and review process. It is worth noting that given the 22 questions, and assuming for arguments sake they each had binary answers, this means there are at least 2 to the power of 22 different types of ways of building participation into an  evaluation i.e 4,194,304 ways! That seems a bit closer to reality to me, relative to the earlier classification of four types in Table 1

I think the one area here where I would like more detail and examples is on participatory approaches to the analysis of data. Not the collection of data, but its analysis. There is some discussion on page 11 about causality, which would be great to see further developed. I often feel that this is an area of participatory practice where a yellow post-it note might as well placed, saying “here a miracle occurs”

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

The use of Data Envelopment Analysis to calculate priority scores in needs assessments

Posted on 10 August, 2015 – 6:40 PM

by Aldo Benini, July 2015

Priority indices have grown popular for identifying communities most affected by disasters. Responders have produced a number of formats and formulas. Most of these combine indicators using weights and aggregations decided by analysts. Often the rationales for these are weak. In such situations, a data-driven methodology may be preferable. This note discusses the suitability of different approaches. It offers a basic tutorial of a DEA freeware application that works closely with MS Excel. The demo data are from the response to Typhoon Haiyan in the Philippines 2013. . – Mirrored from the Assessment Capacities Project (ACAPS) Web site with permission.

Rick Davies comment: I have dipped into this paper and resolved to learn more about Data Envelope Analysis. It looks like it could be quite useful.

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Qualitative Comparative Analysis – A Rigorous Qualitative Method for Assessing Impact

Posted on 6 June, 2015 – 2:50 PM

A Coffey How-To note, June 2015, by Carrie Baptist and Barbara Befani. Available as pdf


  • QCA is a case based method which allows evaluators to identify different combinations of factors that are critical to a given outcome, in a given context. This allows for a more nuanced understanding of how different combinations of  factors can lead to success, and the influence context can have on success.
  • QCA allows evaluators to test theories of change and answer the question ‘what works best, why and under what circumstances’ in a way that emerges directly from the empirical analysis, that can be replicated by other  researchers, and is generalizable to other contexts.
  • While it isn’t appropriate for use in all circumstances and has limitations, QCA also has certain unique strengths – including qualitatively assessing impact and identifying multiple pathways to achieving change which make it a valuable addition to the evaluation toolkit.

Rick Davies comment: The availability of this sort of explanatory and introductory note is very timely, given the increased use of QCA for evaluation purposes. My only quibble with this how-to note is that the heart of the QCA process seems to have been left undescribed (see step 10, page 6), like the proverbial  black box. For those looking for a more detailed exposition, keep an eye out for the extensive guide now being prepared by Barbara Befani, with support from the Expert Group for Aid Studies in Sweden (More details here). There is also an introductory posting on QCA on the Better Evaluation website

See also: This new listing of use of QCA for evaluation purposes

VN:F [1.9.22_1171]
Rating: +1 (from 3 votes)

Category refinement in humanitarian needs assessments

Posted on 27 May, 2015 – 9:45 AM

MODERATE NEED, ACUTE NEED Valid categories for humanitarian needs assessments? Evidence from a recent needs assessment in Syria
26 MARCH 2015 by Aldo Benini,

Needs assessments in crises seek to establish, among other elements, the number of persons in need. “Persons in need” is a broad and fuzzy concept; the estimates that local key informants provide are highly uncertain. Would refining the categories of persons in need lead to more reliable estimates?

The Syria Multi-Sectoral Needs Assessment (MSNA), in autumn 2014, provided PiN estimates for 126 of the 270 sub-districts of the country. It differentiated between persons in moderate and those in acute need. “Moderate Need, Acute Need – Valid Categories for Humanitarian Needs Assessments?” tests the information value of this distinction. The results affirm that refined PiN categories can improve the measurement of unmet needs under conditions that rarely permit exact classification. The note ends with some technical recommendations for future assessments.”

VN:F [1.9.22_1171]
Rating: +2 (from 2 votes)

Impact Evaluation: A Guide for Commissioners and Managers

Posted on 18 May, 2015 – 3:48 PM

Prepared by Elliot Stern for the Big Lottery Fund, Bond, Comic Relief
and the Department for International Development, May 2015
Available as pdf

1. Introduction and scope 2
2. What is impact evaluation? 4
Defining impact and impact evaluation 4
Linking cause and effect 5
Explanation and the role of ‘theory’ 7
Who defines impact? 7
Impact evaluation and other evaluation approaches 8
Main messages 9

3. Frameworks for designing impact evaluation 10
Designs that support causal claims 10
The design triangle 11
Evaluation questions 11
Evaluation designs 13
Programme attributes 14
Main messages 15

4. What different designs and methods can do 16
Causal inference: linking cause and effect 16
Main types of impact evaluation design 20
The contemporary importance of the ‘contributory’ cause 21
Revisiting the ‘design triangle’ 21
Main messages 23

5. Using this guide 24
Drawing up terms of reference and assessing proposals for impact evaluations 25
Assessing proposals 25
Quality of reports and findings 27
Strengths of conclusions and recommendations 28
Using findings from impact evaluations 29
Main messages 29
Annex 30


VN:F [1.9.22_1171]
Rating: +2 (from 2 votes)