Research on the use and influence of evaluations: The beginnings of a list

This is intended to be the start of an accumulating list of references on the subject of evaluation use. Particularly papers that review specific sets or examples of evaluations, rather than talk about the issues in a less grounded way

2016

2015

2014

2012

2009

2000

1997

1986

Related docs

  • Improving the use of monitoring & evaluation processes and findings. Conference Report, Centre for Development Innovation, Wageningen, June 2014  
    • “An existing framework of four areas of factors influencing use …:
      1. Quality factors, relating to the quality of the evaluation. These factors include the evaluation design, planning, approach, timing, dissemination and the quality and credibility of the evidence.
      2. Relational factors: personal and interpersonal; role and influence of evaluation unit; networks,communities of practice.
      3. Organisational factors: culture, structure and knowledge management
      4. External factors, that affect utilisation in ways beyond the influence of the primary stakeholders and the evaluation process.
  • Bibliography provided by ODI, in response to this post Jan 2015. Includes all ODI publications found using keyword “evaluation” – a bit too broad, but still useful
  • ITIG- Utilization of Evaluations- Bibliography. International Development  Evaluation Association. Produced circa 2011/12

Livelihoods Monitoring and Evaluation: A Rapid Desk Based Study

by Kath Pasteur, 2014, 24 pages. Found here: http://www.evidenceondemand.info/livelihoods-monitoring-and-evaluation-a-rapid-desk-based-study

Abstract: “This report is the outcome of a rapid desk study to identify and collate the current state of evidence and best practice for monitoring and evaluating programmes that aim to have a livelihoods impact. The study identifies tried and tested approaches and indicators that can be applied across a range of livelihoods programming. The main focus of the report is an annotated bibliography of literature sources relevant to the theme. The narrative report highlights key themes and examples from the literature relating to methods and indicators. This collection of resources is intended to form the starting point for a more thorough organisation and analysis of material for the final formation of a Topic Guide on Livelihoods Indicators. This report has been produced by Practical Action Consulting for Evidence on Demand with the assistance of the UK Department for International Development (DFID) contracted through the Climate, Environment, Infrastructure and Livelihoods Professional Evidence and Applied Knowledge Services (CEIL PEAKS) programme, jointly managed by HTSPE Limited and IMC Worldwide Limited”

Full reference: Pasteur, K. Livelihoods monitoring and evaluation: A rapid desk based study. Evidence on Demand, UK (2014) 24 pp. [DOI: http://dx.doi.org/10.12774/eod_hd.feb2014.pasteur]

Process tracing: A list

  • Understanding Process Tracing, David Collier, University of California, Berkeley. PS: Political Science and Politics 44, No.4 (2011):823-30. 7 pages.
    • Abstract: “Process tracing is a fundamental tool of qualitative analysis. This method is often invoked by scholars who carry out within-case analysis based on qualitative data, yet frequently it is neither adequately understood nor rigorously applied. This deficit motivates this article, which offers a new framework for carrying out process tracing. The reformulation integrates discussions of process tracing and causal-process observations, gives greater attention to description as a key contribution, and emphasizes the causal sequence in which process-tracing observations can be situated. In the current period of major innovation in quantitative tools for causal inference, this reformulation is part of a wider, parallel effort to achieve greater systematization of qualitative methods. A key point here is that these methods can add inferential leverage that is often lacking in quantitative analysis. This article is accompanied by online teaching exercises, focused on four examples from American politics, two from comparative politics, three from international relations, and one from public health/epidemiology”
      • Great explanation of the difference between straw-in-the-wind tests, hoop tests, smoking-gun tests and doubly-decisive tests, using Sherlock Holmes story “Silver Blaze”
  • Case selection techniques in Process-tracing and the implications of taking the study of causal mechanisms seriously, Derek Beach, Rasmus Brun, 2012, 33 pages
    • Abstract: “This paper develops guidelines for each of the three variants of Process-tracing (PT): explaining outcome PT, theory-testing, and theory-building PT. Case selection strategies are not relevant when we are engaging in explaining outcome PT due to the broader conceptualization of outcomes that is a product of the different understandings of case study research (and science itself) underlying this variant of PT. Here we simply select historically important cases because they are for instance the First World War, not a ‘case of’ failed deterrence or crisis decision-making. Within the two theorycentric variants of PT, typical case selection strategies are most applicable. A typical case is one that is a member of the set of X, Y and the relevant scope conditions for the mechanism. We put forward that pathway cases, where scores on other causes are controlled for, are less relevant when we take the study of mechanisms seriously in PT, given that we are focusing our attention on how a mechanism contributes to produce Y, not on the causal effects of an X upon values of Y. We also discuss the role that deviant cases play in theory-building PT, suggesting that PT cannot stand alone, but needs to be complemented with comparative analysis of the deviant case with typical cases”
  • Process-Tracing Methods: Foundations and Guidelines, Derek Beach, Rasmus Brun Pedersen,  The University of Michigan Press (15 Dec 2012), 248 pages.
    • Description: “Process-tracing in social science is a method for studying causal mechanisms linking causes with outcomes. This enables the researcher to make strong inferences about how a cause (or set of causes) contributes to producing an outcome. Derek Beach and Rasmus Brun Pedersen introduce a refined definition of process-tracing, differentiating it into three distinct variants and explaining the applications and limitations of each. The authors develop the underlying logic of process-tracing, including how one should understand causal mechanisms and how Bayesian logic enables strong within-case inferences. They provide instructions for identifying the variant of process-tracing most appropriate for the research question at hand and a set of guidelines for each stage of the research process.” View the Table of Contents here:
  • Mahoney, James. 2012. “Mahoney, J. (2012). The Logic of Process Tracing Tests in the Social Sciences.  1-28.” Sociological Methods & Research XX(X) (March): 1–28. doi:10.1177/0049124112437709.
    • Abstract: This article discusses process tracing as a methodology for testing hypotheses in the social sciences. With process tracing tests, the analyst combines preexisting generalizations with specific observations from within a single case to make causal inferences about that case. Process tracing tests can be used to help establish that (1) an initial event or process took place, (2) a subsequent outcome also occurred, and (3) the former was a cause of the latter. The article focuses on the logic of different process tracing tests, including hoop tests, smoking gun tests, and straw in the wind tests. New criteria for judging the strength of these tests are developed using ideas concerning the relative importance of necessary and sufficient conditions. Similarities and differences between process tracing and the deductive-nomological model of explanation are explored.
  • Goertz, Gary, and James Mahoney. 2012. A Tale of Two Cultures: Qualitative and Quantitative Research in the Social Sciences. Princeton University Press. See chapter 8 on causal mechanisms and process tracing, and the surrounding chapters 7 and 9 which make up a section on within-case analysis
  • Hutchings, Claire. ‘Process Tracing: Draft Protocol’. Oxfam, 2013. Plus an associated blog posting and an Effectiveness Review which made use of the protocol
  • Schneider, C.Q., Rohlfing, I., 2013. Combining QCA and Process Tracing in Set-Theoretic Multi-Method Research. Sociological Methods & Research 42, 559–597. doi:10.1177/0049124113481341
    • Abstract:  Set-theoretic methods and Qualitative Comparative Analysis (QCA) in particular are case-based methods. There are, however, only few guidelines on how to combine them with qualitative case studies. Contributing to the literature on multi-method research (MMR), we offer the first comprehensive elaboration of principles for the integration of QCA and case studies with a special focus on case selection. We show that QCA’s reliance on set-relational causation in terms of necessity and sufficiency has important consequences for the choice of cases. Using real world data for both crisp-set and fuzzy-set QCA, we show what typical and deviant cases are in QCA-based MMR. In addition, we demonstrate how to select cases for comparative case studies aiming to discern causal mechanisms and address the puzzles behind deviant cases. Finally, we detail the implications of modifying the set-theoretic cross-case model in the light of case-study evidence. Following the principles developed in this article should increase the inferential leverage of set-theoretic MMR.”
  • Rohlfing, Ingo. “Comparative Hypothesis Testing Via Process Tracing.” Sociological Methods & Research 43, no. 4 (November 1, 2014): 606–42. doi:10.1177/0049124113503142.
    • Abstract: Causal inference via process tracing has received increasing attention during recent years. A 2 × 2 typology of hypothesis tests takes a central place in this debate. A discussion of the typology demonstrates that its role for causal inference can be improved further in three respects. First, the aim of this article is to formulate case selection principles for each of the four tests. Second, in focusing on the dimension of uniqueness of the 2 × 2 typology, I show that it is important to distinguish between theoretical and empirical uniqueness when choosing cases and generating inferences via process tracing. Third, I demonstrate that the standard reading of the so-called doubly decisive test is misleading. It conflates unique implications of a hypothesis with contradictory implications between one hypothesis and another. In order to remedy the current ambiguity of the dimension of uniqueness, I propose an expanded typology of hypothesis tests that is constituted by three dimensions.
  • Bennett, A., Checkel, J. (Eds.), 2014Process Tracing: From Metaphor to Analytic Tool. Cambridge University Press
  • Befani, Barbara, and John Mayne. “Process Tracing and Contribution Analysis: A Combined Approach to Generative Causal Inference for Impact Evaluation.IDS Bulletin 45, no. 6 (2014): 17–36. doi:10.1111/1759-5436.12110.
    • Abstract: This article proposes a combination of a popular evaluation approach, contribution analysis (CA), with an emerging method for causal inference, process tracing (PT). Both are grounded in generative causality and take a probabilistic approach to the interpretation of evidence. The combined approach is tested on the evaluation of the contribution of a teaching programme to the improvement of school performance of girls, and is shown to be preferable to either CA or PT alone. The proposed procedure shows that established Bayesian principles and PT tests, based on both science and common sense, can be applied to assess the strength of qualitative and quali-quantitative observations and evidence, collected within an overarching CA framework; thus shifting the focus of impact evaluation from ‘assessing impact’ to ‘assessing confidence’ (about impact).

  • Punton, M., Welle, K., 2015. Straws-in-the-wind, Hoops and Smoking Guns: What can Process Tracing Offer to Impact Evaluation?
    • Abstract:  “This CDI Practice Paper by Melanie Punton and Katharina Welle explains the methodological and theoretical foundations of process tracing, and discusses its potential application in international development impact evaluations. It draws on two early applications of process tracing for assessing impact in international development interventions: Oxfam Great Britain (GB)’s contribution to advancing universal health care in Ghana, and the impact of the Hunger and Nutrition Commitment Index (HANCI) on policy change in Tanzania. In a companion to this paper, Practice Paper 10 Annex describes the main steps in applying process tracing and provides some examples of how these steps might be applied in practice.”
  • Weller, N., & Barnes, J. (2016). Pathway Analysis and the search for causal mechanisms. Sociological Methods & Research, 45(3), 424–457.
    • Abstract: The study of causal mechanisms interests scholars across the social sciences. Case studies can be a valuable tool in developing knowledge and hypotheses about how causal mechanisms function. The usefulness of case studies in the search for causal mechanisms depends on effective case selection, and there are few existing guidelines for selecting cases to study causal mechanisms. We outline a general approach for selecting cases for pathway analysis: a mode of qualitative research that is part of a mixed-method research agenda, which seeks to (1) understand the mechanisms or links underlying an association between some explanatory variable, X1, and an outcome, Y, in particular cases and (2) generate insights from these cases about mechanisms in the unstudied population of cases featuring the X1/Y relationship. The gist of our approach is that researchers should choose cases for comparison in light of two criteria. The first criterion is the expected relationship between X1/Y, which is the degree to which cases are expected to feature the relationship of interest
      between X1 and Y. The second criterion is variation in case characteristics or the extent to which the cases are likely to feature differences in characteristics that can facilitate hypothesis generation. We demonstrate how to apply our approach and compare it to a leading example of pathway analysis in the so-called resource curse literature, a prominent example of a correlation featuring a nonlinear relationship and multiple causal mechanisms.
  • Befani, Barbara, and Gavin Stedman-Bryce. “Process Tracing and Bayesian Updating for Impact Evaluation.” Evaluation, June 24, 2016, 1356389016654584. doi:10.1177/1356389016654584.
    • Abstract: Commissioners of impact evaluation often place great emphasis on assessing the contribution made by a particular intervention in achieving one or more outcomes, commonly referred to as a ‘contribution claim’. Current theory-based approaches fail to provide evaluators with guidance on how to collect data and assess how strongly or weakly such data support contribution claims. This article presents a rigorous quali-quantitative approach to establish the validity of contribution claims in impact evaluation, with explicit criteria to guide evaluators in data collection and in measuring confidence in their findings. Coined ‘Contribution Tracing’, the approach is inspired by the principles of Process Tracing and Bayesian Updating, and attempts to make these accessible, relevant and applicable by evaluators. The Contribution Tracing approach, aided by a symbolic ‘contribution trial’, adds value to impact evaluation theory-based approaches by: reducing confirmation bias; improving the conceptual clarity and precision of theories of change; providing more transparency and predictability to data-collection efforts; and ultimately increasing the internal validity and credibility of evaluation findings, namely of qualitative statements. The approach is demonstrated in the impact evaluation of the Universal Health Care campaign, an advocacy campaign aimed at influencing health policy in Ghana.

A review of evaluations of interventions related to violence against women and girls – using QCA and process tracing

In this posting I am drawing attention to a blog by Michaela Raab and Wolf Stuppert, which is exceptional (or at least unusual) in a number of respects.  The blog is called http://www.evawreview.de/

Firstly the blog is not just about the results of a review, but more importantly, about the review process, written as the review process proceeds. (I have not seen many of these kinds of blogs around, but if you know about any others please let me know)

Secondly the blog is about the use of of QCA and process tracing. There have been a number of articles about QCA in the journal Evaluation but generally speaking relatively few evaluators working with development projects know much about QCA or process tracing.

Thirdly, the blog is about the use of QCA and process tracing as a means of doing a review of findings of past evaluations of  interventions related to violence against women and girls. In other words it is another approach to undertaking a kind of systematic review, notably one which does not require throwing out 95% of the available studies because their contents don’t fit the methodology being used to do the systematic review.

Fourthly, it is about combining the use of QCA and process tracing, i.e. combining cross-case comparisons with within-case analyses. QCA can help identify causal configurations of conditions associated with specific outcomes. But once found these associations need to be examined in depth to ensure there are plausible causal mechanisms at work. That is where process tracing comes into play.

I have two hopes for the EVAWG Review blog. One is that it will provide a sufficiently transparent account of the use of QCA to enable new potential users to understand how it works, along with an appreciation of its potentials and difficulties. The other is that the dataset used in the QCA analysis will be made publicly available, ideally via the blog itself. One of the merits of QCA analyses, as published so far, is that the datasets are often published as part of the published articles, which means others can then re-analyse the same data, perhaps from a different perspective. For example, I would like to test the results of the QCA analyses by using another method for generating results which have a comparable structure (i.e. descriptions of one or more configurations of conditions associated with the presence and absence of expected outcomes). I have described this method elsewhere (Decision Tree algorithms, as used in data mining)

There are also some challenges that will face this use of QCA, which I would like to see how the blog’s authors will try to deal with. In RCTs there need to be both comparable interventions and comparable outcomes e.g. cash transfers provided to many people in some standardised manner, and a common measure of household poverty status. With QCA (and Decision Tree) analyses comparable outcomes are still needed, but not comparable interventions. These can be many and varied, as can be the wider context in which they are provided. The challenge with Raab and Stuppert’s work on VAWG is that there will be many and varied outcome measures as well and interventions. They will probably need to do multiple QCA analyses, focusing on sub-sets of evaluations within which there are one or more comparable outcomes. But by focusing in this way, they may end up with too few cases (evaluations) to produce plausible results, given the diversity of (possibly) causal conditions they will be exploring.

There is a much bigger challenge still. On re-reading the blog I realised this is not simply a kind of systematic review of the available evidence, using a different method. Instead it is a kind of meta-evaluation, where the focus is on comparison of the evaluation methods used in the population of evaluation they manage to amass. The problem of finding comparable outcomes is much bigger here. For example, on what basis will they rate or categorise evaluations as successful (e.g. valid and/or useful)? There seems to be a chicken and egg problem lurking here. Help!

PS1: I should add that this work is being funded by DFID, but the types of evaluations being reviewed is not limited to evaluations of DFID projects

PS2 2013 11 07 : I now see from the team’s latest blog posting the the common outcome of interest will be the usefullness of the evaluation. I would be interested to see how they assess usefullness , in some way that is reasonably reliable.

PS3 2014 01 07: I continue to be impressed by the team’s efforts to publicly document the progress of their work. Their Scoping Report is now available online, along with a blog commentary on progress to date (2013 01 06)

PS4 2014 03 27: The Inception Report is now available on the VAWG blog. It is well worth reading, especially the sections explaining the methodology and the evaluation team’s response to comments by the the Specialised Evaluation and Quality Assurance Service (SEQUAS, 4 March 2014) on pages 56-62, some of which are quite tough.

Some related/relevant reading:


AEA resources on Social Network Analysis and Evaluation

American Evaluation Association (AEA) Social Network  Analysis (SNA) Topical Interest Group (TIG) resources

AEA365 | A Tip-a-Day by and for Evaluators

A Bibliography on Evaluability Assessment

PS: This posting and bibliography was first published in November 2012, but has been updated since then, most recently in March 2018. The bibliography now contains 150 items.

An online (Zotero) bibliography was generated in November 2012 by Rick Davies, as part of the process of developing a “Synthesis of literature on evaluability assessments” contracted by the DFID Evaluation Department

[In 2012] There are currently 133 items in this bibliography, listed by year of publication, starting with the oldest first. They include books, journal articles, government and non-government agency documents and webpages, produced between 1979 and 2012. Of these 59% described actual examples of Evaluability Assessments, 13% reviewed experiences of multiple kinds of Evaluability Assessments, 28% were expositions on Evaluability Assessments, with some references to examples, 10% were official guidance documents on how to do Evaluability Assessments and 12% were Terms of Reference for Evaluability Assessments. Almost half (44%) of the documents were produced by international development agencies.

The list is a result of a search using Google Scholar and Google Search to find documents with “evaluability” in the title. The first 100 items in the search result listing were examined. Searches were also made via PubMed, JSTOR and Sciverse. A small number of documents were also identified as a result of a request posted on the MandE NEWS, Xceval and Theory Based Evaluation email lists.

This list is open to further editing and inclusions. Suggestions should be sent to rick.davies@gmail.com

 

M&E Software: A List

Well, the beginnings of a list…

PLEASE NOTE: No guarantee can be given about the accuracy of information provided on the linked websites about the M&E software concerned, and its providers. Please proceed with due caution when downloading any executable programs.

Contents on this page: Stand alone systemsOnline systems | Survey supporting software | Sector specific tools | Qualitative data analysis | Data mining / Predictive ModellingProgram Logic / Theory of Change modelingDynamic models | Excel-based tools | Uncategorised and misc other

If you have any advice or opinions on any of the applications below, please tell us more via this survey.

Stand-alone systems

  • AidProject M+E for Donor-funded aid projects
  • Flamingo and Monitoring Organiser: “In order to implement FLAMINGO, it is crucial to first define the inputs (or resources available), activities, outputs and outcomes”
  • HIV/AIDS  Data Capturing And Reporting Platform[Monitoring and Evaluation System]
  • PacPlan: “Results-Based Planning, Monitoring and Evaluation Software and Process Solution”
  • Prome Web: A project management, monitoring and evaluation software. Adapted for aid projects in developing countries
  • Sigmah: “humanitarian project management open source software”

Online systems

  • Activity Info: “an online humanitarian project monitoring tool, which helps humanitarian organizations to collect, manage, map and analyze indicators. ActivityInfo has been developed to simplify reporting and allow for real-time monitoring”
  • AKVO: “a paid-for platform that covers data collection, analysis, visualisation and reporting”
  • Canva Mind Maps: “Create a mind map with Canva and bring your thoughts to life. Easy to use, completely online and completely free mind mapping software”
  • DevResults: “web-based project management tool specially designed for the international development community.” Including M&E, mapping, budgeting, checklists, forms, and collaboration facilities.
  • Granity: “Management and reporting software for Not-for-profits Making transparency easy”
  • IndiKit: Guidance on SMART indicators for relief and development programmes
  • Kashana: An open sourced, web-based Monitoring, Evaluation & Learning (MEL) product for development projects and organisations
  • Kinaki: “Kinaki is a unique and intuitive project design, data collection, analysis, reporting and sharing tool”
  • KI-PROJECTS™ MONITORING AND EVALUATION SOFTWARE:
  • Kobo Toolbox: “a free, more user-friendly way to deploy Open Data Kit surveys. It was developed with humanitarian purposes in mind, but could be used in various contexts (and not just for surveys). There is an Android data collection app that works offline”
  • Logalto:”Collaborative Web-Based Software for Monitoring and Evaluation of International Development Projects”
  • M&E Online: “Web-based monitoring and evaluation software tool”
  • Monitoring and Evaluation Online: Online Monitoring and Evaluation Software Tool
  • SmartME: “SmartME is a tried and tested comprehensive Fund Management and M&E software platform to manage funds better”
  • Systmapp: “cloud-based software that uses a patent-pending methodology to connect monitoring, planning, and knowledge management for international development organisations”
  • TolaData “is a program management and M&E platform that helps organisations create data-driven impact through the adaptive and timely management of projects”
  • WebMo: Web-based project monitoring for development cooperation

Survey supporting software

  • CommCare: a mobile data collection platform.
  • EthnoCorder is mobile multimedia survey software for your iPhone
  • HarvestYourData: iPad & Android Survey App for Mobile Offline Data Collection
  • KoBoToolbox is a suite of tools for field data collection for use in challenging environments. Free and open source
  • Magpi (formerly EpiSurvey)  – provides tools for mobile data collection, messaging and visualisation, lets anyone create an account, design forms, download them to phones, and start collecting data in minutes, for free.
  • Open Data Kit (ODK) is a free and open-source set of tools which help organizations author, field, and manage mobile data collection solution
  • REDCap,a secure web application for building and managing online surveys and databases… specifically geared to support online or offline data capture for research studies and operations
  • Sensemaker(c) “links micro-narratives with human sense-making to create advanced decision support, research and monitoring capability in both large and small organisations.”
  • Comparisons

Sector-specific tools

  • Mwater for WASH, which explicitly aims to make the data (in this case water quality). Free and open source
  • Adaptive Management Software for Conservation projects. https://www.miradi.org/

Qualitative data analysis

  • Dedooose, A cross-platform app for analyzing qualitative and mixed methods research with text, photos, audio, videos, spreadsheet data and more
  • Nvivo, powerful software for qualitative data analysis.
  • HyperRESEARCH “…gives you complete access and control, with keyword coding, mind-mapping tools, theory building and much more”.
  • Impact Mapper: “A new online software tool to track trends in stories and data related to social change”

Data mining / predictive modeling

  • RapidMiner Studio. Free and paid for versions. Data Access (Connect to any data source, any format, at any scale), Data Exploration (Quickly discover patterns or data quality issues). Data Blending (Create the optimal data set for predictive analysis), Data Cleansing (Expertly cleanse data for advanced algorithms), Modeling (Efficiently build and delivers better models faster), Validation (Confidently & accurately estimate model performance)
  • BigML. Free and paid for versions. Online service. “Machine learning made easy”
  • EvalC3: Tools for exploring and evaluating complex causal configurations, developed by Rick Davies (Editor of MandE NEWS). Free and available with Skype video support

Program Logic / Theory of Change modeling / Diagramming

  • Changeroo: “Changeroo assists organisations, programs and projects with a social mission to develop and manage high-quality Theories of Change”
  • Coggle:The clear way to share complex information
  • DAGitty: ” a browser-based environment for creating, editing, and analyzing causal models (also known as directed acyclic graphs or causal Bayesian networks)”
  • Decision Explorer: a  tool for managing “soft” issues – the qualitative information that surrounds complex or uncertain situations.
  • DCED’s Evidence Framework – more a way of using a website than software as such, but definitely an approach that is replicable by others.
  • DoView – Visual outcomes and results planning
  • Draw.io:
  • Dylomo: ” a free* web-based tool that you can use to build and present program logic models that you can interact with”
  • IdeaTree – Simultaneous Collaboration & Brainstorming Using Mind Maps
  • Insight Maker: “…express your thoughts using rich pictures and causal loop diagrams. … turn these diagrams into powerful simulation models.”
  • Kumu: a powerful data visualization platform that helps you organize complex information into interactive relationship maps.
  • Logframer 1.0 “a free project management application for projects based on the logical framework method”
  • LucidChart: Diagrams done right. Diagram and collaborate anytime on any device
  • Netway: a cyberinfrastructure designed to support collaboration on the development of program models and evaluation plans, provide connection to a virtual community of related programs, outcomes, measures and practitioners, and to provide quick access to resources on evaluation planning
  • Omnigraffle: for creating precise, beautiful graphics: website wireframes, electrical systems, family trees and maps of software classes
  • Theory maker: a free web app by Steve Powell for making any kind of causal diagram, i.e. a diagram which uses arrows to say what contributes to what.
  • TOCO – Theory of Change Online. A free version is available.
  • Visual Understanding Environment (VUE): open source ‘mind mapping’ freeware from Tufts Univ.
  • yEd – diagram editor that can be used to generate drawings of diagrams.  FREE. PS: There is now a web-based version of this excellent network drawing application

Dynamic models

  • CCTools: Map and steer complex systems, using Fuzzy Cognitive Maps and others [ This site is currently under reconstruction]
  • Loopy: A tool for thinking in systems
  • Mental Modeller: FCM modeling software that helps individuals and communities capture their knowledge in a standardized format that can be used for scenario analysis.
  • FCM Expert: Experimenting tools for Fuzzy Cognitive Maps
  • FCMapper: the first available FCM analysis tool based on MS Excel and FREE for non-commercial use.
  • FSDM: Fuzzy Systems Dynamics Model Implemented with a Graphical User Interface

Mind-Mapping software (tree diagrams)

  • MindView: “a professional mind mapping software that allows you to visually brainstorm, organize and present ideas.”
  • XMind: “mind mapping and brainstorming tool, designed to generate ideas, inspire creativity, brings you efficiency both in work and life.”
  • MindManager: “
  • Plectica: “Diagram your thinking in real time, together”

Collaboration software

  • Miro:  which can be used to make a collaborative ToC.

Excel-based tools

  • EvalC3: …tools for developing, exploring and evaluating predictive models of expected outcomes, developed by Rick Davies (Editor of MandE NEWS). Free and available with Skype video support

Uncategorised yet

  • OpenRefine: Formerly called Google Refine is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.
  • Overview is an open-source tool originally designed to help journalists find stories in large numbers of documents, by automatically sorting them according to topic and providing a fast visualization and reading interface. It’s also used for qualitative research, social media conversation analysis, legal document review, digital humanities, and more. Overview does at least three things really well.
Other lists
Other other

On evaluation quality standards: A List

 

The beginnings of a list. Please suggest others by using the Comment facility below

Normative statements:

Standards for specific methods (and fields):

Meta-evaluations:

  • Are Sida Evaluations Good Enough?An Assessment of 34 Evaluation Reports” by Kim Forss, Evert Vedung, Stein Erik Kruse,Agnes Mwaiselage, Anna Nilsdotter, Sida Studies in Evaluation 2008:1  See especially Section 6: Conclusion, 6.1 Revisiting the Quality Questions, 6.2 Why are there Quality Problems with Evaluations?, 6.3 How can the Quality of Evaluations be Improved?, 6.4 Direction of Future Studies. RD Comment:  This study has annexes with empirical data on the quality attributes of  34 evaluation reports published in the Sida Evaluations series between 2003 and 2005. It BEGS a follow up study to see if/how these various quality ratings correlate in any way with the subsequent use of the evaluation reports. Could Sida pursuaded to do something like this?

Ethics focused

  • Australasian Evaluation Society

Journal articles

Checklists:

  • Evaluation checklists prepared by the Western Michegan University ,covering Evaluation Management, Evaluation Models, Evaluation Values and Criteria, Metaevaluation, Evaluation Capacity Building / Institutionalization, and Checklist Creation

Other lists:

A list of M&E training providers

Update 2014 12 20: The contents of this page have become woefully out of date and it would be more than a full time job to keep it up to date.

My advice is as now as follows:

If you are looking for M&E training opportunities visit the MandE NEWS Training Forum, which lists all upcoming training events. There are many training providers listed there, along with links to their websites

Please also consider taking part in the online survey of training needs.

If you are a training provider, please look at the cumulative results to date of that survey.

I have now deleted all the previous training providers that were shown below

Value for money: A list

Hopefully, the start of a short but useful bibliography, listed in chronological order.

Please suggest additional documents by using the Comment facility below.  If you have ideas on how Value for Money can be clearly defined and usefully measured please also use the Comment facility below..

For the Editor’s own suggestion, go to the bottom of this page

2015

2014

2013

2012

2011

  • ICAI’s Approach to Effectiveness and Value for Money, November 2011. See also Rick Davies comments on same
  • Value for Money and international development: Deconstructing some myths to promote more constructive discussion. OECD Consultation Draft. October 2011
  • What does ‘value for money’ really mean? CAFOD, October 2011
  • Value for Money: Guideline, NZAID, updated July 2011
  • DFID’s Approach to Value for Money (VfM), July 2011
  • DFID Briefing Note: Indicators and VFM in Governance Programming July 2011.  INTRODUCTION: This note provides advice to DFID staff on: i. governance indicator best practice, and ii. measuring the Value for Money of governance programmes. This note is for use primarily by DFID governance advisers, as well as other DFID staff designing programmes with governance elements. The note provides a framework for consideration in Business Case design that relates to governance activity.  On Value for Money (VFM) in particular, this guidance is only intended as ‘interim’ whilst further research is undertaken. During 2011-2012, DFID will work to determine best practice and establish agreed approaches and mechanisms. This guidance will therefore be updated accordingly subject to research findings as they are made available.  This note was drawn up by DFID staff. It builds on 2 research reports by ITAD, submitted in December 2010 and January 2011 respectively, as well as DFID’s internal Business Case guidance. There are 2 main sections: Section 1: Governance Indicators and Section 2: Value for Money in Governance Programming. The note ends with 10 Top Tips on Business Case preparation.
  • DFID is developing ” Guidance for DFID country offices on maximising VfM in cash transfer programmes“. July 2011. Objective:To provide guidance to DFID country offices on measuring value for money in cash transfer programmes through the rigorous analysis of costs and benefits, as far as possible, at the design stage and through programme implementation and completion.  This project is driven by DFID’s expansion of support to cash transfer programmes, its strong emphasis on ensuring programmes are delivering value for money, and strong country office demand for specific advice and guidance” (ToRs)
  • Value for Money: Current Approaches and Evolving Debates. Antinoja Emmi, Eskiocak Ozlem, Kjennerud Maja, Rozenkopf Ilan,  Schatz Florian, LSE, London, May 2011. 43 pages. “NGOs have increasingly been asked by donors to demonstrate their Value for Money (VfM).This report analyses this demand across a number of dimensions and intends to lay out the interpretation of different stakeholders. After contextualising the debate internationally and nationally, a conceptual discussion of possible ways of defining and measuring VfM is conducted, followed by a technical analysis of different approaches and measurement techniques adopted by stakeholders. Finally, opportunities and caveats of measuring VfM are discussed. The report draws heavily on information gained through a total of seventeen interviews with representatives of NGOs, consultancies, think tanks and academic institutions.”
  • Independent Commission for Aid Impact – Work Plan, May 2011: “We have not yet agreed our own definition of terms such as “value for money” and “aid effectiveness”. These are complex issues which are currently under much debate. In the case of value for money we believe that this should include long-term impact and effectiveness. We intend to commission our contractor to help us in our consideration of these matters.”
  • The Guardian, Madeleine Bunting,11th April 2011 “Value for money is not compatible with increasing aid to ‘fragile states’. The two big ideas from the UK’s Department for International Development are destined for collision”
  • NAO report on DFID Financial Management, April 2011. See the concluding section of the Executive Summary, titled Conclusion on value for money:
    • “We recognise that the Department has been improving its core financial management and has also been strengthening its focus on value for money at all levels of the organisation, including through a step change in its approach to the strategic allocation of resources based on expected results. Important building blocks have been put in place, but key gaps in financial management maturity remain. The changes the Department has introduced to-date are positive, and provide a platform to address the challenges that will come with its increased spending.”
    • At present, however, the Department’s financial management is not mature. The Department’s forecasting remains inaccurate and its risk management is not yet fully embedded. Weaknesses in the measurement of value for money at project level, variability in the quality and coverage of data, and lack of integration in core systems, mean that the Department cannot assess important aspects of value for money of the aid it has delivered, at an aggregated level. The Department now needs to develop a coherent single strategy to address the weaknesses identified and the key risks to meeting its objectives.”
  • DFID’s March 2011, Multilateral Aid Review, “was commissioned to assess the value for money for UK aid of funding through multilateral organisations”. “All were assessed against the same set of criteria, interpreted flexibly to fit with their different circumstances, but always grounded in the best available evidence. Together the criteria capture the value for money for UK aid of the whole of each organisation. The methodology was independently validated and quality assured by two of the UK’s leading development experts. The assessment framework included criteria which relate directly to the focus and impact of an organisation on the UK’s development and humanitarian objectives– such as whether or not they are playing a critical role in line with their mandate, what this means in terms of results achieved on the ground, their focus on girls and women, their ability to work in fragile states, their attention to climate change and environmental sustainability, and their focus on poor countries. These criteria were grouped together into an index called “Contribution to UK development objectives.  The framework also included criteria which relate to the organisations’ behaviours and values that will drive the very best performance – such as transparency, whether or not cost and value consciousness and ambition for results are driving forces in the organisation, whether there are sound management and accountability systems, whether the organisations work well in partnership with others and whether or not financial resource management systems and instruments help to maximise impact. These were grouped together into an index called “Organisational strengths”. Value for money for UK aid was assessed on the basis of performance against both indices. So, for example, organisations with a strong overall performance against both indices were judged to offer very good value for money for UK aid, while those with a weak or unsatisfactory performance against both indices were deemed to offer poor value for money.”
    • [RD comment] In the methodology chapter the authors explain / claim that this approach is based on a 3E view that seeks to give attention to the whole “value for money chain” (nee causal chain), from inputs to impacts (which is discussed below). Reading the rest of that chapter, I am not convinced, I think the connection is tenuous, and what exists here is a new interpretation of Value for Money that will not be widely used. That said, I dont envy the task the authors of this report were faced with.
    • [RD comment]The Bilateral Aid Review makes copious references to Value for Money, but there is no substantive discussion of what it means anywhere in the review. Annex D includes a proposal format which includes a section for providing  Value for Money information in 200 words. This includes the following fields, which are presumably explained elsewhere: Qualitative judgement of vfm, vfm metrics (including cost-benefit measures), Unit costs, Scalability, Comparators, Overall VfM RAG rating: red/amber/green.
  • Aid effectiveness and value for money aid: complementary or divergent agendas as we head towards HLF-4. (March 2011)  This ODI, ActionAid and UK Aid Network public event was called “to reflect on approaches dominating the debate in advance of the OECD’s 4th High Level Forum on Aid Effectiveness (HLF-4); explore the degree to which they represent complimentary or divergent agendas; and discuss how they might combine to help ensure that HLF-4 is a turning point in the future impact of aid.” The presentations of three of the four speakers are available on this site. Unfortunately DFID’s presentation, by Liz Ditchburn– Director, Value for Money, DFID, is not available.
  • BOND Value for Money event (3 February 2011). “Bond hosted a half day workshop to explore this issue in more depth. This was an opportunity to take stock of the debates on Value for Money in the sector, to hear from organisations that have trialled approaches to Value for Money and to learn more about DFID’s interpretation of Value for Money from both technical and policy perspectives.” Presentations were made by (and are available): Oxfam, VSO, WaterAid, HIV/AIDS Aliliance, and DFID (Jo Abbot, Deputy Head Civil Society Department). There was also a prior BOND event in January 2011 on Value for Money, and presentations are also available, including an undated National Audit Office Analytical framework for assessing Value for Money
    • [RD Comment]The DFID presentation on “Value for Money and Civil Society”  is notable in the ways that it seeks to discourage NGOs from over investing efforts to measure Value for Money, and its emphasises on the continuity of DFIDs approach to assessing CSO proposals. The explanation of Value for Money is brief, captured in two statements: “optimal use of resources to get desired outcomes” and “maximum benefit for the resources requested”. To me this reads as efficiency and cost-effectiveness.
  • The Independent Commission for Aid Impact (ICAI)’s January 2011online consultation contrasts Value for Money reviews with Evaluations, Reviews and Investigations, as follows.
    • Value for money reviews: judgements on whether value for money has been secured in the area under examination. Value for money reviews will focus on the use of resources for development interventions.
    • Evaluations: the systematic and objective assessment of an on-going or complete development intervention, its design, implementation and results. Evaluations will focus on the outcome of development interventions.
    • Reviews: assessments of the performance of an intervention, periodically or on an ad hoc basis. Reviews tend to look at operational aspects and focus on the effectiveness of the processes used for development interventions.
    • Investigations:a formal inquiry focusing on issues around fraud and corruption.
      • [RD comment] The ICAI seems to take a narrower view than the National Audit Office, focusing on economy and efficiency and leaving out effectiveness – which within its perspective would be covered by evaluations.

 

2010

  • Measuring the Impact and Value for Money of Governance & Conflict Programmes Final Report December 2010 by Chris Barnett, Julian Barr, Angela Christie,  Belinda Duff, and Shaun Hext. “The specific objective stated for our work on value for money (VFM) in the Terms of Reference was: “To set out how value for money can best be measured in governance and conflict programming, and whether the suggested indicators have a role in this or not”. This objective was taken to involve three core tasks: first, developing a value for money approach that applies to both the full spectrum of governance programmes, and those programmes undertaken in conflict-affected and failed or failing states; second, that the role of a set of suggested indicators should be explored and examined for their utility in this approach, and, further, that existing value for money frameworks (such as the National Audit Office’s use of the 3Es of ‘economy, efficiency and effectiveness’) should be incorporated, as outlined in the Terms of Reference.”
  • Value for Money: How are other donors approaching ‘value for money’ in their aid programming? Question and answer on the Governance and Social Development Resource Centre Help Desk, 17 September 2010.
  • Value for Money (VfM) in International Development NEF Consulting Discussion Paper, September 2010. Some selective quotes: “While the HM Treasury Guidance provides principles for VfM assessments, there is currently limited guidance on how to operationalise these in the international development sector or public sector more generally. This has led to confusion about how VfM assessments should be carried out and seen the proliferation of a number of different approaches.” …”The HM Treasury guidance should inform the VfM framework of any publicly-funded NGO in the development sector. The dark blue arrow in Figure 1 shows the key relationship that needs to be assessed to determine VfM. In short, this defines VfM as: VfM = value of positive + negative outcomes / investment (or cost)”
  • [RD Comment:] Well now, having that formula makes it so much easier (not), all we have to do is find the top values, add them up, then divide by the bottom value :-(
  • What is Value for Money? (July 2010) by the Improvement Network (Audit Commission, Chartered Institute of Public Finance and Accountancy (CIPFA), Improvement and Development Agency (IDeA), Leadership Centre for Local Government, NHS Institute for Innovation and Improvement).  “VfM is about achieving the right local balance between economy, efficiency and effectiveness, the 3Es – spending less, spending well and spending wisely” These three attributes are each related to different stages of aid delivery, from inputs to outcomes, via this diagram.
  • [RD comment]: Reading this useful page raises two interesting questions. Firstly, how does this framework relate to the OECD/DAC evaluation criteria? Is it displacing them, as far as DFID is concerned? It appears so, given its appearance in the Terms of Reference for the contractors who will do the evaluation work for the new Independent Commission for Aid Impact. Ironically, the Improvement Network makes the following comments about the third E, (effectiveness) which suggests that the DAC criteria may be re-emerging within this new framework: “Outcomes should be equitable across communities, so effectiveness measures should include aspects of equity, as well as quality. Sustainability is also an increasingly important aspect of effectiveness.” The second interesting question is how Value for Money is measured in aggregate, taking into account all three Es. Part of the challenge is with effectiveness, where it is noted that effectivenessis a measure of the impact that has been achieved, which can be either quantitative or qualitative.” Then there is the notion that Value for Money is about a “balance” of the three Es. “VfM is high when there is an optimum balance between all three elements – when costs are relatively low, productivity is high and successful outcomes have been achieved.” On the route to that heaven there are multiple possible combinations of states of economy (+,-), efficiency (+,-) and effectiveness (+,-). There is no one desired route or ranking. Because of these difficulties Sod’s Law will probably apply and attention will focus on what is easiest to measure i.e. economy or at the most, efficiency. This approach seems to be evident in earlier government statements about DFID: “International Development Minister Gareth Thomas yesterday called for a push on value for money in the UN system with a target of 25% efficiency savings.”….”The UK is holding to its aid commitments of 0.7 % of GNI.  But for the past five years we have been expected to cut 5% from our administration or staffing costs across Government. 5% – year on year”

 

2007

 

2003

 

The Editor’s suggestion

1. Dont seek to create an absolute measure of the Value for Money for a single activity/project/program/intervention

2. Instead, create a relative measure of  the VfM found within a portfolio of activities, by using a rank correlation. [This measure then be used to compare VfM across different types of portfolios]

  • 1. Rank the entities (activities/projects…) by cost of the inputs, and
    • Be transparent about which costs were included/excluded e.g partner’s own costs, other donor contributions etc,)
  • 2. Rank the the same set of entities by their perceived effectiveness or impact (depending on the time span of interest)
    • Ideally this ranking would be done through a participatory ranking process (see Refs below), and information would be available on the stakeholders who were involved
    • Where multiple stakeholder groups were consulted, any aggregation of their rankings would be done using transparent weighting values and information would also be available on the Standard Deviation of the rankings given to the different entities. There is likely to be more agreement across stakeholders on some rankings than others.
    • Supplementary information would be available detailing how stakeholders explained their ranking. This is best elicited through pair comparisons of  adjacent sets of ranked entities.
      • That explanation is likely to include a mix of:
        • some kinds of impacts being more valued by the stakeholders than others, and
        • for a given type of impact there being evidence of more rather than less of that kind of impact, and
        • where a given impact is on the same scale, there being better evidence of that impact
  • 3. Calculate the rank correlation between the two sets of rankings. The results will range between these two extremities:
    • A high positive correlation (e.g. +0.90): here the highest impact is associated with the highest cost ranking, and the lowest impact is associated with the lowest cost ranking. Results are proportionate to investments. This would be the more preferred finding, compared to
    • A high negative correlation (e.g -0.90): here the highest impact is associated with lowest cost ranking, but the lowest impact is associated with the highest cost ranking. Here the more you increase your investment the less you gain, This is the worst possible outcome.
    • In between will be correlations closer to zero, where there is no evident relationship between cost and impact ranking.
  • 4. Opportunities for improvement would be found by doing case studies of “outliers”, found when the two rankings are plotted against each other in a graph. Specifically:
    • Positive cases, whose rank position on cost is conspicuosly lower than their rank position on impact.
    • Negative cases, whose rank position on impact is conspicuosly lower than their rank position on cost.

PS: It would be important to  disclose the number of entities that have been ranked. The more entities there are being ranked the more precise the rank correlation will be. However, the more entities there are to rank the harder it will be for participants and the more likely they will use tied ranks. A minimum of seven rankable entities would seem desirable.

For more on participatory ranking methods see:

PS: There is a UNISTAT plugin for Excel that will produce rank correlations, plus much more.

%d bloggers like this: