Order and Diversity: Representing and Assisting Organisational Learning in Non-Government Aid Organisations.

Posted on 23 July, 2017 – 10:38 AM

No, history did not begin three years ago ;-)

“It was twenty years ago today…” well almost. Here is a link to my 1998 PhD Thesis of the above title. It was based on field work I carried out in Bangladesh between 1992 and 1995. Chapter 8 describes the first implementation of what later became the Most Significant Change impact monitoring technique. But there is a lot more of value in this thesis as well, including analysis of the organisational learning literature up to that date, an analysis of the Bangladesh NGO sector in the early 1990s, and a summary of thinking about evolutionary epistemology. Unlike all too many PhDs, this one was useful, even for the immediate subjects of my field work. CCDB was still using the impact monitoring process I helped them set up (i.e. MSC)  when I visited them again in the early 2000’s, albeit with some modifications to suit its expanded use.

Abstract: The aim of this thesis is to develop a coherent theory of organisational learning which can generate practical means of assisting organisational learning. The thesis develops and applies this theory to one class of organisations known as non-government organisations (NGOs), and more specifically to those NGOs who receive funds from high income countries but who work for the benefit of the poor in low income countries. Of central concern are the processes whereby these NGOs learn from the rural and urban poor with whom they work.
The basis of the theory of organisational learning used in this thesis is modern evolutionary theory, and more particularly, evolutionary epistemology. It is argued that this theory provides a means of both representing and assisting organisational learning. Firstly, it provides a simple definition of learning that can be operationalised at multiple scales of analysis: that of individuals, organisations, and populations of organisations. Differences in the forms of organisational learning that do take place can be represented using a number of observable attributes of learning which are derived from an interpretation of evolutionary theory. The same evolutionary theory can also provide useful explanations of processes thus defined and represented. Secondly, an analysis of organisational learning using these observable attributes and background theory also suggest two ways in which organisational learning can be assisted. One is the use of specific methods within NGOs: a type of participatory monitoring. The second is the use of particular interventions by their donors: demands for particular types of information which are indicative of how and where the NGO is learning In addition to these practical implications, it is argued that a specific concern with organisational learning can be related to a wider problematic which should be of concern to Development Studies: one which is described as “the management of diversity”. Individual theories, organisations, and larger social structures may not survive in the face of diversity and change. In surviving they may constrain and / or enable other agents, with feedback effects into the scale and forms of diversity possible. The management of diversity can be analysed descriptively and prescriptively, at multiple scales of aggregation.


VN:F [1.9.22_1171]
Rating: +2 (from 2 votes)

Twitters posts tagged as #evaluation

Posted on 13 July, 2017 – 8:58 AM

This post should feature a continually updated feed of all Twitter tweets tagged as: #evaluation

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)


Posted on 8 May, 2017 – 7:37 PM

Kenya Heard, Elisabeth O’Toole, Rohit Naimpally, Lindsey Bressler. J-PAL North America, April 2017. pdf copy here

Randomized evaluations, also called randomized controlled trials (RCTs), have received increasing attention from practitioners, policymakers, and researchers due to their high credibility in estimating the causal impacts of programs and policies. In a randomized evaluation, a random selection of individuals from a sample pool is offered a program or service, while the remainder of the pool does not receive an offer to participate in the program or service. Random assignment ensures that, with a large enough sample size, the two groups (treatment and control) are similar on average before the start of the program. Since members of the groups do not differ systematically at the outset of the experiment, any difference that subsequently arises between the groups can be attributed to the intervention rather than to other factors.

Researchers, practitioners, and policymakers face many real-world challenges while designing and implementing randomized evaluations. Fortunately, several of these challenges can be addressed by designing a randomized evaluation that accommodates existing programs and addresses implementation challenges.

Program design challenges: Certain features of a program may present challenges to using a randomized evaluation design. This document showcases four of these program features and demonstrates how to alter the design of an evaluation to accommodate them.
• Resources exist to extend the program to everyone in the study area
• Program has strict eligibility criteria
• Program is an entitlement
• Sample size is small

Implementation challenges: There are a few challenges that may threaten a randomized evaluation when a program or policy is being implemented. This document features two implementation challenges and demonstrates how to design a randomized evaluation that mitigates threats and eliminates difficulties in the implementation phase of an evaluation.
• It is difficult for service providers to adhere to random assignment due to logistical or political reasons
• The control group finds out about the treatment, benefits from the treatment, or is harmed by the treatment


INTRODUCTION ……………………………………………………………………….. 3
TABLE OF CONTENTS……………………………………………………………………. 4
PROGRAM DESIGN CHALLENGES ……………………………………………………………. 5
Challenge #1: Resources exist to extend the program to everyone in the study area…………… 5
Challenge #2: Program has strict eligibility criteria …………………………………… 9
Challenge #3: Program is an entitlement…………………………………………………12
Challenge #4: Sample size is small …………………………………………………….16
IMPLEMENTATION CHALLENGES……………………………………………………………..20
Challenge #5: It is difficult for service providers to adhere to random assignment due to logistical
or political reasons …………………………………………………………………20
Challenge #6: Control group finds out about the treatment, benefits from the treatment,
or is harmed by the treatment………………………………………………………….23
SUMMARY TABLE ……………………………………………………………………….27
GLOSSARY ……………………………………………………………………………28
REFERENCES ………………………………………………………………………….29


VN:F [1.9.22_1171]
Rating: +1 (from 1 vote)

Riddle me this: How many interviews (or focus groups) are enough?

Posted on 8 May, 2017 – 7:37 PM

Emily Namey, R&E Search for Evidence http://researchforevidence.fhi360.org/author/enamey

“The first two posts in this series describe commonly used research sampling strategies and provide some guidance on how to choose from this range of sampling methods. Here we delve further into the sampling world and address sample sizes for qualitative research and evaluation projects. Specifically, we address the often-asked question: How many in-depth interviews/focus groups do I need to conduct for my study?

Within the qualitative literature (and community of practice), the concept of “saturation” – the point when incoming data produce little or no new information – is the well-accepted standard by which sample sizes for qualitative inquiry are determined (Guest et al. 2006; Guest and MacQueen 2008). There’s just one small problem with this: saturation, by definition, can be determined only during or after data analysis. And most of us need to justify our sample sizes (to funders, ethics committees, etc.) before collecting data!

Until relatively recently, researchers and evaluators had to rely on rules of thumb or their personal experiences to estimate how many qualitative data collection events they needed for a study; empirical data to support these sample sizes were virtually non-existent. This began to change a little over a decade ago. Morgan and colleagues (2002) decided to plot (and publish!) the number of new concepts identified in successive interviews across four datasets. They found that nearly no new concepts were found after 20 interviews. Extrapolating from their data, we see that the first five to six in-depth interviews produced the majority of new data, and approximately 80% to 92% of concepts were identified within the first 10 interviews.

Emily’s blog continues here http://researchforevidence.fhi360.org/riddle-me-this-how-many-interviews-or-focus-groups-are-enough

VN:F [1.9.22_1171]
Rating: +3 (from 3 votes)

How to find the right answer when the “wisdom of the crowd” fails?

Posted on 9 April, 2017 – 6:39 PM

Dizekes, P. (2017). Better wisdom from crowds. MIT Office News. Retrieved from http://news.mit.edu/2017/algorithm-better-wisdom-crowds-0125  PDF copy pdf copy

Ross, E. (n.d.). How to find the right answer when the “wisdom of the crowd” fails. Nature News. https://doi.org/10.1038/nature.2017.21370

Prelec, D., Seung, H. S., & McCoy, J. (2017). A solution to the single-question crowd wisdom problem.Nature, 541(7638), 532–535. https://doi.org/10.1038/nature21054

Dizekes: The wisdom of crowds is not always perfect. but two scholars at MIT’s Sloan Neuroeconomics Lab, along with a colleague at Princeton University, have found a way to make it better. Their method, explained in a newly published paper, uses a technique the researchers call the “surprisingly popular” algorithm to better extract correct answers from large groups of people. As such, it could refine “wisdom of crowds” surveys, which are used in political and economic forecasting, as well as many other collective activities, from pricing artworks to grading scientific research proposals.

The new method is simple. For a given question, people are asked two things: What they think the right answer is, and what they think popular opinion will be. The variation between the two aggregate responses indicates the correct answer. [Ross: In most cases, the answers that exceeded expectations were the correct ones. Example: If Answer A was given by 70% but 80% expected it to be given and Answer B was given by 30% but only 20% expected it to be given then Answer B would be the “surprisingly popular” answer].

In situations where there is enough information in the crowd to determine the correct answer to a question, that answer will be the one [that] most outperforms expectations,” says paper co-author Drazen Prelec, a professor at the MIT Sloan School of Management as well as the Department of Economics and the Department of brain and Cognitive Sciences.

The paper is built on both theoretical and empirical work. The researchers first derived their result mathematically, then assessed how it works in practice, through surveys spanning a range of subjects, including U.S. state capitols, general knowledge, medical diagnoses by dermatologists, and art auction estimates.

Across all these areas, the researchers found that the “surprisingly popular” algorithm reduced errors by 21.3 percent compared to simple majority votes, and by 24.2 percent compared to basic confidence-weighted votes (where people express how confident they are in their answers). And it reduced errors by 22.2 percent compared to another kind of confidence weighted votes, those taking the answers with the highest average confidence levels”

But “… Prelec and Steyvers both caution that this algorithm won’t solve all of life’s hard problems. It only works on factual topics: people will have to figure out the answers to political and philosophical questions the old-fashioned way”

Rick Davies comment: This method could be useful in an evaluation context, especially where participatory methods were needed or potentially useful

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Fact Checking websites serving as public evidence-monitoring services: Some sources

Posted on 2 March, 2017 – 7:42 AM

These services seem to be getting more attention lately, so I thought it would be worthwhile compiling a list of some of the kinds of fact checking websites that exist, and how they work.

Fact checkers have the potential to influence policies at all stages of the policy development and implementation process, not by promoting particular policy positions based on evidence, but by policing the boundaries of what should be considered as acceptable as factual evidence. They are responsive rather than pro-active.


American websites

  • Politifact– PolitiFact is a fact-checking website that rates the accuracy of claims by elected officials and others who speak up in American politics.
  • Fact Check–They monitor the factual accuracy of what is said by major U.S. political players in the form of TV ads, debates, speeches, interviews and news releases.
  • Media Bias / Fact Check…claims to be ” the most comprehensive media bias resource on the internet”, but content is mainly American


United Kingdom

Discussions of the role of fact checkers

A related item, just seen…

  • This site is “taking the edge off rant mode” by making readers pass a factual knowldge quiz before commenting. ““If everyone can agree that this is what the article says, then they have a much better basis for commenting on it.”

Update 20/03/2017: Read Tim Harford’s blog posting on The Problem With Facts (pdf copy here), and communication value of eliciting curiosity

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Integrating Big Data into the Monitoring and Evaluation of Development Programmes

Posted on 24 January, 2017 – 7:39 PM
Bamberger, M. (2016). Integrating Big Data into the Monitoring and Evaluation of Development Programmes (2016) |. United Nations Global Pulse. Retrieved from http://unglobalpulse.org/big-data-monitoring-and-evaluation-report  PDF copy available

Context: “This report represents a basis for integrating big data and data analytics in the monitoring and evaluation of development programmes. The report proposes a Call to Action, which hopes to inspire development agencies and particularly evaluators to collaborate with data scientists and analysts in the exploration and application of new data sources, methods, and technologies. Most of the applications of big data in international development do not currently focus directly on monitoring, and even less on evaluation. Instead they relate more to research, planning and operational use using big data. Many development agencies are still in the process of defining their policies on big data and it can be anticipated that applications to the monitoring and evaluation of development programmes will start to be incorporated more widely in the near future. This report includes examples and ways that big data, together with related information and communications technologies (ICTs) are already being used in programme monitoring, evaluation and learning. The data revolution has been underway for perhaps a decade now. One implication for international development is that new sources of real–time information about people are for the first time available and accessible. In 2015, in an unprecedented, inclusive and open process, 193 members states of the United Nations adopted, by consensus, the 2030 Agenda for sustainable development. The 17 Sustainable Development Goals (SDGs) contained in the 2030 Agenda constitute a transformative plan for people, planet, prosperity, partnerships and peace. All of these factors are creating a greater demand for new complexity–responsive evaluation designs that are flexible, cost effective and provide real–time information. At the same time, the rapid and exciting developments in the areas of new information technology (big data, information and communications technologies) are creating the expectation, that the capacity to collect and analyse larger and more complex kinds of data, is increasing. The report reviews the opportunities and challenges for M&E in this new, increasingly digital international development context. The SDGs are used to illustrate the need to rethink current approaches to M&E practices, which are no longer able to address the complexities of evaluation and interaction among the 17 Goals. This endeavour hopes to provide a framework for monitoring and evaluation practitioners in taking advantage of the data revolution to improve the design of their programmes and projects to support the achievement of the Sustainable Development Goals and the 2030 Agenda.

Rick Davies comment: As well as my general interest in this paper, I have two particular interests in its contents. One is what it says about small  (rather than big) data and how big data analysis techniques may be relevant to the analysis of small data sets. In my experience many development agencies have rather small data sets, which are often riddle with missing data points. The other is what the paper has to say about predictive analytics, a field of analysis (within data mining defined more widely) that I think has a lot of relevance to M&E of development programmes.

Re the references to predictive analytics, I was disappointed to see this explanation on page 48: “Predictive analytics (PA) uses patterns of associations among variables to predict future trends. The predictive models are usually based on Bayesian statistics and identify the probability distributions for different outcomes“.  In my understanding  Bayesian classification algorithms are only one of a number of predictive analytics tools which generate classifications (read predictive models). Here  are some some classifications of the different algorithms that are available: (a) Example A, focused on classification algorithms – with some limitations, (b) Example B, looking at classification algorithms within the wider ambit of data mining methods, from Maimon and Rokach (2010; p.6) . Bamberger’s narrow definition is an unfortunate because there are simpler and more transparent methods available, such as Decision Trees, which would be easier for many evaluators to use and whose results could be more easily communicated to their clients.

Re my first interest re small data, I was more pleased to see this statement: “While some data analytics are based on the mining of very large data sets with very large numbers of cases and variables, it is also possible to apply many of the techniques such as predictive modelling with smaller data sets” This heightens the importance of clearly spelling out the different ways in which predictive analytics work can be done.

I was also agreeing with the follow on paragraph:  “While predictive analytics are well developed, much less progress has been made on causal (attribution) analysis. Commercial predictive analytics tends to focus on what happened, or is predicted to happen (e.g. click rates on web sites), with much less attention to why outcomes change in response to variations in inputs (e.g. the wording or visual presentation of an on–line message). From the evaluation perspective, a limitation of predictive analysis is that it is not normally based on a theoretical framework, such as a theory of change, which explains the process through which outcomes are likely to be achieved. This is an area where there is great potential for collaboration between big data analytics and current impact evaluation methodologies” My approach to connecting these two types of analysis is explained on the EvalC3 website. This involves connecting cross-case analysis (using predictive analytics tools, for example) to within-case analysis (using process tracing or simpler tools, for example) through carefully thought though case selection and comparison strategies.

My interest and argument for focusing more on small data was reinforced when I saw this plausible and likely situation: “The limited access of many agencies to big data is another major consideration” (p69) – not a minor issue in a paper on the use and uses of big data! Though the paper does highlight the many and varied sources that are becoming increasingly available, and the risks and opportunities associated with their use.

VN:F [1.9.22_1171]
Rating: +1 (from 1 vote)

Monitoring and Evaluation in Health and Social Development: Interpretive and Social Development Perspectives

Posted on 17 January, 2017 – 4:47 PM

Edited by Stephen Bell and Peter Aggleton. Routledge 2016. View on Google Books

interpretive researchers thus attempt to understand phenomena through accessing the meanings participants assign to them

“...interpretive and ethnographic approaches are side-lined in much contemporary evaluation work and current monitoring and evaluation practice remains heavily influenced by more positivist approaches

attribution is not the only purpose of impact evaluation

Lack of familiarity with qualitative approaches by programme staff and donor agencies also influences the preferences for for quantitative methods in monitoring and evaluation work


1. Interpretive and Ethnographic Perspectives – Alternative Approaches to Monitoring and Evaluation Practice

2. The Political Economy of Evidence: Personal Reflections on the Value of the Interpretive Tradition and its Methods

3. Measurement, Modification and Transferability: Evidential Challenges in the Evaluation of Complex Interventions

4. What Really Works? Understanding the Role of ‘Local Knowledges’ in the Monitoring and Evaluation of a Maternal, Newborn and Child Health Project in Kenya

PART 2: Programme Design 5. Permissions, Vacations and Periods of Self-regulation: Using Consumer Insight to Improve HIV Treatment Adherence in Four Central American Countries

6. Generating Local Knowledge: A Role for Ethnography in Evidence-based Programme Design for Social Development

7. Interpretation, Context and Time: An Ethnographically Inspired Approach to Strategy Development for Tuberculosis Control in Odisha, India

8. Designing Health and Leadership Programmes for Young Vulnerable Women Using Participatory Ethnographic Research in Freetown, Sierra Leone

Part 3: Monitoring Processes

9. Using Social Mapping Techniques to Guide Programme Redesign in the Tingim Laip HIV Prevention and Care Project in Papua New Guinea

10. Pathways to Impact: New Approaches to Monitoring and Improving Volunteering for Sustainable Environmental Management

11. Ethnographic Process Evaluation: A Case Study of an HIV Prevention Programme with Injecting Drug Users in the USA

12. Using the Reality Check Approach to Shape Quantitative Findings: Experience from Mixed Method Evaluations in Ghana and Nepal

Part 4: Understanding Impact and Change

13. Innovation in Evaluation: Using SenseMaker to Assess the Inclusion of Smallholder Farmers in Modern Markets

14. The Use of the Rapid PEER Approach for the Evaluation of Sexual and Reproductive Health Programmes

15. Using Interpretive Research to Make Quantitative Evaluation More Effective: Oxfam’s Experience in Pakistan and Zimbabwe

16. Can Qualitative Research Rigorously Evaluate Programme Impact? Evidence from a Randomised Controlled Trial of an Adolescent Sexual Health Programme in Tanzania

Rick Davies Comment: [Though this may reflect my reading biases…]It seems like this strand of thinking has not been in the forefront of M&E attention for a long time (i.e. maybe since the 1990s – early 2000’s) so it is good to see this new collection of papers, by a large collection of both old and new faces (33 in all).

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

New books on the pros and cons of algorithms

Posted on 5 January, 2017 – 2:42 PM

Algorithms are means of processing data in ways that can aid our decision making. One of the weak areas of evaluation practice is guidance on data analysis, as distinct from data gathering. In the last year or so I have been searching for useful books on the subject of algorithms – what they are, how they work and the risks and opportunities associated with their use. Here are a couple of books I have found worth reading, plus some blog postings.


Christian, B., & Griffiths, T. (2016). Algorithms To Live By: The Computer Science of Human Decisions. William Collins. An excellent over view of a wide range of types of algorithms and how they work. I have read this book twice and found a number of ideas within it that have been practically useful for me in my work

O’Neil, C. (2016). Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. New York: Crown Publishing Group. A more depressing book, but a necessary read nevertheless. Highlighting the risks posed  to human welfare by poorly designed and or  poorly used algorithms. One of the examples cited being labor/staff scheduling algorithms, which very effectively minimize labor costs for empowers, but at the cost of employees not being able to predictably schedule child care, second jobs or part time further education, thus in effect locking those people into membership of a low cost labor pool.Some algorithms are able to optimize multiple objectives e.g. labor costs and labor turnover (represented longer term costs), but both objectives are still employer focused. Another area of concern is customer segmentation, where algorithms fed on big data sets enable companies to differentially (and non-transparently) price products and services being sold to ever smaller segments of their consumer population. In the insurance market this can mean that instead of the whole population sharing the costs of health insurance risks, which may in real life fall more on some than others, those costs will now be imposed more specifically on those with the high risks (regardless of the origins of those risks, genetic, environmental or an unknown mix)

Ezrachi, A., & Stucke, M. E. (2016). Virtual Competition: The Promise and Perils of the Algorithm-Driven Economy. Cambridge, Massachusetts: Harvard University Press. This one is a more in-depth analysis than the one above, focusing on the implications for how our economies work, and can fail to work

Blog postings

Kleinberg, J., Ludwig, J., & Mullainathan, S. (2016, December 8). A Guide to Solving Social Problems with Machine Learning. Retrieved January 5, 2017, from Harvard Business Review website. A blog posting, easy to read and informative

Knight, Will, (2016, November 23) How to Fix Silicon Valley’s Sexist Algorithms, MIT Technology Review

Lipton, Zacharay Chase, (2016) The foundations of algorithmic bias. KD Nuggets

Nicholas Diakopoulos and Sorelle Friedler (2016, November 17) How to Hold Algorithms Accountable,  MIT Technology Review. Algorithmic systems have a way of making mistakes or leading to undesired consequences. Here are five principles to help technologists deal with that.


VN:F [1.9.22_1171]
Rating: +1 (from 1 vote)

Dealing with missing data: A list

Posted on 20 November, 2016 – 12:48 PM

In this post “missing data” does not mean absence of whole categories of data, which is a common enough problem, but missing data values within a given data set.

While this is a common problem in almost all spheres of research/evaluation it seems particularly common in more qualitative and participatory inquiry, where the same questions may not be asked of all participants/respondents. It is also likely to be a problem when data is extracted from documentary source produced by different parties e.g. project completion reports.

Some types of strategies (from Analytics Vidhya):

  1. Deletion:
    1. Listwise deletion: Of all cases with missing data
    2. Pairwise deletion: : An analysis is carried out with all cases in which the variable of interest is present. The sub-set of cases used will vary according to the sub-set of variables which are the focus of each analysis.
  2. Substitution
    1. Mean/ Mode/ Median Imputation: replacing the missing data for a given attribute by the mean or median (quantitative attribute) or mode (qualitative attribute) of all known values of that variable. Two variants:
      1. Generalized: Done for all cases
      2. Similar case: calculated separately for different sub-groups e.g. men versus women
    2. K Nearest Neighbour (KNN) imputation: The missing values of an attribute are imputed using those found in other cases with the most similar other attributes (where k = number of other attributes being examined).
    3. Prediction model: Using a sub-set of cases with no missing values, a model is developed that best predicts the presence of the attribute of interest. This is then applied to predict the missing values in the sub-set of cases with the missing values. Another variant, for continuous data:
      1. Regression Substitution: Using multiple-regression analysis to estimate a missing value.
  3. Error estimation (tbc)

References (please help me extend this list)

Note: I would like this list to focus on easily usable references i.e. those not requiring substantial knowledge of statistics and/or the subject of missing data


VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)