Computational Modelling of Public Policy: Reflections on Practice

Gilbert G, Ahrweiler P, Barbrook-Johnson P, et al. (2018) Computational Modelling of Public Policy: Reflections on Practice. Journal of Artificial Societies and Social Simulation 21: 1–14. pdf copy available

Abstract: Computational models are increasingly being used to assist in developing, implementing and evaluating public policy. This paper reports on the experience of the authors in designing and using computational models of public policy (‘policy models’, for short). The paper considers the role of computational models in policy making, and some of the challenges that need to be overcome if policy models are to make an effective contribution. It suggests that policy models can have an important place in the policy process because they could allow policy makers to experiment in a virtual world, and have many advantages compared with randomised control trials and policy pilots. The paper then summarises some general lessons that can be extracted from the authors’ experience with policy modelling. These general lessons include the observation that ofen the main benefit of designing and using a model is that it provides an understanding of the policy domain, rather than the numbers it generates; that care needs to be taken that models are designed at an appropriate level of abstraction; that although appropriate data for calibration and validation may sometimes be in short supply, modelling is ofen still valuable; that modelling collaboratively and involving a range of stakeholders from the outset increases the likelihood that the model will be used and will be fit for purpose; that attention needs to be paid to effective communication between modellers and stakeholders; and that modelling for public policy involves ethical issues that need careful consideration. The paper concludes that policy modelling will continue to grow in importance as a component of public policy making processes, but if its potential is to be fully realised, there will need to be a melding of the cultures of computational modelling and policy making.

Selected quotes: For these reasons, the ability to make ‘point predictions’, i.e. forecasts of specific values at a specific time in the future, is rarely possible. More possible is a prediction that some event will or will not take place, or qualitative statements about the type or direction of change of values. Understanding what sort of unexpected outcomes
can emerge and something of the nature of how these arise also helps design policies that can be responsive to unexpected outcomes when they do arise. It can be particularly helpful in changing environments to use the model to explore what might happen under a range of possible, but dfferent, potential futures – without any commitment about which of these may eventually transpire. Even more valuable is a finding that the model shows that certain outcomes could not be achieved given the assumptions of the model. An example of this is the use of a whole system energy model to develop scenarios that meet the decarbonisation goals set by the EU for 2050 (see, for example, RAENG 2015.)

Rick Davies comment: A concise and very informative summary with many useful references. Definitely worth reading! I like the big emphasis on the need for ongoing collaboration and communication between model developers and their clients and other model stakeholders However, I would have liked to see some discussion of the pros and cons of different approaches to modeling e.g. agent-based models vs Fuzzy Cognitive Mapping and other approaches. Not just examples of different modelling applications, useful as they were.

See also: Uprichard, E and Penn, A (2016) Dependency Models: A CECAN Evaluation and Policy Practice Note for policy analysts and evaluators. CECAN. Available at: https://www.cecan.ac.uk/sites/default/files/2018-01/EMMA%20PPN%20v1.0.pdf (accessed 6 June 2018).

Representing Theories of Change: Technical Challenges and Evaluation Consequences

 

CEDIL – Centre for Evaluation Lecture Series
The Centre of Excellence for Development Impact and Learning (CEDIL) and the Centre for Evaluation host a lecture series addressing methods and innovation in primary studies.

Watch the live-streamed lecture here

London School of Hygiene and Tropical Medicine. Lecture Two – Wednesday 30th May 2018 – Dr Rick Davies 12:45-14:00  Jerry Morris B, LSHTM 15-17 Tavistock Place, London, WC1H 9SH

“This lecture will summarise the main points of a paper of the same name. That paper looks at the technical issues associated with the representation of Theories of Change and the implications of design choices for the evaluability of those theories. The focus is on the description of connections between events, rather the events themselves, because this is seen as a widespread design weakness. Using examples and evidence from a range of Internet sources six structural problems are described, along with their consequences for evaluation. The paper then outlines six different ways of addressing these problems, which could be used by programme designers and by evaluators. These solutions range from simple to follow advice on designing more adequate diagrams, to the use of specialist software for the manipulation of much more complex static and dynamic network models. The paper concludes with some caution, speculating on why the design problems are so endemic but also pointing a way forward. Three strands of work are identified that CEDIL and DFID could invest in to develop solutions identified in the paper.”

The paper referred to in the lecture was commissioned by CEDIL and is now pending publication in a special issue of an evaluation journal

The Book of Why: The New Science of Cause and Effect

by Judea Pearl, Allen Lane, May 2018

Publisher blurb: “‘Correlation does not imply causation.’ This mantra was invoked by scientists for decades in order to avoid taking positions as to whether one thing caused another, such as smoking and cancer and carbon dioxide and global warming. But today, that taboo is dead. The causal revolution, sparked by world-renowned computer scientist Judea Pearl and his colleagues, has cut through a century of confusion and placed cause and effect on a firm scientific basis. Now, Pearl and science journalist Dana Mackenzie explain causal thinking to general readers for the first time, showing how it allows us to explore the world that is and the worlds that could have been. It is the essence of human and artificial intelligence. And just as Pearl’s discoveries have enabled machines to think better, The Book of Why explains how we can think better.

Introduction: Mind over data (pdf copy)

Chapter 1: The Ladder of Causation (pdf copy)

Reviews: None found yet, but they will be listed here when found

2020 05 20: New material from Judea Pearl – “On this page I plan to post viewgraphs and homeworks that I currently use in teaching CAUSALITY. So far, we have covered the following chapters”:

http://bayes.cs.ucla.edu/BOOK-2K/viewgraphs.html

The surprising usefulness of simple measures: HappyOrNot terminals

as described in this very readable article by David Owen:
Customer Satisfaction at the Push of  a Button – HappyOrNot terminals look simple, but the information they gather is revelatory. New Yorker, 2 February 2018, pages 26-29

Read the full article here

Points of interest covered by the article include:

  1. What is so good about them
  2. Why they work so well
  3. Can people “game” the data that is collected
  4. The value of immediacy of data collection
  5. How value is added to data points by information about location and time
  6. Example of real life large scale applications
  7. What is the worse thing that could happen

Other articles on the same subject:

Rick Davies comment: I like the design of the simple experiment described in the first para of this article. Because the locations of the petrol stations were different, and thus not comparable, the managers swapped the “treatment” given to each station i.e the staff they thought were making a difference to the performance of these stations.

Searching for Success: A Mixed Methods Approach to Identifying and Examining Positive Outliers in Development Outcomes

by Caryn Peiffer and Rosita Armytage, April 2018, Development Leadership Program Research Paper 52. Available as pdf

Summary: Increasingly, development scholars and practitioners are reaching for exceptional examples of positive change to better understand how developmental progress occurs. These are often referred to as ‘positive outliers’, but also ‘positive deviants’ and ‘pockets of effectiveness’.
Studies in this literature promise to identify and examine positive developmental change occurring in otherwise poorly governed states. However, to identify success stories, such research largely relies on cases’ reputations, and, by doing so, overlooks cases that have not yet garnered a reputation for their developmental progress.

This paper presents a novel three-stage methodology for identifying and examining positive outlier cases that does not rely solely on reputations. It therefore promises to uncover ‘hidden’ cases of developmental progress as
well as those that have been recognised.

The utility of the methodology is demonstrated through its use in uncovering two country case studies in which surprising rates of bribery reduction occurred, though the methodology has much broader applicability. The advantage of the methodology is validated by the fact that, in both of the cases identified, the reductions in bribery that occurred were largely previously unrecognised.

Contents: 
Summary
Introduction 1
Literature review: How positive outliers are selected 2
Stage 1: Statistically identifying potential positive outliers in bribery reduction 3
Stage 2: Triangulating statistical data 6
Stage 3: In-country case study fieldwork 7
Promise realised: Uncovering hidden ‘positive outliers’ 8
Conclusion 9
References 11
Appendix: Excluded samples from pooled GCB dataset 13

Rick Davies comment: This is a paper that has been waiting to be published, one that unites a qual and quant approach to identifying AND understanding positive deviance / positive outliers [I do prefer the latter term, promoted by the authors of this paper]

The authors use regression analysis to identify statistical outliers, which is appropriate where numerical data is available.. Where the data is binary/categorical is possible to use other methods to identify such outliers. See this page on the use of the EvaLC3 Excel app to find positive outliers in binary data sets.

“How to Publish Statistically Insignificant Results in Economics”

…being the title of  a blog posting submitted by DAVID EVANS , 03/28/2018 on the World Bank’s Development Impact website here

…and reproduced in full here…


Sometimes, finding nothing at all can unlock the secrets of the universe. Consider this story from astronomy, recounted by Lily Zhao: “In 1823, Heinrich Wilhelm Olbers gazed up and wondered not about the stars, but about the darkness between them, asking why the sky is dark at night. If we assume a universe that is infinite, uniform and unchanging, then our line of sight should land on a star no matter where we look. For instance, imagine you are in a forest that stretches around you with no end. Then, in every direction you turn, you will eventually see a tree. Like trees in a never-ending forest, we should similarly be able to see stars in every direction, lighting up the night sky as bright as if were day. The fact that we don’t indicates that the universe either is not infinite, is not uniform, or is somehow changing.”

What can “finding nothing” – statistically insignificant results – tell us in economics? In his breezy personal essay, MIT economist Alberto Abadie makes the case that statistically insignificant results are at least as interesting as significant ones. You can see excerpts of his piece below.

In case it’s not obvious from the above, one of Abadie’s key points (in a deeply reductive nutshell) is that results are interesting if they change what we believe (or “update our priors”). With most public policy interventions, there is no reason that the expected impact would be zero. So there is no reason that the only finding that should change our beliefs is a non-zero finding.

Indeed, a quick review of popular papers (crowdsourced from Twitter) with key results that are statistically insignificantly different from zero showed that the vast majority showed an insignificant result in a context where many readers would expect a positive result.
For example…

  • You think wealth improves health? Not so fast! (Cesarini et al., QJE, 2016)
  • Okay, if wealth doesn’t affect health, maybe you think that education reduces mortality? Nuh-uh! (Meghir, Palme, & Simeonova, AEJ: Applied, 2018)
  • You think going to an elite school improves your test scores? Not! (Abdulkadiroglu, Angrist, & Pathak, Econometrica, 2014)
  • Do you still think going to an elite school improves your test scores, but only in Kenya? No way! (Lucas & Mbiti, AEJ: Applied, 2014)
  • You think increasing teacher salaries will increase student learning? Nice try! (de Ree et al., QJE, 2017)
  • You believe all the hype about microcredit and poverty? Think again! (Banerjee et al., AEJ: Applied, 2015)

and even

  • You think people born on Friday the 13th are unlucky? Think again! (Cesarini et al., Kyklos, 2015)

It also doesn’t hurt if people’s expectations are fomented by active political debate.

  • Do you believe that cutting taxes on individual dividends will increase corporate investment? Better luck next time! (Yagan, AER, 2015)
  • Do you believe that Mexican migrant agricultural laborers drive down wages for U.S. workers? We think not! (Clemens, Lewis, & Postel, AER, forthcoming)
  • Okay, maybe not the Mexicans. But what about Cuban immigrants? Nope! (Card, Industrial and Labor Relations Review, 1980)

In cases where you wouldn’t expect readers to have a strong prior, papers sometimes play up a methodological angle.

  • Do you believe that funding community projects in Sierra Leone will improve community institutions? No strong feelings? It didn’t. But we had a pre-analysis plan which proves we aren’t cherry picking among a thousand outcomes, like some other paper on this topic might do. (Casey, Glennerster, & Miguel, QJE, 2012)
  • Do you think that putting flipcharts in schools in Kenya improves student learning? What, you don’t really have an opinion about that? Well, they don’t. And we provide a nice demonstration that a prospective randomized-controlled trial can totally flip the results of a retrospective analysis. (Glewwe et al., JDE, 2004)

Sometimes, when reporting a statistically insignificant result, authors take special care to highlight what they can rule out.

  • “We find no evidence that wealth impacts mortality or health care utilization… Our estimates allow us to rule out effects on 10-year mortality one sixth as largeas the cross-sectional wealth-mortality gradient.” In other words, we can rule out even a pretty small effect. “The effects on most other child outcomes, including drug consumption, scholastic performance, and skills, can usually be bounded to a tight interval around zero.” (Cesarini et al., QJE, 2016)
  • “We estimate insignificant effects of the [Swedish education] reform [that increased years of compulsory schooling] on mortality in the affected cohort. From the confidence intervals, we can rule out effects larger than 1–1.4 months of increased life expectancy.” (Meghir, Palme, & Simeonova, AEJ: Applied, 2018)
  • “We can rule out even modest positive impacts on test scores.” (de Ree et al., QJE, 2017)

Of course, not all insignificant results are created equal. In the design of a research project, data that illuminates what kind of statistically insignificant result you have can help. Consider five (non-exhaustive) potential reasons for an insignificant result proposed by Glewwe and Muralidharan (and summarized in my blog post on their paper, which I adapt below).

  1. The intervention doesn’t work. (This is the easiest conclusion, but it’s often the wrong one.)
  2. The intervention was implemented poorly. Textbooks in Sierra Leone made it to schools but never got distributed to students (Sabarwal et al. 2014).
  3. The intervention led to substitution away from program inputs by other actors. School grants in India lost their impact in the second year when households lowered their education spending to compensate (Das et al. 2013).
  4. The intervention works for some participants, but it doesn’t alleviate a binding constraint for the average participant. English language textbooks in rural Kenya only benefitted the top students, who were the only ones who could read them (Glewwe et al. 2009).
  5. The intervention will only work with complementary interventions. School grants in Tanzania only worked when complemented with teacher performance pay (Mbiti et al. 2014).

Here are two papers that – just in the abstract – demonstrate detective work to understand what’s going on behind their insignificant results.

For example #1, in Atkin et al. (QJE, 2017), few soccer ball producing firms in Pakistan take up a technology that reduces waste. Why?

“We hypothesize that an important reason for the lack of adoption is a misalignment of incentives within firms: the key employees (cutters and printers) are typically paid piece rates, with no incentive to reduce waste, and the new technology slows them down, at least initially. Fearing reductions in their effective wage, employees resist adoption in various ways, including by misinforming owners about the value of the technology.”

And then, they implemented a second experiment to test the hypothesis.

“To investigate this hypothesis, we implemented a second experiment among the firms that originally received the technology: we offered one cutter and one printer per firm a lump-sum payment, approximately a month’s earnings, conditional on demonstrating competence in using the technology in the presence of the owner. This incentive payment, small from the point of view of the firm, had a significant positive effect on adoption.”

Wow! You thought we had a null result, but by the end of the abstract, we produced a statistically significant result!

For example #2, Michalopoulos and Papaioannou (QJE, 2014) can’t run a follow-up experiment because they’re looking at the partition of African ethnic groups by political boundaries imposed half a century ago. “We show that differences in countrywide institutional structures across the national border do not explain within-ethnicity differences in economic performance.” What? Do institutions not matter? Need we rethink everything we learned from Why Nations Fail? Oh ho, the “average noneffect…masks considerable heterogeneity.” This is a version of Reason 4 from Glewwe and Muralidharan above.

These papers remind us that economists need to be detectives as well as plumbers, especially in the context of insignificant results.

Towards the end of the paper that began this post, Abadie writes that “we advocate a visible reporting and discussion of non-significant results in empirical practice.” I agree. Non-significant results can change our minds. They can teach us. But authors have to do the work to show readers what they should learn. And editors and reviewers need to be open to it.

What else can you read about this topic?


 

The Politics of Evidence: From evidence-based policy to the good governance of evidence

by Justin Parkhurst © 2017 – Routledge,
Available as pdf and readable online, and hardback or paperback

“There has been an enormous increase in interest in the use of evidence for public policymaking, but the vast majority of work on the subject has failed to engage with the political nature of decision making and how this influences the ways in which evidence will be used (or misused) within political areas. This book provides new insights into the nature of political bias with regards to evidence and critically considers what an ‘improved’ use of evidence would look like from a policymaking perspective”

“Part I describes the great potential for evidence to help achieve social goals, as well as the challenges raised by the political nature of policymaking. It explores the concern of evidence advocates that political interests drive the misuse or manipulation of evidence, as well as counter-concerns of critical policy scholars about how appeals to ‘evidence-based policy’ can depoliticise political debates. Both concerns reflect forms of bias – the first representing technical bias, whereby evidence use violates principles of scientific best practice, and the second representing issue bias in how appeals to evidence can shift political debates to particular questions or marginalise policy-relevant social concerns”

“Part II then draws on the fields of policy studies and cognitive psychology to understand the origins and mechanisms of both forms of bias in relation to political interests and values. It illustrates how such biases are not only common, but can be much more predictable once we recognise their origins and manifestations in policy arenas”

“Finally, Part III discusses ways to move forward for those seeking to improve the use of evidence in public policymaking. It explores what constitutes ‘good evidence for policy’, as well as the ‘good use of evidence’ within policy processes, and considers how to build evidence-advisory institutions that embed key principles of both scientific good practice and democratic representation. Taken as a whole, the approach promoted is termed the ‘good governance of evidence’ – a concept that represents the use of rigorous, systematic and technically valid pieces of evidence within decision-making processes that are representative of, and accountable to, populations served”

Contents
Part I: Evidence-based policymaking – opportunities and challenges
Chapter 1. Introduction
Chapter 2. Evidence-based policymaking – an important first step, and the need to take the next
Part II: The politics of evidence
Chapter 3. Bias and the politics of evidence
Chapter 4. The overt politics of evidence – bias and the pursuit of political interests
Chapter 5. The subtle politics of evidence – the cognitive-political origins of bias
Part III: Towards the good governance of evidence
Chapter 6. What is ‘good evidence for policy’? From hierarchies to appropriate evidence.
Chapter 7. What is the ‘good use of evidence’ for policy?
Chapter 8. From evidence-based policy to the good governance of evidence

Wiki Surveys: Open and Quantifiable Social Data Collection

by Matthew J. Salganik, Karen E. C. Levy, PLOS
Published: May 20, 2015 https://doi.org/10.1371/journal.pone.0123483

Abstract: In the social sciences, there is a longstanding tension between data collection methods that facilitate quantification and those that are open to unanticipated information. Advances in technology now enable new, hybrid methods that combine some of the benefits of both approaches. Drawing inspiration from online information aggregation systems like Wikipedia and from traditional survey research, we propose a new class of research instruments called wiki surveys. Just as Wikipedia evolves over time based on contributions from participants, we envision an evolving survey driven by contributions from respondents. We develop three general principles that underlie wiki surveys: they should be greedy, collaborative, and adaptive. Building on these principles, we develop methods for data collection and data analysis for one type of wiki survey, a pairwise wiki survey. Using two proof-of-concept case studies involving our free and open-source website www.allourideas.org, we show that pairwise wiki surveys can yield insights that would be difficult to obtain with other methods.

Also explained in detail in this Vimeo video: https://vimeo.com/51369546

OPM’s approach to assessing Value for Money

by Julian King, Oxford Policy Management. January 2018. Available as pdf

Excerpt from Foreword:

In 2016, Oxford Policy Management (OPM) teamed up with Julian King, an evaluation specialist, who worked with staff from across the company to develop the basis of a robust and distinct OPM approach to assessing VfM. The methodology was successfully piloted during the annual reviews of the Department for International Development’s (DFID) Sub-National Governance programme in Pakistan and MUVA, a women’s economic empowerment programme in Mozambique. The approach involves making transparent, evidence-based judgements about how well resources are being used, and whether the value derived is good enough to justify the investment.

To date, we have applied this approach on upwards of a dozen different development projects and programmes, spanning a range of clients, countries, sectors, and budgets. It has been well received by our clients (both funding agencies and partner governments) and project teams alike, who in particular appreciate the use of explicit evaluative reasoning. This involves developing definitions of what acceptable / good / excellent VfM looks like, in the context of each specific project. Critically, these definitions are co-developed and endorsed upfront, in advance of implementation and before the evidence is gathered, which provides an agreed, objective, and transparent basis for making judgements.

Table of contents
Foreword 1
Acknowledgements 3
Executive summary 4
1 Background 7
1.1 What is VfM? 7
1.2 Why evaluate VfM? 7
1.3 Context 8
1.4 Overview of this guide 8
2 Conceptual framework for VfM assessment 9
2.1 Explicit evaluative reasoning 9
2.2 VfM criteria 10
2.3 DFID’s VfM criteria: the Four E’s 10
2.4 Limitations of the Four E’s 12
2.5 Defining criteria and standards for the Four E’s 13
2.6 Mixed methods evidence 15
2.7 Economic analysis 15
2.8 Complexity and emergent strategy 17
2.9 Perspective matters in VfM 19
2.10 Integration with M&E frameworks 19
2.11 Timing of VfM assessments 20
2.12 Level of effort required 20
3 Designing and implementing a VfM framework 21
3.1 Step 1: theory of change 21
3.2 Steps 2 and 3: VfM criteria and standards 22
3.3 Step 4: identifying evidence required 26
3.4 Step 5: gathering evidence 26
3.5 Steps 6–7: analysis, synthesis, and judgements 27
3.6 Step 8: reporting 29
Bibliography 30
Contact us

Review by Dr E. Jane Davidson, author of Evaluation Methodology Basics (Sage, 2005) and Director of Real Evaluation LLC, Seattle

Finally, an approach to Value for Money that breaks free of the “here’s the formula” approach and instead emphasises the importance of thoughtful and well-evidenced evaluative reasoning. Combining an equity lens with insights and perspectives from diverse stakeholders helps us understand the value of different constellations of outcomes relative to the efforts and investments required to achieve them. This step-by-step guide helps decision makers figure out how to answer the VfM question in an intelligent way when some of the most valuable outcomes may be the hardest to measure – as they so often are.

 

Bit by Bit Social Research in the Digital Age

by Matthew J. Salganik, Princeton University Press, 2017

Very positive reviews by…

Selected quotes:

“Overall, the book relies on a repeated narrative device, imagining how a social scientist and a data scientist might approach the same research opportunity. Salganik suggests that where data scientists are glass-half-full people and see opportunities, social scientists are quicker to highlight problems (the glass-half-empty camp). He is also upfront about how he has chosen to write the book, adopting the more optimistic view of the data scientist, while holding on to the caution expressed by social scientists”

“Salganik argues that data scientists most often work with “readymades”, social scientists with “custommades”, illustrating the point through art: data scientists are more like Marcel Duchamp, using existing objects to make art; meanwhile, social scientists operate in the custom-made style of Michelangelo, which offers a neat fit between research questions and data, but does not scale well. The book is thus a call to arms, to encourage more interdisciplinary research and for both sides to see the potential merits and drawbacks of each approach. It will be particularly welcome to researchers who have already started to think along similar lines, of which I suspect there are many”

Illustrates important ideas with examples of outstanding research

Combines ideas from social science and data science in an accessible style and without jargon

Goes beyond the analysis of “found” data to discuss the collection of “designed” data such as surveys, experiments, and mass collaboration

Features an entire chapter on ethics

Includes extensive suggestions for further reading and activities for the classroom or self-study

Matthew J. Salganik is professor of sociology at Princeton University, where he is also affiliated with the Center for Information Technology Policy and the Center for Statistics and Machine Learning. His research has been funded by Microsoft, Facebook, and Google, and has been featured on NPR and in such publications as the New Yorker, the New York Times, and the Wall Street Journal.

Contents

Preface
1 Introduction
2 Observing Behavior
3 Asking Questions
4 Running Experiments
5 Creating Mass Collaboration
6 Ethics
7 The Future
Acknowledgments
References
Index

More detailed contents page available via Amazon Look Inside

PS: See also this Vimeo video presentation by Salganik: Wiki Surveys – Open and Quantifiable Social Data Collection plus this PLOS paper on the same topic.

 

%d bloggers like this: