PROCESS TRACING: Oxfam’s Draft Protocol

Undated, but possibly 2012. Available as pdf

Background: “Oxfam GB has adopted a Global Performance Framework.  Among other things, this framework involves the random selection of samples of closing or sufficiently mature projects under six outcome areas each year and rigorously evaluating their performance.  These are referred to as Effectiveness Reviews.  Effectiveness Reviews carried out under the Citizen Voice and Policy Influencing thematic areas are to be informed by a research protocol based on process tracing, a qualitative research approach used by case study researchers to investigate casual inference.”

Oxfam is seeking feedback on this draft.    Please send your comments to

See also the related blog posting by Oxfam on the “AEA365 | A Tip-a-Day by and for Evaluators”website:

Rick Davies comment: While the draft protocol already includes six references on process tracing, I would recommend two more which I think are especially useful and recent:

  • Mahoney, James. “Mahoney, J. (2012). The Logic of Process Tracing Tests in the Social Sciences.  1-28.” Sociological Methods & Research XX(X) (March 2, 2012): 1–28. doi:10.1177/0049124112437709.
    • Abstract: This article discusses process tracing as a methodology for testing hypotheses in the social sciences. With process tracing tests, the analyst combines preexisting generalizations with specific observations from within a single case to make causal inferences about that case. Process tracing tests can be used to help establish that (1) an initial event or process took place, (2) a subsequent outcome also occurred, and (3) the former was a cause of the
      latter. The article focuses on the logic of different process tracing tests, including hoop tests, smoking gun tests, and straw in the wind tests. New criteria for judging the strength of these tests are developed using ideas concerning the relative importance of necessary and sufficient conditions. Similarities and differences between process tracing and the deductive nomological model of explanation are explored.


A Tale of Two Cultures: Qualitative and Quantitative Research in the Social Sciences

Gary Goertz & James Mahoney, 2012
Princeton University Press. Available on Amazon

Review of the book by Dan Hirschman

Excerpts from his review:

“Goertz, a political scientist, and Mahoney, a sociologist, attempt to make sense of the different cultures of research in these two camps without attempting to apply the criteria of one to the other. In other words, the goal is to illuminate difference and similarity rather than judge either approach (or, really, affiliated collection of approaches) as deficient by a universal standard.

G&M are interested in quantitative and qualitative approaches to causal explanation.

Onto the meat of the argument. G&M argue that the two cultures of quantitative and (causal) qualitative research differ in how they understand causality, how they use mathematics, how they privilege within-case vs. between-case variation, how they generate counterfactuals, and more. G&M argue, perhaps counter to our expectations, that both cultures have answers to each of these questions, and that the answers are reasonably coherent across cultures, but create tensions when researchers attempt to evaluate each others’ research: we mean different things, we emphasize different sorts of variation, and so on. Each of these differences is captured in a succinct chapter that lays out in incredible clarity the basic choices made by each culture, and how these choices aggregate up to very different models of research.

Perhaps the most counterintuitive, but arguably most rhetorically important, is the assertion that both quant and qual research are tightly linked to mathematics. For quant research, the connection is obvious: quantitative research relies heavily on probability and statistics. Causal explanation consists of statistically identifying the average effect of a treatment. For qual research, the claim is much more controversial. Rather than relying on statistics, G&M assert that qualitative research relies on logic and set theory, even if this reliance is often implicit rather than formal. G&M argue that at the core of explanation in the qualitative culture are the set theoretic/logical criteria of necessary and sufficient causes. Combinations of necessary and sufficient explanations constitute causal explanations. This search for non-trivial necessary and sufficient conditions for the appearance of an outcome shape the choices made in the qualitative culture, just as the search for significant statistical variation shapes quantitative resarch. G&M include a brief review of basic logic, and a quick overview of the fuzzy-set analysis championed by Charles Ragin. I had little prior experience with fuzzy sets (although plenty with formal logic), and I found this chapter extremely compelling and provocative. Qualitative social science works much more often with the notion of partial membership – some countries are not quite democracies, while others are completely democracies, and others are completely not democracies. This fuzzy-set approach highlight the non-linearities inherent in partial membership, as contrasted with quantitative approaches that would tend to treat “degree of democracy” as a smooth variable.”

Earlier paper by same authors available as pdf: A Tale of Two Cultures: Contrasting Quantitative and Qualitative Research
by James Mahoney, Gary Goertz. Political Analysis (2006) 14:227–249 doi:10.1093/pan/mpj017

See also these recent reviews:

See also The Logic of Process Tracing Tests in the Social Sciences by James Mahoney, Sociological Methods & Research, XX(X), 1-28 Published online 2 March 2012

RD comment: This books is recommended reading!

PS 15 February 2013: See Howard White’s new blog posting “Using the causal chain to make sense of the numbers” where he provides examples of the usefulness of simple set-theoretic analyses of the kind described by Mahoney and Goetz (e.g. in an analysis of arguments about why Gore lost to Bush in Florida)


A move to more systematic and transparent approaches in qualitative evidence synthesis

An update on a review of published papers.
By Karin Hannes and Kirsten Macaitis  Qualitative Research 2012 12: 402 originally published online 11 May 2012


In 2007, the journal Qualitative Research published a review on qualitative evidence syntheses conducted between 1988 and 2004. It reported on the lack of explicit detail regarding methods for searching, appraisal and synthesis, and a lack of emerging consensus on these issues. We present an update of this review for the period 2005–8. Not only has the amount of published qualitative evidence syntheses doubled, but authors have also become more transparent about their searching and critical appraisal procedures. Nevertheless, for the synthesis component of the qualitative reviews, a black box remains between what people claim to use as a synthesis approach and what is actually done in practice. A detailed evaluation of how well authors master their chosen approach could provide important information for developers of particular methods, who seem to succeed in playing the game according to the rules. Clear methodological instructions need to be developed to assist others in applying these synthesis methods.

Connecting communities? A review of World Vision’s use of MSC

A report for World Vision, by Rick Davies and Tracey Delaney, Cambridge and Melbourne, March 2011. Available as pdf

Background to this review

“This review was undertaken by two monitoring and evaluation consultants, both with prior experience in the use of the Most Significant Change (MSC) technique. The review was commissioned by World Vision UK, with funding support from World Vision Canada. The consultants have been asked to “focus on what has and has not worked relating to the implementation and piloting of MSC and why; establish if the MSC tools were helpful to communities that used them; will suggest ideas for consideration on how MSC could be implemented in an integrated way given WV’s structure, systems and sponsorship approach; and what the structural, systems and staffing implications of those suggestions might be”. The review was undertaken in February-March 2011 using a mix of field visits (WV India and Cambodia), online surveys, Skype interviews, and document reviews.

MSC is now being used, in one form or another, in many WV National Offices (NOs). Fifteen countries using MSC were identified through document searches, interviews and an online survey, and other users may exist that did not come to our attention. Three of these countries have participated in a planned and systematic introduction of MSC as part of WV’s Transformational Development Communications (TDC) project; namely Cambodia, India and the Philippines.  Almost all of this use has emerged in the last four years, which is a very brief period of time. The ways in which MSC has been used varies widely, some of which we would call MSC in name only. Most notably, where the MSC question is being used, but where there is no subsequent selection process of MSC stories. Across almost all the users of MSC that we made contact with there was a positive view of the value of the MSC process and the stories can produce. There is clearly a basis here for improving the way MSC is used within WV, and possibly widening the scale of its use. However, it is important to bear in mind that our views are based on a largely self-selected sample of respondents, from 18 of the 45 countries we sought to engage.”


Glossary. 4
1.      Executive Summary. 5

1.1 Background to this review.. 5

1.2 Overview of how MSC is being used in WV. 5

1.3 The findings: perceptions and outcomes of using MSC. 6

1.4 Recommendations emerging from this review.. 7

1.5 Concluding comment about the use of MSC within WV. 12

2.      Review purpose and methods. 13

2.1 World Vision expectations. 13

2.2 Review approach and methods. 13

2.3 The limitations of this review.. 14

3.      A quick summary of the use of MSC by World Vision.. 15

4.      How MSC has been used in World Vision.. 17

4.1 Objectives: Why MSC was being used. 17

4.2 Processes: How MSC was being used. 18

Management 18

Training. 19

Domains of change. 19

Story collection. 20

A review of some stories documented in WV reports. 22

Story selection. 24

Verification. 26

Feedback. 26

Quantification. 27

Secondary analysis. 27

Use of MSC stories. 28

Integration with other WV NO and SO functions. 29

4.3 Outcomes: Experiences and Impacts. 30

Evaluations of the use of MSC. 30

Experiences of MSC stories. 30

Who benefits. 31

Impacts on policies and practices. 31

Summary assessments of the strengths and weaknesses of using MSC. 32

5.      How MSC has been introduced and used in TDC countries. 36

5.1 Objectives: Why MSC was being used. 36

5.2 Process in TDC: a comparison across countries. 36

Management and coordination of MSC process. 36

Training and support 37

Use of domains. 39

Story collection. 39

Story Selection. 43

Feedback on MSC stories. 46

Use of MSC stories. 47

Role out of TDC pilot – extending the use of MSC to all ADPs. 49

Integration and/or adoption of MSC into other sections of the NO.. 50

5.3 The outcomes of using MSC in the TDC. 51

Experiences and reactions to MSC. 51

Who has benefited and how.. 52

5.4 Conclusions about the TDC pilot. 55



Quantitative and Qualitative Methods in Impact Evaluation and Measuring Results

Governance and Social Development Resource Centre. Issues Paper by Sabine Garbarino and Jeremy Holland March 2009

1 Introduction
There has been a renewed interest in impact evaluation in recent years amongst development agencies and donors. Additional attention was drawn to the issue recently by a Center for Global Development (CGD) report calling for more rigorous impact evaluations, where ‘rigorous’ was taken to mean studies which tackle the selection bias aspect of the attribution problem (CGD, 2006). This argument was not universally well received in the development community; among other reasons there was the mistaken belief that supporters of rigorous impact evaluations were pushing for an approach solely based on randomised control trials (RCTs). While ‘randomisers’ have appeared to gain the upper hand in a lot of the debates—particularly in the United States—the CGD report in fact recognises a range of approaches and the entity set up as a results of its efforts, 3ie, is moving even more strongly towards mixed methods (White, nd). The Department for International Development (DFID) in its draft policy statements similarly stresses the opportunities arising from a synthesis of qualitative and qualitative approaches in impact evaluation. Other work underway on ‘measuring results’ and ‘using numbers’ recognises the need to find standard indicators which capture non-material impacts and which are sensitive to social difference. This work also stresses the importance of supplementing standard indicators with narrative that can capture those dimensions of poverty that are harder to measure. This paper contributes to the ongoing debate on ‘more and better’ impact evaluations by highlighting experience on combining qualitative and quantitative methods for impact evaluation to ensure that we:

1. measure the different impact of donor interventions on different groups of people and

2. measure the different dimensions of poverty, particularly those that are not readily quantified but which poor people themselves identity as important, such as dignity, respect, security and power.

A third framing question was added during the discussions with DFID staff on the use of the research process itself as a way of increasing accountability and empowerment of the poor.

This paper does not intend to provide a detailed account of different approaches to impact evaluation nor an overview of proposed solutions to specific impact evaluation challenges. Instead it defines and reviews the case for combining qualitative and quantitative approaches to impact evaluation. An important principle that emerges in this discussion is that of equity, or what McGee (2003, 135) calls ‘equality of difference’. By promoting various forms of mixing we are moving methodological discussion away from a norm in development research in which qualitative research plays ‘second fiddle’ to conventional empiricist investigation. This means, for example, that contextual studies should not be used simply to confirm or ‘window dress’ the findings of non-contextual surveys. Instead they should play a more rigorous role of observing and evaluating impacts, even replacing, when appropriate, large-scale and lengthy surveys that can ‘overgenerate’ information in an untimely fashion for policy audiences.

The remainder of the paper is structured as follows. Section 2 briefly sets the scene by summarising the policy context. Section 3 clarifies the terminology surrounding qualitative and quantitative approaches, including participatory research. Section 4 reviews options for combining and sequencing qualitative and quantitative methods and data and looks at recent methodological innovations in measuring and analysing qualitative impacts. Section 5 addresses the operational issues to consider when combing methods in impact evaluation. Section 6 briefly concludes.

Using stories to increase sales at Pfizer

by Nigel Edwards, Strategic Communications Management Vol. 15, Issue 2, Feb-March 2011. pages 30-33. Available from Cognitive Edge website, and found via a tweet by David Snowden

[RD comment| This article is about the collation, analysis and use of a large volume of qualitative data, and as such has relevance to aid organisations as well as companies. It talks about the integrated use of two sets of methods:  anecdote circles as used by a  consultancy Narrate, and SenseMaker software as used by CognitiveEdge. While there is no mention of other story based methods, such as Most Significant Change(MSC), there are some connections. There are also connections with issues I have raised in the PAQI page on this website, which is all about the visualisation of qualitative data. I will explain.

The core of the Pfizer process was the collection of stories from a salesforce in 11 cities in six countries, within a two week period. With a further two weeks to analyse and report back the results.  Before then, the organisers identified a number of “signifiers” which could be applied to the stories. I would describe these as tags or categories that could be applied to the stories, between one and four words long, to signal what they were all about. These signifiers were developed as sets of choices offered in the form of polarities and triads. For example, one triad was “achieving the best vs respecting vs people, making a difference”. A polarity was “worried vs excited”. In previous work by Cognitive Edge and LearningbyDesign in Kenya the choice of which signifiers to apply to a story was in the hands of the story-teller, hence Cognitive Edge’s use of the phrase self-signifiers. What appeared to be new in the Pfizer application was that as each story was told by a member of an anecdote circle it was not only self-signified by the story teller, but also by the other members of the same group. So, for the 200 stories collected from 94 sales representatives they had 1,700 perspectives on those stories (so presumably about 8.5 people per group gave their choice of signifiers to each of the stories from that group).

I should back track at this stage. Self-signifiers are useful for two reasons. Firstly, because they are a way by which the respondent can provide extra information, in effect, meta-data, about what they have said in the story. Secondly, when stories can be given signifiers by multiple respondents from a commonly available set this allows clusters of stories to be self-created (i.e. being those which share the same sets of signifiers) and potentially identified. This is in contrast to external researchers reading the stories themselves, and doing their own tagging and sorting, using NVIVO or other means. The risk with this second approach is that the researcher prematurely imposes their own views on the data, before the data can “speak for themselves”. The self-signifying approach  is a more participatory and bottom up process, notwithstanding the fact that the set of signifiers being used may have been identified by the researchers in the first instance. PS: The more self signifiers there are to choose from, the more possible it will be that the participants can find a specific combination of signifiers which best fits their view of their story. From my reading there were at least 18 signifiers available to be used, possibly more.

The connection to MSC: MSC is about the participatory collection, discussion and selection of stories of significant change. Not only are people asked to describe what they think has been the most significant change, but they are also asked to explain why they think so. And when groups of MSC stories are pooled and discussed, with a view to participants selecting the most significant change from amongst all these, the participants are asked to explain and separately document why they selected the selected story. This is a process of self-signification. In some applications of MSC participants are also asked to place the stories they have discussed into one or another categories (called domains), which have in most cases been pre-identified by the organisers. This is another form of self-signifying. These two methods have advantages and disadvantages compared to the Pfizer approach.  One limitation I have noticed with the explanations of story choices is that while such discussions around reasons for choosing one story versus another can be very animated and in-depth, the subsequent documentation of the reasons is often very skimpy. Using a signifier tag or category description would be easier and might deliver more usable meta-data – even if participants themselves did not generate those signifiers. My concern, not substantiated, is that the task of assigning the signifiers might derail or diminish the discussion around story selection, which is so central to the MSC process.

Back to Pfizer. After the stories are collected along with their signifiers, the next step described in the Edwards paper is “looking at the overall patterns that emerged”. The text then goes on to describe the various findings and conclusions that were drawn, and how they were acted upon. This sequence reminds me of the cartoon, which has a long complex mathematical formula on a blackboard, with a bit of text in the middle of it all which says “then a miracle happens”. Remember, there were 200  stories with multiple signifiers applied to each story, by about 8 participants. That is 1700 different perspectives. That is a lot of data to look through and make sense of. Within this set I would expect to find many and varied clusters of stories that shared common sets of two or more signifiers. There are two ways of searching for these clusters. One is by intentional search, .i.e. by searching for stories that were given both signifier x and signifier y, because they were of specific interest to Pfizer. This requires some prior theory, hypotheses or hunch to guide it, otherwise it would be random search. A random search could take a very long time to find major clusters of stories, because the possibility space is absolutely huge. It doubles with every additional signifier (2,4,8,16…) and there multiple combinations of these signifiers because 8 participants are applying the signifiers (256 combinations of any combination of signifiers) to any one story. Intentional search is fine, but we will only find what we are looking for.

The other approach is to use tools which automatically visualise the clusters of stories that exist. One of the tools CognitiveEdge use for this purpose (and it is also used during data collection) are triangles that feature three different signifiers in each corner (the triads above). Each story will appear as a point within the triangle, representing the particular combinations of three attributes the story teller felt applied to the story. When multiple stories are plotted within the triangle multiple clusters of stories commonly appear, and they can then be investigated. The limitation of this tool is that it only visualises clusters of three signifiers at a time, when in practice 18 or more were used in the Pfizer case. It is still going to be slow way to search the space of all possible clusters of stories.

There is another approach, which I have discussed with David Snowden. This involves viewing stories as being connected to each other in a network, by virtue of sharing two or more signifiers. Data consisting of a list of stories with associated signifiers can be relatively easily imported from Excel into Social Network Analysis software, such as Ucinet/NetDraw, and then visualised as a network. Links can be size coded to show the relative number of signifiers any two connected stories share. More importantly, a filter can then be applied to automatically show only those stories connected by  x or more shared signifiers. This is a much less labor intensive way of searching huge possibility spaces.  My assumption is that clusters of stories sharing many signifiers are likely to be more meaningful than those sharing less, because they are less likely to occur simply by random chance.  And perhaps… that smaller clusters sharing many signifiers may be more meaningful than larger clusters sharing many signifiers (where the signifier might be fuzzier and less specific in meaning). These assumptions could be tested.

To recapitulate: Being able to efficiently explore large possibility spaces is important because they arise from giving participants more rather than less choice of signifiers. Giving more choice means we are more likely to hear the participants’ particular views, even though they are voiced through our constructs (the signifiers). And larger number of signifiers means that any clusters of highly connected stories is more likely to be meaningful rather than random.

Social Network Analysis software has an additional relevance for the analysis of Pfizer data set. Within the 1700 different perspectives on the stories there will not only be a network of stories connected by shared signifiers. There will also be a network of participants, connected by their shared similar uses of those signifiers. There will be clusters of participants as well as clusters of stories. This social dimension opened up by the participatory process used to apply the signifiers was not touched upon by the Dawson paper, probably because of limitations of time and space. But it could be great significance for Pfizer when working out how to best respond to the issues raised by the stories. Stories have owners, and different groups of owners will have different interests.

The Katine Challenge: How to analyse 540+ stories about a rural development project

The Guardian & Barclays funded and AMREF implemented, Katine Community Partnerships Project in Soroti District, Uganda is exceptional in some respects and all too common in others.

It is exceptional in the degree to which its progress has been very publicly monitored since it began in October 2007. Not only have all project documents been made publicly available via the dedicated Guardian Katine website, but resident and visiting journalists have posted more than 540 stories about the  people, the place and the project. These stories provide an invaluable in-depth and dynamic picture of what has been happening in Katine, unparalleled by anything else I have seen in any other development aid project.

On the flip side, the project is all too common in the kind of design and implementation problems that have been experienced, along with its fair share of unpredictable and very influential external events, including dramatic turn-arounds in various government policies. Plus the usual share of staffing and contracting problems.

Right now the project has completed its third year of operation and is now heading into the fourth and final year, one more year than originally planned.

I have a major concern. It is during this final year that there will be more knowledge about the project available than ever before, but at the same time its donors, and perhaps various staff within AMREF, will be becoming more interested in other new events appearing over the horizon. For example, the Guardian will cease its intensive journalistic coverage of the project from this month, and attention is now focusing on their new international development website

So, I would like to pose an important challenge to all the visitors to the Monitoring and Evaluation NEWS website, and the associated MandE NEWS email list:

How can the 540+ stories be put to good use? Is there some form of analysis that could be made of their contents, that would help AMREF, the Guardian, Barclays, the people of Katine, and all of us learn more from the Katine project?

In order to help I have uploaded an Excel file listing all the stories since December 2008, with working hypertext links. I will try to progressively extend this list back to the start of the project in late 2007. This list includes copies of all progress reports, review and planning documents that  AMREF has given the Guardian to be uploaded onto their website.

If you have any questions or comments please post them below, as Comments to this posting, in the first instance.

What would be useful in the first instance is ideas about plans or strategies for analysing the data. Then volunteers to actually implement one or more of these plans.

PS: My understanding is that the data is by definition already in the public domain, and therefore anyone could make use of it. However, that use should be fair and not for profit. What we should be searching for here are lessons or truths in some form that could be seen as having wider applicability, which are based on sound argument and good evidence, as much as is possible.

Measuring Empowerment? Ask Them

Quantifying qualitative outcomes from people’s own analysis. Insights for results-based management from the experience of a social movement in Bangladesh Dee Jupp Sohel Ibn Ali with contribution from Carlos Barahona 2010: Sida Studies in Evaluation. Download pdf


Participation has been widely taken up as an essential element of development, but participation for what purpose? Many feel that its acceptance, which has extended to even the most conventional of institutions such as the international development banks, has resulted in it losing its teeth in terms of the original ideology of being able to empower those living in poverty and to challenge power relations.

The more recent emergence of the rights-based approach discourse has the potential to restore the ‘bite’ to participation and to re-politicise development. Enshrined in universal declarations and conventions, it offers a palatable route to accommodating radicalism and creating conditions for emancipatory and transformational change, particularly for people living in poverty. But an internet search on how to measure the impact of these approaches yields a disappointing harvest of experience. There is a proliferation of debate on the origins and processes, the motivations and pitfalls of rights-based programming but little on how to know when or if it works. The discourse is messy and confusing and leads many to hold up their hands in despair and declare that outcomes are intangible, contextual, individual, behavioural, relational and fundamentally un-quantifiable!

As a consequence, results-based management pundits are resorting to substantive measurement of products, services and goods which demonstrate outputs and rely on perception studies to measure outcomes.

However, there is another way. Quantitative analyses of qualitative assessments of outcomes and impacts can be undertaken with relative ease and at low cost. It is possible to measure what many regard as unmeasurable.

This publication suggests that steps in the process of attainment of rights and the process of empowerment are easy to identify and measure for those active in the struggle to achieve them. It is our etic perspectives that make the whole thing difficult. When we apply normative frames of reference, we inevitably impose our values and our notions of democracy and citizen engagement rather than embracing people’s own context-based experience of empowerment.

This paper presents the experience of one social movement in Bangladesh, which managed to find a way to measure empowerment by letting the members themselves explain what benefits they acquired from the Movement and by developing a means to measure change over time. These measures , which are primarily of use to the members, have then been subjected to numerical analysis outside of the village environment to provide convincing quantitative data, which satisfies the demands of results-based management.

The paper is aimed primarily at those who are excited by the possibilities of rights-based approaches but who are concerned about proving that their investment results in measurable and attributable change. The experience described here should build confidence that transparency, rigour and reliability can be assured in community led approaches to monitoring and evaluation without distorting the original purpose, which is a system of reflection for the community members themselves. Hopefully, the reader will feel empowered to challenge the sceptics.

Dee Jupp and Sohel Ibn Ali
Continue reading “Measuring Empowerment? Ask Them”

Research Integration Using Dialogue Methods

David McDonald, Gabriele Bammer, Peter Deane, 2009 Download pdf

Ed: Although about “research integration”  the book is also very relevant to the planning and evaluation of development projects

“Research on real-world problems—like restoration of wetlands, the needs of the elderly, effective disaster response and the future of the airline industry—requires expert knowledge from a range of disciplines, as well as from stakeholders affected by the problem and those in a position to do something about it. This book charts new territory in taking a systematic approach to research integration using dialogue methods to bring together multiple perspectives. It links specific dialogue methods to particular research integration tasks.

Fourteen dialogue methods for research integration are classified into two groups:

1. Dialogue methods for understanding a problem broadly: integrating judgements

2. Dialogue methods for understanding particular aspects of a problem: integrating visions, world views, interests and values.

The methods are illustrated by case studies from four research areas: the environment, public health, security and technological innovation.”

Stories vs. Statistics: The Impact of Anecdotal Data on Accounting Decision Making

James Wainberg , Thomas Kida, James F. Smith
March 12, 2010  Download pdf copy

Prior research in psychology and communications suggests that decision makers are biased by anecdotal data, even in the presence of more informative statistical data. A bias for anecdotal data can have significant implications for accounting decision making since judgments are often made when both statistical and anecdotal data are present. We conduct experiments in two different accounting contexts (i.e., managerial accounting and auditing) to investigate whether accounting decision makers are unduly influenced by anecdotal data in the presence of superior, and contradictory, statistical data. Our results suggest that accounting decision makers ignored or underweighted statistical data in favor of anecdotal data, leading to suboptimal decisions. In addition, we investigate whether two decision aids, judgment orientation and counterargument, help to mitigate the effects of this anecdotal bias. The results indicate that both decision aids can reduce the influence of anecdotal data in accounting decision contexts. The implications of these results for decision making in accounting and auditing are discussed.