Structured Analytic Techniques for Intelligence Analysis

This is the title of the 3rd edition of the same, by Randolph H. Pherson and Richards J. Heuer Jr, published by Sage in 2019 

It is not cheap book, so I am not encouraging its purchase, but I am encouraging the perusal of its contents via the contents list and via Amazon’s “Look inside” facility.

Why so? The challenges facing intelligence analysts are especially difficult, so any methods used to address these may be of wider interest. These are spelled out in the Foreword, as follows:


This report is of interest in a number of ways:

  1. To what extent are the challenges faced similar/different to those of evaluations of publicly visible interventions?
  2. How different is the tool set, and the categorisation of the contents of that set?
  3. How much research has gone into the development and testing of this tool set?

The challenges

Some of these challenges are also faced by evaluation teams working in more overt and less antagonistic settings, albeit to a lesser degree.  For example, what will work in future in a slightly different settings (1), missing and ambiguous evidence (2), and with clients and other stakeholders who may intentionally or unintentionally not disclose or actually mislead (3) , and whose recommendations can affect peoples lives, positively and negatively (4).

The contents of the tool set

My first impression is that this book casts its net much wider than the average evaluation text (if there is such a thing). The families of methods include team working, organising, exploring, diagnosing, reframing, foresight, decision support, and more. Secondly, there are quite a few methods within these families I had not heard of before, including Bowtie analysis, opportunities incubator, morphological analysis, premortem analysis, deception detection and inconsistencies finder. The last two are of particular interest. Hopefully they are more than just a method brand name.

Research and testing

Worth looking at, alongside this publication, is this 17 page paper by Artner, S., Girven, R., & Bruce, J. (2016). Assessing the Value of Structured Analytic Techniques in the U.S. Intelligence Community. RAND Corporation. Its key findings are summarised as follows:

    • The U.S. Intelligence Community does not systematically evaluate the effectiveness of structured analytic techniques, despite their increased use.
    • One promising method of assessing these techniques would be to initiate qualitative reviews of their contribution in bodies of intelligence production on a variety of topics, in addition to interviews with authors, managers,  and consumers.
    • A RAND pilot study found that intelligence publications using these techniques generally addressed a broader range of potential outcomes and implications than did other analyses.
    • Quantitative assessments correlating the use of structured techniques to measures of analytic quality, along with controlled experiments using these techniques,  could provide a fuller picture of their contribution to intelligence analysis.

See also Chang, W., & Berdini, E. (2017). Restructuring Structured Analytic Techniques in Intelligence.  For an interesting in-depth analysis of bias risks and how the are managed and possibly mismanaged. Here is the abstract:

Structured analytic techniques (SATs) are intended to improve intelligence analysis by checking the two canonical sources of error: systematic biases and random noise. Although both goals are achievable, no one knows how close the current generation of SATs comes to achieving either of them. We identify two root problems: (1) SATs treat bipolar biases as unipolar. As a result, we lack metrics for gauging possible over-shooting—and have no way of knowing when SATs that focus on suppressing one bias (e.g., over-confidence) are triggering the opposing bias (e.g., under-confidence); (2) SATs tacitly assume that problem decomposition (e.g., breaking reasoning into rows and columns of matrices corresponding to hypotheses and evidence) is a sound means of reducing noise in assessments. But no one has ever actually tested whether decomposition is adding or subtracting noise from the analytic process—and there are good reasons for suspecting that decomposition will, on balance, degrade the reliability of analytic judgment. The central shortcoming is that SATs have not been subject to sustained scientific of the sort that could reveal when they are helping or harming the cause of delivering accurate assessments of the world to the policy community.

Both sound like serious critiques, but compared to what? There are probably plenty of evaluation methods where the same criticism could be applied – no one has subjected them to serious evaluation.

An Institutional View of Algorithmic Impact Assessments

Selbst, A. (2021). An Institutional View of Algorithmic Impact Assessments. Harvard Journal of Law and Technology, 35(10), 78. The author has indicated that paper that can be downloaded has a “draft” status.
First some general points about its relevance:
  1. Rich people get personalised one-to-one attention and services. Poor people get processed by algorithms. That may be a bit of a caricature, but there is also some truth there. Consider loan applications, bail applications, recruitment decisions, welfare payments. And perhaps medical diagnoses and treatments, depending to the source of service.  There is therefore a good reason for any evaluators concerned with equity to pay close attention to how algorithms affect the lives of the poorest sections of societies.
  2. This paper reminded me of the importance of impact assessments, as distinct from impact evaluations. The former are concerned with “effects-of-a-cause“, as distinct from the “causes-of-an-effect” , which is the focus of impact evaluations. In this paper impact assessment is specifically concerned about negative impacts, which is a narrower ambit than I  have seen previously in my sphere of work. But complementary to the expectations of positive impact associated with impact evaluations.  It may reflect the narrowness of my inhabited part of the evaluation world, but my feeling is that impact evaluations get way more attention than impact assessments. Yet once could argue that the default situation should be the reverse. Though I cant quite articulate my reasoning … I think it is something to do with the perception that most of the time the world acts on us, relative to us acting on the world.
Some selected quotes:
  1. The impact assessment approach has two principal aims. The first goal is to get the people who build systems to think methodically about the details and potential impacts of a complex project before its implementation, and therefore head off risks before they become too costly to correct. As proponents of values-in-design have argued for decades, the earlier in project development that social values are considered, the more likely that the end result will reflect those social values. The second goal is to create and provide documentation of the decisions made in development and their rationales, which in turn can lead to better accountability for those decisions and useful information for future policy  interventions (p.6)
    1. This Article will argue in part that once filtered through the institutional logics of the private sector, the first goal of improving systems through better design will only be effective in those organizations motivated by social obligation rather than mere compliance, but second goal of producing information needed for better policy and public understanding is what really can make the AIA regime worthwhile (p.8)
  2. Among all possible regulatory approaches, impact assessments are most useful where projects have unknown and hard-to-measure impacts on society, where the people creating the project and the ones with the knowledge and expertise to estimate its impacts have inadequate incentives to generate the needed information, and where the public has no other means to create that information. What is attractive about the AIA (Algorithmic Impact Assessment) is that we are now in exactly such a situation with respect to algorithmic harms. (p.7)
  3. The Article proceeds in four parts. Part I introduces the AIA, and
    explains why it is likely a useful approach….Part II briefly surveys different models of AIA that have been proposed as well as two alternatives: self-regulation and audits…Part III examines how institutional forces shape regulation and compliance, seeking to apply those lessons to the case of AIAs….Ultimately, the Part concludes that AIAs may not be
    fully successful in their primary goal of getting individual firms to consider
    social problems early, but that the second goal of policy-learning may well be
    more successful because it does not require full substantive compliance. Finally, Part IV looks at what we can learn from the technical community. This part discusses many relevant developments within technology industry and scholarship: empirical research into how firms understand AI fairness and ethics, proposals for documentation standards coming from academic and industrial labs, trade groups, standards organizations, and various self-regulatory framework proposal.(p.9)

 

 

“The Checklist Manifesto”, another perspective on managing the problem of extreme complexity

The Checklist Manifesto by Atul Gwande, 2009

Atul differentiates two types of problems that we face when dealing with extreme complexity. One is that of ignorance, there is a lot we simply don’t know. Unpredictability is a facet of complexity that many writers on the subject of complexity have given plenty of attention to, along with possible ways of managing that unpredictability. The other problem that Atul identifies is that of ineptitude. This is our inability to make good use of knowledge that is already available. He gives many examples where complex bodies of knowledge already exist that can make a big difference to people’s lives, notably in the field of medicine. But because of the very scale of those bodies of knowledge the reality is that people often are not cable of making full use of it and sometimes the consequences are disastrous. This facet of complexity is not something I’ve seen given very much attention to in the literature on complexity, at least that which I have come across. So I read this book with great interest, an interest magnified no doubt by my previous interest in, and experiments with, the use of weighted checklists, which are documented elsewhere on this website.

Another distinction that he makes is between task checklists and communication checklists. The first are all about avoiding dumb mistakes, forgetting to do things we should know that have to be done. The second is about coping with unexpected events, and the necessary characteristics of how we should cope by communicating relevant information to relevant people. He gives some interesting examples from the (big) building industry, where given the complexity of modern construction activities, and the extensive use of task checklists,  there are still inevitably various unexpected hitches which have to be responded to effectively, without jeopardising the progress or safety of the construction process.

Some selected quotes:

  • Checklists helped ensure a higher standard of baseline performance.
  • Medicine has become the art of managing extreme complexity  – and a test of whether such extreme complexity can, in fact, be humanely mastered”
  • Team work may just be hard in certain lines of work. Under conditions of extreme complexity, we inevitably rely on a division of tasks and  expertise…But the evidence suggests that we need them to see their job not just as performing their isolated  set of tasks well, but also helping the group get the best possible results
  • It is common to misconceived power checklists function in complex lines of work. They are not comprehensive how to guides whether for building a skyscraper or getting a plane out of trouble. They are quick and simple tools aimed to buttress the skills of expert professionals. And by remaining swift and usable and resolutely modest, they are saving thousands upon thousands of lives.
  • When you are making a checklist, you have a number of key decisions. You must define a clear pause point at which the checklist is supposed to be used (unless the moment is obvious, like when a warning light goes on or an engine fails) you must decide whether you want a do-confirm checklist or read-do checklist. With a do-confirm checklist team members perform their jobs from memory and experience, often separately. But then they stop. They paused to run the checklist and confirm that everything that was supposed to be done was done. With the read-do checklist, on the other hand, people carry out the task as they check them off, it’s more like a recipe. So for any new checklist created from scratch, you have to pick the type that makes the most sense of the situation.
  • We are obsessed in medicine with having great components – the best drugs, the best devices, the best specialists – but paid little attention to how to make them fit together well. Berwisk notes how wrongheaded this approach is ‘anyone who understands systems will know immediately that optimising part is not a great route to system excellent ‘he says.

I could go on, but I would rather keep reading the book… :-)

 

Calling Bullshit: THE ART OF SKEPTICISM IN A DATA-DRIVEN WORLD

Reviews

Wired review article

Guardian review article

Forbes review article

Kirkus Review article

Podcast Interview with the authors here

ABOUT CALLING BULLSHIT (=publisher blurb)
“Bullshit isn’t what it used to be. Now, two science professors give us the tools to dismantle misinformation and think clearly in a world of fake news and bad data.

Misinformation, disinformation, and fake news abound and it’s increasingly difficult to know what’s true. Our media environment has become hyperpartisan. Science is conducted by press release. Startup culture elevates bullshit to high art. We are fairly well equipped to spot the sort of old-school bullshit that is based in fancy rhetoric and weasel words, but most of us don’t feel qualified to challenge the avalanche of new-school bullshit presented in the language of math, science, or statistics. In Calling Bullshit, Professors Carl Bergstrom and Jevin West give us a set of powerful tools to cut through the most intimidating data.

You don’t need a lot of technical expertise to call out problems with data. Are the numbers or results too good or too dramatic to be true? Is the claim comparing like with like? Is it confirming your personal bias? Drawing on a deep well of expertise in statistics and computational biology, Bergstrom and West exuberantly unpack examples of selection bias and muddled data visualization, distinguish between correlation and causation, and examine the susceptibility of science to modern bullshit.

We have always needed people who call bullshit when necessary, whether within a circle of friends, a community of scholars, or the citizenry of a nation. Now that bullshit has evolved, we need to relearn the art of skepticism.”

Evaluation Failures: 22 Tales of Mistakes Made and Lessons Learned

Edited by: Kylie Hutchinson – Community Solutions, Vancouver, Canada. 2018 Published by Sage. https://us.sagepub.com/en-us/nam/evaluation-failures/book260109

But $30 for 184-page paperback is going to limit its appeal! The electronic version is similarly expensive, more like the cost of a hardback. Fortunately, two example chapters (1 and 8) are available as free pdfs, see below. Reading those two chapters makes me think the rest of the book would also be well worthwhile reading. It is not ofter you see anything written at length about evaluation failures. Perhaps we should set up an online-confessional, where we can line up to anonymously confess our un/professional sins. I will certainly be one of those needing to join such a queue! :)

PART I. MANAGE THE EVALUATION
Chapter 2. The Scope Creep Train Wreck: How Responsive Evaluation Can Go Off the Rails
Chapter 3. The Buffalo Jump: Lessons After the Fall
Chapter 4. Evaluator Self-Evaluation: When Self-Flagellation Is Not Enough
PART II. ENGAGE STAKEHOLDERS
Chapter 5. That Alien Feeling: Engaging All Stakeholders in the Universe
Chapter 6. Seeds of Failure: How the Evaluation of a West African
Chapter 7. I Didn’t Know I Would Be a Tightrope Walker Someday: Balancing Evaluator Responsiveness and Independence
PART III. BUILD EVALUATION CAPACITY
Chapter 9. Stars in Our Eyes: What Happens When Things Are Too Good to Be True
PART IV. DESCRIBE THE PROGRAM
Chapter 10. A “Failed” Logic Model: How I Learned to Connect With All Stakeholders
Chapter 11. Lost Without You: A Lesson in System Mapping and Engaging Stakeholders
PART V. FOCUS THE EVALUATION DESIGN
Chapter 12. You Got to Know When to Hold ’Em: An Evaluation That Went From Bad to Worse
Chapter 13. The Evaluation From Hell: When Evaluators and Clients Don’t Quite Fit
PART VI. GATHER CREDIBLE EVIDENCE
Chapter 14. The Best Laid Plans of Mice and Evaluators: Dealing With Data Collection Surprises in the Field
Chapter 15. Are You My Amigo, or My Chero? The Importance of Cultural Competence in Data Collection and Evaluation
Chapter 16. OMG, Why Can’t We Get the Data? A Lesson in Managing Evaluation Expectations
Chapter 17. No, Actually, This Project Has to Stop Now: Learning When to Pull the Plug
Chapter 18. Missing in Action: How Assumptions, Language, History, and Soft Skills Influenced a Cross-Cultural Participatory Evaluation
PART VII. JUSTIFY CONCLUSIONS
Chapter 19. “This Is Highly Illogical”: How a Spock Evaluator Learns That Context and Mixed Methods Are Everything
Chapter 20. The Ripple That Became a Splash: The Importance of Context and Why I Now Do Data Parties
Chapter 21. The Voldemort Evaluation: How I Learned to Survive Organizational Dysfunction, Confusion, and Distrust
PART VIII. REPORT AND ENSURE USE
Chapter 22. The Only Way Out Is Through
Conclusion

 

 

 

Free Coursera online course: Qualitative Comparative Analysis (QCA)

Highly recommended! A well organised and very clear and systematic exposition. Available at: https://www.coursera.org/learn/qualitative-comparative-analysis

About this Course

Welcome to this massive open online course (MOOC) about Qualitative Comparative Analysis (QCA). Please read the points below before you start the course. This will help you prepare well for the course and attend it properly. It will also help you determine if the course offers the knowledge and skills you are looking for.

What can you do with QCA?

  • QCA is a comparative method that is mainly used in the social sciences for the assessment of cause-effect relations (i.e. causation).
  • QCA is relevant for researchers who normally work with qualitative methods and are looking for a more systematic way of comparing and assessing cases.
  • QCA is also useful for quantitative researchers who like to assess alternative (more complex) aspects of causation, such as how factors work together in producing an effect.
  • QCA can be used for the analysis of cases on all levels: macro (e.g. countries), meso (e.g. organizations) and micro (e.g. individuals).
  • QCA is mostly used for research of small- and medium-sized samples and populations (10-100 cases), but it can also be used for larger groups. Ideally, the number of cases is at least 10.
  • QCA cannot be used if you are doing an in-depth study of one case

What will you learn in this course?

  • The course is designed for people who have no or little experience with QCA.
  • After the course you will understand the methodological foundations of QCA.
  • After the course you will know how to conduct a basic QCA study by yourself.

How is this course organized?

  • The MOOC takes five weeks. The specific learning objectives and activities per week are mentioned in appendix A of the course guide. Please find the course guide under Resources in the main menu.
  • The learning objectives with regard to understanding the foundations of QCA and practically conducting a QCA study are pursued throughout the course. However, week 1 focuses more on the general analytic foundations, and weeks 2 to 5 are more about the practical aspects of a QCA study.
  • The activities of the course include watching the videos, consulting supplementary material where necessary, and doing assignments. The activities should be done in that order: first watch the videos; then consult supplementary material (if desired) for more details and examples; then do the assignments. • There are 10 assignments. Appendix A in the course guide states the estimated time needed to make the assignments and how the assignments are graded. Only assignments 1 to 6 and 8 are mandatory. These 7 mandatory assignments must be completed successfully to pass the course. • Making the assignments successfully is one condition for receiving a course certificate. Further information about receiving a course certificate can be found here: https://learner.coursera.help/hc/en-us/articles/209819053-Get-a-Course-Certificate

About the supplementary material

  • The course can be followed by watching the videos. It is not absolutely necessary yet recommended to study the supplementary reading material (as mentioned in the course guide) for further details and examples. Further, because some of the covered topics are quite technical (particularly topics in weeks 3 and 4 of the course), we provide several worked examples that supplement the videos by offering more specific illustrations and explanation. These worked examples can be found under Resources in the main menu. •
  • Note that the supplementary readings are mostly not freely available. Books have to be bought or might be available in a university library; journal publications have to be ordered online or are accessible via a university license. •
  • The textbook by Schneider and Wagemann (2012) functions as the primary reference for further information on the topics that are covered in the MOOC. Appendix A in the course guide mentions which chapters in that book can be consulted for which week of the course. •
  • The publication by Schneider and Wagemann (2012) is comprehensive and detailed, and covers almost all topics discussed in the MOOC. However, for further study, appendix A in the course guide also mentions some additional supplementary literature. •
  • Please find the full list of references for all citations (mentioned in this course guide, in the MOOC, and in the assignments) in appendix B of the course guide.

 

 

Five ways to ensure that models serve society: A manifesto

Saltelli, A., Bammer, G., Bruno, I., Charters, E., Fiore, M. D., Didier, E., Espeland, W. N., Kay, J., Piano, S. L., Mayo, D., Jr, R. P., Portaluri, T., Porter, T. M., Puy, A., Rafols, I., Ravetz, J. R., Reinert, E., Sarewitz, D., Stark, P. B., … Vineis, P. (2020). Five ways to ensure that models serve society: A manifesto. Nature, 582(7813), 482–484. https://doi.org/10.1038/d41586-020-01812-9

The five ways:

    1. Mind the assumptions
      • “One way to mitigate these issues is to perform global uncertainty and sensitivity analyses. In practice, that means allowing all that is uncertain — variables, mathematical relationships and boundary conditions — to vary simultaneously as runs of the model produce its range of predictions. This often reveals that the uncertainty in predictions is substantially larger than originally asserted”
    2. Mind the hubris
      • Most modellers are aware that there is a tradeoff between the usefulness of a model and the breadth it tries to capture. But many are seduced by the idea of adding complexity in an attempt to capture reality more accurately. As modellers incorporate more phenomena, a model might fit better to the training data, but at a cost. Its predictions typically become less
    3. Mind the framing
      • “Match purpose and context. Results from models will at least partly reflect the interests, disciplinary orientations and biases of the developers. No one model can serve all purposes. accurate”
    4. Mind the consequences
      • Quantification can backfire. Excessive regard for producing numbers can push a discipline away from being roughly right towards being precisely wrong. Undiscriminating use of statistical tests can substitute for sound judgement. By helping to make risky financial products seem safe, models contributed to derailing the global economy in 2007–08 (ref. 5).”
    5. Mind the unknowns
      • Acknowledge ignorance. For most of the history of Western philosophy, self-awareness of ignorance was considered a virtue, the worthy object of intellectual pursuit”

“Ignore the five, and model predictions become Trojan horses for unstated
interests and values”

“Models’ assumptions and limitations must be appraised openly and honestly. Process and ethics matter as much as intellectual prowess”

“Mathematical models are a great way to explore questions. They are also a dangerous way to assert answers. Asking models for  certainty or consensus is more a sign of the  difficulties in making controversial decisions  than it is a solution, and can invite ritualistic use of quantification”

Evaluating the Future

A blog posting and (summarising) podcast, produced for the EU Evaluation Support Services, by Rick Davies, June 2020

The podcast is available here, on the Capacity4Dev website

The blog posting full text is here as a pdf

    • Limitations of common evaluative thinking
    • Scenario planning
    • Risk vs uncertainty
    • Additional evaluation criteria
    • Meaningful differences
    • Other information sources

Process Tracing as a Practical Evaluation Method: Comparative Learning from Six Evaluations

By Alix Wadeson, Bernardo Monzani and Tom Aston
March 2020. pdf available here

Rick Davies comment: This is the most interesting and useful paper I have seen yet written on process tracing and its use for evaluation purposes. A good mix of methodology discussion, practical examples and useful recommendations.