Cekova, D., Corsetti L., Ferretti, S. and Vaca, S. (2025). Considerations and Practical Applications for Using Artificial Intelligence (AI) in Evaluations. Technical Note. CGIAR Independent Advisory and Evaluation Service (IAES). Rome: IAES Evaluation Function. https://iaes.cgiar.org/evaluation
Executive Summary
The CGIAR 2030 Research and Innovation Strategy commits organizational change with seven ways of working, including “Making the digital revolution central to our way of working”. In that context, Artificial Intelligence (AI), introduces both opportunities and risks to evaluation practice. Guided by the CGIAR-wide Evaluation Framework, integrating AI tools requires a governance approach to balance innovation with ethical responsibility, ensuring transparency, fairness, accountability, and inclusivity. This Technical Note encourages and guides CGIAR evaluators to ethically explore, negotiate, and experiment with AI tools:
Explore: Evaluators are invited to discover how AI, especially GenAI, can enhance evaluation efficiency, from scoping and data analysis to reporting. The Note provides practical guidance on AI applications and examples to support creative yet responsible exploration.
Negotiate: Integrating AI should be openly discussed with commissioners, stakeholders, and teams. The Note prioritizes jointly defining boundaries, expectations, and ethical parameters (transparency, accountability, and data sensitivity) at each phase of evaluation.
Use AI Responsibly: While AI tools are evolving, evaluators are encouraged to pilot and iterate their use. The document supports experimentation through practical tips, prompt examples, and tool selection criteria, all while emphasizing documentation and learning from each use case. Effective AI governance is grounded in core principles: Transparency requires clear documentation of AI tool usage, data sources, model limitations, and decision-making processes. Accountability involves assigning responsibility for AI decisions and outputs and establishing oversight and redress mechanisms. Fairness and inclusion must proactively mitigate bias and discrimination, with particular attention to underrepresented groups and data gaps. Data privacy and security must align with applicable data protection regulations and ensure secure handling practices. Human oversight ensures that evaluators retain control over processes and can intervene as needed.
In operationalizing ethical AI governance in CGIAR evaluations, due diligence is required in assessing AI tools for ethical alignment before deployment: reviewing the transparency of vendors, the documentation of models, and their intended use cases. Where relevant, components involving AI—especially those engaging human subjects or sensitive data—should undergo ethics review. AI applications must be adapted to the local and cultural contexts in which evaluations are conducted, as what is suitable in one setting may be inappropriate in another. Additionally, participants should be informed about the use of AI systems and the implications of data collection or processing to ensure informed consent.
Ethical AI governance should be embedded in the entire evaluation lifecycle. During the design phase, evaluators should define AI tools to use, why they are selected, and assess risks. In data collection, AI tools should be used in ways that uphold data privacy and protection standards and avoid reinforcing harmful stereotypes or excluding groups. During the analysis phase, the role of AI in supporting interpretation should be documented, with an acknowledgment of limitations or biases. In dissemination, documentation, and reporting, AI’s contribution, limitations, and human validation should be disclosed. By rapid adaptation of content across formats, languages, and complexity levels, AI opens possibilities for broader, more inclusive communication of findings. Finally, the follow-up phases should include a reflection on the ethical implications observed and how these lessons can improve future evaluations.
By embedding methodological flexibility into the evaluation processes, AI adoption would contribute to integrity, equity, and learning in an era of rapid technological advancement. This Technical Note is a conversation starter—as a “Beta” version, it will evolve based on responsible real-world experimentation and continuous reflection. Evaluators are encouraged to be responsive to stakeholder input throughout the evaluation processes, to ensure relevance, accuracy, and inclusivity.