The five stages of AI adoption: A field guide for evaluators

Illustrated cover image showing people feeling stressed or enthusiastic while working at their computers, symbolising the emotional stages of adapting to AI. The bottom-right illustration shows a person interacting with a friendly AI robot on a screen. The BetterEvaluation logo appears in the corner.

This blog by Jonathan Kuhn-Patrick (UK Evaluation Society Trustee and AI Working Group Lead) explores the five emotional stages many evaluators experience as they begin working with AI. Drawing on lessons from developing the new UK Evaluation Society's Good Practice Guidelines for AI in Evaluation, it aims to help readers recognise where they are on the journey and navigate AI adoption with greater clarity.

If you've seen my previous blog on AI in evaluation on the UK Evaluation Society website (evaluation.org.uk), you'll know my journey began in a conference corridor in 2023, downloading Claude onto my phone after Rick Davies' presentation, and evolved into what I described as "standing on the shoulders of a digital giant." What I didn't fully articulate in that earlier piece was the emotional rollercoaster that accompanied the technical learning curve.

Having now spoken with dozens of evaluation practitioners about their own AI experiences, I've noticed a pattern. Most of us travel through roughly the same stages—not quite the five stages of grief, but close enough to warrant the comparison. I should say upfront: these stages aren't strictly sequential. I've moved through them roughly in order, but I've also cycled back more than once, and I suspect you will too. Understanding where you are can help you move forward more productively. And if you're wondering whether I've completed the journey myself: not entirely. But I've at least found a map to help me navigate it—in the form of guidance that we in the UK Evaluation Society have developed for precisely this purpose.

Most of you are already on your own version of this rollercoaster; those who aren't will be boarding soon enough. My hope is that this blog, and the UKES guidance underpinning it, makes the ride a little safer, more informed, and less dizzying.

Stage 1: Denial "This doesn't apply to me."

This is the evaluator who insists AI is for tech companies and Silicon Valley enthusiasts, not serious methodologists working on complex social programmes. The rationalisations are familiar: "My work is too nuanced for automation." "You can't replicate professional judgement." "I'll engage with this once it matures."

The irony, of course, is that these same colleagues are already using AI—spell-check algorithms, search engine results, translation tools, grammar assistants—without recognising it as such. The boundary between "technology I use" and "AI I refuse to use" is more porous than most people are aware of or will admit.

I confess I romped through this stage rather quickly. My curiosity outpaced my caution, and I found myself immediately captivated by the technology and by the possibilities. Within days of that conference encounter, I was sketching out potential applications faster than I could test them. Not everyone moves this rapidly, and there's no particular virtue in speed. But denial does carry a cost: it delays the learning curve and leaves evaluators unprepared when commissioners start asking pointed questions about methodology and efficiency. And very soon, commissioners will be asking why evaluators are not using AI, and in what ways—rather than the converse.

Stage 2: Euphoria "This changes everything!"

For me, this came immediately after the initial fascination. Having a conversation with AI (and my default is Claude, for its rigour and Anthropic's explicit focus on AI safety) felt like engaging with something approaching omniscience—a vast, patient intelligence that could hold more information simultaneously than any human mind. I found myself hoping it wasn't also omnipotent, though I couldn't quite rule out omnipresence given how often I was consulting it.

The possibilities seemed boundless. Every evaluation task became a potential candidate for AI assistance. This is the stage where evaluators, after discovering that ChatGPT can draft an evaluation framework in thirty seconds, suddenly announce at team meetings that they've "revolutionised stakeholder analysis." It's also the stage where those same evaluators discover that AI can invent organisations wholesale, attribute quotes to people who never said them (or who don't even exist), and produce plausible-sounding nonsense with the confidence of a newly minted graduate from a certain type of institution.

The serious point beneath the comedy: euphoria without verification is how professional credibility gets damaged—yours and that of the evaluation profession's. Research has shown that AI can introduce non-random biases correlated with the characteristics of the people being studied, performing worse in some cases than simpler methods applied by careful humans. Enthusiasm is valuable. Enthusiasm unchecked by rigorous verification is dangerous. This is precisely why our guidance places quality assurance and verification as a core principle—not to dampen enthusiasm, but to channel it safely.

Stage 3: Panic "This changes everything - part two"

This stage surprised me by arriving after euphoria rather than before it—a delayed reaction, perhaps, once the initial excitement subsided enough for doubt and anxiety to creep in. I still find myself cycling through waves of genuine concern about where this technology is heading. I've been reading a book with the reassuring title If Anyone Builds It, Everyone Dies—about artificial superintelligence and the various ways it might go catastrophically wrong. Light bedtime reading for the anxious evaluator.

As an independent consultant, my panic isn't primarily about job displacement—though I recognise this is a very real concern for colleagues in different circumstances, particularly those in evaluation units facing budget pressures and institutional change. What concerns me more is the broader question: how do we harness something this powerful responsibly? How do I embrace AI to do even better evaluation work without contributing to harms I don't fully understand?

This existential dimension is genuinely uncomfortable, and I suspect many practitioners are grappling with similar questions beneath their professional composure. The technology's capabilities are expanding faster than our frameworks for governing its use. Yoshua Bengio, pioneer of deep learning and one of the so-called "Godfathers of AI," recently observed that a sandwich has more regulation than artificial intelligence. Which is, incidentally, one reason why developing a framework for the use of AI in evaluation felt urgent—a framework built around active risk management and harm prevention. At least I could work on that while eating my sandwich in peace.

Stage 4: Disillusionment "This is more trouble than it's worth."

After the euphoric phase came the grinding reality of verification. Hours spent checking whether that beautifully structured literature summary was actually accurate. The risk being that AI confidently cites papers that don't exist, attributes findings to the wrong studies, and synthesises sources into conclusions the original authors would find unrecognisable.

One example still makes me wince. I asked AI to identify comparable programmes for an evaluability assessment. It produced a convincing short list, complete with implementing organisations, timeframes, and outcome data. One of the programmes however was utterly irrelevant. The organisation existed as did the programme but it had nothing to do with what I was doing. Had I not checked, I would have submitted a report referencing interventions that had had no relevance to my subject. And an apology from AI for getting it wrong just doesn’t cut it when explaining your error to a commissioner. That near-miss concentrated my mind somewhat.

I described this verification process in my earlier blog as a "delicate dance of trust and verification." That phrase now strikes me as rather too elegant. In practice, it felt more like forensic accounting—painstaking, occasionally tedious, and absolutely essential.

The verification paradox is real: sometimes the time saved by AI-assisted analysis gets consumed entirely by the time required to check it. But I emerged from those hours of cross-referencing genuinely grateful I'd done the work. The errors I caught would have been professionally embarrassing at best, credibility-destroying at worst.

Disillusionment, it turns out, is the productive stage—the point where realistic expectations form and robust quality assurance processes get built. I am hopeful that as AI continues to develop, the verification burden will diminish significantly. But we are not there yet, and prudent evaluators will continue checking the robot's homework for the foreseeable future.

Stage 5: Integration "Right tool, right task, right safeguards."

This is where I find myself now, most of the time—though I still cycle back through earlier stages, particularly when reading about artificial superintelligence or encountering a new AI capability that reignites the euphoria. The difference is that I now have a framework for working through those moments productively.

That framework is the UK Evaluation Society's Good Practice Guidelines for AI in Evaluation. Developed by the Society's AI Working Group, it draws on the practical experience of evaluators who had been grappling with exactly these challenges—and who recognised that individual experimentation, however valuable, wasn't enough.

The journey here involved recognising that without clear guidance, we would forever be bouncing between the other four stages: pushing boundaries in one project, retreating anxiously in the next, never quite confident about how we should be deploying these tools.

A question nagged at many of us throughout: am I somehow a fraud for using AI to enhance my work? Did the first people to use computers ask themselves the same question? The first to use calculators? I suspect they did. Every generation of practitioners has faced the task of integrating new capabilities while preserving the core of professional judgement that technology cannot replace.

This recognition—that we needed principled frameworks, not just individual experimentation—is what motivated the guidance. The resulting four principles emerged from practitioners who had been through these stages and recognised the need for a shared foundation:

  1. Transparency, accountability and competence: being open about how AI is used and ensuring we have the skills to use it well.
  2. Human control and proportionate use: keeping professional judgement central and matching safeguards to risk.
  3. Active risk management and harm prevention: channelling anxiety into systematic assessment rather than paralysis.
  4. Quality assurance and verification: building the checking processes that protect credibility.

The guidance doesn't eliminate the emotional journey. New capabilities will emerge, and I'll find myself cycling through earlier stages again. But having a principled framework means the cycling is productive rather than paralysing. It frames both the thinking behind the question on "should I use AI?" as well as "how should I use AI responsibly for this specific purpose?"

Where are you?

If you recognise yourself somewhere in these stages, you're in good company. The evaluation profession is working through this collectively, and that's precisely as it should be. The UK Evaluation Society's Good Practice Guidelines for AI in Evaluation are designed to support you wherever you are in the journey—whether you're cautiously experimenting, enthusiastically over-adopting, or somewhere in the productive disillusionment that precedes genuine integration.

While the guidance was developed by the UK Evaluation Society, the principles are designed to be context-adaptable, and we hope they prove useful to practitioners wherever you're based.

You can find the full guidance here. If you're currently deep in Stage 2, I'd gently suggest starting with the verification section. And if you're in Stage 3, know that the guidance exists partly because others have shared your concerns and decided to do something constructive about them. Your future self will thank you for engaging now.

Jonathan Kuhn-Patrick is Treasurer of the UK Evaluation Society and leads its AI Working Group. He has over 30 years of experience in evaluation across humanitarian and development contexts. And yes, he used AI to help write this blog. The content is 100% his however and everything is verified.

Read more

'The five stages of AI adoption: A field guide for evaluators' is referenced in: