Utilization-Focused Evaluation in the Age of AI
What happens when evaluators teach machines to think with us, not for us.
Evaluation has always evolved alongside its tools.
Clipboards became tablets. Word processors replaced typewriters. Data dashboards replaced binders. But the purpose stayed the same: to generate insight that people could actually use.
The question now isn’t whether AI will replace evaluators.
It’s whether evaluators will use AI to make their work more useful, participatory, and human.
A familiar principle in a new era
Michael Quinn Patton’s work on Utilization-Focused Evaluation (UFE) reframed the field around a deceptively simple idea: evaluation should be judged by its usefulness to intended users. That shift turned evaluators into facilitators of learning, not just producers of reports.
The emergence of generative AI doesn’t change that principle, it tests our commitment to it.
If our purpose is use, and if AI can reduce barriers to participation, comprehension, and speed, then perhaps the most utilization-focused thing we can do right now is learn how to use it well.
When the tool becomes a collaborator
I started experimenting with AI in evaluation as part of an evaluation for My Very Own Bed, a small Minneapolis nonprofit that provides new beds to children transitioning out of homelessness. Like many grassroots organizations, they didn’t have dedicated MEL staff.
Normally, the planning phase for an evaluation like this, drafting the Terms of Reference, defining questions, creating a matrix, setting timelines, would take days of back-and-forth writing and synthesis.
Instead, I worked with an AI chatbot as a research assistant. I uploaded publicly available reports, contextual notes, and evaluation plans from similar projects. The chatbot synthesized background data, proposed evaluation questions, and generated first-pass drafts of key documents.
Each draft was refined through dialogue, the same way I’d work with a human colleague.
It didn’t make decisions. It didn’t interpret meaning. But it accelerated the thinking process so I could focus on the parts that actually require judgment and context.
The result wasn’t less human, It was more.
Accessibility as rigor
Then I took it further.
I used AI to create what I call vibe-coded mini-apps. These are lightweight, interactive tools that replaced dense technical documents.
A self-guided readiness app that helped the team determine whether they were ready for evaluation.
A question-card app where staff could flip digital cards to explore and prioritize evaluation questions.
A NotebookLM explainer video that translated the TOR into plain language.
These weren’t gimmicks. They were accessibility layers designed so that people without MEL training could participate meaningfully in shaping the evaluation.
This nails down something really important: accessibility is a form of rigor.
If your process can’t be understood or used by the people it’s meant to serve, how rigorous can it really be?
From use to co-use
Patton’s original call was for use.
The tools now allow something even richer: co-use.
Evaluators can build workflows where program teams, community partners, and even AI systems participate in different layers of learning and sensemaking.
The evaluator’s role expands — from sole author to orchestrator. From translator of findings to facilitator of shared insight.
This doesn’t mean handing over evaluation to machines. It means teaching machines to think with us — to hold context, draft options, and help make knowledge more accessible.
That’s not replacing human judgment. It’s reclaiming it.
An invitation
I’ll be exploring this intersection, what Utilization-Focused Evaluation looks like in the age of AI, during my session this week at the Minnesota International NGO Network (MINN) Summit.
I’ll show how AI can act as a research assistant, a translator, and a participatory design tool, without ever replacing human discernment.
If you’ll be there, come say hi. If not, I’ll share the tools and a
follow-up reflection here soon.
In the end
AI won’t replace evaluators.
But evaluators who use AI to expand participation and improve use may well define the next era of evaluation.
The real task isn’t to resist change, it’s to make sure that as our tools evolve, our purpose doesn’t drift.
Evaluation has always been about learning, use, and humanity.
That part hasn’t changed.
Anthralytic makes evaluation and strategy more accessible for mission-driven teams through a hybrid model that combines human-centered consulting with AI-enabled tools. We explore how technology can make learning, participation, and impact measurement simpler — and more human.

