Measuring the Wrong Thing Faster
AI is accelerating nonprofit work. The dominant report mistakes the acceleration for impact.
The gap the report names is not the real one
Last week I asked who is actually better off when the workflow gets faster. This week the question is narrower and harder. When the field tries to assess whether AI is working at all, what exactly is it measuring.
The headline gap sits between adoption and impact. Ninety-two percent of nonprofits use AI. Only seven percent report a major gain in what their organization can actually do. The report names this the efficiency plateau and reads it as a problem of integration, a sector that picked up the tool without learning to use it well. The real gap is somewhere else. It sits between what the report measures and what would tell us the sector is working.
The numbers are real. The definition underneath them is a choice.
The 2026 Nonprofit AI Adoption Report surveyed 346 nonprofits. Ninety-two percent adoption. Seven percent reporting major improvement in organizational capability. The report calls the space between them the efficiency plateau, and the coverage settled into one explanation. Nonprofits are using AI but not capturing transformation. The fix is better strategy, better governance, better workflows.
That explanation quietly accepts a definition of impact. The sector should not accept it without naming it first.
The report counts velocity and calls it impact
Read the categories carefully. Major improvement in organizational capability is framed around fundraising efficiency, donor segmentation, personalized outreach, the capacity to manage a larger portfolio, the hours saved on a campaign. The report’s own exemplar is a development director who set aside six hours for an email campaign and finished it in twenty minutes.
That is the picture of impact the report offers.
It is a real efficiency gain. It is not a measure of whether the relationship between the organization and the donor grew stronger, whether the fundraising program became more sustainable, whether the staff member is less burned out, or whether the mission moved at all. The thing being measured is velocity in transactional fundraising. AI is very good at that.
When the measure becomes the target, it stops measuring
I have written before about Goodhart’s law, the observation that when a measure becomes a target, it stops being a good measure. The sector spent thirty years building its incentive architecture around fundraising velocity. AI is now arriving inside that architecture and being judged against the targets that architecture set.
So the seven percent are not the organizations that figured out AI. They are the organizations that bought the integrated stack and tuned it to the targets they already had. The report measures their success against the same metrics the sector was already optimizing for.
The efficiency plateau is not a plateau. It is the report reaching the edge of its own measurement. There is nothing left to find beyond the metric it chose to look for. This is Goodhart playing out in real time, in a report that is shaping how the whole sector talks about AI, and no one is naming it.
The questions that would matter are not in the frame
A real assessment would ask a different set of questions. Whether the depth of donor relationships changed. Whether gift officers still remember the names of donors’ children, or have quietly handed that to a CRM field. Whether the fundraising program is more sustainable, in the sense that it could survive a major funder walking away, or whether AI personalization has deepened single-source dependency by raising the cost of reaching any donor not already in the system.
It would ask what happened to staff sustainability. What happened to the relationship between the organization and the communities it serves, as those communities started receiving AI-generated outreach. What happened to the work that never appears in a dashboard at all, the unscheduled phone call, the showing up at the funeral, the slow accumulation of trust.
None of this is in the report. None of it is in the coverage. It is not in the frame.
The comparison the report makes should not pass quietly
The report puts one staffer occasionally using a free chatbot in the same category as organizations running paid platforms, cross-functional teams, and integrated workflows. It calls both of them adoption, and then asks why one shows impact and the other does not.
The seven percent gap may be less an integration story than an artifact of that collapsed category. The seven percent are, by definition, the customers who bought the integrated stack. The ninety-three percent are the rest of the sector. Naming the distance between them the efficiency plateau obscures the simpler fact that the plateau is exactly what the report was built to measure.
I am not standing outside this
I have worked within systems that counted outputs as if they were outcomes, that treated touchpoints as engagement, that measured whatever the funder asked for and reported it back as impact. I did not always see the frame I was working inside. More often than I would like, I did not think to question it.
This is the same move at sector scale. The report measures what its publisher’s product accelerates and calls the acceleration impact. The complicity I am naming is not Virtuous’s. It is the field’s. It is mine.
Indigenous evaluation asks the question the report skips
Nicole Bowman’s work treats evaluation as a relational act, carried out with communities as relatives rather than performed on them as subjects. An honest assessment of what AI is doing to the nonprofit sector would start by asking the people the sector exists to serve whether anything changed for them. The report did not ask. Most of the coverage did not either.
That is the gap.
The report is failing the ninety-three percent
The ninety-three percent are not failing the report. The report is failing them.
Measuring the wrong thing faster does not turn into impact once the measurement clears a threshold. It turns into the thing the field begins to mistake for impact, because the measurement is the part of the work the field can actually see.
Anthralytic is a strategy and evaluation studio for mission-driven organizations. If you make decisions about resources in the social sector, whether or not you call yourself an evaluator, this newsletter is for you.
For the previous piece in this series:
Forthcoming posts in this series:
Three Moves to Tell if Nonprofit AI Works (and for whom)


