Grading Their Own Homework

The nonprofit sector's evidence base on AI is being produced by the companies selling it.

Jun 03, 2026

The frame arrived ready-made

Last week I asked who is actually better off when the workflow gets faster. Earlier this week I looked at what the dominant report measures and what it cannot. Now the question moves back one more step. Who set the frame in the first place.

In a few months, one set of terms became the way the sector talks about AI: the ninety-two percent who use it, the seven percent who get real impact, the efficiency plateau in between, and the integration that is supposed to close it. All of it now reads as neutral description. It is worth asking who built that frame, because the answer shapes every question that gets asked next.

Before going further, a clarification. I am not against AI. I use it, I enjoy it, and I think it holds real promise for the social sector. That is not the argument. The argument is narrower and easier to miss. The conclusions the sector is absorbing about AI, what works, what success looks like, what to do next, are not independent conclusions. They come from the people selling it.

The report’s publisher sells the cure it prescribes

The 2026 Nonprofit AI Adoption Report is published by Virtuous and Fundraising.AI. Virtuous is a CRM company, and it sells an AI fundraising product, Virtuous Momentum, which the report’s launch material holds up as what integration looks like when it works.

So the report defines the sector’s problem as a failure to integrate, and the thing that fixes a failure to integrate is the integrated product the publisher sells. The seven percent who broke through are, by the report’s own logic, the organizations that did the integration. In many cases they are also the organizations that bought the stack.

This is the part worth slowing down on. The frame does not only describe the sector’s disappointment with AI. It diagnoses it, and the diagnosis points in a single direction. If your AI is underwhelming, the problem is that you have not integrated it deeply enough, and the answer is to integrate it more. The question the first two pieces were built around, whether the thing being accelerated is worth accelerating at all, never comes up. The frame has already converted it into a different question. Not whether this is worth doing, but whether you are doing it well enough.

Blackbaud is the cleanest version of this

Blackbaud runs a research arm, the Blackbaud Institute, which has spent years publishing reports that link digital maturity to fundraising revenue growth. Blackbaud, the company, sells the digital maturity. The research finds that organizations with more integrated technology raise more money, and the company stands ready to sell more integrated technology. The study and the sales sheet are the same document in different formatting.

It is not only Blackbaud. AWS runs free trainings built around case studies of nonprofits using its tools, and Salesforce has folded AI agents into its nonprofit CRM in partnership with OpenAI. The specifics vary. The structure does not. The company that benefits from the conclusion is the company producing the evidence for it.

There is no independent grader, because no one funds one

The obvious response is that someone else should produce the evidence. The trouble is that almost no one does. The infrastructure that would, the academic centers, the intermediaries, the independent evaluation shops, has never been resourced at the scale the question needs, and recent funding cuts have made it thinner rather than stronger.

Even the sector’s most serious equity work runs on the same dependency. Candid’s AI Equity Project found that only 6.9 percent of surveyed nonprofits had internal policies for responsible AI use, and Candid’s own analysis names the reason. Funders keep paying for tools and experimentation rather than governance, policy, or the staff time to learn. They want responsible adoption and they fund the opposite. Mission money inverted, AI edition, the same mechanism I traced through the USAID closure. The dollars that would pay for an independent grader go to the products instead. So the only grader left is the one selling the test prep.

This is not a scandal, it is the default

A scandal is a violation, something that broke a rule and can be exposed and fixed. This broke no rules. No one had to corrupt anything. The only research that gets funded is the research a vendor has a reason to pay for, and the only frame that travels is the one that arrives already written. It is a different kind of problem than a scandal, one with no single bad actor to point to, nothing to expose, and no obvious fix to demand. It looks like information, and it pays for itself, which is why it stays.

The cost lands on people who are not in the room. Nonprofit leaders are making real decisions, what to buy, who to hire, what to cut, on the strength of research designed to sell them software. The communities those organizations serve inherit the decisions. When the evidence base is built by the sellers, the sector loses the ability to tell what works apart from what sells, and it spends scarce money as though the two were the same thing.

The sellers are grading their own homework

Across the three pieces, one mechanism keeps repeating. AI accelerates workflows that were already broken. The acceleration gets counted as impact. And the companies selling the acceleration are the ones producing the evidence that says it worked.

The next piece is about where an honest assessment could happen instead, and what it would take to hold it.

Previous pieces in this series:

Who is Better Off When AI Speeds up the Workflow?

Measuring the Wrong Thing Faster

Forthcoming post in this series:

Three Moves to Tell if Nonprofit AI Works (and for whom)

Anthralytic is a strategy and evaluation studio for mission-driven organizations. If you make decisions about resources in the social sector, whether or not you call yourself an evaluator, this newsletter is for you.