Usefulness as the Gold Standard in Social Impact
For decades, evaluators have heard the mantra: randomized controlled trials (RCTs) are the gold standard. The phrase is meant to signal rigor, credibility, and objectivity. In medicine, RCTs revolutionized how we test drugs and treatments. In social impact, they promised to do the same: offer certainty about what works.
But certainty is not the same as usefulness. Just recently, the Stanford Social Innovation Review published The Problem with Randomized Controlled Trials, which argues that RCTs often miss the point: they can be expensive, slow, and disconnected from the actual decisions social sector leaders need to make. Evaluator Julian King also recently shared a thoughtful piece on how we might think about causation differently, drawing on the Bradford Hill criteria. His argument: there’s more than one way to build credible causal claims, and randomization is on
ly one tool in a larger kit. Reading King’s piece was especially timely as I’ve been wrestling with how we talk about rigor in evaluation.
In practice, RCTs can also be ethically fraught—especially when services are withheld in the name of randomization. More importantly, they can flatten context and reduce complex human experiences into narrow variables. None of that makes the findings more useful to the people running programs or making resource decisions.
This matters even more now. With federal budget cuts shrinking the nonprofit sector, organizations are being asked to do more with less. They don’t have the time, money, or bandwidth for evaluations that deliver elegant findings but no actionable insight. Usefulness isn’t a luxury—it’s survival.
I’ve seen the limits of the “gold standard” framing in my own evaluation work. With small grassroots organizations what’s needed isn’t a controlled experiment—it’s a conversation. A nonprofit may not be asking, “Does this program work better than nothing?” They’re asking, “How can we serve families more effectively? What’s changing in the lives of the people we reach, and what could we do differently?”
In those cases, the most valuable method is one that meets the organization where it is. That might mean focus groups with parents and kids. It might mean co-creating a theory of change so staff can articulate the “why” behind their work. It might mean building space for reflection rather than pushing for comparison groups. These participatory approaches don’t sacrifice rigor—they redefine it. Rigor becomes about relevance, credibility, and ownership of findings. In other words, rigor becomes about usefulness.
RCTs have their place. But in social impact, usefulness should be our guiding star. The real standard is not whether an evaluation looks airtight to an econometrician—it’s whether the people making decisions and living with the outcomes find the evidence meaningful and actionable.
At Anthralytic, we believe the real gold standard isn’t a single method—it’s usefulness. That means meeting organizations where they are, choosing designs that fit the purpose, and ensuring evidence leads to better decisions.

