3 of 5: How People Break AI in Social Impact

Gaming, Bias, and the Human Element

Aug 06, 2025

Welcome to Part 3 of a deep dive series on AI security for social impact. In Part 1, we explored data vulnerabilities—cybersecurity, risks, and safeguards. Part 2 covered infrastructure challenges, particularly in low-resource environments. This week, we're diving into the human side of AI in social impact: why the biggest threats often come from inside your own organization—from the people who are truly trying to help. Resources will be linked at the end.

Coming up: Part 4 will tackle the regulatory minefield of privacy, sovereignty, and surveillance. Part 5 will synthesize everything into a practical framework for building truly safe and equitable AI systems.

The Human Side of AI Failure

Picture this: You've just deployed an AI system to optimize food distribution in refugee camps. The model is impressive—it can predict demand, minimize waste, and even account for cultural food preferences. But within three weeks, local administrators have figured out how to game the system. They inflate numbers, manipulate data inputs, and the system doesn’t work as it was supposed to, essentially breaking the entire program.

It isn't malicious. These are dedicated aid workers trying to get more resources for hungry people. But the result is the same: a sophisticated AI system that makes things worse instead of better.

This scenario isn't purely hypothetical—it's playing out in programs around the world. And it represents a fundamental challenge that most organizations aren't prepared for.

Hidden Fault Lines in Social Impact AI

Here’s what’s becoming clear as AI proliferates in social impact work: many systems fail. Not because the technology is bad, but because AI reliability in social impact work depends on navigating complex human, contextual, and governance factors.

Predictive policing systems in U.S. cities have amplified racial bias. Education algorithms have consistently underperformed for rural students.

These aren’t edge cases. They’re the norm.

When we talk about AI security in social impact, we’re not just worried about hackers (though that’s part of it). We’re dealing with a whole ecosystem of ways things can go wrong. Here are some of the most common and acute:

Evolving competency and knowledge requirements
Gaming and manipulation
Context collapse
Baked-in Bias that Compounds
The Fairness–Security Trade‑Off
Value misalignment

Let’s dive into each of these problems.

The Risks

1. Evolving Competency Requirements

In a rural health evaluation in India, an AI‑enabled tool used GPS data from mobile health apps to map care‑seeking patterns. On paper, it was a breakthrough—pinpointing which households were at highest risk for waterborne diseases and where interventions could hit hardest.

Thankfully the team did a risk assessment. The team realized the “anonymous” GPS coordinates weren’t really anonymous at all. In small villages, you could identify a home and, with a glance at local records—or a chat with a neighbor—know exactly who lived there. Along with their precise location came their health status, potentially exposing families to stigma or even retaliation.

Because the issue was caught early, the evaluators worked with implementers to make fast fixes: data was aggregated to the neighborhood level, location precision was reduced, and stricter access controls were put in place. The AI still guided the program, but now without putting anyone in the crosshairs.

As AI becomes embedded in evaluation practice, evaluators need a new skill set that blends technical literacy with social and ethical judgment:

Data governance literacy – Understanding data provenance, ownership, licensing, and sovereignty; knowing who collects the data, under what incentives, and with which inclusion or exclusion criteria.
Ethical risk assessment – Identifying potential harms early, from privacy breaches to unintended discrimination, and integrating mitigation strategies from the start.
Collaborative interpretation – Involving stakeholders—especially those most affected by the program—in interpreting AI outputs, validating findings, and providing contextual nuance.
Algorithmic transparency and accountability – Interrogating model design, data pipelines, and decision rules; advocating for explainability in high‑stakes contexts.
A new era of systems thinking – Understanding how AI tools interact with program ecosystems, feedback loops, and power structures, and anticipating second‑order effects.
Change management and capacity building – Helping organizations adapt workflows, roles, and decision‑making processes to integrate AI in ways that align with mission and values.

These competencies are not “nice to have”—they’re essential for ensuring AI‑supported evaluations remain credible, equitable, and contextually grounded. Without them, even technically accurate models can cause harm or erode trust.

2. Gaming and Manipulation

Every AI system creates incentives. And sometimes people respond to incentives in ways you could never expect.

Take the Netherlands’ childcare benefits fraud detection system. It was designed to identify families incorrectly receiving benefits. Instead, it systematically flagged families from immigrant backgrounds, creating a massive scandal that brought down the government.

In AI and evaluation theory, this phenomenon is explained by Goodhart’s Law: When a measure becomes a target, it ceases to be a good measure. In AI research, it’s also called reward hacking or specification gaming (Medium)—when a system learns to exploit the metric it’s given rather than achieving the real-world outcome it was designed for.

Some of the biggest risks in social impact AI don’t come from hackers—they come from people trying to do good. Program administrators inflate numbers to secure more funding. Beneficiaries tweak answers to meet eligibility criteria. Local officials polish the data to make their programs look successful.

Have I done it? Yeah.

Why? Because it works—at least in the short term.

Programs are designed at the macro scale but live at the micro scale—with real people you know and talk to. The paradox is that what looks good for the big picture can hurt the ground reality, and what works locally can undermine the national strategy. That gap invites gaming. But recent research shows that prediction models for social service allocation can lose 40–60% of their accuracy when people start gaming the system. And they will—because sometimes, that’s the only way to get help where it’s needed.

3. Context Collapse

Social systems are incredibly context-dependent. What works in one place, at one time, with one population might fail completely somewhere else.

Social impact AI has to work across incredibly diverse contexts—different cultures, languages, economic conditions, legal frameworks. A system that works perfectly in urban Kenya might fail catastrophically in rural Kenya, let alone rural Bangladesh. But AI systems are built to generalize, which means they often miss the nuances that make or break impact.

This pattern shows up everywhere. In Oregon, a mobile health clinic model that operated efficiently for stable rural populations struggled to meet demand spikes among migrant farmworkers during harvest season, when static resource allocation systems failed to account for seasonal migration patterns. An algorithm that accurately matched middle‑class students to college programs seriously underestimated the potential of low‑income and minority students, leading to more missed opportunities for those groups. An online job-matching platform built for formal, salaried markets in developed countries performed well there—but fractured when rolled out in Sub-Saharan Africa, where informal work dominates and most job seekers fell outside its formal-sector algorithm design, resulting in few relevant matches. All of us working in social impact have these stories.

4. Baked-in Bias that Compounds

All models learn patterns from human‑created data — and human‑created data always reflects the world as it is, not as it should be. When you train a model on historical records of who received services, who succeeded in programs, or who was labeled “high‑risk,” you are encoding decades (or even centuries) of structural inequality into the model from day one.

This isn’t just about Xai’s Grok, which was intentionally trained to reflect the opinions of one man, Elon Musk; all AI systems are intrinsically biased — just usually at a more macro or systemic level.

And that initial bias doesn’t just sit there — it often snowballs. Multiple studies have found that AI systems amplify existing disparities, sometimes by 2-5 times, compared to human decision‑makers. Over time, these effects are reinforced through feedback loops: biased predictions shape real‑world decisions, those decisions generate new biased data, and the model retrains on that data — locking in and deepening inequities.

5. The Fairness–Security Trade‑Off

Improving one dimension of AI performance can come at the cost of another. Research has shown that models tuned for fairness—especially demographic parity or equalized odds—can become more susceptible to adversarial manipulation. This is because fairness interventions often require the model to process and weight sensitive attributes, which adversaries can then exploit.

Conversely, methods designed to harden models against attacks, like adversarial training, can disproportionately degrade accuracy for underrepresented groups. When models are trained to be robust to input perturbations, they often optimize toward majority group patterns, amplifying error rates for minority groups.

The result is a persistent tension: interventions that improve fairness can weaken security, and interventions that improve security can worsen fairness. Addressing this trade‑off requires not only technical balancing but also policy, governance, and stakeholder oversight to decide which risks are most acceptable in a given social impact context.

6. Value Misalignment

When AI systems are designed to chase certain metrics—like graduation rates—they risk gamified behavior: optimizing what’s measured, not what matters. This leads to systems that can accelerate success on paper while failing their true social missions—such as graduates who read at a fourth-grade level.

In Artificial Intelligence and Evaluation (which I highly recommend reading if you are interested in this topic), Nielsen, Mazzeo Rinaldi, and Petersson describe a concept they call the “metric translation gap”—the disconnect between program goals rooted in lived experience and the proxy metrics embedded in models that drive decisions. When system builders adopt indicators without adapting them to context, AI can end up producing perverse outcomes.

Real-world example:
Many U.S. colleges adopted predictive analytics to boost graduation rates by identifying and nudging students deemed “at risk.” While graduation numbers at institutions like Georgia State University surged, critics—like those featured in a Hechinger Report investigation—warned that the approach tended to reinforce racial and socioeconomic inequities, funneling low-income and Black students into easier majors where success was most likely, rather than helping them access higher-value pathways. What the model achieved at scale conflicted with what mattered most: equitable opportunity.

This shows how metric-focused tools—even when well-intentioned—can create a performance paradox: they succeed statistically while undermining broader equity and agency goals.

Rewiring Human Brains for AI‑Enabled Impact

It’s easy to be overwhelmed with the sense that it’s all just moving too fast to keep up, but humanitarians, development practitioners, and evaluators know that we can’t afford to be luddites, because real people are at risk in many different ways.

AI is no longer an optional add‑on. Right now, only a small group of evaluators are directly working on AI projects. Soon, it will be in the background of almost every program, policy, and evaluation you touch—often without being explicitly labeled as “AI.” We can’t ignore it.

Below is a list of what organizations and individual practitioners need to focus on if they want AI systems for social programs and policies to be both powerful and safe—based on research, field experience, and lessons from evaluation practice. Most of these things aren’t new—they’re long‑standing best practices in evaluation, program design, and adaptive management. What’s new, is that with big data and AI the playing field is much more complex and vast.

We need to be paying attention, adapting our skills, learning about AI, and applying our best practices with this technology in mind. That AI‑specific lens is what makes the first point—organizational capacity building—non‑negotiable.

This is the most important one:

Get the People AI‑Ready (the keystone)
This is the keystone—the one that makes all the others possible, and without which it all crumbles. AI changes how work gets done: who makes decisions, how evidence is gathered, what gets measured, and what gets ignored. Teams need both technical skills and contextual judgment. If we don’t know how to use AI, we can’t know how it will break—or how to fix it. Evaluators and social impact professionals must understand model design, data flows, and system limitations alongside the ethical, governance, and contextual implications. That means being able to question assumptions, interpret outputs critically, identify risks early, and adapt processes on the fly.
After that, check these:
Check the Ground Before You Launch
Before rolling out an AI‑enabled program, conduct a social context risk and readiness assessment. Identify how the system could be gamed or misused in the local context. In refugee food distribution, for example, eligibility‑scoring AI can be manipulated by inflated household counts. Map incentive structures, test for adversarial inputs, and verify that datasets are representative, up‑to‑date, and compliant with local data protection laws.
Test It Everywhere, Not Just in the Lab
Conduct distributional robustness testing to ensure your AI performs across populations, geographies, and time periods—not just the training dataset. An AI that predicts school dropouts may work in urban contexts but fail in rural areas with different dropout drivers. Track for performance drift and out‑of‑distribution errors, especially after major disruptions (e.g., climate events, policy shifts).
Keep Humans in the Driver’s Seat
Implement expert‑in‑the‑loop decision‑making for high‑stakes use cases. AI detecting child protection risks should never operate without human review. Combine automated scoring or anomaly detection with qualitative assessments from staff on the ground. Use things like confidence thresholds and escalation protocols to decide when human overrides are triggered. The COMPAS risk assessment tool in U.S. courts shows the dangers of relying on opaque models without transparent oversight.
Bake Ethics In from Day One
Apply algorithmic fairness and explainability principles from the start, not as retrofits. If your AI allocates healthcare resources, embed fairness audits during model development, use model interpretability tools post‑deployment, and train staff to recognize ethical red flags. Bias amplification has been documented in multiple contexts, from Amazon’s scrapped hiring algorithm that penalized female candidates to facial recognition tools with higher error rates for darker‑skinned individuals.
Have a Backup Plan for When It Breaks
Use resilience engineering and fallback protocols to plan for AI failures. If the satellite‑based crop‑yield predictor fails, what’s the manual process? Maintain a rapid‑response team for debugging, plus incident reporting frameworks to document and share lessons learned. The collapse of Kenya’s biometric voter registration system in 2017 is a reminder of how fast tech failures can become governance crises.

If you don’t know what these mean, see practice #1. In the very near future, we are all going to need to be literate in AI, just like we need to be literate on the internet. That means we need to learn in the present.

Conclusion

If there’s one takeaway here, it’s that the human element in AI‑enabled social impact work is never a side note—it’s the main story.

Strong evaluation practice, paired with competencies like data governance literacy and ethical risk assessment, can catch these risks before they cause harm. And when evaluators act early—asking hard questions, bringing diverse voices into the process, anticipating second‑order effects, and really understanding the technology. AI will soon be everywhere; the capacity to engage with it wisely is what will decide whether it helps or harms.

Stay tuned for the next installment in this series on regulatory confusion. We will discuss who owns the data, who sets the rules, and what gets lost in translation.

Anthralytic is a woman‑owned consultancy specializing in human‑centered evaluation, AI strategy, and social impact measurement. We combine rigorous methodology with practical, context‑driven insight, partnering with organizations worldwide to ensure that technology serves people—not the other way around. Learn more at anthralytic.ai.

Resource Links

Steffen Bohni Nielson, Francesco Mazzeo Rinaldi, and Gustav Jakob Petersson Artificial Intelligence in Evaluation: Emerging Technologies and their Implications for Evaluation

Teacher Evaluations

Teacher evaluation systems & “teaching to the test” – narrowing curriculum and cheating: National Bureau of Economic Research

Predictive Policing Systems

Machine bias in software that predicts criminal activity: ProPublica – Machine Bias

Education Algorithms

UK A‑levels grading algorithm failure: MIT Technology Review

Evolving Competencies

Ethics and equity in evaluation design: OECD – AI Principles

Gaming and Manipulation

Netherlands’ childcare benefits scandal – systemic targeting of immigrant families: Politico – Dutch scandal serves as a warning for Europe over algorithm risks
Goodhart’s Law – Center for Naval Analysis - Goodhart’s Law: Recognizing and Mitigating the Manipulation of Measures in Analysis
Prediction model accuracy drops 40–60% when gamed: arXiv – Predictive models and gaming impacts

Context Collapse

Mobile health clinic mismatch: BMC Health Services Research – Mobile clinics meeting rural needs
Job-matching platform failures in informal economies: World Bank Blog – Digital platforms and job matching in Africa
Content Collapse in Social Media: Social Media and Society - Time Collapse in Social Media: Extending the Context Collapse

Bias

Bias baked into training data – Amazon hiring algorithm example: Reuters – Amazon scraps AI recruiting tool
Bias amplification – AI systems worsening inequality: Amnesty International – Xenophobic machines

Integrating Ethics and Equity from the Outset (OECD AI Principles)

Foundational principles for trustworthy AI: OECD – OECD AI Principles
Applying equity from inception: OECD – Ensuring Equitable AI – The Role of Education

Plan for Failure and Adaptation (Adaptive Management)

Kenya biometric voter registration failure: BBC – Kenya’s biometric system collapses

Many people walking on a bright, open space. — Photo by Buddy AN on Unsplash

Anthralytic’s Substack

Ready for more?