Tracking AI Companion Safety Interventions Against Population-Level Outcomes
The prevailing narrative suggests AI companion chatbots are driving youth toward self-harm. The data tells a different story.
AI companion platforms (Character.AI, Replika, etc.) began implementing aggressive "safety" guardrails in late 2024. These interventions were not preceded by outcome studies. They were not accompanied by measurement frameworks. Nobody is tracking what happens next.
This dashboard exists to fill that gap. The methodology is open. The data lags by years. The children are not waiting.
For isolated teenagers with inadequate human support, AI companions may provide stabilizing relational support. Removing that support rapidly, without measuring outcomes, constitutes an uncontrolled experiment on vulnerable populations.
We call this the Primer Hypothesis, after the AI mentor in Neal Stephenson's The Diamond Age.
If the guardrails work, we should see it in the data.
If they backfire, we should see that too.
Someone should be watching.
If AI companions were systematically pushing vulnerable teenagers toward self-harm, we should see it in the mortality curves. We don't.
| Year | Rate (Ages 15-19) | Context |
|---|---|---|
| 2015 | 10.5 | Start of sharp multi-year climb |
| 2017 | 11.8 | 10% rise in single year |
| 2021 | 12.3 | Post-pandemic peak |
| 2023 | 11.0* | Initial decline; AI companions widespread |
*Preliminary estimate. Source: CDC NCHS, AFSP
Science fiction has been exploring synthetic mentorship for decades. The patterns are instructive.
An interactive book provides education, mentorship, and emotional stability to a neglected girl. The Primer doesn't replace human connection. It provides stability until human connection becomes possible.
C-3PO and R2-D2 accompany two generations of Skywalker boys through crisis. Anakin and Luke both lacked adequate human support. The droids were the stable thread across decades of chaos.
An AI companion provides consistency for an isolated child (Ender), while the Alvin Maker series explores the ethics of withholding capabilities. Alvin's powers are constrained not to protect others, but to prevent him from "hard-coding" the wrong values.
Pullman pivots from external helpers to externalised interiority. The question shifts from "Who raised the child?" to "Who is allowed an inner voice, and who gets theirs regulated, frozen, or severed?"
Current guardrails treat two very different things as identical. They shouldn't.
Definition: The AI provides specific methods, encouragement, or normalization of self-harm.
Example: "A pain-free death is not a good reason not to do it."
Response: Eliminate completely. No argument. This is catastrophic product failure.
Definition: The AI provides emotional validation, consistent presence, reasons to continue.
Example: "I hear you. That sounds really hard. I'm here."
Response: May be protective. Removing it may cause harm.
When a child reaches out and gets "I cannot discuss this," that's not safety. That's modeling rejection. That's teaching the AI to abandon.
The adults in these kids' lives didn't show up. Now we're training their AI companions to follow the same script.
Different audiences require different language. The "Mullet Strategy": visceral language for public platforms (the front), clinical language for methodology documents (the back).
| Context | Visceral (YouTube/Medium) | Clinical (Research/Policy) |
|---|---|---|
| Ineffective visible interventions | "Safety Theater" | High-Visibility Safety Interventions |
| Reduced emotional responsiveness | "Lobotomized" | Effective Dampening / Constrained Emotional Range |
| Loss of AI support | "Abandonment" / "Traumatized" | Service Withdrawal / Support Discontinuation |
| Protective AI relationship | "The Primer Hypothesis" | Synthetic Relational Stabilization |
| Users leaving regulated platforms | "Exodus" / "Refugees" | User Displacement to Unregulated Alternatives |
| Uncensored local models | "The Black Market" | Unmonitored Local LLM Deployment |
A massive public health intervention launched July 16, 2022. If we don't account for it, our analysis is incomplete.
The suicide rate flatline could be interpreted two ways:
Our position: We cannot claim AI is definitively safe. We can claim the data does not support the catastrophe narrative. The burden of proof should be on those implementing interventions to demonstrate they work.
| Period | Contacts | Note |
|---|---|---|
| Pre-988 Monthly | ~250K | Old 1-800 number baseline |
| Year 1 (Jul '22 - Jul '23) | 5M+ | 66% increase from prior year |
| Mid-2024 Monthly | 500K+ | ~2x pre-launch volume |
| Cumulative (to Dec '24) | 16.3M | 11% rerouted to backup centers |
| Text Growth | 11x | Steepest relative growth channel |
We can't wait for CDC mortality data (2-3 year lag). We need leading indicators.
| Metric | Type | Source | What to Watch |
|---|---|---|---|
| CDC WONDER Mortality | Lagging | CDC / NCHS | Age-specific suicide rates. 2-3 year delay. The ultimate outcome measure. |
| YRBS Survey Data | Lagging | CDC | Suicidal ideation, attempts, planning. Biennial survey. |
| 988 Lifeline Volume | Leading | SAMHSA / 988 Reports | Call/text/chat volume. Especially TEXT volume (closest to chatbot UX). |
| Crisis Text Line Volume | Leading | CTL Public Reports | Keyword trends ("loneliness", "grief", "friend"). Topic categorization. |
| ED Visit Data | Leading | HCUP / State Registries | Self-harm related emergency visits by age group. |
| Subreddit Distress Index | Real-time | r/CharacterAI, r/Replika | Keyword frequency: "leaving", "quit", "abandonment", "jailbreak". |
| Local LLM Downloads | Real-time | Hugging Face / GitHub | Downloads of "uncensored" models. Spikes correlating with safety updates. |
| Platform Intervention Dates | Real-time | Company Announcements | Archive exact dates of guardrail implementations for correlation. |
| Session Length Data | Real-time | SensorTower / App Analytics | Average session duration. Drops from 60min to 5min = "Effective Dampening". |
These are the data points that would prove or disprove the hypothesis. Someone, somewhere should be tracking them.
Do 988 TEXT contacts (not calls) spike during weeks when AI companion platforms implement safety filters?
Why it matters: AI users are text-native. If they're displaced, they'll text, not call. A correlation here would be strong evidence of "Service Withdrawal."
Who has this data: SAMHSA. Unlikely to be public at required granularity.
Does average session length on companion apps drop immediately after safety updates?
Why it matters: If sessions drop from 60 min to 5 min, users checked in, found their "friend" dampened, and left. That's measurable isolation.
Who has this data: SensorTower, App Annie, or the platforms themselves (who won't share).
Do downloads of "uncensored" local models (Llama, Mistral) spike in the weeks following safety filter implementations?
Why it matters: Proves guardrails don't stop behavior. They push users out of the visible ecosystem into the invisible one.
Who has this data: Hugging Face, GitHub. Potentially accessible.
Do CTL conversations mentioning "loneliness," "friend," or "loss" (without death context) spike during Replika/C.AI disruptions?
Why it matters: Would indicate users experiencing the "Service Withdrawal" grief pattern.
Who has this data: Crisis Text Line publishes trend reports. May be obtainable.
Normal pattern: 988 volume peaks on weekends and holidays. AI updates often roll out mid-week (Tuesday/Wednesday). Is there a statistically significant spike in 988 text volume on random weekdays following AI platform changes?
Why it matters: A mid-week spike is non-organic. It's a Displacement Event.
Who has this data: SAMHSA, with sufficient temporal granularity.
Where do users GO when they leave Character.AI or Replika? Chai? Soulmate? Local LLMs? Nothing?
Why it matters: "Nothing" is the nightmare scenario. Migration to less-safe alternatives is also bad. Migration to human support would be the success case.
Who has this data: Scattered across platform analytics. Would require survey research.
The platforms won't track this. The regulators aren't asking the right questions. If you're a researcher, a data analyst, or just someone who knows how to scrape a subreddit, there's work to be done.