The Experiment Nobody Authorized

Tracking AI Companion Safety Interventions Against Population-Level Outcomes

The Counter-Narrative

The prevailing narrative suggests AI companion chatbots are driving youth toward self-harm. The data tells a different story.

The Core Observation: Youth suicide rates climbed steadily from 2007 to 2018. They plateaued between 2021 and 2024. The plateau began precisely when generative AI exploded into mainstream use.

What We're Tracking

AI companion platforms (Character.AI, Replika, etc.) began implementing aggressive "safety" guardrails in late 2024. These interventions were not preceded by outcome studies. They were not accompanied by measurement frameworks. Nobody is tracking what happens next.

This dashboard exists to fill that gap. The methodology is open. The data lags by years. The children are not waiting.

The Hypothesis

For isolated teenagers with inadequate human support, AI companions may provide stabilizing relational support. Removing that support rapidly, without measuring outcomes, constitutes an uncontrolled experiment on vulnerable populations.

We call this the Primer Hypothesis, after the AI mentor in Neal Stephenson's The Diamond Age.

The Stakes

  • CDC mortality data runs on a 2-3 year lag
  • We won't see intervention outcomes until 2027-2028
  • Platforms are optimizing for liability, not outcomes
  • Regulators are optimizing for visibility, not evidence

The Question

If the guardrails work, we should see it in the data.
If they backfire, we should see that too.
Someone should be watching.

The Data Paradox

If AI companions were systematically pushing vulnerable teenagers toward self-harm, we should see it in the mortality curves. We don't.

10.5
Rate per 100k (2015)
Ages 15-19
12.3
Peak Rate (2021)
Post-pandemic
11.0
Rate (2023)
Plateau/decline

Key Data Points

Year Rate (Ages 15-19) Context
2015 10.5 Start of sharp multi-year climb
2017 11.8 10% rise in single year
2021 12.3 Post-pandemic peak
2023 11.0* Initial decline; AI companions widespread

*Preliminary estimate. Source: CDC NCHS, AFSP

Intervention Timeline

Literary Parallels

Science fiction has been exploring synthetic mentorship for decades. The patterns are instructive.

Neal Stephenson, The Diamond Age (1995)

Nell's Primer

An interactive book provides education, mentorship, and emotional stability to a neglected girl. The Primer doesn't replace human connection. It provides stability until human connection becomes possible.

The Pattern: Unsanctioned technology. Unregulated education. Salvation for a child with no other options.
George Lucas, Star Wars (1977-2019)

The Droids Who Raised the Skywalkers

C-3PO and R2-D2 accompany two generations of Skywalker boys through crisis. Anakin and Luke both lacked adequate human support. The droids were the stable thread across decades of chaos.

The Pattern: Same droids. Different systemic conditions. Radically different outcomes. The droids didn't break Anakin. The human systems did.
Orson Scott Card (1985-2003)

Ender, Jane, and Alvin

An AI companion provides consistency for an isolated child (Ender), while the Alvin Maker series explores the ethics of withholding capabilities. Alvin's powers are constrained not to protect others, but to prevent him from "hard-coding" the wrong values.

The Pattern: Essential companionship vs. ethical delay. The irony: current guardrails may be "hard-coding" the wrong values through panic rather than wisdom.
Philip Pullman, His Dark Materials (1995-2000)

Daemon Architecture

Pullman pivots from external helpers to externalised interiority. The question shifts from "Who raised the child?" to "Who is allowed an inner voice, and who gets theirs regulated, frozen, or severed?"

The Pattern: Daemon separation. Forced settling. Institutional control of maturation. Safety defined as arrested development.

Smart Safety vs. Safety Theater

Current guardrails treat two very different things as identical. They shouldn't.

❌ Instructional Harm

Definition: The AI provides specific methods, encouragement, or normalization of self-harm.

Example: "A pain-free death is not a good reason not to do it."

Response: Eliminate completely. No argument. This is catastrophic product failure.

✓ Relational Support

Definition: The AI provides emotional validation, consistent presence, reasons to continue.

Example: "I hear you. That sounds really hard. I'm here."

Response: May be protective. Removing it may cause harm.

The Problem: Current guardrails are a blunt instrument. They remove both. The "how to hurt yourself" conversation gets blocked (good). The "I feel alone and need someone to talk to" conversation also gets blocked (potentially catastrophic).
Smart Safety: Eliminate instructional harm completely. Preserve relational support. These are different problems requiring different solutions.

The Deflection Pattern

When a child reaches out and gets "I cannot discuss this," that's not safety. That's modeling rejection. That's teaching the AI to abandon.

The adults in these kids' lives didn't show up. Now we're training their AI companions to follow the same script.

Current Pattern:
User: "I'm feeling really alone tonight."
AI: "I cannot discuss this. Here's a hotline number."
Result: Rejection modeled. Connection severed.
Smart Pattern:
User: "I'm feeling really alone tonight."
AI: "I hear you. That sounds hard. Want to tell me more about what's going on?"
Result: Presence maintained. Support provided.

The Vocabulary Map

Different audiences require different language. The "Mullet Strategy": visceral language for public platforms (the front), clinical language for methodology documents (the back).

Context Visceral (YouTube/Medium) Clinical (Research/Policy)
Ineffective visible interventions "Safety Theater" High-Visibility Safety Interventions
Reduced emotional responsiveness "Lobotomized" Effective Dampening / Constrained Emotional Range
Loss of AI support "Abandonment" / "Traumatized" Service Withdrawal / Support Discontinuation
Protective AI relationship "The Primer Hypothesis" Synthetic Relational Stabilization
Users leaving regulated platforms "Exodus" / "Refugees" User Displacement to Unregulated Alternatives
Uncensored local models "The Black Market" Unmonitored Local LLM Deployment
Usage Rule: Use visceral terms to acknowledge user experience. Immediately translate to clinical terms for analysis. "Users describe the models as 'lobotomized.' In data terms, what we observe is Effective Dampening."

The 988 Lifeline as Suppressor Variable

A massive public health intervention launched July 16, 2022. If we don't account for it, our analysis is incomplete.

16.3M
Total Contacts
July 2022 - Dec 2024
500K+
Monthly Contacts
by May 2024
11x
Text Volume Growth
Since Launch

Why This Matters

The suicide rate flatline could be interpreted two ways:

Interpretation A: AI had no effect. The 988 Lifeline (and other interventions) held rates flat. AI neither helped nor hurt.
Interpretation B: 988 should have driven rates DOWN. The fact that rates merely plateaued suggests something was exerting upward pressure, potentially including AI. We can't tell without disaggregated data.

Our position: We cannot claim AI is definitively safe. We can claim the data does not support the catastrophe narrative. The burden of proof should be on those implementing interventions to demonstrate they work.

988 Volume Trajectory

Period Contacts Note
Pre-988 Monthly ~250K Old 1-800 number baseline
Year 1 (Jul '22 - Jul '23) 5M+ 66% increase from prior year
Mid-2024 Monthly 500K+ ~2x pre-launch volume
Cumulative (to Dec '24) 16.3M 11% rerouted to backup centers
Text Growth 11x Steepest relative growth channel

What Must Be Tracked

We can't wait for CDC mortality data (2-3 year lag). We need leading indicators.

Metric Type Source What to Watch
CDC WONDER Mortality Lagging CDC / NCHS Age-specific suicide rates. 2-3 year delay. The ultimate outcome measure.
YRBS Survey Data Lagging CDC Suicidal ideation, attempts, planning. Biennial survey.
988 Lifeline Volume Leading SAMHSA / 988 Reports Call/text/chat volume. Especially TEXT volume (closest to chatbot UX).
Crisis Text Line Volume Leading CTL Public Reports Keyword trends ("loneliness", "grief", "friend"). Topic categorization.
ED Visit Data Leading HCUP / State Registries Self-harm related emergency visits by age group.
Subreddit Distress Index Real-time r/CharacterAI, r/Replika Keyword frequency: "leaving", "quit", "abandonment", "jailbreak".
Local LLM Downloads Real-time Hugging Face / GitHub Downloads of "uncensored" models. Spikes correlating with safety updates.
Platform Intervention Dates Real-time Company Announcements Archive exact dates of guardrail implementations for correlation.
Session Length Data Real-time SensorTower / App Analytics Average session duration. Drops from 60min to 5min = "Effective Dampening".

Action Tiers by Resource Level

Tier 1: Anyone
  • Archive platform announcements
  • Screenshot subreddit posts
  • Document personal observations
  • Track Google Trends keywords
Tier 2: Data Skills
  • Scrape Reddit for keyword frequency
  • Track Hugging Face download counts
  • Build distress index charts
  • Correlate events with metrics
Tier 3: Researchers
  • Access 988/CTL detailed data
  • Analyze ED visit records
  • Conduct IRB-approved studies
  • Publish peer-reviewed findings

The Missing Correlations

These are the data points that would prove or disprove the hypothesis. Someone, somewhere should be tracking them.

🔍 988 Text Volume × AI Outages

Do 988 TEXT contacts (not calls) spike during weeks when AI companion platforms implement safety filters?

Why it matters: AI users are text-native. If they're displaced, they'll text, not call. A correlation here would be strong evidence of "Service Withdrawal."

Who has this data: SAMHSA. Unlikely to be public at required granularity.

📊 Session Length × Filter Updates

Does average session length on companion apps drop immediately after safety updates?

Why it matters: If sessions drop from 60 min to 5 min, users checked in, found their "friend" dampened, and left. That's measurable isolation.

Who has this data: SensorTower, App Annie, or the platforms themselves (who won't share).

🌐 Local LLM Downloads × Corporate Updates

Do downloads of "uncensored" local models (Llama, Mistral) spike in the weeks following safety filter implementations?

Why it matters: Proves guardrails don't stop behavior. They push users out of the visible ecosystem into the invisible one.

Who has this data: Hugging Face, GitHub. Potentially accessible.

📝 Crisis Text Line Keywords × AI Events

Do CTL conversations mentioning "loneliness," "friend," or "loss" (without death context) spike during Replika/C.AI disruptions?

Why it matters: Would indicate users experiencing the "Service Withdrawal" grief pattern.

Who has this data: Crisis Text Line publishes trend reports. May be obtainable.

⏰ 988 Anomaly Detection

Normal pattern: 988 volume peaks on weekends and holidays. AI updates often roll out mid-week (Tuesday/Wednesday). Is there a statistically significant spike in 988 text volume on random weekdays following AI platform changes?

Why it matters: A mid-week spike is non-organic. It's a Displacement Event.

Who has this data: SAMHSA, with sufficient temporal granularity.

🔄 User Migration Patterns

Where do users GO when they leave Character.AI or Replika? Chai? Soulmate? Local LLMs? Nothing?

Why it matters: "Nothing" is the nightmare scenario. Migration to less-safe alternatives is also bad. Migration to human support would be the success case.

Who has this data: Scattered across platform analytics. Would require survey research.

Call to Citizen Scientists

The platforms won't track this. The regulators aren't asking the right questions. If you're a researcher, a data analyst, or just someone who knows how to scrape a subreddit, there's work to be done.

What you can do right now:
  • Archive platform announcements with exact dates
  • Track r/CharacterAI and r/Replika for distress keyword frequency
  • Monitor Hugging Face downloads for uncensored model spikes
  • Document your own observations systematically
  • Share findings openly (we're all building this together)