GrieVoice

WhatsApp Voice Message Integration Specification

Overview

WhatsApp voice message integration enables workers to submit grievances by recording voice messages on their phones - no typing required, works with basic smartphones, and leverages a platform workers already use daily. Messages are automatically transcribed, translated if needed, and processed into structured grievance records.

Why WhatsApp Voice?

Accessibility: Workers with limited literacy can speak their concerns naturally.
Familiarity: No new app to learn - uses existing WhatsApp.
Async: Workers can record when convenient, not limited to call center hours.
Evidence: Original voice recording preserved for sensitive cases.

User Journey

📱
1. Initiate

Worker messages the GrieVoice WhatsApp number

🎙️
2. Record

Sends voice message describing their concern

⚙️
3. Process

System transcribes, translates, extracts fields

4. Confirm

Receives confirmation with case reference number

Technical Flow

1

Message Reception Twilio WhatsApp API

Worker sends voice message to WhatsApp Business number. Twilio webhook triggers with message metadata and audio URL.

2

Audio Retrieval Twilio Media

Server fetches audio file from Twilio's secure media storage. Supports .ogg format (WhatsApp default).

3

Transcription OpenAI Whisper / Deepgram

Audio converted to text with automatic language detection. Supports Portuguese, English, Swahili, and other languages.

4

Translation (if needed) GPT-3.5 / Gemini

Non-English transcripts translated while preserving original. Both versions stored for reference.

5

Field Extraction Claude Sonnet

AI extracts structured fields: name, contact, location, people involved, category, description, urgency.

6

Storage Supabase

Structured data saved to grievances table. Original audio stored in secure bucket. Source marked as "whatsapp_voice".

7

Confirmation Twilio WhatsApp API

Auto-reply sent to worker: "Thank you. Your concern has been recorded. Reference: GV-2024-0847. We will review within 48 hours."

Customizable Categories

Categories can be configured per deployment to match organizational structure and reporting requirements:

💰 Wages & Payment
⏰ Hours & Overtime
⚠️ Safety & Health
🚫 Harassment
⚖️ Discrimination
📋 Contracts
👔 Discipline
🤝 Union Relations
🏗️ Working Conditions
📚 Training Access
🏠 Accommodation
📦 Other

Categories are extracted automatically from the conversation content. The AI classifies based on keywords and context. New categories can be added by updating the system prompt - no code changes required.

Channel Comparison

🎙️ Real-time Voice (Hume)
  • Live conversation
  • Emotion detection
  • Immediate clarification
  • Best for complex cases
  • Requires stable connection
💬 WhatsApp Voice
  • Async recording
  • Familiar platform
  • Works offline (send later)
  • Best for routine reports
  • Low data usage
📞 USSD
  • Menu-driven input
  • Any phone (no smartphone)
  • Zero data required
  • Best for basic reports
  • Limited detail capture

Cost Estimates (per message)

Component Service Cost
WhatsApp Message (inbound) Twilio $0.005
WhatsApp Message (outbound) Twilio $0.005 - $0.08*
Audio Transcription (2 min avg) Whisper API $0.012
Translation (if needed) GPT-3.5-turbo ~$0.004
Field Extraction Claude Sonnet ~$0.01
Database Storage Supabase ~$0.001
Total per WhatsApp Voice Submission $0.04 - $0.12

*Outbound cost varies by country and message type. Business-initiated messages cost more than user-initiated replies.

Implementation Timeline

With existing infrastructure in place, WhatsApp voice integration can be deployed in phases:

Phase Scope Duration
Phase 1 Twilio WhatsApp setup, webhook configuration, basic audio reception 1-2 days
Phase 2 Transcription pipeline, translation integration 2-3 days
Phase 3 Field extraction, database integration, confirmation flow 2-3 days
Phase 4 Testing, category customization, dashboard updates 2-3 days

Total estimated development time: 7-11 days
Prerequisites: Twilio account with WhatsApp Business API access, approved WhatsApp Business number