Voice to text — free voice-to-text converter, 99 languages

Voice to text.From any recording. In 99 languages.

Upload an iPhone voice memo, an Android recording, a WhatsApp voice note — or record live in the browser. Get your voice transcribed to searchable text in minutes.

Drop your audio or video

MP3 · WAV · M4A · MP4 · MOV · MKV · OGG · OPUS · FLAC · WEBM — up to 100 MB anonymously

Paste a link, we’ll fetch the audio

YouTube · TikTok · Vimeo · Twitter · SoundCloud · Spotify · 50+ more

Record straight from your browser

No card required~90s per 60-min fileSRT · VTT · DOCX · TXTFiles auto-delete in 24h

Voice memo in. Searchable text out.

Drop an iPhone voice memo, a WhatsApp voice note, a Telegram voice message, or hit record live in your browser. Mobile codecs compress voice aggressively — we handle the format quirks server-side with ffmpeg.

WhatsApp voice · .opusREC 01:42.06

en-US auto-detectedOpus 24 kbps mono

~90s

Transcript · streaming1 speaker · 01:42

Hey, quick voice note — wanted to capture this before I forget. The thing about voice memos is everyone's recording from a different platform.

iPhone uses M4A. WhatsApp uses Opus. Telegram uses Ogg. Android does AMR or M4A depending on the keyboard app. Same voice, four different codecs.

Don't worry about the format — drop it in, transcribe runs the same way. The accuracy ceiling is the mobile bitrate, not the

92%+ accuracy on mobile codecsTXT · DOCX · SRT · per-message

Same view, regardless of recording source.

iPhone M4A, WhatsApp Opus, Telegram Ogg, Android M4A — all four land in the same dashboard with the same Summary / Transcript / Speakers / Exports tabs. Mobile recording quirks (compression, low bitrate, narrowband) handled at the audio-extraction layer.

app.transcription.solutions / whatsapp-voice-2026-05-17.opusExport

Summary 5Transcript 1,420Speakers 2Exports

whatsapp-voice-2026-05-17.opus01:42Opus 24 kbps mono1 speakeren-US auto-detected

Voice memos from any platform — iPhone, WhatsApp, Telegram, Android — work identically once the file lands in the tool.

Sample preview from a WhatsApp voice note about cross-platform voice-memo workflows. Mirrors what loads in your account: summary, transcript, format-aware export, voice-specific accuracy notes.

Key points

Format-agnostic intake M4A, Opus, Ogg, AMR — all decoded server-side via ffmpeg. You never convert manually.

Mobile codec aware WhatsApp Opus at 24 kbps is lossy by design; expect 92% vs 95% on clean studio.

Voice-specific punctuation the model is tuned for monologue voice memos (no speaker turns) by default.

Multi-message threads (WhatsApp / Telegram conversation export) transcribe as separate jobs with timestamps preserved.

iPhone Voice Memos sync drag from the macOS Voice Memos app directly into the upload card — no Share Sheet needed.

Action items

Speaker 1Drop the voice note. Transcript back in 8 seconds for a 1:42 file.

Speaker 1Copy 1-paragraph summary into the Slack / Notion thread it answers.

Speaker 1Export DOCX if you'll review with someone else; TXT if it's for yourself.

Speaker 1Re-run with custom vocabulary if it's a recurring contact whose name needs correcting.

Auto-taggedvoice memoWhatsApp Opusmobile workflowmonologue

Drop your voice memo — try it free

Option 01

Siri / Google live dictation

Press and hold the keyboard mic icon. Words appear as you speak. No history, no editing buffer, no file upload.

Accuracy · clear voice~88%

Works on uploaded filesNo

Long-form (>2 min)Breaks mid-flow

PunctuationManual ('comma')

Languages~30

CostFree / built-in

Best forHands-free messaging while walking. One-sentence dictation into a notes app. Short voice search in a maps app.

Option 02

AI voice-to-text

Drop a voice memo, paste a recording link, or record live. ~30× realtime. Punctuation, paragraph breaks, AI summary, mobile codecs handled.

Accuracy · clear voice95%+

Works on uploaded filesYes (10h max)

Long-form (>2 min)Native support

PunctuationAutomatic

Languages100+ auto

Cost · per min$0.03

Best forVoice memos from iPhone / WhatsApp / Telegram · interview voice recordings · journalist field recordings · podcast solo episodes · audio diary / journaling workflows.

Option 03

Manual typing

Listen, pause, type. Slowest. Highest accuracy if the audio is hard or the speaker code-switches frequently.

Accuracy · clear voice98–99%

Works on uploaded filesYes

Long-formYes, but slow

PunctuationManual

60-min file3–5 hours typing

Cost · per minYour time

Best forHard audio where AI tools fall below 85% · code-switching speakers · audio with heavy redaction / verbatim transcription requirements.

Siri/Google figures from public iOS / Android speech API benchmarks. Manual typing speed from US/UK transcriber productivity surveys.

Format	Extension	Bitrate	Notes
iPhone Voice Memos	.m4a · AAC	64 kbps	Apple's default. Decent quality for monologue voice notes. Drag-drop from the macOS Voice Memos app works directly. Accuracy ceiling ~95% on clean recording.
WhatsApp voice notes	.opus · Ogg container	24 kbps	Aggressive compression for messaging-grade audio. Lossy by design — accuracy ceiling ~92% even on clean voice. Forward to email or download via WhatsApp web, then upload.
Telegram voice messages	.oga / .ogg · Opus	32 kbps	Slightly higher bitrate than WhatsApp. Accuracy ceiling ~93%. Save the message file from Telegram desktop, drop directly into the upload card.
Android voice recorder	.m4a / .amr	32–128 kbps	Varies by manufacturer — Samsung defaults to M4A 128 kbps, older Android uses AMR 12 kbps. Pixel Recorder app exports clean M4A and reaches 95%+ accuracy.

Format

Extension

Bitrate

Notes

iPhone Voice Memos

.m4a · AAC

64 kbps

Apple's default. Decent quality for monologue voice notes. Drag-drop from the macOS Voice Memos app works directly. Accuracy ceiling ~95% on clean recording.

WhatsApp voice notes

.opus · Ogg container

24 kbps

Aggressive compression for messaging-grade audio. Lossy by design — accuracy ceiling ~92% even on clean voice. Forward to email or download via WhatsApp web, then upload.

Telegram voice messages

.oga / .ogg · Opus

32 kbps

Slightly higher bitrate than WhatsApp. Accuracy ceiling ~93%. Save the message file from Telegram desktop, drop directly into the upload card.

Android voice recorder

.m4a / .amr

32–128 kbps

Varies by manufacturer — Samsung defaults to M4A 128 kbps, older Android uses AMR 12 kbps. Pixel Recorder app exports clean M4A and reaches 95%+ accuracy.

8 things people ask about this.

01What does voice-to-text mean?+

Voice to text is the conversion of spoken voice into written text by an AI model. The terms voice-to-text, speech-to-text, voice recognition, and automatic speech recognition (ASR) are used interchangeably. Modern voice-to-text reaches 95%+ word accuracy on clear voice recordings in major languages.

02How do I convert voice to text on iPhone?+

iPhone Voice Memos saves recordings as M4A files. Open the Voice Memos app, tap the recording, tap the share button, and send the M4A to transcription.solutions (email-to-self, AirDrop to Mac, or open the URL in mobile Safari and upload directly). The first 90 minutes per month are free, no card required.

03How do I convert voice to text on Android?+

Android Voice Recorder typically saves as M4A or MP3. Open the file, share it to Chrome or the browser, drop it onto transcription.solutions. Or upload directly from the recording app's share menu.

04Is voice-to-text accurate?+

On clear voice recordings — single speaker, decent microphone, quiet environment — voice-to-text reaches 95%+ word accuracy in major languages. On phone-quality voice (8 kHz), noisy environments, or strong accents, accuracy drops to 85–90%. For legal or medical dictation, a human review pass on top of AI output is the recommended standard.

05Can I record voice live in the browser?+

Yes. Click the record button on transcription.solutions and your microphone captures directly in the page. When you stop, the audio uploads and transcribes server-side. Works on Chrome, Safari, Firefox, Edge — desktop and mobile.

06Can voice-to-text handle WhatsApp voice messages?+

Yes. Export the WhatsApp voice message (forward to email or download via WhatsApp Web), then upload the .opus or .ogg file. Voice-to-text handles WhatsApp's OPUS format directly without conversion.

07Is voice-to-text free?+

Transcription.solutions includes 90 minutes per month of free voice-to-text — no credit card required. Pro is $19/month (600 minutes); Business is $49/month (2,500 minutes). Free renews every month indefinitely.

08What's the difference between voice-to-text and speech-to-text?+

They are synonyms. "Voice to text" is more common in consumer contexts (phone voice notes, dictation apps); "speech to text" is the more technical term used for longer recordings and professional tools. The AI does the same thing in both cases.