Transcribe WAV files kei volasona ne na yaco.Kalani na maca.

Drop a WAV recording straight mai na field rig, DAW bounce, o interview kit. Ka keep mo na 24-bit headroom intact, run diarization on na raw PCM, me return a timestamped transcript kei SRT in minutes.

Drop a file, or pick one

MP3 · WAV · M4A · MP4 · MOV · MKV · OGG · OPUS · FLAC · WEBM — up to 100 MB anonymously

Paste a link, we’ll fetch the audio

YouTube · TikTok · Vimeo · Twitter · SoundCloud · Spotify · 50+ more

Record straight from your browser

Sign up takes 30 seconds — recording opens right after, in the dashboard.

No card required~90s per 60-min fileSRT · VTT · DOCX · TXTFiles auto-deleted in 24h

↓ Watch what comes out

Raw PCM in. Clean transcript out.

Lossless WAV ena every sibilant, plosive, me quiet word survives intact — kece MP3 smear on consonants. If na file is multi-track (one speaker per channel), ka skip mo acoustic diarization entirely me split on na channel layout.

WAV · 48 kHz / 24-bitREC 2 tracks · 1h 12m · 743 MB
auto-detected en-GBstereo PCM · uncompressed
~90s
Transcript · streaming97% accuracy
S1

Take me back to that morning in seventy-eight — what time did the call come in?

S2

Quarter to five, give or take. Kettle was on, I remember that much.

S1

And from there you drove straight down to the harbour?

S2

Straight to the boatyard. Lights were still on when I pulled in.

97% on per-track WAVSRT · DOCX · TXT · JSON

↓ This is the dashboard

This is what loads when the job finishes.

Same layout as the real dashboard — Summary, full Transcript, Speakers tab, Exports. Key points and action items extracted automatically. Auto-tags on every job.

Try it on your own file — it's free

Three real options · honest comparison

Adobe Audition. Descript. O kami.

Audition's Speech to Text ka bundled kei Creative Cloud me stay inside na timeline. Descript imports na WAV into its own editor. Ka take mo na file as-is, return standard exports, me kece na request mo move na project anywhere.

Option 01

Adobe Audition / Premiere

Transcript panel inside na Adobe timeline. Tied to Creative Cloud me na project file.

RequiresCreative Cloud subscription
Speaker diarizationYes, mixed-down only
Multi-track WAVFlattened before STT
ExportSRT · CSV · XML
Languages18, manual select
Cost~$23/mo (single app)
Best forEditors already cutting in Premiere o Audition na na want captions stitched to na timeline.
Option 02

Transcription.Solutions

Drop na WAV. Per-channel diarization if it's multi-track. Source deleted in 24h.

RequiresNothing — just na file
Speaker diarizationPer-track o acoustic
Multi-track WAVUp to 16 channels
ExportSRT · VTT · DOCX · TXT · JSON
Languages99, auto-detected
Cost · per min$0.03
Best forAnyone holding a raw WAV — field recordists, podcasters bouncing from a DAW, oral history archivists, researchers.
Option 03

Descript

Imports na WAV into Descript's editor. Powerful, but ka work mo inside it.

RequiresDescript account + import
Speaker diarizationAcoustic, EN-tuned
Multi-track WAVImport as separate clips
ExportTXT · SRT · DOCX
Languages23, accuracy varies
Cost$16–24/user/mo
Best forPodcast editors na na want to edit na audio by editing na transcript — Descript's actual superpower.

Pricing accurate as of 2026. Adobe me Descript feature flags change frequently; check current docs before committing.

Specific to WAV

Three things that bite people on generic transcription tools.

Most uploaders silently downsample na WAV before sending it to a recognizer. Ka kece kami.

What goes wrong

  1. 1Multi-track WAV gets flattened. A 4-channel field recording from a Sound Devices MixPre gets mixed to mono before STT. Na per-mic separation na kena pay for is thrown away.
  2. 232-bit float WAVs from Zoom F-series o MixPre get rejected outright, o clipped to 16-bit me lose their headroom recovery.
  3. 396 kHz / 24-bit interviews take forever to upload because na tool re-encodes to MP3 in na browser before sending.

What to flip here

  1. 1Upload na multi-track WAV as-is (up to 16 channels). Ka luku kami na channel layout from na WAV header me assign one speaker per track — kece na acoustic guessing.
  2. 232-bit float is accepted natively. Ka preserve kami na float headroom when normalising for na recognizer, so peaks above 0 dBFS kece na clip.
  3. 3Direct binary upload, kece na transcode in na browser. A 2 GB WAV moves at na full bandwidth me start processing na moment na last byte lands.

Recommended job settings for WAV

Drop a WAV me these flip on by default. Override per-job from na form.

Sample rate
Native (kece downsample)
Bit depth
24-bit / 32-float preserved
Diarization
Per-channel if multi-track
Speaker model
Interview · 2-8 speakers
Filler words
Kept (toggle off if needed)
Export
DOCX · SRT · timestamped TXT

Accuracy · real-world numbers

97%+ on per-track WAV. WAV gives na recognizer the cleanest possible signal.

Because WAV stores raw PCM kei kece na perceptual compression, consonants me sibilants aren't smeared na wayi MP3 smears them. Na recognizer hears what na microphone heard. Numbers below come from real customer WAV jobs in production.

98%
Studio WAV · single speaker

48 kHz / 24-bit, large-diaphragm condenser, treated room. Narration, audiobook, voice-over bookings land here.

96%
Multi-track interview WAV

One channel per speaker (lavs o boundary mics). Diarization is just channel routing — text-only error.

92%
Handheld field recorder

Zoom H5, Tascam DR-40, similar. Stereo XY pickup, 2-3 speakers, some room reflection. Most podcast WAVs land here.

85%
Noisy environment field WAV

Outdoor, café, vehicle. Lossless capture helps — na noise is real, kece codec artefact — but accuracy still drops on overlapping speech.

Common questions

8 things people ask about WAV transcription.

01What's the maximum WAV file size?+
5 GB per file on na standard plan, which is roughly 8 hours of stereo 48 kHz / 24-bit, o 2.5 hours of 96 kHz / 24-bit. Larger files are fine on na team plan — just contact mo kami before na upload.
02Do you support 32-bit float WAV from Zoom F-series o MixPre?+
Yes, natively. Ka luku kami na float samples kece na clipping at 0 dBFS, so loud transients na na plan mo pull down in post still get transcribed cleanly. Most generic uploaders silently down-cast to 16-bit first.
03I have a 4-channel WAV from a field recorder — one mic per person. Will diarization use that?+
It will. Upload na polyphonic WAV directly (kece na bounce to stereo first). Ka parse kami na channel layout from na WAV header me assign one speaker per track — much more reliable than acoustic diarization on similar voices.
04Will you downsample my 96 kHz WAV?+
Na recognizer runs at 16 kHz internally — that's na ceiling of human speech intelligibility. But ka keep kami na original file untouched me use it for any post-processing like noise gating. Na exports reference na original timeline.
05Is WAV actually more accurate than MP3 for transcription?+
Marginally, yes — usually 1-2 points of WER on clean speech. Na bigger gap shows up on sibilants me quiet passages, where MP3's psychoacoustic compression discards information na recognizer would have used. For archival o forensic work, WAV is na right call.
06Are BWF metadata me timecode preserved?+
Ka luku kami na BWF chunks (bext, iXML) me use na start timecode to align na transcript to na session timeline. Na original WAV is never modified — ka work kami on a copy that's deleted within 24h.
07Can I drop a folder of WAV files from a DAW session export?+
Yes. Batch upload accepts up to 50 files at once. Each WAV gets its own job me transcript. If they're stems from one session, can mo also merge them into a single multi-track WAV before upload me ka diarize kami per channel.
08How long does a 1-hour stereo WAV actually take?+
Upload is na slowest part — a 1-hour 48 kHz / 24-bit stereo WAV is about 600 MB me takes 2-5 minutes on typical broadband. Once uploaded, transcription itself runs in roughly 4-6 minutes on na standard queue.

Drop na WAV. Keep na kalani na maca. See what comes out.

30 free minutes every month. No card. Per-track diarization, 32-bit float supported, source audio deleted in 24h.

Start free