Fa'ataʻitaʻia WAV files ma toʻilalo o le tautala.Leo sao lelei.

Tuʻu le WAV recording mai i lau field rig, DAW bounce, poʻo le interview kit. E tatau atu ta te le tuʻu i lalo le 24-bit headroom, talitonu i le diarization i luga o le raw PCM, ma e ave mai le timestamped transcript ma SRT i itulagi.

Drop a file, or pick one

MP3 · WAV · M4A · MP4 · MOV · MKV · OGG · OPUS · FLAC · WEBM — up to 100 MB anonymously

Paste a link, we’ll fetch the audio

YouTube · TikTok · Vimeo · Twitter · SoundCloud · Spotify · 50+ more

Record straight from your browser

Sign up takes 30 seconds — recording opens right after, in the dashboard.

No card required~90s per 60-min fileSRT · VTT · DOCX · TXTFiles auto-deleted in 24h

↓ Vaʻai i le mea e sau mai

Raw PCM i lalo. Clean transcript i fafo.

Lossless WAV e faitau ai le consonants, sibilants, ma word quiet — e fa'atumau lelei — leai le MP3 smear i le consonants. Afai e toʻa le file (one speaker per channel), e tatau ta te taʻu ese le acoustic diarization ma e vaʻi i le channel layout.

WAV · 48 kHz / 24-bitREC 2 tracks · 1h 12m · 743 MB
auto-detected en-GBstereo PCM · uncompressed
~90s
Transcript · streaming97% accuracy
S1

Toe aveina mai a'u i le taeao lea i le 1978 — o le ā le taimi na omi ai le lipoti?

S2

Itula e le na tasi, e foʻi mai. O le ketile e upu e mafai ona ou manatu.

S1

Ma mai i lea e te maliu lelei i le harbour?

S2

Maliu lelei i le boatyard. O le iʻa e a aʻu i le taimi na ou tuli ai.

97% i le per-track WAVSRT · DOCX · TXT · JSON

↓ This is the dashboard

This is what loads when the job finishes.

Same layout as the real dashboard — Summary, full Transcript, Speakers tab, Exports. Key points and action items extracted automatically. Auto-tags on every job.

Try it on your own file — it's free

Tolu mea moni · honesty comparison

Adobe Audition. Descript. Poʻo ta.

O le Audition Speech to Text e teuina faʻatasi ai le Creative Cloud ma e nonofo i le timeline. Descript e tuʻu i lalo le WAV i lona lava editor. E tatau ta te ave le file e pei ona, ave mai le standard exports, ma e le faʻamatala atu ia te oe e talitonuina lou faʻamatua i sē.

Option 01

Adobe Audition / Premiere

Transcript panel i le Adobe timeline. Ua afua i le Creative Cloud ma le project file.

AogaCreative Cloud subscription
Speaker diarizationIoe, mixed-down only
Multi-track WAVFlattened before STT
ExportSRT · CSV · XML
Gagana18, manual select
Tau~$23/mo (single app)
Best forO i latou e puipui i le Premiere poʻo Audition e mananao ai i le captions e fesoʻotoʻi i le timeline.
Option 02

Transcription.Solutions

Tuʻu le WAV. Per-channel diarization afai e toʻa. Source fa'afosifosia i 24h.

AogaLeai — le file lava
Speaker diarizationPer-track poʻo acoustic
Multi-track WAVE o'o i 16 channels
ExportSRT · VTT · DOCX · TXT · JSON
Gagana99, auto-detected
Tau · per min$0.03
Best forSe maʻa e i ai le raw WAV — field recordists, podcasters bouncing from DAW, oral history archivists, researchers.
Option 03

Descript

Tuʻu i lalo le WAV i le Descript editor. Malosi, pea e tatau ta te galue i totonu o ia.

AogaDescript account + import
Speaker diarizationAcoustic, EN-tuned
Multi-track WAVImport e pei o separate clips
ExportTXT · SRT · DOCX
Gagana23, accuracy varies
Tau$16–24/user/mo
Best forPodcast editors e mananao ai e fa'atasi i le audio i le tusi — o le Descript superpower.

Tau o le 2026. Adobe ma Descript feature flags e suʻesuʻe i le taimi; tiʻi se current docs a'o leʻi talitonuina.

Specific i WAV

Tolu mea e gau ai i le generic transcription tools.

O le tele o uploaders e fa'afolafolau atu i le WAV a'o leʻi ave atu ki le recognizer. E le ta.

O le mea e tupu

  1. 1Multi-track WAV e fa'afofoina. O le 4-channel field recording mai le Sound Devices MixPre e ua faʻafofo i le mono a'o leʻi STT. O le per-mic separation e sau mai ai e gau.
  2. 232-bit float WAVs mai le Zoom F-series poʻo MixPre e teʻa poʻo e toʻu i 16-bit ma e moti le headroom recovery.
  3. 396 kHz / 24-bit interviews e fai ai le loloa o le upload talofa e fa'ailoga le tool i MP3 i le browser a'o leʻi ave atu.

O le mea e suʻia

  1. 1Tuʻu i lalo le multi-track WAV e pei ona (e o'o i 16 channels). E tatou faitau le channel layout mai le WAV header ma e tuʻu i lalo le speaker mo channel — leai le acoustic guessing.
  2. 232-bit float e talitonuina lelei. E tatou tapu le float headroom a'o le normalising mo le recognizer, e le piti le peaks i 0 dBFS.
  3. 3Direct binary upload, leai transcode i le browser. O le 2 GB WAV e maliu ma le lau full bandwidth ma e amata le processing a'o tula le last byte.

Recommended job settings mo WAV

Tuʻu le WAV ma e ua faʻapu nei i le default. E mafai ona e suʻia per-job mai le form.

Sample rate
Native (no downsample)
Bit depth
24-bit / 32-float preserved
Diarization
Per-channel afai e toʻa
Speaker model
Interview · 2-8 speakers
Filler words
Kept (toggle off a mānaia)
Export
DOCX · SRT · timestamped TXT

Accuracy · real-world numbers

97%+ i le per-track WAV. WAV e avatu le recognizer le cleanest possible signal.

Talofa WAV e ia i le raw PCM e leai se perceptual compression, consonants ma sibilants e le fa'aleagaina e pei o MP3 e fa'aleaga ai. O le recognizer e faalogo i le mea na faalogo ai le microphone. O numera i lalo e sau mai real customer WAV jobs i production.

98%
Studio WAV · single speaker

48 kHz / 24-bit, large-diaphragm condenser, treated room. Narration, audiobook, voice-over bookings e noanoana i lenei.

96%
Multi-track interview WAV

O le tasi channel mo le tasi speaker (lavs poʻo boundary mics). Diarization e lava le channel routing — text-only error.

92%
Handheld field recorder

Zoom H5, Tascam DR-40, similar. Stereo XY pickup, 2-3 speakers, room reflection. O le tele o podcast WAVs e noanoana i lenei.

85%
Noisy environment field WAV

I fafo, café, vehicle. Lossless capture e fesoʻotaʻi — o le nuisance e moni, leai le codec artefact — pea e piti le accuracy i le overlapping speech.

Faʻafitaʻiga o fesili

8 mea e fesili ai le tautala i le WAV transcription.

01O le ā le maximum WAV file size?+
5 GB mo le file i le standard plan, lea e pei o 8 hours o le stereo 48 kHz / 24-bit, poʻo 2.5 hours o 96 kHz / 24-bit. O le lalai o files e lelei i le team plan — fesili atu a'o leʻi si le upload.
02E talitonuina a 32-bit float WAV mai le Zoom F-series poʻo MixPre?+
Ioe, lelei lava. E tatou faitau le float samples e leai se tago i 0 dBFS, e faʻatumau ai le loud transients e te manatu ai na e faʻaiti i post. O le tele o generic uploaders e faʻafitaʻi i le 16-bit a'o a'u mai.
03O loʻu 4-channel WAV mai field recorder — o le mike mo le tagata. E aoga le diarization?+
E aoga. Tuʻu i lalo le polyphonic WAV (e aua le bounce i stereo muamua). E tatou faitau le channel layout mai le WAV header ma e tuʻu le speaker mo channel — o le lelei ma le acoustic diarization i luga o le voice tutusa.
04E faʻaitiiti a le 96 kHz WAV?+
O le recognizer e galue i 16 kHz i le luma — o le far o le faalogo a le tagata. Pea e tatou tapu le lau file ma e aoga i poʻi post-processing e pei o noise gating. O le exports e valaʻau i le original timeline.
05O WAV e sili ona lelei mai MP3 mo le transcription?+
Ioe faʻapea — o le 1-2 points o WER i le clean speech. O le gaps e maʻa mai i le sibilants ma quiet passages, lea MP3 psychoacoustic compression e gau ai le faʻamatalaga na aoga ai le recognizer. Mo le archival poʻo forensic work, e WAV e lelei.
06O le BWF metadata ma timecode e tapu a?+
E tatou faitau le BWF chunks (bext, iXML) ma e aoga le start timecode e fesoʻotaʻi ai le transcript i lou session timeline. O le original WAV e le fa'asuia — e tatou galue i le copy e fa'afosifosia i 24h.
07E mafai ona a tu��u i lalo le folder o WAV files mai le DAW session export?+
Ioe. Batch upload e talitonuina e o'o i 50 files ia. O le tasi WAV e suo le job ma transcript. Afai e stems mai le tasi session, e mafai ona oi faʻasaoina i le tasi multi-track WAV a'o leʻi upload ma e diarize mo channel.
08O le 1-hour stereo WAV e faʻapea le taimi moni?+
O le upload e le vave — o le 1-hour 48 kHz / 24-bit stereo WAV e pei o 600 MB ma e 2-5 minutes i typical broadband. A'o leʻi upload, o le transcription e galue i 4-6 minutes i le standard queue.

Tuʻu le WAV. Tapu le lossless quality. Vaʻai i le mea e sau mai.

30 free minutes mo le lua o le mahina. Leai le card. Per-track diarization, 32-bit float supported, source audio fa'afosifosia i 24h.

Amatalia free