Diarisa MP3 go ba letseno.Sefako sa boledi, lipuo tse 100+.

Laola faele ya MP3 ka bitrate efe kapa efe ho tloha 64 go 320 kbps. Fumana sengwalo sa boemo ba nako, se diarisiletsoeng ka lipuo tse 99 — ha ho phetolo ya formate, ha ho re-encoding, ha ho lebala ka pono.

Drop a file, or pick one

MP3 · WAV · M4A · MP4 · MOV · MKV · OGG · OPUS · FLAC · WEBM — up to 100 MB anonymously

Paste a link, we’ll fetch the audio

YouTube · TikTok · Vimeo · Twitter · SoundCloud · Spotify · 50+ more

Record straight from your browser

Sign up takes 30 seconds — recording opens right after, in the dashboard.

No card required~90s per 60-min fileSRT · VTT · DOCX · TXTFiles auto-deleted in 24h

↓ Sheba se ka tlago

MP3 e ka mo. Sengwalo se diarisiletsoeng se ka tlago.

Re bala di-header tsa MP3 frame ka kotsi — VBR, CBR, joint-stereo, sekopu se sele (LAME, Fraunhofer, FFmpeg). Haeba faele e le le le le le le le mpho wa nnete le le le boledi ba ka mo di-channel tse fapaneng, re e sebelisa go arola molumo. Mono mix-down e aba ka mabuto a acoustic diarization.

interview-tape-04.mp3REC 192 kbps · stereo · 38:42
a hlokomelwa ka ho ikhethela en-GB44.1 kHz · LAME 3.100
~90s
Sengwalo · streaming95% nepahalo
S1

Kahoo ke neng ha u qala ho ela hloko hore archive e ne e se e le botlano?

S2

Mohlomong ka 2019, ha re qala ho reka di-reel-to-reel.

S1

Le di-tape tse sa leng teng — na di ne di sa kataloge kae e le kae?

S2

Ho na le index ea pampiri ho tloha '78, empa halofo ea eona e suwe ka metsi.

95% ka 192 kbps stereoSRT · DOCX · TXT · JSON · VTT

↓ This is the dashboard

This is what loads when the job finishes.

Same layout as the real dashboard — Summary, full Transcript, Speakers tab, Exports. Key points and action items extracted automatically. Auto-tags on every job.

Try it on your own file — it's free

Dikgetsi tse tharo tse nnete · letotokelo leo le itseng

Whisper e mahala ea lapeng. Otter kapa Sonix. Kapa rona.

O ka sebetsa Whisper sehlopheng sa hao ka mahala haeba o na le boteknoloji. Otter le Sonix di amoha di-MP3 upload kahare ga di-dashboard tsa subscription. Re nka faele, ra aba sengwalo, mme ha re ka mo diphaphaswing tsa UI.

Option 01

Whisper ya lapeng / open source

E mahala haeba o na le GPU le motsheare wa hora. Ha ho diarization ea boledi ka mofini.

Ho lokisetswaPython + CUDA + dimôdele tse 10 GB
Diarization ea bolediHa e akaretsa (kgetsi ya pyannote)
Lebaka la nako · MP3 ea hora5–40 min ka GPU ea molemi
Lipuo99, hoh e le modele o moenyane o theoha tlase ho 80%
Ho ntshaTXT / SRT / VTT / JSON
ThekoMahala + kurent ya hao
Best forBatsebi ba tekinoloji ba nang le GPU, ba sa hloke sefako sa boledi, ba batla privacy ea lapeng ka botlano.
Option 02

Transcription.Solutions

Laola MP3. Fumana letseno le sefako sa boledi le nako e fokolane × 0.025.

Ho lokisetswaDrag-and-drop, ha ho molato o lokahalang go leka
Diarization ea bolediE akaretsa (Pro & Business planning)
Lebaka la nako · MP3 ea hora~metsotsoana e 90
Lipuo99, e hlokomelwa ka ho ikhethela
Ho ntshaSRT · VTT · DOCX · TXT · JSON
Theko · per min$0.03
Best forMotho e e nang le MP3 — ketsano ya motlaladi, podcast export, voice memo, archival dub — eo o se batlong feela letseno le nepahala.
Option 03

Otter / Sonix

Dashboard e nepahetseng, kgwedi ea minete e matelelo, English-tuned. File upload e utlwala joaloka feature e e ka direng.

Ho lokisetswaMolato + planned planning
Diarization ea bolediAcoustic, EN-leaning
Lebaka la nako · MP3 ea hora5–10 min ka pono
LipuoOtter EN-only; Sonix ~40
Ho ntshaE boreletswe ho pele ga di-tier tse paid
Theko$17+/kgwedi kapa $10+/hora (Sonix)
Best forDigrupo tse batla editor ea sengwalo le collaboration UI thapelo go na le flow ya file→letseno e e hloekileng.

Pricing le feature availability a nepahetse ha May 2026. Whisper performance e fetoha ka size ya model le hardware.

E ikhethileng ho MP3

Dilo tse tharo tse dikela batho ka ditholoele tsa diariso ea sebopeho sa kakaretso.

MP3 ke sebopeho, ha se mohuta wa kgetsi — eo e bolelang hore dimodo tsa ho hlola e tuka ho tloha ho encoder, ha se puo.

Se ka etsahallang

  1. 1Di-header tsa VBR di kopuoa hantle. Ditholoele tse itseng di bala variable-bitrate MP3s e le fixed-rate mme di bala nako e fosahetseng — timestamps di felela ka metsotsoana ngata lori-long hour-long file.
  2. 2Joint-stereo e thibelwa ho mono nakong ea upload preprocessing. O lahla per-speaker channel separation eo e ne e le file.
  3. 3ID3 album art e emetseng e hlokahala ho batho ba bang ba uploaders — ba hana faele joaloka 'ha se audio e hloekileng' kapa ba e ntsha mme ba re-encode, ba thopa nepahalo thapelo.

Se re e etsuang bakeng sa hora

  1. 1Re sebelisa Xing/LAME header ha e le teng mme frame-count fallback ha e se. VBR timestamps di ntse di nepahala go ±0.1 s lori-long multi-hour files.
  2. 2Joint-stereo le true-stereo MP3s di dekompiled go L/R PCM pele ho diarization. Haeba boledi ba hao ba ne ba lengile, re ba boloka ba arohane.
  3. 3ID3v1, ID3v2, APE tags, embedded art — ka ho kopisa kakaretso. Ha re re-encode hona hao ya MP3.

Ditlhamilodiswa tse itekanetseng bakeng sa di-upload tsa MP3

Diphakwane tseo di lekana ~80% ya difile tsa MP3. Override per-job ho tloha fomu.

Decoder
Frame-accurate, ha ho re-encode
Diarization
Channel split haeba stereo, e se acoustic
Speaker model
Auto · boledi ba 1-12
Puo
Auto-detect ho tloha first 30 s
Filler words
Ntshitswe (toggle go boloka)
Export bundle
DOCX + SRT + timestamped TXT

Accuracy · real-world numbers

95%+ ka 192 kbps stereo. E sebetsa go tlase ho 64 kbps mono.

MP3 nepahalo e fapuoa ke se encoder e se boloketseng, ha se ka rona. Perceptual compression e ka saesae ~96 kbps e boloka mohlala wa puo haholo; tlase ho 64 kbps, sibilants le consonants di qala ho senyeha. Dinomoro mo tlase ke ho tloha di-MP3 tse dipono tsa bareki ba lebopo.

96%
320 kbps stereo, studio source

E batla e le sa hlolehwa bakeng sa puo. Podcast masters, dictation app exports, professional interview rigs. Diarization e hloekileng haeba boledi ba hlapara ka li-channel tse fapaneng.

95%
192 kbps stereo, boledi ba 2-3

Bitrate e sebelisetsoang haholo bakeng sa spoken-word MP3s. Zoom exports, Riverside downloads, voice recorders default. Compression artifacts a sa utlete ka recognizer.

91%
128 kbps mono, puisano

Voice memo defaults ka diphone tse ngata. Acoustic diarization e berakanya boledi ba 2-4. Dinomoro le proper nouns ka mehla di hloka leele.

84%
64 kbps mono, archival / phone-dump

Lesaka-leso ole answering-machine rips, lecture archives, narrow-band sources. High-frequency consonants (f/s/sh) di fupana. E ntse e le e lokelang — ikarabella leqephelo.

Dipotso tse tloaelehileng

Dilo tse 8 tse batho ba di botsang tungkolo ea diariso ya MP3.

01Ke efe ya bitrate ya MP3 e e kang go amohela go ba letseno le molemo?+
64 kbps ke floor e rarahaneng. Mo tlase ho eona, sibilants (s, sh, f) di kopane-kopane ho ba setata mme word error rate e phahamela ba fetula 20%. Haeba o rekota hape, ikhethella 128 kbps mono kapa 192 kbps stereo — se sele se palo ya go fetela bakeng sa puo.
02Na nka lokisa MP3 ya ka go WAV pele?+
Aowa. Re-encoding MP3 → WAV ha e fapanye nepahalo hobane datha eo encoder e e hlobotse e se e sa buoa. Laola MP3 ntle. Re decode frames ka mo memory mme re laola PCM go recognizer.
03Na stereo MP3 e ka nthusa go ba sefako se setle go fapana le mono?+
Haeba feela boledi ba ne ba rekotiwe ka di-channel tse fapaneng — MP3 tse stereo tse ngata di na le sekopu se le sele ka mahlapelo a mabedi ('dual mono') mme di sa fumane efe. True channel-split (mohlala Riverside exports, two-mic field rigs) e re laea go leba acoustic diarization mme e laela boledi go botlana habonolo.
04Ke efe ya size ea faele ya MP3 e kgolwane eo o e amohetsoeng?+
5 GB per upload, eo e leng hoo tsela le hora e 60 ka 192 kbps kapa hora e 90 ka 128 kbps. Haeba faele ya hao e le kgolwane rona re bontsha chunked upload — jwa ho e banahanya hao.
05Hora e 60-minute MP3 e nka nako e kae ho diarisa?+
Ka thejana metsotsoana e 90 ho tloha upload-complete go transcript-ready, ha lebaka la bitrate le le kae. Decoding MP3 frames e lebaka la lekonyana; nako e le ho recognizer. Diarization e kenella 5-10 seconds ka difile tse boledi ba mang.
06MP3 ya ka e na le mmino wa tlhago — na sengwalo se tla hlologa?+
Quiet bed music tlase ga puo e lokile. Mmino o makatsang o atamana le moleme (intro stings, scoring under interviews) ka mehla o kopa misrecognitions ka syllables tse atamang. Toggle music suppression ka job form go pre-filter.
07Na o kgona ho berakanya MP3s tse rikiwa ho phone voicemail kapa answering machines?+
Eya, leha dilo tse dino dino dino dino dino 8 kHz narrow-band re-encoded e le MP3 — ceiling ea audio quality e itisitswe ke original PSTN capture, ha se MP3 wrapper. Lapela 78-85% nepahalo ka mofuta ona wa source, eo e leng tse tshwane re fumana ka call.
08Na le boloka MP3 ya ka ha transcript e felile?+
Difile di hlakatswa ka morago ga matsatsi a 30 ka mofini, kapa nako nngwe ntshotswe ka dashboard. Sengwalo se sa khone hodima molato wa hao ho fihlela o sele. Ha re sebelise sekopu sa customer ho hopa model e le kae — ha le ise.

Laola MP3 ya hao. Fumana letseno mo metsotsoana e 90.

Minete e 30 e mahala ka kgwedi. Ha ho kadi e hlokahalang. Sefako sa boledi, lipuo tse 99, formats ya export kakaretso e akaretsa.

Qala mahala