Transcription for podcasters.Show notes and SRT in one pass.

Drop your podcast episode master — MP3, WAV, or a YouTube link. Get a speaker-labeled transcript, AI show notes with key points and tags, plus an SRT for the video cut.

Drop a file, or pick one

MP3 · WAV · M4A · MP4 · MOV · MKV · OGG · OPUS · FLAC · WEBM — up to 100 MB anonymously

Paste a link, we’ll fetch the audio

YouTube · TikTok · Vimeo · Twitter · SoundCloud · Spotify · 50+ more

Record straight from your browser

Sign up takes 30 seconds — recording opens right after, in the dashboard.

No card required~90s per 60-min fileSRT · VTT · DOCX · TXTFiles auto-deleted in 24h

↓ One file in, four artifacts out

Episode master in. Transcript, show notes, SRT, tags out.

Most podcasts come in as a post-production stereo MP3 with host and guest already mixed together. We split them acoustically, detect the music intro, and start the transcript at the first spoken word.

Episode 142 masterREC 2 speakers · 48:21 · MP3 192 kbps
auto-detected en-US44.1 kHz stereo · post-mix
~90s
Transcript · streaming95% accuracy
S1

Welcome back to the show. Today I'm talking with Priya Anand about her new book on supply chains.

S2

Thanks for having me, Jordan. It's been a wild three years since we last spoke.

S1

So the book opens with the Suez blockage — why start there?

S2

Because it was the moment everyone non-logistics suddenly cared about containers.

95% on stereo post-mixSRT · DOCX · TXT · Show notes MD

↓ This is the dashboard

This is what loads when the job finishes.

Same layout as the real dashboard — Summary, full Transcript, Speakers tab, Exports. Key points and action items extracted automatically. Auto-tags on every job.

Try it on your own file — it's free

Three real options · honest comparison

Descript. Castmagic. Or us.

Descript is an editor first, transcription second. Castmagic is show notes first, transcript second. We focus on the file → transcript → show notes pipeline and stay out of your editor.

Option 01

Descript

Audio editor with built-in transcription. Great for editing-by-text workflows, heavier than you need if you just want a transcript.

Primary useDAW + word-edit
Speaker diarizationAcoustic, EN-strong
Show notesUnderlord AI add-on
ExportSRT · TXT · project file
Free tier1 hr/mo transcription
Cost$24/user/mo (Creator)
Best forSolo podcasters who edit episodes by deleting words from a transcript and want one app for everything.
Option 02

Transcription.Solutions

Drop the episode master. Transcript, show notes, tags, SRT — all four in one pass. No editor, no lock-in.

Primary useTranscript + show notes
Speaker diarizationAcoustic + per-track upload
Show notesFree on every plan
ExportSRT · VTT · DOCX · MD · JSON
Free tier30 min/mo, no card
Cost · per min$0.03
Best forShows that already have an editor (Logic, Hindenburg, Reaper) and just want clean text + notes after the episode is mixed.
Option 03

Castmagic

Show-notes-as-a-service. Drag in the file, get a slick content pack. Transcript is more of a byproduct.

Primary useContent repurposing
Speaker diarizationYes, EN-tuned
Show notesMany templates, paid only
ExportSRT · TXT · template MD
Free tierTrial only
Cost~$23+/mo (Starter)
Best forMarketing-heavy shows that need 12 social posts, 4 newsletter drafts, and a LinkedIn carousel per episode.

Pricing approximate as of 2026 and changes per vendor. Free tiers and add-on AI features rotate frequently.

Specific to podcasting

Three things that bite podcasters on generic transcription tools.

Tell us a few things about the episode on upload and the output stops needing a cleanup pass.

What goes wrong

  1. 1Music intro transcribed as gibberish. The recognizer tries to read lyrics or hum patterns and inserts nonsense like 'la la na' across the first 30 seconds.
  2. 2Guest name spelled phonetically. 'Priya Anand' comes out 'Pria Anan' or 'Prea Ahnand' — and it's wrong every single time it appears.
  3. 3Laughter and crosstalk get rendered as filler words or attributed to the wrong speaker, especially during energetic exchanges.

What to flip here

  1. 1Toggle Skip music intro/outro on the job form. We detect non-speech segments and start the transcript at the first spoken word — timestamp offsets adjust automatically.
  2. 2Paste guest name and brand mentions into Custom vocabulary. We pass it as a recognizer hint, so spelling stays consistent across the whole episode.
  3. 3Turn on Show notes to get a 2-4 sentence summary, 3-7 key points, action items, and 3-8 topic tags rendered in markdown — paste straight into your CMS.

Recommended job settings for podcasts

Drop an episode and these defaults flip on. Override per-job from the form.

Diarization
Stereo split if 2 speakers
Music detection
Skip intro/outro segments
Filler words
Removed by default
Show notes
Summary + key points + tags
Chapters
Generated from key points
Export
SRT · DOCX · show notes MD

Accuracy · real-world numbers

97% on studio-mic episodes. Holds up on remote-guest calls too.

Podcast accuracy depends mostly on how the guest was recorded, not the host. A studio host paired with a Zoom-only guest behaves like the worst leg. Numbers below come from real customer episodes, not lab audio.

97%
Per-track upload (Riverside / SquadCast)

Each speaker on a separate WAV. We treat each track independently and skip diarization. Cleanest possible case.

95%
Stereo post-mix, 2 speakers

Host left, guest right, after mastering. The most common podcast shape. Diarization is essentially free from the stereo split.

91%
Mono mix, 3-4 speakers

Roundtable shows or panel format mixed to mono. Similar voices may merge once or twice per hour — a 2-min cleanup pass fixes it.

86%
Remote guest on phone / poor mic

Guest on AirPods through a hotel wifi call. Numbers and proper nouns suffer most. Custom vocabulary recovers most of it.

Common questions

8 things people ask about podcast transcription.

01Can I just paste my YouTube or SoundCloud link?+
Yes. Paste a public YouTube URL or a hosted episode link (SoundCloud, Buzzsprout, Transistor, Libsyn direct MP3) and we pull the audio on our side. For private feeds, download the file and upload it.
02Will the music intro be transcribed as 'la la la' nonsense?+
Not if Skip music intro/outro is on (it is by default). We detect non-speech audio and start the transcript at the first spoken word. Timestamps in the SRT shift to match so YouTube captions still sync.
03What's in the show notes file exactly?+
A 2-4 sentence episode summary, 3-7 key points as a bulleted list, action items if any were mentioned, and 3-8 topic tags. Rendered as markdown so you can paste straight into WordPress, Ghost, Substack, or your podcast host's episode page.
04Can you generate chapter markers for Apple Podcasts and Spotify?+
Yes — chapters are generated from the key points with timestamps. Export as a separate chapters.txt or embed in the WAV/M4A. Note that Spotify only honors chapters on Anchor-hosted shows, so the txt file is your fallback.
05I have per-track files from Riverside / SquadCast — should I upload those?+
Yes, please do. Upload each speaker's WAV separately and tag them with names. We transcribe each track independently and merge by timestamp. Accuracy lands around 97% on this setup — the cleanest case we see.
06Can it flag sponsor reads or ad breaks?+
Not automatically yet — that's on the roadmap. For now, drop a marker in your edit (a brief silence or chime) and we'll surface it as a timestamp in the transcript. You can also tag ad segments by paste-finding the sponsor brand name afterward.
07How long can the episode be?+
Up to 6 hours per file in one upload. Most shows run 30-90 minutes, which finishes in 4-8 minutes wall-clock. For a 3-hour interview episode, expect roughly 12-15 minutes from upload to all four artifacts ready.
08Will the SRT replace YouTube's auto-captions cleanly?+
Yes. The SRT is line-broken at ~42 chars with proper punctuation and speaker prefixes optional. Upload it in YouTube Studio → Subtitles → Add language → SRT. It overrides the auto-generated caption track entirely.

Drop your episode. Get the transcript, notes, and SRT.

30 free minutes every month. No card. Speaker labels, show notes, chapters, and every export included.

Start free