WorkflowFrom episode master to publishable transcript and show notes — the seven-step path most weekly podcast producers settle into by their tenth episode.
SubjectPodcast Production Workflow
Step count7 from master to publish
Time per episode~25 min wall time
Cost per episode≈ $1.92 at 60 min on Pro

Podcast transcription. From master to show notes
in one pass.

The producer's path: drop the post-production master, get a transcript with speaker labels, AI-generated chapter markers with timestamps, an editable show-notes draft, and the SRT for the YouTube version of the episode. The full sequence reads below.

Speaker labels on Pro· AI chapter markers· Same SRT works on YouTube· Source deleted in 24 hours
Wall time
~25
Minutes from upload to ready-to-publish show notes for a typical 60-minute interview episode.
Episode cost
$1.92
60 minutes at the Pro plan rate ($19 / 600 min = $0.032/min). Beats freelance transcribers ($60-180/episode) and matches API-only services without the integration cost.
Diarization
2-3
Speakers per episode is the sweet spot — solo monologues and two-host interviews get clean separation. Roundtables of 4+ need a manual review pass.
DefinitionReference passage

Podcast transcription turns recorded podcast audio into a publishable text artifact: a transcript with speaker labels, an AI-generated show-notes draft with timestamped chapter markers, key-quote extraction for social, and a searchable archive of every past episode. Transcription.Solutions handles podcasts three ways — upload the post-production master (MP3 / WAV / FLAC / M4A), paste a SoundCloud or Bandcamp URL, or paste the YouTube link if your show also publishes there. The transcript becomes the source of truth for the episode page, the blog post, the social clips, and the back-catalogue search.

WorkflowFrom master file to publish

The seven-step podcast publishing workflow

Most weekly producers settle into this exact sequence by episode 10. The tool fits between mastering and publish — the steps below describe what happens around it.

01
0 min

Master out of the DAW

Bounce the post-produced episode out of your DAW (Reaper, Logic, Hindenburg, Adobe Audition) as 16-bit WAV or 320 kbps MP3. Music beds in, intros and outros baked in.

02
+1 min

Drop the file in

Drag the master onto the upload area. Up to 2 GB on Business — covers a 4-hour interview at WAV. Pro 500 MB; Free 100 MB.

03
+10 min

Diarization + AI summary run

Voices separated, transcript chunked and parallelised, chapter markers and key quotes extracted. A 60-minute episode lands in the dashboard around minute 10.

04
+13 min

Rename Speaker 2 to the guest

Click "Speaker 2" once, type the guest's name. The rest of the transcript updates. If you have a recurring co-host, save them in the speaker library; rename auto-applies next time.

05
+18 min

Edit chapter markers

AI summary suggests 5-9 chapters with timestamps. Drag to reorder, edit titles, drop the obvious ones. Most producers keep ~80% as-is.

06
+22 min

Export to your hosting

Copy chapter syntax into Captivate / Buzzsprout / Transistor / Anchor (each format slightly different — we provide all three). DOCX for the show-notes blog post. SRT for the YouTube video version.

07
+25 min

Publish

Episode out, transcript live on the show page, blog draft saved for the next workday, archive searchable for the listener-emails-asking-which-episode that come back six months later.

Worked exampleFrom the inbox of a working podcast producer

How a 47-episode back catalogue got transcribed in one weekend

A weekly indie business podcast — 47 episodes recorded over 11 months, 56 minutes average. The producer wanted transcripts on every show page for SEO, plus a searchable archive for the increasingly common "which episode was X mentioned in?" listener emails.
01

Pulled all 47 MP3 URLs from the hosting service

Captivate's RSS feed gives direct MP3 URLs per episode. Copy-pasted the 47 URLs into a CSV.

30 min
02

Bulk-submitted via the API

Hit /jobs/bulk with the 47 URLs and an Authorization header. One webhook URL to receive completion notices.

8 min setup
03

Let the queue run overnight

Total audio: 43.5 hours. Diarization on, summary on, DOCX + SRT exports. Pipeline parallelised the queue across workers.

~6 h batch
04

Bulk-replaced episode pages with transcripts

Used Captivate's bulk-update API to add a "Transcript" tab to every old episode page. DOCX downloaded, converted to HTML, pasted as the tab content.

~3 h editorial
Published outcome
Every episode page now ships a real transcript. Three months later the show **doubled organic search traffic** as Google indexed the long-form content.
Total cost$83.52
Total wall time~10 h
Per episode$1.78
Equivalent at $1.50/min$3,915
Output6 deliverable elements

What podcasters actually do with the transcript

01

AI show notes with chapters

Auto-generated chapter markers with timestamps. Edit the topic names, drag the order, export as the description for your hosting service (Captivate, Buzzsprout, Transistor — they all accept timestamped chapters).

02

Blog post from the transcript

Use the AI summary outline as the post structure, paste in the most-quotable transcript blocks, ship in 30 minutes. Major source of inbound search traffic for many podcast networks.

03

Search across your back catalogue

Listener emails asking "which episode did you talk about X?" become 5-second answers. Full-text search across hundreds of episodes with click-to-jump-to-moment.

04

Speaker labels with manual rename

Solo monologues, two-person interviews, three-host roundtables — all separated. Rename once per episode. Useful for guest credit and for transcripts that go on the website with attribution.

05

SRT for the YouTube version

Many shows now upload to YouTube as well as audio platforms. Same transcript produces SRT and VTT for video captions — re-upload to YouTube Studio to replace auto-captions.

06

API for new-episode automation

Webhook into your post-production flow: upload master, get transcript + show notes back. Saves the manual click for shows that publish weekly. JWT auth, per-key rate limits.

5.0 / Where the audio comes fromThree real paths into transcription

Three ways to feed us your podcast

Most podcasters either upload the post-production master (best quality), or paste a public link (fast, no waiting for the upload). Both work; below are the actual options ordered by how often users pick them.

8.0 / Show formatsThree common podcast formats and what each needs

Three podcast formats, three workflows

The post-production needs of a solo podcast, an interview show, and a narrative-edited podcast are different. Pick the closest match for workflow tips.

QualityWhat to expect, honestly

Accuracy on real podcast masters

Studio-recorded podcasts are the easiest case for ASR — controlled environment, decent mics, low background noise. Field-recorded interviews and Zoom-recorded shows are harder. Honest expectations below.

97%+
On post-produced podcast masters with USB or shotgun mics, two speakers, music intros and outros. This is the typical interview show, solo show, or two-host roundtable on the platform.
What we deliver
97%+

Studio podcast masters.

Post-produced files with EQ, compression, and a music intro/outro. The conditions every successful weekly show creates by episode 10.

  • Solo podcast monologues
  • Two-host interview shows
  • Tech and business podcasts on USB mics
  • Editor-cut narrative podcasts
What's normal
90%+

Zoom-recorded podcasts.

Interview podcasts where the guest joins via Zoom or Squadcast. The host's local audio is studio-grade; the guest's is internet-quality. Diarization handles it cleanly; word accuracy on the guest side typically lands here.

  • Zoom-recorded interview shows
  • Squadcast / Riverside remote interviews
  • Phone-call style call-in segments
  • Multi-guest panels recorded over the network
What hurts podcast accuracy specifically

Heavy music beds under speech

Some narrative podcasts (the This American Life style) layer music quietly under the speech for emotional pacing. We handle this OK if the music is significantly quieter than the voice. If the mix is roughly equal, the transcript drifts.

Three-plus speakers in a roundtable

Diarization is excellent at two speakers, good at three, and may merge voices at four or more. For a panel show with five hosts, plan a manual speaker-correction pass.

Heavy regional accents

Tier-1 languages have high regional-accent coverage (Glaswegian, Texan, Australian, Brazilian Portuguese all work well). Heavy accents in tier-2 or tier-3 languages drop accuracy 5–10 points.

What we don't do

We don't pull from RSS feeds directly — paste an individual episode URL or upload the file. We don't transcribe DRM-protected content (Spotify-exclusive shows, Apple Podcasts subscriptions). For those, you'd need to record the playback and upload the audio.

Search across the catalogue 2026 producer survey
3 sec

Time for a weekly podcast producer to find which episode a topic was mentioned in, down from three hours of re-listening. The single most-cited reason producers stay on the platform after a back-catalogue migration.

ReferenceCommon questions

Frequently asked questions

  1. 01What audio formats work for podcast masters?
    MP3, WAV, M4A, OGG, FLAC, OPUS, WEBM. Maximum 100 MB on Free, 500 MB on Pro, 2 GB on Business. Maximum duration: 30 min Free, 60 min Pro, 4 h Business. We accept whatever your post-production stack outputs without conversion.
  2. 02How do I get speaker labels?
    On Pro and Business plans, automatic. Two or more voices are separated and labelled "Speaker 1", "Speaker 2", etc. Manual rename: click the label, type the actual name, and the rest of the transcript updates. The Free plan returns the transcript without diarization.
  3. 03Can it generate show notes?
    Yes — AI summary on Pro and Business plans produces chapter markers with timestamps, key quotes, and decisions. Export as the description for your podcast hosting (Captivate, Buzzsprout, Transistor, Anchor — most accept timestamped chapter syntax).
  4. 04Does it work with my podcast hosting RSS feed?
    Not directly. We don't consume RSS feeds — paste an individual episode URL (the one your hosting uses to serve the MP3, e.g. yourhost.com/episode-42.mp3). Or upload the master file. Both produce the same result.
  5. 05Can I transcribe an Apple Podcasts or Spotify URL?
    Apple Podcasts: depends — the share URL points to a webpage, not to an MP3. If the show's MP3 is hosted publicly elsewhere (most are), copy that URL. Spotify: no, Spotify's audio is DRM-protected and only plays back inside the Spotify client. For Spotify-exclusive shows you'd need to record playback or upload the master if you have access.
  6. 06How accurate is it on a Zoom-recorded interview podcast?
    About 90% on the guest side (internet-quality audio), 97%+ on the host side (local mic). Speaker labels separate the two cleanly. Plan a single editorial pass for anything you'll publish without listening back.
  7. 07Can I transcribe the back catalogue in bulk?
    Yes. The REST API takes a list of URLs or files; transcripts come back via webhook. Many podcast networks have used this to back-fill 200+ episode archives in a weekend. Per-key rate limits apply; a 100-episode bulk job typically completes in 4–6 hours.
  8. 08What's the per-minute cost?
    On Pro ($19 / 600 min) it's $0.032 per minute — beats freelance transcribers (typically $1–3/min) and matches the cheapest API services. Business ($49 / 2,500 min) is $0.020 per minute. Free is $0.
Action Start trial

Try it on one episode.

60 free minutes per month. Drop your latest master or paste a SoundCloud URL — first transcript and AI show notes in about 10 minutes.

Start free