Free tool · no card required

Free SRT Generatorsubtitles from audio in 100+ languages

Upload any audio or video. Get accurate .srt subtitles with speaker labels in 100+ languages. Free up to 5 minutes anonymously, 30 minutes per month with a free account.

Drop audio or video — get an SRT

MP3 · WAV · M4A · OGG · OPUS · FLAC · MP4 · MOV · MKV · WEBM

Anonymous: up to 5 min/ 100 MB. Sign up free for 30 min / month + bigger files.

  • 100+ languages, auto-detected
  • Speaker labels on paid plans
  • Files auto-deleted in 24 h

What you get

A standards-compliant .srt — ready to drop in.

Indexed cues. Hours:minutes:seconds,milliseconds timestamps. UTF-8 text. No proprietary wrappers. Open it in Notepad if you want — it’s just text.

Input AUDIO

product-meeting.mp3
─────────────────────────
duration   12:34
size       11.4 MB
language   auto-detect → en-US
sample     48 kHz · 16-bit · mono
codec      MP3 / 192 kbps
status     uploaded → queued

Output SRT

1
00:00:00,420 → 00:00:03,180
Sarah: Alright, let’s lock the Q3 launch date.

2
00:00:03,310 → 00:00:06,840
Marcus: September 14 works — engineering signed off Friday.

3
00:00:07,020 → 00:00:09,560
Sarah: Great. I’ll get marketing on the launch brief today.

4
00:00:09,710 → 00:00:11,430
Marcus: One concern — pricing page copy is still in review.

How it works

From an audio file to timed subtitles in 3 steps.

No setup. No installs. No format conversion. Drop a file, get an SRT back — same workflow whether you’re subtitling a podcast clip, a Zoom call, or a one-hour lecture.

  1. 01

    Upload your file

    Drag and drop — or click choose. MP3, WAV, M4A, MP4, MOV and 6 more formats accepted. Up to 5 minutes anonymously, 30 minutes per month on the free plan, 600 on Pro.

  2. 02

    AI transcribes + timestamps

    Auto-detects language from the first seconds, transcribes every word, and aligns each line to a millisecond-precise timestamp. A 60-minute file finishes in roughly 90 seconds.

  3. 03

    Download your SRT

    Get a standards-compliant .srt ready for YouTube, Premiere, DaVinci Resolve, VLC — anything that reads subtitles. VTT, DOCX, plain text, and PDF exports come with it.

What’s in the box

Subtitles that ship — not just a transcript dumped into a text file.

Each export is checked against the real format spec — players accept the file, no fix-up step needed.

100+ languages

From Mandarin to Maltese. Auto-detected — you don't pick a language up front.

Frame-accurate timestamps

Millisecond-precision cues. Drop the SRT straight into Premiere or DaVinci Resolve — every line is already aligned.

Speaker labels (on Pro)

Native diarization tags each line with the speaker. Rename them in the editor and exports update everywhere.

Every export format

SRT, VTT, plain TXT, DOCX with speakers, JSON with word timings, and a branded PDF — all from one transcription.

Fast — ~90s per hour

Real ratio on production. A one-hour podcast is done before you finish making coffee. No batch queues on paid plans.

Private by default

Files auto-deleted in 24h. No training on your content. TLS in transit, encryption at rest, EU/US infrastructure.

Common questions

10 questions people ask about SRT generation.

01What is an SRT file?+
SRT — SubRip Subtitle — is the most widely supported subtitle format on the web. Each cue carries an index number, a start/end timestamp (HH:MM:SS,mmm), and one or two lines of text. YouTube, Vimeo, Netflix, VLC, Premiere Pro, DaVinci Resolve, OBS, and every modern video player accept it.
02How accurate are the subtitles?+
On clean audio in widely spoken languages we hit 95%+ word accuracy. Heavy accents, overlapping speakers, background music, and obscure technical jargon push accuracy down — same as every AI subtitle tool. We never claim 100%: if perfect captions matter (broadcast TV, legal record), budget a human review pass on top.
03Which languages are supported?+
100+ languages with auto-detection — including English, Spanish, French, German, Portuguese, Italian, Japanese, Korean, Mandarin, Hindi, Arabic, Turkish, Russian, Polish, Dutch, Swedish, Hebrew, Thai, Vietnamese, and many more. See the full list at /languages. You don't need to pick a language up front — the model identifies it from the first few seconds.
04What's the difference between SRT and VTT?+
Both encode the same thing (timestamped text cues), but VTT (WebVTT) is the HTML5 standard and supports inline styling, positioning, and cue settings. SRT is older, simpler, and accepted by virtually every tool ever made. Use SRT for maximum compatibility (uploading to YouTube, importing into Premiere); use VTT when you need styling or you're hand-coding HTML5 <video> elements. We export both — pick whichever your downstream tool wants.
05Is there a length limit?+
Anonymous uploads (no account): up to 5 minutes and 100 MB. Free account: 30 minutes per month, files up to 2 GB. Pro ($19/mo): 600 minutes per month, files up to 5 GB. There is no per-file duration cap on paid plans beyond the 10-hour Whisper window per chunk (we split automatically).
06Can I edit speaker labels after transcription?+
Yes. After the job completes, open the transcript in the dashboard editor — every speaker block is editable. Rename `Speaker 1` to `Sarah`, merge mis-split speakers, fix individual words, and re-export the SRT (or VTT, TXT, DOCX, PDF). Edits propagate to all export formats. Diarization is included on Pro and above — Free plans get the transcript without per-speaker labels.
07Can I use the generated subtitles commercially?+
Yes. You own everything we produce from files you uploaded. There are no royalty claims, no per-export fees, no usage tier on the subtitle output. Standard caveats apply: you need rights to the source audio/video (don't subtitle other people's copyrighted broadcasts without permission), and you're responsible for what the subtitles say.
08Do I need an account?+
For files up to 5 minutes — no. Drop a short file, get the SRT back. For anything longer (or to keep a history, re-download, share, or use speaker labels) — yes, free signup takes 30 seconds and bumps you to 30 minutes per month at no cost.
09What about privacy?+
Uploaded files are deleted from our storage within 24 hours, and we never use customer content to train models. Audio is processed in EU/US infrastructure with TLS in transit and encryption at rest. Full details at /privacy. If you need a custom DPA or longer retention windows, talk to us about Business or Enterprise.
10What audio formats do you accept?+
Audio: MP3, WAV, M4A, OGG, OPUS, FLAC, WEBM. Video: MP4, MOV, MKV, WEBM. We extract the audio track from video files automatically — no need to pre-convert.

Need more than 30 minutes a month? Pro is $19/mo.

600 minutes a month, files up to 5 GB, native speaker diarization, AI summaries and action items, meeting bot for Zoom / Google Meet / Microsoft Teams.

See Pro plan

Already a customer? Open the dashboard