Free VTT to TXT Converter

Strip the `WEBVTT` header, timecodes, and cue settings from a WebVTT file. Get a clean plain-text transcript — runs entirely in your browser.

Drop your .vtt file here

Or click to choose a file from your device.

How it works

Conversion happens entirely in your browser using JavaScript. Your file never leaves your device — no upload to our servers, no signup, no rate limits.

Pick a .vtt file

Drag and drop, or click to choose. The file is parsed in your browser — no upload, no API, no analytics on the content itself.

Strip everything but the words

We drop the `WEBVTT` signature, `NOTE`/`STYLE`/`REGION` blocks, cue settings, and inline tags like `<v Speaker>` and `<i>`. Multi-line cues join into paragraphs.

Download or copy

Save a .txt file or copy to clipboard. Paste into Notion, Docs, your AI tool of choice, or anywhere else that prefers prose over caption format.

WebVTT (.vtt) vs SRT (SubRip)

The two formats look almost identical — the differences are small but they matter for which players will accept your file.

WebVTT (.vtt)

WEBVTT

1
00:00:01.200 --> 00:00:03.500
First caption line
Second caption line

2
00:00:04.000 --> 00:00:06.000
Another caption
  • Header required: the file must start with `WEBVTT`.
  • Decimal separator: period (`.`) — `00:00:01.200`.
  • Extras: supports cue settings (position, alignment), `<v Speaker>` tags, CSS `STYLE` blocks, and `NOTE` comments.
  • Compatibility: every modern browser via the `<track>` element. Required by HTML5 video.

SRT (SubRip)

1
00:00:01,200 --> 00:00:03,500
First caption line
Second caption line

2
00:00:04,000 --> 00:00:06,000
Another caption
  • Decimal separator: comma (`,`) — `00:00:01,200`.
  • Cue numbering: required, sequential, starts at 1.
  • Styling: none — plain text only. No positioning, no colors, no speaker tags.
  • Compatibility: VLC, MX Player, YouTube, Premiere, Final Cut, DaVinci Resolve, Plex.

Common questions

8 questions people ask about this.

01What format is the output text?+
Plain text, UTF-8, one paragraph per cue with a blank line between cues. No timestamps, no headers, no XML/HTML-like tags. Just readable prose in the original cue order.
02Do `<v Speaker>` voice tags get converted to readable labels?+
No — we strip them entirely. WebVTT voice tags carry the speaker name as the tag value (e.g. `<v Alice>Hi</v>`), but converting those to inline labels would be opinionated (`Alice: Hi`? `[Alice] Hi`? newline-separated?). We keep the output clean and let you add labels back if needed. For automatic, opinionated speaker labels in the transcript, our [transcription service](/) handles diarization end-to-end.
03What about styling and cue settings?+
All discarded. `<i>`, `<b>`, `<c.colour>`, cue position settings, `STYLE` CSS blocks, and `REGION` blocks are not meaningful in plain text. The cue text is preserved verbatim once tags are removed.
04Does this work with YouTube auto-caption .vtt files?+
Yes — though YouTube auto-captions can have heavy repetition from streaming overlap (the same line appearing across multiple cues). The output will reflect that. If you need a cleaner transcript, run the audio through our [transcription service](/) instead — it produces a single non-duplicated transcript.
05Can the timing be added back later?+
No. Once timestamps are stripped, the only way to re-align is to feed the original audio (not the text) back through a forced-alignment tool. For most uses — reading, summarisation, search — that's not needed.
06Does the file leave my browser?+
No. JavaScript parses the file locally; there is no upload endpoint. You can verify this in your browser's network tab — no request goes out during conversion.
07Is there a max length?+
Practically no — only your device's memory. A 4-hour lecture VTT (~3,000 cues) converts in a few hundred milliseconds.
08Why use this instead of just opening the VTT in a text editor?+
A text editor shows the raw file — header, cue numbers, timestamps, tags. This tool gives you just the spoken words, formatted as paragraphs ready to paste into a document, AI chat, or search index.

Transcribing from scratch?

Upload audio or video and get a clean transcript directly — no captions step needed. Speaker labels, 100+ languages, free 30 min/month.

Try free transcription