X live captions
Real-time captions inside the Spaces UI. Nothing to download, nothing to search.
Drop the MP3 from a recorded Twitter Space — or a video, or a DM voice note. Get speaker labels, timestamps, and an SRT in 99 languages. No X Premium needed.
MP3 · WAV · M4A · MP4 · MOV · MKV · OGG · OPUS · FLAC · WEBM — up to 100 MB anonymously
YouTube · TikTok · Vimeo · Twitter · SoundCloud · Spotify · 50+ more
↓ Watch what comes out
X exports a Space recording as a single mixed MP3 — every speaker on one channel. We use acoustic diarization tuned for 6-12 rotating mic holders, the usual Spaces shape.
Welcome back everyone — we've got about 600 listeners now. Jess, you wanted to jump in on the Solana point?
Yeah, so the throughput numbers from last week are misleading without context on the validator set.
Can I push back on that? Because the mainnet beta data tells a different story.
Go ahead, Mike — keep it tight, we've got two more speakers in the queue.
↓ This is the dashboard
Same layout as the real dashboard — Summary, full Transcript, Speakers tab, Exports. Key points and action items extracted automatically. Auto-tags on every job.
Sample preview from a founder interview about post-call workflow. Real transcripts look exactly like this — same tabs, same summary block, same key-points / action-items split, same auto-tag chips.
Three real options · honest comparison
X added live closed captions to Spaces in 2023, but there's no transcript export. Otter requires you to mirror audio into a meeting. We take the MP3 you already downloaded from X and return a file.
Real-time captions inside the Spaces UI. Nothing to download, nothing to search.
Drop the Space MP3 or paste the Space URL. Speaker labels, SRT, summary — every plan.
Calendar bots designed for Zoom. To capture a Space you have to route audio into a fake meeting.
Pricing and feature flags accurate as of May 2026. X Spaces caption rollout still varies by region and account type.
Specific to X / Twitter
Spaces have a shape: mono mix, rotating mic, crypto and tech jargon, lots of @handles. Tune for that.
Drop a Space MP3 and these flip on by default. Override per-job from the form.
Accuracy · real-world numbers
X exports every Space as a single mixed mono MP3, so the ceiling depends on how each speaker connected. Wired mic in a quiet room is the best case. Bluetooth earbuds in a car is the worst. Numbers below come from actual Spaces files in production.
Small Space, hosts on USB or XLR mics. Diarization separates voices cleanly even in mono mix.
Typical Space. Some on iPhone, some on laptop. Diarization holds; expect a 2-min cleanup pass on speaker chips.
Big Space with mic passed around. Acoustic model can merge similar voices when speakers swap quickly.
AirPods in a coffee shop, AAC compression, wind. Text usable; numbers, names, and acronyms degrade first.
Common questions
30 free minutes every month. No card. Speaker labels, 99 languages, SRT and DOCX included.
Start free