Transcription for journalists.Source interviews, speaker labels, citation timestamps.

Drop a source interview — phone, lavalier, field recorder, or press conference. Get a speaker-labeled transcript with citation timestamps, then a DOCX your fact-checker can mark up.

Drop a file, or pick one

MP3 · WAV · M4A · MP4 · MOV · MKV · OGG · OPUS · FLAC · WEBM — up to 100 MB anonymously

Paste a link, we’ll fetch the audio

YouTube · TikTok · Vimeo · Twitter · SoundCloud · Spotify · 50+ more

Record straight from your browser

Sign up takes 30 seconds — recording opens right after, in the dashboard.

No card required~90s per 60-min fileSRT · VTT · DOCX · TXTFiles auto-deleted in 24h

↓ Watch what comes out

Interview audio in. Quotable transcript out.

Every speaker turn gets a timestamp so you can jump back to the audio when your editor or fact-checker challenges a quote. Filler words stay in by default — quote integrity matters.

Source interview · WAVREC 2 speakers · 38:12
auto-detected en-US44.1 kHz · lavalier + room
~90s
Transcript · streaming94% accuracy
S1

Walk me through when you first noticed the cost overrun on the housing project.

S2

It was the March 14 finance committee. The number jumped from 22 to 31 million with no memo.

S1

And nobody on the council asked where the extra nine million went?

S2

One person did. It's on the recording. After that, the item moved to closed session.

94% on lavalierDOCX · SRT · TXT · JSON

↓ This is the dashboard

This is what loads when the job finishes.

Same layout as the real dashboard — Summary, full Transcript, Speakers tab, Exports. Key points and action items extracted automatically. Auto-tags on every job.

Try it on your own file — it's free

Three real options · honest comparison

Rev human. Trint. Or us.

Rev's human-typed tier is the legacy newsroom default — accurate, expensive, slow. Trint built a newsroom-tuned AI editor. We do the AI transcript, kill the file in 24 hours, and stay out of the way.

Option 01

Rev (human-typed)

A person types your audio. Highest accuracy on hard audio — pay for it in dollars and hours.

Turnaround12–24h typical
Accuracy99%+ (human)
Cost · per min$1.50 (human)
Source privacyContractor listens
File deletionRetained per policy
ExportDOCX · TXT · SRT
Best forCourt-quality records or court-bound reporting where a human ear on every word is worth the dollar-a-minute premium.
Option 02

Transcription.Solutions

AI transcript in minutes. Source audio deleted in 24 hours. Custom vocabulary for source names and place names.

Turnaround~1× realtime
Accuracy94% on lavalier
Cost · per min$0.03
Source privacyNo human listens
File deletion24h hard delete
ExportDOCX · SRT · TXT · JSON
Best forReporters working a beat — daily interviews, embargoed sources, field tape from court steps or city hall.
Option 03

Trint

Newsroom-tuned AI editor with collaborative workflows. Strong product, subscription pricing.

Turnaround~1× realtime
AccuracyComparable AI tier
Cost$80+/user/mo seat
Source privacyCloud-retained
File deletionStored until you purge
ExportDOCX · SRT · EDL
Best forNewsroom teams that want the editor, the collaboration features, and a paid seat per reporter.

Pricing approximate as of 2026. Rev's automated tier is separate from the human-typed tier compared here.

Specific to journalism

Three things that bite reporters on generic transcription tools.

Before you upload the file, flip the right settings — the transcript comes back closer to publishable.

What goes wrong

  1. 1Proper nouns get phonetically guessed. Source names, agency acronyms (HUD, DOT, FOIA), and bill numbers (SB-1421, HB-340) come back as plausible-sounding wrong words.
  2. 2Filler words stripped by default. Generic tools delete "um, well, I think" — which is fine for meetings but wrecks quote integrity when an editor compares text to audio.
  3. 3Press-conference crosstalk collapses into one speaker. Six reporters become "Speaker 2" and you lose attribution on who asked what.

What to flip here

  1. 1Paste source names, agency abbreviations, and bill IDs into Custom vocabulary on the job form. We pass them as hints to the recognizer.
  2. 2Set Filler words: keep so "um" and "like" stay in. Strip them in your DOCX after you've matched the quote to the audio.
  3. 3For pressers, raise Max speakers to 8–10 and turn on per-turn timestamps. Easier to clean up labels manually than to recover lost attribution.

Recommended job settings for source interviews

Drop an interview and these flip on by default. Override per-job from the form.

Diarization
On · 2–10 speakers
Timestamps
Every speaker turn
Filler words
Kept (quote integrity)
Custom vocabulary
Source + place names
Language
Auto-detect · 99 supported
Export
DOCX with timestamps · SRT

Accuracy · real-world numbers

94% on a lavalier. Down to mid-80s on a noisy podium.

Reporters record in conditions transcribers don't always plan for. The microphone and the room set the ceiling — not the model. Numbers below are from actual journalist files in production, not synthetic benchmarks.

96%+
Studio or quiet 1-on-1, USB mic

Podcast-grade setup, one source across the table on a cardioid or lav. Proper nouns are the only failure mode.

94%
Phone or Zoom source interview

Cooperative source on broadband. Some loss on numbers, bill IDs, and unfamiliar last names. Custom vocabulary closes most of it.

89%
Field recorder, café or sidewalk

Tascam or phone in a coat pocket. HVAC, traffic, dishes. Words are usable for quote pulls — expect a verify pass on the audio.

84%
Press conference, room mic, crosstalk

Six reporters shouting questions, podium PA echo, no individual mics. Diarization will fuse some questioners. Worst case in our data.

Common questions

8 things reporters ask about transcription for journalists.

01How long do you keep the source audio?+
Source files are hard-deleted within 24 hours of the job completing. The transcript stays in your account; the audio doesn't. If you're meeting with a sensitive source, that's the answer to give them.
02Does a human listen to my interview?+
No. The pipeline is AI end-to-end — no contractor review, no human QA step on your file. If you need a human-typed transcript (court-quality), Rev's human tier is the right tool, not us.
03Can I transcribe a phone recording from TapeACall or iPhone Voice Memos?+
Yes. Drop the M4A or MP3 directly. Expect ~93–94% on broadband VoIP. If the source called from a landline or weak cell, drop closer to 88% — still usable for quote pulls, just verify against the audio.
04How do I mark a section as off-the-record?+
We don't have an off-the-record toggle in the file itself. The workflow most reporters use: trim the off-the-record minutes out of the audio before upload, or upload the full file and delete those paragraphs from the DOCX. The 24-hour audio deletion limits exposure either way.
05Will the timestamps line up well enough to fact-check against the recording?+
Yes. We emit a timestamp at every speaker turn and, on the SRT, every 2–7 seconds. A fact-checker can click into the audio at any quote in under 10 seconds. The DOCX export keeps the timestamps inline by default.
06My source is Spanish-speaking. Can you transcribe in Spanish and translate?+
We transcribe in the source language across 99 languages, auto-detected. Translation to English is a separate optional step on the job form — it produces a second file so you keep the original Spanish transcript for accuracy.
07How well does diarization handle a press conference with six reporters?+
Acoustic diarization on a room mic will conflate similar-sounding voices — expect 2–3 reporters to merge into one speaker label. Raise max speakers to 8–10 on the form, then rename the speaker chips manually. Faster than retyping.
08Is the transcript admissible as a legal record?+
No AI transcript is a certified record. For court-bound reporting, the audio is the record and the transcript is your working document. If you need a certified transcript for a subpoena or libel defense, use a court reporter — not us, not Trint, not Rev's AI tier.

Drop a source interview. See it come back quotable.

30 free minutes every month. No card. Speaker labels, citation timestamps, 24-hour audio deletion on every plan.

Start free