Transcription for academic researchers.IRB-aware, CAQDAS-ready, 100+ languages.

Drop a research interview or focus group recording. Get speaker-labelled, timestamped text ready for NVivo, Atlas.ti, or MaxQDA — with audio deleted within 24 hours.

Drop a file, or pick one

MP3 · WAV · M4A · MP4 · MOV · MKV · OGG · OPUS · FLAC · WEBM — up to 100 MB anonymously

Paste a link, we’ll fetch the audio

YouTube · TikTok · Vimeo · Twitter · SoundCloud · Spotify · 50+ more

Record straight from your browser

Sign up takes 30 seconds — recording opens right after, in the dashboard.

No card required~90s per 60-min fileSRT · VTT · DOCX · TXTFiles auto-deleted in 24h

↓ Watch what comes out

Field recording in. Coding-ready transcript out.

We mark each participant turn with a timestamp at the start, keep filler words if you ask for verbatim, and export DOCX with speaker styles your CAQDAS tool already recognises.

Semi-structured interview · .wavREC 2 speakers · 1:08:24
auto-detected en-GB44.1 kHz mono · lavalier mic
~90s
Transcript · streaming94% accuracy · verbatim mode
S1

Can you walk me through the first time you noticed the change in the neighbourhood?

S2

Um, it was probably 2019 — the bakery on the corner shut, and, yeah, that's when it hit me.

S1

And what did that feel like, watching that happen over those months?

S2

Honestly? Like the place I'd known for thirty years was vanishing, piece by piece.

94% on lavalier interviewDOCX (CAQDAS) · TXT · SRT · JSON

↓ This is the dashboard

This is what loads when the job finishes.

Same layout as the real dashboard — Summary, full Transcript, Speakers tab, Exports. Key points and action items extracted automatically. Auto-tags on every job.

Try it on your own file — it's free

Three real options · honest comparison

Rev human. NVivo Transcription. Or us.

Rev's human service is the historical default for dissertation-grade quotes. NVivo bundles AI transcription inside the CAQDAS tool itself. We sit between — faster than Rev, more accurate and IRB-friendlier than NVivo's built-in.

Option 01

Rev (human transcription)

Humans type it. Slow, expensive, but the gold standard for publishable verbatim.

Turnaround12–24 hours (typical)
Cost · per min$1.50 human / $0.25 AI
Speaker labelsYes, manually placed
Audio retentionStored on Rev servers
LanguagesEN human · ~30 AI
CAQDAS exportDOCX, TXT (manual)
Best forSingle high-stakes interviews destined for direct quotation in a published paper, where budget is not the constraint.
Option 02

Transcription.Solutions

AI transcript in minutes, audio deleted in 24h, DOCX styled for NVivo and Atlas.ti import.

Turnaround~5 min for a 60-min file
Cost · per min$0.03
Speaker labelsDiarized, rename in-app
Audio retentionDeleted within 24h
Languages100+, auto-detected
CAQDAS exportDOCX heading styles + TXT
Best forResearchers running 20+ interviews who need fast first-pass transcripts, then hand-correct the 5% of quotes destined for publication.
Option 03

NVivo Transcription / Otter

AI transcription bundled inside your CAQDAS tool or note-taker. Convenient, EN-leaning, less control.

TurnaroundComparable (AI)
CostCredit packs · ~$0.30/min
Speaker labelsAcoustic, EN-tuned
Audio retentionTied to subscription
LanguagesNon-EN accuracy drops
CAQDAS exportNative to NVivo only
Best forSolo PhD students working entirely in English inside one CAQDAS ecosystem who want a single bill.

Pricing and feature flags accurate as of 2026. Rev's AI/human split and NVivo Transcription credit pricing vary by region and academic licensing.

Specific to qualitative research

Three things that bite researchers on generic transcription tools.

Flip the right settings before you upload and the transcript imports straight into your CAQDAS project.

What goes wrong

  1. 1Filler words stripped silently. Generic AI removes "um", "like", false starts — fine for meeting notes, fatal for conversation analysis or discourse work.
  2. 2Domain terminology (theoretical frameworks, drug names, place names, kinship terms) gets transcribed phonetically. Coding then needs a find-and-replace pass.
  3. 3Audio sits on the vendor's servers indefinitely. Most IRB data management plans require deletion or controlled retention — vendors rarely document this clearly.

What to flip here

  1. 1Switch to Verbatim mode on the job form. We keep fillers, false starts, repetitions, and laugh markers — clean mode is opt-in, not default for researchers.
  2. 2Paste your codebook terms and proper nouns into Custom vocabulary. We pass it as a recognizer hint, not a hard substitution, so context still wins.
  3. 3Audio is deleted within 24 hours of job completion. Transcript stays in your account. We can issue a deletion confirmation for your IRB file on request.

Recommended job settings for research interviews

Drop a field recording and these flip on by default. Override per-job from the form.

Mode
Verbatim (fillers + false starts on)
Speaker model
Interview · 2–8 speakers
Language
Auto-detect · accent-tolerant
Timestamps
Every speaker turn
Audio retention
Delete within 24h
Export
DOCX (CAQDAS styles) · TXT · SRT

Accuracy · real-world numbers

94% on a clean lavalier interview. Honest about what fieldwork breaks.

Field audio is the hard case in transcription — open rooms, accented English, overlapping speech in focus groups. Lavalier-mic dyadic interviews hit the ceiling; ambient field recordings and large focus groups degrade fastest. Numbers below come from actual researcher uploads, not synthetic benchmarks.

95%
1-on-1, lavalier or USB mic

Quiet room, single L2 or native speaker, recorder on the table. Best case for semi-structured interviews — most dyadic studies land here.

91%
Handheld recorder, 2–3 speakers

Zoom H4n or phone recorder mid-table. Speaker chairs identified by direction. Plan a 5-min relabel pass.

85%
Field interview, ambient noise

Café, market, walking interview. Background chatter and traffic affect short responses; main turns remain codable.

80%
Focus group, 5–8 participants

Overlapping speech and shared mic. Diarization will merge some quieter voices — expect to disambiguate at coding time.

Common questions

8 things researchers ask about academic transcription.

01Is this acceptable under a typical IRB data management plan?+
Most plans we've seen approve us once they read two facts: audio is deleted within 24 hours of job completion, and transcripts stay only in the researcher's account. We're not an IRB ourselves — your board makes the final call — but we'll issue a written processing description for your protocol on request.
02Do you keep my interview audio?+
No. The audio file is deleted within 24 hours of the job finishing. Only the transcript remains in your account, and you can delete that any time. We don't use research audio to train models.
03Can you do true verbatim — with fillers, false starts, and overlaps — for conversation analysis?+
Yes. Toggle Verbatim mode on the job form and we keep "um", "uh", repetitions, false starts, and laugh tokens. Overlap is marked with a brace symbol at the turn boundary. We don't do Jefferson notation automatically — that's still a human pass.
04Will the DOCX import cleanly into NVivo, Atlas.ti, or MaxQDA?+
Yes. Our DOCX uses the heading and speaker styles each tool expects for auto-coding by speaker. In NVivo, use File → Import → Transcripts. In Atlas.ti and MaxQDA, the speaker-paragraph structure is preserved so autocoding by speaker works out of the box.
05How does it handle accented English or multilingual interviews?+
We support 100+ languages with auto-detection, including code-switching within a single recording. Heavy L2 accents land around 85–90% on clean audio. For minority languages with sparse training data (e.g., some African and Indigenous languages), accuracy is lower and we say so on the language picker.
06Focus groups with 6–8 people — does diarization actually work?+
Partly. Acoustic diarization reliably separates 4–5 distinct voices on a shared mic. Beyond that, expect the model to merge the quietest two participants. The fix is a rename pass in the transcript editor — most focus group transcripts need 10–15 minutes of cleanup.
07Can my co-PI and grad students access transcripts in the same project?+
Yes. Workspaces support shared folders with per-user permissions — PI can see all interviews, RAs see only their assigned cohort. Useful for multi-site studies where you don't want one student exporting another's data.
08For publication-grade direct quotes, do you offer a human pass?+
Not yet, and we won't pretend we do. For quotes going into a thesis or article, our recommendation is: run the AI transcript first, code in your CAQDAS tool, then hand-correct the specific 30–60 seconds around each quote against the audio before it's deleted. That's the workflow most of our researcher users use.

Upload one interview. See if the transcript codes the way you'd code it.

30 free minutes every month. No card. Verbatim mode, 100+ languages, CAQDAS-ready DOCX, audio deleted in 24h.

Start free