ReferenceSpeech-to-text online — what runs in the browser, what runs on our servers, what's free and what's paid.
SubjectSpeech-to-Text in the Browser
InputsFile · URL · Microphone
Coverage99 languages, auto-detect
Free tier60 minutes / month, no card

Speech to text.
Online. No install.

Open a browser tab, drop a file or hit record, and watch text appear. No app, no extension, no Chrome-only restriction. Works on iPhone, iPad, Android, Mac, Windows, Linux. Free for first 60 minutes a month.

No account required to try· Works on every modern browser· 60 free minutes / month· Source deleted in 24 hours
Free tier
60
Minutes per month, no card. Up to 30 minutes per file. Personal use, evaluation, and one-off transcripts fit comfortably.
Browsers
All
Chrome, Safari, Firefox, Edge, Brave, Arc — desktop and mobile. The recorder uses the standard MediaRecorder API.
Latency
~6×
Speed relative to realtime on a single chunk. A 10-minute recording typically completes in 90 seconds.
HeadlineOne number
60 min
Free tier

Free per month. No credit card. No "15-day trial then auto-charge". Sixty minutes is enough to evaluate the result on real audio of yours, or to handle the long tail of one-off transcripts.

DefinitionReference passage

Speech-to-text online means converting recorded speech to text directly in your web browser, with no software install. Transcription.Solutions does this three ways: upload a saved audio or video file, paste a URL from YouTube / TikTok / Instagram / 1,500 other sites, or click record and capture from your microphone right in the page. The recording is uploaded as it finishes, transcribed on our servers, and the text comes back within minutes — typically faster than realtime. The source audio is permanently deleted from our infrastructure within 24 hours.

WorkflowThree-step procedure

How it works in the browser

Three ways to start. Pick the one that matches what you have right now — a file, a link, or a microphone.

1

Drop a file or paste a URL

Drag any audio or video file into the page. Or paste a public link — YouTube, TikTok, Instagram, Vimeo, Twitter, Facebook, podcast feeds. We handle the upload, audio extraction, and transcription server-side. You don't need to convert formats.

2

Or record from your microphone

Click the record button. The browser asks for mic permission once. Speak, then click stop. The recording uploads automatically and transcription begins. Useful for quick voice memos, dictation, and meeting notes when you don't have a recording app.

3

Read, edit, export

The transcript appears in the browser. Edit speaker turns inline, copy paragraphs, run an AI summary, or export to TXT, SRT, VTT, or DOCX. Everything stays in your account; the source audio is wiped within 24 hours.

Output6 deliverable elements

What works in the browser

01

In-browser recorder

Captures from your microphone using the standard MediaRecorder API. Works on iOS Safari, Android Chrome, desktop everywhere. No app to install, no extension to permission-creep.

02

Drag-and-drop upload

Multi-file drop zone. Resumable uploads on large files (the connection drops; you don't lose progress). Free tier accepts files up to 100 MB; Pro 500 MB; Business 2 GB.

03

URL paste

YouTube, TikTok, Instagram Reels, Vimeo, Twitter, Facebook, podcast feeds — paste any public URL. We resolve and extract the audio server-side.

04

Speaker labels

On Pro and Business plans we separate two or more voices automatically. Manual rename per speaker. Free tier has the transcript without diarization.

05

Live preview as it processes

Long files show partial results as chunks complete. You start reading before the file is finished — no blank loading screen.

06

Direct API for everything browser does

If you'd rather not have a UI between you and the transcription engine: REST endpoints for upload, URL, recorder. JWT auth. Same backend, no UI overhead.

APISame backend, no UI

If you'd rather skip the dashboard

The browser flow is good, but for batch jobs and automation you want the API. Three lines of bash to upload a file, three more to paste a URL. JWT auth, webhook callbacks for completion.

# Upload a file → get a job ID curl -X POST https://api.transcription.solutions/api/v1/jobs/upload \ -H "Authorization: Bearer $TS_API_KEY" \ -F "file=@interview.mp3" \ -F "diarize=true" # Or paste a URL curl -X POST https://api.transcription.solutions/api/v1/jobs \ -H "Authorization: Bearer $TS_API_KEY" \ -H "Content-Type: application/json" \ -d '{"source_url": "https://youtu.be/...", "diarize": true}' # Webhook fires on completion with the transcript URL. # Or poll: GET /api/v1/jobs/{id} until status = "done".

Per-key rate limits apply. JWT auth. Webhook signatures use HMAC-SHA256. Available on every plan including Free for evaluation. Same pricing as the dashboard — no API surcharge.

Full endpoint reference →
CoverageTop-tier languages — full list at the hub

Tier-1 languages — studio quality without an editorial pass

Auto-detection picks the language; force a specific one in advanced settings if a multilingual file confuses detection. The 8 below are tier-1 — production-grade quality on real-world audio. We support 99 total.

QualityWhat to expect, honestly

Accuracy: what to expect on browser-recorded audio

Browser-recorded audio is typically 16 kHz mono via the laptop or phone mic. That's good enough for 95%+ on a quiet room. Bus stops and coffee shops are harder.

95%+
In a quiet room, single speaker, decent mic — the conditions most browser-recorder users actually have. Whether you're using a MacBook built-in or a $20 USB headset, the result lands here.
What we deliver
95%+

Quiet room, single speaker.

MacBook or iPhone mic, indoor, no music in the background. The default condition for a voice memo, dictation, or one-person podcast take.

  • Voice memos in a home office
  • Dictation into the browser
  • Solo podcast recording
  • One-on-one Zoom calls (downloaded then re-uploaded)
What's normal
85%+

Outdoors, moving, distance.

Phone mic outdoors, conversation while walking, 1–2 metres from the speaker. Most words right; punctuation occasionally drifts on long pauses.

  • Walking conversation captured on phone
  • Lecture recorded from the back of the room
  • Public-space voice memos
  • Multi-speaker meeting from a single laptop mic
Browser recorder gotchas

Tab in background

On some browsers, audio recording pauses when you switch tabs. Leave the tab in the foreground for long recordings. Or use a dedicated recorder app and upload the file — same accuracy, no tab risk.

Mobile battery saver

iOS Low Power Mode and Android battery savers can cut recording short. Disable them for long sessions, or plug in.

Permission revocation

If you deny mic permission, the recorder won't work — no fallback. Check site permissions in your browser settings, or use the file-upload path instead.

Cookie-blocking extensions

Hardcore privacy extensions occasionally break our auth flow. If recording works but transcripts don't save, try a private window or a different browser.

— What changed for early users Note 003 / 2026

I had a Whisper.cpp install on my laptop, three Python venvs, and a habit of "setting it up properly" instead of using it. Closing all of that and pasting a URL into a browser tab sounded like a downgrade. Two weeks in, I haven't opened the laptop install once.

ReferenceCommon questions

Frequently asked questions

  1. 01Is it actually free?
    Yes — 60 minutes per month, no credit card. You can upload files up to 30 minutes each, or record from the browser mic. The free tier exists for evaluation and personal use; if you outgrow it, Pro is $19/month for 600 minutes.
  2. 02Do I need to install anything?
    No. It runs in any modern browser — Chrome, Safari, Firefox, Edge, Brave, Arc. Desktop and mobile. The recorder uses the standard MediaRecorder API, available everywhere since 2017.
  3. 03Does it work on iPhone?
    Yes — Safari on iOS supports the MediaRecorder API since iOS 14. Recording from the in-page button works as long as you grant mic permission. File upload works the same as desktop. Tip: enable Safari's microphone permission for the site once and you won't be asked again.
  4. 04Where does the speech-to-text actually run?
    On our servers, not in the browser. Browser-side speech-to-text exists (the Web Speech API) but its accuracy is dialect-narrow and quality drops on anything but US English. We use a server-side ASR pipeline that handles 99 languages with consistent quality.
  5. 05How long does it take?
    Roughly 6× realtime on a single chunk. A 10-minute recording completes in about 90 seconds; a 60-minute file in 9–11 minutes. Long files split into chunks and process in parallel.
  6. 06Is my speech kept private?
    The audio is uploaded over HTTPS, transcribed, and the source file is permanently deleted from our infrastructure within 24 hours of completion. Transcripts and summaries stay in your account until you delete them. We do not train models on your data.
  7. 07Can I use it without an account?
    You can try it once without signing up — paste a short YouTube URL or record 30 seconds from the mic. To save the result and access more than the trial, create a free account.
  8. 08Why use this instead of Web Speech API in the browser?
    Web Speech is fast for short live captioning but has narrow language and dialect support, requires Chrome, and routes through Google's servers anyway. Our pipeline runs on every browser, supports 99 languages, returns timestamps, and gives you SRT / VTT / DOCX exports.
Action Start trial

Try it in a browser tab.

60 free minutes per month. No card, no app install, no extension. Drop a file, paste a link, or click record.

Start free