REST API · v1
Transcripton API
Drop a file or a URL into our pipeline and get a transcript back — with speaker labels, AI summary, multi-language translation, and seven download formats. Same engine that powers the dashboard, just wired up for your scripts and apps.
POST /api-keys returns 403 until you upgrade.Introduction
All endpoints live under a single base URL. Every response is JSON unless you explicitly request another format (transcript exports return TXT, SRT, VTT, DOCX, MD, or PDF).
https://api.transcription.solutions/api/v1Requests use standard HTTP verbs and status codes. Successful calls return 2xx with a JSON body; failures return 4xx or 5xx with { "detail": { "code": "...", "message": "..." } }. See Errors for the full code list.
Authentication
Sign in to app.transcription.solutions/settings → API keys, click Create, give the key a name, and copy the value. We store only its SHA-256 hash, so the raw ts_...token is shown exactly once. Lose it and you create a new one — there's no recovery flow on purpose.
Pass the key in an X-API-Key header on every call:
curl https://api.transcription.solutions/api/v1/jobs \
-H "X-API-Key: ts_••••••••••••••••••••••••••••••••"Settings → API keys — the next request with that hash returns 401.Quickstart
Three calls, end to end: upload a file, poll until it's done, download the transcript.
# 1) Upload — returns { id, status }
JOB=$(curl -sS -X POST https://api.transcription.solutions/api/v1/jobs \
-H "X-API-Key: $TS_KEY" \
-F "file=@meeting.mp3" \
-F "diarize=true" | jq -r .id)
# 2) Poll — repeat every few seconds, or use the WebSocket
while true; do
STATUS=$(curl -sS https://api.transcription.solutions/api/v1/jobs/$JOB \
-H "X-API-Key: $TS_KEY" | jq -r .status)
[ "$STATUS" = "done" ] && break
[ "$STATUS" = "failed" ] && echo "failed" && exit 1
sleep 3
done
# 3) Download — TXT, SRT, VTT, DOCX, MD, JSON, PDF
curl -sS "https://api.transcription.solutions/api/v1/jobs/$JOB/export?format=md" \
-H "X-API-Key: $TS_KEY" -o meeting.mdPlans & limits
Limits are enforced per-user, not per-key. All keys on a Pro account share that account's monthly minute pool.
| Capability | Free | Pro | Business |
|---|---|---|---|
| API access | — | ✓ | ✓ |
| Minutes / month | 60 | 600 | 2,500 |
| Max file duration | 30 min | 60 min | 4 h |
| Max file size | 100 MB | 500 MB | 2 GB |
| Concurrent jobs | 1 | 3 | 5 |
| Speaker diarization | — | ✓ | ✓ |
| AI summary & polish | — | ✓ | ✓ |
| Translations / month | — | 10 | 50 |
| Queue priority | Low | Normal | High |
Need more? Email us for an Enterprise quote — custom minutes, SLAs, dedicated queue, and on-prem deployment options.
Upload a file
Multipart upload. The file streams straight to S3 — never buffered in RAM — and a magic-byte check on the first 16 KB rejects polyglots before they reach ffmpeg.
Form fields
| Field | Type | Description |
|---|---|---|
file | file | Audio or video. mp3, m4a, wav, ogg, opus, flac, webm, mp4, mov, mkv, avi. |
diarize | boolean | Speaker labels. Defaults to false. Pro+ only. |
curl -X POST https://api.transcription.solutions/api/v1/jobs \
-H "X-API-Key: $TS_KEY" \
-F "file=@meeting.mp3" \
-F "diarize=true"{
"id": "5d3f2c80-9a4f-4f3a-9ab1-5d7c3a1e0b91",
"status": "queued"
}Submit a URL
Hand us any link our universal extractor recognises — YouTube, TikTok, Instagram, Twitter/X, Facebook, Reddit, Vimeo, SoundCloud, Twitch, podcast feeds, plus ~1,500 more sites — or a direct CDN URL to an audio/video file. We download, transcode, and queue.
curl -X POST https://api.transcription.solutions/api/v1/jobs/from-url \
-H "X-API-Key: $TS_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"language": "auto",
"diarize": false
}'{
"id": "5d3f2c80-9a4f-4f3a-9ab1-5d7c3a1e0b91",
"status": "queued"
}List jobs
Paginated list of the calling user's jobs, newest first. Optional status filter accepts any of queued, downloading, extracting, transcribing, diarizing, analyzing, done, failed.
curl "https://api.transcription.solutions/api/v1/jobs?page=1&limit=20&status=done" \
-H "X-API-Key: $TS_KEY"{
"jobs": [
{
"id": "5d3f2c80-...",
"status": "done",
"stage": 6,
"progress": 100,
"duration_sec": 312.4,
"language_detected": "en",
"created_at": "2026-05-04T08:12:31Z"
}
],
"total": 47,
"page": 1,
"limit": 20
}Get a job
Full job record with transcription text, optional diarization segments, summary, and any cached translations. Poll this every 2–5 seconds, or — much better — subscribe to WebSocket updates.
curl https://api.transcription.solutions/api/v1/jobs/$JOB_ID \
-H "X-API-Key: $TS_KEY"{
"id": "5d3f2c80-...",
"status": "done",
"stage": 6,
"progress": 100,
"duration_sec": 312.4,
"transcription": {
"id": "...",
"full_text": "Hello, welcome to the show...",
"language_detected": "en"
},
"diarization": {
"segments": [
{ "speaker": "speaker_0", "start": 0.0, "end": 4.2,
"text": "Hello, welcome to the show." }
],
"num_speakers": 2
},
"summary": null,
"translations": []
}Export a transcript
Seven formats. JSON returns the versioned TranscriptionExportV1 payload (stable across point releases — breaking changes bump the version).
| Format | Use it for |
|---|---|
txt | Plain transcript, no metadata. |
md | Speaker-labelled paragraphs, summary, action items. |
srt | Subtitles for video editing. |
vtt | Web-native captions (HTML5 video). |
docx | Word — meeting notes, speaker turns. |
pdf | Print-ready report with summary block. |
json | Structured payload — pipe into your own tooling. |
curl "https://api.transcription.solutions/api/v1/jobs/$JOB_ID/export?format=docx" \
-H "X-API-Key: $TS_KEY" \
-o meeting.docxContent-Disposition uses RFC 5987 encoding so non-ASCII filenames (Cyrillic, CJK, emoji) survive a curl -OJ round-trip without mojibake.
Delete a job
Removes the source file from S3 and the row from Postgres. If the job is still in flight, we revoke the Celery task — Whisper stops mid-transcription. Any minutes already billed are refunded via a negative UsageRecord so the audit ledger stays append-only.
curl -X DELETE https://api.transcription.solutions/api/v1/jobs/$JOB_ID \
-H "X-API-Key: $TS_KEY"Polish — paragraph breaks & punctuation
Whisper output is one wall of text. /polish re-flows it into readable paragraphs and fixes punctuation — without rewriting words. A drift guard rejects any LLM response whose character count moves more than ±15 % from the source, so you keep your original text.
Idempotent — if the transcript already has paragraph breaks the cached version is returned without spending another LLM call.
curl -X POST https://api.transcription.solutions/api/v1/jobs/$JOB_ID/polish \
-H "X-API-Key: $TS_KEY"{ "polished": true, "cached": false, "paragraphs": 8, "chars": 2147 }Summarize
Generates an executive summary, key points, and action items from a finished transcript. One LLM round-trip; idempotent — a repeat call returns the cached summary object without re-billing.
curl -X POST https://api.transcription.solutions/api/v1/jobs/$JOB_ID/summarize \
-H "X-API-Key: $TS_KEY"{
"summarized": true,
"cached": false,
"summary": "Quarterly review covered Q3 launch, channel mix, and Q4 hiring plan...",
"key_points": [
"Q3 closed 18% above target",
"Paid social CAC dropped from $42 to $31"
],
"action_items": [
"Marketing — finalise Q4 budget by Friday",
"Eng — open backend headcount by next Monday"
]
}Translate a transcript
Pass the transcription.id from GET /jobs/{id} and the target ISO 639-1 code. Translation is idempotent per (transcription, target_lang) — a repeat call returns the cached row and does not bump your translations counter.
curl -X POST https://api.transcription.solutions/api/v1/transcriptions/$TR_ID/translate \
-H "X-API-Key: $TS_KEY" \
-H "Content-Type: application/json" \
-d '{ "target_lang": "es" }'{
"id": "...",
"transcription_id": "...",
"target_lang": "es",
"text": "Hola, bienvenido al programa...",
"llm_model": "gpt-5-mini",
"created_at": "2026-05-04T08:14:09Z"
}List all cached translations for a transcription with GET /transcriptions/{id}/translations.
WebSocket — realtime progress
Skip the polling loop. Connect, send an auth frame with your Supabase JWT (the dashboard's session token — API keys are HTTP-only by design), then subscribe to a job ID and receive stage updates as the worker bumps progress.
const ws = new WebSocket("wss://api.transcription.solutions/api/v1/ws");
ws.onopen = () => {
ws.send(JSON.stringify({ type: "auth", token: SUPABASE_JWT }));
ws.send(JSON.stringify({ subscribe: jobId }));
};
ws.onmessage = (e) => {
const msg = JSON.parse(e.data);
// { stage, progress, status }
if (msg.status === "done") fetchJob(jobId);
};Webhooks
Have us POST to your endpoint when a job finishes — no polling, no sockets. Configure delivery URLs at Settings → Webhooks. Each event is signed:X-Webhook-Signature is HMAC-SHA-256 over the raw body using the secret shown in the dashboard.
{
"event": "job.done",
"data": {
"job_id": "5d3f2c80-...",
"status": "done",
"duration_sec": 312.4
},
"timestamp": 1746345210
}Verify in your handler before you trust anything in the body:
import crypto from "node:crypto";
function verify(req, secret) {
const sig = req.headers["x-webhook-signature"];
const expected = crypto
.createHmac("sha256", secret)
.update(req.rawBody)
.digest("hex");
return crypto.timingSafeEqual(
Buffer.from(sig, "hex"),
Buffer.from(expected, "hex")
);
}Failed deliveries retry on a backoff schedule — 30 s → 5 min → 30 min → 1 h → 4 h — then cancel. Inspect attempts at GET /integrations/webhooks/deliveries.
Errors
Every error response is shaped the same way:
{
"detail": {
"code": "quota_exceeded",
"message": "You've used 600.0 of your 600 minutes this period. Resets May 14. Upgrade to keep transcribing.",
"minutes_used": 600.0,
"minutes_limit": 600,
"plan": "pro"
}
}| HTTP | Code | Meaning |
|---|---|---|
| 401 | invalid_api_key | Missing or unknown X-API-Key header. |
| 403 | email_not_verified | Free signup hasn't confirmed their email yet. |
| 403 | polish_requires_paid_plan | /polish on a Free plan. |
| 403 | summarize_requires_paid_plan | /summarize on a Free plan. |
| 403 | translation_not_in_plan | Translations gated to Pro+. |
| 402 | quota_exceeded | Monthly minute pool is empty until reset_at. |
| 402 | translation_quota_exceeded | Monthly translation cap hit. |
| 413 | file_too_large | File exceeds max_file_size_mb for the plan. |
| 413 | file_too_long | Probed duration exceeds max_file_minutes. |
| 400 | unsupported_format | Extension or magic bytes don't match a known media type. |
| 400 | unsupported_language | Translation target outside our supported set. |
| 404 | not_found | Job or transcription doesn't exist (or isn't yours). |
| 409 | job_not_ready | Polish/summarize called before transcription finished. |
| 429 | rate_limited | Per-IP burst cap. Honour the Retry-After header. |
| 503 | asr_quota_exceeded | Whisper provider returned insufficient_quota. Transient. |
| 503 | translation_unavailable | LLM key is missing or rotating. Transient. |
Rate limits
Per-key burst limits track your plan. Hitting the wall returns 429 with a Retry-After header (seconds). For high-throughput integrations, contact support for a custom extended tier.
| Endpoint | Free | Pro / Business |
|---|---|---|
POST /jobs & POST /jobs/from-url | 3 / min | 10 / min |
GET /jobs/{id} | 60 / min | 120 / min |
POST /jobs/{id}/polish & /summarize | — | 20 / min |
Versioning
The current API is v1. Backwards-incompatible changes will introduce v2 at a different path — existing integrations on v1 keep working. Additive changes (new fields on responses, new optional inputs, new error codes) ship without bumping the version. Pin to specific JSON shapes by parsing only the fields you need.
Deprecations are announced via the changelog with at least 90 days notice and a sunset header on responses.
Support
Bug reports, integration help, feature requests: support@transcription.solutions. We answer every email; if you're building something we haven't shipped yet, tell us — the obvious wins ship fast.