How accurate is the transcription?

On clear audio with one or two speakers, accuracy reaches 95%+ in most major languages. Quality drops with background noise, heavy accents, or overlapping speech.

What is the refund policy?

Full refund within 7 days if you have used less than 10% of your plan minutes. After that, pro-rated refunds for the unused portion. Email support@transcription.solutions.

Yes — REST API is live with webhooks. API key authentication, per-key rate limits by plan tier. Documentation at /docs/api.

Start free

Transcribe
voice recordings, audio and video, YouTube videos, audio files, video files, MP4 videos, Zoom meetings, Microsoft Teams, Google Meet, interviews, podcasts, lectures, TikTok videos, WhatsApp voice, voice memos, MP3 files, phone calls, sermons
转成文字。 In seconds

Speech-to-text & AI transcription software for audio and video. Convert MP3, MP4, or voice to text with speaker labels and AI summary, usually faster than realtime.

Drop your audio or video

MP3 · MP4 · WAV · M4A · MOV · 单文件最长 10 小时

粘贴链接，我们来抓取音频

YouTube · TikTok · Vimeo · Twitter · SoundCloud · Spotify ·还有 50 多个

直接在浏览器里录音

注册只要 30 秒——之后直接在控制台里开始录音。

Free 30 min/mo无需信用卡100+ 100+ languages说话人标注（Pro 及以上）文件 24 小时后自动删除

免费版：每月 30 分钟，单文件最长 30 分钟。无需信用卡。

100+

Languages auto-detected

Auto-detect with manual override.

95%+

Accuracy on clean audio

Most major languages, one or two speakers.

10h

Max file length on Business

Pro 每月 10 小时 · Free 每月 30 分钟。

~30×

比实时还快

60 分钟的文件通常 2–3 分钟就能拿到结果。

This is the dashboard

Click around. 这才是真东西

标签页可切换，待办事项可展开。任务完成后在你账号里看到的就是这个界面——同样的布局，同样的控件。

app.transcription.solutions / jobs / interview-ari-2026-04-26

Summary

auto-snapshot · saved

TL;DR

创始人要的是会后产出物，不只是文字稿。市面工具逼着他们把 5 个应用拼起来用。

318words2speakers · 58 / 425话题

Key points 3

01原始录音和可发布内容之间存在断层
02节目笔记、社媒剪辑、博客初稿——通话结束前就要拿到
03Current tooling fragmented across 5+ apps

待办事项 2

Investigate single-pipeline approach to replace 5-app stitch
Mock how show-note draft would look from this transcript

Topics创始人工作流会后产出物tooling fragmentation节目笔记single pipeline

带说话人标注的转录稿

4 行 · 2 speakers · 30s clip

00:12说话人 A我从创始人那里反复听到一个问题：原始录音和真正能发出去的内容之间，有一道鸿沟。

00:27Speaker BExactly. Nobody wants another transcript — they want a show note, a clip, a blog draft, by the time the call ends.

00:41说话人 ARight, and the tooling right now forces you to stitch five apps together to get there.

00:54Speaker B一条流水线，一个地方搞定。这就是我们押的方向。

说话人分析

立体声分轨 · 单声道说话人识别

说话人 A

占 58% 发言时长

Turns

14s

Talk time

…this gap between raw recordings and content you can actually ship.

Speaker B

42% airtime

Turns

10s

Talk time

一条流水线，一个地方搞定。这就是我们押的方向。

Export formats

Every plan, every format · 7 种输出 · 无水印 · TXT · SRT · MD · JSON · VTT · DOCX · PDF

TXT

Plain text

纯文本导出 · 全部套餐支持

SRT

SubRip subtitle

Timestamped subtitle · all plans

Markdown

Speaker headers + summary · all plans

JSON

Structured JSON

公开 schema · 适配 API 工作流 · 全部套餐支持

VTT

WebVTT 字幕

HTML5 video player format · all plans

DOCX

Word document

Speaker headers + timestamps · all plans

PDF

Branded PDF

可直接打印 · 含摘要和说话人 · 全部套餐

演示 · 已静音

0:18 / 1:00

Sample output · 30 seconds of a podcast clip

One file. 拿到 8 样东西

Hover or tap any output to see what it actually looks like. Same 30-second podcast clip in the center, eight artifacts derived from it.

Transcript

Punctuated · timestamped

00:12 说话人 A
So what I keep hearing from founders is this gap…

AI 摘要

TL;DR · key points

Founders need post-call content, not just transcripts. Tools force them to stitch 5 apps together.

Speakers

说话人识别 · Pro+

双人通话用立体声分轨。其他场景用单声道说话人识别。

100+ languages

自动识别

Research-grade ASR. Force a specific language if auto-detect picks the wrong one.

interview-ari-2026-04-26.mp3

30-second clip · 2 speakers

100+ 种语言 · 自动识别 · 95%+ 准确率

转录稿 · 30 秒窗口