สัมภาษณ์ที่แปลงเป็นข้อความ — แปลงเสียงสัมภาษณ์เป็นข้อความพร้อมชื่อผู้พูดและเวลา

สัมภาษณ์ที่แปลงเป็นข้อความบันทึกเสียงแบบไหนก็ได้ ผลลัพธ์เหมือนเดิม

Voice memo จากโทรศัพท์ Zoom call lavalier rig หรือ field recorder — วาง เสียงสัมภาษณ์ แล้วได้ข้อความพร้อมชื่อผู้พูด เวลา และสามารถอ้างอิงได้

Drop a file, or pick one

MP3 · WAV · M4A · MP4 · MOV · MKV · OGG · OPUS · FLAC · WEBM — up to 100 MB anonymously

Paste a link, we’ll fetch the audio

YouTube · TikTok · Vimeo · Twitter · SoundCloud · Spotify · 50+ more

Record straight from your browser

No card required~90s per 60-min fileSRT · VTT · DOCX · TXTFiles auto-deleted in 24h

สองเสียงเข้า สองเสียงออกมา มีป้ายชื่อ

สัมภาษณ์ส่วนใหญ่มีสองคนในอุปกรณ์เดียว — โทรศัพท์บนโต๊ะ หรือ recorder วางตรงกลาง เราแยก เสียงสัมภาษณ์ ให้เป็นผู้สัมภาษณ์และแหล่งข้อมูล แม้จากช่องเดี่ยว แล้วมีเวลาในแต่ละขั้นสำหรับอ้างอิง

Field recorder · WAVREC 2 speakers · 38:42

auto-detected en-US48 kHz mono · 1411 kbps

~90s

Transcript · streaming94% accuracy

Can you walk me through what you saw the morning of the eighteenth?

I got there around six. The loading bay door was already open, which it shouldn't have been.

And you'd reported the door issue before — to whom?

To Diane Okafor in facilities, twice in March. I have the emails.

94% on field WAVDOCX · TXT · SRT · JSON

This is what loads when the job finishes.

Same layout as the real dashboard — Summary, full Transcript, Speakers tab, Exports. Key points and action items extracted automatically. Auto-tags on every job.

app.transcription.solutions / interview-202.mp3Export

Summary 5Transcript 1,420Speakers 2Exports

interview-202.mp347:08128 kbps CBR2 speakersen-US auto-detected

Founders need post-call content, not just transcripts. Tools force them to stitch 5 apps together.

Sample preview from a founder interview about post-call workflow. Real transcripts look exactly like this — same tabs, same summary block, same key-points / action-items split, same auto-tag chips.

Key points

Gap exists between raw recordings and shippable content — tools stop at transcript.

Show notes, social clips, blog drafts all expected by call's end, not next-day.

Current tooling fragmented across 5 apps — no single pipeline.

Conversion-rate signal flipped a buyer-segment assumption at week 3.

40% of original hypothesis survived — the shape held, mechanics rebuilt.

Action items

Speaker 1Investigate single-pipeline approach to replace 5-app stitch.

Speaker 2Mock how show-notes draft could flow from the transcript.

Speaker 2Pull conversion-rate by segment, Monday EOD.

Speaker 1Map the 5-app stitch & list which steps actually need a human.

Auto-taggedfounder interviewpost-call contenttooling fragmentationsingle pipeline

Try it on your own file — it's free

Rev แบบคน Otter หรือ Trint หรือเรา

Rev ส่งเสียงคุณให้กับคนพิมพ์ — ช้าและแพงแต่ได้คุณภาพสูงกับเสียงยาก Otter และ Trint เป็น AI-first เหมือนเรา ปรับให้เหมาะกับนักข่าวและนักวิจัย นี่คือสิ่งที่เหมาะกับแต่ละแบบ

Option 01

Rev human transcription

มนุษย์พิมพ์สัมภาษณ์ของคุณ ดีที่สุดกับเสียงเย็นแต่คุณต้องรอและต้องจ่าย

Turnaround12–24 hours typical

Accuracy on clean audio99% (claimed)

Speaker labelsManual, included

LanguagesEN human · 30+ AI

Cost · per min$1.50 human · $0.25 AI

PrivacyAudio sent to contractors

Best forสัมภาษณ์ที่เกี่ยวข้องกับศาลหรือสิ่งพิมพ์ สำคัญมากในเสียงเย็นที่ต้องการหูมนุษย์ และมีเวลาหนึ่งวันให้รอ

Option 02

Transcription.Solutions

AI transcript ผู้พูดแยก พร้อมใช้ในเวลาไม่กี่นาที engine เดียวกันสำหรับ voice memo Zoom หรือ field recorder

Turnaround~3 min per hour of audio

Accuracy on clean audio94–96%

Speaker labelsAuto · rename in editor

Languages99, auto-detected

Cost · per min$0.03

PrivacyAudio deleted in 24h · no training

Best forนักข่าว นักวิจัย และผู้สร้างสรรค์ที่ทำสัมภาษณ์หลายครั้งต่อสัปดาห์ต้องการข้อความที่อ้างอิงได้อย่างรวดเร็วโดยไม่อัพโหลดไปยัง contractor

Option 03

Otter / Trint

AI transcription ด้วย editor ที่มุ่งเน้นวิจัย English-strong ล็อกไว้ในแผน monthly

TurnaroundReal-time to ~5 min

Accuracy on clean audio~90–93%

Speaker labelsYes · EN-tuned

LanguagesOtter EN-only · Trint 30+

Cost$17–80/user/mo (subscription)

PrivacyStored in account by default

Best forทีมที่ต้องการห้องเก็บเสียงสัมภาษณ์ทั้งหมดที่บันทึกไว้และไม่รังแค้นต้องจ่าย monthly seat fee ต่อผู้ใช้

Pricing and feature flags accurate as of 2026. Human Rev turnaround varies by queue depth and audio length.

96% บน lav ที่ดี อ่านได้บน cafe recording

ความแม่นยำของสัมภาษณ์ถูกจำกัดโดยสิ่งที่ mic จำได้จริ�� close-mic stereo บนแต่ละผู้พูด คือ ceiling; โทรศัพท์บนโต๊ะเสียงดังคือ floor ตัวเลขด้านล่างมาจากไฟล์สัมภาษณ์จริง ไม่ใช่ synthetic benchmark

8 สิ่งที่คนถาม เกี่ยวกับ สัมภาษณ์ที่แปลงเป็นข้อความ

01ฉันสามารถใช้ transcript เหล่านี้ในบทความที่ตีพิมพ์โดยไม่ต้องตรวจสอบกับเสียง?+

สำหรับคำพูดโดยตรง — ไม่ เสมอตรวจสอบกับเสียง AI transcript ที่ 94% accuracy ยังคงอ่านผิดคำหนึ่งใน 17 โดยเฉลี่ย และคำที่ผิดในคำพูดคือการแก้ไข transcript คือเพื่อการนำทางและร่าง เสียงคือแหล่งที่มาของความจริง

02Recorder ของฉันบันทึก stereo WAV ด้วย mic หนึ่งต่อผู้พูด ฉันควรทำอะไร?+

อัพโหลดไฟล์นั้นโดยตรง — อย่าแปลงเป็น mono ก่อน เราตรวจหาสองช่องและส่งหนึ่งไปยัง diarization track ของตัวเอง ซึ่งเป็น highest-accuracy path ที่เรามี ต่อให้คาดหวัง 96%+ ในห้องเงียบ

03บทสัมภาษณ์ที่บันทึกผ่านสายเรียนโทรศัพท์ล่ะ?+

เสียงโทรศัพท์เป็น 8 kHz narrow-band ซึ่ง cap accuracy ประมาณ 88% แม้ในสาย clean เรายังคง split สองฝ่ายโดยใช้ channel separation หากแอป recorder ของคุณจับพวกเขา separately (ส่วนใหญ่ทำ) VoIP call ผ่าน WhatsApp หรือ Signal ฟังดูดีนิดหน่อยกว่า PSTN

04ฉันสามารถ redact off-the-record sections ก่อนแชร์ transcript?+

ใช่ ใน editor เลือก timestamp range และทำเครื่องหมาย `[REDACTED]` export ทดแทนข้อความด้วย redaction marker แต่เก็บ timestamps ดังนั้นเอกสารยังติดตาม audio อยู่

05คุณฝึก model บน interview recording ของฉันหรือเปล่า?+

ไม่ เสียงต้นฉบับถูกลบออกจาก infrastructure ของเราภายใน 24 ชั่วโมงหลังเสร็จ และเราไม่ใช้ customer recording สำหรับ model training ภายใต้แผนใด transcript text นอนอยู่ในบัญชีของคุณจนกว่าคุณจะลบ

06สามหรือสี่คนในสัมภาษณ์แนว panel — diarization ยังใช้ได้หรือเปล่า?+

ถึงประมาณหกเสียงที่แตกต่าง ใช่ แต่ accuracy บนการกำหนด speaker ลดลงด้วยแต่ละคนที่เพิ่มเข้าโดยและลดลงเมื่อสองผู้พูดฟังดูคล้าย วางแผน 2–3 นาที rename pass บนช่องผู้พูดหลัง transcript ดินแดน

07คุณสามารถแปลงเสียงสัมภาษณ์เป็นภาษาอื่นนอกจากอังกฤษหรือเปล่า?+

99 ภาษา auto-detected Code-switching (English source ลื่นไถลไปเป็น Spanish mid-sentence) จัดการกับ 12 language pairs Accuracy แตกต่างแบบภาษา — ภาษาของยุโรปตรงกับอังกฤษ; ภาษา low-resource African และ Central Asian ทำงาน 5–10 points ต่ำกว่า

08ฉันบันทึกบน Zoom call — ฉันควร ใช้ Zoom page ของคุณแทนหรือเปล่า?+

Engine เดียวกัน ผลลัพธ์เดียวกัน Zoom page ครอบคลุม cloud-recording specifics (per-participant audio dial-in degradation) หากคุณทำสัมภาษณ์คนต่อคนผ่าน Zoom ทั้งสองเส้นทางใช้��ด้ — วาง MP4 แล้วป้ายชื่อผู้พูด ออกมาเหมือนเดิม