All capabilities
Speech to Text

Convert audio files into accurate text in seconds.

Use Whisper and QTstt for instant, accurate transcription of audio and video files into Persian and dozens of other languages.

A transcription sample
🎤 meeting.mp3 (1 hour 12 minutes)
📄 Full transcription ready — TXT + SRT + JSON text with timestamps
5-line summary: ...

Use cases

For whatever you have in mind.

Meeting transcription

Accurate text from work meetings, interviews and talks with speaker detection.

Video subtitles

Create SRT and VTT with accurate timestamps for educational and media videos.

Voice messages

Convert voice notes in messengers or your CRM into searchable text.

Available models

Choose from the world's best models.

WhisperOpenAI, 99 languages
QTsttOptimized for Persian

Start with the free plan.

With the professional plan, access all models without limits.

FAQ

Answers at a glance.

Which formats are supported?
MP3, MP4, WAV, M4A, OGG, WebM and many more — up to 2 GB on the professional plan.
How accurate is Persian transcription?
Above 95% for medium-quality files, and up to 99% for studio files.
Are timestamps provided?
Yes, output as plain text, JSON or SRT with word-by-word timing.
What's the maximum file length?
Up to 4 hours on the professional plan and unlimited on the enterprise plan.