All capabilities
Speech to Text
Convert audio files into accurate text in seconds.
Use Whisper and QTstt for instant, accurate transcription of audio and video files into Persian and dozens of other languages.
A transcription sample
🎤 meeting.mp3 (1 hour 12 minutes)
📄 Full transcription ready — TXT + SRT + JSON text with timestamps
5-line summary: ...
5-line summary: ...
Use cases
For whatever you have in mind.
Meeting transcription
Accurate text from work meetings, interviews and talks with speaker detection.
Video subtitles
Create SRT and VTT with accurate timestamps for educational and media videos.
Voice messages
Convert voice notes in messengers or your CRM into searchable text.
Available models
Choose from the world's best models.
Start with the free plan.
With the professional plan, access all models without limits.
FAQ
Answers at a glance.
Which formats are supported?
MP3, MP4, WAV, M4A, OGG, WebM and many more — up to 2 GB on the professional plan.
How accurate is Persian transcription?
Above 95% for medium-quality files, and up to 99% for studio files.
Are timestamps provided?
Yes, output as plain text, JSON or SRT with word-by-word timing.
What's the maximum file length?
Up to 4 hours on the professional plan and unlimited on the enterprise plan.