keyboard_voiceSpeech to Text API

Speech to Text API
Accurate Transcription at Low Cost

Transcribe audio files to text with high accuracy. 28+ languages, timestamps, auto-detection. Starting at $2 for 100K credits. No subscription.

rocket_launchStart Transcribing — $2 for 100K Credits descriptionView API Docs

28+

Languages

High

Accuracy

Auto

Language detect

Timestamps

Word-level

Code Examples

Upload audio, receive transcribed text in JSON.

codePython

import requests

with open("audio.mp3", "rb") as f:
    response = requests.post(
        "https://cheapaiapi.com/api/v1/speech-to-text",
        headers={"Authorization": "Bearer sk_your_api_key"},
        files={"audio": f},
        data={
            "language": "en",       # Optional — auto-detect if omitted
            "model": "whisper-1",
            "response_format": "json"
        }
    )

result = response.json()
print(result["text"])
# "Hello, this is the transcribed content of the audio file."

codeJavaScript / Node.js

const formData = new FormData();
formData.append("audio", audioFile);
formData.append("language", "en");
formData.append("response_format", "json");

const response = await fetch(
  "https://cheapaiapi.com/api/v1/speech-to-text",
  {
    method: "POST",
    headers: { "Authorization": "Bearer sk_your_api_key" },
    body: formData,
  }
);

const { text, segments } = await response.json();
console.log(text); // Full transcription
// segments: timestamped word/sentence-level data

Features

translate

28+ Languages

Transcribe audio in English, Spanish, French, Chinese, Arabic, Hindi, and 22+ more languages.

schedule

Word Timestamps

Get word-level or sentence-level timestamps with verbose_json response format.

language

Auto Language Detection

Omit the language parameter and the API will detect it automatically.

description

Multiple Formats

Receive output as plain text, JSON, SRT subtitles, or VTT captions.

hearing

Noise Tolerance

Works with real-world audio including mild background noise and accented speech.

savings

Low Cost

Much cheaper than comparable transcription APIs with no accuracy trade-off.

Supported Languages

28+ languages with automatic detection.

EnglishSpanishFrenchGermanItalianPortugueseRussianJapaneseKoreanChineseArabicHindiDutchPolishTurkishSwedishDanishFinnishRomanianCzechHungarianUkrainianGreekHebrewThaiVietnameseIndonesianMalay

Use Cases

closed_caption

Subtitles & Captions

Auto-generate SRT or VTT captions for videos with accurate timestamps.

edit_note

Meeting Transcription

Transcribe calls and meetings to searchable text records automatically.

podcasts

Podcast Transcripts

Convert podcast episodes into blog posts and SEO-friendly transcripts.

mic

Voice Commands

Build voice-controlled interfaces with real-time audio transcription.

article

Content Repurposing

Transform audio interviews into written articles and social media copy.

accessibility

Accessibility

Make audio and video content accessible for hearing-impaired audiences.

Frequently Asked Questions

How accurate is the transcription?expand_more

We use Whisper-based models that achieve near human-level accuracy on clear audio. Accuracy may vary with heavy accents, background noise, or low-quality recordings.

What audio formats are supported?expand_more

MP3, MP4, M4A, WAV, FLAC, OGG, and WebM are all supported. Maximum file size is 25MB per request.

Does the API support timestamps?expand_more

Yes. Set response_format to "verbose_json" to receive word-level or segment-level timestamps alongside the transcription.

Can the API auto-detect the spoken language?expand_more

Yes. Omit the language parameter and the model will automatically identify the language from the audio.

Is there speaker diarization (who said what)?expand_more

Basic speaker identification is available via additional parameters. Check the API docs for the latest diarization capabilities.

How much does transcription cost?expand_more

Credits are charged per minute of audio. A 1-minute audio file typically costs a few hundred credits — making it much cheaper than alternatives.

Start Transcribing Today

Accurate audio-to-text transcription at a fraction of the cost. Pay only for what you use.

rocket_launchGet Started — $2 for 100K Credits

Speech to Text APIAccurate Transcription at Low Cost