Quick Start

Choose your preferred programming language to get started with the API:

Python

import requests
import os

def generate_speech(text, voice="alloy", format="mp3", instructions=None):
    url = "https://ttsapi.site/v1/audio/speech"
    headers = {
        "Content-Type": "application/json"
    }
    data = {
        "input": text,
        "voice": voice,
        "response_format": format
    }
    
    # Add instructions if provided
    if instructions:
        data["instructions"] = instructions
    
    response = requests.post(url, json=data, headers=headers)
    
    if response.status_code == 200:
        # Get the appropriate file extension based on format
        ext = format.lower()
        filename = f"output.{ext}"
        
        # Save the audio file
        with open(filename, "wb") as f:
            f.write(response.content)
        print(f"Audio saved as {filename}")
        return filename
    else:
        error = response.json()
        print(f"Error: {response.status_code}, {error}")
        return None

# Example usage
text = "Hello, this is a test."
voice = "alloy"
format = "mp3"  # Supported formats: mp3, opus, aac, flac, wav, pcm
instructions = "Speak in a cheerful and upbeat tone."

# Generate speech with default format (MP3)
generate_speech(text, voice, instructions=instructions)

# Generate speech in WAV format
generate_speech(text, voice, format="wav", instructions=instructions)
JavaScript
async function generateSpeech(text, voice = 'alloy', format = 'mp3', instructions = null) {
    const response = await fetch('https://ttsapi.site/v1/audio/speech', {
        method: 'POST',
        headers: {
            'Content-Type': 'application/json'
        },
        body: JSON.stringify({
            input: text,
            voice: voice,
            response_format: format,
            ...(instructions && { instructions })
        })
    });

    if (response.ok) {
        const blob = await response.blob();
        
        // Create audio element for playback
        const audio = new Audio(URL.createObjectURL(blob));
        audio.play();
        
        // Optional: Download the file
        const url = window.URL.createObjectURL(blob);
        const a = document.createElement('a');
        a.href = url;
        a.download = `output.${format}`;
        document.body.appendChild(a);
        a.click();
        window.URL.revokeObjectURL(url);
        document.body.removeChild(a);
        
        return blob;
    } else {
        const error = await response.json();
        console.error('Error:', error);
        throw error;
    }
}

// Example usage
const text = 'Hello, this is a test.';
const voice = 'alloy';
const format = 'mp3';  // Supported formats: mp3, opus, aac, flac, wav, pcm
const instructions = 'Speak in a cheerful and upbeat tone.';

// Generate speech with default format (MP3)
generateSpeech(text, voice, undefined, instructions);

// Generate speech in WAV format
generateSpeech(text, voice, 'wav', instructions);

Available Voices

alloy ash ballad coral echo fable onyx nova sage shimmer verse

API Reference

Generate Speech (OpenAI Compatible)

POST /v1/audio/speech

Request Parameters

Parameter Type Required Description
input string Yes The text to convert to speech
voice string Yes The voice to use (see Available Voices)
instructions string No Mapped to "prompt" parameter when sent to the backend service. Can be used to guide voice emotion or style.
response_format string No The format of the audio response. Supported formats: mp3, opus, aac, flac, wav, pcm. Defaults to mp3.
model string No OpenAI compatibility only - completely ignored.
speed number No OpenAI compatibility only - completely ignored.

Note: Parameters in gray are completely ignored by the service or may cause misleading behavior. Only input, voice, response_format and instructions affect the actual TTS output.

How the Instructions Parameter Works

The instructions parameter is mapped to a prompt parameter when sent to the backend service. It can be used to guide the voice emotion, tone, or style. Some examples of effective instructions:

  • Emotional guidance: "Speak in a happy and excited tone."
  • Character impersonation: "Speak like a wise old wizard."
  • Contextual hints: "This is being read to a child, speak gently."
  • Reading style: "Read this as a news broadcast."

Tip: Keep instructions clear and concise. Overly complex instructions may not be interpreted correctly.

Response Format

The API returns audio in the requested format with the following headers:

  • Content-Type: Based on the requested format (e.g., "audio/mpeg" for MP3)
  • Access-Control-Allow-Origin: "*" (CORS enabled)

Error Responses

Status Code Description
400 Missing required parameters (input or voice)
429 Rate limit exceeded or queue is full. Includes Retry-After header when rate limited.
500 Internal server error

Queue System

The API uses a queue system to handle multiple requests efficiently:

  • Maximum queue size: Configurable via MAX_QUEUE_SIZE environment variable (default: 100 requests)
  • Requests are processed in FIFO (First In, First Out) order
  • Rate limiting: Configurable via RATE_LIMIT_REQUESTS and RATE_LIMIT_WINDOW environment variables (default: 30 requests per 60 seconds per IP address)
  • Queue status can be monitored via the /api/queue-size endpoint
  • Queue status updates every 2 seconds in the web interface
  • Visual indicators show queue load (Low/Medium/High) based on utilization

Queue Status Endpoint

GET /api/queue-size

Returns JSON with queue information:

{
    "queue_size": 0,        // Current number of requests in queue
    "max_queue_size": 100   // Maximum queue capacity
}

Response Status Codes

  • 200 - Success
  • 429 - Queue is full or rate limit exceeded
  • 500 - Server error