Quick Start
Choose your preferred programming language to get started with the API:
import requests
import os
def generate_speech(text, voice="alloy", format="mp3", instructions=None):
url = "https://ttsapi.site/v1/audio/speech"
headers = {
"Content-Type": "application/json"
}
data = {
"input": text,
"voice": voice,
"response_format": format
}
# Add instructions if provided
if instructions:
data["instructions"] = instructions
response = requests.post(url, json=data, headers=headers)
if response.status_code == 200:
# Get the appropriate file extension based on format
ext = format.lower()
filename = f"output.{ext}"
# Save the audio file
with open(filename, "wb") as f:
f.write(response.content)
print(f"Audio saved as {filename}")
return filename
else:
error = response.json()
print(f"Error: {response.status_code}, {error}")
return None
# Example usage
text = "Hello, this is a test."
voice = "alloy"
format = "mp3" # Supported formats: mp3, opus, aac, flac, wav, pcm
instructions = "Speak in a cheerful and upbeat tone."
# Generate speech with default format (MP3)
generate_speech(text, voice, instructions=instructions)
# Generate speech in WAV format
generate_speech(text, voice, format="wav", instructions=instructions)
async function generateSpeech(text, voice = 'alloy', format = 'mp3', instructions = null) {
const response = await fetch('https://ttsapi.site/v1/audio/speech', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
input: text,
voice: voice,
response_format: format,
...(instructions && { instructions })
})
});
if (response.ok) {
const blob = await response.blob();
// Create audio element for playback
const audio = new Audio(URL.createObjectURL(blob));
audio.play();
// Optional: Download the file
const url = window.URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = `output.${format}`;
document.body.appendChild(a);
a.click();
window.URL.revokeObjectURL(url);
document.body.removeChild(a);
return blob;
} else {
const error = await response.json();
console.error('Error:', error);
throw error;
}
}
// Example usage
const text = 'Hello, this is a test.';
const voice = 'alloy';
const format = 'mp3'; // Supported formats: mp3, opus, aac, flac, wav, pcm
const instructions = 'Speak in a cheerful and upbeat tone.';
// Generate speech with default format (MP3)
generateSpeech(text, voice, undefined, instructions);
// Generate speech in WAV format
generateSpeech(text, voice, 'wav', instructions);
Available Voices
API Reference
Generate Speech (OpenAI Compatible)
POST /v1/audio/speech
Request Parameters
Parameter | Type | Required | Description |
---|---|---|---|
input | string | Yes | The text to convert to speech |
voice | string | Yes | The voice to use (see Available Voices) |
instructions | string | No | Mapped to "prompt" parameter when sent to the backend service. Can be used to guide voice emotion or style. |
response_format | string | No | The format of the audio response. Supported formats: mp3, opus, aac, flac, wav, pcm. Defaults to mp3. |
model | string | No | OpenAI compatibility only - completely ignored. |
speed | number | No | OpenAI compatibility only - completely ignored. |
Note: Parameters in gray are completely ignored by the service or may cause misleading behavior. Only input
, voice
, response_format
and instructions
affect the actual TTS output.
How the Instructions Parameter Works
The instructions
parameter is mapped to a prompt
parameter when sent to the backend service. It can be used to guide the voice emotion, tone, or style. Some examples of effective instructions:
- Emotional guidance: "Speak in a happy and excited tone."
- Character impersonation: "Speak like a wise old wizard."
- Contextual hints: "This is being read to a child, speak gently."
- Reading style: "Read this as a news broadcast."
Tip: Keep instructions clear and concise. Overly complex instructions may not be interpreted correctly.
Response Format
The API returns audio in the requested format with the following headers:
Content-Type
: Based on the requested format (e.g., "audio/mpeg" for MP3)Access-Control-Allow-Origin
: "*" (CORS enabled)
Error Responses
Status Code | Description |
---|---|
400 | Missing required parameters (input or voice) |
429 | Rate limit exceeded or queue is full. Includes Retry-After header when rate limited. |
500 | Internal server error |
Queue System
The API uses a queue system to handle multiple requests efficiently:
- Maximum queue size: Configurable via
MAX_QUEUE_SIZE
environment variable (default: 100 requests) - Requests are processed in FIFO (First In, First Out) order
- Rate limiting: Configurable via
RATE_LIMIT_REQUESTS
andRATE_LIMIT_WINDOW
environment variables (default: 30 requests per 60 seconds per IP address) - Queue status can be monitored via the
/api/queue-size
endpoint - Queue status updates every 2 seconds in the web interface
- Visual indicators show queue load (Low/Medium/High) based on utilization
Queue Status Endpoint
GET /api/queue-size
Returns JSON with queue information:
{
"queue_size": 0, // Current number of requests in queue
"max_queue_size": 100 // Maximum queue capacity
}
Response Status Codes
200
- Success429
- Queue is full or rate limit exceeded500
- Server error