Kokoro TTS

Kokoro TTS

Use Kokoro text to speech to convert text to MP3s with optional autoupload to S3.

67
Stars
14
Forks
0
Releases

Overview

Kokoro Text to Speech MCP server generates MP3 audio files from text using the Kokoro TTS model. It relies on Kokoro Onnx weights (kokoro-v1.0.onnx) and voices (voices-v1.0.bin) stored in the same repository and references the Kokoro-TTS source from HuggingFace Spaces. The server can optionally upload generated MP3s to S3, configurable via environment variables and the MCP config. The configuration example shows how to set the command, arguments, and environment variables such as TTS_VOICE, TTS_SPEED, TTS_LANGUAGE, AWS creds, AWS_REGION, AWS_S3_FOLDER, S3_ENABLED, and MP3_FOLDER. FFmpeg is required to convert WAV to MP3. Local MP3 files are stored in MP3_FOLDER, with optional automatic cleanup controlled by MP3_RETENTION_DAYS and DELETE_LOCAL_AFTER_S3_UPLOAD. S3 uploads can be toggled via S3_ENABLED and can be disabled per-request using the client’s --no-s3 option. The server runs with UV (uv run mcp-tts.py) and exposes a client tool (mcp_client.py) to submit TTS requests with options for text, file input, voice, and speed. Endpoint and bucket details, as well as host/port bindings, are configurable through environment variables.

Details

Owner
mberg
Language
Python
License
Apache License 2.0
Updated
2025-12-07

Features

Kokoro TTS to MP3

Generates MP3 audio files from text using the Kokoro Text-to-Speech model.

Optional S3 Upload

Uploads MP3 files to S3 when enabled; per-request disable via the mcp_client.py client.

Environment-based Configuration

Configurable voice, speed, language and AWS/S3 settings via environment variables.

Local MP3 Management

Stores MP3s locally in a configurable MP3_FOLDER and supports automatic cleanup.

WAV to MP3 Conversion with FFmpeg

Requires FFmpeg to convert generated WAV audio to MP3 format.

UV-based Local Server

Run the MCP server locally using UV (uv run mcp-tts.py).

TTS Client Tool

Use mcp_client.py to send TTS requests with customizable text, voice, and speed.

S3 Endpoint Flexibility

Optionally configure a custom S3-compatible endpoint URL for storage.

Tags

KokoroTTSText-to-SpeechMCP ServerMP3S3AWSOnnxHuggingFaceffmpegaudio