Cartesia

Cartesia

Connect to the Cartesia voice platform to perform text-to-speech, voice cloning etc.

11
Stars
4
Forks
0
Releases

Overview

The Cartesia MCP server provides a bridge for clients such as Cursor, Claude Desktop, and OpenAI agents to access Cartesia's API. It supports generating speech from text, localizing speech into different languages, and infilling audio between existing segments. The server can be installed via pip and run as a command-line executable. Clients configure the MCP server through integration files (Claude Desktop uses claude_desktop_config.json; Cursor uses .cursor/mcp.json or a global config). To use, you must have a Cartesia account with API keys; you can obtain an API key from the Cartesia Playground API Keys section (New). When running, you can specify OUTPUT_DIRECTORY to store generated files. The Claude Desktop example shows how to set environment variables including CARTESIA_API_KEY and OUTPUT_DIRECTORY. The README also mentions a free tier with 20,000 credits per month. The MCP server exposes commands to list voices, convert text to audio, localize speech, infill audio, and switch voices.

Details

Owner
cartesia-ai
Language
Python
License
Updated
2025-12-07

Features

Text-to-speech generation

Convert text into audio using Cartesia's TTS capabilities, enabling natural-sounding speech.

Voice localization

Localize speech into different languages or locales to support multilingual outputs.

Infill audio between segments

Infill or bridge audio between two existing clips to create seamless transitions.

Voice cloning / voice switching

Change the voice in a clip or clone a voice across outputs.

Multi-client integration

Connect with clients like Cursor, Claude Desktop, and OpenAI agents via a common MCP interface.

Configurable CLI server

Run the MCP server as a CLI tool installed with pip; specify the executable path and environment variables in config.

API key authentication

Authenticate with Cartesia using the CARTESIA_API_KEY environment variable.

Output directory support

Optionally specify OUTPUT_DIRECTORY to store generated audio and related files.

Audience

DevelopersIntegrate Cartesia TTS and voice features into apps or assistants using the MCP server and config-based setups.
AI/voice engineers and product teamsBuild voice-enabled workflows with Cursor, Claude Desktop, or OpenAI agents by accessing Cartesia's API through MCP.

Tags

CartesiaMCP servertext-to-speechvoice cloningvoice localizationinfillaudio generationCLI integrationClaude DesktopCursorOpenAI agentsAPI key