Mandoline

Mandoline

Mandoline MCP Server enables AI assistants to reflect on and improve performance using Mandoline's evaluation framework

4
Stars
0
Forks
2
Releases

Overview

Mandoline MCP Server enables AI assistants to reflect on, critique, and improve their own performance using Mandoline's evaluation framework through the Model Context Protocol. Offered as a hosted service (recommended for most users) and optionally as a local deployment for development or contribution, it integrates with popular assistants such as Claude Code, Codex, Claude Desktop, and Cursor via standard MCP configurations. The server provides core tooling for evaluation workflows, including a health check endpoint to verify connectivity, a metrics subsystem to create and manage evaluation criteria (create_metric, batch_create_metrics, get_metric, get_metrics, update_metric), and an evaluations subsystem to score prompts and responses against those metrics (create_evaluation, batch_create_evaluations, get_evaluation, get_evaluations, update_evaluation). Documentation resources (llms.txt and MCP guides) are available to help with setup, best practices, and usage. For local use, you can run Node.js 18+ locally, start the server, and configure clients to point to http://localhost:8080/mcp. The README also includes concrete client setup snippets for Claude Code, Codex, Claude Desktop, and Cursor to establish connections to the hosted or local MCP server.

Details

Owner
mandoline-ai
Language
TypeScript
License
Apache License 2.0
Updated
2025-12-07

Features

Hosted Mandoline MCP server

A hosted MCP server for quick integration of Mandoline's evaluation tools into AI assistants.

Cross-client MCP integration

Supports Claude Code, Codex, Claude Desktop, and Cursor clients connecting to Mandoline's MCP server.

Health endpoint

Includes a health check tool (get_server_health) to verify MCP server reachability and health status.

Metrics management

Define evaluation criteria with metrics: create_metric, batch_create_metrics, get_metric, get_metrics, update_metric.

Evaluations management

Score prompts and responses against metrics with create_evaluation, batch_create_evaluations, get_evaluation, get_evaluations, update_evaluation.

Documentation and resources

Access Mandoline docs index (llms.txt) and MCP guides for setup and best practices.

Local server option

Option to run the MCP server locally (Node.js 18+), with client configurations pointing to http://localhost:8080/mcp.

Audience

DevelopersIntegrate Mandoline MCP into AI assistants to enable evaluation, critique, and continuous improvement.

Tags

MCPModel Context ProtocolMandolineevaluation frameworkAI evaluationmetricsevaluationshealthtoolsintegration