Overview
Features
Initialize Patronus with API key and project settings
Set up the Patronus MCP server by providing an API key and project configuration to enable subsequent evaluation and experiment workflows.
Run single evaluations with configurable evaluators
Execute individual evaluations using configurable evaluators (e.g., RemoteEvaluatorConfig) to assess a model output against a task.
Run batch evaluations with multiple evaluators
Perform batch evaluations across multiple evaluators to compare results on a single task and gather aggregated insights.
Run experiments with datasets
Run experiments with datasets, supporting asynchronous operations, custom evaluators, and adapters for flexible evaluation pipelines.
Who Is This For?
- Developers:Integrate MCP server to run LLM evaluations and experiments programmatically.
- Researchers:Design and assess evaluators and criteria for model optimization experiments.
- Engineers:Build automated evaluation pipelines and dashboards for iterating prompts.




