MLflow

MLflow

A MCP server enabling LLMs to interact with MLflow tracking for experiments, runs, metrics, and the model registry.

0
Stars
0
Forks
4
Releases

Overview

This MCP server provides a natural language interface for MLflow tracking, enabling large language models to query experiments, inspect runs, analyze metrics, and browse the model registry without writing code. It exposes a rich set of tools to list and search experiments, discover metrics and parameters, retrieve detailed run data, filter and sort runs, and perform side-by-side comparisons. Users can access metric histories, compare parameters across runs, and identify the best-performing models. Artifacts from runs can be browsed and downloaded, while registered models and their versions, along with deployment stages, are accessible through the MCP endpoints. It supports tag-based search and offset-based pagination to handle large result sets. The server can be installed with uvx (recommended) or via pip, and run from source. It requires a running MLflow tracking server, with MLFLOW_TRACKING_URI set (environment variable or Claude Desktop config). Health checks are available to verify connectivity. Example tool names include get_experiments, get_runs, query_runs, get_best_run, compare_runs, get_run_artifacts, get_registered_models, and get_model_version.

Details

Owner
kkruglik
Language
Python
License
MIT License
Updated
2025-12-07

Features

Experiment Management

List and search experiments; discover available metrics and parameters.

Run Analysis

Retrieve run details, query runs with filters, and identify best-performing models.

Metrics & Parameters

Get metric histories and compare parameters across runs.

Artifacts

Browse and download run artifacts.

Model Registry

Access registered models, versions, and deployment stages.

Comparison Tools

Side-by-side run comparisons and best run selection.

Tag-based Search

Filter runs by custom tags.

Pagination

Offset-based pagination for browsing large result sets.

Tags

MLflowMCPLLMexperimentsrunsmetricsparametersartifactsmodel registrycomparisonpaginationtag searchnatural languageClaude