Cloudera Iceberg

Cloudera Iceberg

enabling AI on the Open Data Lakehouse.

7
Stars
8
Forks
0
Releases

Overview

This MCP server provides read-only access to Iceberg tables through Apache Impala. It enables large language models (LLMs) to inspect database schemas and execute read-only SQL queries against Iceberg-backed data. The server exposes two core operations: execute_query(query: str), which runs any SQL query on Impala and returns the results as JSON, and get_schema(), which lists all tables available in the current database. It connects to Impala using environment-configured credentials and database settings (IMPALA_HOST, IMPALA_PORT, IMPALA_USER, IMPALA_PASSWORD, IMPALA_DATABASE). The transport layer is configurable via MCP_TRANSPORT, supporting stdio (default), http, and SSE, enabling local tools, web deployments, and existing web-based integrations. The repository provides deployment instructions for Claude Desktop and local setups, including GitHub-based and local directory installations. Examples in the repository showcase integration with AI frameworks like LangChain/LangGraph and the OpenAI SDK. This server is intended to help AI applications discover Iceberg schemas and fetch read-only results without modifying data.

Details

Owner
cloudera
Language
Python
License
Apache License 2.0
Updated
2025-12-07

Features

Read-only Iceberg access via Impala

Provides read-only SQL access to Iceberg tables through Impala, preventing data modification.

execute_query

Runs any SQL query on Impala and returns the results as JSON.

get_schema

Lists all tables available in the current database.

Configurable transport

MCP_TRANSPORT environment variable allows choosing stdio (default), http, or SSE for diverse deployments.

Claude Desktop integration

Includes deployment guidance to run the MCP server with Claude Desktop.

Deployment options

Supports GitHub-based direct installation or local directory installation.

Impala connection configuration

Connects to Impala using IMPALA_HOST, IMPALA_PORT, IMPALA_USER, IMPALA_PASSWORD, and IMPALA_DATABASE.

AI framework integration examples

Repository examples demonstrate integration with LangChain/LangGraph and the OpenAI SDK.

Audience

Data scientistsInspect Iceberg schemas and run read-only SQL queries against Impala for AI analysis.
LLM developersBuild AI tools and integrations that query Iceberg data via Impala and process JSON results.
Data engineersSet up and maintain Impala-based data access for AI-ready Iceberg datasets.

Tags

IcebergImpalaMCP Serverread-onlySQLLLMClaude DesktopLangChainLangGraphOpenAI SDKtransporthttpssestdio