Sourcerer

Sourcerer

MCP for semantic code search & navigation that reduces token waste.

96
Stars
11
Forks
0
Releases

Overview

Sourcerer MCP is an MCP server for semantic code search and navigation that helps AI agents work efficiently without burning through costly tokens. Instead of reading entire files, agents can search conceptually and jump directly to the specific functions, classes, and code chunks they need. The server builds a semantic index of your codebase by parsing sources with Tree-sitter into ASTs, extracting meaningful chunks (functions, classes, methods, types) with stable IDs, and attaching source code, location data, and contextual summaries. Chunks are identified with IDs like file.ext::Type::method. It watches for file changes with fsnotify, respects .gitignore files, and automatically re-indexes changed files, storing modification times as metadata. A chromem-go backed vector database stores persistent embeddings generated via OpenAI, enabling semantic similarity searches rather than exact text matching. MCP tools include semantic_search, get_chunk_code, find_similar_chunks, index_workspace, and get_index_status to help AI agents retrieve relevant code with reduced token usage and cognitive load.

Details

Owner
st3v3nmw
Language
Go
License
MIT License
Updated
2025-12-07

Features

Semantic Search

Find relevant code using semantic search to retrieve contextually relevant chunks rather than relying on exact text.

Get Chunk by ID

Retrieve specific chunks by their stable IDs (e.g., file.ext::Type::method), including code, location, and summaries.

Find Similar Chunks

Identify chunks similar to a given chunk using embeddings.

Index Workspace

Manually trigger re-indexing of the workspace to refresh the semantic index.

Get Index Status

Check indexing progress and status.

Code Parsing & Chunking

Uses Tree-sitter to parse sources into ASTs and extract meaningful chunks with stable IDs.

File System Integration

Watches for file changes with fsnotify, respects .gitignore, auto re-indexes, and tracks modification times.

Vector Database & Embeddings

Chromem-go persistent vector storage with OpenAI embeddings for semantic similarity.

Audience

AI agentsUse Sourcerer MCP to perform semantic code search and navigation, enabling concept-based access to code chunks and reducing token usage in AI workflows.
Development teamsIntegrate semantic code search and indexing into developer tooling and workflows to reduce token usage when AI analyzes code.

Tags

semantic searchcode searchcode navigationAI-assisted code explorationvector databaseembeddingsTree-sitterOpenAI APIfsnotify.gitignorechromem-gosourcererGoJavaScriptMarkdownPythonTypeScript