What features does RAG Local MCP server provide?

Q: What features does RAG Local MCP server provide?

Semantic memory storage: Store and retrieve text passages based on semantic meaning using embeddings rather than keywords.. Memorize multiple texts: Memorize several texts in a single operation for later semantic retrieval.. PDF memorization: Memorize contents of a PDF by reading up to 20 pages at a time, chunking into meaningful segments, and storing.. Conversational chunking: LLM-assisted splitting of long texts into short, meaningful chunks and memorizing them iteratively.. Semantic retrieval: Retrieve the most relevant stored texts for a query, with human-readable relevance descriptions.. Local vector store (ChromaDB): Uses ChromaDB for vector storage and fast similarity search.. Embedding generation with Ollama: Generates embeddings using Ollama for the vector store.. MCP tooling and deployment: Includes MCP configuration and Docker Compose-based deployment to run ChromaDB, Ollama, and the MCP server.

RAG Local MCP Server 2026: Features & Setup Guide

Overview

Memory Server (mcp-rag-local) provides a simple API to store and retrieve text passages by semantic meaning rather than keywords. It uses Ollama to generate text embeddings and ChromaDB for vector storage and fast similarity search, enabling semantic retrieval of stored content. You can memorize a single text or multiple texts, and later query for the most relevant passages. The MCP also supports memorizing contents from PDF files via the memorize_pdf_file tool: the reader processes up to 20 pages at a time, the LLM chunks the text into meaningful segments, and memorize_multiple_texts stores the chunks; the process repeats until the entire document is memorized. It supports conversational chunking for very large texts, where the LLM iteratively chunks and memorizes. Retrieved results include the relevant texts along with a human-readable description of their relevance. The server runs with Docker Compose, exposing ports for ChromaDB and Ollama, and provides a web-based admin GUI at http://localhost:8322. Setup involves pulling the embedding model (all-minilm:l6-v2) and configuring MCP to run the main.py.

Features

Semantic memory storage

Store and retrieve text passages based on semantic meaning using embeddings rather than keywords.

Memorize multiple texts

Memorize several texts in a single operation for later semantic retrieval.

PDF memorization

Memorize contents of a PDF by reading up to 20 pages at a time, chunking into meaningful segments, and storing.

Conversational chunking

LLM-assisted splitting of long texts into short, meaningful chunks and memorizing them iteratively.

Semantic retrieval

Retrieve the most relevant stored texts for a query, with human-readable relevance descriptions.

Local vector store (ChromaDB)

Uses ChromaDB for vector storage and fast similarity search.

Embedding generation with Ollama

Generates embeddings using Ollama for the vector store.

MCP tooling and deployment

Includes MCP configuration and Docker Compose-based deployment to run ChromaDB, Ollama, and the MCP server.