PDF reader MCP

PDF reader MCP

MCP server to read and search text in a local PDF file.

26
Stars
7
Forks
0
Releases

Overview

An MCP server for local PDF processing that provides advanced text extraction, search, and metadata analysis. It exposes core capabilities to extract text from PDF files with options for page ranges, optional inclusion of metadata, and text normalization. It enables targeted discovery with a powerful search feature that supports case-sensitive, whole-word, and configurable matching. In addition to textual content, the server can retrieve comprehensive PDF metadata such as author, title, creation date, keywords, as well as other properties, and can perform page-specific processing. It enforces a 50 MB file size limit to protect resources and uses asynchronous processing to keep operations non-blocking. The solution ships with three tools: read-pdf for enhanced reading and extraction, search-pdf for in-document searching, and pdf-metadata for metadata-only retrieval. Security is a focus, with file validation and path sanitization to guard against unsafe inputs. Designed for MCP workflows and easy Cursor integration, it is suitable for automation, analysis, and applications that need robust PDF handling within an MCP server.

Details

Owner
gpetraroli
Language
JavaScript
License
Updated
2025-12-07

Features

Text Extraction

Extract text content from PDF files with customizable options.

Text Search

Search for specific text within PDFs with advanced options.

Metadata Extraction

Retrieve comprehensive PDF metadata.

Page-specific Processing

Extract content from specific page ranges.

Text Cleaning

Normalize and clean extracted text.

File Size Limits

Protection against overly large files (50MB limit).

Async Processing

Non-blocking file operations.

Security

File validation and path sanitization.

Tags

PDFText extractionText searchMetadataPage rangesText cleaningFile size limitAsync processingSecurityMCP serverread-pdfsearch-pdfpdf-metadataCursor integration