TextIn

TextIn

OCR and text extraction from documents, including recognition and Markdown conversion.

27
Stars
6
Forks
0
Releases

Overview

TextIn OCR MCP Server is a modular MCP server that enables OCR-powered text extraction from documents, including document text recognition, ID recognition, and invoice recognition, with support for converting documents into Markdown format. It exposes three tools: recognition_text, which performs text recognition on images, Word documents, and PDFs; doc_to_markdown, which converts PDFs, Word documents, and common image formats to Markdown; and general_information_extration, which automatically identifies and extracts key information from documents or user-specified fields, returning a JSON with the extracted data. When inputs are provided via URL, access to protected resources is not supported. The project provides a setup path based on NPX usage, requiring an APP_ID and APP_SECRET from TextIn, and a MCP server configuration that includes environment variables like APP_ID, APP_SECRET, and MCP_SERVER_REQUEST_TIMEOUT. Supported input formats include PDF, Word/Excel documents, and standard image types (JPEG, PNG, BMP). The MCP server is released under the MIT License, enabling usage, modification, and distribution subject to license terms.

Details

Owner
intsig-textin
Language
JavaScript
License
MIT License
Updated
2025-12-07

Features

recognition_text

Text recognition from images, Word documents, and PDFs; returns extracted text.

doc_to_markdown

Converts images, PDFs, and Word documents to Markdown; returns Markdown text.

general_information_extration

Automatically identifies and extracts key information from documents or user-specified fields; returns a JSON with the extracted data; supports PDF, Word/Excel, and images.

Audience

DevelopersIntegrate OCR, text extraction, and Markdown conversion into applications via the MCP server.
Data engineersExtract structured data from documents and generate Markdown representations within pipelines.

Tags

OCRText ExtractionDocument RecognitionID RecognitionInvoice RecognitionMarkdown ConversionPDFImagesWordExcelTextInMCPNPXTextIn OCR