HTML to Markdown

HTML to Markdown

Fetch HTML from URLs and convert it to clean Markdown with large-page handling.

4
Stars
0
Forks
0
Releases

Overview

An MCP server that fetches HTML content from URLs or accepts raw HTML input and converts it into Markdown using Turndown.js. It automatically strips scripts, styles, and other non-content elements to produce clean, readable Markdown while preserving essential formatting such as headers, links, code blocks, lists, and tables. The server also auto-extracts page titles and metadata to provide helpful context about the source. For very large pages, maxLength can limit the returned content, and saveToFile can persist the full Markdown to disk while returning only a summary, enabling efficient handling within token limits. The server exposes a dedicated html_to_markdown tool and integrates with Claude, Claude Desktop, Cursor, and Codex configurations, including local development workflows. It uses the MCP (Model Context Protocol) over stdio transport and runs as an ES module Node.js server powered by Turndown.js. This setup makes it suitable for automation, documentation generation, knowledge extraction, and AI-assisted processing of web content.

Details

Owner
levz0r
Language
JavaScript
License
Updated
2025-12-07

Features

Fetch and convert web pages

Automatically fetch HTML from a URL and convert it to Markdown while preserving content structure.

HTML to Markdown conversion using Turndown.js

Transforms HTML into clean, readable Markdown while preserving headers, links, code blocks, and lists.

Preserve formatting

Keeps headers, links, code blocks, lists, and tables intact in the Markdown output.

Content cleaning

Removes scripts, styles, and other non-content elements for readability.

Metadata extraction

Auto-extracts page titles and metadata to provide contextual references.

Large page handling and truncation

Supports maxLength to limit returned content and truncates long outputs with a message.

File saving for large outputs

saveToFile option saves the full Markdown to disk and returns a summary.

MCP protocol compatibility and integrations

Operates over stdio transport and integrates with Claude, Cursor, Codex, and local development workflows.

Audience

DevelopersIntegrate HTML-to-Markdown conversion into automation workflows, tooling, and content processing pipelines.
Content CreatorsConvert web content to Markdown for documentation, knowledge bases, or blog notes.
AI AssistantsSupport Claude, Cursor, and Codex by fetching and formatting web content as Markdown.

Tags

HTML to MarkdownWeb page fetchingMarkdown conversionTurndown.jsContent cleaningMetadata extractionLarge page handlingMCPClaude integrationCursor integration