Overview
Features
HTML to Markdown conversion
Converts HTML content fetched from URLs into clean Markdown.
Content preservation
Preserves essential content such as tables, images, and links.
Content trimming
Removes unnecessary elements (scripts, styles, navigation, footers, headers).
Size reduction
Achieves significant compression (~90-95%) while preserving content.
Configurable rendering options
Configurable options to include images, tables, and links.
Extraction stack
Built with trafilatura and BeautifulSoup4 for robust extraction.
Streaming processing
Stream processing for efficient handling of large pages.
Browser mode with Playwright
Browser mode enabling JavaScript rendering and authenticated access; supports Chromium, Firefox, WebKit; cookies via user profile and configurable wait strategies.
Who Is This For?
- AI developers:Convert web pages to compact Markdown to feed LLMs and AI agents with essential content.
- Claude Desktop users:Configure and run html2md MCP via Claude Desktop using Docker or uv, enabling fast URL-to-Markdown conversions.
- Web developers:Leverage Playwright browser mode to render JS-heavy sites and access authenticated content.
- Data scientists:Obtain compact Markdown of pages with preserved tables/images for data extraction and analysis.




