Overview
Features
Puppeteer scraping with stealth mode
Scrapes webpages using Puppeteer with stealth mode to avoid detection while extracting content.
AI-driven interaction for overlays and prompts
Uses vision-capable AI to automatically handle cookies, CAPTCHAs, newsletters, paywalls, login walls, age checks, interstitial ads, and other blocking elements.
Content extraction with Readability
Extracts the main page content using Mozilla's Readability.
HTML to Markdown conversion with Turndown
Converts the extracted HTML into well-formatted Markdown.
Structured content handling
Provides special handling for code blocks, tables, and other structured content.
Model Context Protocol accessibility
Designed to be accessible and operable as an MCP tool within the MCP ecosystem.
Real-time browser view option
Allows viewing the browser interaction in real-time by disabling headless mode.
NPX-ready distribution
Easily consumable as an npx package for quick usage without cloning.
Who Is This For?
- LLM developers:Integrate this MCP server as a tool to scrape web content and render it as Markdown within an MCP-compatible orchestrator.
- MCP orchestrator engineers:Configure and manage this server as an external MCP server accessed via NPX, supporting stdio/SSE/HTTP transports.




