Puppeteer vision

Puppeteer vision

Puppeteer vision MCP server scrapes webpages and converts them to Markdown with AI-driven interaction.

45
Stars
8
Forks
0
Releases

Overview

An MCP server that combines Puppeteer-based web scraping with Mozilla Readability and Turndown to deliver clean Markdown outputs from web pages. It leverages stealth browsing and AI-driven interaction to automatically handle overlays and interactive elements such as cookie banners, CAPTCHAs, newsletter prompts, paywalls/login walls, age verification prompts, interstitial ads, and other blockers. After performing interactions, the server uses Readability to extract the main content and converts the HTML into well-formatted Markdown, with special handling for code blocks, tables, and other structured content. The tool is designed to be consumed via NPX for easy usage without cloning, and supports three communication modes: stdio, SSE, and HTTP, enabling integration with MCP-compatible LLM orchestrators in various deployment scenarios. The MCP tool exposed is named scrape-webpage and accepts parameters such as url, autoInteract, maxInteractionAttempts, and waitForNetworkIdle, returning content (markdown) and metadata (including success and contentSize). Configurable environment variables include OPENAI_API_KEY, VISION_MODEL, API_BASE_URL, TRANSPORT_TYPE, and DISABLE_HEADLESS, with headless mode defaulting to off for real-time viewing when enabled.

Details

Owner
djannot
Language
TypeScript
License
Updated
2025-12-07

Features

Puppeteer scraping with stealth mode

Scrapes webpages using Puppeteer with stealth mode to avoid detection while extracting content.

AI-driven interaction for overlays and prompts

Uses vision-capable AI to automatically handle cookies, CAPTCHAs, newsletters, paywalls, login walls, age checks, interstitial ads, and other blocking elements.

Content extraction with Readability

Extracts the main page content using Mozilla's Readability.

HTML to Markdown conversion with Turndown

Converts the extracted HTML into well-formatted Markdown.

Structured content handling

Provides special handling for code blocks, tables, and other structured content.

Model Context Protocol accessibility

Designed to be accessible and operable as an MCP tool within the MCP ecosystem.

Real-time browser view option

Allows viewing the browser interaction in real-time by disabling headless mode.

NPX-ready distribution

Easily consumable as an npx package for quick usage without cloning.

Audience

LLM developersIntegrate this MCP server as a tool to scrape web content and render it as Markdown within an MCP-compatible orchestrator.
MCP orchestrator engineersConfigure and manage this server as an external MCP server accessed via NPX, supporting stdio/SSE/HTTP transports.

Tags

puppeteervisionmcpweb-scrapingmarkdownreadabilityturndowncookiescaptchasinteractive-elementsnpx