scrapling-fetch

scrapling-fetch

An MCP server enabling AI to fetch bot-protected web content via page fetching and pattern extraction.

57
Stars
9
Forks
0
Releases

Overview

scrapling-fetch-mcp is an MCP server designed to help AI assistants access text content from websites that implement bot-detection measures. It targets low-volume retrieval of documentation and reference materials (text/HTML only) and is not intended for broad data harvesting. The server exposes two Claude-friendly tools: Page fetching, which retrieves complete web pages with pagination to cover multi-page content, and Pattern extraction, which uses regular expressions to locate and extract specific content. The AI determines which tool to use based on the user's natural-language request, enabling seamless interaction without exposing scraping details. Protection modes provide three levels of bot-detection bypass: basic, stealth, and max-stealth, with progressively higher latency to accommodate tougher defenses. Setup typically involves Claude Desktop configuration and, for operation, the installation of browser binaries (a large initial download). The project is built on Scrapling for bot-detection bypass. Limitations include text-content-only output, possible authentication barriers, and variable performance depending on site complexity and protection level. The server aims to bridge the gap between browser-visible content and AI-accessible text.

Details

Owner
cyberchitta
Language
Python
License
Apache License 2.0
Updated
2025-12-07

Features

Page fetching

Retrieves complete web pages with support for pagination to gather multi-page content.

Pattern extraction

Regex-based extraction to locate and retrieve specific content from pages.

Automatic tool selection

AI decides which tool to use based on the user’s natural-language request.

Protection modes

Supports basic, stealth, and max-stealth bot-detection bypass with escalating latency.

Claude Desktop integration

Configurable via Claude Desktop MCP settings (example provided in setup).

Low-volume optimization

Optimized for low-volume retrieval of documentation and reference materials (text/HTML only).

Browser binaries prerequisite

Requires installation of browser binaries before first use; large initial downloads.

Audience

AI assistantsAccess bot-protected web content for documentation, tutorials, and reference materials.
DevelopersEnable developers to fetch bot-protected text for AI chat workflows.

Tags

bot-detectionweb-contentMCPdocumentationregexpaginationtextHTMLlow-volumebrowser-binariesScraplingClaude Desktop