Supadata

Supadata

Official MCP server for Supadata - YouTube, TikTok, X and Web data for makers.

20
Stars
6
Forks
0
Releases

Overview

This MCP server integrates Supadata's video and web scraping capabilities into an MCP interface. It supports extracting transcripts from major video platforms (YouTube, TikTok, Instagram, Twitter) and file URLs, alongside robust web scraping, crawling, and URL discovery. The server includes automatic retries and rate limiting with exponential backoff, leveraging Supadata's built-in rate limiting and batch processing for efficient parallel operations. It exposes a suite of MCP tools (transcript, transcript status, scrape, map, crawl, and crawl status) to enable both synchronous and asynchronous data collection, discovery, and content extraction. Configuration is flexible and can be driven by environment variables (e.g., SUPADATA_API_KEY) across multiple deployment contexts, including Cursor, Windsurf, Smithery, VS Code, Claude Desktop, and other integration points. The default configuration outlines retry behavior and backoff, and guidance is provided for handling large crawl responses with sensible limits. Outputs are returned in text/markdown/html formats where applicable, with job IDs and status endpoints to monitor long-running tasks. Designed for developers and makers needing reliable video transcripts and site-wide data for analysis, automation, and content workflows.

Details

Owner
supadata-ai
Language
JavaScript
License
MIT License
Updated
2025-12-07

Features

Transcript extraction across major platforms

Extract transcripts from YouTube, TikTok, Instagram, Twitter, and file URLs with language options and flexible formatting.

Web scraping, crawling, and discovery

Perform web scraping, site crawling, and URL discovery to collect content and map site structures.

Automatic retries and rate limiting

Automatically handle rate-limited requests with exponential backoff and retries for transient errors.

Built-in rate limiting and batch processing

Leverages Supadata's rate limiting and batch processing for efficient parallel operations and smart queuing.

MCP toolset (transcript, status, scrape, map, crawl, check crawl status)

Provides a suite of MCP tools to manage transcripts, page scraping, URL mapping, and multi-page crawling with status endpoints.

Asynchronous crawl with status checks

Crawl jobs run asynchronously and return an operation ID; monitor progress with status endpoints (e.g., supadata_check_crawl_status).

Cross-platform deployment and configuration

Configurable via environment variables and deployable across Cursor, Windsurf, Smithery, VS Code, Claude Desktop, and other clients.

Developer-friendly installation and integration options

Multiple installation and integration paths (npx, npm, Smithery) with configuration examples for common environments.

Audience

MakersLeverage MCP to fetch video transcripts and web data for content creation and analysis.

Tags

video transcriptstranscriptsweb scrapingweb crawlingURL discoveryrate limitingretryMCPCursorWindsurfSmitheryVS CodeClaude Desktop