Unstructured

Unstructured

Set up and interact with your unstructured data processing workflows in Unstructured Platform

37
Stars
20
Forks
0
Releases

Overview

An MCP server implementation for interacting with the Unstructured API, exposing tools to list sources and workflows. The server enables programmatic management of connectors and workflows: list, inspect, create, update, run, and delete source and destination connectors, as well as list and manage workflows and associated jobs. It supports a broad set of connectors (S3, Azure, Google Drive, OneDrive, Salesforce, Sharepoint) and destinations (S3, Weaviate, Pinecone, AstraDB, MongoDB, Neo4j, Databricks Volumes, and Databricks Volumes Delta Table). Detailed information for specific sources, destinations, and workflows can be retrieved, and workflows can be executed with status monitoring via jobs. Credentials for connectors are supplied through environment variables defined in a .env file, with a provided list of required keys. Firecrawl integration provides HTML crawling and LLM-optimized text generation, with results uploaded to configured S3 locations. The server can be run via standard Python tools and supports multiple protocols (SSE and stdio) for client interaction, including debugging utilities like the MCP Inspector and configurable logging. This MCP server targets developers and data engineers building, configuring, and monitoring Unstructured Platform pipelines.

Details

Owner
Unstructured-IO
Language
Jupyter Notebook
License
Updated
2025-12-07

Features

Source connectors management

List, inspect, create, update, and delete source connectors to ingest data from the Unstructured Platform.

Destination connectors management

List, inspect, create, update, and delete destination connectors for output destinations.

Workflow management

Create, run, update, delete, and retrieve information about workflows; includes listing workflows.

Job management

List jobs, get job info, and cancel jobs associated with workflows.

Advanced workflow visibility

List workflows with finished job details, including associated source and destination information.

Broad connector compatibility

Supports a wide range of sources and destinations (S3, Azure, Google Drive, OneDrive, Salesforce, Sharepoint, Weaviate, Pinecone, AstraDB, MongoDB, Neo4j, Databricks).

Environment-based credentials

Credentials required for connectors are defined in a .env file with documented mappings to each connector.

Firecrawl integration

Firecrawl source features including HTML crawling and LLM-optimized text generation with status checks and results uploaded to S3.

Audience

DevelopersBuild and manage source/destination connectors and workflows for the Unstructured Platform.

Tags

Unstructured APIMCP serversourcesdestinationsworkflowsconnectorsS3AzureGoogle DriveOneDriveSalesforceSharepointWeaviatePineconeAstraDBMongoDBNeo4jDatabricksFirecrawlSSEStdioEnv varsDebuggingMinimal Client