Mobile MCP

Mobile MCP

MCP server enabling scalable mobile automation across iOS and Android via a unified interface.

2,505
Stars
230
Forks
20
Releases

Overview

Mobile Next MCP server is a Model Context Protocol server that enables scalable mobile automation across iOS and Android using a single, platform-agnostic API. It runs on emulators, simulators, and real devices, allowing Agents and LLMs to interact with native apps via structured accessibility snapshots or coordinate-based taps derived from screenshots. The server prioritizes native accessibility trees for most interactions but can fall back to screenshot-driven analysis when necessary. It is LLm-friendly, avoiding the need for separate computer vision models in the Accessibility (Snapshot) path, and emphasizes deterministic, data-driven actions to reduce ambiguity. Core capabilities cover device management (list devices, screen size, orientation), app management (install, launch, terminate, uninstall), screen interaction (take screenshots, list elements, click at coordinates, swipe, long-press, etc.), and navigation (type text, open URLs, press hardware buttons). Cross-platform support means a unified API works across iOS and Android, with headless operation available for simulators/emulators when no real device is connected. Prerequisites include Xcode tools, Android Platform Tools, Node.js, and MCP-compliant agents.

Details

Owner
mobile-next
Language
TypeScript
License
Apache License 2.0
Updated
2025-12-07

Features

Fast and lightweight

Uses native accessibility trees for most interactions, or screenshot-based coordinates where accessibility labels are not available.

LLM-friendly

No computer vision model required in the Accessibility (Snapshot) path.

Visual Sense

Evaluates what’s rendered on screen to decide the next action; falls back to screenshot-based analysis if accessibility data or view-hierarchy coordinates are unavailable.

Deterministic tool application

Reduces ambiguity found in purely screenshot-based approaches by relying on structured data whenever possible.

Extract structured data

Enables you to extract structured data from anything visible on screen.

Cross-platform Unified API

A single API works across both iOS and Android, simplifying automation and integration.

Agent/LLM integration

Supports integration with AI agents and LLM-driven workflows for mobile automation and data extraction.

Headless/Background operation

Supports running in headless mode on simulators/emulators when no real device is connected.

Audience

DevelopersAutomate iOS/Android apps with a single, platform-agnostic API.
QA engineersOrchestrate end-to-end mobile tests across devices via MCP tools and LLMs.
AI/LLM integratorsIntegrate MCP with AI agents to automate mobile interactions and data extraction.

Tags

Mobile MCPMCP serverMobile automationiOSAndroidSimulatorsEmulatorsReal devicesAccessibility snapshotsCoordinate-based tapsLLM-friendlyScreenshot-based analysisHeadless modeWebDriverAgentUIAutomatorADBCross-platformPlatform-agnostic