Overview
Features
One-Click Out-of-the-box CLI
Supports both headful Web UI and headless server execution, enabling quick setup via npx or global npm installation.
Hybrid Browser Agent
Controls browsers using GUI Agent, DOM, or a hybrid strategy for versatile automation.
Event Stream
Protocol-driven Event Stream powers Context Engineering and the Agent UI for structured data flows.
MCP Integration
Kernel built on MCP and supports mounting MCP Servers to connect to real-world tools.
Who Is This For?
- Developers:to build and extend MCP-enabled multimodal agents and connect to real-world tools via MCP.
- AI Engineers:to explore multimodal LLM workflows with GUI, Vision, and browser automation.
- Product Teams:to deploy automated tasks and workflows using CLI or Web UI.




