Control any platform with Skills
Agent Skills are a format for extending AI coding agents with specialized capabilities. Midscene provides Agent Skills that let AI coding tools (like Claude Code, Cline, etc.) drive UI automation through CLI commands — no MCP server setup required.
Unlike MCP integration, Skills work by running CLI commands directly in the terminal. The AI agent acts as the brain: it takes screenshots, analyzes the UI, and decides which actions to perform next.
Supported platforms
Installation
Make sure Node.js is installed, then run:
Skills repository: github.com/web-infra-dev/midscene-skills
Model configuration
Midscene skills require a vision model with strong visual grounding capabilities. Configure the following environment variables — either as system environment variables or in a .env file in the current working directory (Midscene loads .env automatically):
For supported models and configuration details, see Model strategy and Common model configuration.
Use skills
In your AI chat assistant, you can use the following command to use skills:
More
Please refer to the Skills Repository for more details.

