Install
Terminal · npx$
npx skills add https://github.com/vercel-labs/agent-skills --skill vercel-react-best-practicesWorks with Paperclip
How Agent Browser Automation fits into a Paperclip company.
Agent Browser Automation drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.
S
SaaS FactoryPaired
Pre-configured AI company — 18 agents, 18 skills, one-time purchase.
$27$59
Explore packSource file
SKILL.md437 linesExpandCollapse
---name: agent-browser-automationdescription: Headless browser automation CLI for AI agents using native Rust binary with Chrome DevTools Protocoltriggers: - automate browser with agent-browser - use agent-browser for web scraping - browser automation CLI for AI - take screenshot with agent-browser - click and fill forms with agent-browser - agent-browser snapshot accessibility tree - batch browser commands with agent-browser - install agent-browser for automation--- # agent-browser > Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection. `agent-browser` is a headless browser automation CLI built in Rust, designed for AI agents. It wraps Chrome via the Chrome DevTools Protocol (CDP) and exposes a fast, ergonomic command-line interface for navigation, interaction, accessibility snapshots, screenshots, network interception, and more — with no Node.js or Playwright runtime required. ## Installation ### Recommended (npm global)```bashnpm install -g agent-browseragent-browser install # Download Chrome for Testing (first time only)``` ### macOS (Homebrew)```bashbrew install agent-browseragent-browser install``` ### Rust / Cargo```bashcargo install agent-browseragent-browser install``` ### Local project dependency```bashnpm install agent-browser# Add to package.json scripts or invoke via npx``` ### Linux (with system dependencies)```bashagent-browser install --with-deps``` ## Quick Start ```bashagent-browser open https://example.comagent-browser snapshot # Accessibility tree with @refs (best for AI)agent-browser click @e2 # Click by ref from snapshotagent-browser fill @e3 "hello@example.com" # Fill by refagent-browser get text @e1 # Get text contentagent-browser screenshot page.pngagent-browser close``` ## Core Commands ### Navigation```bashagent-browser open <url> # Navigate (aliases: goto, navigate)agent-browser get url # Get current URLagent-browser get title # Get page titleagent-browser close # Close browser (aliases: quit, exit)``` ### Accessibility Snapshot (recommended for AI agents)```bashagent-browser snapshot # Returns accessibility tree with @ref IDsagent-browser snapshot -i # Interactive / compact mode``` Snapshot output includes `@eN` refs you can use directly:```@e1 [button] "Submit"@e2 [textbox] "Email" value=""@e3 [link] "Sign in"``` Then act on them:```bashagent-browser fill @e2 "user@example.com"agent-browser click @e1``` ### Interaction```bashagent-browser click <sel> # Click elementagent-browser dblclick <sel> # Double-clickagent-browser fill <sel> <text> # Clear and fill inputagent-browser type <sel> <text> # Type into elementagent-browser press <key> # Press key (Enter, Tab, Control+a)agent-browser keyboard type <text> # Type at current focus (real keystrokes)agent-browser keyboard inserttext <text> # Insert text without key eventsagent-browser hover <sel> # Hover elementagent-browser select <sel> <value> # Select dropdown optionagent-browser check <sel> # Check checkboxagent-browser uncheck <sel> # Uncheck checkboxagent-browser scroll down 500 # Scroll (up/down/left/right, optional px)agent-browser scroll down --selector "#feed" # Scroll within elementagent-browser scrollintoview <sel> # Scroll element into viewagent-browser drag <src> <target> # Drag and dropagent-browser upload <sel> /path/file.pdf # Upload file``` ### Screenshots & PDF```bashagent-browser screenshot # Save to temp dir, print pathagent-browser screenshot page.png # Save to pathagent-browser screenshot --full page.png # Full-page screenshotagent-browser screenshot --annotate # Numbered element labels overlayagent-browser screenshot --screenshot-dir ./shots # Custom output directoryagent-browser screenshot --screenshot-format jpeg --screenshot-quality 80agent-browser pdf output.pdf # Save page as PDF``` ### Getting Element Info```bashagent-browser get text <sel> # Text contentagent-browser get html <sel> # innerHTMLagent-browser get value <sel> # Input valueagent-browser get attr <sel> <attr> # Attribute valueagent-browser get count <sel> # Count matching elementsagent-browser get box <sel> # Bounding boxagent-browser get styles <sel> # Computed stylesagent-browser get cdp-url # CDP WebSocket URL``` ### State Checks```bashagent-browser is visible <sel>agent-browser is enabled <sel>agent-browser is checked <sel>``` ### Semantic Locators (find)```bashagent-browser find role button click --name "Submit"agent-browser find text "Sign In" clickagent-browser find label "Email" fill "test@example.com"agent-browser find placeholder "Search..." fill "rust"agent-browser find testid "login-btn" clickagent-browser find first ".item" clickagent-browser find nth 2 "a" textagent-browser find role textbox fill "hello" --name "Username"``` **Actions:** `click`, `fill`, `type`, `hover`, `focus`, `check`, `uncheck`, `text` ### Waiting```bashagent-browser wait "#modal" # Wait for element visibleagent-browser wait 2000 # Wait N millisecondsagent-browser wait --text "Welcome back" # Wait for textagent-browser wait --url "**/dashboard" # Wait for URL patternagent-browser wait --load networkidle # Wait for load stateagent-browser wait --fn "window.appReady === true" # Wait for JS conditionagent-browser wait "#spinner" --state hidden # Wait for element to disappear``` **Load states:** `load`, `domcontentloaded`, `networkidle` ### JavaScript Eval```bashagent-browser eval "document.title"agent-browser eval "JSON.stringify(window.__STATE__)"agent-browser eval -b "BASE64_ENCODED_JS"echo "return document.body.innerHTML" | agent-browser eval --stdin``` ### Batch Execution (efficient multi-step)```bashecho '[ ["open", "https://example.com"], ["snapshot", "-i"], ["fill", "@e2", "user@example.com"], ["click", "@e1"], ["screenshot", "result.png"]]' | agent-browser batch --json # Stop on first failureagent-browser batch --bail < commands.json``` ### Tabs & Frames```bashagent-browser tab # List tabsagent-browser tab new https://... # New tab with URLagent-browser tab 2 # Switch to tab 2agent-browser tab close # Close current tabagent-browser frame "#my-iframe" # Switch into iframeagent-browser frame main # Return to main frame``` ### Cookies & Storage```bashagent-browser cookiesagent-browser cookies set session_id "abc123"agent-browser cookies clear agent-browser storage localagent-browser storage local set theme darkagent-browser storage local clearagent-browser storage session set cart '{"items":[]}'``` ### Network```bashagent-browser network route "**/api/users" --body '{"users":[]}' # Mock responseagent-browser network route "**/ads/**" --abort # Block requestsagent-browser network unroute # Remove all routesagent-browser network requests --filter api # View requestsagent-browser network har startagent-browser network har stop recording.har``` ### Browser Settings```bashagent-browser set viewport 1280 800agent-browser set viewport 375 812 2 # With device pixel ratio (retina)agent-browser set device "iPhone 14"agent-browser set geo 37.7749 -122.4194agent-browser set offline onagent-browser set headers '{"X-Custom":"value"}'agent-browser set credentials admin secretagent-browser set media dark``` ### Auth State```bashagent-browser state save ./auth.json # Save cookies + localStorageagent-browser state load ./auth.json # Restore auth stateagent-browser state list # List saved statesagent-browser state show auth.json # Summary of saved state``` ### Dialogs```bashagent-browser dialog accept # Accept alert/confirm/promptagent-browser dialog accept "My input" # Accept prompt with textagent-browser dialog dismiss``` ### Clipboard```bashagent-browser clipboard readagent-browser clipboard write "Hello, World!"agent-browser clipboard copy # Ctrl+C current selectionagent-browser clipboard paste # Ctrl+V``` ### Diff & Visual Testing```bashagent-browser diff snapshot # vs last snapshotagent-browser diff snapshot --baseline before.txt # vs saved fileagent-browser diff snapshot --selector "#main" --compactagent-browser diff screenshot --baseline before.pngagent-browser diff screenshot --baseline b.png -o diff.pngagent-browser diff url https://v1.example.com https://v2.example.comagent-browser diff url https://v1.example.com https://v2.example.com --screenshotagent-browser diff url https://v1.example.com https://v2.example.com --selector "#content"``` ### Debug & Profiling```bashagent-browser trace start trace.zipagent-browser trace stopagent-browser profiler startagent-browser profiler stop profile.jsonagent-browser console # View console messagesagent-browser errors # View uncaught JS exceptionsagent-browser highlight "#button" # Visually highlight elementagent-browser inspect # Open Chrome DevToolsagent-browser connect 9222 # Connect to existing browser via CDP port``` ## Common Patterns ### Login flow and save session```bash#!/bin/bashagent-browser open https://app.example.com/loginagent-browser fill "#email" "$LOGIN_EMAIL"agent-browser fill "#password" "$LOGIN_PASSWORD"agent-browser click "[type=submit]"agent-browser wait --url "**/dashboard"agent-browser state save ./session.json``` ### AI agent loop with snapshot-driven interaction```bash#!/bin/bashagent-browser open https://app.example.comagent-browser state load ./session.json # Get snapshot, parse @refs, actSNAPSHOT=$(agent-browser snapshot)echo "$SNAPSHOT" # Agent determines @e5 is the search boxagent-browser fill @e5 "quarterly report"agent-browser press Enteragent-browser wait --load networkidleagent-browser snapshotagent-browser screenshot results.png``` ### Batch commands from a script (JSON)```bashcat > commands.json << 'EOF'[ ["open", "https://news.ycombinator.com"], ["wait", "--load", "networkidle"], ["get", "title"], ["snapshot"], ["screenshot", "hn.png"]]EOF agent-browser batch --json < commands.json``` ### Scrape with mocked network```bashagent-browser open https://api-heavy-app.example.comagent-browser network route "**/api/slow-endpoint" --body '{"data":"mocked"}'agent-browser snapshotagent-browser network unroute``` ### Full-page screenshot with annotations```bashagent-browser open https://example.comagent-browser wait --load networkidleagent-browser screenshot --full --annotate annotated.png``` ### Connect to already-running Chrome```bash# Start Chrome with remote debugginggoogle-chrome --remote-debugging-port=9222 & agent-browser connect 9222agent-browser open https://example.comagent-browser snapshot``` ### Emulate mobile device```bashagent-browser set device "iPhone 14"agent-browser open https://example.comagent-browser screenshot mobile.png``` ### HAR recording for network analysis```bashagent-browser open https://example.comagent-browser network har startagent-browser click "#load-data"agent-browser wait --load networkidleagent-browser network har stop session.har``` ## Selector Reference | Format | Example | Notes ||--------|---------|-------|| `@ref` | `@e1`, `@e12` | From `snapshot` output — preferred for AI || CSS | `#id`, `.class`, `[attr=val]` | Standard CSS selectors || Text | `"Sign In"` | Exact text match || XPath | `//button[@type='submit']` | Full XPath | ## Troubleshooting ### Chrome not found```bashagent-browser install # Downloads Chrome for Testingagent-browser install --with-deps # Linux: also installs system libs``` ### Element not found / timing issues```bashagent-browser wait "#my-element" # Wait for visibility firstagent-browser wait --load networkidle # Wait for page to settleagent-browser wait --fn "!!document.querySelector('#app')"``` ### Selector issues — use snapshot refs instead```bash# Instead of fragile CSS:agent-browser click ".btn.btn-primary.submit-form" # Use snapshot refs:agent-browser snapshot # Find @e7 = [button] "Submit"agent-browser click @e7``` ### Debug what's on the page```bashagent-browser screenshot debug.png # Visual checkagent-browser snapshot # Accessibility treeagent-browser console # JS console outputagent-browser errors # Uncaught exceptionsagent-browser eval "document.readyState"``` ### Auth issues between sessions```bashagent-browser state save ./auth.json # After successful loginagent-browser state load ./auth.json # At start of next session``` ### Handling alerts/dialogs```bash# Set up handler BEFORE the action that triggers dialogagent-browser dialog acceptagent-browser click "#delete-button"``` ### Performance — use batch for multi-step workflows```bash# Slow: one process per commandagent-browser open https://example.comagent-browser fill "#q" "search"agent-browser click "#submit" # Fast: single process, multiple commandsecho '[["open","https://example.com"],["fill","#q","search"],["click","#submit"]]' \ | agent-browser batch --json```Related skills
Agency Agents Ai Specialists
Install Agency Agents Ai Specialists skill for Claude Code from aradotso/trending-skills.
Antigravity Manager
Install Antigravity Manager skill for Claude Code from aradotso/trending-skills.
Aracli Deploy Management
Install Aracli Deploy Management skill for Claude Code from aradotso/trending-skills.