← Back to list

agent-browser
by Keith-CY
my-crazy-skills is a curated collection of AI skills and automation modules organized as git submodules. It provides a centralized, extensible skill arsenal for various AI agents — including Claude, Gemini, and others.
⭐ 2🍴 0📅 Jan 25, 2026
SKILL.md
name: agent-browser description: Use when a task needs headless browser automation (navigate, click, fill forms, extract text, take screenshots) and Playwright-based tooling should be avoided; prefer the agent-browser CLI.
agent-browser
Overview
Use agent-browser (CLI) for deterministic, scriptable headless browser automation via snapshots and stable element refs (@e1, @e2, ...).
Core Workflow
- Open a page:
agent-browser open <url> - Get a ref-based accessibility snapshot:
agent-browser snapshot -i - Interact using refs (preferred) or selectors:
- Click:
agent-browser click @e2 - Fill:
agent-browser fill @e3 "text" - Press:
agent-browser press Enter
- Click:
- Extract info:
- Visible text:
agent-browser get text @e1 - URL/title:
agent-browser get url/agent-browser get title
- Visible text:
- Capture artifacts:
- Screenshot:
agent-browser screenshot --full path/to.png - Trace:
agent-browser trace start/agent-browser trace stop path/to-trace.json.gz
- Screenshot:
Quick Reference
Common Flags
- Reuse a session across commands:
--session <name>(orAGENT_BROWSER_SESSION=<name>) - Machine-readable output:
--json - Debug output:
--debug - Run with a visible window (debugging):
--headed
Playwright → agent-browser Mapping
| Intent | Playwright concept | agent-browser command |
|---|---|---|
| Navigate | page.goto(url) | agent-browser open <url> |
| Snapshot | a11y snapshot / locator discovery | agent-browser snapshot -i |
| Click | locator.click() | agent-browser click <sel> or agent-browser click @eN |
| Fill | locator.fill() | agent-browser fill <sel> <text> or agent-browser fill @eN <text> |
| Type | locator.type() | agent-browser type <sel> <text> |
| Press key | page.keyboard.press() | agent-browser press <key> |
| Wait for element | page.waitForSelector() | agent-browser wait <sel> |
| Read text | locator.innerText() | agent-browser get text <sel> |
| Screenshot | page.screenshot() | agent-browser screenshot [--full] [path] |
| Evaluate JS | page.evaluate() | agent-browser eval <js> |
Examples
Login/Form Flow (ref-first)
export AGENT_BROWSER_SESSION="acme-login"
agent-browser open "https://example.com/login"
agent-browser snapshot -i
agent-browser fill @e12 "user@example.com"
agent-browser fill @e13 "correct horse battery staple"
agent-browser click @e14
agent-browser wait @e20
agent-browser get text @e20
Auth Headers (origin-scoped)
agent-browser open "https://example.com"
agent-browser --headers '{"Authorization":"Bearer YOUR_TOKEN"}' open "https://example.com/protected"
agent-browser snapshot -i --json
Common Mistakes
- Not using
--sessionwhen a multi-step flow spans multiple commands. - Clicking with brittle CSS selectors when a
snapshot -iref is available. - Forgetting
--jsonwhen downstream steps require structured parsing. - In sandboxed environments, the
agent-browserdaemon may fail to start due to restricted writes to user cache/home; rerun with the necessary filesystem permissions. - If the browser can’t launch, run
agent-browser install(may require network access).
Score
Total Score
75/100
Based on repository quality metrics
✓SKILL.md
SKILL.mdファイルが含まれている
+20
✓LICENSE
ライセンスが設定されている
+10
✓説明文
100文字以上の説明がある
+10
○人気
GitHub Stars 100以上
0/15
✓最近の活動
1ヶ月以内に更新
+10
○フォーク
10回以上フォークされている
0/5
✓Issue管理
オープンIssueが50未満
+5
✓言語
プログラミング言語が設定されている
+5
✓タグ
1つ以上のタグが設定されている
+5
Reviews
💬
Reviews coming soon

