スキル一覧に戻る
Keith-CY

agent-browser

by Keith-CY

my-crazy-skills is a curated collection of AI skills and automation modules organized as git submodules. It provides a centralized, extensible skill arsenal for various AI agents — including Claude, Gemini, and others.

2🍴 0📅 2026年1月25日
GitHubで見るManusで実行

SKILL.md


name: agent-browser description: Use when a task needs headless browser automation (navigate, click, fill forms, extract text, take screenshots) and Playwright-based tooling should be avoided; prefer the agent-browser CLI.

agent-browser

Overview

Use agent-browser (CLI) for deterministic, scriptable headless browser automation via snapshots and stable element refs (@e1, @e2, ...).

Core Workflow

  1. Open a page: agent-browser open <url>
  2. Get a ref-based accessibility snapshot: agent-browser snapshot -i
  3. Interact using refs (preferred) or selectors:
    • Click: agent-browser click @e2
    • Fill: agent-browser fill @e3 "text"
    • Press: agent-browser press Enter
  4. Extract info:
    • Visible text: agent-browser get text @e1
    • URL/title: agent-browser get url / agent-browser get title
  5. Capture artifacts:
    • Screenshot: agent-browser screenshot --full path/to.png
    • Trace: agent-browser trace start / agent-browser trace stop path/to-trace.json.gz

Quick Reference

Common Flags

  • Reuse a session across commands: --session <name> (or AGENT_BROWSER_SESSION=<name>)
  • Machine-readable output: --json
  • Debug output: --debug
  • Run with a visible window (debugging): --headed

Playwright → agent-browser Mapping

IntentPlaywright conceptagent-browser command
Navigatepage.goto(url)agent-browser open <url>
Snapshota11y snapshot / locator discoveryagent-browser snapshot -i
Clicklocator.click()agent-browser click <sel> or agent-browser click @eN
Filllocator.fill()agent-browser fill <sel> <text> or agent-browser fill @eN <text>
Typelocator.type()agent-browser type <sel> <text>
Press keypage.keyboard.press()agent-browser press <key>
Wait for elementpage.waitForSelector()agent-browser wait <sel>
Read textlocator.innerText()agent-browser get text <sel>
Screenshotpage.screenshot()agent-browser screenshot [--full] [path]
Evaluate JSpage.evaluate()agent-browser eval <js>

Examples

Login/Form Flow (ref-first)

export AGENT_BROWSER_SESSION="acme-login"
agent-browser open "https://example.com/login"
agent-browser snapshot -i
agent-browser fill @e12 "user@example.com"
agent-browser fill @e13 "correct horse battery staple"
agent-browser click @e14
agent-browser wait @e20
agent-browser get text @e20

Auth Headers (origin-scoped)

agent-browser open "https://example.com"
agent-browser --headers '{"Authorization":"Bearer YOUR_TOKEN"}' open "https://example.com/protected"
agent-browser snapshot -i --json

Common Mistakes

  • Not using --session when a multi-step flow spans multiple commands.
  • Clicking with brittle CSS selectors when a snapshot -i ref is available.
  • Forgetting --json when downstream steps require structured parsing.
  • In sandboxed environments, the agent-browser daemon may fail to start due to restricted writes to user cache/home; rerun with the necessary filesystem permissions.
  • If the browser can’t launch, run agent-browser install (may require network access).

スコア

総合スコア

75/100

リポジトリの品質指標に基づく評価

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

+10
説明文

100文字以上の説明がある

+10
人気

GitHub Stars 100以上

0/15
最近の活動

1ヶ月以内に更新

+10
フォーク

10回以上フォークされている

0/5
Issue管理

オープンIssueが50未満

+5
言語

プログラミング言語が設定されている

+5
タグ

1つ以上のタグが設定されている

+5

レビュー

💬

レビュー機能は近日公開予定です