
crawl4ai
by tao3k
WIP-designed to bridge the gap between human intent and machine execution. Built on a robust Tri-MCP architecture, it physically separates cognitive planning (The Brain), atomic execution (The Hands), and surgical coding (The Pen) to ensure context hygiene and safety.
SKILL.md
name: crawl4ai version: 0.1.0 description: High-performance web crawler skill using Sidecar Execution Pattern author: Omni Team routing_keywords: [crawl, scrape, web, fetch, scrape] execution_mode: subprocess intents:
- "Crawl a webpage and extract its content"
- "Fetch website content as markdown"
- "Scrape web pages for information"
Crawl4ai Skill
High-performance web crawler using the Sidecar Execution Pattern for dependency isolation.
Architecture
This skill demonstrates Sidecar Execution Pattern:
scripts/__init__.py: Lightweight interface (loaded by main agent via @skill_command)scripts/engine.py: Actual crawler implementation (runs in isolated uv environment)pyproject.toml: Skill-specific dependencies (crawl4ai, fire, pydantic)
The main agent never imports crawl4ai directly. Instead, it uses common.isolation.run_skill_command() to execute the crawler in a subprocess with its own isolated environment.
Commands
crawl_webpage
Crawl a webpage and extract its content as markdown.
@omni("skill.run crawl4ai.crawl_webpage url='https://example.com'")
Parameters:
url(str, required): Target URL to crawlfit_markdown(bool, default: True): Clean and simplify the markdown
Returns:
{
"success": True,
"url": "https://example.com",
"markdown": "# Example\n\nContent...",
"metadata": {"title": "Example", "description": "..."},
"error": None
}
check_crawler_ready
Check if the crawler skill is properly configured.
@omni("skill.run crawl4ai.check_crawler_ready")
Setup
Dependencies are automatically managed by uv:
cd assets/skills/crawl4ai
uv sync # Installs crawl4ai and its dependencies
Usage Example
# In conversation with Claude
Please crawl https://github.com and give me the main content as markdown.
# Claude will invoke:
# @omni("skill.run crawl4ai.crawl_webpage url='https://github.com'")
Why Sidecar Pattern?
- Zero Pollution: Main agent doesn't need crawl4ai, playwright, or other heavy deps
- Version Isolation: Each skill can use different versions of the same library
- Hot Swappable: Skills can be added/removed without restarting the agent
- Security: Compromised skill code has limited blast radius
Score
Total Score
Based on repository quality metrics
SKILL.mdファイルが含まれている
ライセンスが設定されている
100文字以上の説明がある
GitHub Stars 100以上
1ヶ月以内に更新
10回以上フォークされている
オープンIssueが50未満
プログラミング言語が設定されている
1つ以上のタグが設定されている
Reviews
Reviews coming soon


