
exploring-data
by oaustegard
My collection of Claude skills
SKILL.md
name: exploring-data description: Exploratory data analysis using ydata-profiling. Use when users upload .csv/.xlsx/.json/.parquet files or request "explore data", "analyze dataset", "EDA", "profile data". Generates interactive HTML or JSON reports with statistics, visualizations, correlations, and quality alerts. metadata: version: 0.0.3
Exploring Data
Workflow
1. Check if installed (instant)
bash /mnt/skills/user/exploring-data/scripts/check_install.sh
Returns: installed or not_installed
2. Install if needed (one-time, ~19s)
if [ "$(bash check_install.sh)" = "not_installed" ]; then
bash /mnt/skills/user/exploring-data/scripts/install_ydata.sh
fi
3. Run analysis (always generates JSON + HTML by default)
bash /mnt/skills/user/exploring-data/scripts/analyze.sh <filepath> [minimal|full] [html|json]
Defaults: minimal + html (also generates JSON)
Output:
eda_report.html- Interactive report for usereda_report.json- Machine-readable for Claude analysis
4. If Claude needs to analyze (user asks "what do you think?" etc.)
python /mnt/skills/user/exploring-data/scripts/summarize_insights.py /mnt/user-data/outputs/eda_report.json
Reads: eda_report.json (comprehensive ydata output)
Writes: eda_insights_summary.md (condensed for Claude)
Outputs to stdout: Formatted markdown summary
Claude should read the stdout markdown summary, NOT the full JSON report.
Invocation Examples
# Standard workflow (user views HTML)
bash analyze.sh /mnt/user-data/uploads/data.csv
# Produces: eda_report.html + eda_report.json
# Link user to: computer:///mnt/user-data/outputs/eda_report.html
# User asks Claude to analyze
bash analyze.sh /mnt/user-data/uploads/data.csv
python summarize_insights.py /mnt/user-data/outputs/eda_report.json
# Claude reads the stdout markdown summary
# Claude can then provide analysis based on patterns/insights
# Full mode for comprehensive analysis
bash analyze.sh /mnt/user-data/uploads/data.csv full
# JSON-only output (skip HTML generation)
bash analyze.sh /mnt/user-data/uploads/data.csv minimal json
Modes
Minimal (default, 5-10s): Dataset overview, variable analysis, correlations, missing values, alerts
Full (10-20s): Everything in minimal + scatter matrices, sample data, character analysis, more visualizations
User Triggers for Full Mode
"comprehensive analysis", "detailed EDA", "full profiling", "deep analysis"
Otherwise use minimal.
Score
Total Score
Based on repository quality metrics
SKILL.mdファイルが含まれている
ライセンスが設定されている
100文字以上の説明がある
GitHub Stars 100以上
1ヶ月以内に更新
10回以上フォークされている
オープンIssueが50未満
プログラミング言語が設定されている
1つ以上のタグが設定されている
Reviews
Reviews coming soon
