← Back to list
name: survey-seed-harvest
description: |
Identify survey/review papers in a retrieved set and extract taxonomy seeds into

survey-seed-harvest
by WILLOSCAR
Research pipelines as semantic execution units: each skill declares inputs/outputs, acceptance criteria, and guardrails. Evidence-first methodology prevents hollow writing through structured intermediate artifacts.
⭐ 83🍴 10📅 Jan 24, 2026
SKILL.md
name: survey-seed-harvest
description: |
Identify survey/review papers in a retrieved set and extract taxonomy seeds into outline/taxonomy.yml (topics/subtopics/terminology).
Trigger: survey seed harvest, taxonomy seeds, 从 survey 提 taxonomy, bootstrap taxonomy.
Use when: retrieval/dedup 后想快速从已有 survey/review 论文中提取术语与主题结构,用于加速 taxonomy-builder。
Skip if: 已经有高质量 taxonomy(或你不想被 survey 既有框架限制)。
Network: none.
Guardrail: 产物是 seed,必须经 taxonomy-builder 重写与对齐 scope;避免生成泛化占位节点。
Survey Seed Harvest
Bootstrap taxonomy seeds from existing survey/review papers inside your retrieved set.
This is an accelerator for the early structure stage: it should make taxonomy-builder easier, not replace it.
Inputs
papers/papers_dedup.jsonl(deduped paper metadata with titles/abstracts)
Outputs
outline/taxonomy.yml(seed taxonomy; expected to be refined)
Workflow (heuristic)
Uses: papers/papers_dedup.jsonl.
- Find likely survey/review papers:
- title/abstract contains “survey”, “review”, “systematic”, “meta-analysis”
- Extract candidate topic terms and group them into:
- ~4–10 top-level nodes (“chapters”)
- 2–6 children per node (mappable leaves)
- Write short, actionable descriptions:
- what belongs here / what does not
- (optional) list 2–5 representative titles as seeds
- Treat the result as a starting point:
- pass it to
taxonomy-builderfor domain-meaningful rewriting and scope alignment.
- pass it to
Quality checklist
-
outline/taxonomy.ymlexists and is valid YAML. - Taxonomy has at least 2 levels (
childrenused) and every node has a description. - Avoid generic placeholder nodes like “Overview/Benchmarks/Open Problems” unless they are truly content-based for your domain.
Script (optional helper)
Quick Start
python .codex/skills/survey-seed-harvest/scripts/run.py --helppython .codex/skills/survey-seed-harvest/scripts/run.py --workspace <workspace_dir>
All Options
--top-k <n>: number of candidate terms to consider--min-freq <n>: minimum frequency threshold
Examples
- More conservative term selection:
python .codex/skills/survey-seed-harvest/scripts/run.py --workspace <ws> --top-k 80 --min-freq 3
Notes
- This helper is keyword-based; treat the output as seeds and refine with
taxonomy-builder.
Troubleshooting
Issue: no survey/review papers are detected in the set
Fix:
- Broaden retrieval (add “survey”, “review”, “benchmark” variants) or manually seed a few known surveys, then rerun.
Issue: taxonomy seeds look like generic buckets
Fix:
- Keep seeds concrete (named methods/benchmarks/tasks) and rely on
taxonomy-builderto rewrite under the actual scope.
Score
Total Score
70/100
Based on repository quality metrics
✓SKILL.md
SKILL.mdファイルが含まれている
+20
○LICENSE
ライセンスが設定されている
0/10
✓説明文
100文字以上の説明がある
+10
○人気
GitHub Stars 100以上
0/15
✓最近の活動
1ヶ月以内に更新
+10
✓フォーク
10回以上フォークされている
+5
✓Issue管理
オープンIssueが50未満
+5
✓言語
プログラミング言語が設定されている
+5
✓タグ
1つ以上のタグが設定されている
+5
Reviews
💬
Reviews coming soon

