
speckit-test-review
by Obsidian-Owl
The Open Platform for building Data Platforms. Ship faster. Stay compliant. Scale to Data Mesh.
SKILL.md
name: speckit-test-review description: Review test quality before PR - semantic analysis of test design, not just linting
User Input
$ARGUMENTS
You MUST consider the user input before proceeding (if not empty).
Usage Modes
| Mode | Command | Scope | When to Use |
|---|---|---|---|
| Changed Files | /speckit.test-review | Tests changed vs main | Before PR (default) |
| Full Audit | /speckit.test-review --all | ALL test files | Quality gate, periodic audit |
| Specific Files | /speckit.test-review path/to/test.py | Named files | Targeted review |
Important: Use --all to catch pre-existing issues, not just changes in current branch.
Goal
Perform a comprehensive test quality review that answers: Are these tests actually good tests?
This is NOT linting. This is semantic analysis of test design:
- Do tests actually test what they claim?
- Could tests pass while the code is broken?
- Are tests at the right level?
- Are tests maintainable?
Plus floe-specific checks:
- Plugin testing completeness
- Contract stability
- Architecture compliance
Operating Constraints
STRICTLY READ-ONLY: Do not modify any files. Output analysis and recommendations.
SEMANTIC ANALYSIS: Read and understand tests, don't just grep for patterns.
TIERED OUTPUT: Full analysis for problems, brief summary for clean tests.
Memory Integration
After Completion
Save quality findings:
./scripts/memory-save --decisions "Test review for {feature}: {key findings}" --issues "{LinearIDs}"
Constitution Alignment
This skill validates test adherence to project principles:
- TDD: Tests should exist before implementation
- Traceability: Tests should have requirement markers
- No Skip: Tests should FAIL, not skip
Execution Steps
Phase 0: Identify Test Files
You handle this phase directly.
Parse user input to determine mode:
-
If
--allflag present: Full codebase audit# Get ALL test files in the codebase find packages/*/tests plugins/*/tests tests/ testing/tests -name "test_*.py" -type f 2>/dev/null -
If specific file path provided: Review that file
# Verify file exists ls -la <provided-path> -
Default (no args): Changed files only
# Get current branch git rev-parse --abbrev-ref HEAD # Get changed test files git diff --name-only main...HEAD | grep -E 'tests.*\.py$'
Report mode to user:
--allmode: "Running FULL CODEBASE audit on N test files"- Specific file: "Reviewing specified file: "
- Default: "Reviewing N test files changed vs main"
If no test files to review in default mode, suggest using --all for full audit.
Output: List of test files to analyze, classified by type:
- Unit:
*/tests/unit/*.pyor no marker - Integration:
*/tests/integration/*.pyor@pytest.mark.integration - Contract:
*/tests/contract/*.pyor@pytest.mark.contract - E2E:
*/tests/e2e/*.pyor@pytest.mark.e2e
Phase 1: Semantic Test Analysis
Invoke test-reviewer agent for each test file (or batch by type).
Task(test-reviewer, "Review the following test file for quality, correctness, and maintainability.
File: [path]
Apply your full analysis framework:
1. For each test, evaluate Purpose, Correctness, Isolation, Maintainability, Type Appropriateness
2. Apply type-specific checks based on test classification
3. Full analysis for tests with issues, brief summary for clean tests
Return your structured analysis.")
Wait for test-reviewer to return.
Phase 2: floe-Specific Analysis (Parallel)
Invoke floe-specific agents IN PARALLEL (single message, multiple Task calls):
Task(plugin-quality, "Analyze plugin testing completeness.
Changed files: [list]
Return your Plugin Quality Report.")
Task(contract-stability, "Analyze contract stability.
Changed files: [list]
Return your Contract Stability Report.")
Task(architecture-compliance, "Analyze architecture compliance in tests.
Changed files: [list]
Return your Architecture Compliance Report.")
Wait for all agents to return.
Phase 3: Strategic Synthesis
You handle this phase directly.
Synthesize all reports into a unified strategic assessment.
Output Format
## Test Quality Review
**Branch**: [branch]
**Files Reviewed**: [N]
**Tests Analyzed**: [N]
---
### Executive Summary
| Aspect | Status | Key Finding |
|--------|--------|-------------|
| Test Design Quality | status | [summary from test-reviewer] |
| Plugin Coverage | status | [summary from plugin-quality] |
| Contract Stability | status | [summary from contract-stability] |
| Architecture Compliance | status | [summary from architecture-compliance] |
**Overall**: [One sentence assessment]
---
### Test Design Analysis
[Include test-reviewer findings]
#### Tests Needing Attention
[Full analysis for each problematic test]
#### Clean Tests
[Summary table of tests that passed review]
---
### floe-Specific Findings
#### Plugin Coverage
[Key findings from plugin-quality agent]
#### Contract Stability
[Key findings from contract-stability agent]
#### Architecture Compliance
[Key findings from architecture-compliance agent]
---
### Priority Actions
| Priority | Issue | Impact | Effort |
|----------|-------|--------|--------|
| P0 | [Must fix] | High | [estimate] |
| P1 | [Should fix] | Medium | [estimate] |
| P2 | [Consider] | Low | [estimate] |
---
### Recommendations
1. **Immediate** (this PR):
- [Specific action with file:line]
2. **Follow-up** (next PR):
- [Action item]
---
### Next Steps
- [ ] Address P0 issues
- [ ] Re-run `/speckit.test-review` to verify
- [ ] Proceed to PR when clean
What This Review Checks
From test-reviewer (Semantic Analysis)
- Purpose: Is it clear what's being tested?
- Correctness: Could test pass while code is broken?
- Isolation: Deterministic? Independent?
- Maintainability: Brittle to implementation changes?
- Type Appropriateness: Right level of test?
From floe-specific agents
- Plugin Quality: All 11 types tested? Lifecycle coverage?
- Contract Stability: Schema stable? Backwards compatible?
- Architecture: K8s-native? Technology ownership respected?
What This Review Does NOT Check
- Linting/style: ruff handles this
- Type safety: mypy handles this
- Security: Aikido/SonarQube handle this
- Coverage %: pytest-cov handles this
When to Use
| Situation | Recommended Mode |
|---|---|
| Before creating a PR | /speckit.test-review (changed files) |
| After writing new tests | /speckit.test-review (changed files) |
| When investigating test failures | /speckit.test-review path/to/test.py |
| When asked "are my tests good?" | /speckit.test-review --all |
| Quality gate / periodic audit | /speckit.test-review --all |
| Fixing pre-existing issues | /speckit.test-review --all |
Key Insight: Default mode only reviews changed files. Use --all to catch issues that existed before your branch.
Handoff
After completing this skill:
- Fix issues: Address P0/P1 issues identified
- Check wiring: Run
/speckit.wiring-checkto verify code connectivity - Check merge readiness: Run
/speckit.merge-checkbefore PR - Create PR: Run
/speckit.prwhen tests pass
References
TESTING.md- Testing standards.claude/rules/testing-standards.md- Testing rules.claude/rules/test-organization.md- Test organization
Score
Total Score
Based on repository quality metrics
SKILL.mdファイルが含まれている
ライセンスが設定されている
100文字以上の説明がある
GitHub Stars 100以上
1ヶ月以内に更新
10回以上フォークされている
オープンIssueが50未満
プログラミング言語が設定されている
1つ以上のタグが設定されている
Reviews
Reviews coming soon
