test-strategy

Name: test-strategy
Rating: 55
Author: cosmix

by cosmix

A curated list of agents, skills and a CLAUDE.md starter for your agentic sessions!

⭐ 4🍴 0📅 Jan 24, 2026

agentic-coding agents claude claude-code skills

View on GitHub Run in Manus

SKILL.md

name: test-strategy description: Comprehensive test strategy guidance including test pyramid design, coverage goals, test categorization, flaky test diagnosis, test infrastructure architecture, and risk-based prioritization. Absorbed expertise from eliminated senior-qa-engineer. Use when planning testing approaches, setting up test infrastructure, optimizing test suites, diagnosing flaky tests, or designing test architecture across domains (API, data pipelines, ML models, infrastructure). Trigger keywords: test strategy, test pyramid, test plan, what to test, how to test, test architecture, test infrastructure, coverage goals, test organization, CI/CD testing, test prioritization, testing approach, flaky test, test optimization, test parallelization, API testing strategy, data pipeline testing, ML model testing, infrastructure testing.

Test Strategy

Overview

Test strategy defines how to approach testing for a project, balancing thoroughness with efficiency. A well-designed strategy ensures critical functionality is covered while avoiding over-testing trivial code. This skill covers the test pyramid, coverage metrics, test categorization, and integration with CI/CD pipelines.

Instructions

1. Design the Test Pyramid

Structure tests in layers with appropriate ratios:

         /\
        /  \        E2E Tests (5-10%)
       /----\       - Critical user journeys
      /      \      - Cross-system integration
     /--------\     Integration Tests (15-25%)
    /          \    - API contracts
   /------------\   - Database interactions
  /              \  - Service boundaries
 /----------------\ Unit Tests (65-80%)
                    - Business logic
                    - Pure functions
                    - Edge cases

Recommended Ratios:

Unit tests: 65-80% of test suite
Integration tests: 15-25%
E2E tests: 5-10%

2. Set Coverage Goals

Coverage Targets by Component Type:

Component Type	Line Coverage	Branch Coverage	Notes
Business Logic	90%+	85%+	Critical paths fully covered
API Handlers	80%+	75%+	All endpoints tested
Utilities	95%+	90%+	Pure functions easily testable
UI Components	70%+	60%+	Focus on behavior over markup
Infrastructure	60%+	50%+	Integration tests preferred

Coverage Anti-patterns to Avoid:

Chasing 100% coverage for coverage's sake
Testing getters/setters without logic
Testing framework or library code
Writing tests that don't verify behavior

3. Decide What to Test vs What Not to Test

Always Test:

Business logic and domain rules
Input validation and error handling
Security-sensitive operations
Data transformations
State transitions
Edge cases and boundary conditions
Regression scenarios from bug fixes

Consider Not Testing:

Simple pass-through functions
Framework-generated code
Third-party library internals
Trivial getters/setters
Configuration constants
Logging statements (unless critical)

Test Smell Detection:

// BAD: Testing trivial code
test("getter returns value", () => {
  const user = new User("John");
  expect(user.getName()).toBe("John");
});

// GOOD: Testing meaningful behavior
test("user cannot change name to empty string", () => {
  const user = new User("John");
  expect(() => user.setName("")).toThrow(ValidationError);
});

4. Categorize and Organize Tests

Directory Structure:

tests/
├── unit/
│   ├── services/
│   ├── models/
│   └── utils/
├── integration/
│   ├── api/
│   ├── database/
│   └── external-services/
├── e2e/
│   ├── flows/
│   └── pages/
├── fixtures/
│   ├── factories/
│   └── mocks/
└── helpers/
    ├── setup.ts
    └── assertions.ts

Test Tagging System:

// Jest example with tags
describe("[unit][fast] UserService", () => {});
describe("[integration][slow] DatabaseRepository", () => {});
describe("[e2e][critical] CheckoutFlow", () => {});

// Run specific categories
// npm test -- --grep="\[unit\]"
// npm test -- --grep="\[critical\]"

Naming Conventions:

[ComponentName].[scenario].[expected_result].test.ts

Examples:
UserService.createUser.returnsNewUser.test.ts
PaymentProcessor.invalidCard.throwsPaymentError.test.ts

5. Integrate with CI/CD

Pipeline Stage Configuration:

# .github/workflows/test.yml
name: Test Pipeline

on: [push, pull_request]

jobs:
  unit-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run Unit Tests
        run: npm test -- --grep="\[unit\]" --coverage
      - name: Upload Coverage
        uses: codecov/codecov-action@v3

  integration-tests:
    runs-on: ubuntu-latest
    needs: unit-tests
    services:
      postgres:
        image: postgres:15
        env:
          POSTGRES_PASSWORD: test
    steps:
      - uses: actions/checkout@v4
      - name: Run Integration Tests
        run: npm test -- --grep="\[integration\]"

  e2e-tests:
    runs-on: ubuntu-latest
    needs: integration-tests
    steps:
      - uses: actions/checkout@v4
      - name: Run E2E Tests
        run: npm run test:e2e

CI Test Optimization:

Run unit tests first (fast feedback)
Parallelize test suites
Cache dependencies and build artifacts
Use test splitting for large suites
Fail fast on critical tests

6. Risk-Based Test Prioritization

Risk Matrix for Prioritization:

Impact ↓ / Likelihood →	Low	Medium	High
High	Medium Priority	High Priority	Critical
Medium	Low Priority	Medium Priority	High Priority
Low	Skip/Manual	Low Priority	Medium Priority

Risk Factors to Consider:

Business Impact: Revenue, user trust, legal compliance
Complexity: Code complexity, integration points
Change Frequency: Actively developed areas
Historical Bugs: Components with bug history
Dependencies: Critical external services

Prioritized Test Categories:

Critical (P0): Run on every commit
- Authentication/authorization
- Payment processing
- Data integrity
High (P1): Run on PR merge
- Core business workflows
- API contract tests
Medium (P2): Run nightly
- Edge cases
- Performance tests
Low (P3): Run weekly
- Backward compatibility
- Deprecated feature coverage

7. Domain-Specific Testing Strategies

API Testing Strategy

Test Layers:

Contract Tests (P0)
- Request/response schema validation
- HTTP status codes for all endpoints
- Error response formats
- Authentication/authorization rules
Business Logic Tests (P0)
- Valid input processing
- Business rule enforcement
- State transitions via API calls
Integration Tests (P1)
- Database operations via API
- External service integration
- Transaction rollback scenarios
Performance Tests (P2)
- Response time under load
- Concurrent request handling
- Rate limiting behavior

API Test Organization:

tests/api/
├── contracts/          # Schema validation tests
├── endpoints/          # Per-endpoint behavior tests
├── auth/               # Authentication flows
├── integration/        # Cross-service scenarios
└── performance/        # Load and stress tests

Data Pipeline Testing Strategy

Test Focus Areas:

Data Quality Tests (P0)
- Schema validation at each stage
- Data type correctness
- Null/missing value handling
- Duplicate detection
Transformation Tests (P0)
- Input → output correctness
- Edge case handling
- Data loss detection
- Aggregation accuracy
Integration Tests (P1)
- Source extraction correctness
- Sink loading verification
- Idempotency checks
- Failure recovery
Performance Tests (P2)
- Processing throughput
- Memory usage with large datasets
- Partition handling

Data Pipeline Test Pattern:

def test_user_data_transformation():
    # Arrange: Create test input data
    raw_input = create_test_dataset(
        rows=1000,
        include_nulls=True,
        include_duplicates=True
    )

    # Act: Run transformation
    result = transform_user_data(raw_input)

    # Assert: Verify output quality
    assert_no_nulls(result, required_fields=["user_id", "email"])
    assert_no_duplicates(result, key="user_id")
    assert_schema_matches(result, UserSchema)
    assert len(result) == expected_output_count(raw_input)

ML Model Testing Strategy

Test Layers:

Data Validation Tests (P0)
- Feature schema validation
- Label distribution checks
- Data leakage detection
- Train/test split correctness
Model Behavior Tests (P0)
- Prediction on known examples
- Invariance tests (e.g., case-insensitive text)
- Directional expectation tests
- Boundary condition handling
Model Quality Tests (P1)
- Accuracy/precision/recall thresholds
- Fairness metrics across groups
- Performance on edge cases
- Regression detection (vs baseline)
Integration Tests (P1)
- Model loading and serving
- Prediction API contract
- Feature engineering pipeline
- Model versioning

ML Test Example:

def test_sentiment_model_invariance():
    """Model should be case-insensitive"""
    model = load_sentiment_model()

    test_cases = [
        ("This is GREAT!", "This is great!"),
        ("TERRIBLE service", "terrible service"),
    ]

    for text1, text2 in test_cases:
        pred1 = model.predict(text1)
        pred2 = model.predict(text2)
        assert pred1 == pred2, f"Case sensitivity detected: {text1} vs {text2}"

Infrastructure Testing Strategy

Test Focus:

Infrastructure-as-Code Tests (P0)
- Syntax validation (terraform validate)
- Security policy checks
- Resource naming conventions
- Cost estimation validation
Deployment Tests (P1)
- Smoke tests post-deployment
- Health check endpoints
- Configuration validation
- Rollback procedures
Resilience Tests (P2)
- Service restart handling
- Network partition recovery
- Resource exhaustion scenarios
- Chaos engineering tests
Observability Tests (P1)
- Metrics collection verification
- Log aggregation correctness
- Alert rule validation
- Dashboard functionality

Infrastructure Test Pattern:

# terraform test example
run "verify_security_group_rules" {
  command = plan

  assert {
    condition     = length([for rule in aws_security_group.main.ingress : rule if rule.cidr_blocks[0] == "0.0.0.0/0"]) == 0
    error_message = "Security group should not allow ingress from 0.0.0.0/0"
  }
}

8. Flaky Test Diagnosis and Prevention

Common Causes of Flakiness:

Cause	Symptoms	Solution
Race conditions	Fails intermittently on timing	Add proper synchronization
Async operations	Fails with "element not found"	Use explicit waits, not sleeps
Shared state	Fails when run with other tests	Isolate test data, reset state
External dependencies	Fails when service unavailable	Mock external calls, use test doubles
Time-dependent logic	Fails at specific times/dates	Inject time, use fake clocks
Resource cleanup	Fails after certain test order	Ensure teardown always runs
Nondeterministic data	Fails with random data variations	Use fixed seeds, deterministic generators
Environment differences	Fails in CI but passes locally	Containerize test environment
Insufficient timeouts	Fails under load/slow machines	Make timeouts configurable
Parallel execution races	Fails only when parallelized	Use unique identifiers per test

Flaky Test Diagnosis Workflow:

1. Reproduce Locally
   ├─ Run test 100 times: `for i in {1..100}; do npm test -- TestName || break; done`
   ├─ Run with different seeds: `npm test -- --seed=$RANDOM`
   └─ Run in parallel: `npm test -- --maxWorkers=4`

2. Identify Pattern
   ├─ Always fails at same point? → Logic bug, not flaky
   ├─ Fails under load? → Timing/resource issue
   ├─ Fails with other tests? → Shared state pollution
   └─ Fails on specific data? → Data-dependent bug

3. Instrument Test
   ├─ Add verbose logging
   ├─ Capture timing information
   ├─ Record test environment state
   └─ Save failure artifacts (screenshots, logs)

4. Fix Root Cause
   ├─ Eliminate race conditions
   ├─ Add proper synchronization
   ├─ Isolate test state
   └─ Mock external dependencies

5. Verify Fix
   ├─ Run fixed test 1000 times
   ├─ Run in CI 10 times
   └─ Monitor over 1 week

Flaky Test Prevention Checklist:

Tests use deterministic test data (fixed seeds, no random())
Async operations use explicit waits (not setTimeout/sleep)
Tests create unique resources (UUIDs in names/IDs)
Cleanup always runs (try/finally, afterEach hooks)
No hardcoded timing assumptions (sleep(100) is a code smell)
External services are mocked or use test doubles
Time-dependent logic uses injected/fake clocks
Tests do not depend on execution order
Shared state is reset between tests
Test environment is reproducible (containerized)

Example: Fixing a Flaky Test

// FLAKY: Race condition with async operation
test("user profile loads", async () => {
  renderUserProfile(userId);
  // Race: profile might not be loaded yet
  expect(screen.getByText("John Doe")).toBeInTheDocument();
});

// FIXED: Proper async handling
test("user profile loads", async () => {
  renderUserProfile(userId);
  // Wait for async operation to complete
  const userName = await screen.findByText("John Doe");
  expect(userName).toBeInTheDocument();
});

// FLAKY: Shared state pollution
test("creates user with default role", () => {
  const user = createUser({ name: "Alice" });
  expect(user.role).toBe("user"); // Fails if previous test modified default
});

// FIXED: Isolated state
test("creates user with default role", () => {
  resetDefaultRole(); // Ensure clean state
  const user = createUser({ name: "Alice" });
  expect(user.role).toBe("user");
});

// FLAKY: Time-dependent logic
test("expires session after 1 hour", () => {
  const session = createSession();
  // Flaky: Depends on current time
  expect(session.expiresAt).toBe(Date.now() + 3600000);
});

// FIXED: Inject time dependency
test("expires session after 1 hour", () => {
  const mockClock = installFakeClock();
  mockClock.setTime(new Date("2024-01-01T12:00:00Z"));

  const session = createSession();
  expect(session.expiresAt).toBe(new Date("2024-01-01T13:00:00Z").getTime());

  mockClock.uninstall();
});

9. Test Infrastructure Architecture

Test Environment Management:

# docker-compose.test.yml
version: '3.8'
services:
  test-db:
    image: postgres:15
    environment:
      POSTGRES_DB: test_db
      POSTGRES_USER: test_user
      POSTGRES_PASSWORD: test_pass
    ports:
      - "5433:5432"
    tmpfs:
      - /var/lib/postgresql/data  # In-memory for speed

  test-redis:
    image: redis:7-alpine
    ports:
      - "6380:6379"

  test-app:
    build: .
    environment:
      DATABASE_URL: postgres://test_user:test_pass@test-db:5432/test_db
      REDIS_URL: redis://test-redis:6379
    depends_on:
      - test-db
      - test-redis

Test Data Management:

// Factory pattern for test data
class UserFactory {
  private sequence = 0;

  create(overrides?: Partial<User>): User {
    return {
      id: overrides?.id ?? `user-${this.sequence++}`,
      email: overrides?.email ?? `user${this.sequence}@test.com`,
      name: overrides?.name ?? `Test User ${this.sequence}`,
      role: overrides?.role ?? "user",
      createdAt: overrides?.createdAt ?? new Date(),
    };
  }

  createBatch(count: number, overrides?: Partial<User>): User[] {
    return Array.from({ length: count }, () => this.create(overrides));
  }
}

// Usage ensures unique data per test
test("user search works", () => {
  const factory = new UserFactory();
  const users = factory.createBatch(10);
  // Each test gets unique users, no conflicts
});

Test Parallelization Strategy:

Strategy	When to Use	Configuration
File-level parallel	Tests in different files independent	Jest: `--maxWorkers=4`
Database per worker	Tests need database isolation	Postgres: Create schema per worker
Test sharding	CI with multiple machines	Split tests by shard: `--shard=1/4`
Test prioritization	Want fast feedback	Run fast tests first, slow tests in parallel
Smart test selection	Only run affected tests	Use dependency graph to select changed tests

Example: Parallel Test Configuration

// jest.config.js with parallel optimization
module.exports = {
  maxWorkers: process.env.CI ? "50%" : "75%", // Conservative in CI
  testTimeout: 30000, // Longer timeout for CI

  // Run fast tests first
  testSequencer: "./custom-sequencer.js",

  // Database isolation per worker
  globalSetup: "./tests/setup/create-test-dbs.js",
  globalTeardown: "./tests/setup/drop-test-dbs.js",

  // Shard tests in CI
  shard: process.env.CI_NODE_INDEX
    ? `${process.env.CI_NODE_INDEX}/${process.env.CI_NODE_TOTAL}`
    : undefined,
};

Test Optimization Techniques:

Reduce Test Startup Time
- Cache compiled code
- Lazy-load test dependencies
- Use in-memory databases for unit tests
Optimize Test Execution
- Batch database operations
- Reuse expensive fixtures (connections, containers)
- Skip unnecessary setup for focused tests
Parallelize Safely
- Unique identifiers per test (UUIDs)
- Separate database schemas per worker
- Avoid shared file system access
Smart Test Selection
- Run only affected tests during development
- Use coverage mapping to determine affected tests
- Cache test results for unchanged code

# Run only tests affected by changes
npm test -- --changedSince=origin/main

# Run tests for specific module and dependents
npm test -- --selectProjects=user-service --testPathPattern=user

# Watch mode with smart re-running
npm test -- --watch --changedSince=HEAD

Best Practices

Test Behavior, Not Implementation
- Tests should verify outcomes, not internal mechanics
- Refactoring should not break tests if behavior unchanged
Keep Tests Independent
- No shared mutable state between tests
- Each test sets up its own context
- Tests can run in any order
Use Test Doubles Appropriately
- Stubs for providing test data
- Mocks for verifying interactions
- Fakes for complex dependencies
- Real implementations when feasible
Maintain Test Quality
- Apply same code quality standards to tests
- Refactor test code for readability
- Remove obsolete tests promptly
Fast Feedback Loop
- Optimize for quick local test runs
- Use watch mode during development
- Prioritize fast tests in CI
Document Test Intent
- Clear test names describe behavior
- Add comments for non-obvious setup
- Link tests to requirements/tickets

Examples

Example: Feature Test Strategy Document

# Feature: User Registration

## Risk Assessment

- Business Impact: HIGH (user acquisition)
- Complexity: MEDIUM (email validation, password rules)
- Change Frequency: LOW (stable feature)

## Test Coverage Plan

### Unit Tests (P0)

- [ ] Email format validation
- [ ] Password strength requirements
- [ ] Username uniqueness check logic
- [ ] Profile data sanitization

### Integration Tests (P1)

- [ ] Database user creation
- [ ] Email service integration
- [ ] Duplicate email handling

### E2E Tests (P0)

- [ ] Happy path: complete registration flow
- [ ] Error path: duplicate email shows error

## Coverage Targets

- Line coverage: 85%
- Branch coverage: 80%
- Critical paths: 100%

Example: Test Organization Configuration

// jest.config.js
module.exports = {
  projects: [
    {
      displayName: "unit",
      testMatch: ["<rootDir>/tests/unit/**/*.test.ts"],
      setupFilesAfterEnv: ["<rootDir>/tests/helpers/unit-setup.ts"],
    },
    {
      displayName: "integration",
      testMatch: ["<rootDir>/tests/integration/**/*.test.ts"],
      setupFilesAfterEnv: ["<rootDir>/tests/helpers/integration-setup.ts"],
      globalSetup: "<rootDir>/tests/helpers/db-setup.ts",
      globalTeardown: "<rootDir>/tests/helpers/db-teardown.ts",
    },
  ],
  coverageThreshold: {
    global: {
      branches: 75,
      functions: 80,
      lines: 80,
      statements: 80,
    },
    "./src/services/": {
      branches: 90,
      lines: 90,
    },
  },
};

Example: Risk-Based Test Selection Script

// scripts/select-tests.ts
interface TestFile {
  path: string;
  priority: "P0" | "P1" | "P2" | "P3";
  tags: string[];
}

function selectTestsForPipeline(
  context: "commit" | "pr" | "nightly" | "weekly",
): TestFile[] {
  const allTests = getTestManifest();

  const priorityMap = {
    commit: ["P0"],
    pr: ["P0", "P1"],
    nightly: ["P0", "P1", "P2"],
    weekly: ["P0", "P1", "P2", "P3"],
  };

  return allTests.filter((test) =>
    priorityMap[context].includes(test.priority),
  );
}

Score

Total Score

55/100

Based on repository quality metrics

✓SKILL.md

SKILL.mdファイルが含まれている

+20

○LICENSE

ライセンスが設定されている

0/10

○説明文

100文字以上の説明がある

0/10

○人気

GitHub Stars 100以上

0/15

✓最近の活動

1ヶ月以内に更新

+10

○フォーク

10回以上フォークされている

0/5

✓Issue管理

オープンIssueが50未満

✓言語

プログラミング言語が設定されている

✓タグ

1つ以上のタグが設定されている

Reviews

💬

Reviews coming soon

test-strategy

SKILL.md

Test Strategy

Overview

Instructions

1. Design the Test Pyramid

2. Set Coverage Goals

3. Decide What to Test vs What Not to Test

4. Categorize and Organize Tests

5. Integrate with CI/CD

6. Risk-Based Test Prioritization

7. Domain-Specific Testing Strategies

API Testing Strategy

Data Pipeline Testing Strategy

ML Model Testing Strategy

Infrastructure Testing Strategy

8. Flaky Test Diagnosis and Prevention

9. Test Infrastructure Architecture

Best Practices

Examples

Example: Feature Test Strategy Document

Example: Test Organization Configuration

Example: Risk-Based Test Selection Script

Score

Reviews

changelog-automation

web-component-design

dbt-transformation-patterns

market-sizing-analysis

on-call-handoff-patterns

architecture-decision-records

test-strategy

SKILL.md

Test Strategy

Overview

Instructions

1. Design the Test Pyramid

2. Set Coverage Goals

3. Decide What to Test vs What Not to Test

4. Categorize and Organize Tests

5. Integrate with CI/CD

6. Risk-Based Test Prioritization

7. Domain-Specific Testing Strategies

API Testing Strategy

Data Pipeline Testing Strategy

ML Model Testing Strategy

Infrastructure Testing Strategy

8. Flaky Test Diagnosis and Prevention

9. Test Infrastructure Architecture

Best Practices

Examples

Example: Feature Test Strategy Document

Example: Test Organization Configuration

Example: Risk-Based Test Selection Script

Score

Reviews

Related

Related Skills

changelog-automation

web-component-design

dbt-transformation-patterns

market-sizing-analysis

on-call-handoff-patterns

architecture-decision-records