test-data-management

Name: test-data-management
Rating: 85
Author: proffesor-for-testing

by proffesor-for-testing

Agentic QE Fleet is an open-source AI-powered quality engineering platform designed for use with Claude Code, featuring specialized agents and skills to support testing activities for a product at any stage of the SDLC. Free to use, fork, build, and contribute. Based on the Agentic QE Framework created by Dragan Spiridonov.

⭐ 132🍴 27📅 Jan 23, 2026

agenticqe agenticsfoundation agents quality-engineering utility-development tool-building productivity-enhancement workflow-improvement

View on GitHub Run in Manus

SKILL.md

Test Data Management

<default_to_action> When creating or managing test data:

NEVER use production PII directly
GENERATE synthetic data with faker libraries
ANONYMIZE production data if used (mask, hash)
ISOLATE test data (transactions, per-test cleanup)
SCALE with batch generation (10k+ records/sec)

Quick Data Strategy:

Unit tests: Minimal data (just enough)
Integration: Realistic data (full complexity)
Performance: Volume data (10k+ records)

Critical Success Factors:

40% of test failures from inadequate data
GDPR fines up to €20M for PII violations
Never store production PII in test environments </default_to_action>

Quick Reference Card

When to Use

Creating test datasets
Handling sensitive data
Performance testing with volume
GDPR/CCPA compliance

Data Strategies

Type	When	Size
Minimal	Unit tests	1-10 records
Realistic	Integration	100-1000 records
Volume	Performance	10k+ records
Edge cases	Boundary testing	Targeted

Privacy Techniques

Technique	Use Case
Synthetic	Generate fake data (preferred)
Masking	j***@example.com
Hashing	Irreversible pseudonymization
Tokenization	Reversible with key

Synthetic Data Generation

import { faker } from '@faker-js/faker';

// Seed for reproducibility
faker.seed(123);

function generateUser() {
  return {
    id: faker.string.uuid(),
    email: faker.internet.email(),
    firstName: faker.person.firstName(),
    lastName: faker.person.lastName(),
    phone: faker.phone.number(),
    address: {
      street: faker.location.streetAddress(),
      city: faker.location.city(),
      zip: faker.location.zipCode()
    },
    createdAt: faker.date.past()
  };
}

// Generate 1000 users
const users = Array.from({ length: 1000 }, generateUser);

Test Data Builder Pattern

class UserBuilder {
  private user: Partial<User> = {};

  asAdmin() {
    this.user.role = 'admin';
    this.user.permissions = ['read', 'write', 'delete'];
    return this;
  }

  asCustomer() {
    this.user.role = 'customer';
    this.user.permissions = ['read'];
    return this;
  }

  withEmail(email: string) {
    this.user.email = email;
    return this;
  }

  build(): User {
    return {
      id: this.user.id ?? faker.string.uuid(),
      email: this.user.email ?? faker.internet.email(),
      role: this.user.role ?? 'customer',
      ...this.user
    } as User;
  }
}

// Usage
const admin = new UserBuilder().asAdmin().withEmail('admin@test.com').build();
const customer = new UserBuilder().asCustomer().build();

Data Anonymization

// Masking
function maskEmail(email) {
  const [user, domain] = email.split('@');
  return `${user[0]}***@${domain}`;
}
// john@example.com → j***@example.com

function maskCreditCard(cc) {
  return `****-****-****-${cc.slice(-4)}`;
}
// 4242424242424242 → ****-****-****-4242

// Anonymize production data
const anonymizedUsers = prodUsers.map(user => ({
  id: user.id, // Keep ID for relationships
  email: `user-${user.id}@example.com`, // Fake email
  firstName: faker.person.firstName(), // Generated
  phone: null, // Remove PII
  createdAt: user.createdAt // Keep non-PII
}));

Database Transaction Isolation

// Best practice: use transactions for cleanup
beforeEach(async () => {
  await db.beginTransaction();
});

afterEach(async () => {
  await db.rollbackTransaction(); // Auto cleanup!
});

test('user registration', async () => {
  const user = await userService.register({
    email: 'test@example.com'
  });
  expect(user.id).toBeDefined();
  // Automatic rollback after test - no cleanup needed
});

Volume Data Generation

// Generate 10,000 users efficiently
async function generateLargeDataset(count = 10000) {
  const batchSize = 1000;
  const batches = Math.ceil(count / batchSize);

  for (let i = 0; i < batches; i++) {
    const users = Array.from({ length: batchSize }, (_, index) => ({
      id: i * batchSize + index,
      email: `user${i * batchSize + index}@example.com`,
      firstName: faker.person.firstName()
    }));

    await db.users.insertMany(users); // Batch insert
    console.log(`Batch ${i + 1}/${batches}`);
  }
}

Agent-Driven Data Generation

// High-speed generation with constraints
await Task("Generate Test Data", {
  schema: 'ecommerce',
  count: { users: 10000, products: 500, orders: 5000 },
  preserveReferentialIntegrity: true,
  constraints: {
    age: { min: 18, max: 90 },
    roles: ['customer', 'admin']
  }
}, "qe-test-data-architect");

// GDPR-compliant anonymization
await Task("Anonymize Production Data", {
  source: 'production-snapshot',
  piiFields: ['email', 'phone', 'ssn'],
  method: 'pseudonymization',
  retainStructure: true
}, "qe-test-data-architect");

Agent Coordination Hints

Memory Namespace

aqe/test-data-management/
├── schemas/*            - Data schemas
├── generators/*         - Generator configs
├── anonymization/*      - PII handling rules
└── fixtures/*           - Reusable fixtures

Fleet Coordination

const dataFleet = await FleetManager.coordinate({
  strategy: 'test-data-generation',
  agents: [
    'qe-test-data-architect',  // Generate data
    'qe-test-executor',        // Execute with data
    'qe-security-scanner'      // Validate no PII exposure
  ],
  topology: 'sequential'
});

database-testing - Schema and integrity testing
compliance-testing - GDPR/CCPA compliance
performance-testing - Volume data for perf tests

Remember

Test data is infrastructure, not an afterthought. 40% of test failures are caused by inadequate test data. Poor data = poor tests.

Never use production PII directly. GDPR fines up to €20M or 4% of revenue. Always use synthetic data or properly anonymized production snapshots.

With Agents: qe-test-data-architect generates 10k+ records/sec with realistic patterns, relationships, and constraints. Agents ensure GDPR/CCPA compliance automatically and eliminate test data bottlenecks.

Score

Total Score

85/100

Based on repository quality metrics

✓SKILL.md

SKILL.mdファイルが含まれている

+20

✓LICENSE

ライセンスが設定されている

+10

✓説明文

100文字以上の説明がある

+10

✓人気

GitHub Stars 100以上

✓最近の活動

1ヶ月以内に更新

+10

✓フォーク

10回以上フォークされている

✓Issue管理

オープンIssueが50未満

✓言語

プログラミング言語が設定されている

✓タグ

1つ以上のタグが設定されている

Reviews

💬

Reviews coming soon

test-data-management

SKILL.md

Test Data Management

Quick Reference Card

When to Use

Data Strategies

Privacy Techniques

Synthetic Data Generation

Test Data Builder Pattern

Data Anonymization

Database Transaction Isolation

Volume Data Generation

Agent-Driven Data Generation

Agent Coordination Hints

Memory Namespace

Fleet Coordination

Remember

Score

Reviews

git-workflow

code-review

system-info

changelog-automation

web-component-design

dbt-transformation-patterns

test-data-management

SKILL.md

Test Data Management

Quick Reference Card

When to Use

Data Strategies

Privacy Techniques

Synthetic Data Generation

Test Data Builder Pattern

Data Anonymization

Database Transaction Isolation

Volume Data Generation

Agent-Driven Data Generation

Agent Coordination Hints

Memory Namespace

Fleet Coordination

Related Skills

Remember

Score

Reviews

Related

Related Skills

git-workflow

code-review

system-info

changelog-automation

web-component-design

dbt-transformation-patterns