[G1-Sync] Manual knowledge update

This commit is contained in:
Antigravity Agent
2026-05-10 22:08:15 +09:00
parent 21ac3ed255
commit 504fd5fb42
3011 changed files with 380280 additions and 206977 deletions
@@ -0,0 +1,389 @@
---
id: testing-test-data-management
title: Test Data — fixture / factory / seed / clean
category: Coding
status: draft
source_trust_level: B
verification_status: conceptual
created_at: 2026-05-09
updated_at: 2026-05-09
tags: [testing, data, vibe-coding]
tech_stack: { language: "TS / Python", applicable_to: ["Backend", "QA"] }
applied_in: []
aliases: [test data, fixture, factory, seed, faker, builder pattern, anonymization]
---
# Test Data Management
> Test 의 가장 큰 문제 = data. **Factory + faker (random) + seed (deterministic) + clean (after)**. PII 안 사용 + repeatable.
## 📖 핵심 개념
- Fixture: 정적 data (file).
- Factory: dynamic builder.
- Seed: 초기 DB 상태.
- Anonymized: prod data 의 PII 제거.
## 💻 코드 패턴
### Fixture (정적)
```ts
// fixtures/users.json
[
{ "id": 1, "email": "alice@test.com", "role": "admin" },
{ "id": 2, "email": "bob@test.com", "role": "user" }
]
// 사용
import users from './fixtures/users.json';
beforeEach(() => db.users.insertAll(users));
```
→ 작은 / 변경 적음. 큰 = 관리 어려움.
### Factory (dynamic)
```ts
import { faker } from '@faker-js/faker';
function userFactory(overrides: Partial<User> = {}): User {
return {
id: faker.string.uuid(),
email: faker.internet.email(),
name: faker.person.fullName(),
age: faker.number.int({ min: 18, max: 80 }),
createdAt: faker.date.past(),
...overrides,
};
}
// 사용
const admin = userFactory({ role: 'admin' });
const users = Array.from({ length: 10 }, () => userFactory());
```
→ 매 test 가 fresh data.
### Builder pattern
```ts
class UserBuilder {
private user: Partial<User> = {};
withEmail(e: string) { this.user.email = e; return this; }
withRole(r: Role) { this.user.role = r; return this; }
asAdmin() { this.user.role = 'admin'; return this; }
build(): User {
return { ...userFactory(), ...this.user };
}
}
const admin = new UserBuilder().asAdmin().withEmail('a@x').build();
```
### fishery (TS factory lib)
```ts
import { Factory } from 'fishery';
const userFactory = Factory.define<User>(({ sequence }) => ({
id: sequence,
email: `user${sequence}@test.com`,
name: faker.person.fullName(),
}));
const u = userFactory.build({ name: 'Alice' });
const list = userFactory.buildList(10);
```
### factory-bot (Python)
```python
import factory
class UserFactory(factory.Factory):
class Meta:
model = User
id = factory.Sequence(lambda n: n)
email = factory.Sequence(lambda n: f'user{n}@test.com')
role = 'user'
class AdminFactory(UserFactory):
role = 'admin'
admin = AdminFactory()
```
### Seed (DB)
```ts
// seed.ts
import { faker } from '@faker-js/faker';
async function seed() {
// 100 user
const users = Array.from({ length: 100 }, () => ({
email: faker.internet.email(),
name: faker.person.fullName(),
}));
await db.users.insertAll(users);
// 1000 order
const orders = Array.from({ length: 1000 }, () => ({
userId: faker.helpers.arrayElement(users).id,
amount: faker.number.int({ min: 10, max: 1000 }),
}));
await db.orders.insertAll(orders);
}
await seed();
```
→ Dev / staging 환경 초기화.
### Faker faker faker
```ts
faker.person.fullName(); // 'Alice Johnson'
faker.internet.email(); // 'alice@example.com'
faker.location.city(); // 'New York'
faker.commerce.productName(); // 'Laptop'
faker.number.int({ min: 1, max: 100 });
faker.date.recent();
faker.lorem.paragraphs(3);
faker.image.url();
faker.string.uuid();
```
→ Locale 가능: `faker.locale = 'ko'`.
### Determinism
```ts
faker.seed(42);
const u1 = faker.person.fullName(); // 항상 같은 결과
// Sequence
let _id = 0;
function nextId() { return _id++; }
```
→ Test 가 reproducible.
### Reset (테스트 격리)
```ts
afterEach(async () => {
await db.query('TRUNCATE users, orders CASCADE');
});
// 또는 transaction rollback
beforeEach(async () => {
await db.query('BEGIN');
});
afterEach(async () => {
await db.query('ROLLBACK');
});
```
→ 매 test 가 깨끗.
### Snapshot DB
```bash
# Reset 빠른 방법
pg_dump test_db > snapshot.sql
# 매 test
dropdb test_db && createdb test_db && psql test_db < snapshot.sql
```
→ 큰 seed 가 매번 다시 X.
### Testcontainers
```ts
import { PostgreSqlContainer } from '@testcontainers/postgresql';
beforeAll(async () => {
pg = await new PostgreSqlContainer().start();
await runMigrations(pg.getConnectionUri());
await seed();
});
afterAll(() => pg.stop());
```
→ Docker container 가 test 시작/종료.
### Production data anonymization
```sql
-- Prod → staging dump
UPDATE users SET
email = 'user' || id || '@test.com',
phone = '000-0000-0000',
ssn = NULL,
full_name = 'User ' || id;
DELETE FROM payment_methods;
DELETE FROM messages WHERE created_at < NOW() - INTERVAL '30 days';
```
→ PII / payment 제거.
### Synthetic prod data
```python
# Prod 의 분포 학습 → fake 생성
from sdv.tabular import GaussianCopula
model = GaussianCopula()
model.fit(prod_users_df)
fake_users = model.sample(10000)
```
→ [[AI_Synthetic_Data]].
### 마스킹 vs 가짜 vs 합성
```
Masking: 기존 → blur (Alice → A*****)
Faker: 새 random
Synthetic: 분포 보존 + 새
→ 통계 분석 = synthetic.
일반 test = faker.
```
### Time travel
```ts
// 시간 의존 test
import MockDate from 'mockdate';
MockDate.set('2026-05-09T00:00:00Z');
// ... test ...
afterEach(() => MockDate.reset());
```
### UUID 의 함정
```ts
// ❌ 매 test 가 다른 UUID = snapshot 깨짐
const u = { id: faker.string.uuid() };
expect(u).toMatchSnapshot();
// ✅ Fixed
const u = { id: '00000000-0000-0000-0000-000000000001' };
```
### Test data 공유 (shared)
```
beforeAll = 모든 test 공유.
beforeEach = 매 test fresh.
→ Read-only test = beforeAll OK.
Write test = beforeEach 필수 (격리).
```
### Builder 의 약점
```
큰 entity = builder 길음:
new UserBuilder()
.withName(...)
.withEmail(...)
.withRole(...)
.withAddress(...)
.build();
→ Default 가 좋고 + override 만 나은 경우 많음.
```
→ Factory + override ({...}) 가 ergonomic.
### Test 의 data dependency
```
"이 user 가 있어야 X" → setup.
"이 order 가 있어야 Y" → setup + relation.
"이 user + order 합 = Z" → 복잡 setup.
→ Builder / factory 가 도움.
```
### Idempotent seed
```ts
async function seedIdempotent() {
const exists = await db.users.findOne({ email: 'admin@x.com' });
if (!exists) {
await db.users.insert({ email: 'admin@x.com', ... });
}
}
```
→ 다시 실행 OK.
### CI 의 data
```
CI 가 매번 새 DB / container.
- Migration 실행
- Seed 실행
- Test 실행
→ Hermetic.
```
### Performance test data
```
큰 양 (100k user, 1M order):
- Bulk insert (COPY in postgres)
- Generate file → load
- 매 test 가 X
bulk_insert(users, 100_000) # 빠름
```
### 함정: Test 가 prod data 가정
```ts
// ❌ "user_id 1 가 항상 admin" 가정
test('admin can delete', () => {
...
});
// ✅ 매 test 가 admin 명시 생성
test('admin can delete', () => {
const admin = userFactory({ role: 'admin' });
...
});
```
### Fixtures vs factory
```
Fixture:
- 작은 / 정적 / shared
- "전형적 user 5명"
Factory:
- Dynamic / parameter
- "임의 admin / banned user"
→ 둘 다 함께 가능.
```
## 🤔 의사결정 기준
| 작업 | 추천 |
|---|---|
| 단순 unit test | Factory (fishery / factory-bot) |
| Integration test | Factory + DB seed |
| E2E test | Docker + 큰 seed |
| Big data | Bulk insert + snapshot |
| 시간 의존 | MockDate |
| Snapshot test | Fixed (no UUID) |
| Prod 가까움 | Anonymized / synthetic |
## ❌ 안티패턴
- **Test 가 prod data 가정**: fragile.
- **모든 test 가 1 fixture**: 격리 X.
- **PII in test**: 누출 가능.
- **Cleanup 없음**: 다음 test 영향.
- **Faker 가 결정 X**: snapshot 깨짐.
- **Fixture 거대 (10MB+)**: 관리 X.
- **Seed 가 idempotent X**: 재실행 깨짐.
## 🤖 LLM 활용 힌트
- Factory + faker = ergonomic.
- 매 test 가 격리 (clean / transaction).
- Prod-like = anonymized / synthetic.
- Determinism (seed) = reproducible.
## 🔗 관련 문서
- [[Testing_Faker_and_Builders]]
- [[AI_Synthetic_Data]]
- [[DB_Migration_Safety]]