Files
2nd/10_Wiki/Topics/Architecture/Characterization Tests (특성화 테스트).md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

7.5 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-characterization-tests-특성화-테스트 Characterization Tests (특성화 테스트) 10_Wiki/Topics verified self
Golden Master Tests
Approval Tests
Pinning Tests
Snapshot Tests
none A 0.9 applied
testing
legacy-code
feathers
refactoring
2026-05-10 pending
language framework
typescript-python jest-pytest-approvaltests

Characterization Tests (특성화 테스트)

매 한 줄

"매 legacy code 의 actual behavior 의 capture — 매 spec 이 X, 매 photo 가 O.". Michael Feathers 의 Working Effectively with Legacy Code (2004) 에서 정립. 매 untested legacy 의 refactoring 시작점. 매 "what should it do" X — 매 "what does it do right now" 의 lock down. 2026년 snapshot tests, ApprovalTests, golden master tests 모두 같은 family.

매 핵심

매 정의 (Feathers)

"A characterization test is a test that characterizes the actual behavior of a piece of code. There's no 'correct'... it just records what the system does."

매 vs unit/spec test

측면 Spec Test Characterization Test
Source of truth Requirements / spec Actual current behavior
Failing means Code has bug Behavior changed (maybe intended)
When to write TDD, before code Before refactoring legacy code
Update on change Fix code Review diff, accept if intended

매 procedure (Feathers)

  1. Pick code 의 region.
  2. Write test 가 invokes the code with realistic inputs.
  3. Assert with placeholder (e.g. assert result == "FILL_ME").
  4. Run, capture actual output.
  5. Replace placeholder with actual output.
  6. Now test pins behavior — refactor with confidence.

매 variants

  • Snapshot test (Jest, Vitest): serialize output, compare next run.
  • Approval test (ApprovalTests): write to .approved.txt, manual review on diff.
  • Golden master: large input/output pair, often UI screenshot.
  • Property-based regression: random inputs, save outputs as golden.

매 응용

  1. Refactoring legacy monolith without specs.
  2. Migration (framework upgrade, language port).
  3. Compiler / parser output stability.
  4. Report generation (PDFs, CSVs).
  5. UI visual regression (Percy, Chromatic).

💻 패턴

Feathers procedure (TypeScript + Jest)

import { calculateInvoice } from './legacy';

describe('calculateInvoice — characterization', () => {
  it('records current behavior for typical input', () => {
    const result = calculateInvoice({
      items: [
        { sku: 'A', qty: 2, price: 19.99 },
        { sku: 'B', qty: 1, price: 5.00 },
      ],
      customerType: 'premium',
      country: 'KR',
    });

    // Step 1: write `expect(result).toBe('PLACEHOLDER')`
    // Step 2: run, observe actual:
    //   { subtotal: 44.98, discount: 4.50, vat: 4.05, total: 44.53 }
    // Step 3: paste it as expected
    expect(result).toEqual({
      subtotal: 44.98,
      discount: 4.50,
      vat: 4.05,
      total: 44.53,
    });
  });
});

Jest snapshot

import { renderInvoiceHtml } from './render';

test('invoice html — characterized', () => {
  const html = renderInvoiceHtml(SAMPLE_INVOICE);
  expect(html).toMatchSnapshot();
});
// First run: writes __snapshots__/invoice.test.ts.snap
// Future runs: diffs. Use `--update-snapshot` after intentional change.

ApprovalTests (Python)

from approvaltests import verify

def test_pdf_layout():
    pdf_text = render_pdf(sample_data)
    verify(pdf_text)
# Writes `test_pdf_layout.received.txt`, compares to `.approved.txt`.
# CI fails on diff; dev reviews then renames received→approved.

Golden master with multiple inputs

import json
from pathlib import Path
from legacy_pricing import compute_price

def test_pricing_golden_master():
    cases = json.loads(Path("fixtures/cases.json").read_text())
    actual = [compute_price(c["input"]) for c in cases]
    expected = json.loads(Path("fixtures/expected.json").read_text())
    assert actual == expected

Generating the golden initially

# tools/regenerate_golden.py — run once, then commit
import json
from pathlib import Path
from legacy_pricing import compute_price

cases = json.loads(Path("fixtures/cases.json").read_text())
out = [compute_price(c["input"]) for c in cases]
Path("fixtures/expected.json").write_text(json.dumps(out, indent=2))

Differential testing (old vs refactored)

import { computePriceLegacy } from './pricing-legacy';
import { computePriceNew } from './pricing-new';
import fc from 'fast-check';

test('refactor preserves behavior', () => {
  fc.assert(
    fc.property(
      fc.record({
        qty: fc.integer({ min: 1, max: 100 }),
        unitPrice: fc.float({ min: 0.01, max: 9999 }),
      }),
      (input) => {
        expect(computePriceNew(input))
          .toBeCloseTo(computePriceLegacy(input), 2);
      },
    ),
    { numRuns: 1000 },
  );
});

Capture I/O of legacy via instrumentation

# Wrap legacy fn, log all inputs+outputs in production for a week,
# then replay as test fixtures
from functools import wraps
import json, time

def record(path):
    def deco(fn):
        @wraps(fn)
        def wrapper(*args, **kwargs):
            result = fn(*args, **kwargs)
            with open(path, "a") as f:
                f.write(json.dumps({
                    "ts": time.time(),
                    "args": args, "kwargs": kwargs,
                    "result": result,
                }, default=str) + "\n")
            return result
        return wrapper
    return deco

@record("fixtures/legacy_calls.jsonl")
def legacy_compute(...): ...

매 결정 기준

상황 Approach
Pure function legacy Inline assertion (Feathers procedure)
Large structured output Snapshot or ApprovalTests
Visual UI Storybook + Chromatic / Percy
Two impls (refactor) Differential testing + property-based
Production behavior unknown Record-replay from instrumentation

기본값: Snapshot test for serializable output, ApprovalTests for human-reviewed diffs (PDF, HTML), differential test during refactor.

🔗 Graph

🤖 LLM 활용

언제: legacy code refactor 시작, untested codebase 첫 test net, framework migration, output-shape stability (PDF/CSV/JSON). 언제 X: greenfield TDD (use spec tests), rapidly evolving design (snapshots churn), security-critical (need real spec).

안티패턴

  • Treating snapshot as spec: 매 snapshot fail — auto-update without review. 매 bug 의 silent merge.
  • Huge unreadable snapshot: 1000-line JSON — split into focused snapshots.
  • No fixture review process: golden master changes auto-merge — require reviewer.
  • Characterization without refactor goal: tests forever pin "legacy bug" 의 behavior — note bugs explicitly, fix later.
  • Time/random in capture: nondeterministic snapshots — freeze clock/seed.

🧪 검증 / 중복

  • Verified (Feathers "Working Effectively with Legacy Code" 2004 / Jest docs / ApprovalTests / Beck "Test Driven Development").
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — Feathers procedure + snapshot/approval/differential 패턴