Files
2nd/10_Wiki/Topics/Coding/Quality_Engineering_Excellence.md
T
2026-05-10 22:08:15 +09:00

6.4 KiB

id, title, category, status, source_trust_level, verification_status, created_at, updated_at, tags, tech_stack, applied_in, aliases
id title category status source_trust_level verification_status created_at updated_at tags tech_stack applied_in aliases
quality-engineering-excellence Engineering Excellence — DORA / SPACE / DX metric Coding draft B conceptual 2026-05-09 2026-05-09
quality
productivity
vibe-coding
language applicable_to
process
Engineering
DORA
DORA metrics
SPACE
DX
developer experience
deployment frequency
lead time
MTTR

Engineering Excellence (DORA / SPACE / DX)

"Good team 가 무엇 측정?" DORA 4 가 baseline. SPACE 가 holistic. DX 가 modern. Vanity metric (LOC, commit count) 안.

📖 핵심 개념

  • DORA: Delivery + reliability.
  • SPACE: 5 dimension (satisfaction, performance, activity, communication, efficiency).
  • DX: developer experience.

💻 코드 패턴

DORA 4 metrics

1. Deployment Frequency: 매 day vs 매 month.
2. Lead Time for Changes: PR open → prod.
3. Change Failure Rate: deploy 의 % 가 incident.
4. MTTR (Mean Time To Recovery): incident → fix.

Elite vs Low

Elite:
- Daily+ deploy.
- < 1 hour lead time.
- < 5% failure.
- < 1 hour MTTR.

Low:
- Monthly deploy.
- 1+ month lead time.
- 16-30% failure.
- 1+ week MTTR.

→ Elite team 가 2x 더 많이 ship + 2x 적은 incident.

→ "Accelerate" book (Forsgren).

Measure (자동)

# GitHub
- PR open: timestamp.
- Deploy success: timestamp.
- Lead time = deploy - first commit.

# DataDog / Honeycomb / Sleuth
- Auto-collect.
- Per-team / per-service.

SPACE framework

S - Satisfaction
P - Performance (output / outcome)
A - Activity (volume — but careful)
C - Communication / collaboration
E - Efficiency / flow

→ Multi-dimensional. Activity 만 = 위험 (게임).

Activity 의 함정

"매 dev 가 매주 X commit"
→ Goodhart's law: target 가 measure 되면 fail.
→ Padding commit, busy work.

→ 측정 = activity but goal = outcome.

DX (Developer Experience)

Build time, deploy time, dev env setup, hot reload, test time, dashboard ...

→ "Friction" 을 측정.
DX 가 좋음 = velocity.

→ "DX" by Abi Noda (research).

Build time

< 1 min: green.
1-5 min: yellow.
> 5 min: red.

→ Cache + parallel + smaller bundle.

Deploy time

< 5 min: elite.
5-30 min: high.
> 30 min: low.

→ CI / CD pipeline 의 dependency.

Local dev setup

Onboarding 의 시간:
- < 1 hour: elite.
- < 1 day: good.
- > 1 week: 가짜.

→ Docker compose, devcontainer, automated.

Test time

Unit: < 30 sec.
Integration: < 5 min.
E2E: < 15 min.

→ Slow test = skip 자주.

Survey

"이 codebase 가 일하기 쉬운가?"
"매 PR 가 review 빠른가?"
"Tooling 가 좋은가?"

→ Quarterly survey + score.

Lighthouse / Web Vitals (UX)

LCP: < 2.5s.
FID: < 100ms.
CLS: < 0.1.

→ User experience metric.

SLO (Service Level Objective)

99.9% uptime = 4 hour 22 min / month down.
99.95% = 21.6 min.
99.99% = 4.32 min.
99.999% (5 9's) = 26 sec — 매우 어려움.

→ Match SLA / cost.

Incident (MTTR breakdown)

Detection time: alert → human aware.
Acknowledgement: human aware → action.
Resolution: action → fix.
Recovery: fix → user happy.

→ 매 phase 측정 + improve.

Postmortem

매 P0/P1 incident:
- Timeline.
- Root cause (5 whys).
- Action items (concrete).
- Public (blameless).

→ [[Productivity_Postmortem]].

Code review metric

- Time to first review (target < 4 hour).
- Approval count (1-2).
- LOC per PR (< 400).
- Iterations per PR (< 3).

→ Big PR = slow review = lead time ↑.

→ "Small PR culture" 가 큰 lever.

Test coverage

60-80% = baseline.
100% = brittle.
0% = scary.

→ Coverage 가 quality 의 proxy 만.
Mutation 가 진짜.

Tech debt 측정

- TODO comment count.
- Deprecated API 사용 (Snyk, Sourcegraph).
- 큰 file (> 500 LOC).
- Cyclomatic complexity > 10.

→ "Debt ratio" 가 trend.

CodeClimate / Sonar

Auto metric:
- Maintainability.
- Test coverage.
- Duplication.
- Complexity.

→ PR 가 quality gate.

Engineering productivity ≠ output

"Velocity = code 작성 ↑" ≠ "Productivity ↑".
Productivity = customer outcome.

→ 매 commit 가 가치 가져야.
"이 PR 가 user 의 무엇 fix?"

Goodhart 의 law

"매 dev 가 매주 X commit"
→ Padding.

"Coverage 100%"
→ 가짜 test.

"PR 매주 N 개"
→ Big PR 가 split 가짜.

→ Metric 가 target 가 됨 = 게임.

→ Mix of metrics + judgment.

Real-world

  • Google: monorepo + DORA + SPACE.
  • Spotify: tribe model + autonomy metric.
  • Microsoft: SPACE 의 origin.
  • Pinterest: DORA + DX.
  • Intuit: developer survey.

Dashboard

매 team 의:
- Deploy / day.
- Lead time p50.
- Incident MTTR.
- PR cycle time.
- DX score.

→ Trend track.

→ Sleuth, LinearB, Code Climate, Faros.

LinearB / Sleuth (managed)

GitHub / GitLab → 자동 metric.
- Cycle time visualization.
- Bottleneck identification.
- Per-team comparison.

Engineering ladder

Promotion criteria:
- Junior: complete task.
- Senior: own feature.
- Staff: own system.
- Principal: cross-team impact.

→ Metric + qualitative.

Avoid

- "Most active" award (commit count).
- LOC / day target.
- Coverage 100% gate.
- Strict deploy frequency.
- Comparison 가 individual (team OK).

Best practice

1. DORA baseline.
2. SPACE survey (quarterly).
3. DX friction (build, deploy, test).
4. Incident postmortem.
5. Mix: metric + survey + judgment.

🤔 의사결정 기준

측정 Tool
DORA Sleuth / LinearB / GitHub Actions
SPACE Survey + dashboard
DX Build / test time + survey
Code quality Sonar / CodeClimate
Test coverage Codecov
Performance Lighthouse CI / Web Vitals
Incident PagerDuty / Linear

안티패턴

  • Activity 만: padding / busy work.
  • LOC measure: misalign.
  • Individual leaderboard: morale ↓.
  • 100% coverage gate: brittle test.
  • Strict daily deploy: 가짜 deploy.
  • Goodhart's law 무시: metric 가짜.
  • Survey 없음: blind.

🤖 LLM 활용 힌트

  • DORA 가 baseline (4 metric).
  • SPACE 가 holistic (5 dimension).
  • DX 가 modern (friction).
  • Goodhart's law 항상 인지.

🔗 관련 문서