f8b21af4be
10_Wiki/Topics 대규모 정리: - 오류 캡처/미완성 stub 문서 227개 제거 - 교차폴더 중복 43클러스터 병합 (63파일 → redirect) - 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건 - 카테고리 MOC 6개 신규 생성 - Graph 섹션 미해결 related-keyword 링크 10,058건 제거 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7.0 KiB
7.0 KiB
id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
| id | title | category | status | canonical_id | aliases | duplicate_of | source_trust_level | confidence_score | verification_status | tags | raw_sources | last_reinforced | github_commit | tech_stack | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| wiki-2026-0508-dora-metrics | DORA Metrics | 10_Wiki/Topics | verified | self |
|
none | A | 0.9 | applied |
|
2026-05-10 | pending |
|
DORA Metrics
매 한 줄
"매 software delivery performance 의 매 4 numbers": deployment frequency, lead time for changes, change failure rate, mean time to restore (MTTR). 매 Forsgren/Humble/Kim "Accelerate" (2018) + 매 yearly DORA report 가 매 source-of-truth. 2026 의 매 5번째 metric (reliability) 의 official.
매 핵심
매 the 4 (now 5)
- Deployment Frequency (DF) — production deploys per period. Elite: on-demand (multiple/day).
- Lead Time for Changes (LT) — commit → production. Elite: < 1 day.
- Change Failure Rate (CFR) — % deploys causing incident/rollback. Elite: 0–5%.
- Mean Time to Restore (MTTR) — incident detection → resolution. Elite: < 1 hour.
- Reliability (added 2021/2022 reports) — meeting/exceeding SLO targets.
매 performance bands (2024 report)
- Elite — frequent deploys, < 1 day LT, 0–5% CFR, < 1h MTTR.
- High — weekly–monthly, 1 day–1 week, 0–10%, < 1 day.
- Medium — monthly–6m, 1 week–1 month, 0–15%, < 1 week.
- Low — < 6m, > 6m, > 64%, > 1 week.
매 accelerators (capabilities)
- Trunk-based development.
- Continuous integration.
- Test automation.
- Loosely coupled architecture.
- Generative culture (Westrum).
- Database change automation.
- Empowered teams.
💻 패턴
Lead time SQL (GitHub + production deploy events)
-- 매 first commit in PR → production deploy 의 measure
WITH deploys AS (
SELECT service, deploy_id, deployed_at, sha
FROM deployments WHERE environment = 'production'
),
commits AS (
SELECT pr.merge_commit_sha AS sha, MIN(c.committed_at) AS first_commit_at
FROM pull_requests pr JOIN commits c ON c.pr_id = pr.id
GROUP BY pr.merge_commit_sha
)
SELECT d.service,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY EXTRACT(EPOCH FROM d.deployed_at - c.first_commit_at)) AS lt_p50_seconds,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY EXTRACT(EPOCH FROM d.deployed_at - c.first_commit_at)) AS lt_p95_seconds
FROM deploys d JOIN commits c USING (sha)
WHERE d.deployed_at > NOW() - INTERVAL '90 days'
GROUP BY d.service;
GitHub Actions 의 deploy event emit
- name: Record deployment
if: success()
run: |
curl -X POST "$DORA_API/deployments" \
-H "Authorization: Bearer $TOKEN" \
-d "{\"service\":\"orders\",\"sha\":\"${{ github.sha }}\",\"environment\":\"production\",\"deployed_at\":\"$(date -u +%FT%TZ)\"}"
Change failure (PagerDuty + deploy correlation)
async function changeFailureRate(service: string, days = 30) {
const deploys = await db.deployments.count({ service, since: daysAgo(days) });
const failedDeploys = await db.deployments.count({
service, since: daysAgo(days),
correlatedIncidentWithin: '4h', // 매 incident 가 deploy 후 4h 내 의 fail count
});
return failedDeploys / deploys;
}
MTTR from PagerDuty
import { api } from '@pagerduty/pdjs';
const pd = api({ token: process.env.PD_TOKEN! });
const { data } = await pd.get('/incidents', {
data: { since: daysAgo(30), service_ids: [SVC_ID], statuses: ['resolved'] },
});
const durations = data.incidents.map(i =>
(new Date(i.resolved_at).getTime() - new Date(i.created_at).getTime()) / 1000);
const mttr = durations.reduce((a,b)=>a+b,0) / durations.length;
Four Keys (Google) on BigQuery
-- 매 https://github.com/dora-team/fourkeys 의 reference
SELECT
COUNTIF(event_type = 'deployment') AS deploys,
AVG(TIMESTAMP_DIFF(deploy_time, first_commit_time, MINUTE)) AS lt_minutes,
COUNTIF(failed) / NULLIF(COUNTIF(event_type = 'deployment'), 0) AS cfr,
AVG(TIMESTAMP_DIFF(resolved_time, incident_time, MINUTE)) AS mttr_minutes
FROM `project.four_keys.events_raw`
WHERE deploy_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY);
Grafana dashboard panel (PromQL-style)
# Deployment frequency (deploys per day per service)
sum by (service) (rate(deployments_total{env="production"}[7d])) * 86400
# Change failure rate
sum by (service) (deployments_failed_total[30d])
/ sum by (service) (deployments_total[30d])
# Lead time p50
histogram_quantile(0.5, sum by (le, service) (rate(deploy_lead_time_seconds_bucket[30d])))
Reliability (SLO-aligned 5th metric)
// 매 service 의 SLO 의 meeting-or-exceeding 의 % of measurement windows
const reliability = await prom.query(`
(sum_over_time(slo_compliance{service="$svc"}[30d]) /
count_over_time(slo_compliance{service="$svc"}[30d])) * 100
`);
Anti-gaming guardrails
// 매 metric 의 isolated 의 game 가 가능 — pair 의 always 의 read
const elite = (df > 1/day) && (lt < 1*day) && (cfr < 0.05) && (mttr < 1*hour);
// 매 elite 가 X 만 high cfr 의 hide 의 X.
매 결정 기준
| 상황 | Approach |
|---|---|
| Greenfield team | Adopt Four Keys (open source) on BigQuery |
| GitHub-centric | dora-team/fourkeys + Cloud Run pipelines |
| Multi-tool | LinearB / Sleuth / Faros AI / Jellyfish (SaaS) |
| Self-host | Apache DevLake (LF AI) |
| Enterprise governance | Faros + custom dashboards |
기본값: Apache DevLake (open source) or Four Keys reference impl; weekly review with team; show all 4 (5) together — never single metric.
🔗 Graph
- 부모: DevOps · Continuous-Delivery
- 변형: Engineering-Metrics
- 응용: Trunk-Based-Development · Continuous-Integration · SRE
- Adjacent: Postmortem-Culture
🤖 LLM 활용
언제: 매 metric definition explanation, 매 SQL/PromQL query authoring, 매 trend interpretation, 매 retrospective talking points generation. 언제 X: 매 individual performance evaluation (DORA 의 team-level metric — never individual). 매 metric tuning to look good (gaming).
❌ 안티패턴
- Single metric optimization: 매 deploy frequency 의 increase 만 → CFR explodes. 매 4 의 always 의 together 보기.
- Individual performance ranking: 매 explicitly anti-pattern in DORA research. 매 team-level만.
- Vanity deploys: 매 empty commits / config-only changes 의 count → meaningless.
- MTTR from "ticket close": 매 customer-impact end 의 measure, 매 ticket admin 가 X.
- Comparing teams in different domains: 매 fintech vs internal tool 의 baselines 가 different.
- No deployment instrumentation: 매 manual spreadsheet 가 X. 매 auto-emit deploy event.
🧪 검증 / 중복
- Verified (Forsgren/Humble/Kim "Accelerate" 2018, Google DORA 2024 State of DevOps report, dora-team/fourkeys, Apache DevLake).
- 신뢰도 A.
🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — DORA 4(5) metrics, queries, anti-patterns |