Files

T

Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization

10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-20 23:52:15 +09:00

7.0 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

DORA Metrics

매 한 줄

"매 software delivery performance 의 매 4 numbers": deployment frequency, lead time for changes, change failure rate, mean time to restore (MTTR). 매 Forsgren/Humble/Kim "Accelerate" (2018) + 매 yearly DORA report 가 매 source-of-truth. 2026 의 매 5번째 metric (reliability) 의 official.

매 핵심

매 the 4 (now 5)

Deployment Frequency (DF) — production deploys per period. Elite: on-demand (multiple/day).
Lead Time for Changes (LT) — commit → production. Elite: < 1 day.
Change Failure Rate (CFR) — % deploys causing incident/rollback. Elite: 0–5%.
Mean Time to Restore (MTTR) — incident detection → resolution. Elite: < 1 hour.
Reliability (added 2021/2022 reports) — meeting/exceeding SLO targets.

매 performance bands (2024 report)

Elite — frequent deploys, < 1 day LT, 0–5% CFR, < 1h MTTR.
High — weekly–monthly, 1 day–1 week, 0–10%, < 1 day.
Medium — monthly–6m, 1 week–1 month, 0–15%, < 1 week.
Low — < 6m, > 6m, > 64%, > 1 week.

매 accelerators (capabilities)

Trunk-based development.
Continuous integration.
Test automation.
Loosely coupled architecture.
Generative culture (Westrum).
Database change automation.
Empowered teams.

💻 패턴

Lead time SQL (GitHub + production deploy events)

-- 매 first commit in PR → production deploy 의 measure
WITH deploys AS (
  SELECT service, deploy_id, deployed_at, sha
  FROM deployments WHERE environment = 'production'
),
commits AS (
  SELECT pr.merge_commit_sha AS sha, MIN(c.committed_at) AS first_commit_at
  FROM pull_requests pr JOIN commits c ON c.pr_id = pr.id
  GROUP BY pr.merge_commit_sha
)
SELECT d.service,
       PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY EXTRACT(EPOCH FROM d.deployed_at - c.first_commit_at)) AS lt_p50_seconds,
       PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY EXTRACT(EPOCH FROM d.deployed_at - c.first_commit_at)) AS lt_p95_seconds
FROM deploys d JOIN commits c USING (sha)
WHERE d.deployed_at > NOW() - INTERVAL '90 days'
GROUP BY d.service;

GitHub Actions 의 deploy event emit

- name: Record deployment
  if: success()
  run: |
    curl -X POST "$DORA_API/deployments" \
      -H "Authorization: Bearer $TOKEN" \
      -d "{\"service\":\"orders\",\"sha\":\"${{ github.sha }}\",\"environment\":\"production\",\"deployed_at\":\"$(date -u +%FT%TZ)\"}"

Change failure (PagerDuty + deploy correlation)

async function changeFailureRate(service: string, days = 30) {
  const deploys = await db.deployments.count({ service, since: daysAgo(days) });
  const failedDeploys = await db.deployments.count({
    service, since: daysAgo(days),
    correlatedIncidentWithin: '4h', // 매 incident 가 deploy 후 4h 내 의 fail count
  });
  return failedDeploys / deploys;
}

MTTR from PagerDuty

import { api } from '@pagerduty/pdjs';
const pd = api({ token: process.env.PD_TOKEN! });
const { data } = await pd.get('/incidents', {
  data: { since: daysAgo(30), service_ids: [SVC_ID], statuses: ['resolved'] },
});
const durations = data.incidents.map(i =>
  (new Date(i.resolved_at).getTime() - new Date(i.created_at).getTime()) / 1000);
const mttr = durations.reduce((a,b)=>a+b,0) / durations.length;

Four Keys (Google) on BigQuery

-- 매 https://github.com/dora-team/fourkeys 의 reference
SELECT
  COUNTIF(event_type = 'deployment') AS deploys,
  AVG(TIMESTAMP_DIFF(deploy_time, first_commit_time, MINUTE)) AS lt_minutes,
  COUNTIF(failed) / NULLIF(COUNTIF(event_type = 'deployment'), 0) AS cfr,
  AVG(TIMESTAMP_DIFF(resolved_time, incident_time, MINUTE)) AS mttr_minutes
FROM `project.four_keys.events_raw`
WHERE deploy_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY);

Grafana dashboard panel (PromQL-style)

# Deployment frequency (deploys per day per service)
sum by (service) (rate(deployments_total{env="production"}[7d])) * 86400

# Change failure rate
sum by (service) (deployments_failed_total[30d])
/ sum by (service) (deployments_total[30d])

# Lead time p50
histogram_quantile(0.5, sum by (le, service) (rate(deploy_lead_time_seconds_bucket[30d])))

Reliability (SLO-aligned 5th metric)

// 매 service 의 SLO 의 meeting-or-exceeding 의 % of measurement windows
const reliability = await prom.query(`
  (sum_over_time(slo_compliance{service="$svc"}[30d]) /
   count_over_time(slo_compliance{service="$svc"}[30d])) * 100
`);

Anti-gaming guardrails

// 매 metric 의 isolated 의 game 가 가능 — pair 의 always 의 read
const elite = (df > 1/day) && (lt < 1*day) && (cfr < 0.05) && (mttr < 1*hour);
// 매 elite 가 X 만 high cfr 의 hide 의 X.

매 결정 기준

상황	Approach
Greenfield team	Adopt Four Keys (open source) on BigQuery
GitHub-centric	dora-team/fourkeys + Cloud Run pipelines
Multi-tool	LinearB / Sleuth / Faros AI / Jellyfish (SaaS)
Self-host	Apache DevLake (LF AI)
Enterprise governance	Faros + custom dashboards

기본값: Apache DevLake (open source) or Four Keys reference impl; weekly review with team; show all 4 (5) together — never single metric.

🔗 Graph

부모: DevOps · Continuous-Delivery
변형: Engineering-Metrics
응용: Trunk-Based-Development · Continuous-Integration · SRE
Adjacent: Postmortem-Culture

🤖 LLM 활용

언제: 매 metric definition explanation, 매 SQL/PromQL query authoring, 매 trend interpretation, 매 retrospective talking points generation. 언제 X: 매 individual performance evaluation (DORA 의 team-level metric — never individual). 매 metric tuning to look good (gaming).

❌ 안티패턴

Single metric optimization: 매 deploy frequency 의 increase 만 → CFR explodes. 매 4 의 always 의 together 보기.
Individual performance ranking: 매 explicitly anti-pattern in DORA research. 매 team-level만.
Vanity deploys: 매 empty commits / config-only changes 의 count → meaningless.
MTTR from "ticket close": 매 customer-impact end 의 measure, 매 ticket admin 가 X.
Comparing teams in different domains: 매 fintech vs internal tool 의 baselines 가 different.
No deployment instrumentation: 매 manual spreadsheet 가 X. 매 auto-emit deploy event.

🧪 검증 / 중복

Verified (Forsgren/Humble/Kim "Accelerate" 2018, Google DORA 2024 State of DevOps report, dora-team/fourkeys, Apache DevLake).
신뢰도 A.

🕓 Changelog

날짜	변경
2026-05-08	Phase 1
2026-05-10	Manual cleanup — DORA 4(5) metrics, queries, anti-patterns

7.0 KiB Raw Blame History Unescape Escape