Files
2nd/10_Wiki/Topics/Architecture/Logging_Diagnostics.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

7.0 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-logging-diagnostics Logging Diagnostics 10_Wiki/Topics verified self
Logging
Application Logging
Diagnostic Logging
none A 0.9 applied
observability
logging
diagnostics
sre
2026-05-10 pending
language framework
typescript pino

Logging Diagnostics

매 한 줄

"매 structured event 의 production runtime 의 X-ray". Application logging 은 매 incident 의 forensic record + 매 system behavior 의 narrative. 2026 best practice: structured JSON logs + OpenTelemetry semantic conventions + sampling at scale + cardinality discipline. Plain text logs 는 매 deprecated; loggers 는 매 trace context 와 correlation.

매 핵심

매 3 pillars (observability)

  • Logs: discrete events, high cardinality, narrative.
  • Metrics: aggregated time-series, low cardinality, dashboards/alerts.
  • Traces: causal chain across services.
  • 매 modern unified backbone: OpenTelemetry → Loki/Tempo/Prometheus 또는 Datadog/Honeycomb.

매 log levels

  • TRACE: 매 fine-grained internal state (off in prod).
  • DEBUG: 매 development diagnostics (sampled or off in prod).
  • INFO: 매 business event, lifecycle (default level).
  • WARN: 매 degraded but recoverable.
  • ERROR: 매 actionable failure — page-worthy candidate.
  • FATAL: 매 process-terminating.

매 structured logging principles

  1. JSON output (or logfmt).
  2. 매 fixed schema: timestamp, level, service, trace_id, span_id, message, ...attrs.
  3. Correlation IDs propagated (W3C Trace Context).
  4. 매 PII redaction at source.
  5. Sampling for high-volume paths.
  6. 매 cardinality discipline — no unbounded values in indexed fields.

매 응용

  1. Incident investigation (search by trace_id).
  2. Audit trail (compliance — separate stream).
  3. Business event analytics (BI pipeline ingestion).
  4. SLO error budget calculation.
  5. Anomaly detection input.

💻 패턴

1. Pino (Node.js, fast structured logging)

import pino from "pino";

export const logger = pino({
  level: process.env.LOG_LEVEL ?? "info",
  redact: ["password", "*.authorization", "creditCard"],
  formatters: {
    level: (label) => ({ level: label }),
  },
  base: {
    service: "checkout",
    env: process.env.NODE_ENV,
    version: process.env.GIT_SHA,
  },
});

logger.info({ userId, orderId, amount }, "order placed");
// {"level":"info","time":1710000000,"service":"checkout","userId":"u1",...}

2. OpenTelemetry log correlation

import { trace } from "@opentelemetry/api";
import pino from "pino";

const baseLogger = pino();

export function getLogger() {
  const span = trace.getActiveSpan();
  const ctx = span?.spanContext();
  return baseLogger.child({
    trace_id: ctx?.traceId,
    span_id: ctx?.spanId,
  });
}

// Usage
getLogger().error({ err }, "payment failed");
// Now searchable by trace_id across services.

3. Python structlog

import structlog

structlog.configure(
    processors=[
        structlog.contextvars.merge_contextvars,
        structlog.processors.add_log_level,
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.JSONRenderer(),
    ],
)

log = structlog.get_logger()

structlog.contextvars.bind_contextvars(request_id=req_id, user_id=uid)
log.info("checkout_started", cart_size=len(cart))

4. Sampling (head + tail)

// Head sampling: decide at request entry
function shouldLog(req: Request): boolean {
  if (req.url.startsWith("/health")) return false;       // drop healthchecks
  if (req.headers["x-debug"]) return true;               // force on
  return Math.random() < 0.01;                           // 1% sample for /api/*
}

// Tail sampling (in OTel collector): keep all errors + slow + 1% baseline

5. Error logging with stack + cause

try {
  await chargeCard(order);
} catch (err) {
  logger.error({
    err: { message: err.message, stack: err.stack, cause: err.cause },
    orderId: order.id,
    customerId: order.customerId,
  }, "charge failed");
  throw err;
}

6. Redaction (PII safety)

const SENSITIVE = /(\b\d{16}\b|\b\d{3}-\d{2}-\d{4}\b)/g;  // CC, SSN

function sanitize(obj: any): any {
  const json = JSON.stringify(obj);
  return JSON.parse(json.replace(SENSITIVE, "[REDACTED]"));
}

logger.info(sanitize(payload), "received webhook");

7. Audit log (separate stream)

const audit = pino({
  level: "info",
  base: { stream: "audit" },
  // separate transport → tamper-evident store (S3 + object lock)
});

audit.info({
  actor: user.id,
  action: "user.delete",
  target: targetId,
  ip: req.ip,
  outcome: "success",
}, "audit");

8. Go slog (stdlib, 1.21+)

import "log/slog"

logger := slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
    Level: slog.LevelInfo,
}))
slog.SetDefault(logger)

slog.Info("order placed",
    "user_id", userID,
    "order_id", orderID,
    "amount", amount,
)

매 결정 기준

상황 Approach
새 service Structured JSON + OTel correlation.
High-volume path (>1k rps) Head sampling + tail sampling.
Compliance / audit Separate audit stream + immutable store.
Legacy plain-text logs Parse → enrich → forward (Vector, Fluent Bit).
Edge / IoT Compact binary (CBOR) + batched upload.
Real-time alerting on log content Stream → Loki/ELK with regex rules.

기본값: 2026 의 새 서비스는 매 OTel logs + structured JSON + Pino/structlog/slog. 매 plain text 의 X.

🔗 Graph

🤖 LLM 활용

언제: log schema design, sampling strategy, log-to-trace correlation, redaction policy review. 언제 X: 매 metric/trace 만 필요한 경우 (logs 의 cost > value), 매 single-developer side project (basic console.log 충분).

안티패턴

  • String interpolation logs: log.info("user " + id + " did " + action) — 매 unparseable, 매 search 불가. 매 structured fields 사용.
  • PII leak: 매 redaction 부재 → 매 GDPR breach.
  • Unbounded cardinality: 매 user_email 을 indexed field 로 → 매 storage explosion.
  • Logging in tight loop: 매 hot path 의 매 iter 마다 log → 매 IO bottleneck.
  • Catch and silent log: } catch (e) { logger.error(e); } 에서 매 context 부재 — orderId/userId 같이 add.
  • Plain text in 2026: 매 grep-only logs — search/correlation 매 painful.
  • No trace correlation: 매 service 마다 isolated logs — 매 incident 시 매 cross-service narrative 의 manual 재구성.

🧪 검증 / 중복

  • Verified (OpenTelemetry Logs spec, Google SRE Book Ch.6, Pino/structlog official docs).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — structured logging + OTel correlation + sampling/redaction patterns