Files
2nd/10_Wiki/Topics/Coding/Mobile_Crash_Free_SLO.md
T
2026-05-09 21:08:02 +09:00

4.7 KiB

id, title, category, status, source_trust_level, verification_status, created_at, updated_at, tags, tech_stack, applied_in, aliases
id title category status source_trust_level verification_status created_at updated_at tags tech_stack applied_in aliases
mobile-crash-free-slo Crash-free SLO — 99.5% 유지하기 Coding draft B conceptual 2026-05-09 2026-05-09
mobile
crash
slo
reliability
vibe-coding
language applicable_to
Swift / Kotlin / RN
iOS
Android
crash-free
crashlytics SLO
ANR rate
mean time to detect
MTTD

Crash-free SLO

Crash-free users 99.5%+ 가 산업 표준. 측정 + 알람 + 빠른 hotfix 채널. iOS / Android 별 다름. Crashlytics velocity alert 자동.

📖 핵심 개념

  • Crash-free users: 24h 동안 crash 안 본 사용자 비율.
  • Crash-free sessions: 한 세션 안 crash 없음 비율.
  • Velocity: 짧은 시간 내 빠른 증가 = 즉각 알람.
  • ANR rate (Android): 별도 SLO.

💻 코드 패턴

Crashlytics velocity alert

Firebase Console → Crashlytics → Velocity Alerts
- Threshold: 1% sessions affected
- Window: 1 hour
- Slack / email integration

Sentry release health

// 앱 시작 시
Sentry.init({
  dsn,
  release: `app@${version}+${build}`,
  environment: 'prod',
  tracesSampleRate: 0.1,
  enableAutoSessionTracking: true,
});
Sentry Dashboard → Releases → Crash-free users / sessions

SLO 정의

SLO:
  crash_free_users_30d: 99.5%
  crash_free_sessions_30d: 99.8%
  ANR_rate_24h: < 0.47% (Play Console 표준)
  startup_latency_p95: < 2s

Alerts:
  - velocity > 1% in 1h → page oncall
  - daily crash-free < 99% → page
  - new crash signature affecting > 50 users / 1h → page

디버그 정보

Crashlytics.crashlytics().setUserID(user.id);
Crashlytics.crashlytics().setCustomValue(plan, forKey: 'plan');
Crashlytics.crashlytics().setCustomValue(experimentVariant, forKey: 'exp:onboarding');
Crashlytics.crashlytics().log("entered checkout flow")

Symbol 업로드 자동

Fastlane: upload_symbols_to_crashlytics
또는 build script

dSYM 누락 = crash 의미 없음.

Hotfix 채널

  • iOS: TestFlight + 빠른 review (24h) + phased release pause.
  • Android: Play Console internal testing → staged rollout 5% → full.
# Fastlane: emergency hotfix 자동
lane :hotfix do
  ensure_git_branch(branch: 'hotfix/.*')
  match
  build_app
  upload_to_testflight(distribute_external: true, groups: ['Beta'])
end

Phased release / staged rollout

# iOS: App Store Connect API
upload_to_app_store(
  phased_release: true,  # 7일 자동 분배
)

# Android
upload_to_play_store(track: 'production', rollout: '0.05')

새 버전 issue 발견 → halt rollout.

Kill switch (서버 제어 강제)

// 앱 시작 시 remote config
const config = await fetchRemoteConfig();
if (config.minSupportedBuild > currentBuild) {
  showUpdateRequiredScreen();
  return;
}
if (config.killFeature.checkout) {
  // checkout 비활성
}

Client error rate (non-crash)

class GlobalErrorBoundary extends React.Component {
  componentDidCatch(error: Error, info: ErrorInfo) {
    Sentry.captureException(error, { extra: info });
    this.setState({ hasError: true });
  }
}
// JS unhandled rejection
window.addEventListener('unhandledrejection', (e) => {
  Sentry.captureException(e.reason);
});

Postmortem 템플릿

# Crash spike 2026-05-09
- Affected: 23,000 users (3.4%)
- Window: 14:00 - 15:30 UTC
- Root cause: 새 endpoint 의 nil response → force unwrap
- Fix: optional unwrap + fallback
- Hotfix: 1.4.2 (3 hours)
- Prevention: 빌드 시 force-unwrap lint 활성, integration test 추가

🤔 의사결정 기준

모니터 도구
일반 SLO Crashlytics + Sentry
ANR (Android) Crashlytics + Play Console
시작 latency Firebase Performance / SwiftUI Instruments
메모리 / 배터리 MetricKit (iOS) / Battery Historian (Android)
네트워크 에러 자체 + Sentry breadcrumbs
User journey crash 영향 Mixpanel / Amplitude

안티패턴

  • dSYM / mapping 누락: stack 의미 없음.
  • Crash velocity 알람 없음: 큰 spike 모름.
  • Force unwrap (!) 무절제 (Swift): nil 시 crash. 옵션 unwrap.
  • Catch 후 swallow + continue: 더 큰 crash. 또는 record + degrade.
  • Phased release 없음 prod: 큰 영향 즉시.
  • Kill switch 없음: emergency 시 review 기다림.
  • PII crash log 그대로: GDPR. 마스킹.
  • Old version 무한 지원: minSupportedBuild + 강제 update.

🤖 LLM 활용 힌트

  • Crashlytics + Sentry 양쪽 권장 (각 강점 다름).
  • Velocity alert + dSYM 자동 + phased release.
  • Kill switch 로 emergency 대응.

🔗 관련 문서