--- id: mobile-crash-free-slo title: Crash-free SLO — 99.5% 유지하기 category: Coding status: draft source_trust_level: B verification_status: conceptual created_at: 2026-05-09 updated_at: 2026-05-09 tags: [mobile, crash, slo, reliability, vibe-coding] tech_stack: { language: "Swift / Kotlin / RN", applicable_to: ["iOS", "Android"] } applied_in: [] aliases: [crash-free, crashlytics SLO, ANR rate, mean time to detect, MTTD] --- # Crash-free SLO > Crash-free users 99.5%+ 가 산업 표준. **측정 + 알람 + 빠른 hotfix 채널**. iOS / Android 별 다름. **Crashlytics velocity alert** 자동. ## 📖 핵심 개념 - Crash-free users: 24h 동안 crash 안 본 사용자 비율. - Crash-free sessions: 한 세션 안 crash 없음 비율. - Velocity: 짧은 시간 내 빠른 증가 = 즉각 알람. - ANR rate (Android): 별도 SLO. ## 💻 코드 패턴 ### Crashlytics velocity alert ``` Firebase Console → Crashlytics → Velocity Alerts - Threshold: 1% sessions affected - Window: 1 hour - Slack / email integration ``` ### Sentry release health ```ts // 앱 시작 시 Sentry.init({ dsn, release: `app@${version}+${build}`, environment: 'prod', tracesSampleRate: 0.1, enableAutoSessionTracking: true, }); ``` ``` Sentry Dashboard → Releases → Crash-free users / sessions ``` ### SLO 정의 ```yaml SLO: crash_free_users_30d: 99.5% crash_free_sessions_30d: 99.8% ANR_rate_24h: < 0.47% (Play Console 표준) startup_latency_p95: < 2s Alerts: - velocity > 1% in 1h → page oncall - daily crash-free < 99% → page - new crash signature affecting > 50 users / 1h → page ``` ### 디버그 정보 ```ts Crashlytics.crashlytics().setUserID(user.id); Crashlytics.crashlytics().setCustomValue(plan, forKey: 'plan'); Crashlytics.crashlytics().setCustomValue(experimentVariant, forKey: 'exp:onboarding'); Crashlytics.crashlytics().log("entered checkout flow") ``` ### Symbol 업로드 자동 ``` Fastlane: upload_symbols_to_crashlytics 또는 build script ``` dSYM 누락 = crash 의미 없음. ### Hotfix 채널 - iOS: TestFlight + 빠른 review (24h) + phased release pause. - Android: Play Console internal testing → staged rollout 5% → full. ```ruby # Fastlane: emergency hotfix 자동 lane :hotfix do ensure_git_branch(branch: 'hotfix/.*') match build_app upload_to_testflight(distribute_external: true, groups: ['Beta']) end ``` ### Phased release / staged rollout ```ruby # iOS: App Store Connect API upload_to_app_store( phased_release: true, # 7일 자동 분배 ) # Android upload_to_play_store(track: 'production', rollout: '0.05') ``` 새 버전 issue 발견 → halt rollout. ### Kill switch (서버 제어 강제) ```ts // 앱 시작 시 remote config const config = await fetchRemoteConfig(); if (config.minSupportedBuild > currentBuild) { showUpdateRequiredScreen(); return; } if (config.killFeature.checkout) { // checkout 비활성 } ``` ### Client error rate (non-crash) ```ts class GlobalErrorBoundary extends React.Component { componentDidCatch(error: Error, info: ErrorInfo) { Sentry.captureException(error, { extra: info }); this.setState({ hasError: true }); } } ``` ```ts // JS unhandled rejection window.addEventListener('unhandledrejection', (e) => { Sentry.captureException(e.reason); }); ``` ### Postmortem 템플릿 ```markdown # Crash spike 2026-05-09 - Affected: 23,000 users (3.4%) - Window: 14:00 - 15:30 UTC - Root cause: 새 endpoint 의 nil response → force unwrap - Fix: optional unwrap + fallback - Hotfix: 1.4.2 (3 hours) - Prevention: 빌드 시 force-unwrap lint 활성, integration test 추가 ``` ## 🤔 의사결정 기준 | 모니터 | 도구 | |---|---| | 일반 SLO | Crashlytics + Sentry | | ANR (Android) | Crashlytics + Play Console | | 시작 latency | Firebase Performance / SwiftUI Instruments | | 메모리 / 배터리 | MetricKit (iOS) / Battery Historian (Android) | | 네트워크 에러 | 자체 + Sentry breadcrumbs | | User journey crash 영향 | Mixpanel / Amplitude | ## ❌ 안티패턴 - **dSYM / mapping 누락**: stack 의미 없음. - **Crash velocity 알람 없음**: 큰 spike 모름. - **Force unwrap (`!`) 무절제 (Swift)**: nil 시 crash. 옵션 unwrap. - **Catch 후 swallow + continue**: 더 큰 crash. 또는 record + degrade. - **Phased release 없음 prod**: 큰 영향 즉시. - **Kill switch 없음**: emergency 시 review 기다림. - **PII crash log 그대로**: GDPR. 마스킹. - **Old version 무한 지원**: minSupportedBuild + 강제 update. ## 🤖 LLM 활용 힌트 - Crashlytics + Sentry 양쪽 권장 (각 강점 다름). - Velocity alert + dSYM 자동 + phased release. - Kill switch 로 emergency 대응. ## 🔗 관련 문서 - [[Native_Crash_Reporting]] - [[Mobile_CI_CD_Fastlane]] - [[Native_ANR_Freeze_Debugging]]