4.7 KiB
4.7 KiB
id, title, category, status, source_trust_level, verification_status, created_at, updated_at, tags, tech_stack, applied_in, aliases
| id | title | category | status | source_trust_level | verification_status | created_at | updated_at | tags | tech_stack | applied_in | aliases | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| mobile-crash-free-slo | Crash-free SLO — 99.5% 유지하기 | Coding | draft | B | conceptual | 2026-05-09 | 2026-05-09 |
|
|
|
Crash-free SLO
Crash-free users 99.5%+ 가 산업 표준. 측정 + 알람 + 빠른 hotfix 채널. iOS / Android 별 다름. Crashlytics velocity alert 자동.
📖 핵심 개념
- Crash-free users: 24h 동안 crash 안 본 사용자 비율.
- Crash-free sessions: 한 세션 안 crash 없음 비율.
- Velocity: 짧은 시간 내 빠른 증가 = 즉각 알람.
- ANR rate (Android): 별도 SLO.
💻 코드 패턴
Crashlytics velocity alert
Firebase Console → Crashlytics → Velocity Alerts
- Threshold: 1% sessions affected
- Window: 1 hour
- Slack / email integration
Sentry release health
// 앱 시작 시
Sentry.init({
dsn,
release: `app@${version}+${build}`,
environment: 'prod',
tracesSampleRate: 0.1,
enableAutoSessionTracking: true,
});
Sentry Dashboard → Releases → Crash-free users / sessions
SLO 정의
SLO:
crash_free_users_30d: 99.5%
crash_free_sessions_30d: 99.8%
ANR_rate_24h: < 0.47% (Play Console 표준)
startup_latency_p95: < 2s
Alerts:
- velocity > 1% in 1h → page oncall
- daily crash-free < 99% → page
- new crash signature affecting > 50 users / 1h → page
디버그 정보
Crashlytics.crashlytics().setUserID(user.id);
Crashlytics.crashlytics().setCustomValue(plan, forKey: 'plan');
Crashlytics.crashlytics().setCustomValue(experimentVariant, forKey: 'exp:onboarding');
Crashlytics.crashlytics().log("entered checkout flow")
Symbol 업로드 자동
Fastlane: upload_symbols_to_crashlytics
또는 build script
dSYM 누락 = crash 의미 없음.
Hotfix 채널
- iOS: TestFlight + 빠른 review (24h) + phased release pause.
- Android: Play Console internal testing → staged rollout 5% → full.
# Fastlane: emergency hotfix 자동
lane :hotfix do
ensure_git_branch(branch: 'hotfix/.*')
match
build_app
upload_to_testflight(distribute_external: true, groups: ['Beta'])
end
Phased release / staged rollout
# iOS: App Store Connect API
upload_to_app_store(
phased_release: true, # 7일 자동 분배
)
# Android
upload_to_play_store(track: 'production', rollout: '0.05')
새 버전 issue 발견 → halt rollout.
Kill switch (서버 제어 강제)
// 앱 시작 시 remote config
const config = await fetchRemoteConfig();
if (config.minSupportedBuild > currentBuild) {
showUpdateRequiredScreen();
return;
}
if (config.killFeature.checkout) {
// checkout 비활성
}
Client error rate (non-crash)
class GlobalErrorBoundary extends React.Component {
componentDidCatch(error: Error, info: ErrorInfo) {
Sentry.captureException(error, { extra: info });
this.setState({ hasError: true });
}
}
// JS unhandled rejection
window.addEventListener('unhandledrejection', (e) => {
Sentry.captureException(e.reason);
});
Postmortem 템플릿
# Crash spike 2026-05-09
- Affected: 23,000 users (3.4%)
- Window: 14:00 - 15:30 UTC
- Root cause: 새 endpoint 의 nil response → force unwrap
- Fix: optional unwrap + fallback
- Hotfix: 1.4.2 (3 hours)
- Prevention: 빌드 시 force-unwrap lint 활성, integration test 추가
🤔 의사결정 기준
| 모니터 | 도구 |
|---|---|
| 일반 SLO | Crashlytics + Sentry |
| ANR (Android) | Crashlytics + Play Console |
| 시작 latency | Firebase Performance / SwiftUI Instruments |
| 메모리 / 배터리 | MetricKit (iOS) / Battery Historian (Android) |
| 네트워크 에러 | 자체 + Sentry breadcrumbs |
| User journey crash 영향 | Mixpanel / Amplitude |
❌ 안티패턴
- dSYM / mapping 누락: stack 의미 없음.
- Crash velocity 알람 없음: 큰 spike 모름.
- Force unwrap (
!) 무절제 (Swift): nil 시 crash. 옵션 unwrap. - Catch 후 swallow + continue: 더 큰 crash. 또는 record + degrade.
- Phased release 없음 prod: 큰 영향 즉시.
- Kill switch 없음: emergency 시 review 기다림.
- PII crash log 그대로: GDPR. 마스킹.
- Old version 무한 지원: minSupportedBuild + 강제 update.
🤖 LLM 활용 힌트
- Crashlytics + Sentry 양쪽 권장 (각 강점 다름).
- Velocity alert + dSYM 자동 + phased release.
- Kill switch 로 emergency 대응.