--- id: backend-maintenance-mode title: Maintenance Mode — 점진 / Read-only / Banner category: Coding status: draft source_trust_level: B verification_status: conceptual created_at: 2026-05-09 updated_at: 2026-05-09 tags: [backend, maintenance, vibe-coding] tech_stack: { language: "TS", applicable_to: ["Backend"] } applied_in: [] aliases: [maintenance mode, read-only mode, downtime, planned outage, kill switch] --- # Maintenance Mode > Migration / 큰 변경 = 일시 차단. **Banner → read-only → full block** 점진. 완전 down 보다 좋음. Kill switch + feature flag 통합. ## 📖 핵심 개념 - Banner only: "Maintenance scheduled at X" 알림. - Read-only: GET OK, POST/PUT/DELETE 차단. - Restricted: admin 만 OK. - Full block: 503 + 모든 traffic. ## 💻 코드 패턴 ### Feature flag 기반 ```ts const MAINTENANCE_MODE = await flags.get('maintenance'); // 'off' | 'banner' | 'readonly' | 'admin-only' | 'full' app.use(async (req, res, next) => { switch (MAINTENANCE_MODE) { case 'off': return next(); case 'banner': res.setHeader('X-Maintenance-Banner', 'Scheduled at 2026-05-10 02:00 UTC'); return next(); case 'readonly': if (req.method !== 'GET' && req.method !== 'HEAD') { return res.status(503).json({ type: '...', title: 'Read-only mode', detail: 'Writes are temporarily disabled', retryAfter: 1800, }); } return next(); case 'admin-only': if (!req.user?.isAdmin) { return res.status(503).json({ type: '...', title: 'Maintenance', status: 503, }); } return next(); case 'full': return res.status(503).set('Retry-After', '1800').json({ type: '...', title: 'Maintenance', status: 503, }); } }); ``` ### Reverse proxy 차단 (nginx) ```nginx # Maintenance file 있으면 모두 503 server { if (-f /var/www/maintenance.html) { return 503; } error_page 503 /maintenance.html; location = /maintenance.html { root /var/www; internal; } # Admin IP allowlist location / { if ($remote_addr !~ ^(10\.0\.0\.1|10\.0\.0\.2)$) { if (-f /var/www/maintenance.html) { return 503; } } proxy_pass http://app; } } ``` ```bash # Toggle touch /var/www/maintenance.html # ON rm /var/www/maintenance.html # OFF ``` ### CDN level (Cloudflare Worker) ```ts export default { async fetch(req: Request, env: Env): Promise { const mode = await env.KV.get('maintenance'); if (mode === 'full' && !isAdminIp(req)) { return new Response('Maintenance', { status: 503, headers: { 'Retry-After': '1800', 'Content-Type': 'text/html' }, }); } return fetch(req); }, }; ``` ### Banner UI ```tsx function App() { const { data: status } = useQuery(['maintenance'], fetchStatus); return ( <> {status?.maintenance?.scheduled && (
⚠️ Scheduled maintenance: {format(status.maintenance.start)} - {format(status.maintenance.end)}
)} {status?.maintenance?.readonly && (
🔒 Read-only mode active. Writes are temporarily disabled.
)} ... ); } ``` ### DB migration with read-only ```bash # 1. Read-only mode ON (writes 차단) # 2. Wait for in-flight writes complete # 3. Migration (큰 backfill, partition rebuild) # 4. Verify # 5. Read-only mode OFF ``` ```sql -- PG read-only role CREATE ROLE readonly; ALTER USER app_user SET default_transaction_read_only = on; ``` ### Kill switch (emergency) ```ts // 외부 KV 또는 config 에서 제어 async function checkKillSwitch(feature: string): Promise { return (await redis.get(`kill:${feature}`)) === '1'; } app.post('/api/payments', async (req, res) => { if (await checkKillSwitch('payments')) { return res.status(503).json({ title: 'Payments temporarily unavailable', detail: 'We are working to restore service. Try again in a few minutes.', }); } // ... }); ``` → Bug 발견 시 즉시 끄기. Deploy 안 기다림. ### Status page ``` status.acme.com — 사용자에 표시. - Statuspage.io / Better Stack / 자체. - "Scheduled maintenance: 2026-05-10 02:00 UTC" 미리. ``` ### Communication (사용자) ``` 1. Email (24h+ 전): 큰 maintenance. 2. Banner (web): 1h 전 + during. 3. API 응답 (header): 매번. 4. Status page: 항상. 5. Twitter / 사회 미디어: incident 시. ``` ### API 별 Retry-After ```ts res.status(503).set({ 'Retry-After': '300', 'X-Maintenance-Mode': 'true', }).json({ type: 'https://api.acme.com/errors/maintenance', title: 'Maintenance', detail: 'API temporarily unavailable, retry in 5 minutes', retryAfter: 300, }); ``` → Client 가 자동 retry. ### Soft launch (admin 만 보임) ```ts // 새 feature 가 prod 배포됐지만 admin 만 사용 가능 if (newFeature.enabled) { if (!req.user?.isAdmin && !req.user?.betaTester) { return res.status(404).end(); // 사용자에는 없는 것처럼 } } ``` → Stealth deploy + soft test. ### Database maintenance ```sql -- 큰 migration 시 lock 짧게 -- pg_repack, gh-ost 같은 zero-downtime 도구 -- 또는 read-only 로 ALTER DATABASE app SET default_transaction_read_only = on; -- Migration 작업 ALTER DATABASE app SET default_transaction_read_only = off; ``` ### Rolling restart ```yaml # K8s spec: strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 0 maxSurge: 1 ``` → Pod 별 종료 + 새 pod 시작 — 서비스 안 끊김. ### Runbook (사전 작성) ```markdown # Maintenance Runbook — DB Schema Migration v2 ## Pre-checks - [ ] Backup latest snapshot taken - [ ] Migration tested on staging - [ ] Rollback script ready - [ ] Status page updated - [ ] On-call notified ## Steps (estimated 30min) 1. Enable read-only mode at 02:00 UTC 2. Wait for write queue drain (5 min) 3. Run migration: `pnpm migrate:up` 4. Verify schema: `pnpm verify` 5. Disable read-only mode 6. Monitor errors for 30 min ## Rollback 1. Enable read-only mode 2. Run rollback: `pnpm migrate:down` 3. Disable read-only mode 4. Investigate ## Communication - Status page: "Scheduled maintenance" 24h before - Email: 24h before - During: hourly status updates ``` ### Test maintenance mode ```ts test('maintenance read-only blocks writes', async () => { await flags.set('maintenance', 'readonly'); const r = await fetch('/api/orders', { method: 'POST', body: '...' }); expect(r.status).toBe(503); const get = await fetch('/api/orders'); expect(get.status).toBe(200); await flags.set('maintenance', 'off'); }); ``` ## 🤔 의사결정 기준 | 작업 | Mode | |---|---| | Schema migration (안전) | None — zero-downtime tools | | Schema migration (위험) | Read-only 5-30min | | Major refactor | Banner + monitor | | Emergency bug | Kill switch (specific feature) | | Pricing change | Banner only | | DB hardware change | Full maintenance window | ## ❌ 안티패턴 - **Maintenance 갑자기 (사전 공지 X)**: 사용자 불만. - **`HTTP 200 + maintenance message`**: client retry 안 됨. 503 + Retry-After. - **Admin / staff 도 차단**: 디버깅 불가능. - **Kill switch 없음**: 큰 bug 시 deploy 기다림. - **Banner 만 — 실제 차단 X**: 사용자 시도 + 깨짐. - **DB read-only + 일부 write 누락**: 부분 깨짐. - **Rollback plan 없음**: Forward only — 실패 시 더 큰 사고. ## 🤖 LLM 활용 힌트 - 점진 (banner → read-only → block). - Kill switch per feature. - Status page + 사용자 통신. - Runbook + rollback 미리. ## 🔗 관련 문서 - [[Backend_Feature_Flags_Deep]] - [[Backend_Graceful_Shutdown]] - [[DB_Migration_Safety]]