Files
2nd/10_Wiki/Topics/Coding/Backend_Maintenance_Mode.md
T
2026-05-09 21:08:02 +09:00

316 lines
7.8 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: backend-maintenance-mode
title: Maintenance Mode — 점진 / Read-only / Banner
category: Coding
status: draft
source_trust_level: B
verification_status: conceptual
created_at: 2026-05-09
updated_at: 2026-05-09
tags: [backend, maintenance, vibe-coding]
tech_stack: { language: "TS", applicable_to: ["Backend"] }
applied_in: []
aliases: [maintenance mode, read-only mode, downtime, planned outage, kill switch]
---
# Maintenance Mode
> Migration / 큰 변경 = 일시 차단. **Banner → read-only → full block** 점진. 완전 down 보다 좋음. Kill switch + feature flag 통합.
## 📖 핵심 개념
- Banner only: "Maintenance scheduled at X" 알림.
- Read-only: GET OK, POST/PUT/DELETE 차단.
- Restricted: admin 만 OK.
- Full block: 503 + 모든 traffic.
## 💻 코드 패턴
### Feature flag 기반
```ts
const MAINTENANCE_MODE = await flags.get('maintenance');
// 'off' | 'banner' | 'readonly' | 'admin-only' | 'full'
app.use(async (req, res, next) => {
switch (MAINTENANCE_MODE) {
case 'off':
return next();
case 'banner':
res.setHeader('X-Maintenance-Banner', 'Scheduled at 2026-05-10 02:00 UTC');
return next();
case 'readonly':
if (req.method !== 'GET' && req.method !== 'HEAD') {
return res.status(503).json({
type: '...',
title: 'Read-only mode',
detail: 'Writes are temporarily disabled',
retryAfter: 1800,
});
}
return next();
case 'admin-only':
if (!req.user?.isAdmin) {
return res.status(503).json({
type: '...', title: 'Maintenance', status: 503,
});
}
return next();
case 'full':
return res.status(503).set('Retry-After', '1800').json({
type: '...', title: 'Maintenance', status: 503,
});
}
});
```
### Reverse proxy 차단 (nginx)
```nginx
# Maintenance file 있으면 모두 503
server {
if (-f /var/www/maintenance.html) {
return 503;
}
error_page 503 /maintenance.html;
location = /maintenance.html {
root /var/www;
internal;
}
# Admin IP allowlist
location / {
if ($remote_addr !~ ^(10\.0\.0\.1|10\.0\.0\.2)$) {
if (-f /var/www/maintenance.html) {
return 503;
}
}
proxy_pass http://app;
}
}
```
```bash
# Toggle
touch /var/www/maintenance.html # ON
rm /var/www/maintenance.html # OFF
```
### CDN level (Cloudflare Worker)
```ts
export default {
async fetch(req: Request, env: Env): Promise<Response> {
const mode = await env.KV.get('maintenance');
if (mode === 'full' && !isAdminIp(req)) {
return new Response('Maintenance', {
status: 503,
headers: { 'Retry-After': '1800', 'Content-Type': 'text/html' },
});
}
return fetch(req);
},
};
```
### Banner UI
```tsx
function App() {
const { data: status } = useQuery(['maintenance'], fetchStatus);
return (
<>
{status?.maintenance?.scheduled && (
<div className="bg-yellow-100 border-b border-yellow-300 px-4 py-2 text-sm">
Scheduled maintenance: {format(status.maintenance.start)} - {format(status.maintenance.end)}
</div>
)}
{status?.maintenance?.readonly && (
<div className="bg-orange-100 border-b border-orange-400 px-4 py-2 text-sm">
🔒 Read-only mode active. Writes are temporarily disabled.
</div>
)}
<Routes>...</Routes>
</>
);
}
```
### DB migration with read-only
```bash
# 1. Read-only mode ON (writes 차단)
# 2. Wait for in-flight writes complete
# 3. Migration (큰 backfill, partition rebuild)
# 4. Verify
# 5. Read-only mode OFF
```
```sql
-- PG read-only role
CREATE ROLE readonly;
ALTER USER app_user SET default_transaction_read_only = on;
```
### Kill switch (emergency)
```ts
// 외부 KV 또는 config 에서 제어
async function checkKillSwitch(feature: string): Promise<boolean> {
return (await redis.get(`kill:${feature}`)) === '1';
}
app.post('/api/payments', async (req, res) => {
if (await checkKillSwitch('payments')) {
return res.status(503).json({
title: 'Payments temporarily unavailable',
detail: 'We are working to restore service. Try again in a few minutes.',
});
}
// ...
});
```
→ Bug 발견 시 즉시 끄기. Deploy 안 기다림.
### Status page
```
status.acme.com — 사용자에 표시.
- Statuspage.io / Better Stack / 자체.
- "Scheduled maintenance: 2026-05-10 02:00 UTC" 미리.
```
### Communication (사용자)
```
1. Email (24h+ 전): 큰 maintenance.
2. Banner (web): 1h 전 + during.
3. API 응답 (header): 매번.
4. Status page: 항상.
5. Twitter / 사회 미디어: incident 시.
```
### API 별 Retry-After
```ts
res.status(503).set({
'Retry-After': '300',
'X-Maintenance-Mode': 'true',
}).json({
type: 'https://api.acme.com/errors/maintenance',
title: 'Maintenance',
detail: 'API temporarily unavailable, retry in 5 minutes',
retryAfter: 300,
});
```
→ Client 가 자동 retry.
### Soft launch (admin 만 보임)
```ts
// 새 feature 가 prod 배포됐지만 admin 만 사용 가능
if (newFeature.enabled) {
if (!req.user?.isAdmin && !req.user?.betaTester) {
return res.status(404).end(); // 사용자에는 없는 것처럼
}
}
```
→ Stealth deploy + soft test.
### Database maintenance
```sql
-- 큰 migration 시 lock 짧게
-- pg_repack, gh-ost 같은 zero-downtime 도구
-- 또는 read-only 로
ALTER DATABASE app SET default_transaction_read_only = on;
-- Migration 작업
ALTER DATABASE app SET default_transaction_read_only = off;
```
### Rolling restart
```yaml
# K8s
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0
maxSurge: 1
```
→ Pod 별 종료 + 새 pod 시작 — 서비스 안 끊김.
### Runbook (사전 작성)
```markdown
# Maintenance Runbook — DB Schema Migration v2
## Pre-checks
- [ ] Backup latest snapshot taken
- [ ] Migration tested on staging
- [ ] Rollback script ready
- [ ] Status page updated
- [ ] On-call notified
## Steps (estimated 30min)
1. Enable read-only mode at 02:00 UTC
2. Wait for write queue drain (5 min)
3. Run migration: `pnpm migrate:up`
4. Verify schema: `pnpm verify`
5. Disable read-only mode
6. Monitor errors for 30 min
## Rollback
1. Enable read-only mode
2. Run rollback: `pnpm migrate:down`
3. Disable read-only mode
4. Investigate
## Communication
- Status page: "Scheduled maintenance" 24h before
- Email: 24h before
- During: hourly status updates
```
### Test maintenance mode
```ts
test('maintenance read-only blocks writes', async () => {
await flags.set('maintenance', 'readonly');
const r = await fetch('/api/orders', { method: 'POST', body: '...' });
expect(r.status).toBe(503);
const get = await fetch('/api/orders');
expect(get.status).toBe(200);
await flags.set('maintenance', 'off');
});
```
## 🤔 의사결정 기준
| 작업 | Mode |
|---|---|
| Schema migration (안전) | None — zero-downtime tools |
| Schema migration (위험) | Read-only 5-30min |
| Major refactor | Banner + monitor |
| Emergency bug | Kill switch (specific feature) |
| Pricing change | Banner only |
| DB hardware change | Full maintenance window |
## ❌ 안티패턴
- **Maintenance 갑자기 (사전 공지 X)**: 사용자 불만.
- **`HTTP 200 + maintenance message`**: client retry 안 됨. 503 + Retry-After.
- **Admin / staff 도 차단**: 디버깅 불가능.
- **Kill switch 없음**: 큰 bug 시 deploy 기다림.
- **Banner 만 — 실제 차단 X**: 사용자 시도 + 깨짐.
- **DB read-only + 일부 write 누락**: 부분 깨짐.
- **Rollback plan 없음**: Forward only — 실패 시 더 큰 사고.
## 🤖 LLM 활용 힌트
- 점진 (banner → read-only → block).
- Kill switch per feature.
- Status page + 사용자 통신.
- Runbook + rollback 미리.
## 🔗 관련 문서
- [[Backend_Feature_Flags_Deep]]
- [[Backend_Graceful_Shutdown]]
- [[DB_Migration_Safety]]