316 lines
7.8 KiB
Markdown
316 lines
7.8 KiB
Markdown
---
|
||
id: backend-maintenance-mode
|
||
title: Maintenance Mode — 점진 / Read-only / Banner
|
||
category: Coding
|
||
status: draft
|
||
source_trust_level: B
|
||
verification_status: conceptual
|
||
created_at: 2026-05-09
|
||
updated_at: 2026-05-09
|
||
tags: [backend, maintenance, vibe-coding]
|
||
tech_stack: { language: "TS", applicable_to: ["Backend"] }
|
||
applied_in: []
|
||
aliases: [maintenance mode, read-only mode, downtime, planned outage, kill switch]
|
||
---
|
||
|
||
# Maintenance Mode
|
||
|
||
> Migration / 큰 변경 = 일시 차단. **Banner → read-only → full block** 점진. 완전 down 보다 좋음. Kill switch + feature flag 통합.
|
||
|
||
## 📖 핵심 개념
|
||
- Banner only: "Maintenance scheduled at X" 알림.
|
||
- Read-only: GET OK, POST/PUT/DELETE 차단.
|
||
- Restricted: admin 만 OK.
|
||
- Full block: 503 + 모든 traffic.
|
||
|
||
## 💻 코드 패턴
|
||
|
||
### Feature flag 기반
|
||
```ts
|
||
const MAINTENANCE_MODE = await flags.get('maintenance');
|
||
// 'off' | 'banner' | 'readonly' | 'admin-only' | 'full'
|
||
|
||
app.use(async (req, res, next) => {
|
||
switch (MAINTENANCE_MODE) {
|
||
case 'off':
|
||
return next();
|
||
case 'banner':
|
||
res.setHeader('X-Maintenance-Banner', 'Scheduled at 2026-05-10 02:00 UTC');
|
||
return next();
|
||
case 'readonly':
|
||
if (req.method !== 'GET' && req.method !== 'HEAD') {
|
||
return res.status(503).json({
|
||
type: '...',
|
||
title: 'Read-only mode',
|
||
detail: 'Writes are temporarily disabled',
|
||
retryAfter: 1800,
|
||
});
|
||
}
|
||
return next();
|
||
case 'admin-only':
|
||
if (!req.user?.isAdmin) {
|
||
return res.status(503).json({
|
||
type: '...', title: 'Maintenance', status: 503,
|
||
});
|
||
}
|
||
return next();
|
||
case 'full':
|
||
return res.status(503).set('Retry-After', '1800').json({
|
||
type: '...', title: 'Maintenance', status: 503,
|
||
});
|
||
}
|
||
});
|
||
```
|
||
|
||
### Reverse proxy 차단 (nginx)
|
||
```nginx
|
||
# Maintenance file 있으면 모두 503
|
||
server {
|
||
if (-f /var/www/maintenance.html) {
|
||
return 503;
|
||
}
|
||
|
||
error_page 503 /maintenance.html;
|
||
|
||
location = /maintenance.html {
|
||
root /var/www;
|
||
internal;
|
||
}
|
||
|
||
# Admin IP allowlist
|
||
location / {
|
||
if ($remote_addr !~ ^(10\.0\.0\.1|10\.0\.0\.2)$) {
|
||
if (-f /var/www/maintenance.html) {
|
||
return 503;
|
||
}
|
||
}
|
||
proxy_pass http://app;
|
||
}
|
||
}
|
||
```
|
||
|
||
```bash
|
||
# Toggle
|
||
touch /var/www/maintenance.html # ON
|
||
rm /var/www/maintenance.html # OFF
|
||
```
|
||
|
||
### CDN level (Cloudflare Worker)
|
||
```ts
|
||
export default {
|
||
async fetch(req: Request, env: Env): Promise<Response> {
|
||
const mode = await env.KV.get('maintenance');
|
||
|
||
if (mode === 'full' && !isAdminIp(req)) {
|
||
return new Response('Maintenance', {
|
||
status: 503,
|
||
headers: { 'Retry-After': '1800', 'Content-Type': 'text/html' },
|
||
});
|
||
}
|
||
|
||
return fetch(req);
|
||
},
|
||
};
|
||
```
|
||
|
||
### Banner UI
|
||
```tsx
|
||
function App() {
|
||
const { data: status } = useQuery(['maintenance'], fetchStatus);
|
||
|
||
return (
|
||
<>
|
||
{status?.maintenance?.scheduled && (
|
||
<div className="bg-yellow-100 border-b border-yellow-300 px-4 py-2 text-sm">
|
||
⚠️ Scheduled maintenance: {format(status.maintenance.start)} - {format(status.maintenance.end)}
|
||
</div>
|
||
)}
|
||
{status?.maintenance?.readonly && (
|
||
<div className="bg-orange-100 border-b border-orange-400 px-4 py-2 text-sm">
|
||
🔒 Read-only mode active. Writes are temporarily disabled.
|
||
</div>
|
||
)}
|
||
<Routes>...</Routes>
|
||
</>
|
||
);
|
||
}
|
||
```
|
||
|
||
### DB migration with read-only
|
||
```bash
|
||
# 1. Read-only mode ON (writes 차단)
|
||
# 2. Wait for in-flight writes complete
|
||
# 3. Migration (큰 backfill, partition rebuild)
|
||
# 4. Verify
|
||
# 5. Read-only mode OFF
|
||
```
|
||
|
||
```sql
|
||
-- PG read-only role
|
||
CREATE ROLE readonly;
|
||
ALTER USER app_user SET default_transaction_read_only = on;
|
||
```
|
||
|
||
### Kill switch (emergency)
|
||
```ts
|
||
// 외부 KV 또는 config 에서 제어
|
||
async function checkKillSwitch(feature: string): Promise<boolean> {
|
||
return (await redis.get(`kill:${feature}`)) === '1';
|
||
}
|
||
|
||
app.post('/api/payments', async (req, res) => {
|
||
if (await checkKillSwitch('payments')) {
|
||
return res.status(503).json({
|
||
title: 'Payments temporarily unavailable',
|
||
detail: 'We are working to restore service. Try again in a few minutes.',
|
||
});
|
||
}
|
||
// ...
|
||
});
|
||
```
|
||
|
||
→ Bug 발견 시 즉시 끄기. Deploy 안 기다림.
|
||
|
||
### Status page
|
||
```
|
||
status.acme.com — 사용자에 표시.
|
||
- Statuspage.io / Better Stack / 자체.
|
||
- "Scheduled maintenance: 2026-05-10 02:00 UTC" 미리.
|
||
```
|
||
|
||
### Communication (사용자)
|
||
```
|
||
1. Email (24h+ 전): 큰 maintenance.
|
||
2. Banner (web): 1h 전 + during.
|
||
3. API 응답 (header): 매번.
|
||
4. Status page: 항상.
|
||
5. Twitter / 사회 미디어: incident 시.
|
||
```
|
||
|
||
### API 별 Retry-After
|
||
```ts
|
||
res.status(503).set({
|
||
'Retry-After': '300',
|
||
'X-Maintenance-Mode': 'true',
|
||
}).json({
|
||
type: 'https://api.acme.com/errors/maintenance',
|
||
title: 'Maintenance',
|
||
detail: 'API temporarily unavailable, retry in 5 minutes',
|
||
retryAfter: 300,
|
||
});
|
||
```
|
||
|
||
→ Client 가 자동 retry.
|
||
|
||
### Soft launch (admin 만 보임)
|
||
```ts
|
||
// 새 feature 가 prod 배포됐지만 admin 만 사용 가능
|
||
if (newFeature.enabled) {
|
||
if (!req.user?.isAdmin && !req.user?.betaTester) {
|
||
return res.status(404).end(); // 사용자에는 없는 것처럼
|
||
}
|
||
}
|
||
```
|
||
|
||
→ Stealth deploy + soft test.
|
||
|
||
### Database maintenance
|
||
```sql
|
||
-- 큰 migration 시 lock 짧게
|
||
-- pg_repack, gh-ost 같은 zero-downtime 도구
|
||
|
||
-- 또는 read-only 로
|
||
ALTER DATABASE app SET default_transaction_read_only = on;
|
||
-- Migration 작업
|
||
ALTER DATABASE app SET default_transaction_read_only = off;
|
||
```
|
||
|
||
### Rolling restart
|
||
```yaml
|
||
# K8s
|
||
spec:
|
||
strategy:
|
||
type: RollingUpdate
|
||
rollingUpdate:
|
||
maxUnavailable: 0
|
||
maxSurge: 1
|
||
```
|
||
|
||
→ Pod 별 종료 + 새 pod 시작 — 서비스 안 끊김.
|
||
|
||
### Runbook (사전 작성)
|
||
```markdown
|
||
# Maintenance Runbook — DB Schema Migration v2
|
||
|
||
## Pre-checks
|
||
- [ ] Backup latest snapshot taken
|
||
- [ ] Migration tested on staging
|
||
- [ ] Rollback script ready
|
||
- [ ] Status page updated
|
||
- [ ] On-call notified
|
||
|
||
## Steps (estimated 30min)
|
||
1. Enable read-only mode at 02:00 UTC
|
||
2. Wait for write queue drain (5 min)
|
||
3. Run migration: `pnpm migrate:up`
|
||
4. Verify schema: `pnpm verify`
|
||
5. Disable read-only mode
|
||
6. Monitor errors for 30 min
|
||
|
||
## Rollback
|
||
1. Enable read-only mode
|
||
2. Run rollback: `pnpm migrate:down`
|
||
3. Disable read-only mode
|
||
4. Investigate
|
||
|
||
## Communication
|
||
- Status page: "Scheduled maintenance" 24h before
|
||
- Email: 24h before
|
||
- During: hourly status updates
|
||
```
|
||
|
||
### Test maintenance mode
|
||
```ts
|
||
test('maintenance read-only blocks writes', async () => {
|
||
await flags.set('maintenance', 'readonly');
|
||
|
||
const r = await fetch('/api/orders', { method: 'POST', body: '...' });
|
||
expect(r.status).toBe(503);
|
||
|
||
const get = await fetch('/api/orders');
|
||
expect(get.status).toBe(200);
|
||
|
||
await flags.set('maintenance', 'off');
|
||
});
|
||
```
|
||
|
||
## 🤔 의사결정 기준
|
||
| 작업 | Mode |
|
||
|---|---|
|
||
| Schema migration (안전) | None — zero-downtime tools |
|
||
| Schema migration (위험) | Read-only 5-30min |
|
||
| Major refactor | Banner + monitor |
|
||
| Emergency bug | Kill switch (specific feature) |
|
||
| Pricing change | Banner only |
|
||
| DB hardware change | Full maintenance window |
|
||
|
||
## ❌ 안티패턴
|
||
- **Maintenance 갑자기 (사전 공지 X)**: 사용자 불만.
|
||
- **`HTTP 200 + maintenance message`**: client retry 안 됨. 503 + Retry-After.
|
||
- **Admin / staff 도 차단**: 디버깅 불가능.
|
||
- **Kill switch 없음**: 큰 bug 시 deploy 기다림.
|
||
- **Banner 만 — 실제 차단 X**: 사용자 시도 + 깨짐.
|
||
- **DB read-only + 일부 write 누락**: 부분 깨짐.
|
||
- **Rollback plan 없음**: Forward only — 실패 시 더 큰 사고.
|
||
|
||
## 🤖 LLM 활용 힌트
|
||
- 점진 (banner → read-only → block).
|
||
- Kill switch per feature.
|
||
- Status page + 사용자 통신.
|
||
- Runbook + rollback 미리.
|
||
|
||
## 🔗 관련 문서
|
||
- [[Backend_Feature_Flags_Deep]]
|
||
- [[Backend_Graceful_Shutdown]]
|
||
- [[DB_Migration_Safety]]
|