Files
2nd/10_Wiki/Topics/DevOps_and_Security/Backups.md
T
2026-05-10 22:08:15 +09:00

145 lines
4.7 KiB
Markdown

---
id: wiki-2026-0508-backups
title: Backups
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [Backup Strategy, Disaster Recovery, 백업]
duplicate_of: none
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [backup, dr, ops, sre]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: applied
tech_stack:
language: Bash/Python
framework: restic/borg/AWS Backup
---
# Backups
## 매 한 줄
> **"매 backup 은 restore 가 검증된 backup 만이다."**. Backups 는 매 3-2-1 rule (3 copies, 2 media, 1 offsite) + RTO/RPO target + 정기 restore drill 의 trio. 2026 의 standard: incremental dedup (restic/borg) + immutable object lock (S3 Object Lock, Azure Immutable Blob) + ransomware-resistant air gap.
## 매 핵심
### 매 3-2-1-1-0 Rule (modern)
- **3** copies of data.
- **2** different media types.
- **1** offsite copy.
- **1** immutable / air-gapped (anti-ransomware, 매 2020+ 추가).
- **0** errors after restore verification.
### 매 RTO vs RPO
- **RTO (Recovery Time Objective)**: 매 outage 후 service 복구까지 허용 시간.
- **RPO (Recovery Point Objective)**: 매 허용 가능한 data loss window.
- 매 RTO=1h / RPO=15min 이면 hot standby 필요.
### 매 Backup Type
- **Full**: 매 전체 — slow, large, simple restore.
- **Incremental**: 매 since last backup — fast, smaller, restore chain.
- **Differential**: 매 since last full — middle ground.
- **Snapshot (CoW)**: 매 ZFS/btrfs/LVM/EBS — instant, space-efficient.
- **Continuous (CDC)**: 매 every transaction — Postgres WAL, MySQL binlog.
### 매 응용
1. DB backup (pg_basebackup + WAL archive).
2. File backup (restic, borg, Time Machine).
3. VM/disk snapshot (EBS, GCP PD, ZFS).
4. Object store replication (S3 CRR).
5. App-level (export-import, logical dump).
## 💻 패턴
### restic encrypted incremental backup
```bash
# 매 init repo (one-time)
restic init --repo s3:s3.amazonaws.com/my-backup-bucket
# 매 daily backup
restic -r s3:s3.amazonaws.com/my-backup-bucket backup /var/data \
--exclude '*.tmp' --tag daily --host $(hostname)
# 매 retention: keep 7d, 4w, 12m
restic forget --keep-daily 7 --keep-weekly 4 --keep-monthly 12 --prune
# 매 verify
restic check --read-data-subset=10%
```
### Postgres PITR setup
```bash
# postgresql.conf
wal_level = replica
archive_mode = on
archive_command = 'aws s3 cp %p s3://pg-wal/%f'
# 매 base backup
pg_basebackup -D /backup/base -Ft -z -P -U replicator
# 매 restore: recovery.conf or postgresql.auto.conf with restore_command + recovery_target_time
```
### S3 Object Lock (immutable, ransomware-proof)
```bash
aws s3api put-object-lock-configuration \
--bucket my-backup-bucket \
--object-lock-configuration '{"ObjectLockEnabled":"Enabled","Rule":{"DefaultRetention":{"Mode":"COMPLIANCE","Days":30}}}'
```
### Restore drill automation
```bash
#!/usr/bin/env bash
# 매 nightly drill — restore latest to scratch, verify checksums
set -euo pipefail
SCRATCH=$(mktemp -d)
restic -r s3:.../backup restore latest --target "$SCRATCH"
sha256sum -c expected_checksums.sha256 --strict
echo "drill ok: $(date -Iseconds)" | tee -a /var/log/restore-drill.log
rm -rf "$SCRATCH"
```
### ZFS snapshot + send
```bash
# 매 instant CoW snapshot
zfs snapshot tank/data@$(date +%Y%m%d-%H%M)
# 매 incremental send to remote
zfs send -i tank/data@yesterday tank/data@today | ssh backup-host zfs recv tank/data
```
## 매 결정 기준
| 상황 | Approach |
|---|---|
| Files, small-mid | restic / borg |
| Postgres prod | pg_basebackup + WAL archive (PITR) |
| MySQL prod | xtrabackup + binlog |
| VM | snapshot + offsite replica |
| Multi-cloud | S3-compatible + CRR |
| Compliance (WORM) | S3 Object Lock COMPLIANCE mode |
**기본값**: 매 restic to S3 with Object Lock + nightly restore drill.
## 🔗 Graph
- 부모: [[SRE]]
- 변형: [[CI_CD_Pipeline]]
- 응용: [[카오스 몽키(Chaos Monkey)]]
- Adjacent: [[Secret_Management]] · [[Logging_and_Error_Handling]]
## 🤖 LLM 활용
**언제**: backup script generation, restore runbook drafting, log anomaly summarization.
**언제 X**: 매 actual restore execution — manual gate 필요.
## ❌ 안티패턴
- **No restore test**: 매 가장 흔한 실패 — backup 은 되는데 restore 가 안 됨.
- **Single copy**: 매 disk fail 한 방에 잃음.
- **No encryption**: 매 backup 이 attack vector — at-rest encrypt 필수.
- **No immutability**: 매 ransomware 가 backup 까지 암호화.
- **Forever retention**: 매 비용 폭발 + GDPR 위반 가능.
## 🧪 검증 / 중복
- Verified: restic docs; AWS Backup whitepaper; Veeam 3-2-1-1-0 guide; PostgreSQL PITR docs.
- 신뢰도 A.
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — 3-2-1-1-0 + restic/PG PITR/S3 Object Lock |