4.6 KiB
4.6 KiB
id, title, category, status, source_trust_level, verification_status, created_at, updated_at, tags, tech_stack, applied_in, aliases
| id | title | category | status | source_trust_level | verification_status | created_at | updated_at | tags | tech_stack | applied_in | aliases | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| db-time-series-patterns | Time-series — TimescaleDB / 다운샘플 / 보존 | Coding | draft | B | conceptual | 2026-05-09 | 2026-05-09 |
|
|
|
Time-series Patterns
메트릭 / 이벤트 / 로그 / IoT = time-series. TimescaleDB (Postgres extension) 가 modern 표준 — 일반 SQL + 시간축 최적화. InfluxDB / Prometheus 도 옵션.
📖 핵심 개념
- Hypertable: 시간 기준 자동 파티션.
- Continuous aggregate: 실시간 materialized view (1m / 1h / 1d 다운샘플).
- Compression: 오래된 chunk 자동 압축 (10-30x).
- Retention policy: N일 후 자동 drop.
💻 코드 패턴
TimescaleDB hypertable
CREATE EXTENSION IF NOT EXISTS timescaledb;
CREATE TABLE metrics (
time TIMESTAMPTZ NOT NULL,
device_id TEXT NOT NULL,
cpu DOUBLE PRECISION,
mem DOUBLE PRECISION
);
SELECT create_hypertable('metrics', 'time', chunk_time_interval => INTERVAL '1 day');
CREATE INDEX metrics_device_time ON metrics(device_id, time DESC);
Insert (대량)
COPY metrics FROM STDIN;
-- 또는 multi-row INSERT
INSERT INTO metrics(time, device_id, cpu, mem)
VALUES ('2026-05-09 10:00', 'd1', 0.5, 0.3),
('2026-05-09 10:01', 'd1', 0.6, 0.3);
Query (시간 범위)
-- 최근 1시간
SELECT time_bucket('1 minute', time) AS bucket,
device_id,
avg(cpu) AS avg_cpu
FROM metrics
WHERE time > NOW() - INTERVAL '1 hour'
GROUP BY bucket, device_id
ORDER BY bucket;
Continuous aggregate
CREATE MATERIALIZED VIEW metrics_hourly
WITH (timescaledb.continuous) AS
SELECT time_bucket('1 hour', time) AS hour,
device_id,
avg(cpu) AS avg_cpu,
max(cpu) AS max_cpu,
count(*) AS samples
FROM metrics
GROUP BY hour, device_id;
-- refresh 정책 (실시간으로 따라옴)
SELECT add_continuous_aggregate_policy('metrics_hourly',
start_offset => INTERVAL '3 hours',
end_offset => INTERVAL '5 minutes',
schedule_interval => INTERVAL '5 minutes');
Compression
ALTER TABLE metrics SET (
timescaledb.compress,
timescaledb.compress_segmentby = 'device_id',
timescaledb.compress_orderby = 'time DESC'
);
SELECT add_compression_policy('metrics', INTERVAL '7 days');
-- 7일 지난 chunk 자동 압축
Retention
SELECT add_retention_policy('metrics', INTERVAL '90 days');
-- 90일 지난 chunk 자동 drop
Time-bucket gap fill
SELECT time_bucket_gapfill('1 minute', time) AS bucket,
device_id,
locf(avg(cpu)) AS cpu -- last observation carried forward
FROM metrics
WHERE time > NOW() - INTERVAL '1 hour'
AND time <= NOW()
GROUP BY bucket, device_id;
Downsampling 단계
raw (1s) → 1분 (cont. agg) → 1시간 → 1일
각 단계는 다른 retention.
InfluxDB (alternative)
# Line protocol
metrics,device=d1 cpu=0.5,mem=0.3 1715238000000000000
// Flux 쿼리
from(bucket: "default")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "metrics")
|> aggregateWindow(every: 1m, fn: mean)
Prometheus (메트릭 + 알람)
# scrape config
scrape_configs:
- job_name: api
static_configs:
- targets: ['api:9090']
PromQL:
rate(http_requests_total[5m])
histogram_quantile(0.99, sum by (le) (rate(http_request_duration_seconds_bucket[5m])))
🤔 의사결정 기준
| 데이터 | 추천 |
|---|---|
| 메트릭 + alert | Prometheus + Grafana |
| IoT / 센서 (장기 보관) | TimescaleDB |
| 트레이딩 / 분석 | TimescaleDB / ClickHouse |
| 단순 events | Postgres + 파티션 |
| 매우 큰 (PB) | ClickHouse / Druid |
| 짧은 (실시간 30일) | Prometheus / InfluxDB |
❌ 안티패턴
- 단일 테이블 1B+ rows 일반 PG: 인덱스 거대, query 느림.
- Index time + device 따로: composite (device, time DESC) 가 best.
- Retention 없음: 영원 자라남.
- Compression 미사용: 디스크 5-10x 더.
- Continuous agg 없음 — raw 매번: 분 단위 query 가 시간.
- Time as TEXT: 정렬 안 됨, range 쿼리 느림. TIMESTAMPTZ.
- Monitoring DB 에 트랜잭션: HFT app + monitoring 같은 PG = 맥 끊김.
🤖 LLM 활용 힌트
- TimescaleDB = Postgres 알면 그대로.
- Hypertable + cont agg + compression + retention 4종.
- Time-bucket + gapfill + locf 활용.