7.0 KiB
7.0 KiB
id, title, category, status, source_trust_level, verification_status, created_at, updated_at, tags, tech_stack, applied_in, aliases
| id | title | category | status | source_trust_level | verification_status | created_at | updated_at | tags | tech_stack | applied_in | aliases | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| cs-compression-algorithms | Compression — gzip / brotli / zstd / lz4 | Coding | draft | B | conceptual | 2026-05-09 | 2026-05-09 |
|
|
|
Compression Algorithms
Network / disk 압축. gzip (legacy), brotli (web), zstd (modern), lz4 (속도). Trade-off: ratio vs CPU. Per use case.
📖 핵심 개념
- Ratio: 작을수록 좋음.
- Speed: compress / decompress 별.
- Memory: small footprint.
- Streaming: 점진 압축.
💻 코드 패턴
비교 (대략)
Algorithm Ratio CPU comp CPU decomp Use case
gzip 3-5x middle fast legacy web, log
brotli 5-7x slow-ish fast web (HTTP)
zstd 4-6x fast very fast modern default
lz4 2-3x very fast very fast memory cache, snap
snappy 2-3x very fast very fast big data (Cassandra)
xz 5-10x slow slow backup
zlib 3-5x middle fast legacy
Node 사용
import zlib from 'node:zlib';
import { promisify } from 'node:util';
// gzip
const gzip = promisify(zlib.gzip);
const gunzip = promisify(zlib.gunzip);
const compressed = await gzip(Buffer.from('hello'.repeat(1000)));
const decompressed = await gunzip(compressed);
// Brotli
const compressed = await promisify(zlib.brotliCompress)(buf);
const decompressed = await promisify(zlib.brotliDecompress)(compressed);
Streaming
import { createGzip } from 'node:zlib';
import { pipeline } from 'node:stream/promises';
import fs from 'node:fs';
await pipeline(
fs.createReadStream('input.txt'),
createGzip({ level: 6 }),
fs.createWriteStream('output.txt.gz'),
);
zstd (modern, recommend)
yarn add @mongodb-js/zstd # 또는 node-zstandard
import zstd from '@mongodb-js/zstd';
const compressed = await zstd.compress(buffer, 3); // level 1-22
const decompressed = await zstd.decompress(compressed);
HTTP — gzip / brotli (자동)
// Express
import compression from 'compression';
app.use(compression({
level: 6,
threshold: 1024, // > 1KB 만
filter: (req, res) => {
const t = res.getHeader('Content-Type');
return /text|json|javascript|css|svg/.test(String(t));
},
}));
// Hono (modern, brotli + gzip)
import { compress } from 'hono/compress';
app.use(compress({ encoding: 'br' })); // 또는 gzip
→ 자동 Accept-Encoding 검사 + 적절 algorithm.
nginx
gzip on;
gzip_types text/css application/javascript application/json;
gzip_min_length 1024;
gzip_comp_level 6;
brotli on;
brotli_types text/css application/javascript application/json;
brotli_comp_level 6;
→ Brotli 가 web 표준 (3-5% 더 작음 vs gzip).
Pre-compression (static)
# Build 시 압축 — runtime CPU 안 씀
brotli -k -q 11 dist/*.js dist/*.css # 최강 압축
gzip -k -9 dist/*.js dist/*.css
# 또는 vite plugin
// vite.config.ts
import compression from 'vite-plugin-compression';
plugins: [
compression({ algorithm: 'gzip', ext: '.gz' }),
compression({ algorithm: 'brotliCompress', ext: '.br' }),
];
# Pre-compressed serve
gzip_static on;
brotli_static on;
→ Build 시 1번 압축 + nginx 가 그냥 serve.
압축 가능한 vs 불가능한
Compress 잘 됨:
Text (JSON, XML, HTML, CSS, JS, log, code)
Compress 안 됨:
Image (JPEG, PNG, WebP — 이미 압축)
Video (MP4, WebM)
Audio (MP3, AAC)
Binary (PDF, archive)
Random / encrypted
→ Image / video 도 압축 시도 = CPU 만 쓰고 더 작아지지도 않음.
Database column (Postgres TOAST)
TEXT / BYTEA > 8KB → 자동 PGLZ 압축.
LZ4 도 옵션 (Postgres 14+).
ALTER TABLE x ALTER COLUMN data SET COMPRESSION lz4;
→ Disk 절약. Query speed 거의 영향 X.
Compression in storage
Parquet: Snappy (default) / gzip / zstd / brotli
ORC: Snappy / zlib / lzo
ClickHouse: lz4 / zstd
Cassandra: Snappy / lz4 / zstd
RocksDB: Snappy / lz4 / zstd
→ zstd 가 modern best (ratio + speed).
Network — sockets
// WebSocket compression
const ws = new WebSocket(url, { perMessageDeflate: true });
→ 큰 message 자주 = enable.
Brotli vs gzip (web specific)
Brotli static dictionary = HTML / JS / CSS 자주 단어.
같은 size 파일 → brotli 가 5-15% 작음.
→ Modern web = brotli + gzip fallback.
Compression bomb (보안)
1KB compressed → 1GB decompressed.
Server 가 검사 없이 decompress = OOM.
→ Max decompressed size limit.
import { gunzipSync } from 'node:zlib';
const MAX_SIZE = 100 * 1024 * 1024; // 100MB
const decompressed = gunzipSync(buf, { maxOutputLength: MAX_SIZE });
LZ4 (memory cache)
import LZ4 from 'lz4js';
const compressed = LZ4.compress(buf);
const decompressed = LZ4.decompress(compressed);
→ 매우 빠름 — Redis 가 사용 가능.
Snappy (big data, Hadoop / Cassandra)
- 매우 빠른 compress / decompress.
- Ratio 약함 (2-3x).
- Big data scenarios.
압축 level 결정
gzip / brotli / zstd: 1 (fast) - 9/11/22 (slow + smaller)
Real-time stream: level 1-3
HTTP 응답: 6 (default)
Static asset: 11 (max — pre-build)
Backup: max
측정
const original = data.length;
const t0 = Date.now();
const compressed = await zstd.compress(data, 3);
const t1 = Date.now();
const decompressed = await zstd.decompress(compressed);
const t2 = Date.now();
console.log({
original,
compressed: compressed.length,
ratio: (original / compressed.length).toFixed(2),
compressMs: t1 - t0,
decompressMs: t2 - t1,
});
Dictionary compression (큰 절약)
같은 schema JSON 매번 보내면 — 같은 키 반복.
Pre-built dictionary 로 더 작게.
zstd 가 dict mode 지원:
zstd --train *.json -o dict
zstd -D dict input.json
→ 50-80% 더 작아짐 가능.
🤔 의사결정 기준
| 사용 | 추천 |
|---|---|
| HTTP 응답 (실시간) | brotli (level 4-6) + gzip fallback |
| Static asset (build) | brotli max + gzip max pre-compressed |
| Database column | zstd / lz4 |
| Memory cache | lz4 / snappy |
| Backup | zstd / xz |
| Streaming pipe | zstd / lz4 |
| Big data analytic | snappy / zstd |
| Real-time game | lz4 |
❌ 안티패턴
- 이미 압축된 file 다시: CPU 낭비. 검사 후.
- Compress small data (< 1KB): header overhead.
- Decompression bomb 무 limit: OOM 공격.
- Static asset 매 요청 압축: pre-compress.
- Brotli only — gzip fallback X: 옛 client 깨짐.
- Level 22 real-time: latency 큼.
- 모든 Content-Type 압축: image 등 안 줄어듦.
🤖 LLM 활용 힌트
- Web: brotli + gzip fallback (자동 lib).
- Storage: zstd (modern).
- Speed-critical: lz4 / snappy.
- Pre-compress static.