[G1-Sync] Manual knowledge update
This commit is contained in:
@@ -0,0 +1,476 @@
|
||||
---
|
||||
id: cs-hashing-strategies
|
||||
title: Hashing Strategies — MD5 / SHA / xxHash / Argon2
|
||||
category: Coding
|
||||
status: draft
|
||||
source_trust_level: B
|
||||
verification_status: conceptual
|
||||
created_at: 2026-05-09
|
||||
updated_at: 2026-05-09
|
||||
tags: [cs, hashing, vibe-coding]
|
||||
tech_stack: { language: "TS", applicable_to: ["Backend"] }
|
||||
applied_in: []
|
||||
aliases: [hash, MD5, SHA-256, xxHash, Argon2, password hash, content addressing]
|
||||
---
|
||||
|
||||
# Hashing Strategies
|
||||
|
||||
> 다양 use case 의 다양 hash. **Cryptographic (SHA-256, BLAKE3) vs Fast (xxHash, MurmurHash) vs Password (Argon2, bcrypt)**. 잘못 선택 = 보안 / 성능 망가짐.
|
||||
|
||||
## 📖 핵심 개념
|
||||
- Cryptographic: collision-resistant, slow.
|
||||
- Fast non-crypto: speed-optimized.
|
||||
- Password: deliberately slow (brute force 차단).
|
||||
- Content-addressed: data = id (Git, IPFS).
|
||||
|
||||
## 💻 코드 패턴
|
||||
|
||||
### Use case 별 추천
|
||||
```
|
||||
Password hash: Argon2id, bcrypt, scrypt
|
||||
Content address: SHA-256, BLAKE3
|
||||
Tamper detection: SHA-256, HMAC
|
||||
Cache key / sharding: xxHash, MurmurHash
|
||||
File integrity: SHA-256, BLAKE3
|
||||
HMAC (signing): HMAC-SHA-256
|
||||
ID generation: UUID, Snowflake
|
||||
```
|
||||
|
||||
### Cryptographic hash (slow, secure)
|
||||
```ts
|
||||
import { createHash } from 'node:crypto';
|
||||
|
||||
const hash = createHash('sha256').update('hello').digest('hex');
|
||||
// 'sha256' / 'sha512' / 'sha3-256' / 'blake2b512'
|
||||
|
||||
// File hash
|
||||
import { createReadStream } from 'node:fs';
|
||||
|
||||
async function hashFile(path: string): Promise<string> {
|
||||
return new Promise((resolve, reject) => {
|
||||
const hash = createHash('sha256');
|
||||
const stream = createReadStream(path);
|
||||
stream.on('data', (chunk) => hash.update(chunk));
|
||||
stream.on('end', () => resolve(hash.digest('hex')));
|
||||
stream.on('error', reject);
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
### BLAKE3 (modern, faster than SHA-256)
|
||||
```bash
|
||||
yarn add blake3
|
||||
```
|
||||
|
||||
```ts
|
||||
import { hash } from 'blake3';
|
||||
const result = hash('hello').toString('hex');
|
||||
```
|
||||
|
||||
→ SHA-256 보다 5-10x 빠름. Same security.
|
||||
|
||||
### xxHash (very fast, non-crypto)
|
||||
```bash
|
||||
yarn add xxhash-wasm
|
||||
```
|
||||
|
||||
```ts
|
||||
import xxhash from 'xxhash-wasm';
|
||||
|
||||
const { h64ToString, h32 } = await xxhash();
|
||||
const hash = h64ToString('hello'); // 'cbb195b6c87b8e44'
|
||||
|
||||
// 또는 number
|
||||
const num = h32('hello');
|
||||
```
|
||||
|
||||
→ 10 GB/s+. Cache key, sharding, 짐 검사 (non-secure).
|
||||
|
||||
### MurmurHash (fast, popular)
|
||||
```ts
|
||||
import murmurhash from 'murmurhash';
|
||||
const hash = murmurhash.v3('hello'); // 32-bit number
|
||||
```
|
||||
|
||||
→ Java HashMap, Cassandra 사용.
|
||||
|
||||
### Password hashing (Argon2)
|
||||
```bash
|
||||
yarn add argon2
|
||||
```
|
||||
|
||||
```ts
|
||||
import argon2 from 'argon2';
|
||||
|
||||
const hash = await argon2.hash('password', {
|
||||
type: argon2.argon2id,
|
||||
memoryCost: 65536, // 64 MB
|
||||
timeCost: 3,
|
||||
parallelism: 4,
|
||||
});
|
||||
// '$argon2id$v=19$m=65536,t=3,p=4$...'
|
||||
|
||||
const valid = await argon2.verify(hash, 'password');
|
||||
```
|
||||
|
||||
→ Memory-hard. GPU brute force 차단.
|
||||
|
||||
### bcrypt (legacy but OK)
|
||||
```ts
|
||||
import bcrypt from 'bcrypt';
|
||||
|
||||
const hash = await bcrypt.hash('password', 12); // cost 12
|
||||
const valid = await bcrypt.compare('password', hash);
|
||||
```
|
||||
|
||||
→ 1999 부터. Stable. Argon2 보다 약함 — 새 = Argon2.
|
||||
|
||||
### Password hash 의 cost
|
||||
```
|
||||
Argon2id (defaults):
|
||||
- 64 MB memory
|
||||
- 3 iterations
|
||||
- ~100ms verify
|
||||
|
||||
→ Login 매번 100ms — OK.
|
||||
Brute force = 매우 느림.
|
||||
```
|
||||
|
||||
### HMAC (signed message)
|
||||
```ts
|
||||
import { createHmac } from 'node:crypto';
|
||||
|
||||
const sig = createHmac('sha256', secret).update(message).digest('hex');
|
||||
|
||||
// Verify
|
||||
function verify(msg: string, sig: string, secret: string): boolean {
|
||||
const expected = createHmac('sha256', secret).update(msg).digest('hex');
|
||||
return crypto.timingSafeEqual(Buffer.from(sig), Buffer.from(expected));
|
||||
}
|
||||
```
|
||||
|
||||
→ Webhook signature, JWT, API auth.
|
||||
|
||||
→ [[Backend_Webhook_Patterns]].
|
||||
|
||||
### Content-addressed (Git, IPFS)
|
||||
```ts
|
||||
// Git: SHA-1 (legacy → SHA-256 future)
|
||||
const blobHash = createHash('sha1').update('blob 11\0hello world').digest('hex');
|
||||
|
||||
// IPFS: 다양 (default = SHA-256)
|
||||
import { CID } from 'multiformats/cid';
|
||||
import { sha256 } from 'multiformats/hashes/sha2';
|
||||
|
||||
const hash = await sha256.digest(new TextEncoder().encode('hello'));
|
||||
const cid = CID.create(1, 0x55, hash); // 0x55 = raw codec
|
||||
```
|
||||
|
||||
→ Same content = same hash. Dedup.
|
||||
|
||||
### Hash for cache key
|
||||
```ts
|
||||
// 긴 string / object → cache key
|
||||
function cacheKey(req: Request): string {
|
||||
const key = JSON.stringify({ url: req.url, body: req.body });
|
||||
return xxhash.h64ToString(key); // 16 char
|
||||
}
|
||||
|
||||
await redis.set(`cache:${cacheKey(req)}`, response);
|
||||
```
|
||||
|
||||
→ xxHash = 빠름. SHA = overkill.
|
||||
|
||||
### Hash for sharding (consistent)
|
||||
```ts
|
||||
function shardKey(userId: string, numShards: number): number {
|
||||
return xxhash.h32(userId) % numShards;
|
||||
}
|
||||
```
|
||||
|
||||
→ [[CS_Consistent_Hashing]] (better — re-shard 시 작은 이동).
|
||||
|
||||
### Hash table (HashMap)
|
||||
```
|
||||
JS Map / Object 가 HashMap.
|
||||
Default hash 가 V8 internal.
|
||||
|
||||
→ 직접 implement 필요 X.
|
||||
```
|
||||
|
||||
### MD5 (deprecated for security)
|
||||
```
|
||||
MD5: collision found (2004).
|
||||
SHA-1: collision found (2017).
|
||||
|
||||
Use:
|
||||
- Non-security checksum: MD5 / SHA-1 OK
|
||||
- Security: SHA-256 / SHA-3 / BLAKE3
|
||||
```
|
||||
|
||||
### SHA-1 vs SHA-256 vs SHA-3
|
||||
```
|
||||
SHA-1: deprecated (security)
|
||||
SHA-256: 표준
|
||||
SHA-512: 64-bit native (faster on 64-bit CPU)
|
||||
SHA-3: Keccak (different family)
|
||||
BLAKE3: faster than all
|
||||
```
|
||||
|
||||
### Salt (password)
|
||||
```ts
|
||||
// ❌ Same password → same hash
|
||||
hash('password')
|
||||
|
||||
// ✅ Salt
|
||||
hash(salt + password)
|
||||
// Salt 가 unique per user.
|
||||
// Argon2 / bcrypt 자동 salt.
|
||||
```
|
||||
|
||||
### Pepper
|
||||
```ts
|
||||
const pepper = process.env.PEPPER!; // server-side secret
|
||||
const hash = argon2.hash(password + pepper, ...);
|
||||
```
|
||||
|
||||
→ Salt = DB 안. Pepper = env var. DB leak 시 추가 protection.
|
||||
|
||||
### Timing attack
|
||||
```ts
|
||||
// ❌
|
||||
if (sig === expected) ... // string compare timing
|
||||
|
||||
// ✅
|
||||
import { timingSafeEqual } from 'node:crypto';
|
||||
if (timingSafeEqual(Buffer.from(sig), Buffer.from(expected))) ...
|
||||
```
|
||||
|
||||
### Password upgrade (rehash)
|
||||
```ts
|
||||
async function login(email: string, password: string) {
|
||||
const user = await db.users.findByEmail(email);
|
||||
|
||||
if (!await argon2.verify(user.passwordHash, password)) {
|
||||
throw new Error('Invalid');
|
||||
}
|
||||
|
||||
// Upgrade hash if cost 옛
|
||||
if (argon2.needsRehash(user.passwordHash, { ...currentParams })) {
|
||||
const newHash = await argon2.hash(password, currentParams);
|
||||
await db.users.update(user.id, { passwordHash: newHash });
|
||||
}
|
||||
|
||||
return createSession(user);
|
||||
}
|
||||
```
|
||||
|
||||
→ 시간 지나며 cost 증가.
|
||||
|
||||
### Hash chain (Merkle tree)
|
||||
```ts
|
||||
// Block hash:
|
||||
hash(prev_block_hash + transaction_data)
|
||||
|
||||
// Tamper one block → 모든 후속 block invalid.
|
||||
// Bitcoin / Ethereum.
|
||||
```
|
||||
|
||||
### Merkle tree
|
||||
```
|
||||
[hash root]
|
||||
/ \
|
||||
[hash A] [hash B]
|
||||
/ \ / \
|
||||
[h1] [h2] [h3] [h4]
|
||||
| | | |
|
||||
[d1] [d2] [d3] [d4]
|
||||
```
|
||||
|
||||
→ Verify d2 = h2 + (h3+h4 hash) → root. log(N) proof.
|
||||
|
||||
→ Git, IPFS, blockchain.
|
||||
|
||||
### Bloom filter (probabilistic)
|
||||
```ts
|
||||
import xxhash from 'xxhash-wasm';
|
||||
|
||||
const xh = await xxhash();
|
||||
const bf = new Uint8Array(M); // M bits
|
||||
|
||||
function add(key: string) {
|
||||
for (let i = 0; i < K; i++) {
|
||||
const idx = xh.h32(key + i) % (M * 8);
|
||||
bf[idx >> 3] |= (1 << (idx & 7));
|
||||
}
|
||||
}
|
||||
|
||||
function maybe(key: string): boolean {
|
||||
for (let i = 0; i < K; i++) {
|
||||
const idx = xh.h32(key + i) % (M * 8);
|
||||
if (!(bf[idx >> 3] & (1 << (idx & 7)))) return false;
|
||||
}
|
||||
return true; // probably
|
||||
}
|
||||
```
|
||||
|
||||
→ [[CS_Bloom_Filter]].
|
||||
|
||||
### Hash collision
|
||||
```
|
||||
Cryptographic (SHA-256): 2^128 trial 가 평균. 안 발생.
|
||||
|
||||
Non-crypto (xxHash 64): 2^32 trial 가 50% (birthday paradox).
|
||||
- 100 K items: 안 발생.
|
||||
- 1 B items: 가능.
|
||||
|
||||
→ Critical = SHA-256. 작은 = xxHash OK.
|
||||
```
|
||||
|
||||
### Comparison table
|
||||
```
|
||||
Algorithm Speed Security Use case
|
||||
MD5 Fast Broken Legacy checksum
|
||||
SHA-1 Fast Broken Git (legacy)
|
||||
SHA-256 Medium Strong Default crypto
|
||||
SHA-3 Medium Strong New crypto
|
||||
BLAKE3 Fast Strong Modern crypto
|
||||
xxHash Very fast None Cache, shard
|
||||
MurmurHash Very fast None Cache, shard
|
||||
FNV Very fast None Cache (작은)
|
||||
HMAC-SHA256 Medium Strong Sign / verify
|
||||
Argon2id Slow Strong Password
|
||||
bcrypt Slow Strong Password
|
||||
scrypt Slow Strong Password (memory-hard)
|
||||
```
|
||||
|
||||
### Performance (대략)
|
||||
```
|
||||
SHA-256: 500 MB/s (1 thread)
|
||||
SHA-3: 400 MB/s
|
||||
BLAKE3: 3 GB/s (multi-thread)
|
||||
xxHash: 5-10 GB/s
|
||||
MurmurHash: 5 GB/s
|
||||
|
||||
Argon2id: ~100ms / verify (intentionally)
|
||||
bcrypt cost 12: ~250ms
|
||||
```
|
||||
|
||||
### Hash + ID
|
||||
```ts
|
||||
// Content-addressable storage
|
||||
const id = createHash('sha256').update(content).digest('hex');
|
||||
await s3.put(`/objects/${id}`, content);
|
||||
// 같은 content = 같은 id (dedup).
|
||||
```
|
||||
|
||||
### Snowflake / UUID + hash (composite)
|
||||
```
|
||||
Snowflake: time + machine + seq.
|
||||
UUID v7: time + random.
|
||||
|
||||
ID 자체 가 hash X.
|
||||
|
||||
But:
|
||||
hash(snowflake_id) → consistent shard key.
|
||||
```
|
||||
|
||||
### Hash-based deduplication
|
||||
```ts
|
||||
// File dedup
|
||||
async function dedupe(file: Buffer) {
|
||||
const hash = sha256(file);
|
||||
if (await db.files.exists(hash)) return hash; // already
|
||||
await db.files.put(hash, file);
|
||||
return hash;
|
||||
}
|
||||
```
|
||||
|
||||
→ Same file = 1 copy.
|
||||
|
||||
### Ethereum-style hash
|
||||
```
|
||||
keccak-256 (= SHA-3 의 변형, but Ethereum 가 fixed SHA-3 전 use).
|
||||
```
|
||||
|
||||
### Common mistakes
|
||||
```
|
||||
- MD5 for password: broken.
|
||||
- SHA-256 for password: 너무 빠름 (brute force).
|
||||
- Plain text password store: 절대.
|
||||
- Salt 무: rainbow table.
|
||||
- Same hash function 모든 use case: wrong tool.
|
||||
- timingSafeEqual 무 (signature compare): timing attack.
|
||||
```
|
||||
|
||||
### When to use what
|
||||
```
|
||||
DB password column: Argon2id hash.
|
||||
Session ID: cryptographically random (not hash).
|
||||
File integrity: SHA-256.
|
||||
Git-like CAS: BLAKE3 (modern) / SHA-256.
|
||||
Cache key: xxHash.
|
||||
Webhook signature: HMAC-SHA256.
|
||||
JWT signing: HMAC-SHA256 또는 RS256.
|
||||
URL-safe ID: base64url(random) 또는 NanoID.
|
||||
```
|
||||
|
||||
### Library
|
||||
```ts
|
||||
// Node built-in
|
||||
import { createHash, createHmac, randomBytes } from 'node:crypto';
|
||||
|
||||
// Modern
|
||||
import { hash as blake3 } from 'blake3';
|
||||
import argon2 from 'argon2';
|
||||
import xxhash from 'xxhash-wasm';
|
||||
|
||||
// Web Crypto (browser + edge)
|
||||
const buffer = await crypto.subtle.digest('SHA-256', encoder.encode(text));
|
||||
const hex = Array.from(new Uint8Array(buffer)).map(b => b.toString(16).padStart(2, '0')).join('');
|
||||
```
|
||||
|
||||
### Web Crypto (edge / browser)
|
||||
```ts
|
||||
async function sha256(text: string): Promise<string> {
|
||||
const buf = await crypto.subtle.digest('SHA-256', new TextEncoder().encode(text));
|
||||
return Array.from(new Uint8Array(buf))
|
||||
.map(b => b.toString(16).padStart(2, '0'))
|
||||
.join('');
|
||||
}
|
||||
```
|
||||
|
||||
→ Cloudflare Workers / Deno / Bun 호환.
|
||||
|
||||
## 🤔 의사결정 기준
|
||||
| 사용 | 추천 |
|
||||
|---|---|
|
||||
| Password | Argon2id |
|
||||
| File integrity | SHA-256 / BLAKE3 |
|
||||
| Cache key | xxHash |
|
||||
| Webhook sig | HMAC-SHA256 |
|
||||
| Random ID | randomBytes (not hash) |
|
||||
| Sharding | xxHash + consistent hashing |
|
||||
| Git-like | SHA-256 |
|
||||
| Tamper-evident | Merkle + SHA-256 |
|
||||
|
||||
## ❌ 안티패턴
|
||||
- **Password 가 SHA-256**: brute force.
|
||||
- **MD5 prod**: broken.
|
||||
- **No salt**: rainbow table.
|
||||
- **timingSafeEqual 무 + sig compare**: timing.
|
||||
- **Hash 가 ID 의 only**: collision risk (xxHash large scale).
|
||||
- **너무 비싼 hash + non-security**: latency.
|
||||
- **Web Crypto 가 edge 안 알기**: error.
|
||||
|
||||
## 🤖 LLM 활용 힌트
|
||||
- Use case 따라 정확 hash.
|
||||
- Argon2id = password.
|
||||
- SHA-256 = secure default.
|
||||
- xxHash = speed.
|
||||
- timingSafeEqual = compare.
|
||||
|
||||
## 🔗 관련 문서
|
||||
- [[CS_Bloom_Filter]]
|
||||
- [[CS_Consistent_Hashing]]
|
||||
- [[Security_OWASP_Top_10_Practical]]
|
||||
Reference in New Issue
Block a user