Files
2nd/10_Wiki/Topics/Coding/CS_Hashing_Strategies.md
T
2026-05-09 22:47:42 +09:00

11 KiB

id, title, category, status, source_trust_level, verification_status, created_at, updated_at, tags, tech_stack, applied_in, aliases
id title category status source_trust_level verification_status created_at updated_at tags tech_stack applied_in aliases
cs-hashing-strategies Hashing Strategies — MD5 / SHA / xxHash / Argon2 Coding draft B conceptual 2026-05-09 2026-05-09
cs
hashing
vibe-coding
language applicable_to
TS
Backend
hash
MD5
SHA-256
xxHash
Argon2
password hash
content addressing

Hashing Strategies

다양 use case 의 다양 hash. Cryptographic (SHA-256, BLAKE3) vs Fast (xxHash, MurmurHash) vs Password (Argon2, bcrypt). 잘못 선택 = 보안 / 성능 망가짐.

📖 핵심 개념

  • Cryptographic: collision-resistant, slow.
  • Fast non-crypto: speed-optimized.
  • Password: deliberately slow (brute force 차단).
  • Content-addressed: data = id (Git, IPFS).

💻 코드 패턴

Use case 별 추천

Password hash:        Argon2id, bcrypt, scrypt
Content address:       SHA-256, BLAKE3
Tamper detection:      SHA-256, HMAC
Cache key / sharding:  xxHash, MurmurHash
File integrity:        SHA-256, BLAKE3
HMAC (signing):        HMAC-SHA-256
ID generation:         UUID, Snowflake

Cryptographic hash (slow, secure)

import { createHash } from 'node:crypto';

const hash = createHash('sha256').update('hello').digest('hex');
// 'sha256' / 'sha512' / 'sha3-256' / 'blake2b512'

// File hash
import { createReadStream } from 'node:fs';

async function hashFile(path: string): Promise<string> {
  return new Promise((resolve, reject) => {
    const hash = createHash('sha256');
    const stream = createReadStream(path);
    stream.on('data', (chunk) => hash.update(chunk));
    stream.on('end', () => resolve(hash.digest('hex')));
    stream.on('error', reject);
  });
}

BLAKE3 (modern, faster than SHA-256)

yarn add blake3
import { hash } from 'blake3';
const result = hash('hello').toString('hex');

→ SHA-256 보다 5-10x 빠름. Same security.

xxHash (very fast, non-crypto)

yarn add xxhash-wasm
import xxhash from 'xxhash-wasm';

const { h64ToString, h32 } = await xxhash();
const hash = h64ToString('hello');  // 'cbb195b6c87b8e44'

// 또는 number
const num = h32('hello');

→ 10 GB/s+. Cache key, sharding, 짐 검사 (non-secure).

import murmurhash from 'murmurhash';
const hash = murmurhash.v3('hello');  // 32-bit number

→ Java HashMap, Cassandra 사용.

Password hashing (Argon2)

yarn add argon2
import argon2 from 'argon2';

const hash = await argon2.hash('password', {
  type: argon2.argon2id,
  memoryCost: 65536,  // 64 MB
  timeCost: 3,
  parallelism: 4,
});
// '$argon2id$v=19$m=65536,t=3,p=4$...'

const valid = await argon2.verify(hash, 'password');

→ Memory-hard. GPU brute force 차단.

bcrypt (legacy but OK)

import bcrypt from 'bcrypt';

const hash = await bcrypt.hash('password', 12);  // cost 12
const valid = await bcrypt.compare('password', hash);

→ 1999 부터. Stable. Argon2 보다 약함 — 새 = Argon2.

Password hash 의 cost

Argon2id (defaults):
- 64 MB memory
- 3 iterations
- ~100ms verify

→ Login 매번 100ms — OK.
   Brute force = 매우 느림.

HMAC (signed message)

import { createHmac } from 'node:crypto';

const sig = createHmac('sha256', secret).update(message).digest('hex');

// Verify
function verify(msg: string, sig: string, secret: string): boolean {
  const expected = createHmac('sha256', secret).update(msg).digest('hex');
  return crypto.timingSafeEqual(Buffer.from(sig), Buffer.from(expected));
}

→ Webhook signature, JWT, API auth.

Backend_Webhook_Patterns.

Content-addressed (Git, IPFS)

// Git: SHA-1 (legacy → SHA-256 future)
const blobHash = createHash('sha1').update('blob 11\0hello world').digest('hex');

// IPFS: 다양 (default = SHA-256)
import { CID } from 'multiformats/cid';
import { sha256 } from 'multiformats/hashes/sha2';

const hash = await sha256.digest(new TextEncoder().encode('hello'));
const cid = CID.create(1, 0x55, hash);  // 0x55 = raw codec

→ Same content = same hash. Dedup.

Hash for cache key

// 긴 string / object → cache key
function cacheKey(req: Request): string {
  const key = JSON.stringify({ url: req.url, body: req.body });
  return xxhash.h64ToString(key);  // 16 char
}

await redis.set(`cache:${cacheKey(req)}`, response);

→ xxHash = 빠름. SHA = overkill.

Hash for sharding (consistent)

function shardKey(userId: string, numShards: number): number {
  return xxhash.h32(userId) % numShards;
}

CS_Consistent_Hashing (better — re-shard 시 작은 이동).

Hash table (HashMap)

JS Map / Object 가 HashMap.
Default hash 가 V8 internal.

→ 직접 implement 필요 X.

MD5 (deprecated for security)

MD5: collision found (2004).
SHA-1: collision found (2017).

Use:
- Non-security checksum: MD5 / SHA-1 OK
- Security: SHA-256 / SHA-3 / BLAKE3

SHA-1 vs SHA-256 vs SHA-3

SHA-1: deprecated (security)
SHA-256: 표준
SHA-512: 64-bit native (faster on 64-bit CPU)
SHA-3: Keccak (different family)
BLAKE3: faster than all

Salt (password)

// ❌ Same password → same hash
hash('password')

// ✅ Salt
hash(salt + password)
// Salt 가 unique per user.
// Argon2 / bcrypt 자동 salt.

Pepper

const pepper = process.env.PEPPER!;  // server-side secret
const hash = argon2.hash(password + pepper, ...);

→ Salt = DB 안. Pepper = env var. DB leak 시 추가 protection.

Timing attack

// ❌
if (sig === expected) ...  // string compare timing

// ✅
import { timingSafeEqual } from 'node:crypto';
if (timingSafeEqual(Buffer.from(sig), Buffer.from(expected))) ...

Password upgrade (rehash)

async function login(email: string, password: string) {
  const user = await db.users.findByEmail(email);
  
  if (!await argon2.verify(user.passwordHash, password)) {
    throw new Error('Invalid');
  }
  
  // Upgrade hash if cost 옛
  if (argon2.needsRehash(user.passwordHash, { ...currentParams })) {
    const newHash = await argon2.hash(password, currentParams);
    await db.users.update(user.id, { passwordHash: newHash });
  }
  
  return createSession(user);
}

→ 시간 지나며 cost 증가.

Hash chain (Merkle tree)

// Block hash:
hash(prev_block_hash + transaction_data)

// Tamper one block → 모든 후속 block invalid.
// Bitcoin / Ethereum.

Merkle tree

[hash root]
  /         \
[hash A]   [hash B]
  /  \      /   \
[h1] [h2] [h3] [h4]
 |    |    |    |
[d1] [d2] [d3] [d4]

→ Verify d2 = h2 + (h3+h4 hash) → root. log(N) proof.

→ Git, IPFS, blockchain.

Bloom filter (probabilistic)

import xxhash from 'xxhash-wasm';

const xh = await xxhash();
const bf = new Uint8Array(M);  // M bits

function add(key: string) {
  for (let i = 0; i < K; i++) {
    const idx = xh.h32(key + i) % (M * 8);
    bf[idx >> 3] |= (1 << (idx & 7));
  }
}

function maybe(key: string): boolean {
  for (let i = 0; i < K; i++) {
    const idx = xh.h32(key + i) % (M * 8);
    if (!(bf[idx >> 3] & (1 << (idx & 7)))) return false;
  }
  return true;  // probably
}

CS_Bloom_Filter.

Hash collision

Cryptographic (SHA-256): 2^128 trial 가 평균. 안 발생.

Non-crypto (xxHash 64): 2^32 trial 가 50% (birthday paradox).
- 100 K items: 안 발생.
- 1 B items: 가능.

→ Critical = SHA-256. 작은 = xxHash OK.

Comparison table

Algorithm       Speed    Security    Use case
MD5             Fast     Broken     Legacy checksum
SHA-1           Fast     Broken     Git (legacy)
SHA-256         Medium   Strong     Default crypto
SHA-3           Medium   Strong     New crypto
BLAKE3          Fast     Strong     Modern crypto
xxHash          Very fast None      Cache, shard
MurmurHash      Very fast None      Cache, shard
FNV             Very fast None      Cache (작은)
HMAC-SHA256     Medium   Strong     Sign / verify
Argon2id        Slow     Strong     Password
bcrypt          Slow     Strong     Password
scrypt          Slow     Strong     Password (memory-hard)

Performance (대략)

SHA-256:        500 MB/s (1 thread)
SHA-3:          400 MB/s
BLAKE3:         3 GB/s (multi-thread)
xxHash:         5-10 GB/s
MurmurHash:     5 GB/s

Argon2id:       ~100ms / verify (intentionally)
bcrypt cost 12: ~250ms

Hash + ID

// Content-addressable storage
const id = createHash('sha256').update(content).digest('hex');
await s3.put(`/objects/${id}`, content);
// 같은 content = 같은 id (dedup).

Snowflake / UUID + hash (composite)

Snowflake: time + machine + seq.
UUID v7: time + random.

ID 자체 가 hash X.

But:
hash(snowflake_id) → consistent shard key.

Hash-based deduplication

// File dedup
async function dedupe(file: Buffer) {
  const hash = sha256(file);
  if (await db.files.exists(hash)) return hash;  // already
  await db.files.put(hash, file);
  return hash;
}

→ Same file = 1 copy.

Ethereum-style hash

keccak-256 (= SHA-3 의 변형, but Ethereum 가 fixed SHA-3 전 use).

Common mistakes

- MD5 for password: broken.
- SHA-256 for password: 너무 빠름 (brute force).
- Plain text password store: 절대.
- Salt 무: rainbow table.
- Same hash function 모든 use case: wrong tool.
- timingSafeEqual 무 (signature compare): timing attack.

When to use what

DB password column:        Argon2id hash.
Session ID:                 cryptographically random (not hash).
File integrity:            SHA-256.
Git-like CAS:               BLAKE3 (modern) / SHA-256.
Cache key:                  xxHash.
Webhook signature:          HMAC-SHA256.
JWT signing:                HMAC-SHA256 또는 RS256.
URL-safe ID:                base64url(random) 또는 NanoID.

Library

// Node built-in
import { createHash, createHmac, randomBytes } from 'node:crypto';

// Modern
import { hash as blake3 } from 'blake3';
import argon2 from 'argon2';
import xxhash from 'xxhash-wasm';

// Web Crypto (browser + edge)
const buffer = await crypto.subtle.digest('SHA-256', encoder.encode(text));
const hex = Array.from(new Uint8Array(buffer)).map(b => b.toString(16).padStart(2, '0')).join('');

Web Crypto (edge / browser)

async function sha256(text: string): Promise<string> {
  const buf = await crypto.subtle.digest('SHA-256', new TextEncoder().encode(text));
  return Array.from(new Uint8Array(buf))
    .map(b => b.toString(16).padStart(2, '0'))
    .join('');
}

→ Cloudflare Workers / Deno / Bun 호환.

🤔 의사결정 기준

사용 추천
Password Argon2id
File integrity SHA-256 / BLAKE3
Cache key xxHash
Webhook sig HMAC-SHA256
Random ID randomBytes (not hash)
Sharding xxHash + consistent hashing
Git-like SHA-256
Tamper-evident Merkle + SHA-256

안티패턴

  • Password 가 SHA-256: brute force.
  • MD5 prod: broken.
  • No salt: rainbow table.
  • timingSafeEqual 무 + sig compare: timing.
  • Hash 가 ID 의 only: collision risk (xxHash large scale).
  • 너무 비싼 hash + non-security: latency.
  • Web Crypto 가 edge 안 알기: error.

🤖 LLM 활용 힌트

  • Use case 따라 정확 hash.
  • Argon2id = password.
  • SHA-256 = secure default.
  • xxHash = speed.
  • timingSafeEqual = compare.

🔗 관련 문서