5.3 KiB
5.3 KiB
id, title, category, status, source_trust_level, verification_status, created_at, updated_at, tags, tech_stack, applied_in, aliases
| id | title | category | status | source_trust_level | verification_status | created_at | updated_at | tags | tech_stack | applied_in | aliases | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| cs-consistent-hashing | Consistent Hashing — Sharding / Cache 분산 | Coding | draft | B | conceptual | 2026-05-09 | 2026-05-09 |
|
|
|
Consistent Hashing
Hash mod N = node 추가 / 제거 시 거의 모든 key 이동. Consistent hashing = 1/N 만 이동. Memcached / DynamoDB / Cassandra / load balancer 가 사용.
📖 핵심 개념
- Hash ring: hash 결과를 0..2^32 원으로 배치.
- Node 도 ring 의 위치.
- Key 의 hash → 시계방향 다음 node.
- Virtual nodes: 한 node = N 개 가상 위치 (균형).
💻 코드 패턴
단순 hashring
import { createHash } from 'node:crypto';
class HashRing {
private ring: Map<number, string> = new Map();
private sortedKeys: number[] = [];
constructor(nodes: string[], private vnodes = 256) {
for (const node of nodes) this.add(node);
}
private hash(s: string): number {
const md5 = createHash('md5').update(s).digest();
return md5.readUInt32BE(0);
}
add(node: string) {
for (let i = 0; i < this.vnodes; i++) {
const h = this.hash(`${node}#${i}`);
this.ring.set(h, node);
}
this.sortedKeys = [...this.ring.keys()].sort((a, b) => a - b);
}
remove(node: string) {
for (let i = 0; i < this.vnodes; i++) {
this.ring.delete(this.hash(`${node}#${i}`));
}
this.sortedKeys = [...this.ring.keys()].sort((a, b) => a - b);
}
get(key: string): string {
if (this.sortedKeys.length === 0) throw new Error('empty');
const h = this.hash(key);
// Binary search — 시계방향 다음
let lo = 0, hi = this.sortedKeys.length - 1;
while (lo < hi) {
const mid = (lo + hi) >> 1;
if (this.sortedKeys[mid] < h) lo = mid + 1;
else hi = mid;
}
const idx = this.sortedKeys[lo] >= h ? lo : 0;
return this.ring.get(this.sortedKeys[idx])!;
}
}
const ring = new HashRing(['node1', 'node2', 'node3']);
console.log(ring.get('user-42')); // 항상 같은 node
Add / remove 효과
N nodes → N+1: 약 1/(N+1) key 이동.
mod N: 거의 모든 key 이동.
Replication (replica = ring 의 다음 N 개)
function getReplicas(key: string, n: number): string[] {
const seen = new Set<string>();
const result: string[] = [];
let idx = findIndex(hash(key));
while (result.length < n && seen.size < this.ring.size) {
const node = ring.get(sortedKeys[idx % sortedKeys.length]);
if (!seen.has(node)) {
seen.add(node);
result.push(node);
}
idx++;
}
return result;
}
→ Cassandra / DynamoDB 가 이걸로 N replica 분산.
Jump consistent hash (Google)
// O(log N) 메모리 — 큰 cluster 효율
function jumpHash(key: bigint, numBuckets: number): number {
let b = -1n, j = 0n;
let k = key;
while (j < BigInt(numBuckets)) {
b = j;
k = k * 2862933555777941757n + 1n;
j = (b + 1n) * (1n << 31n) / ((k >> 33n) + 1n);
}
return Number(b);
}
→ Buckets 만 — 해시 ring 자체 X. Add 만 가능 (remove 어려움).
Rendezvous (HRW)
function rendezvous(key: string, nodes: string[]): string {
let max = -Infinity;
let chosen = '';
for (const n of nodes) {
const h = hash(`${n}:${key}`);
if (h > max) { max = h; chosen = n; }
}
return chosen;
}
→ Virtual node 없이 균형. 단 O(N) per lookup.
Maglev (Google, LB)
- Lookup table 으로 매핑 미리 계산.
- O(1) lookup, 균등.
- 큰 LB / packet routing.
사용 예 — distributed cache
// Memcached client
const ring = new HashRing(['cache-1', 'cache-2', 'cache-3']);
async function get(key: string) {
const node = ring.get(key);
const client = clients[node];
return client.get(key);
}
Load balancer
Client request → LB → Backend
LB 가 consistent hashing → 같은 client = 같은 backend (sticky).
HAProxy / nginx ip_hash / Envoy ring_hash.
# Envoy
clusters:
- name: backend
lb_policy: RING_HASH
ring_hash_lb_config: { minimum_ring_size: 1024 }
Node weight (heterogeneous)
// 강한 node = 더 많은 vnode
ring.add('big-node', 512); // 2x vnodes
ring.add('small-node', 256);
🤔 의사결정 기준
| 상황 | 추천 |
|---|---|
| Distributed cache | Hashring + vnodes |
| Big cluster (1000s) | Jump hash |
| Replication 필요 | Hashring + N successors |
| LB sticky | ring_hash / Maglev |
| 작은 (5 nodes) | rendezvous |
| 잦은 add/remove | hashring (jump 어려움) |
❌ 안티패턴
- Vnodes 없음: 분배 불균등 (50% / 30% / 20%).
- Bad hash function: 분배 깨짐. md5 / xxhash 같은 균등.
- Vnodes 너무 많음 (10K+): 메모리 / 시간.
- Rebalancing 동기 + traffic 끊김: 점진 / read-from-old + write-to-both.
- Replication factor < 2: node 죽으면 데이터 잃음.
- Hash collision 무시: 64-bit 이상 사용.
- Keys distribution 안 검증: skewed traffic.
🤖 LLM 활용 힌트
- Hashring + vnodes (256-512) = 표준.
- 큰 cluster = Jump hash.
- LB = Envoy ring_hash.