Files
2nd/10_Wiki/Topics/Coding/CS_Consistent_Hashing.md
T
2026-05-09 21:08:02 +09:00

204 lines
5.3 KiB
Markdown

---
id: cs-consistent-hashing
title: Consistent Hashing — Sharding / Cache 분산
category: Coding
status: draft
source_trust_level: B
verification_status: conceptual
created_at: 2026-05-09
updated_at: 2026-05-09
tags: [cs, hashing, sharding, vibe-coding]
tech_stack: { language: "TS", applicable_to: ["Backend"] }
applied_in: []
aliases: [consistent hashing, hashring, virtual nodes, jump hash, rendezvous]
---
# Consistent Hashing
> Hash mod N = node 추가 / 제거 시 거의 모든 key 이동. **Consistent hashing = 1/N 만 이동**. Memcached / DynamoDB / Cassandra / load balancer 가 사용.
## 📖 핵심 개념
- Hash ring: hash 결과를 0..2^32 원으로 배치.
- Node 도 ring 의 위치.
- Key 의 hash → 시계방향 다음 node.
- Virtual nodes: 한 node = N 개 가상 위치 (균형).
## 💻 코드 패턴
### 단순 hashring
```ts
import { createHash } from 'node:crypto';
class HashRing {
private ring: Map<number, string> = new Map();
private sortedKeys: number[] = [];
constructor(nodes: string[], private vnodes = 256) {
for (const node of nodes) this.add(node);
}
private hash(s: string): number {
const md5 = createHash('md5').update(s).digest();
return md5.readUInt32BE(0);
}
add(node: string) {
for (let i = 0; i < this.vnodes; i++) {
const h = this.hash(`${node}#${i}`);
this.ring.set(h, node);
}
this.sortedKeys = [...this.ring.keys()].sort((a, b) => a - b);
}
remove(node: string) {
for (let i = 0; i < this.vnodes; i++) {
this.ring.delete(this.hash(`${node}#${i}`));
}
this.sortedKeys = [...this.ring.keys()].sort((a, b) => a - b);
}
get(key: string): string {
if (this.sortedKeys.length === 0) throw new Error('empty');
const h = this.hash(key);
// Binary search — 시계방향 다음
let lo = 0, hi = this.sortedKeys.length - 1;
while (lo < hi) {
const mid = (lo + hi) >> 1;
if (this.sortedKeys[mid] < h) lo = mid + 1;
else hi = mid;
}
const idx = this.sortedKeys[lo] >= h ? lo : 0;
return this.ring.get(this.sortedKeys[idx])!;
}
}
const ring = new HashRing(['node1', 'node2', 'node3']);
console.log(ring.get('user-42')); // 항상 같은 node
```
### Add / remove 효과
```
N nodes → N+1: 약 1/(N+1) key 이동.
mod N: 거의 모든 key 이동.
```
### Replication (replica = ring 의 다음 N 개)
```ts
function getReplicas(key: string, n: number): string[] {
const seen = new Set<string>();
const result: string[] = [];
let idx = findIndex(hash(key));
while (result.length < n && seen.size < this.ring.size) {
const node = ring.get(sortedKeys[idx % sortedKeys.length]);
if (!seen.has(node)) {
seen.add(node);
result.push(node);
}
idx++;
}
return result;
}
```
→ Cassandra / DynamoDB 가 이걸로 N replica 분산.
### Jump consistent hash (Google)
```ts
// O(log N) 메모리 — 큰 cluster 효율
function jumpHash(key: bigint, numBuckets: number): number {
let b = -1n, j = 0n;
let k = key;
while (j < BigInt(numBuckets)) {
b = j;
k = k * 2862933555777941757n + 1n;
j = (b + 1n) * (1n << 31n) / ((k >> 33n) + 1n);
}
return Number(b);
}
```
→ Buckets 만 — 해시 ring 자체 X. Add 만 가능 (remove 어려움).
### Rendezvous (HRW)
```ts
function rendezvous(key: string, nodes: string[]): string {
let max = -Infinity;
let chosen = '';
for (const n of nodes) {
const h = hash(`${n}:${key}`);
if (h > max) { max = h; chosen = n; }
}
return chosen;
}
```
→ Virtual node 없이 균형. 단 O(N) per lookup.
### Maglev (Google, LB)
- Lookup table 으로 매핑 미리 계산.
- O(1) lookup, 균등.
- 큰 LB / packet routing.
### 사용 예 — distributed cache
```ts
// Memcached client
const ring = new HashRing(['cache-1', 'cache-2', 'cache-3']);
async function get(key: string) {
const node = ring.get(key);
const client = clients[node];
return client.get(key);
}
```
### Load balancer
```
Client request → LB → Backend
LB 가 consistent hashing → 같은 client = 같은 backend (sticky).
HAProxy / nginx ip_hash / Envoy ring_hash.
```
```yaml
# Envoy
clusters:
- name: backend
lb_policy: RING_HASH
ring_hash_lb_config: { minimum_ring_size: 1024 }
```
### Node weight (heterogeneous)
```ts
// 강한 node = 더 많은 vnode
ring.add('big-node', 512); // 2x vnodes
ring.add('small-node', 256);
```
## 🤔 의사결정 기준
| 상황 | 추천 |
|---|---|
| Distributed cache | Hashring + vnodes |
| Big cluster (1000s) | Jump hash |
| Replication 필요 | Hashring + N successors |
| LB sticky | ring_hash / Maglev |
| 작은 (5 nodes) | rendezvous |
| 잦은 add/remove | hashring (jump 어려움) |
## ❌ 안티패턴
- **Vnodes 없음**: 분배 불균등 (50% / 30% / 20%).
- **Bad hash function**: 분배 깨짐. md5 / xxhash 같은 균등.
- **Vnodes 너무 많음 (10K+)**: 메모리 / 시간.
- **Rebalancing 동기 + traffic 끊김**: 점진 / read-from-old + write-to-both.
- **Replication factor < 2**: node 죽으면 데이터 잃음.
- **Hash collision 무시**: 64-bit 이상 사용.
- **Keys distribution 안 검증**: skewed traffic.
## 🤖 LLM 활용 힌트
- Hashring + vnodes (256-512) = 표준.
- 큰 cluster = Jump hash.
- LB = Envoy ring_hash.
## 🔗 관련 문서
- [[DB_Sharding_Strategies]]
- [[CS_Bloom_Filter]]
- [[Backend_WebSocket_Scaling]]