Files
2nd/10_Wiki/Topics/Coding/CS_Consistent_Hashing.md
T
2026-05-09 21:08:02 +09:00

5.3 KiB

id, title, category, status, source_trust_level, verification_status, created_at, updated_at, tags, tech_stack, applied_in, aliases
id title category status source_trust_level verification_status created_at updated_at tags tech_stack applied_in aliases
cs-consistent-hashing Consistent Hashing — Sharding / Cache 분산 Coding draft B conceptual 2026-05-09 2026-05-09
cs
hashing
sharding
vibe-coding
language applicable_to
TS
Backend
consistent hashing
hashring
virtual nodes
jump hash
rendezvous

Consistent Hashing

Hash mod N = node 추가 / 제거 시 거의 모든 key 이동. Consistent hashing = 1/N 만 이동. Memcached / DynamoDB / Cassandra / load balancer 가 사용.

📖 핵심 개념

  • Hash ring: hash 결과를 0..2^32 원으로 배치.
  • Node 도 ring 의 위치.
  • Key 의 hash → 시계방향 다음 node.
  • Virtual nodes: 한 node = N 개 가상 위치 (균형).

💻 코드 패턴

단순 hashring

import { createHash } from 'node:crypto';

class HashRing {
  private ring: Map<number, string> = new Map();
  private sortedKeys: number[] = [];

  constructor(nodes: string[], private vnodes = 256) {
    for (const node of nodes) this.add(node);
  }

  private hash(s: string): number {
    const md5 = createHash('md5').update(s).digest();
    return md5.readUInt32BE(0);
  }

  add(node: string) {
    for (let i = 0; i < this.vnodes; i++) {
      const h = this.hash(`${node}#${i}`);
      this.ring.set(h, node);
    }
    this.sortedKeys = [...this.ring.keys()].sort((a, b) => a - b);
  }

  remove(node: string) {
    for (let i = 0; i < this.vnodes; i++) {
      this.ring.delete(this.hash(`${node}#${i}`));
    }
    this.sortedKeys = [...this.ring.keys()].sort((a, b) => a - b);
  }

  get(key: string): string {
    if (this.sortedKeys.length === 0) throw new Error('empty');
    const h = this.hash(key);
    // Binary search — 시계방향 다음
    let lo = 0, hi = this.sortedKeys.length - 1;
    while (lo < hi) {
      const mid = (lo + hi) >> 1;
      if (this.sortedKeys[mid] < h) lo = mid + 1;
      else hi = mid;
    }
    const idx = this.sortedKeys[lo] >= h ? lo : 0;
    return this.ring.get(this.sortedKeys[idx])!;
  }
}

const ring = new HashRing(['node1', 'node2', 'node3']);
console.log(ring.get('user-42'));  // 항상 같은 node

Add / remove 효과

N nodes → N+1: 약 1/(N+1) key 이동.
mod N: 거의 모든 key 이동.

Replication (replica = ring 의 다음 N 개)

function getReplicas(key: string, n: number): string[] {
  const seen = new Set<string>();
  const result: string[] = [];
  let idx = findIndex(hash(key));
  while (result.length < n && seen.size < this.ring.size) {
    const node = ring.get(sortedKeys[idx % sortedKeys.length]);
    if (!seen.has(node)) {
      seen.add(node);
      result.push(node);
    }
    idx++;
  }
  return result;
}

→ Cassandra / DynamoDB 가 이걸로 N replica 분산.

Jump consistent hash (Google)

// O(log N) 메모리 — 큰 cluster 효율
function jumpHash(key: bigint, numBuckets: number): number {
  let b = -1n, j = 0n;
  let k = key;
  while (j < BigInt(numBuckets)) {
    b = j;
    k = k * 2862933555777941757n + 1n;
    j = (b + 1n) * (1n << 31n) / ((k >> 33n) + 1n);
  }
  return Number(b);
}

→ Buckets 만 — 해시 ring 자체 X. Add 만 가능 (remove 어려움).

Rendezvous (HRW)

function rendezvous(key: string, nodes: string[]): string {
  let max = -Infinity;
  let chosen = '';
  for (const n of nodes) {
    const h = hash(`${n}:${key}`);
    if (h > max) { max = h; chosen = n; }
  }
  return chosen;
}

→ Virtual node 없이 균형. 단 O(N) per lookup.

Maglev (Google, LB)

  • Lookup table 으로 매핑 미리 계산.
  • O(1) lookup, 균등.
  • 큰 LB / packet routing.

사용 예 — distributed cache

// Memcached client
const ring = new HashRing(['cache-1', 'cache-2', 'cache-3']);

async function get(key: string) {
  const node = ring.get(key);
  const client = clients[node];
  return client.get(key);
}

Load balancer

Client request → LB → Backend
LB 가 consistent hashing → 같은 client = 같은 backend (sticky).
HAProxy / nginx ip_hash / Envoy ring_hash.
# Envoy
clusters:
- name: backend
  lb_policy: RING_HASH
  ring_hash_lb_config: { minimum_ring_size: 1024 }

Node weight (heterogeneous)

// 강한 node = 더 많은 vnode
ring.add('big-node', 512);   // 2x vnodes
ring.add('small-node', 256);

🤔 의사결정 기준

상황 추천
Distributed cache Hashring + vnodes
Big cluster (1000s) Jump hash
Replication 필요 Hashring + N successors
LB sticky ring_hash / Maglev
작은 (5 nodes) rendezvous
잦은 add/remove hashring (jump 어려움)

안티패턴

  • Vnodes 없음: 분배 불균등 (50% / 30% / 20%).
  • Bad hash function: 분배 깨짐. md5 / xxhash 같은 균등.
  • Vnodes 너무 많음 (10K+): 메모리 / 시간.
  • Rebalancing 동기 + traffic 끊김: 점진 / read-from-old + write-to-both.
  • Replication factor < 2: node 죽으면 데이터 잃음.
  • Hash collision 무시: 64-bit 이상 사용.
  • Keys distribution 안 검증: skewed traffic.

🤖 LLM 활용 힌트

  • Hashring + vnodes (256-512) = 표준.
  • 큰 cluster = Jump hash.
  • LB = Envoy ring_hash.

🔗 관련 문서