Files
2nd/10_Wiki/Topics/Coding/CS_Rate_Limit_Algorithms.md
T
2026-05-09 21:08:02 +09:00

6.6 KiB

id, title, category, status, source_trust_level, verification_status, created_at, updated_at, tags, tech_stack, applied_in, aliases
id title category status source_trust_level verification_status created_at updated_at tags tech_stack applied_in aliases
cs-rate-limit-algorithms Rate Limit 알고리즘 — Token / Leaky / Sliding Coding draft B conceptual 2026-05-09 2026-05-09
cs
rate-limit
algorithm
vibe-coding
language applicable_to
TS / Redis
Backend
token bucket
leaky bucket
sliding window
fixed window
GCRA

Rate Limit 알고리즘

4가지: Fixed window / Sliding window / Token bucket / Leaky bucket. 정확도 / 메모리 / burst 허용 trade-off. Redis 분산 = sliding window 또는 GCRA.

📖 핵심 개념

  • Burst: 짧은 spike 허용?
  • Fairness: 사용자 간 공평?
  • 정확성: 경계 시점 정확?
  • 메모리: 사용자당 얼마나?

💻 코드 패턴

Fixed window (가장 단순)

async function fixedWindow(key: string, limit: number, windowSec: number): Promise<boolean> {
  const bucket = `rl:${key}:${Math.floor(Date.now() / 1000 / windowSec)}`;
  const count = await redis.incr(bucket);
  if (count === 1) await redis.expire(bucket, windowSec);
  return count <= limit;
}

⚠️ 경계에서 burst 가능: 1분에 100 limit + 0:59 에 100, 1:00 에 또 100 = 1초에 200.

Sliding window log

async function slidingWindowLog(key: string, limit: number, windowMs: number): Promise<boolean> {
  const now = Date.now();
  const bucket = `rl:${key}`;
  
  await redis.zremrangebyscore(bucket, 0, now - windowMs);
  const count = await redis.zcard(bucket);
  if (count >= limit) return false;
  
  await redis.zadd(bucket, now, now);
  await redis.pexpire(bucket, windowMs);
  return true;
}

→ 정확. 단 메모리 = limit (각 요청 timestamp).

Sliding window counter (Cloudflare 방식)

async function slidingWindowCounter(key: string, limit: number, windowSec: number): Promise<boolean> {
  const now = Date.now() / 1000;
  const cur = Math.floor(now / windowSec);
  const prev = cur - 1;
  const elapsed = (now / windowSec) - cur; // 0..1

  const [curCount, prevCount] = await Promise.all([
    redis.get(`rl:${key}:${cur}`).then(Number),
    redis.get(`rl:${key}:${prev}`).then(Number),
  ]);

  // 가중 합
  const estimate = prevCount * (1 - elapsed) + curCount;
  if (estimate >= limit) return false;

  await redis.incr(`rl:${key}:${cur}`);
  await redis.expire(`rl:${key}:${cur}`, windowSec * 2);
  return true;
}

→ Fixed 의 burst 문제 + 정확성 + 작은 메모리.

Token bucket

class TokenBucket {
  private tokens: number;
  private lastRefill: number;
  
  constructor(private capacity: number, private refillPerSec: number) {
    this.tokens = capacity;
    this.lastRefill = Date.now();
  }
  
  consume(n = 1): boolean {
    this.refill();
    if (this.tokens < n) return false;
    this.tokens -= n;
    return true;
  }
  
  private refill() {
    const now = Date.now();
    const elapsed = (now - this.lastRefill) / 1000;
    this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillPerSec);
    this.lastRefill = now;
  }
}

→ Burst (capacity 만큼) 허용. 표준 패턴.

Token bucket Redis (atomic Lua)

-- token-bucket.lua
local key = KEYS[1]
local capacity = tonumber(ARGV[1])
local refill_rate = tonumber(ARGV[2])  -- per second
local now = tonumber(ARGV[3])
local cost = tonumber(ARGV[4])

local bucket = redis.call('HMGET', key, 'tokens', 'last')
local tokens = tonumber(bucket[1]) or capacity
local last = tonumber(bucket[2]) or now

local elapsed = (now - last) / 1000
tokens = math.min(capacity, tokens + elapsed * refill_rate)

if tokens < cost then
  redis.call('HMSET', key, 'tokens', tokens, 'last', now)
  redis.call('PEXPIRE', key, math.ceil(capacity / refill_rate * 1000))
  return 0
end

tokens = tokens - cost
redis.call('HMSET', key, 'tokens', tokens, 'last', now)
redis.call('PEXPIRE', key, math.ceil(capacity / refill_rate * 1000))
return 1

Leaky bucket

// 요청 = drop. 일정 rate 로 흘러나감.
class LeakyBucket {
  private queue: number[] = [];
  
  constructor(private capacity: number, private leakPerSec: number) {}
  
  add(now: number): boolean {
    this.leak(now);
    if (this.queue.length >= this.capacity) return false;
    this.queue.push(now);
    return true;
  }
  
  private leak(now: number) {
    while (this.queue.length > 0) {
      const oldest = this.queue[0];
      const elapsed = (now - oldest) / 1000;
      if (elapsed >= 1 / this.leakPerSec) {
        this.queue.shift();
      } else break;
    }
  }
}

→ 출력 rate 일정. 미들 / network shaping 에 강.

GCRA (Generic Cell Rate Algorithm)

// 메모리 = 2 숫자 / key. 정확.
// 1번 호출 = O(1).
async function gcra(key: string, periodMs: number, burst: number): Promise<boolean> {
  const now = Date.now();
  const arrival = await redis.get(`gcra:${key}`).then(Number);
  const tat = Math.max(arrival || 0, now);
  
  const newTat = tat + periodMs;
  const allowAt = newTat - burst * periodMs;
  
  if (now < allowAt) return false;
  await redis.set(`gcra:${key}`, newTat, 'PX', burst * periodMs);
  return true;
}

→ Stripe 가 사용. 메모리 효율 + 정확.

Distributed (다중 서버)

// Redis SETEX + atomic INCR
// 또는 위 Lua script
// 또는 가까운 dedicated rate-limit service (Envoy + ratelimit)

멀티 키 (per IP + per user)

const ipOk = await rateLimit(`ip:${ip}`, 1000, 60);
const userOk = await rateLimit(`user:${userId}`, 100, 60);
if (!ipOk || !userOk) return 429;

Cost-weighted (비싼 endpoint)

// Token bucket: 일반 = 1 token, 큰 = 10 token
const cost = endpoint === '/expensive' ? 10 : 1;
const ok = await tokenBucket.consume(cost);

🤔 의사결정 기준

상황 추천
단순 / 부정확 OK Fixed window
정확 / 작은 메모리 Sliding window counter / GCRA
Burst 허용 Token bucket
일정 출력 (queue 처럼) Leaky bucket
분산 / 큰 규모 Redis Lua + Sliding/GCRA
정확 + 효율 GCRA

안티패턴

  • Memory 만 (per-server): 분산 환경 비공정.
  • Fixed window prod: 경계 burst 위험.
  • Limit 없음 prod: DoS 취약.
  • 사용자 한 형태만: IP / user / API key 각각.
  • 429 + 큰 cost (DB query 까지 도달): gateway 에서 cut.
  • Retry-After 헤더 없음: 클라가 무한 재시도.
  • Test 환경 같은 limit: prod 만 강.

🤖 LLM 활용 힌트

  • 분산 + 정확 = GCRA / Sliding window counter.
  • Burst = Token bucket.
  • 다중 key (IP + user + API key).
  • 429 + Retry-After.

🔗 관련 문서