"매 부하가 늘 때 매 graceful하게 capacity를 키울 수 있는 능력". Scalability는 매 단일 dimension(traffic, data, compute)이 아니라 매 multi-axis property. 2026년에는 매 K8s HPA + KEDA, 매 serverless auto-scale, 매 LLM token-throughput scaling이 매 일상.
매 핵심
매 두 축
Vertical (scale-up): 매 큰 머신 — 매 limit 빨리.
Horizontal (scale-out): 매 더 많은 머신 — 매 stateless 필요.
// 매 session 매 외부화 (Redis)
importexpressfrom"express";importsessionfrom"express-session";importRedisStorefrom"connect-redis";import{createClient}from"redis";constredis=createClient({url:"redis://redis:6379"});awaitredis.connect();constapp=express();app.use(session({store: newRedisStore({client: redis}),secret: process.env.SESSION_SECRET!,resave: false,saveUninitialized: false,}));// 매 어느 instance든 매 동일 session.
매 DB sharding (hash-based)
functionshardFor(userId: string):string{consthash=crc32(userId);return`db-shard-${hash%8}`;}asyncfunctiongetUser(id: string){constshard=shardFor(id);returnpool[shard].query("SELECT * FROM users WHERE id=$1",[id]);}
매 Read replica
constwriteDb=postgres({host:"primary"});constreadDb=postgres({host:"replica.read"});asyncfunctionplaceOrder(o: Order){returnwriteDb`INSERT INTO orders ...`;}asyncfunctionlistOrders(uid: string){returnreadDb`SELECT * FROM orders WHERE uid=${uid}`;}
asyncfunctiongetProduct(id: string){constcached=awaitredis.get(`p:${id}`);if(cached)returnJSON.parse(cached);constp=awaitdb.query("SELECT * FROM products WHERE id=$1",[id]);awaitredis.setex(`p:${id}`,60,JSON.stringify(p));returnp;}
매 결정 기준
상황
Approach
매 traffic spike (예측 가능)
HPA + capacity planning.
매 burst (predicate X)
Serverless / KEDA scale-to-zero.
매 data > single node
Sharding.
매 read >> write
Replica.
매 global users
Multi-region + edge cache.
매 LLM serving
vLLM TP + KV-cache routing.
기본값: 매 stateless service + HPA + Redis cache + read replica.