"매 cache lookup 의 fail (data 의 cache 외 fetch) ratio". CPU L1/L2/L3 cache, browser HTTP cache, application-level memoization, CDN edge cache 의 universal metric — miss / (hit + miss). 2026 perspective: Apple Silicon M4 의 huge L2 (16MB+), 매 Cloudflare/Fastly tiered cache, 매 LLM prompt-cache (Anthropic) 의 cost-driven optimization.
매 핵심
매 hierarchy (CPU)
L1: 32-128KB / core, ~1-4 cycles.
L2: 256KB-2MB / core, ~10-15 cycles.
L3: 8-128MB shared, ~40 cycles.
DRAM: >100ns, ~200-300 cycles. 매 miss 의 huge cost.
매 miss types (3C)
Compulsory (cold): 매 first access.
Capacity: 매 working set > cache size.
Conflict: 매 set-associative collision.
(+) Coherence: multi-core invalidation.
매 measurement
CPU: perf stat -e cache-misses,cache-references (Linux).
HTTP: cf-cache-status: HIT/MISS headers.
App: hit/miss counter on cache wrapper.
매 응용
매 hot loop 의 data layout (SoA vs AoS) tuning.
매 CDN cache key strategy.
매 LLM prompt-cache 의 prefix stability.
💻 패턴
Pattern 1 — 매 cache-friendly layout (SoA)
// 매 BAD (AoS) — 매 padding / strided access
constparticles=[{x: 0,y: 0,vx: 1,vy: 1},/* ... */];// 매 GOOD (SoA) — 매 sequential, cache-line dense
constxs=newFloat32Array(N),ys=newFloat32Array(N);constvxs=newFloat32Array(N),vys=newFloat32Array(N);for(leti=0;i<N;i++){xs[i]+=vxs[i];ys[i]+=vys[i];}
GET /api/posts/123
Cache-Control: public, max-age=300, stale-while-revalidate=86400
ETag: "abc123"
# CDN response
cf-cache-status: HIT
age: 42
Pattern 5 — 매 stable key (CDN)
// 매 BAD — query order varies
fetch(`/api/list?b=2&a=1`);fetch(`/api/list?a=1&b=2`);// 매 different cache key
// 매 GOOD — canonical
constparams=newURLSearchParams();[...Object.entries(args)].sort().forEach(([k,v])=>params.set(k,v));fetch(`/api/list?${params}`);
Pattern 6 — 매 LLM prompt cache (Anthropic)
# 매 stable system prefix → cache hit (90% cost reduction)client.messages.create(model="claude-opus-4-7",system=[{"type":"text","text":LARGE_STABLE_PREFIX,"cache_control":{"type":"ephemeral"}},],messages=[{"role":"user","content":query}],)
Pattern 7 — 매 CPU perf measure
perf stat -e cache-references,cache-misses,L1-dcache-load-misses ./app
# 매 miss rate = misses / references
Pattern 8 — 매 prefetch hint
// 매 link prefetch (browser)
<linkrel="prefetch"href="/next-page"as="document">// 매 software prefetch (WASM/native via SIMD intrinsics)
매 결정 기준
상황
Approach
매 hot loop slow
profile cache-misses, switch to SoA.
매 CDN low hit rate
normalize key, raise max-age.
매 LLM cost high
prompt-cache stable prefix.
매 working set > cache
partition (tiling), reduce.
매 frequent-but-changing
LRU + TTL.
기본값: 매 measure first (perf, CDN headers, app counter), 매 90%+ hit rate target on hot caches.
언제: 매 cache strategy design, hit rate target setting, prompt-cache prefix structuring.
언제 X: 매 hardware-specific cache line size assumption — vendor docs 의.
❌ 안티패턴
매 cache everything: 매 cold data cache space waste.
매 unstable cache key: 매 hit rate near zero.
매 no eviction: unbounded memory.
매 measuring without baseline: 매 hit rate alone meaningless — 매 latency / cost outcome 의.
매 caching mutable data without invalidation: stale read bug.