"매 compile 매 first call, 매 reuse 매 hot path". JIT compilation 매 source / bytecode / IR 의 native code 의 runtime translation — 매 profile-guided 의 hot region 의 optimize. 2026 ML 시대 매 JAX jit, PyTorch 2.x torch.compile, Mojo, JuliaLang 매 mainstream.
매 핵심
매 JIT 의 mechanics
Trace: 매 input shape / dtype 의 capture 매 computational graph.
Specialize: 매 fixed shapes 의 specialized kernel 의 generate.
Cache: 매 (function, signature) → compiled artifact.
Recompile: 매 shape change → cache miss → recompile (avoid in hot loop).
Compilation cache: persistent disk cache 매 cold-start 의 mitigate.
매 응용
ML training loop (JAX, torch.compile).
Numerical Python (Numba @njit).
JavaScript engines (V8, JSC).
Database query plans (Snowflake, DuckDB).
💻 패턴
Pattern 1: JAX jit (2026 standard)
importjaximportjax.numpyasjnp@jax.jitdefattention(q,k,v):scores=jnp.einsum("bhqd,bhkd->bhqk",q,k)/jnp.sqrt(q.shape[-1])weights=jax.nn.softmax(scores,axis=-1)returnjnp.einsum("bhqk,bhkd->bhqd",weights,v)# First call: trace + compile (slow)# Subsequent: cached (fast)out=attention(q,k,v)
Pattern 2: torch.compile (PyTorch 2.x)
importtorchmodel=MyTransformer().cuda()compiled=torch.compile(model,mode="reduce-overhead",fullgraph=True)forbatchindataloader:out=compiled(batch)# 매 first batch 매 slow, subsequent 매 fastout.backward()
Pattern 3: Static argnums (avoid retrace)
fromfunctoolsimportpartial@partial(jax.jit,static_argnums=(1,))deftopk(logits,k):returnjax.lax.top_k(logits,k)# 매 k=10 매 specialized — 매 k=20 매 separate compilationtopk(logits,10)topk(logits,20)# new compile
importosos.environ["JAX_COMPILATION_CACHE_DIR"]="/var/cache/jax"importjaxjax.config.update("jax_persistent_cache_min_entry_size_bytes",0)jax.config.update("jax_persistent_cache_min_compile_time_secs",1.0)# 매 first deployment 매 prewarm script 의 run — 매 next pods cold-start fast.
Pattern 6: Recompilation detection
importjaxfromcollectionsimportCounterclassCompileCounter:def__init__(self):self.count=Counter()deftrace(self,fn_name:str,sig:tuple):self.count[(fn_name,sig)]+=1ifself.count[(fn_name,sig)]>3:print(f"매 thrash: {fn_name} recompiled {self.count[(fn_name,sig)]} times")# Usage: hook into jax.config or torch dynamo logger
Pattern 7: Mojo JIT (2026)
fnmatmul[M:Int,N:Int,K:Int](a:Tensor,b:Tensor)->Tensor:# 매 compile-time specialization 매 shapes — 매 SIMD auto-vectorize.varc=Tensor[DType.float32](M,N)foriinrange(M):forjinrange(N):vars:Float32=0forkinrange(K):s+=a[i,k]*b[k,j]c[i,j]=sreturnc
매 결정 기준
상황
Approach
Numerical Python tight loop
Numba @njit.
ML training
JAX jit 또는 torch.compile.
Variable shapes
Avoid JIT 또는 dynamic=True.
One-shot script
매 JIT overhead 매 not worth.
Long-running server
JIT + persistent cache.
기본값: ML 매 torch.compile(mode="reduce-overhead") 또는 jax.jit. Tight numerical loop 매 Numba.