--- id: wiki-2026-0508-predictive-coding title: Predictive Coding category: 10_Wiki/Topics status: verified canonical_id: self aliases: [Predictive Coding Networks, PCN, Hierarchical Predictive Coding] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [neuroscience, computational-neuroscience, free-energy, brain-models] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: Python framework: PyTorch / JAX --- # Predictive Coding ## 매 한 줄 > **"매 brain = prediction machine — top-down predictions vs bottom-up errors 의 hierarchical loop"**. Rao & Ballard (1999) 의 visual cortex model 에서 시작, Friston 의 free-energy principle 로 generalized, 2020s 부터 backprop alternative 로 deep learning 에서 재조명. Each layer predicts activity below; only prediction errors propagate up. ## 매 핵심 ### 매 Rao-Ballard (1999) - Hierarchical generative model: layer L predicts layer L-1 activity. - Prediction error e_L = r_{L-1} - W_L * r_L. - Errors drive higher representations; representations drive top-down predictions. - Endstop neurons, surround suppression 매 emergent properties. ### 매 free-energy principle (Friston) - Brain minimizes variational free energy = surprise upper bound. - Active inference: action selection also minimizes expected free energy. - Unifies perception, action, learning under one objective. ### 매 modern PC neural networks - **PCN as backprop alternative**: local Hebbian-like updates only. - **Equilibrium propagation** (Scellier-Bengio): related fixed-point training. - **Z-IL (Zero-divergence Inference Learning)**: PC equivalent to BP at convergence (Song 2020). - 2024-2026 work: scaling PC to ImageNet, transformer-PC hybrids. ### 매 advantages over backprop 1. Local plasticity (biologically plausible). 2. No need to store activations for backward pass. 3. Natural for online / continual learning. 4. Robust to weight transport problem. ## 💻 패턴 ### Minimal PC layer (PyTorch) ```python import torch, torch.nn as nn class PCLayer(nn.Module): def __init__(self, dim_below, dim_above): super().__init__() self.W = nn.Parameter(torch.randn(dim_above, dim_below) * 0.1) self.r = None # state, set per batch def init_state(self, batch_size, device): self.r = torch.zeros(batch_size, self.W.shape[0], device=device, requires_grad=True) def predict(self): return self.r @ self.W # top-down prediction of layer below def error(self, below): return below - self.predict() ``` ### Inference loop (energy minimization) ```python def pc_inference(layers, x, n_steps=20, lr_r=0.1): # x: input at bottom for L in layers: L.init_state(x.size(0), x.device) activity = [x] + [L.r for L in layers] for _ in range(n_steps): # compute errors at each level errors = [] for i, L in enumerate(layers): errors.append(activity[i] - L.predict()) # update r via gradient descent on free energy for i, L in enumerate(layers): grad = -errors[i] @ L.W.T if i + 1 < len(layers): grad = grad + errors[i + 1] L.r = (L.r - lr_r * grad).detach().requires_grad_(True) activity[i + 1] = L.r return errors ``` ### Weight update (local Hebbian) ```python def pc_weight_update(layers, errors, activity, lr_w=0.01): with torch.no_grad(): for i, L in enumerate(layers): # dW ∝ r_above^T * error_below dW = L.r.T @ errors[i] / errors[i].size(0) L.W += lr_w * dW ``` ### Active inference (action selection) ```python def select_action(model, state, candidate_actions): """Pick action minimizing expected free energy G = epistemic + pragmatic.""" G = [] for a in candidate_actions: next_belief = model.transition(state, a) ambiguity = model.entropy(next_belief) risk = model.kl_to_preferred(next_belief) G.append(ambiguity + risk) return candidate_actions[torch.argmin(torch.tensor(G))] ``` ### Z-IL (PC ≡ BP at convergence) ```python # Song et al 2020: at the equilibrium of PC inference, # weight updates equal those produced by BP. # Critical detail: feedback weights = transpose of forward weights (tied). ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | Biological plausibility required | Predictive coding | | Energy efficiency on neuromorphic HW | PC / spiking PC | | SOTA accuracy on ImageNet | Backprop CNN/ViT (still wins) | | Continual learning | PC w/ uncertainty-weighted errors | | Interpretation of cortical hierarchy | PC as theory | **기본값**: BP for engineering; PC for neuroscience modeling 또는 neuromorphic deployment. ## 🔗 Graph - 부모: [[Computational-Neuroscience-RL|Computational-Neuroscience]] · [[Free-Energy-Principle]] - 변형: [[Active-Inference]] - 응용: [[Bayesian-Brain]] · [[Neuromorphic-Computing]] - Adjacent: [[데이터 사이언스 및 ML 엔지니어링|Backpropagation]] · [[Variational-Inference]] ## 🤖 LLM 활용 **언제**: brain-inspired model design, biologically-plausible learning, continual learning, neuromorphic chips. **언제 X**: pure engineering goals — backprop is faster and more accurate. ## ❌ 안티패턴 - **PC as drop-in BP replacement**: still slower and less accurate at scale. - **Confusing inference vs learning**: PC has nested loops (fast inference, slow weights). - **Ignoring weight symmetry**: untied feedback breaks BP equivalence. - **Free-energy hand-wave**: equation must be operationalized concretely. ## 🧪 검증 / 중복 - Verified (Rao & Ballard 1999 Nat Neurosci, Friston 2010, Song et al 2020 NeurIPS). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — full PC theory + modern PC NN code |