"매 brain = prediction machine — top-down predictions vs bottom-up errors 의 hierarchical loop". Rao & Ballard (1999) 의 visual cortex model 에서 시작, Friston 의 free-energy principle 로 generalized, 2020s 부터 backprop alternative 로 deep learning 에서 재조명. Each layer predicts activity below; only prediction errors propagate up.
매 핵심
매 Rao-Ballard (1999)
Hierarchical generative model: layer L predicts layer L-1 activity.
Endstop neurons, surround suppression 매 emergent properties.
매 free-energy principle (Friston)
Brain minimizes variational free energy = surprise upper bound.
Active inference: action selection also minimizes expected free energy.
Unifies perception, action, learning under one objective.
매 modern PC neural networks
PCN as backprop alternative: local Hebbian-like updates only.
Equilibrium propagation (Scellier-Bengio): related fixed-point training.
Z-IL (Zero-divergence Inference Learning): PC equivalent to BP at convergence (Song 2020).
2024-2026 work: scaling PC to ImageNet, transformer-PC hybrids.
매 advantages over backprop
Local plasticity (biologically plausible).
No need to store activations for backward pass.
Natural for online / continual learning.
Robust to weight transport problem.
💻 패턴
Minimal PC layer (PyTorch)
importtorch,torch.nnasnnclassPCLayer(nn.Module):def__init__(self,dim_below,dim_above):super().__init__()self.W=nn.Parameter(torch.randn(dim_above,dim_below)*0.1)self.r=None# state, set per batchdefinit_state(self,batch_size,device):self.r=torch.zeros(batch_size,self.W.shape[0],device=device,requires_grad=True)defpredict(self):returnself.r@self.W# top-down prediction of layer belowdeferror(self,below):returnbelow-self.predict()
Inference loop (energy minimization)
defpc_inference(layers,x,n_steps=20,lr_r=0.1):# x: input at bottomforLinlayers:L.init_state(x.size(0),x.device)activity=[x]+[L.rforLinlayers]for_inrange(n_steps):# compute errors at each levelerrors=[]fori,Linenumerate(layers):errors.append(activity[i]-L.predict())# update r via gradient descent on free energyfori,Linenumerate(layers):grad=-errors[i]@L.W.Tifi+1<len(layers):grad=grad+errors[i+1]L.r=(L.r-lr_r*grad).detach().requires_grad_(True)activity[i+1]=L.rreturnerrors
defselect_action(model,state,candidate_actions):"""Pick action minimizing expected free energy G = epistemic + pragmatic."""G=[]foraincandidate_actions:next_belief=model.transition(state,a)ambiguity=model.entropy(next_belief)risk=model.kl_to_preferred(next_belief)G.append(ambiguity+risk)returncandidate_actions[torch.argmin(torch.tensor(G))]
Z-IL (PC ≡ BP at convergence)
# Song et al 2020: at the equilibrium of PC inference,# weight updates equal those produced by BP.# Critical detail: feedback weights = transpose of forward weights (tied).
매 결정 기준
상황
Approach
Biological plausibility required
Predictive coding
Energy efficiency on neuromorphic HW
PC / spiking PC
SOTA accuracy on ImageNet
Backprop CNN/ViT (still wins)
Continual learning
PC w/ uncertainty-weighted errors
Interpretation of cortical hierarchy
PC as theory
기본값: BP for engineering; PC for neuroscience modeling 또는 neuromorphic deployment.
언제: brain-inspired model design, biologically-plausible learning, continual learning, neuromorphic chips.
언제 X: pure engineering goals — backprop is faster and more accurate.
❌ 안티패턴
PC as drop-in BP replacement: still slower and less accurate at scale.
Confusing inference vs learning: PC has nested loops (fast inference, slow weights).
Ignoring weight symmetry: untied feedback breaks BP equivalence.
Free-energy hand-wave: equation must be operationalized concretely.
🧪 검증 / 중복
Verified (Rao & Ballard 1999 Nat Neurosci, Friston 2010, Song et al 2020 NeurIPS).
신뢰도 A.
🕓 Changelog
날짜
변경
2026-05-08
Phase 1
2026-05-10
Manual cleanup — full PC theory + modern PC NN code