--- id: wiki-2026-0508-feedback-loops title: Feedback Loops category: 10_Wiki/Topics status: verified canonical_id: self aliases: [Feedback Control, Closed Loop, Cybernetic Feedback] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [systems, control-theory, cybernetics, dynamics] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: python framework: control-systems --- # Feedback Loops ## 매 한 줄 > **"매 system output 의 input 의 re-entry — 매 stability 또는 amplification 의 결정"**. 매 1948 Wiener 의 Cybernetics 가 unifying frame. 매 2026 의 RLHF, autoscaling, climate tipping points, social media engagement loop 의 modern instances. ## 매 핵심 ### 매 2 polarities - **Negative (balancing)**: 매 deviation 의 dampen — 매 thermostat, homeostasis, PID controller. - **Positive (reinforcing)**: 매 deviation 의 amplify — 매 viral growth, asset bubble, ice-albedo feedback, runaway selection. ### 매 5 archetypes (Senge) - **Limits to growth**: 매 reinforcing + balancing — 매 S-curve. - **Shifting the burden**: 매 quick fix 의 underlying issue 의 weaken. - **Tragedy of the commons**: 매 individual reinforcing → collective collapse. - **Fixes that fail**: 매 short-term fix 의 long-term backfire. - **Success to the successful**: 매 winner-take-all reinforcing. ### 매 stability concepts - **Gain**: 매 output/input ratio. - **Phase margin**: 매 stability buffer (>45° robust). - **Time delay**: 매 instability driver (Bode-Nyquist). - **Setpoint vs. error**: 매 target — actual. ### 매 응용 1. PID controller (industrial process). 2. RLHF (LLM 의 preference loop). 3. Autoscaling (Kubernetes HPA, target CPU). 4. Insulin-glucose homeostasis. 5. Market price discovery. ## 💻 패턴 ### PID controller ```python class PID: def __init__(self, kp: float, ki: float, kd: float, setpoint: float): self.kp, self.ki, self.kd = kp, ki, kd self.setpoint = setpoint self.integral = 0.0 self.prev_error = 0.0 def step(self, measurement: float, dt: float) -> float: error = self.setpoint - measurement self.integral += error * dt derivative = (error - self.prev_error) / dt if dt > 0 else 0 self.prev_error = error return self.kp * error + self.ki * self.integral + self.kd * derivative ``` ### Logistic growth (limits-to-growth archetype) ```python import numpy as np from scipy.integrate import odeint def logistic(N, t, r, K): return r * N * (1 - N / K) t = np.linspace(0, 50, 500) N = odeint(logistic, y0=1, t=t, args=(0.3, 1000)) # Reinforcing (rN) + balancing ((1 - N/K)) ``` ### Autoscaling reactive loop ```python def autoscale_step(current_replicas: int, cpu_utilization: float, target: float = 0.7, max_replicas: int = 100) -> int: desired = int(current_replicas * cpu_utilization / target) return max(1, min(desired, max_replicas)) ``` ### Reinforcement learning (RLHF reward model loop) ```python def rlhf_iteration(policy, reward_model, prompts, ppo_optimizer): rollouts = [policy.generate(p) for p in prompts] rewards = [reward_model.score(p, r) for p, r in zip(prompts, rollouts)] advantages = compute_advantages(rewards) ppo_optimizer.step(policy, rollouts, advantages) # Loop closes: policy → output → reward → policy update ``` ### Stability check (root locus) ```python import numpy as np from scipy.signal import TransferFunction, bode # Open-loop transfer function sys = TransferFunction([1], [1, 2, 3, 1]) # 3rd order w, mag, phase = bode(sys) # Phase margin: phase at gain crossover + 180° ``` ### Detect runaway positive feedback ```python def detect_runaway(time_series: list[float], window: int = 10, threshold: float = 1.5) -> bool: """Exponential growth detector — log-linear fit slope.""" import numpy as np if len(time_series) < window: return False y = np.log(np.maximum(time_series[-window:], 1e-9)) slope = np.polyfit(range(window), y, 1)[0] return slope > np.log(threshold) / window ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | Process regulation, setpoint tracking | PID (negative feedback) | | Growth modeling | logistic / Gompertz (mixed) | | Cascading failure prevention | rate limiters + circuit breakers | | Slow process w/ delay | feed-forward + smith predictor | | ML training | RLHF / GRPO with KL regularization | **기본값**: 매 negative feedback 의 default for stability. 매 positive feedback 의 explicit guard (rate limit, kill switch). ## 🔗 Graph - 부모: [[Cybernetics Foundations|Cybernetics]] · [[Control Theory]] · [[Systems_Thinking|Systems Thinking]] - 응용: [[RLHF]] · [[Homeostasis (항상성)|Homeostasis]] ## 🤖 LLM 활용 **언제**: 매 archetype identification, 매 PID gain initial estimation, 매 system dynamics diagram 의 stock-flow conversion. **언제 X**: 매 safety-critical control gain tuning — 매 hardware-in-the-loop testing, 매 actual phase margin verification 필수. ## ❌ 안티패턴 - **Ignoring delay**: 매 time-delay 의 PID 의 instability — 매 dead-time compensation 필요. - **High gain assumption = better tracking**: 매 oscillation, 매 noise amplification. - **Open-loop control for safety-critical**: 매 disturbance rejection X — 매 closed-loop 필수. - **Reinforcing loop 의 무방어 deploy**: 매 viral metric 의 optimization — 매 social harm runaway (engagement maximization → polarization). ## 🧪 검증 / 중복 - Verified (Wiener "Cybernetics" 1948, Åström & Murray "Feedback Systems" 2nd ed, Sterman "Business Dynamics" 2000, Senge "Fifth Discipline" rev. ed). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — PID, Senge archetypes, RLHF/autoscaling 추가 |