--- id: wiki-2026-0508-signal-in-noise title: Signal in Noise category: 10_Wiki/Topics status: verified canonical_id: self aliases: [SNR, Signal-to-Noise Ratio, Detectability] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [signal-processing, statistics, detection-theory, snr] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: python framework: scipy / numpy / torch --- # Signal in Noise ## 매 한 줄 > **"매 'signal in noise' 는 informative variation 을 stochastic background 위에서 detect 하는 문제"**. Shannon (1948) 의 channel capacity, Wiener filter (1949), 매 modern denoising diffusion (Song & Ermon 2019, EDM2 2024) 까지의 lineage. 2026 의 LLM RAG pipeline 에서 query–document retrieval, gravitational-wave detection (LIGO-Voyager), ML feature engineering 까지 매 universal motif. ## 매 핵심 ### 매 SNR 정의들 - **Power SNR**: `P_signal / P_noise` (linear). - **dB SNR**: `10·log10(P_signal/P_noise)`. - **PSNR (image)**: `20·log10(MAX/RMSE)`. - **Detection SNR (matched filter)**: `(s, s) / σ²` — 매 detection theory 의 sufficient stat. ### 매 Detection theory 4-quadrant - **Hit / Miss / FA / CR** — 매 ROC curve → AUC. - **d′ (d-prime)**: `Z(hit) − Z(FA)` — 매 perceptual sensitivity. - **Likelihood ratio**: optimal Neyman–Pearson decision. ### 매 Noise types - **White Gaussian**: 매 i.i.d. — 매 baseline assumption. - **Pink (1/f)**: 매 brain LFP, financial returns. - **Shot**: 매 Poisson — photon detector. - **Quantization**: 매 ADC bit-depth 의 fundamental floor. ### 매 응용 1. RAG retrieval (relevant doc = signal, distractor = noise) → reranker 로 SNR↑. 2. RLHF reward modeling — 매 preference label noise 의 robust loss (Wu et al 2024). 3. Sensor fusion (Kalman) — 매 process vs measurement noise 의 trade. 4. Diffusion model — 매 noise schedule 이 곧 generation curriculum. ## 💻 패턴 ### SNR / PSNR (numpy) ```python import numpy as np def snr_db(signal, noise): ps = np.mean(signal ** 2); pn = np.mean(noise ** 2) return 10.0 * np.log10(ps / pn) def psnr(y_true, y_pred, max_val=1.0): mse = np.mean((y_true - y_pred) ** 2) return 20.0 * np.log10(max_val / np.sqrt(mse + 1e-12)) ``` ### Matched filter (1D) ```python from scipy.signal import correlate def matched_filter(x, template): h = template[::-1] / np.linalg.norm(template) y = correlate(x, h, mode="same") return y # peak at template's location ``` ### Wiener filter (frequency domain) ```python def wiener(x_noisy, sxx, snn): X = np.fft.rfft(x_noisy) H = sxx / (sxx + snn + 1e-12) return np.fft.irfft(H * X, n=len(x_noisy)) ``` ### Adaptive threshold via d′ ```python from scipy.stats import norm def d_prime(hit_rate, fa_rate, eps=1e-6): h = np.clip(hit_rate, eps, 1 - eps) f = np.clip(fa_rate, eps, 1 - eps) return norm.ppf(h) - norm.ppf(f) ``` ### Spectral subtraction (denoise speech) ```python import numpy as np def spectral_subtract(y, noise_psd, frame=512, hop=128, alpha=1.0): win = np.hanning(frame) out = np.zeros_like(y, dtype=float) norm = np.zeros_like(y, dtype=float) for i in range(0, len(y) - frame, hop): seg = y[i:i+frame] * win Y = np.fft.rfft(seg) mag = np.maximum(np.abs(Y) - alpha * np.sqrt(noise_psd), 0) out[i:i+frame] += np.fft.irfft(mag * np.exp(1j*np.angle(Y))) * win norm[i:i+frame] += win ** 2 return out / np.maximum(norm, 1e-9) ``` ### Diffusion-style denoiser score (toy) ```python import torch, torch.nn as nn class ScoreNet(nn.Module): def __init__(self, d=128): super().__init__() self.net = nn.Sequential(nn.Linear(d+1, 256), nn.SiLU(), nn.Linear(256, d)) def forward(self, x, sigma): s = sigma.expand(x.size(0), 1) return -self.net(torch.cat([x, s], dim=1)) / sigma # ≈ ∇ log p_σ(x) ``` ### RAG reranker SNR boost (cross-encoder) ```python from sentence_transformers import CrossEncoder reranker = CrossEncoder("BAAI/bge-reranker-v2-m3") # 2025 SOTA reranker def rerank(query, candidates, k=5): pairs = [[query, c] for c in candidates] scores = reranker.predict(pairs) order = scores.argsort()[::-1][:k] return [(candidates[i], float(scores[i])) for i in order] ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | Known template | Matched filter | | Stationary noise PSD known | Wiener | | Speech / audio enhance | Spectral subtraction / RNNoise | | Image denoise | NLM / BM3D / Diffusion (DiffBIR 2025) | | RAG noise (irrelevant docs) | Cross-encoder reranker | | Binary detection | ROC + Neyman–Pearson | **기본값**: detection task 는 d′/ROC, denoise 는 problem-domain 에 맞춘 method (음성→spectral, 이미지→diffusion-prior, retrieval→reranker). ## 🔗 Graph - 부모: [[Entropy in Information Theory|Information Theory]] · [[Signal-Processing-Foundations]] - 변형: [[Noise]] · [[Information-Entropy]] - 응용: [[Kalman-Filter-and-State-Tracking]] · [[Particle-Filter-Algorithms]] - Adjacent: [[Statistical-Power]] · [[Information Retrieval Evaluation Metrics]] ## 🤖 LLM 활용 **언제**: retrieval quality debug, A/B significance check, sensor pipeline, image/audio gen quality, agent observation-fusion. **언제 X**: deterministic logic / no stochastic component (compile-time invariants). ## ❌ 안티패턴 - **SNR 단위 혼동**: linear vs dB — 매 plot legend 명시. - **Stationarity 가정 위반**: nonstationary 면 STFT/wavelet 으로 local SNR. - **threshold 의 hardcoding**: 매 base-rate 변하면 d′ tracking 으로 adaptive. - **Reranker 만 의존**: bi-encoder recall 이 부족하면 rerank top-k 안에 정답이 없음. ## 🧪 검증 / 검토 - Verified (Kay 1998 "Fundamentals of Statistical Signal Processing, vol II"; Macmillan & Creelman "Detection Theory" 2005; Karras EDM2 paper 2024). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — SNR/PSNR/d-prime, matched/Wiener/spectral patterns, 2026 RAG-reranker context |