---
id: wiki-2026-0508-signal-in-noise
title: Signal in Noise
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [SNR, Signal-to-Noise Ratio, Detectability]
duplicate_of: none
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [signal-processing, statistics, detection-theory, snr]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
  language: python
  framework: scipy / numpy / torch
---

# Signal in Noise

## 매 한 줄
> **"매 'signal in noise' 는 informative variation 을 stochastic background 위에서 detect 하는 문제"**. Shannon (1948) 의 channel capacity, Wiener filter (1949), 매 modern denoising diffusion (Song & Ermon 2019, EDM2 2024) 까지의 lineage. 2026 의 LLM RAG pipeline 에서 query–document retrieval, gravitational-wave detection (LIGO-Voyager), ML feature engineering 까지 매 universal motif.

## 매 핵심

### 매 SNR 정의들
- **Power SNR**: `P_signal / P_noise` (linear).
- **dB SNR**: `10·log10(P_signal/P_noise)`.
- **PSNR (image)**: `20·log10(MAX/RMSE)`.
- **Detection SNR (matched filter)**: `(s, s) / σ²` — 매 detection theory 의 sufficient stat.

### 매 Detection theory 4-quadrant
- **Hit / Miss / FA / CR** — 매 ROC curve → AUC.
- **d′ (d-prime)**: `Z(hit) − Z(FA)` — 매 perceptual sensitivity.
- **Likelihood ratio**: optimal Neyman–Pearson decision.

### 매 Noise types
- **White Gaussian**: 매 i.i.d. — 매 baseline assumption.
- **Pink (1/f)**: 매 brain LFP, financial returns.
- **Shot**: 매 Poisson — photon detector.
- **Quantization**: 매 ADC bit-depth 의 fundamental floor.

### 매 응용
1. RAG retrieval (relevant doc = signal, distractor = noise) → reranker 로 SNR↑.
2. RLHF reward modeling — 매 preference label noise 의 robust loss (Wu et al 2024).
3. Sensor fusion (Kalman) — 매 process vs measurement noise 의 trade.
4. Diffusion model — 매 noise schedule 이 곧 generation curriculum.

## 💻 패턴

### SNR / PSNR (numpy)
```python
import numpy as np

def snr_db(signal, noise):
    ps = np.mean(signal ** 2); pn = np.mean(noise ** 2)
    return 10.0 * np.log10(ps / pn)

def psnr(y_true, y_pred, max_val=1.0):
    mse = np.mean((y_true - y_pred) ** 2)
    return 20.0 * np.log10(max_val / np.sqrt(mse + 1e-12))
```

### Matched filter (1D)
```python
from scipy.signal import correlate

def matched_filter(x, template):
    h = template[::-1] / np.linalg.norm(template)
    y = correlate(x, h, mode="same")
    return y  # peak at template's location
```

### Wiener filter (frequency domain)
```python
def wiener(x_noisy, sxx, snn):
    X = np.fft.rfft(x_noisy)
    H = sxx / (sxx + snn + 1e-12)
    return np.fft.irfft(H * X, n=len(x_noisy))
```

### Adaptive threshold via d′
```python
from scipy.stats import norm

def d_prime(hit_rate, fa_rate, eps=1e-6):
    h = np.clip(hit_rate, eps, 1 - eps)
    f = np.clip(fa_rate, eps, 1 - eps)
    return norm.ppf(h) - norm.ppf(f)
```

### Spectral subtraction (denoise speech)
```python
import numpy as np

def spectral_subtract(y, noise_psd, frame=512, hop=128, alpha=1.0):
    win = np.hanning(frame)
    out = np.zeros_like(y, dtype=float)
    norm = np.zeros_like(y, dtype=float)
    for i in range(0, len(y) - frame, hop):
        seg = y[i:i+frame] * win
        Y = np.fft.rfft(seg)
        mag = np.maximum(np.abs(Y) - alpha * np.sqrt(noise_psd), 0)
        out[i:i+frame] += np.fft.irfft(mag * np.exp(1j*np.angle(Y))) * win
        norm[i:i+frame] += win ** 2
    return out / np.maximum(norm, 1e-9)
```

### Diffusion-style denoiser score (toy)
```python
import torch, torch.nn as nn

class ScoreNet(nn.Module):
    def __init__(self, d=128):
        super().__init__()
        self.net = nn.Sequential(nn.Linear(d+1, 256), nn.SiLU(), nn.Linear(256, d))
    def forward(self, x, sigma):
        s = sigma.expand(x.size(0), 1)
        return -self.net(torch.cat([x, s], dim=1)) / sigma  # ≈ ∇ log p_σ(x)
```

### RAG reranker SNR boost (cross-encoder)
```python
from sentence_transformers import CrossEncoder

reranker = CrossEncoder("BAAI/bge-reranker-v2-m3")  # 2025 SOTA reranker

def rerank(query, candidates, k=5):
    pairs = [[query, c] for c in candidates]
    scores = reranker.predict(pairs)
    order = scores.argsort()[::-1][:k]
    return [(candidates[i], float(scores[i])) for i in order]
```

## 매 결정 기준

| 상황 | Approach |
|---|---|
| Known template | Matched filter |
| Stationary noise PSD known | Wiener |
| Speech / audio enhance | Spectral subtraction / RNNoise |
| Image denoise | NLM / BM3D / Diffusion (DiffBIR 2025) |
| RAG noise (irrelevant docs) | Cross-encoder reranker |
| Binary detection | ROC + Neyman–Pearson |

**기본값**: detection task 는 d′/ROC, denoise 는 problem-domain 에 맞춘 method (음성→spectral, 이미지→diffusion-prior, retrieval→reranker).

## 🔗 Graph
- 부모: [[Entropy in Information Theory|Information Theory]] · [[Signal-Processing-Foundations]]
- 변형: [[Noise]] · [[Information-Entropy]]
- 응용: [[Kalman-Filter-and-State-Tracking]] · [[Particle-Filter-Algorithms]]
- Adjacent: [[Statistical-Power]] · [[Information Retrieval Evaluation Metrics]]

## 🤖 LLM 활용
**언제**: retrieval quality debug, A/B significance check, sensor pipeline, image/audio gen quality, agent observation-fusion.
**언제 X**: deterministic logic / no stochastic component (compile-time invariants).

## ❌ 안티패턴
- **SNR 단위 혼동**: linear vs dB — 매 plot legend 명시.
- **Stationarity 가정 위반**: nonstationary 면 STFT/wavelet 으로 local SNR.
- **threshold 의 hardcoding**: 매 base-rate 변하면 d′ tracking 으로 adaptive.
- **Reranker 만 의존**: bi-encoder recall 이 부족하면 rerank top-k 안에 정답이 없음.

## 🧪 검증 / 검토
- Verified (Kay 1998 "Fundamentals of Statistical Signal Processing, vol II"; Macmillan & Creelman "Detection Theory" 2005; Karras EDM2 paper 2024).
- 신뢰도 A.

## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — SNR/PSNR/d-prime, matched/Wiener/spectral patterns, 2026 RAG-reranker context |