Files
2nd/10_Wiki/Topics/Computer_Science_and_Theory/Signal in Noise.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

6.0 KiB
Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-signal-in-noise Signal in Noise 10_Wiki/Topics verified self
SNR
Signal-to-Noise Ratio
Detectability
none A 0.9 applied
signal-processing
statistics
detection-theory
snr
2026-05-10 pending
language framework
python scipy / numpy / torch

Signal in Noise

매 한 줄

"매 'signal in noise' 는 informative variation 을 stochastic background 위에서 detect 하는 문제". Shannon (1948) 의 channel capacity, Wiener filter (1949), 매 modern denoising diffusion (Song & Ermon 2019, EDM2 2024) 까지의 lineage. 2026 의 LLM RAG pipeline 에서 querydocument retrieval, gravitational-wave detection (LIGO-Voyager), ML feature engineering 까지 매 universal motif.

매 핵심

매 SNR 정의들

  • Power SNR: P_signal / P_noise (linear).
  • dB SNR: 10·log10(P_signal/P_noise).
  • PSNR (image): 20·log10(MAX/RMSE).
  • Detection SNR (matched filter): (s, s) / σ² — 매 detection theory 의 sufficient stat.

매 Detection theory 4-quadrant

  • Hit / Miss / FA / CR — 매 ROC curve → AUC.
  • d (d-prime): Z(hit) Z(FA) — 매 perceptual sensitivity.
  • Likelihood ratio: optimal NeymanPearson decision.

매 Noise types

  • White Gaussian: 매 i.i.d. — 매 baseline assumption.
  • Pink (1/f): 매 brain LFP, financial returns.
  • Shot: 매 Poisson — photon detector.
  • Quantization: 매 ADC bit-depth 의 fundamental floor.

매 응용

  1. RAG retrieval (relevant doc = signal, distractor = noise) → reranker 로 SNR↑.
  2. RLHF reward modeling — 매 preference label noise 의 robust loss (Wu et al 2024).
  3. Sensor fusion (Kalman) — 매 process vs measurement noise 의 trade.
  4. Diffusion model — 매 noise schedule 이 곧 generation curriculum.

💻 패턴

SNR / PSNR (numpy)

import numpy as np

def snr_db(signal, noise):
    ps = np.mean(signal ** 2); pn = np.mean(noise ** 2)
    return 10.0 * np.log10(ps / pn)

def psnr(y_true, y_pred, max_val=1.0):
    mse = np.mean((y_true - y_pred) ** 2)
    return 20.0 * np.log10(max_val / np.sqrt(mse + 1e-12))

Matched filter (1D)

from scipy.signal import correlate

def matched_filter(x, template):
    h = template[::-1] / np.linalg.norm(template)
    y = correlate(x, h, mode="same")
    return y  # peak at template's location

Wiener filter (frequency domain)

def wiener(x_noisy, sxx, snn):
    X = np.fft.rfft(x_noisy)
    H = sxx / (sxx + snn + 1e-12)
    return np.fft.irfft(H * X, n=len(x_noisy))

Adaptive threshold via d

from scipy.stats import norm

def d_prime(hit_rate, fa_rate, eps=1e-6):
    h = np.clip(hit_rate, eps, 1 - eps)
    f = np.clip(fa_rate, eps, 1 - eps)
    return norm.ppf(h) - norm.ppf(f)

Spectral subtraction (denoise speech)

import numpy as np

def spectral_subtract(y, noise_psd, frame=512, hop=128, alpha=1.0):
    win = np.hanning(frame)
    out = np.zeros_like(y, dtype=float)
    norm = np.zeros_like(y, dtype=float)
    for i in range(0, len(y) - frame, hop):
        seg = y[i:i+frame] * win
        Y = np.fft.rfft(seg)
        mag = np.maximum(np.abs(Y) - alpha * np.sqrt(noise_psd), 0)
        out[i:i+frame] += np.fft.irfft(mag * np.exp(1j*np.angle(Y))) * win
        norm[i:i+frame] += win ** 2
    return out / np.maximum(norm, 1e-9)

Diffusion-style denoiser score (toy)

import torch, torch.nn as nn

class ScoreNet(nn.Module):
    def __init__(self, d=128):
        super().__init__()
        self.net = nn.Sequential(nn.Linear(d+1, 256), nn.SiLU(), nn.Linear(256, d))
    def forward(self, x, sigma):
        s = sigma.expand(x.size(0), 1)
        return -self.net(torch.cat([x, s], dim=1)) / sigma  # ≈ ∇ log p_σ(x)

RAG reranker SNR boost (cross-encoder)

from sentence_transformers import CrossEncoder

reranker = CrossEncoder("BAAI/bge-reranker-v2-m3")  # 2025 SOTA reranker

def rerank(query, candidates, k=5):
    pairs = [[query, c] for c in candidates]
    scores = reranker.predict(pairs)
    order = scores.argsort()[::-1][:k]
    return [(candidates[i], float(scores[i])) for i in order]

매 결정 기준

상황 Approach
Known template Matched filter
Stationary noise PSD known Wiener
Speech / audio enhance Spectral subtraction / RNNoise
Image denoise NLM / BM3D / Diffusion (DiffBIR 2025)
RAG noise (irrelevant docs) Cross-encoder reranker
Binary detection ROC + NeymanPearson

기본값: detection task 는 d/ROC, denoise 는 problem-domain 에 맞춘 method (음성→spectral, 이미지→diffusion-prior, retrieval→reranker).

🔗 Graph

🤖 LLM 활용

언제: retrieval quality debug, A/B significance check, sensor pipeline, image/audio gen quality, agent observation-fusion. 언제 X: deterministic logic / no stochastic component (compile-time invariants).

안티패턴

  • SNR 단위 혼동: linear vs dB — 매 plot legend 명시.
  • Stationarity 가정 위반: nonstationary 면 STFT/wavelet 으로 local SNR.
  • threshold 의 hardcoding: 매 base-rate 변하면 d tracking 으로 adaptive.
  • Reranker 만 의존: bi-encoder recall 이 부족하면 rerank top-k 안에 정답이 없음.

🧪 검증 / 검토

  • Verified (Kay 1998 "Fundamentals of Statistical Signal Processing, vol II"; Macmillan & Creelman "Detection Theory" 2005; Karras EDM2 paper 2024).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — SNR/PSNR/d-prime, matched/Wiener/spectral patterns, 2026 RAG-reranker context