Files

T

Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization

10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-20 23:52:15 +09:00

6.0 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

Signal in Noise

매 한 줄

"매 'signal in noise' 는 informative variation 을 stochastic background 위에서 detect 하는 문제". Shannon (1948) 의 channel capacity, Wiener filter (1949), 매 modern denoising diffusion (Song & Ermon 2019, EDM2 2024) 까지의 lineage. 2026 의 LLM RAG pipeline 에서 query–document retrieval, gravitational-wave detection (LIGO-Voyager), ML feature engineering 까지 매 universal motif.

매 핵심

매 SNR 정의들

Power SNR: P_signal / P_noise (linear).
dB SNR: 10·log10(P_signal/P_noise).
PSNR (image): 20·log10(MAX/RMSE).
Detection SNR (matched filter): (s, s) / σ² — 매 detection theory 의 sufficient stat.

매 Detection theory 4-quadrant

Hit / Miss / FA / CR — 매 ROC curve → AUC.
d′ (d-prime): Z(hit) − Z(FA) — 매 perceptual sensitivity.
Likelihood ratio: optimal Neyman–Pearson decision.

매 Noise types

White Gaussian: 매 i.i.d. — 매 baseline assumption.
Pink (1/f): 매 brain LFP, financial returns.
Shot: 매 Poisson — photon detector.
Quantization: 매 ADC bit-depth 의 fundamental floor.

매 응용

RAG retrieval (relevant doc = signal, distractor = noise) → reranker 로 SNR↑.
RLHF reward modeling — 매 preference label noise 의 robust loss (Wu et al 2024).
Sensor fusion (Kalman) — 매 process vs measurement noise 의 trade.
Diffusion model — 매 noise schedule 이 곧 generation curriculum.

💻 패턴

SNR / PSNR (numpy)

import numpy as np

def snr_db(signal, noise):
    ps = np.mean(signal ** 2); pn = np.mean(noise ** 2)
    return 10.0 * np.log10(ps / pn)

def psnr(y_true, y_pred, max_val=1.0):
    mse = np.mean((y_true - y_pred) ** 2)
    return 20.0 * np.log10(max_val / np.sqrt(mse + 1e-12))

Matched filter (1D)

from scipy.signal import correlate

def matched_filter(x, template):
    h = template[::-1] / np.linalg.norm(template)
    y = correlate(x, h, mode="same")
    return y  # peak at template's location

Wiener filter (frequency domain)

def wiener(x_noisy, sxx, snn):
    X = np.fft.rfft(x_noisy)
    H = sxx / (sxx + snn + 1e-12)
    return np.fft.irfft(H * X, n=len(x_noisy))

Adaptive threshold via d′

from scipy.stats import norm

def d_prime(hit_rate, fa_rate, eps=1e-6):
    h = np.clip(hit_rate, eps, 1 - eps)
    f = np.clip(fa_rate, eps, 1 - eps)
    return norm.ppf(h) - norm.ppf(f)

Spectral subtraction (denoise speech)

import numpy as np

def spectral_subtract(y, noise_psd, frame=512, hop=128, alpha=1.0):
    win = np.hanning(frame)
    out = np.zeros_like(y, dtype=float)
    norm = np.zeros_like(y, dtype=float)
    for i in range(0, len(y) - frame, hop):
        seg = y[i:i+frame] * win
        Y = np.fft.rfft(seg)
        mag = np.maximum(np.abs(Y) - alpha * np.sqrt(noise_psd), 0)
        out[i:i+frame] += np.fft.irfft(mag * np.exp(1j*np.angle(Y))) * win
        norm[i:i+frame] += win ** 2
    return out / np.maximum(norm, 1e-9)

Diffusion-style denoiser score (toy)

import torch, torch.nn as nn

class ScoreNet(nn.Module):
    def __init__(self, d=128):
        super().__init__()
        self.net = nn.Sequential(nn.Linear(d+1, 256), nn.SiLU(), nn.Linear(256, d))
    def forward(self, x, sigma):
        s = sigma.expand(x.size(0), 1)
        return -self.net(torch.cat([x, s], dim=1)) / sigma  # ≈ ∇ log p_σ(x)

RAG reranker SNR boost (cross-encoder)

from sentence_transformers import CrossEncoder

reranker = CrossEncoder("BAAI/bge-reranker-v2-m3")  # 2025 SOTA reranker

def rerank(query, candidates, k=5):
    pairs = [[query, c] for c in candidates]
    scores = reranker.predict(pairs)
    order = scores.argsort()[::-1][:k]
    return [(candidates[i], float(scores[i])) for i in order]

매 결정 기준

상황	Approach
Known template	Matched filter
Stationary noise PSD known	Wiener
Speech / audio enhance	Spectral subtraction / RNNoise
Image denoise	NLM / BM3D / Diffusion (DiffBIR 2025)
RAG noise (irrelevant docs)	Cross-encoder reranker
Binary detection	ROC + Neyman–Pearson

기본값: detection task 는 d′/ROC, denoise 는 problem-domain 에 맞춘 method (음성→spectral, 이미지→diffusion-prior, retrieval→reranker).

🔗 Graph

부모: Entropy in Information Theory · Signal-Processing-Foundations
변형: Noise · Information-Entropy
응용: Kalman-Filter-and-State-Tracking · Particle-Filter-Algorithms
Adjacent: Statistical-Power · Information Retrieval Evaluation Metrics

🤖 LLM 활용

언제: retrieval quality debug, A/B significance check, sensor pipeline, image/audio gen quality, agent observation-fusion. 언제 X: deterministic logic / no stochastic component (compile-time invariants).

❌ 안티패턴

SNR 단위 혼동: linear vs dB — 매 plot legend 명시.
Stationarity 가정 위반: nonstationary 면 STFT/wavelet 으로 local SNR.
threshold 의 hardcoding: 매 base-rate 변하면 d′ tracking 으로 adaptive.
Reranker 만 의존: bi-encoder recall 이 부족하면 rerank top-k 안에 정답이 없음.

🧪 검증 / 검토

Verified (Kay 1998 "Fundamentals of Statistical Signal Processing, vol II"; Macmillan & Creelman "Detection Theory" 2005; Karras EDM2 paper 2024).
신뢰도 A.

🕓 Changelog

날짜	변경
2026-05-08	Phase 1
2026-05-10	Manual cleanup — SNR/PSNR/d-prime, matched/Wiener/spectral patterns, 2026 RAG-reranker context

6.0 KiB Raw Blame History Unescape Escape