Files

T

koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)

이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-08 12:24:15 +09:00

4.6 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

Inexact Science

매 한 줄

"매 uncertainty 매 quantify". Inexact science 매 deterministic closed-form X — 매 noise, bias, partial observability 매 inherent. 매 2026 ML interpretability, social science replication crisis 매 forefront. 매 tool: 매 Bayesian inference, robust statistics, sensitivity analysis.

매 핵심

매 inexactness 의 source

Aleatory: 매 inherent randomness (quantum, dice).
Epistemic: 매 ignorance — 매 reducible by data.
Measurement noise: 매 instrument precision limit.
Model misspecification: 매 wrong functional form.
Selection bias: 매 non-representative sample.

매 mitigation 전략

Bayesian credible intervals (vs frequentist CI).
Bootstrap resampling — 매 distribution-free uncertainty.
Cross-validation — 매 generalization estimate.
Sensitivity analysis — 매 parameter perturbation.
Pre-registration — 매 p-hacking 방지.

매 응용

매 medical trials (FDA Phase III).
매 ML model deployment (Bayesian deep learning).
매 climate modeling (ensemble).
매 economics (DSGE models).

💻 패턴

1. Bayesian Linear Regression (PyMC)

import pymc as pm

with pm.Model() as model:
    alpha = pm.Normal('alpha', 0, 10)
    beta = pm.Normal('beta', 0, 10)
    sigma = pm.HalfNormal('sigma', 5)
    mu = alpha + beta * x_obs
    y = pm.Normal('y', mu=mu, sigma=sigma, observed=y_obs)
    trace = pm.sample(2000, tune=1000)
# 매 posterior distribution — credible intervals 매 natural

2. Bootstrap Confidence Interval

import numpy as np
def bootstrap_ci(data, stat_fn, n=10_000, alpha=0.05):
    boots = [stat_fn(np.random.choice(data, len(data), replace=True))
             for _ in range(n)]
    return np.percentile(boots, [100*alpha/2, 100*(1-alpha/2)])

3. Sensitivity Analysis (Sobol)

from SALib.analyze import sobol
from SALib.sample.sobol import sample as sobol_sample

problem = {'num_vars': 3, 'names': ['x1','x2','x3'],
           'bounds': [[0,1]]*3}
param_values = sobol_sample(problem, 1024)
Y = np.array([model(p) for p in param_values])
Si = sobol.analyze(problem, Y)  # 매 first/total order indices

4. Cross-Validation

from sklearn.model_selection import cross_val_score
scores = cross_val_score(model, X, y, cv=10, scoring='neg_mean_squared_error')
print(f"MSE: {-scores.mean():.3f} ± {scores.std():.3f}")

5. Robust Statistics (M-estimator)

from sklearn.linear_model import HuberRegressor
# 매 outlier-resistant — Huber loss 매 quadratic+linear
huber = HuberRegressor(epsilon=1.35).fit(X, y)

6. Conformal Prediction (Distribution-Free)

# 매 2026 standard — coverage guarantee 매 model-agnostic
calib_residuals = np.abs(y_calib - model.predict(X_calib))
q_hat = np.quantile(calib_residuals, 0.95)
# 매 prediction interval: [pred - q_hat, pred + q_hat]

매 결정 기준

상황	Approach
Small n, prior knowledge	Bayesian (PyMC, Stan)
Large n, distribution-free	Bootstrap + conformal
Causal claim	RCT > obs + IV/DiD
Outliers heavy	Huber / RANSAC
Multiple comparisons	BH-FDR / Bonferroni

기본값: 매 report point estimate + 95% interval; 매 effect size > significance.

🔗 Graph

부모: Statistics · Probability Theory
변형: Bayesian Inference
응용: Statistical-Power · Multivariate-Analysis
Adjacent: Epistemology

🤖 LLM 활용

언제: 매 study design review, 매 uncertainty communication, 매 robustness check 제안. 언제 X: 매 deterministic system (compiler, hash). 매 cryptographic exactness 필요.

❌ 안티패턴

p<0.05 cult: 매 effect size 무시, multiple-testing 무수정.
HARKing: 매 hypothesis after results known.
Overconfident point estimate: 매 ±uncertainty 미보고.
Garrison the data: 매 outlier 임의 제거.

🧪 검증 / 중복

Verified (Gelman, BDA; Wasserman, All of Statistics; ASA p-value statement).
신뢰도 A.

🕓 Changelog

날짜	변경
2026-05-08	Phase 1
2026-05-10	Manual cleanup — Bayesian/bootstrap/conformal patterns

4.6 KiB Raw Blame History