Files
2nd/10_Wiki/Topics/AI_and_ML/Sensitivity-Analysis.md
T
koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 12:24:15 +09:00

146 lines
4.5 KiB
Markdown

---
id: wiki-2026-0508-sensitivity-analysis
title: Sensitivity Analysis
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [Sobol, Morris, SALib, Feature Importance]
duplicate_of: none
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [statistics, ml-interpretability, uncertainty]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
language: python
framework: SALib / scikit-learn / SHAP
---
# Sensitivity Analysis
## 매 한 줄
> **"매 input 변동이 output 의 어디에 얼마나 영향?"**. 매 Sobol indices (variance decomposition), Morris elementary effects (screening), 그리고 ML interpretability (SHAP, permutation importance) 모두 매 sensitivity analysis 의 family. 매 2026 default: SALib (classic SA) + SHAP (ML model).
## 매 핵심
### 매 Local vs Global
- **Local**: gradient at one point (∂y/∂x). 매 빠르지만 nonlinear 모델 misleading.
- **Global**: full input space sample. 매 Sobol/Morris/FAST. 매 정직.
### 매 Method 분류
- **Screening (Morris)**: 매 cheap, identify important factors among many. r·(k+1) runs.
- **Variance-based (Sobol)**: S1 (first-order), ST (total). 매 N·(2k+2) Saltelli sample.
- **Regression-based**: standardized regression coefficients (SRC).
- **ML feature importance**: permutation, SHAP, integrated gradients.
### 매 응용
1. Engineering tolerance — 매 어느 parameter 가 yield drop.
2. Climate/epidemiology model — input uncertainty propagation.
3. ML model debug — 매 feature 가 prediction drive.
4. Hyperparameter search prior — 매 important hp 만 tune.
## 💻 패턴
### Sobol indices (SALib)
```python
from SALib.sample import saltelli
from SALib.analyze import sobol
import numpy as np
problem = {
'num_vars': 3,
'names': ['x1', 'x2', 'x3'],
'bounds': [[0, 1]] * 3,
}
param_values = saltelli.sample(problem, 1024)
Y = np.array([model(*row) for row in param_values])
Si = sobol.analyze(problem, Y)
print(Si['S1'], Si['ST']) # first-order + total
```
### Morris screening
```python
from SALib.sample.morris import sample
from SALib.analyze import morris
X = sample(problem, N=100, num_levels=4)
Y = np.array([model(*r) for r in X])
Mi = morris.analyze(problem, X, Y, num_levels=4)
print(Mi['mu_star'], Mi['sigma']) # importance, nonlinearity
```
### Permutation importance (sklearn)
```python
from sklearn.inspection import permutation_importance
r = permutation_importance(model, X_val, y_val, n_repeats=20, random_state=0)
for i in r.importances_mean.argsort()[::-1]:
print(f"{features[i]}: {r.importances_mean[i]:.3f} ± {r.importances_std[i]:.3f}")
```
### SHAP for any model
```python
import shap
explainer = shap.TreeExplainer(xgb_model) # or shap.Explainer for general
sv = explainer(X_val)
shap.plots.beeswarm(sv) # global
shap.plots.waterfall(sv[0]) # local
```
### Tornado plot (one-at-a-time)
```python
base = model(**defaults)
deltas = []
for k, (lo, hi) in bounds.items():
lo_y = model(**{**defaults, k: lo})
hi_y = model(**{**defaults, k: hi})
deltas.append((k, hi_y - lo_y))
deltas.sort(key=lambda x: abs(x[1]), reverse=True)
```
### Variance decomposition w/ ANOVA
```python
import statsmodels.api as sm
from statsmodels.formula.api import ols
m = ols('y ~ x1 + x2 + x3 + x1:x2', data=df).fit()
print(sm.stats.anova_lm(m, typ=2))
```
## 매 결정 기준
| 상황 | Approach |
|---|---|
| 100+ inputs, screen first | Morris |
| <20 inputs, full ranking | Sobol |
| ML black-box | SHAP / permutation |
| Linear-ish model | SRC |
| One-shot intuition | Tornado |
**기본값**: SALib Sobol (simulation), SHAP (ML model).
## 🔗 Graph
- 부모: [[Statistics]] · [[Epistemic-Uncertainty|Uncertainty-Quantification]]
- 변형: [[SHAP]]
- 응용: [[Hyperparameters|Hyperparameter-Tuning]]
- Adjacent: [[Bayesian Inference]] · [[Monte-Carlo]]
## 🤖 LLM 활용
**언제**: simulation/model에서 어느 input이 결과 좌우하는지 정량화. ML feature 중요도 ranking.
**언제 X**: input 간 강한 correlation 존재 — Sobol 가정 깨짐. Conditional SA / Shapley 사용.
## ❌ 안티패턴
- **OAT only**: one-at-a-time 은 interaction 놓침.
- **Sample 너무 작음**: Sobol N<512 → 매우 noisy estimate.
- **Correlated inputs 무시**: independence 가정 violation.
- **SHAP = causal**: SHAP 는 attribution, causality 아님.
## 🧪 검증 / 중복
- Verified (Saltelli 2010, SALib docs, scikit-learn inspection).
- 신뢰도 A.
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — Sobol/Morris/SHAP unified treatment |