d8a80f6272
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해 끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은 과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업. 도구: Datacollect/scripts/link_reconcile_apply.mjs Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
307 lines
9.1 KiB
Markdown
307 lines
9.1 KiB
Markdown
---
|
||
id: wiki-2026-0508-causal-inference
|
||
title: Causal Inference
|
||
category: 10_Wiki/Topics
|
||
status: verified
|
||
canonical_id: self
|
||
aliases: [인과 추론, causal inference, do-calculus, Pearl, DAG, counterfactual, SCM, RCT, propensity score]
|
||
duplicate_of: none
|
||
source_trust_level: A
|
||
confidence_score: 0.93
|
||
verification_status: applied
|
||
tags: [statistics, causal-inference, dag, do-calculus, pearl, ab-testing, dowhy, econml, observational-study]
|
||
raw_sources: []
|
||
last_reinforced: 2026-05-10
|
||
github_commit: pending
|
||
tech_stack:
|
||
language: Python / R
|
||
framework: DoWhy / EconML / CausalML / pgmpy
|
||
---
|
||
|
||
# Causal Inference
|
||
|
||
## 📌 한 줄 통찰
|
||
> **"매 correlation 의 X — 매 cause"**. 매 Judea Pearl 의 ladder. 매 observational 의 limit + 매 RCT / DAG / counterfactual 의 fix. 매 modern AI 의 base capability — 매 LLM 의 weakest area. 매 policy / medical / business 의 critical.
|
||
|
||
## 📖 핵심
|
||
|
||
### Pearl's Ladder of Causation
|
||
1. **Association** (P(y|x)): 매 correlation. 매 standard ML.
|
||
2. **Intervention** (P(y|do(x))): 매 "what if I change x?".
|
||
3. **Counterfactual** (P(y_x|x', y')): 매 "what would have happened if?".
|
||
|
||
→ 매 LLM 의 mostly stuck on step 1.
|
||
|
||
### 매 핵심 concept
|
||
|
||
#### Confounder
|
||
- 매 X → Y 매 spurious 의 매 Z (common cause).
|
||
- 예: 매 ice cream sales ↔ drowning (Z = 매 summer).
|
||
|
||
#### Mediator
|
||
- 매 X → M → Y.
|
||
|
||
#### Collider
|
||
- 매 X → Z ← Y.
|
||
- 매 Z 의 condition 의 spurious correlation 의 induce!
|
||
|
||
#### Backdoor path
|
||
- 매 X ← Z → Y.
|
||
- 매 Z 의 control 의 close.
|
||
|
||
#### Frontdoor
|
||
- 매 X → M → Y (매 confounder 가 매 X-Y 간 의 unobserved).
|
||
|
||
### 매 method
|
||
|
||
#### RCT (gold standard)
|
||
- 매 randomization 의 confounder 의 break.
|
||
- 매 ethics / cost.
|
||
|
||
#### Observational + adjustment
|
||
- **Propensity Score Matching (PSM)**.
|
||
- **Inverse Probability Weighting (IPW)**.
|
||
- **Regression discontinuity (RDD)**.
|
||
- **Difference-in-differences (DiD)**.
|
||
- **Instrumental variables (IV)**.
|
||
- **Synthetic control**.
|
||
|
||
#### Causal graph (DAG)
|
||
- 매 explicit assumption.
|
||
- 매 do-calculus 의 identify.
|
||
|
||
#### ML-based
|
||
- **Causal forest** (Wager-Athey).
|
||
- **Double ML** (Chernozhukov).
|
||
- **CausalGAN / counterfactual VAE**.
|
||
|
||
### 매 Simpson's paradox
|
||
- 매 aggregate vs subgroup 의 reverse.
|
||
- 매 Berkeley admission, 매 kidney stone treatment.
|
||
- → 매 confounder 의 stratify.
|
||
|
||
### 매 응용
|
||
1. **A/B test** + 매 follow-up causal.
|
||
2. **Pricing**: 매 price → 매 demand.
|
||
3. **Marketing attribution**: 매 channel → 매 conversion.
|
||
4. **Medicine**: 매 treatment effect.
|
||
5. **Policy**: 매 minimum wage.
|
||
6. **Education**: 매 program effect.
|
||
7. **Recommender**: 매 click ≠ 매 caused conversion.
|
||
|
||
### 매 modern tool
|
||
- **DoWhy** (Microsoft): 매 4-step framework.
|
||
- **EconML** (Microsoft).
|
||
- **CausalML** (Uber).
|
||
- **pgmpy**: 매 graphical model.
|
||
- **GeNIe / Hugin**: 매 visual.
|
||
- **DAGitty**: 매 web DAG.
|
||
|
||
### 매 LLM 의 한계
|
||
- 매 association 의 strong.
|
||
- 매 spurious 의 confidently 의 emit.
|
||
- 매 causal reasoning 의 weak.
|
||
- 매 hybrid (LLM + symbolic causal) 의 trend.
|
||
|
||
## 💻 패턴
|
||
|
||
### DoWhy (4-step framework)
|
||
```python
|
||
from dowhy import CausalModel
|
||
|
||
model = CausalModel(
|
||
data=df,
|
||
treatment='ad_exposure',
|
||
outcome='conversion',
|
||
common_causes=['age', 'income', 'past_purchases'],
|
||
)
|
||
|
||
# 1. Identify
|
||
estimand = model.identify_effect(proceed_when_unidentifiable=False)
|
||
print(estimand)
|
||
|
||
# 2. Estimate
|
||
estimate = model.estimate_effect(estimand, method_name='backdoor.propensity_score_matching')
|
||
print(estimate.value)
|
||
|
||
# 3. Refute
|
||
refutation = model.refute_estimate(estimand, estimate, method_name='placebo_treatment_refuter')
|
||
print(refutation)
|
||
```
|
||
|
||
### Propensity Score Matching
|
||
```python
|
||
from sklearn.linear_model import LogisticRegression
|
||
from sklearn.neighbors import NearestNeighbors
|
||
|
||
# 매 propensity score = P(treatment=1 | covariates)
|
||
ps_model = LogisticRegression()
|
||
ps_model.fit(X_covariates, treatment)
|
||
ps = ps_model.predict_proba(X_covariates)[:, 1]
|
||
|
||
# 매 match treated to control
|
||
treated = df[df.treatment == 1]
|
||
control = df[df.treatment == 0]
|
||
|
||
knn = NearestNeighbors(n_neighbors=1).fit(ps[control.index].reshape(-1, 1))
|
||
matches = knn.kneighbors(ps[treated.index].reshape(-1, 1), return_distance=False)
|
||
|
||
# 매 ATE estimate
|
||
ate = treated.outcome.mean() - control.iloc[matches.flatten()].outcome.mean()
|
||
```
|
||
|
||
### IPW (Inverse Probability Weighting)
|
||
```python
|
||
def ipw_ate(df, treatment, outcome, ps):
|
||
weight = np.where(df[treatment] == 1, 1 / ps, 1 / (1 - ps))
|
||
treated_avg = (df[outcome] * df[treatment] * weight).sum() / weight[df[treatment] == 1].sum()
|
||
control_avg = (df[outcome] * (1 - df[treatment]) * weight).sum() / weight[df[treatment] == 0].sum()
|
||
return treated_avg - control_avg
|
||
```
|
||
|
||
### Difference-in-Differences (DiD)
|
||
```python
|
||
import statsmodels.api as sm
|
||
|
||
# 매 panel data: pre/post × treatment/control
|
||
df['post'] = (df['period'] >= treatment_period).astype(int)
|
||
df['treated'] = (df['group'] == 'treated').astype(int)
|
||
df['interaction'] = df['post'] * df['treated']
|
||
|
||
model = sm.OLS(df['outcome'], sm.add_constant(df[['post', 'treated', 'interaction']])).fit()
|
||
# 매 interaction coefficient = 매 DiD treatment effect
|
||
print(model.summary())
|
||
```
|
||
|
||
### Causal Forest (heterogeneous treatment effect)
|
||
```python
|
||
from econml.dml import CausalForestDML
|
||
|
||
cf = CausalForestDML(
|
||
n_estimators=200,
|
||
discrete_treatment=True,
|
||
random_state=42,
|
||
)
|
||
cf.fit(Y=df['outcome'], T=df['treatment'], X=df[features], W=df[confounders])
|
||
|
||
# 매 individual treatment effect
|
||
ites = cf.effect(df_test[features])
|
||
|
||
# 매 confidence interval
|
||
lower, upper = cf.effect_interval(df_test[features], alpha=0.05)
|
||
```
|
||
|
||
### DAG + Pearl's do-calculus (pgmpy)
|
||
```python
|
||
from pgmpy.models import BayesianNetwork
|
||
from pgmpy.factors.discrete import TabularCPD
|
||
from pgmpy.inference.CausalInference import CausalInference
|
||
|
||
# 매 X → Y, X → Z → Y
|
||
model = BayesianNetwork([('X', 'Y'), ('X', 'Z'), ('Z', 'Y')])
|
||
# ... add CPDs ...
|
||
|
||
ci = CausalInference(model)
|
||
|
||
# 매 P(Y | do(X = 1))
|
||
result = ci.query(variables=['Y'], do={'X': 1})
|
||
print(result)
|
||
```
|
||
|
||
### Synthetic control (state policy effect)
|
||
```python
|
||
# 매 weighted combination of control units 의 treated 의 mimic
|
||
from synthetic_control import SyntheticControl # 매 hypothetical lib
|
||
|
||
sc = SyntheticControl(
|
||
treated_unit='California',
|
||
control_pool=other_states,
|
||
pre_period=range(1990, 2000),
|
||
post_period=range(2000, 2010),
|
||
)
|
||
sc.fit(predictors=['gdp', 'unemployment', 'income'])
|
||
effect = sc.treatment_effect()
|
||
```
|
||
|
||
### Refutation (sensitivity analysis)
|
||
```python
|
||
from dowhy import CausalModel
|
||
|
||
# 1. Placebo treatment
|
||
refute_placebo = model.refute_estimate(
|
||
estimand, estimate, method_name='placebo_treatment_refuter',
|
||
)
|
||
# 매 effect 의 0 가까이 → 매 robust.
|
||
|
||
# 2. Random common cause
|
||
refute_random = model.refute_estimate(
|
||
estimand, estimate, method_name='random_common_cause',
|
||
)
|
||
|
||
# 3. Data subset
|
||
refute_subset = model.refute_estimate(
|
||
estimand, estimate, method_name='data_subset_refuter',
|
||
)
|
||
```
|
||
|
||
### Simpson's paradox detector
|
||
```python
|
||
def detect_simpson(df, x_col, y_col, group_col):
|
||
# 매 aggregate
|
||
overall_corr = df[[x_col, y_col]].corr().iloc[0, 1]
|
||
|
||
# 매 subgroup
|
||
subgroup_corrs = df.groupby(group_col).apply(
|
||
lambda g: g[[x_col, y_col]].corr().iloc[0, 1]
|
||
)
|
||
|
||
if overall_corr > 0 and (subgroup_corrs < 0).all():
|
||
return f"Simpson's paradox: overall +, subgroups all -"
|
||
if overall_corr < 0 and (subgroup_corrs > 0).all():
|
||
return f"Simpson's paradox: overall -, subgroups all +"
|
||
return None
|
||
```
|
||
|
||
## 🤔 결정 기준
|
||
| 상황 | Method |
|
||
|---|---|
|
||
| New feature launch | A/B test (RCT) |
|
||
| Historical data | DoWhy + matching |
|
||
| Heterogeneous effect | Causal Forest |
|
||
| Panel data | DiD |
|
||
| Cutoff threshold | RDD |
|
||
| Hidden confounder + IV | Instrumental Variables |
|
||
| Single treated unit | Synthetic Control |
|
||
| ML-aware confounder | Double ML |
|
||
|
||
**기본값**: 매 RCT first. 매 observational 가 DoWhy + sensitivity refute.
|
||
|
||
## 🔗 Graph
|
||
- 부모: [[Statistics]] · [[Decision Theory]]
|
||
- 변형: [[DAG]] · [[Do-Calculus]] · [[Counterfactual]]
|
||
- Adjacent: [[Bayesian Statistics]] · [[Anthropic-Principle]] · [[Beliefs]] · [[Algorithmic Fairness]]
|
||
|
||
## 🤖 LLM 활용
|
||
**언제**: 매 policy decision. 매 marketing attribution. 매 medical treatment. 매 root cause analysis. 매 fairness counterfactual.
|
||
**언제 X**: 매 pure prediction (ML 의 OK). 매 LLM 의 alone (weak on step 2-3).
|
||
|
||
## ❌ 안티패턴
|
||
- **Correlation = causation**: 매 classic mistake.
|
||
- **Collider 의 control**: 매 spurious correlation 의 induce.
|
||
- **No DAG**: 매 hidden assumption.
|
||
- **Single method**: 매 sensitivity 의 X.
|
||
- **No refutation**: 매 fragile estimate.
|
||
- **Simpson's paradox 의 unaware**: 매 misleading.
|
||
- **LLM 의 causal claim 의 trust**: 매 association level 만.
|
||
|
||
## 🧪 검증 / 중복
|
||
- Verified (Pearl "Book of Why", Hernán "Causal Inference: What If", DoWhy paper).
|
||
- 신뢰도 A.
|
||
- Related: [[Bayesian Statistics]] · [[Algorithmic Fairness]] · [[Bias-Correction-Algorithm]] · [[A/B Testing]] · [[Anthropic-Principle]].
|
||
|
||
## 🕓 Changelog
|
||
| 날짜 | 변경 |
|
||
|---|---|
|
||
| 2026-05-08 | Phase 1 |
|
||
| 2026-05-10 | Manual cleanup — Pearl ladder + DAG + 매 DoWhy / PSM / DiD / Causal Forest code |
|