--- id: wiki-2026-0508-structural-equation-modeling title: Structural Equation Modeling category: 10_Wiki/Topics status: verified canonical_id: self aliases: [SEM, Path-Analysis, CFA-SEM] duplicate_of: none source_trust_level: A confidence_score: 0.88 verification_status: applied tags: [statistics, sem, latent-variable, causal, psychometrics] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: python-r framework: semopy-lavaan --- # Structural Equation Modeling ## 매 한 줄 > **"매 latent + 매 path 의 통합"**. SEM 의 measurement model (CFA: 매 observed → 매 latent) + structural model (매 latent 사이 의 path) 의 simultaneously fit. 1970s Jöreskog (LISREL) — 2026 에 psychology, marketing research, epidemiology 의 표준. ## 매 핵심 ### 매 component - **Measurement model (CFA)**: 매 observed indicator → 매 latent factor (e.g., 4 items → "Anxiety"). - **Structural model**: 매 latent 사이 의 directional path (e.g., Anxiety → Performance). - **Latent variable**: 매 unobserved construct — measurement error 의 separate. - **Indicator**: observed measurement — 매 reflective (factor causes indicator) vs formative. ### 매 estimation - **MLE** (default) — 매 multivariate normal assumption. - **WLSMV** — 매 ordinal/categorical data. - **Bayesian SEM** — 매 small sample, complex model. ### 매 fit indices - **χ² / df** — 매 < 3 의 양호. - **CFI / TLI** — 매 ≥ 0.95 의 양호. - **RMSEA** — 매 ≤ 0.06 의 양호, ≤ 0.08 의 acceptable. - **SRMR** — 매 ≤ 0.08 의 양호. - 매 single index 의 X — 매 multiple 의 결합 의 평가. ### 매 응용 1. Psychology — Big Five, depression-anxiety pathway. 2. Marketing — brand → satisfaction → loyalty. 3. Education — SES → study time → achievement. 4. Epidemiology — stress → cortisol → disease. ## 💻 패턴 ### 1. semopy (Python) — basic CFA ```python import semopy desc = """ # measurement Anxiety =~ q1 + q2 + q3 + q4 Depression =~ d1 + d2 + d3 # structural Depression ~ Anxiety """ model = semopy.Model(desc) res = model.fit(df) print(semopy.calc_stats(model)) ``` ### 2. lavaan (R) — gold standard ```r library(lavaan) mod <- ' # measurement visual =~ x1 + x2 + x3 textual =~ x4 + x5 + x6 speed =~ x7 + x8 + x9 # structure textual ~ visual speed ~ textual ' fit <- sem(mod, data=HolzingerSwineford1939) summary(fit, fit.measures=TRUE, standardized=TRUE) ``` ### 3. Mediation analysis ```r mod <- ' Y ~ c*X + b*M M ~ a*X ab := a*b # indirect effect total := c + ab ' fit <- sem(mod, data=df, se="bootstrap", bootstrap=5000) parameterEstimates(fit, boot.ci.type="bca.simple") ``` ### 4. Multi-group invariance ```r fit_config <- cfa(mod, data=df, group="country") fit_metric <- cfa(mod, data=df, group="country", group.equal="loadings") anova(fit_config, fit_metric) # 매 fit 의 worsen 의 test ``` ### 5. Latent growth curve ```r mod <- ' i =~ 1*t1 + 1*t2 + 1*t3 + 1*t4 s =~ 0*t1 + 1*t2 + 2*t3 + 3*t4 i ~~ s ' fit <- growth(mod, data=df) ``` ### 6. Modification indices ```r modindices(fit, sort=TRUE, maximum.number=10) # 매 model 의 어떤 path 의 추가 시 fit 의 향상 의 예측 # 매 theory-driven X — 매 cherry-pick 의 위험 ``` ### 7. Bootstrap CI for indirect effect ```r fit <- sem(mod, data=df, se="bootstrap", bootstrap=5000) parameterEstimates(fit, boot.ci.type="bca.simple", standardized=TRUE) ``` ### 8. Bayesian SEM (blavaan) ```r library(blavaan) fit <- bsem(mod, data=df, n.chains=4, burnin=2000, sample=4000) ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | Continuous indicators, large n | MLR (robust ML) | | Ordinal / Likert | WLSMV | | Small n (<200) | Bayesian (blavaan) | | Latent + multiple regression | SEM > separate regressions (errors-in-vars) | | Mediation | bootstrap CI for indirect | | Longitudinal | LGC / cross-lagged panel | | Complex / non-recursive | SEM (path-only OK) | **기본값**: lavaan (R) 의 가장 mature → Python 만 시 semopy → fit indices 의 multiple report. ## 🔗 Graph - 부모: [[Statistics]] · [[Multivariate-Analysis]] - 변형: [[Regression-Analysis-Foundations]] · [[Principal-Component-Analysis]] - 응용: [[Decision Theory]] · [[Knowledge-Structure]] - Adjacent: [[Standard-Deviation-and-Variance]] · [[Statistical-Power]] ## 🤖 LLM 활용 **언제**: model specification draft, fit-indices interpretation, mediation explanation, paper writing. **언제 X**: 매 estimation 자체 — lavaan/semopy 의 사용. ## ❌ 안티패턴 - **Modification indices 의 chase**: 매 fit 의 향상 의 위해 path 의 add → 매 capitalize on chance, theory 의 lost. - **Rejecting model on χ² alone**: 매 large n 의 χ² 의 always reject — 매 RMSEA, CFI 의 결합. - **Reflective vs formative confusion**: 매 wrong specification 의 estimate 의 bias. - **Causal claim from cross-sectional SEM**: 매 directional path 의 causal X — 매 longitudinal / experiment 의 필요. - **Underidentified model**: 매 df < 0 → 매 estimation impossible. - **n < 200 with many parameters**: 매 unstable — 매 Bayesian 의 권장. ## 🧪 검증 / 중복 - Verified (Kline *Principles and Practice of SEM*, lavaan docs, Hu & Bentler 1999 cutoffs). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — SEM components, fit, lavaan/semopy patterns. |