Files
2nd/10_Wiki/Topics/Computer_Science_and_Theory/Structural-Equation-Modeling.md
T
2026-05-10 22:08:15 +09:00

5.4 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-structural-equation-modeling Structural Equation Modeling 10_Wiki/Topics verified self
SEM
Path-Analysis
CFA-SEM
none A 0.88 applied
statistics
sem
latent-variable
causal
psychometrics
2026-05-10 pending
language framework
python-r semopy-lavaan

Structural Equation Modeling

매 한 줄

"매 latent + 매 path 의 통합". SEM 의 measurement model (CFA: 매 observed → 매 latent) + structural model (매 latent 사이 의 path) 의 simultaneously fit. 1970s Jöreskog (LISREL) — 2026 에 psychology, marketing research, epidemiology 의 표준.

매 핵심

매 component

  • Measurement model (CFA): 매 observed indicator → 매 latent factor (e.g., 4 items → "Anxiety").
  • Structural model: 매 latent 사이 의 directional path (e.g., Anxiety → Performance).
  • Latent variable: 매 unobserved construct — measurement error 의 separate.
  • Indicator: observed measurement — 매 reflective (factor causes indicator) vs formative.

매 estimation

  • MLE (default) — 매 multivariate normal assumption.
  • WLSMV — 매 ordinal/categorical data.
  • Bayesian SEM — 매 small sample, complex model.

매 fit indices

  • χ² / df — 매 < 3 의 양호.
  • CFI / TLI — 매 ≥ 0.95 의 양호.
  • RMSEA — 매 ≤ 0.06 의 양호, ≤ 0.08 의 acceptable.
  • SRMR — 매 ≤ 0.08 의 양호.
  • 매 single index 의 X — 매 multiple 의 결합 의 평가.

매 응용

  1. Psychology — Big Five, depression-anxiety pathway.
  2. Marketing — brand → satisfaction → loyalty.
  3. Education — SES → study time → achievement.
  4. Epidemiology — stress → cortisol → disease.

💻 패턴

1. semopy (Python) — basic CFA

import semopy
desc = """
# measurement
Anxiety =~ q1 + q2 + q3 + q4
Depression =~ d1 + d2 + d3
# structural
Depression ~ Anxiety
"""
model = semopy.Model(desc)
res = model.fit(df)
print(semopy.calc_stats(model))

2. lavaan (R) — gold standard

library(lavaan)
mod <- '
  # measurement
  visual  =~ x1 + x2 + x3
  textual =~ x4 + x5 + x6
  speed   =~ x7 + x8 + x9
  # structure
  textual ~ visual
  speed   ~ textual
'
fit <- sem(mod, data=HolzingerSwineford1939)
summary(fit, fit.measures=TRUE, standardized=TRUE)

3. Mediation analysis

mod <- '
  Y ~ c*X + b*M
  M ~ a*X
  ab := a*b              # indirect effect
  total := c + ab
'
fit <- sem(mod, data=df, se="bootstrap", bootstrap=5000)
parameterEstimates(fit, boot.ci.type="bca.simple")

4. Multi-group invariance

fit_config <- cfa(mod, data=df, group="country")
fit_metric <- cfa(mod, data=df, group="country", group.equal="loadings")
anova(fit_config, fit_metric)  # 매 fit 의 worsen 의 test

5. Latent growth curve

mod <- '
  i =~ 1*t1 + 1*t2 + 1*t3 + 1*t4
  s =~ 0*t1 + 1*t2 + 2*t3 + 3*t4
  i ~~ s
'
fit <- growth(mod, data=df)

6. Modification indices

modindices(fit, sort=TRUE, maximum.number=10)
# 매 model 의 어떤 path 의 추가 시 fit 의 향상 의 예측
# 매 theory-driven X — 매 cherry-pick 의 위험

7. Bootstrap CI for indirect effect

fit <- sem(mod, data=df, se="bootstrap", bootstrap=5000)
parameterEstimates(fit, boot.ci.type="bca.simple", standardized=TRUE)

8. Bayesian SEM (blavaan)

library(blavaan)
fit <- bsem(mod, data=df, n.chains=4, burnin=2000, sample=4000)

매 결정 기준

상황 Approach
Continuous indicators, large n MLR (robust ML)
Ordinal / Likert WLSMV
Small n (<200) Bayesian (blavaan)
Latent + multiple regression SEM > separate regressions (errors-in-vars)
Mediation bootstrap CI for indirect
Longitudinal LGC / cross-lagged panel
Complex / non-recursive SEM (path-only OK)

기본값: lavaan (R) 의 가장 mature → Python 만 시 semopy → fit indices 의 multiple report.

🔗 Graph

🤖 LLM 활용

언제: model specification draft, fit-indices interpretation, mediation explanation, paper writing. 언제 X: 매 estimation 자체 — lavaan/semopy 의 사용.

안티패턴

  • Modification indices 의 chase: 매 fit 의 향상 의 위해 path 의 add → 매 capitalize on chance, theory 의 lost.
  • Rejecting model on χ² alone: 매 large n 의 χ² 의 always reject — 매 RMSEA, CFI 의 결합.
  • Reflective vs formative confusion: 매 wrong specification 의 estimate 의 bias.
  • Causal claim from cross-sectional SEM: 매 directional path 의 causal X — 매 longitudinal / experiment 의 필요.
  • Underidentified model: 매 df < 0 → 매 estimation impossible.
  • n < 200 with many parameters: 매 unstable — 매 Bayesian 의 권장.

🧪 검증 / 중복

  • Verified (Kline Principles and Practice of SEM, lavaan docs, Hu & Bentler 1999 cutoffs).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — SEM components, fit, lavaan/semopy patterns.