--- id: wiki-2026-0508-independent-component-analysis-i title: Independent Component Analysis (ICA) category: 10_Wiki/Topics status: verified canonical_id: self aliases: [ICA, FastICA, blind source separation, cocktail party, sklearn ICA] duplicate_of: none source_trust_level: A confidence_score: 0.94 verification_status: applied tags: [statistics, ica, source-separation, signal-processing, dimensionality] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: Python framework: scikit-learn / MNE-Python --- # Independent Component Analysis (ICA) ## 매 한 줄 > **"매 mixed signal 의 의 의 statistically independent source 의 separate"**. 매 cocktail party problem. 매 vs PCA: 매 ICA 의 non-Gaussian / independence, 매 PCA 의 variance-max. 매 famous: FastICA. 매 응용: 매 EEG 의 artifact remove, 매 audio source separate. ## 매 핵심 ### 매 model - 매 X = AS — 매 X (observed mix), A (mixing matrix), S (independent sources). - 매 estimate W ≈ A⁻¹ → 매 S = WX. ### 매 vs PCA - **PCA**: 매 uncorrelated + variance max. - **ICA**: 매 statistically independent (stronger). ### 매 응용 1. **EEG**: 매 eye blink / muscle artifact remove. 2. **Audio**: 매 source separation (legacy — modern uses DL). 3. **fMRI**: 매 default mode network identify. 4. **Finance**: 매 factor extraction. ## 💻 패턴 ### scikit-learn FastICA ```python from sklearn.decomposition import FastICA ica = FastICA(n_components=3, random_state=0) S = ica.fit_transform(X) # 매 X: [n_samples, n_features] A = ica.mixing_ # 매 mixing matrix # 매 X ≈ S @ A.T ``` ### Cocktail party (3 mics, 3 speakers) ```python import numpy as np from sklearn.decomposition import FastICA # 매 3 sources t = np.linspace(0, 8, 2000) s1 = np.sin(2 * t) s2 = np.sign(np.sin(3 * t)) s3 = np.random.normal(size=2000) S = np.c_[s1, s2, s3] A = np.array([[1, 1, 1], [0.5, 2, 1], [1.5, 1, 2]]) X = S @ A.T # 매 3 mics ica = FastICA(n_components=3) S_estimated = ica.fit_transform(X) ``` ### EEG artifact removal (MNE) ```python import mne raw = mne.io.read_raw_edf('eeg.edf', preload=True) ica = mne.preprocessing.ICA(n_components=20, random_state=0).fit(raw) # 매 detect eye blink (using EOG) eog_idx, scores = ica.find_bads_eog(raw, ch_name='EOG') ica.exclude = eog_idx ica.apply(raw) # 매 modify raw in-place ``` ### Whitening (preprocess) ```python def whiten(X): X_centered = X - X.mean(axis=0) cov = np.cov(X_centered.T) U, S, _ = np.linalg.svd(cov) W = U @ np.diag(1 / np.sqrt(S + 1e-9)) @ U.T return X_centered @ W ``` ### FastICA (manual) ```python def fast_ica(X, n_components, max_iter=200, tol=1e-4): """매 simplified FastICA via deflation.""" X_white = whiten(X) W = np.zeros((n_components, X.shape[1])) for i in range(n_components): w = np.random.randn(X.shape[1]) w /= np.linalg.norm(w) for _ in range(max_iter): wx = X_white @ w g = np.tanh(wx) # 매 nonlinearity g_prime = 1 - g ** 2 w_new = (X_white.T @ g) / X.shape[0] - g_prime.mean() * w # 매 deflate (orthogonalize against found components) for j in range(i): w_new -= (w_new @ W[j]) * W[j] w_new /= np.linalg.norm(w_new) if abs(abs(w @ w_new) - 1) < tol: break w = w_new W[i] = w return W ``` ### Validate (correlation with known sources) ```python def correlate_with_sources(estimated, true_sources): # 매 each estimated → best-matching true n = estimated.shape[1] matches = [] for i in range(n): best_corr = max(abs(np.corrcoef(estimated[:, i], true_sources[:, j])[0, 1]) for j in range(n)) matches.append(best_corr) return matches # 매 ideally close to 1 ``` ### Audio source separation (legacy) ```python import scipy.io.wavfile as wav sr, mix = wav.read('mix.wav') # 매 stereo ica = FastICA(n_components=2) S = ica.fit_transform(mix) wav.write('source1.wav', sr, S[:, 0]) wav.write('source2.wav', sr, S[:, 1]) ``` ### Component selection (EEG) ```python def detect_artifact_components(ica, raw): bad = [] eog_idx, _ = ica.find_bads_eog(raw) ecg_idx, _ = ica.find_bads_ecg(raw) return eog_idx + ecg_idx ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | Source separation | ICA (small N) | | Audio modern | DL (Demucs) — not ICA | | EEG artifact | MNE-Python ICA | | fMRI | spatial ICA | | Variance reduction | PCA (not ICA) | **기본값**: 매 EEG / fMRI = ICA. 매 modern audio = DL. 매 always whitening preprocess. ## 🔗 Graph - 부모: [[Statistics]] - 변형: [[FastICA]] - Adjacent: [[PCA]] · [[Factor-Analysis]] ## 🤖 LLM 활용 **언제**: 매 EEG / fMRI. 매 small-source separation. **언제 X**: 매 modern audio (use Demucs). ## ❌ 안티패턴 - **No whitening**: 매 convergence X. - **N components > N sources**: 매 noise component. - **Gaussian sources**: 매 ICA fail (need non-Gaussian). - **Order assumption**: 매 ICA 매 component order arbitrary. ## 🧪 검증 / 중복 - Verified (Hyvärinen FastICA 2000, MNE docs). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — ICA + 매 sklearn / MNE / manual / cocktail code |