--- id: wiki-2026-0508-gaussian-processes title: Gaussian Processes (GP) category: 10_Wiki/Topics status: verified canonical_id: self aliases: [GP, Gaussian process, kernel methods, Bayesian regression, GPR, sparse GP] duplicate_of: none source_trust_level: A confidence_score: 0.95 verification_status: applied tags: [machine-learning, gaussian-process, bayesian, kernel-methods, regression, gpytorch] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: Python framework: GPyTorch / scikit-learn / GPy --- # Gaussian Processes ## 매 한 줄 > **"매 distribution over functions"**. 매 mean function + kernel (covariance). 매 small data 의 의 의 SOTA, 매 uncertainty quantification 의 강함. 매 modern: 매 GPyTorch, 매 deep kernel, 매 sparse GP for large N. 매 Bayesian opt 의 backbone. ## 매 핵심 ### 매 model - **Prior**: f ~ GP(m(x), k(x, x')). - **Posterior**: 매 conditioned on observed. - **Predictive**: 매 mean + variance. ### 매 kernel - **RBF / Gaussian**: 매 default. - **Matérn**: 매 less smooth. - **Linear**. - **Periodic**: 매 cyclic. - **Composite** (sum, product). ### 매 vs others - **vs Linear regression**: 매 nonlinear. - **vs NN**: 매 uncertainty native, 매 small-N. - **vs Random Forest**: 매 smooth, 매 calibrated. - **Limitation**: 매 O(N³). ### 매 modern - **Sparse GP** (FITC, VFE). - **Deep Kernel Learning** (Wilson 2016). - **Neural Tangent Kernel** (NTK). - **GPyTorch** (scalable). ### 매 응용 1. **Bayesian opt**: 매 hyperparameter, A/B. 2. **Surrogate model**. 3. **Time series**. 4. **Active learning**. 5. **Geostatistics** (kriging). 6. **Robotics**. ## 💻 패턴 ### scikit-learn GP ```python from sklearn.gaussian_process import GaussianProcessRegressor from sklearn.gaussian_process.kernels import RBF, ConstantKernel as C kernel = C(1.0) * RBF(length_scale=1.0) gp = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=5) gp.fit(X_train, y_train) mean, std = gp.predict(X_test, return_std=True) ``` ### GPyTorch (scalable) ```python import gpytorch import torch class ExactGP(gpytorch.models.ExactGP): def __init__(self, X, y, likelihood): super().__init__(X, y, likelihood) self.mean = gpytorch.means.ConstantMean() self.cov = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel()) def forward(self, x): return gpytorch.distributions.MultivariateNormal(self.mean(x), self.cov(x)) likelihood = gpytorch.likelihoods.GaussianLikelihood() model = ExactGP(X_train, y_train, likelihood).cuda() optim = torch.optim.Adam(model.parameters(), lr=0.1) mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model) model.train(); likelihood.train() for _ in range(100): optim.zero_grad() out = model(X_train) loss = -mll(out, y_train) loss.backward(); optim.step() model.eval(); likelihood.eval() with torch.no_grad(), gpytorch.settings.fast_pred_var(): pred = likelihood(model(X_test)) mean, std = pred.mean, pred.stddev ``` ### Bayesian opt (acquisition) ```python from scipy.stats import norm def expected_improvement(mean, std, best_y): z = (mean - best_y) / std return (mean - best_y) * norm.cdf(z) + std * norm.pdf(z) def upper_confidence(mean, std, kappa=2.0): return mean + kappa * std def thompson_sample(gp, X_pool): return gp.sample_y(X_pool, random_state=None).flatten().argmax() ``` ### Sparse GP (large N) ```python class SparseGP(gpytorch.models.ApproximateGP): def __init__(self, inducing_points): var_dist = gpytorch.variational.CholeskyVariationalDistribution(inducing_points.size(0)) var_strategy = gpytorch.variational.VariationalStrategy(self, inducing_points, var_dist) super().__init__(var_strategy) self.mean = gpytorch.means.ConstantMean() self.cov = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel()) def forward(self, x): return gpytorch.distributions.MultivariateNormal(self.mean(x), self.cov(x)) ``` ### Deep Kernel Learning ```python class DeepKernelGP(gpytorch.models.ExactGP): def __init__(self, X, y, likelihood): super().__init__(X, y, likelihood) self.feature_extractor = nn.Sequential( nn.Linear(X.size(-1), 64), nn.ReLU(), nn.Linear(64, 16), ) self.mean = gpytorch.means.ConstantMean() self.cov = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel()) def forward(self, x): feat = self.feature_extractor(x) return gpytorch.distributions.MultivariateNormal(self.mean(feat), self.cov(feat)) ``` ### Multi-output GP ```python class MultiOutputGP(gpytorch.models.ExactGP): def __init__(self, X, y, likelihood, n_outputs): super().__init__(X, y, likelihood) self.mean = gpytorch.means.MultitaskMean(gpytorch.means.ConstantMean(), num_tasks=n_outputs) self.cov = gpytorch.kernels.MultitaskKernel(gpytorch.kernels.RBFKernel(), num_tasks=n_outputs) def forward(self, x): return gpytorch.distributions.MultitaskMultivariateNormal(self.mean(x), self.cov(x)) ``` ### Time series (with periodic kernel) ```python periodic_kernel = gpytorch.kernels.PeriodicKernel() trend_kernel = gpytorch.kernels.RBFKernel(length_scale=10) self.cov = gpytorch.kernels.ScaleKernel(periodic_kernel + trend_kernel) ``` ### Acquisition for BO loop ```python def bo_loop(objective, bounds, n_iter=50, n_init=5): X = sample_random(bounds, n_init) y = np.array([objective(x) for x in X]) for _ in range(n_iter): gp = fit_gp(X, y) candidates = sample_random(bounds, 1000) ei = expected_improvement(*gp.predict(candidates), best_y=y.max()) x_next = candidates[ei.argmax()] y_next = objective(x_next) X = np.vstack([X, x_next]); y = np.append(y, y_next) return X[y.argmax()] ``` ### Calibration check ```python def calibration_plot(gp, X_test, y_test): mean, std = gp.predict(X_test, return_std=True) z_scores = (y_test - mean) / std # 매 should be N(0, 1) return np.histogram(z_scores) ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | Small N + uncertainty | Exact GP | | Large N | Sparse GP | | Deep features | Deep Kernel | | Bayesian opt | Standard GP + EI | | Time series | Periodic + RBF | | Multi-output | Multi-task GP | **기본값**: 매 GPyTorch + 매 RBF / Matérn kernel + 매 sparse for N > 1000 + 매 deep kernel for high-dim + 매 BO with EI. ## 🔗 Graph - 부모: [[Kernel-Methods]] - 변형: [[Sparse-GP]] - 응용: [[Bayesian-Optimization]] · [[Active Learning]] - Adjacent: [[Epistemic-Uncertainty]] ## 🤖 LLM 활용 **언제**: 매 small N. 매 uncertainty needed. 매 BO. 매 surrogate. **언제 X**: 매 N > 100k (use sparse). 매 image / sequence. ## ❌ 안티패턴 - **Default kernel without thought**: 매 wrong assumption. - **No length-scale optim**: 매 underfit. - **N > 10k exact GP**: 매 OOM / slow. - **GP for image**: 매 deep model better. ## 🧪 검증 / 중복 - Verified (Rasmussen & Williams GP for ML, GPyTorch). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-04-26 | Auto | | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — exact + sparse + deep kernel + BO code |