---
id: wiki-2026-0508-boosting-xgboost-lightgbm
title: Boosting Algorithms (XGBoost / LightGBM / CatBoost)
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [boosting, gradient boosting, GBM, XGBoost, LightGBM, CatBoost, AdaBoost, ensemble]
duplicate_of: none
source_trust_level: A
confidence_score: 0.95
verification_status: applied
tags: [ml, boosting, xgboost, lightgbm, catboost, ensemble, tabular-data, kaggle, gradient-boosting]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
  language: Python
  framework: XGBoost / LightGBM / CatBoost / scikit-learn
---

# Boosting Algorithms

## 📌 한 줄 통찰
> **"매 오답 노트 의 군단"**. 매 weak learner 의 sequential 학습 + 매 previous error 의 weight ↑. 매 tabular data 의 still-king (vs deep learning). 매 Kaggle 의 default. 매 XGBoost / LightGBM / CatBoost 의 trinity.

## 📖 핵심

### 매 algorithm history
1. **AdaBoost** (1995): 매 weighted re-sample.
2. **Gradient Boosting** (Friedman 1999): 매 residual fit.
3. **XGBoost** (2014): 매 regularization + parallel.
4. **LightGBM** (2017): 매 GOSS + EFB.
5. **CatBoost** (2017): 매 ordered boosting + categorical.

### Gradient Boosting Machine (GBM) 의 origin
- 매 model 의 sequential 추가.
- 매 each step: 매 negative gradient (residual) 의 fit.
- 매 stage-wise.

### XGBoost (Extreme GBoost)
- 매 regularization (L1, L2) on leaf weight.
- 매 second-order Taylor expansion.
- 매 sparse-aware.
- 매 parallel computing (per feature).
- 매 missing value handling.
- 매 cache-aware.

### LightGBM (Microsoft)
- **GOSS** (Gradient-based One-Side Sampling): 매 high-gradient sample 의 keep.
- **EFB** (Exclusive Feature Bundling): 매 sparse feature 의 merge.
- 매 leaf-wise (vs level-wise) → 매 deeper.
- 매 fastest, 매 large dataset friendly.

### CatBoost (Yandex)
- 매 categorical feature 의 native.
- 매 ordered boosting → 매 target leakage 의 mitigate.
- 매 GPU support.
- 매 default 가 좋음.

### 매 hyperparameter (cross-tool)

#### Tree
- `max_depth` (XGBoost) / `num_leaves` (LightGBM): 매 5-10.
- `min_child_weight` / `min_data_in_leaf`: 매 over-fit 방지.

#### Learning
- `learning_rate` (η): 매 0.01-0.3. 매 작 → 매 N tree ↑.
- `n_estimators` / `num_boost_round`: 매 100-10000.
- `subsample`: 매 row sample (0.7-1.0).
- `colsample_bytree`: 매 feature sample.

#### Regularization
- `reg_alpha` (L1): 매 sparsity.
- `reg_lambda` (L2): 매 weight ↓.
- `gamma` / `min_split_loss`: 매 split threshold.

### 매 over-fit 방지
- **Early stopping**: 매 validation 의 plateau.
- **Low learning rate + many trees**: 매 best practice.
- **Subsample row + col**.
- **Regularization** (reg_alpha, reg_lambda).
- **Max depth limit**.
- **Min child weight**.

### 매 tabular dominance (vs DL)
- 매 small-medium tabular: 매 boosting > NN.
- 매 categorical / mixed: 매 CatBoost win.
- 매 large tabular: 매 LightGBM 의 fast.
- 매 image / text / audio: 매 NN dominant.
- 매 reason: 매 tabular 의 invariance / spatial 의 X.

### 매 modern competitor
- **TabNet, FT-Transformer**: 매 tabular NN.
- 매 close 가, 매 boosting 의 still match.
- 매 Kaggle 2024-2026: 매 LightGBM + ensemble 의 dominant.

## 💻 패턴

### XGBoost (basic)
```python
import xgboost as xgb
from sklearn.model_selection import train_test_split

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

model = xgb.XGBClassifier(
    n_estimators=1000,
    learning_rate=0.05,
    max_depth=6,
    min_child_weight=3,
    subsample=0.8,
    colsample_bytree=0.8,
    reg_alpha=0.1,
    reg_lambda=1.0,
    objective='binary:logistic',
    eval_metric='auc',
    early_stopping_rounds=50,
    n_jobs=-1,
    random_state=42,
)

model.fit(X_train, y_train, eval_set=[(X_val, y_val)], verbose=100)
preds = model.predict_proba(X_test)[:, 1]
```

### LightGBM (fast)
```python
import lightgbm as lgb

model = lgb.LGBMClassifier(
    n_estimators=2000,
    learning_rate=0.03,
    num_leaves=63,           # 매 2^max_depth - 1
    min_child_samples=20,
    feature_fraction=0.8,
    bagging_fraction=0.8,
    bagging_freq=5,
    reg_alpha=0.1,
    reg_lambda=0.1,
    objective='binary',
    metric='auc',
    n_jobs=-1,
    random_state=42,
)

model.fit(
    X_train, y_train,
    eval_set=[(X_val, y_val)],
    callbacks=[lgb.early_stopping(50), lgb.log_evaluation(100)],
)
```

### CatBoost (categorical-friendly)
```python
from catboost import CatBoostClassifier

cat_features = ['gender', 'country', 'product_id']

model = CatBoostClassifier(
    iterations=2000,
    learning_rate=0.03,
    depth=6,
    l2_leaf_reg=3,
    cat_features=cat_features,
    eval_metric='AUC',
    early_stopping_rounds=50,
    random_seed=42,
    verbose=100,
)

model.fit(X_train, y_train, eval_set=(X_val, y_val))
```

### Hyperparameter tune (Optuna)
```python
import optuna

def objective(trial):
    params = {
        'n_estimators': 5000,
        'learning_rate': trial.suggest_float('lr', 0.01, 0.1, log=True),
        'num_leaves': trial.suggest_int('num_leaves', 16, 256),
        'min_child_samples': trial.suggest_int('mcs', 5, 100),
        'feature_fraction': trial.suggest_float('ff', 0.5, 1.0),
        'bagging_fraction': trial.suggest_float('bf', 0.5, 1.0),
        'reg_alpha': trial.suggest_float('reg_alpha', 1e-3, 10, log=True),
        'reg_lambda': trial.suggest_float('reg_lambda', 1e-3, 10, log=True),
    }
    model = lgb.LGBMClassifier(**params, n_jobs=-1, random_state=42)
    model.fit(X_train, y_train, eval_set=[(X_val, y_val)],
              callbacks=[lgb.early_stopping(50, verbose=False)])
    return model.best_score_['valid_0']['auc']

study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)
```

### SHAP (interpretability)
```python
import shap

explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

# 매 global importance
shap.summary_plot(shap_values, X_test)

# 매 single prediction
shap.force_plot(explainer.expected_value, shap_values[0], X_test.iloc[0])
```

### Stacking (meta-ensemble)
```python
from sklearn.ensemble import StackingClassifier
from sklearn.linear_model import LogisticRegression

estimators = [
    ('xgb', xgb.XGBClassifier(...)),
    ('lgb', lgb.LGBMClassifier(...)),
    ('cat', CatBoostClassifier(...)),
]

stack = StackingClassifier(
    estimators=estimators,
    final_estimator=LogisticRegression(),
    cv=5,
    n_jobs=-1,
)
stack.fit(X_train, y_train)
```

→ 매 Kaggle 의 default 의 winning combo.

### GPU acceleration
```python
# XGBoost GPU
xgb.XGBClassifier(tree_method='hist', device='cuda')

# LightGBM GPU
lgb.LGBMClassifier(device='gpu')

# CatBoost GPU
CatBoostClassifier(task_type='GPU', devices='0')
```

## 🤔 결정 기준
| 상황 | Tool |
|---|---|
| Default tabular | LightGBM |
| Small-medium dataset | XGBoost |
| Categorical-heavy | CatBoost |
| Large dataset (10M+) | LightGBM (GOSS) |
| GPU available | XGBoost / CatBoost GPU |
| Kaggle | LightGBM + ensemble |
| Production simple | LightGBM (fast) |
| Interpretability | XGBoost + SHAP |

**기본값**: LightGBM 의 baseline. 매 categorical 가 CatBoost. 매 ensemble 의 stack.

## 🔗 Graph
- 부모: [[Ensemble-Methods]] · [[Decision Tree]] · [[데이터 사이언스 및 ML 엔지니어링|Gradient-Descent]]
- 변형: [[XGBoost]] · [[LightGBM]] · [[CatBoost]] · [[AdaBoost]] · [[GBM]]
- 응용: [[Kaggle]] · [[SHAP]] · [[Stacking]]
- Adjacent: [[Random-Forest]] · [[Bagging]] · [[Bias vs Variance Trade-off]] · [[Optuna]]

## 🤖 LLM 활용
**언제**: 매 tabular task. 매 fraud detection. 매 Kaggle. 매 risk scoring. 매 conversion prediction.
**언제 X**: 매 image / text / audio (DL). 매 sequence (RNN / Transformer).

## ❌ 안티패턴
- **No early stopping**: 매 overfit.
- **High learning rate (0.5+)**: 매 unstable.
- **Default 의 trust**: 매 specific 의 tune.
- **Categorical 의 one-hot (high-cardinality)**: 매 CatBoost 의 lose.
- **No SHAP**: 매 interpret X.
- **DL 의 force on tabular**: 매 boosting 의 lose.
- **Single tool**: 매 ensemble 의 lose.

## 🧪 검증 / 중복
- Verified (Chen XGBoost, Ke LightGBM, Prokhorenkova CatBoost, Kaggle dominance).
- 신뢰도 A.
- Related: [[XGBoost]] · [[LightGBM]] · [[CatBoost]] · [[Random-Forest]] · [[Bias vs Variance Trade-off]].

## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — XGB / LGB / Cat + hyperparameter + 매 SHAP + stacking + tune code |