Files

T

koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)

이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-08 12:24:15 +09:00

5.1 KiB

Raw Blame History

id, title, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

status

canonical_id

aliases

duplicate_of

source_trust_level

confidence_score

verification_status

Miscellaneous AI Topics

한 줄 정의

주류 카테고리(NLP·CV·RL·LLM)에 깔끔히 들어가지 않지만 실무에서 자주 마주치는 틈새 AI 주제 모음. 정식 페이지가 생기기 전 임시 정착지이자, 단독 페이지로 키우기에는 작은 개념의 집결지.

핵심

자주 누락되는 영역

Causal Inference + ML: do-calculus, double ML, uplift modeling.
Differential Privacy in ML: DP-SGD, ε-budget, federated 결합.
Active Learning: label cost 최소화, uncertainty/diversity sampling.
Curriculum Learning: easy → hard 순서 학습.
Continual / Lifelong Learning: catastrophic forgetting, EWC, replay.
Meta-Learning: MAML, Reptile, in-context learning과의 관계.
AutoML 주변: NAS, HPO, AutoFE.
Tabular Deep Learning: TabNet, FT-Transformer, GBM 비교.
Time Series Foundation Models: TimesFM, Chronos, Moirai.
Symbolic + Neural: neuro-symbolic, program synthesis.

응용

연구 trend 추적, 단독 페이지 승격 후보 판별, onboarding 시 "이런 것도 있다" 한 장 인덱스.

💻 패턴

Active Learning loop

def active_learning_loop(model, pool_X, oracle, budget=1000, batch=20):
    labeled_X, labeled_y = [], []
    while len(labeled_y) < budget:
        probs = model.predict_proba(pool_X)
        # least confidence
        uncertainty = 1 - probs.max(axis=1)
        idx = uncertainty.argsort()[-batch:]
        x_new = pool_X[idx]
        y_new = oracle(x_new)
        labeled_X.extend(x_new); labeled_y.extend(y_new)
        pool_X = np.delete(pool_X, idx, axis=0)
        model.fit(labeled_X, labeled_y)
    return model

EWC (Elastic Weight Consolidation) 핵심

def ewc_loss(model, task_loss, fisher, theta_star, lam=1000):
    penalty = 0.0
    for n, p in model.named_parameters():
        if n in fisher:
            penalty += (fisher[n] * (p - theta_star[n])**2).sum()
    return task_loss + lam * penalty

MAML inner step (개념)

def maml_step(model, loss_fn, x_s, y_s, x_q, y_q, lr=0.01):
    grads = torch.autograd.grad(loss_fn(model(x_s), y_s),
                                model.parameters(), create_graph=True)
    fast_weights = [p - lr * g for p, g in zip(model.parameters(), grads)]
    # query loss with fast weights
    return functional_forward(model, fast_weights, x_q, y_q, loss_fn)

DP-SGD 노이즈 추가

# opacus 사용 예시
from opacus import PrivacyEngine
engine = PrivacyEngine()
model, optim, loader = engine.make_private(
    module=model, optimizer=optim, data_loader=loader,
    noise_multiplier=1.1, max_grad_norm=1.0,
)

Uplift Tree (causal ML)

from causalml.inference.tree import UpliftTreeClassifier
uplift = UpliftTreeClassifier(control_name='control')
uplift.fit(X, treatment=t_arr, y=y_arr)
te = uplift.predict(X_test)  # treatment effect per row

결정 기준

주제	단독 페이지로 분리할 시점
4회 이상 다른 페이지에서 링크	분리
코드 예시 5개 이상 누적	분리
결정 기준 표가 만들어짐	분리
단순 용어 정의 1줄	misc 유지
트렌드성 짧은 수명	misc 유지

기본값: 링크 4회 또는 250+ 라인 작성 가능 시 단독 분리.

🔗 Graph

부모: Machine-Learning
변형: Active Learning · Continual-Learning
응용: AutoML · Federated-Learning · Differential-Privacy
Adjacent: Neural-Architecture-Search-NAS · Hyperparameters

🤖 LLM 활용

언제: 새 trend 키워드 등장 시 misc에 1줄 추가, 단독 분리 여부 판정 (링크 횟수 grep), 카테고리 재배치 brainstorm.

언제 X: 핵심 개념을 misc에서 영구히 유지 (단독 페이지로 승격해야 graph 확장 가능). 정의 정확도가 낮은 LLM 답변을 그대로 commit.

❌ 안티패턴

모든 새 개념을 misc에 던져서 영구 방치 → 위키가 "잡동사니 폴더화".
단독 페이지로 분리하지 않은 채 misc 페이지가 1000+ 라인으로 비대해짐.
정식 카테고리(NLP/CV/RL)에 들어가야 할 주제를 misc에 잘못 배치.
Frontmatter tags가 너무 광범위해 검색 변별력 상실.

🧪 검증 / 중복

Verified source: Papers With Code 카테고리 트리, NeurIPS/ICML 워크숍 트랙, ML reading list (e.g., d2l.ai, Papers With Code SOTA). 신뢰도 A.

본 페이지는 메타/카테고리 인덱스로, 다른 페이지와 중복 아님.

🕓 Changelog

2026-05-08 Phase 1 — 초기 stub.
2026-05-10 Manual cleanup — FULL meta-index 형태로 재작성. 분리 기준표·코드 5종(active/EWC/MAML/DP-SGD/uplift) 추가.

5.1 KiB Raw Blame History