Files
2nd/10_Wiki/Topics/AI_and_ML/Pattern-Recognition.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

5.2 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-pattern-recognition Pattern Recognition 10_Wiki/Topics verified self
Pattern Classification
Statistical Pattern Recognition
none A 0.9 applied
pattern-recognition
classification
ml
signal-processing
2026-05-10 pending
language framework
python scikit-learn, pytorch

Pattern Recognition

매 한 줄

"매 raw signal → category label". 매 1960s statistical pattern recognition (Bayes, kNN, LDA) → 1980s neural pattern recognition → 매 2020s deep learning 의 absorbed sub-field. 매 modern frame: 매 supervised classification + representation learning. Bishop's PRML (2006) 의 canonical reference.

매 핵심

매 history & framing

  • 매 1960s: Bayesian decision theory + linear classifiers.
  • 매 1970-80s: HMM (speech), template matching (OCR).
  • 매 1990s: SVM, kernel methods 의 dominance.
  • 매 2010s: DL absorbed it — "pattern recognition" 의 vintage term.
  • 매 modern: 매 ML/DL classification + representation learning.

매 classical approaches

  • Statistical: Bayesian, MAP, ML, GMM, LDA, QDA.
  • Neural: perceptron → MLP → CNN → transformer.
  • Structural / syntactic: grammar-based, less common today.
  • Template matching: cross-correlation, used in OCR, fingerprint.
  • Kernel methods: SVM with RBF/poly kernels.

매 pipeline (classic)

  1. Sensor / data acquisition.
  2. Preprocessing (denoise, normalize).
  3. Feature extraction (HOG, SIFT, MFCC) — 매 hand-crafted.
  4. Classifier (SVM, RF, NN).
  5. Post-processing (smoothing, thresholding).

매 modern deep pipeline

  • 매 raw input → end-to-end DNN → label.
  • Feature extraction = learned (no HOG/SIFT).
  • Backbone (ResNet, ViT, CLIP) → head (linear / MLP).

매 응용

  1. Computer vision (face, OCR, object detection).
  2. Speech recognition (Whisper, ASR).
  3. Biometrics (fingerprint, iris).
  4. Medical imaging (radiology AI).
  5. Anomaly detection (fraud, network intrusion).

💻 패턴

Classic: Bayesian classifier

from sklearn.naive_bayes import GaussianNB
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
clf = GaussianNB().fit(X, y)
print(clf.score(X, y))

Classic: SVM with RBF

from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline

clf = make_pipeline(StandardScaler(), SVC(kernel="rbf", C=1.0, gamma="scale"))
clf.fit(X_train, y_train)

Modern: CNN classification

import torch.nn as nn
import torchvision.models as tvm

model = tvm.resnet50(weights=tvm.ResNet50_Weights.IMAGENET1K_V2)
model.fc = nn.Linear(2048, num_classes)  # transfer learn

Modern: CLIP zero-shot

import clip, torch
model, preprocess = clip.load("ViT-B/32")
classes = ["cat", "dog", "bird"]
text = clip.tokenize([f"a photo of a {c}" for c in classes])
with torch.no_grad():
    img_feat = model.encode_image(preprocess(image).unsqueeze(0))
    txt_feat = model.encode_text(text)
    logits = (img_feat @ txt_feat.T).softmax(-1)
print(classes[logits.argmax()])

HOG + SVM (classic CV pipeline)

from skimage.feature import hog
from sklearn.svm import LinearSVC

features = [hog(img, pixels_per_cell=(8,8)) for img in images]
clf = LinearSVC().fit(features, labels)

Anomaly detection

from sklearn.ensemble import IsolationForest
detector = IsolationForest(contamination=0.01).fit(X_train)
anomalies = detector.predict(X_test) == -1

매 결정 기준

상황 Approach
Tabular small data RF / GBDT / SVM
Image Pretrained CNN/ViT (transfer)
Speech / audio Whisper / wav2vec finetune
Few-shot / zero-shot CLIP / SigLIP / VLM
Anomaly (no labels) IsolationForest / autoencoder
Real-time embedded Quantized CNN (MobileNet)

기본값: 매 image/speech 의 pretrained foundation model 의 fine-tune; tabular 의 GBDT.

🔗 Graph

🤖 LLM 활용

언제: 매 classification problem framing, choosing classical vs DL approach, transfer learning decision. 언제 X: 매 generative tasks (use diffusion / LLM instead).

안티패턴

  • Hand-crafted features in 2026: 매 HOG/SIFT 의 99% case 에서 pretrained CNN feature 가 우수.
  • No baseline: 매 jumping to DL without trying logistic / RF baseline.
  • Class imbalance ignored: 매 99% accuracy on 99/1 split = trivial. 매 use F1, ROC-AUC, balanced metrics.
  • Test contamination: 매 train/test split 의 leakage (especially time series).
  • Calibration ignored: 매 raw softmax ≠ probability. 매 use Platt scaling / temperature.

🧪 검증 / 중복

  • Verified (Bishop "PRML" 2006, Duda & Hart "Pattern Classification" 2001).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — classical-to-modern framing, pipeline patterns