Files
2nd/10_Wiki/Topics/AI_and_ML/Bounding-Box-Regression.md
T
koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 12:24:15 +09:00

7.4 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-bounding-box-regression Bounding Box Regression 10_Wiki/Topics verified self
bbox regression
object detection
IoU
anchor box
NMS
DETR
YOLO
mAP
none A 0.93 applied
object-detection
bbox
computer-vision
iou
nms
yolo
detr
anchor-free
mAP
2026-05-10 pending
language framework
Python PyTorch / Ultralytics / Detectron2

Bounding Box Regression

📌 한 줄 통찰

"매 image 의 정확한 주소". 매 (x, y, w, h) 의 4 number 의 predict + class. 매 object detection 의 core. 매 modern: 매 anchor-free + 매 DETR (transformer) 의 NMS-free.

📖 핵심

매 representation

(x, y, w, h)

  • 매 center + size.

(x1, y1, x2, y2)

  • 매 corner 좌표.

(cx, cy, w, h) normalized

  • 매 image-relative (0-1).

Polar / RotatedBox

  • 매 oriented (aerial, text).

IoU (Intersection over Union)

IoU = \frac{|A \cap B|}{|A \cup B|}
  • 매 0-1.
  • 매 GT 와 predict 의 overlap.
  • 매 NMS 의 base.
  • 매 mAP 의 component.

매 loss

L1 / L2

  • 매 simple.
  • 매 scale-dependent.

IoU loss

  • 매 (1 - IoU).
  • 매 scale-invariant.

GIoU / DIoU / CIoU

  • 매 IoU 의 변형.
  • 매 non-overlap 의 case 도 gradient.
  • 매 CIoU = 매 IoU + center distance + aspect ratio.

매 anchor

Anchor-based (Faster R-CNN, SSD, YOLOv3-v5)

  • 매 미리 매 N 개 box 의 layout.
  • 매 GT 와 closest anchor 의 match.
  • 매 offset 의 regress.

Anchor-free (FCOS, YOLOX, CenterNet)

  • 매 점 의 직접 regress.
  • 매 hyperparameter ↓.
  • 매 modern 의 trend.

NMS (Non-Maximum Suppression)

  • 매 highest score box 의 keep.
  • 매 IoU > threshold 의 box 의 drop.
  • 매 modern: Soft-NMS, Matrix NMS.

매 modern paradigm

YOLO (v8, v10, v11)

  • 매 single-stage.
  • 매 fast.
  • 매 anchor-free + decoupled head.

DETR / Deformable DETR

  • 매 transformer encoder-decoder.
  • 매 set prediction (no NMS).
  • 매 Hungarian matching loss.

DINO / Grounding DINO

  • 매 DETR 변형 + open-vocab.

SAM (Segment Anything)

  • 매 prompt-based segmentation.
  • 매 bbox prompt → 매 mask.

매 응용

  1. Autonomous driving: 매 vehicle / pedestrian.
  2. Surveillance: 매 person / face.
  3. Retail: 매 product detection.
  4. Medical: 매 lesion / cell.
  5. Aerial: 매 oriented bbox.
  6. Robotics: 매 grasping.

Metric

  • mAP (mean Average Precision): 매 IoU threshold 별.
  • mAP@50: 매 IoU 0.5 만.
  • mAP@50:95: 매 0.5-0.95 의 average (COCO).
  • AR (Average Recall).

💻 패턴

IoU calculation

def iou(box1, box2):
    """매 (x1, y1, x2, y2) format."""
    x1 = max(box1[0], box2[0])
    y1 = max(box1[1], box2[1])
    x2 = min(box1[2], box2[2])
    y2 = min(box1[3], box2[3])
    
    inter = max(0, x2 - x1) * max(0, y2 - y1)
    area1 = (box1[2] - box1[0]) * (box1[3] - box1[1])
    area2 = (box2[2] - box2[0]) * (box2[3] - box2[1])
    union = area1 + area2 - inter
    return inter / union if union > 0 else 0

NMS

def nms(boxes, scores, iou_threshold=0.5):
    indices = scores.argsort(descending=True)
    kept = []
    while len(indices) > 0:
        idx = indices[0]
        kept.append(idx.item())
        if len(indices) == 1: break
        rest = indices[1:]
        ious = torch.tensor([iou(boxes[idx], boxes[i]) for i in rest])
        indices = rest[ious <= iou_threshold]
    return kept

CIoU loss

import torch

def ciou_loss(pred, gt):
    iou_val = iou_tensor(pred, gt)
    
    # 매 center distance
    px, py = (pred[:, 0] + pred[:, 2]) / 2, (pred[:, 1] + pred[:, 3]) / 2
    gx, gy = (gt[:, 0] + gt[:, 2]) / 2, (gt[:, 1] + gt[:, 3]) / 2
    rho2 = (px - gx)**2 + (py - gy)**2
    
    # 매 enclosing box
    cx1 = torch.min(pred[:, 0], gt[:, 0])
    cy1 = torch.min(pred[:, 1], gt[:, 1])
    cx2 = torch.max(pred[:, 2], gt[:, 2])
    cy2 = torch.max(pred[:, 3], gt[:, 3])
    c2 = (cx2 - cx1)**2 + (cy2 - cy1)**2
    
    # 매 aspect ratio
    pw, ph = pred[:, 2] - pred[:, 0], pred[:, 3] - pred[:, 1]
    gw, gh = gt[:, 2] - gt[:, 0], gt[:, 3] - gt[:, 1]
    v = (4 / math.pi**2) * (torch.atan(gw / gh) - torch.atan(pw / ph))**2
    alpha = v / (1 - iou_val + v + 1e-7)
    
    return 1 - iou_val + rho2 / c2 + alpha * v

YOLO inference (Ultralytics)

from ultralytics import YOLO

model = YOLO('yolov8n.pt')
results = model('image.jpg', conf=0.25, iou=0.45)

for r in results:
    for box in r.boxes:
        xyxy = box.xyxy[0]  # (x1, y1, x2, y2)
        conf = box.conf[0]
        cls = int(box.cls[0])
        print(f'{model.names[cls]}: {conf:.2f} at {xyxy}')

DETR (Hungarian matching)

import torch
from scipy.optimize import linear_sum_assignment

def hungarian_matcher(pred_logits, pred_boxes, gt_labels, gt_boxes):
    """매 N pred ↔ M gt 의 optimal matching."""
    # 매 cost matrix: classification + bbox L1 + IoU
    cost_class = -pred_logits.softmax(-1)[:, gt_labels]
    cost_bbox = torch.cdist(pred_boxes, gt_boxes, p=1)
    cost_giou = -generalized_iou(pred_boxes, gt_boxes)
    
    cost = 1.0 * cost_class + 5.0 * cost_bbox + 2.0 * cost_giou
    
    indices = linear_sum_assignment(cost.cpu())
    return indices

Custom training (Detectron2 / Ultralytics)

from ultralytics import YOLO

model = YOLO('yolov8n.yaml')
model.train(
    data='coco.yaml',
    epochs=100,
    batch=16,
    imgsz=640,
    optimizer='AdamW',
    lr0=0.001,
    cos_lr=True,
)

# 매 export
model.export(format='onnx')

mAP calculation

from torchmetrics.detection import MeanAveragePrecision

metric = MeanAveragePrecision(box_format='xyxy', iou_type='bbox')
metric.update(preds=pred_boxes_list, target=gt_boxes_list)
result = metric.compute()
print(f"mAP@50:95: {result['map']:.4f}")
print(f"mAP@50: {result['map_50']:.4f}")

🤔 결정 기준

상황 Model
Real-time YOLOv8/10
Accuracy DINO / Co-DETR
Edge YOLOv8n / NanoDet
Open-vocab Grounding DINO
Aerial / oriented RotatedBox + RoITrans
Crowd DETR (no NMS)
Few-shot Meta-learning + finetune

기본값: YOLOv8 의 baseline. 매 SOTA 가 DETR family. 매 segmentation 의 SAM.

🔗 Graph

🤖 LLM 활용

언제: 매 detection task. 매 model selection. 매 loss 의 design. 매 deployment optimization. 언제 X: 매 classification (no localization). 매 segmentation (use SAM/Mask R-CNN).

안티패턴

  • L2 loss only: 매 scale-dependent.
  • NMS threshold 의 default: 매 specific tuning 필요.
  • Anchor 의 default: 매 dataset 의 statistics 의 reflect X.
  • mAP@50 만: 매 strict (mAP@50:95) 의 hide.
  • Class imbalance 무시: 매 minority class 의 fail.
  • Test set 의 augment: 매 leakage.

🧪 검증 / 중복

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — IoU + NMS + DETR + 매 PyTorch / Ultralytics code