--- id: wiki-2026-0508-bounding-box-regression title: Bounding Box Regression category: 10_Wiki/Topics status: verified canonical_id: self aliases: [bbox regression, object detection, IoU, anchor box, NMS, DETR, YOLO, mAP] duplicate_of: none source_trust_level: A confidence_score: 0.93 verification_status: applied tags: [object-detection, bbox, computer-vision, iou, nms, yolo, detr, anchor-free, mAP] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: Python framework: PyTorch / Ultralytics / Detectron2 --- # Bounding Box Regression ## 📌 한 줄 통찰 > **"매 image 의 정확한 주소"**. 매 (x, y, w, h) 의 4 number 의 predict + class. 매 object detection 의 core. 매 modern: 매 anchor-free + 매 DETR (transformer) 의 NMS-free. ## 📖 핵심 ### 매 representation #### (x, y, w, h) - 매 center + size. #### (x1, y1, x2, y2) - 매 corner 좌표. #### (cx, cy, w, h) normalized - 매 image-relative (0-1). #### Polar / RotatedBox - 매 oriented (aerial, text). ### IoU (Intersection over Union) $$IoU = \frac{|A \cap B|}{|A \cup B|}$$ - 매 0-1. - 매 GT 와 predict 의 overlap. - 매 NMS 의 base. - 매 mAP 의 component. ### 매 loss #### L1 / L2 - 매 simple. - 매 scale-dependent. #### IoU loss - 매 (1 - IoU). - 매 scale-invariant. #### GIoU / DIoU / CIoU - 매 IoU 의 변형. - 매 non-overlap 의 case 도 gradient. - 매 CIoU = 매 IoU + center distance + aspect ratio. ### 매 anchor #### Anchor-based (Faster R-CNN, SSD, YOLOv3-v5) - 매 미리 매 N 개 box 의 layout. - 매 GT 와 closest anchor 의 match. - 매 offset 의 regress. #### Anchor-free (FCOS, YOLOX, CenterNet) - 매 점 의 직접 regress. - 매 hyperparameter ↓. - 매 modern 의 trend. ### NMS (Non-Maximum Suppression) - 매 highest score box 의 keep. - 매 IoU > threshold 의 box 의 drop. - 매 modern: Soft-NMS, Matrix NMS. ### 매 modern paradigm #### YOLO (v8, v10, v11) - 매 single-stage. - 매 fast. - 매 anchor-free + decoupled head. #### DETR / Deformable DETR - 매 transformer encoder-decoder. - 매 set prediction (no NMS). - 매 Hungarian matching loss. #### DINO / Grounding DINO - 매 DETR 변형 + open-vocab. #### SAM (Segment Anything) - 매 prompt-based segmentation. - 매 bbox prompt → 매 mask. ### 매 응용 1. **Autonomous driving**: 매 vehicle / pedestrian. 2. **Surveillance**: 매 person / face. 3. **Retail**: 매 product detection. 4. **Medical**: 매 lesion / cell. 5. **Aerial**: 매 oriented bbox. 6. **Robotics**: 매 grasping. ### Metric - **mAP** (mean Average Precision): 매 IoU threshold 별. - **mAP@50**: 매 IoU 0.5 만. - **mAP@50:95**: 매 0.5-0.95 의 average (COCO). - **AR** (Average Recall). ## 💻 패턴 ### IoU calculation ```python def iou(box1, box2): """매 (x1, y1, x2, y2) format.""" x1 = max(box1[0], box2[0]) y1 = max(box1[1], box2[1]) x2 = min(box1[2], box2[2]) y2 = min(box1[3], box2[3]) inter = max(0, x2 - x1) * max(0, y2 - y1) area1 = (box1[2] - box1[0]) * (box1[3] - box1[1]) area2 = (box2[2] - box2[0]) * (box2[3] - box2[1]) union = area1 + area2 - inter return inter / union if union > 0 else 0 ``` ### NMS ```python def nms(boxes, scores, iou_threshold=0.5): indices = scores.argsort(descending=True) kept = [] while len(indices) > 0: idx = indices[0] kept.append(idx.item()) if len(indices) == 1: break rest = indices[1:] ious = torch.tensor([iou(boxes[idx], boxes[i]) for i in rest]) indices = rest[ious <= iou_threshold] return kept ``` ### CIoU loss ```python import torch def ciou_loss(pred, gt): iou_val = iou_tensor(pred, gt) # 매 center distance px, py = (pred[:, 0] + pred[:, 2]) / 2, (pred[:, 1] + pred[:, 3]) / 2 gx, gy = (gt[:, 0] + gt[:, 2]) / 2, (gt[:, 1] + gt[:, 3]) / 2 rho2 = (px - gx)**2 + (py - gy)**2 # 매 enclosing box cx1 = torch.min(pred[:, 0], gt[:, 0]) cy1 = torch.min(pred[:, 1], gt[:, 1]) cx2 = torch.max(pred[:, 2], gt[:, 2]) cy2 = torch.max(pred[:, 3], gt[:, 3]) c2 = (cx2 - cx1)**2 + (cy2 - cy1)**2 # 매 aspect ratio pw, ph = pred[:, 2] - pred[:, 0], pred[:, 3] - pred[:, 1] gw, gh = gt[:, 2] - gt[:, 0], gt[:, 3] - gt[:, 1] v = (4 / math.pi**2) * (torch.atan(gw / gh) - torch.atan(pw / ph))**2 alpha = v / (1 - iou_val + v + 1e-7) return 1 - iou_val + rho2 / c2 + alpha * v ``` ### YOLO inference (Ultralytics) ```python from ultralytics import YOLO model = YOLO('yolov8n.pt') results = model('image.jpg', conf=0.25, iou=0.45) for r in results: for box in r.boxes: xyxy = box.xyxy[0] # (x1, y1, x2, y2) conf = box.conf[0] cls = int(box.cls[0]) print(f'{model.names[cls]}: {conf:.2f} at {xyxy}') ``` ### DETR (Hungarian matching) ```python import torch from scipy.optimize import linear_sum_assignment def hungarian_matcher(pred_logits, pred_boxes, gt_labels, gt_boxes): """매 N pred ↔ M gt 의 optimal matching.""" # 매 cost matrix: classification + bbox L1 + IoU cost_class = -pred_logits.softmax(-1)[:, gt_labels] cost_bbox = torch.cdist(pred_boxes, gt_boxes, p=1) cost_giou = -generalized_iou(pred_boxes, gt_boxes) cost = 1.0 * cost_class + 5.0 * cost_bbox + 2.0 * cost_giou indices = linear_sum_assignment(cost.cpu()) return indices ``` ### Custom training (Detectron2 / Ultralytics) ```python from ultralytics import YOLO model = YOLO('yolov8n.yaml') model.train( data='coco.yaml', epochs=100, batch=16, imgsz=640, optimizer='AdamW', lr0=0.001, cos_lr=True, ) # 매 export model.export(format='onnx') ``` ### mAP calculation ```python from torchmetrics.detection import MeanAveragePrecision metric = MeanAveragePrecision(box_format='xyxy', iou_type='bbox') metric.update(preds=pred_boxes_list, target=gt_boxes_list) result = metric.compute() print(f"mAP@50:95: {result['map']:.4f}") print(f"mAP@50: {result['map_50']:.4f}") ``` ## 🤔 결정 기준 | 상황 | Model | |---|---| | Real-time | YOLOv8/10 | | Accuracy | DINO / Co-DETR | | Edge | YOLOv8n / NanoDet | | Open-vocab | Grounding DINO | | Aerial / oriented | RotatedBox + RoITrans | | Crowd | DETR (no NMS) | | Few-shot | Meta-learning + finetune | **기본값**: YOLOv8 의 baseline. 매 SOTA 가 DETR family. 매 segmentation 의 SAM. ## 🔗 Graph - 부모: [[Object-Detection]] · [[Computer Vision|Computer-Vision]] - 변형: [[YOLO]] · [[Faster-R-CNN]] · [[DETR]] · [[SAM]] - 응용: [[Autonomous Vehicles]] - Loss: [[Focal-Loss]] - Adjacent: [[Anchor-Box]] · [[NMS]] · [[mAP]] ## 🤖 LLM 활용 **언제**: 매 detection task. 매 model selection. 매 loss 의 design. 매 deployment optimization. **언제 X**: 매 classification (no localization). 매 segmentation (use SAM/Mask R-CNN). ## ❌ 안티패턴 - **L2 loss only**: 매 scale-dependent. - **NMS threshold 의 default**: 매 specific tuning 필요. - **Anchor 의 default**: 매 dataset 의 statistics 의 reflect X. - **mAP@50 만**: 매 strict (mAP@50:95) 의 hide. - **Class imbalance 무시**: 매 minority class 의 fail. - **Test set 의 augment**: 매 leakage. ## 🧪 검증 / 중복 - Verified (Faster R-CNN, YOLO papers, DETR, SAM). - 신뢰도 A. - Related: [[YOLO]] · [[DETR]] · [[Object-Detection]] · [[SAM]] · [[Autonomous Vehicles]]. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — IoU + NMS + DETR + 매 PyTorch / Ultralytics code |