id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id
title
category
status
canonical_id
aliases
duplicate_of
source_trust_level
confidence_score
verification_status
tags
raw_sources
last_reinforced
github_commit
tech_stack
wiki-2026-0508-medical-imaging-data-augmentation
Medical Imaging Data Augmentation
10_Wiki/Topics
verified
self
Medical Augmentation
MONAI Augmentation
의료영상 증강
none
A
0.9
applied
medical-imaging
data-augmentation
monai
deep-learning
segmentation
2026-05-10
pending
language
framework
python
monai-pytorch
Medical Imaging Data Augmentation
매 한 줄
"매 환자 데이터는 매 적고, 매 anatomy 는 망가뜨릴 수 없다" . 의료영상 augmentation 은 일반 이미지 대비 (1) 데이터가 매우 적고 (2) 라벨이 픽셀 단위 정확해야 하며 (3) 비현실적 변형이 진단을 망친다는 제약 안에서 기하·강도·합성 변환을 신중히 적용해야 한다.
매 핵심
매 도메인 특성
3D volume (CT/MRI), DICOM/NIfTI 포맷, voxel spacing 다양.
라벨이 segmentation mask / bbox / 환자 단위 진단 — affine 변환 시 동기화.
HU scale (CT), bias field (MRI) 등 강도 분포가 modality 마다 다름.
매 변환 카테고리
Spatial / 기하 : flip, rotate (소각), translate, scale, elastic deformation, B-spline.
Intensity : brightness/contrast, gamma, Gaussian noise, Rician noise (MRI), bias field, MR motion artifact.
Spacing / Resolution : random resample, low-res sim.
Topology-preserving : mixup/cutmix 의 의료 variant — 단, lesion mask 가 깨지지 않도록 patch-aware.
Synthesis : GAN/diffusion 으로 lesion 합성, healthy↔lesion translation.
매 라이브러리
MONAI : PyTorch 기반 의료영상 표준, Compose, dictionary transform.
TorchIO : 3D 친화, MRI artifact 풍부.
Albumentations : 2D slice/엑스레이.
NVIDIA DALI : GPU augmentation 파이프라인.
💻 패턴
1. MONAI dict transform pipeline
from monai.transforms import ( Compose , LoadImaged , EnsureChannelFirstd ,
Spacingd , Orientationd , ScaleIntensityRanged , RandCropByPosNegLabeld ,
RandAffined , RandGaussianNoised , RandBiasFieldd , ToTensord )
train_t = Compose ([
LoadImaged ( keys = [ "img" , "seg" ]),
EnsureChannelFirstd ( keys = [ "img" , "seg" ]),
Orientationd ( keys = [ "img" , "seg" ], axcodes = "RAS" ),
Spacingd ( keys = [ "img" , "seg" ], pixdim = ( 1 , 1 , 1 ), mode = ( "bilinear" , "nearest" )),
ScaleIntensityRanged ( keys = "img" , a_min =- 200 , a_max = 300 , b_min = 0 , b_max = 1 , clip = True ),
RandCropByPosNegLabeld ( keys = [ "img" , "seg" ], label_key = "seg" ,
spatial_size = ( 96 , 96 , 96 ), pos = 1 , neg = 1 , num_samples = 4 ),
RandAffined ( keys = [ "img" , "seg" ], rotate_range = 0.1 , scale_range = 0.1 ,
mode = ( "bilinear" , "nearest" ), prob = 0.5 ),
RandGaussianNoised ( keys = "img" , std = 0.01 , prob = 0.2 ),
RandBiasFieldd ( keys = "img" , coeff_range = ( 0.0 , 0.1 ), prob = 0.2 ),
ToTensord ( keys = [ "img" , "seg" ]),
])
2. Elastic deformation
Anatomy 자연스러운 변형 — sigma 너무 작으면 비현실, 크면 underfit.
3. TorchIO MRI artifact
4. CT HU window robust
5. Lesion-aware MixUp (segmentation)
6. Diffusion synthetic lesion
7. Test-Time Augmentation (TTA)
매 결정 기준
상황
Augmentation
작은 segmentation 데이터
strong elastic + intensity + cropbypos/neg
분류 (X-ray)
mild affine + cutout, lesion 보호
MRI multi-site
bias field + intensity histogram match
CT multi-protocol
HU window jitter + spacing resample
매우 적은 라벨
+ synthetic (GAN/diffusion) + self-supervised pretrain
기본값 : MONAI Compose + spacing/orient 정규화 → mild affine + intensity + (3D 면) crop-by-label.
🔗 Graph
🤖 LLM 활용
언제 : pipeline boilerplate, modality 별 적절 변환 추천, 코드 review.
언제 X : 임상적 plausibility 판단 (radiologist 검증 필요).
❌ 안티패턴
90° rotate / 큰 scale → CT 좌표계/anatomy 깨짐.
segmentation mask 에 bilinear interpolation — 라벨 손상.
intensity normalize 를 augmentation 후 적용 → 분포 불일치.
synthetic 데이터를 real 평가셋과 섞기 — leakage.
모든 patient slice 를 독립 sample 로 — patient-level leakage, split 단위는 patient.
🧪 검증 / 중복
🕓 Changelog
날짜
변경
2026-05-08
Phase 1
2026-05-10
Manual cleanup