Files
2nd/10_Wiki/Topics/AI_and_ML/Point-Cloud-Processing.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

7.9 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-point-cloud-processing Point Cloud Processing 10_Wiki/Topics verified self
point-cloud
3d-deep-learning
lidar-processing
none A 0.9 applied
3d
point-cloud
pointnet
lidar
sparse-conv
2026-05-10 pending
language framework
Python PyTorch / Open3D

Point Cloud Processing

매 한 줄

"매 unordered 3D point set 의 deep learning". 매 PointNet (Qi 2017) 매 permutation-invariant 의 first NN. 매 PointNet++ → sparse conv (MinkowskiEngine, Spconv) → transformer (PTv3) → 매 modern 3D foundation models. 매 LiDAR autonomous driving + 매 robotics + 매 AR/VR 의 core.

매 핵심

매 challenges

  • 매 unordered: 매 N points 의 N! permutations — 매 symmetric function 의 필요.
  • 매 sparse + irregular: 매 voxel grid 매 mostly empty.
  • 매 scale variance: 매 LiDAR (~100K points) vs CAD (~1K).
  • 매 no canonical orientation: 매 SE(3) equivariance 의 desired.

매 lineage

  • 매 PointNet (2017): per-point MLP + max-pool. 매 permutation invariant. 매 no local structure.
  • 매 PointNet++ (2017): hierarchical sampling + grouping (FPS + ball query).
  • 매 voxel + sparse conv (2018-2020): MinkowskiEngine, Spconv — 매 only non-empty voxels.
  • 매 graph methods: DGCNN, KPConv (kernel point conv).
  • 매 transformer: Point Transformer (v1/v2/v3, 2024) — 매 SOTA on ScanNet.
  • 매 3D foundation (2024-2025): Sonata, PointTransformerV3, Uni3D.

매 tasks

  1. 매 classification: ModelNet40 (CAD), ScanObjectNN.
  2. 매 part segmentation: ShapeNet-Part.
  3. 매 semantic segmentation: ScanNet, S3DIS, SemanticKITTI (LiDAR).
  4. 매 detection: KITTI, nuScenes, Waymo (3D bbox).
  5. 매 registration: ICP, DGR, GeoTransformer.
  6. 매 reconstruction: NeRF, Gaussian Splatting.

매 응용

  1. 매 autonomous driving (LiDAR perception).
  2. 매 robotics manipulation (depth → grasp).
  3. 매 AR/VR (scene understanding).
  4. 매 BIM / construction (as-built scan).

💻 패턴

Open3D — load + visualize

import open3d as o3d, numpy as np

pcd = o3d.io.read_point_cloud("scan.ply")
print(pcd)  # PointCloud with 1234567 points
pcd = pcd.voxel_down_sample(voxel_size=0.05)
pcd, _ = pcd.remove_statistical_outlier(nb_neighbors=20, std_ratio=2.0)
pcd.estimate_normals()
o3d.visualization.draw_geometries([pcd])

PointNet (minimal)

import torch.nn as nn, torch

class PointNetCls(nn.Module):
    def __init__(self, num_classes=40):
        super().__init__()
        self.mlp1 = nn.Sequential(nn.Conv1d(3, 64, 1), nn.BatchNorm1d(64), nn.ReLU())
        self.mlp2 = nn.Sequential(nn.Conv1d(64, 128, 1), nn.BatchNorm1d(128), nn.ReLU())
        self.mlp3 = nn.Sequential(nn.Conv1d(128, 1024, 1), nn.BatchNorm1d(1024), nn.ReLU())
        self.head = nn.Sequential(nn.Linear(1024, 256), nn.ReLU(), nn.Linear(256, num_classes))

    def forward(self, x):  # (B, 3, N)
        x = self.mlp3(self.mlp2(self.mlp1(x)))
        x = x.max(dim=2)[0]  # 매 permutation-invariant max pool
        return self.head(x)

Farthest Point Sampling (FPS) — PointNet++

def fps(xyz, npoint):
    # xyz: (B, N, 3)
    B, N, _ = xyz.shape
    centroids = torch.zeros(B, npoint, dtype=torch.long, device=xyz.device)
    distance = torch.full((B, N), 1e10, device=xyz.device)
    farthest = torch.randint(0, N, (B,), device=xyz.device)
    batch_idx = torch.arange(B, device=xyz.device)
    for i in range(npoint):
        centroids[:, i] = farthest
        centroid = xyz[batch_idx, farthest, :].unsqueeze(1)
        dist = ((xyz - centroid) ** 2).sum(-1)
        distance = torch.minimum(distance, dist)
        farthest = distance.argmax(-1)
    return centroids

MinkowskiEngine — sparse 3D conv

import MinkowskiEngine as ME, torch

# Build sparse tensor from point cloud
coords = torch.floor(points / voxel_size).int()
coords = ME.utils.batched_coordinates([c for c in coords])
feats = torch.ones(coords.shape[0], 3)  # or RGB/normal
x = ME.SparseTensor(features=feats, coordinates=coords)

class SparseUNet(ME.MinkowskiNetwork):
    def __init__(self, in_channels=3, out_channels=20, D=3):
        super().__init__(D)
        self.conv1 = ME.MinkowskiConvolution(in_channels, 32, kernel_size=3, dimension=D)
        self.bn1 = ME.MinkowskiBatchNorm(32)
        self.relu = ME.MinkowskiReLU()
        # ... encoder/decoder
    def forward(self, x):
        return self.relu(self.bn1(self.conv1(x)))

Spconv (Volcano-Lab) — 매 fast alternative

import spconv.pytorch as spconv

class SimpleSpconv(nn.Module):
    def __init__(self):
        super().__init__()
        self.net = spconv.SparseSequential(
            spconv.SubMConv3d(3, 32, 3, padding=1),
            nn.BatchNorm1d(32), nn.ReLU(),
            spconv.SparseConv3d(32, 64, 2, stride=2),  # downsample
            nn.BatchNorm1d(64), nn.ReLU(),
        )
    def forward(self, voxel_features, voxel_coords, batch_size, spatial_shape):
        x = spconv.SparseConvTensor(voxel_features, voxel_coords, spatial_shape, batch_size)
        return self.net(x)

KITTI LiDAR — load + crop

import numpy as np

def load_kitti_bin(path):
    # KITTI LiDAR: (N, 4) — x, y, z, intensity
    return np.fromfile(path, dtype=np.float32).reshape(-1, 4)

def crop_fov(pts, x_range=(-50, 50), y_range=(-50, 50), z_range=(-3, 1)):
    mask = ((pts[:, 0] > x_range[0]) & (pts[:, 0] < x_range[1]) &
            (pts[:, 1] > y_range[0]) & (pts[:, 1] < y_range[1]) &
            (pts[:, 2] > z_range[0]) & (pts[:, 2] < z_range[1]))
    return pts[mask]

Gaussian Splatting export (modern 3D recon)

# 3DGS — points + Gaussian params (μ, Σ, color, opacity)
# 매 NeRF 매 successor — 매 fast train + render
import gsplat
# https://github.com/nerfstudio-project/gsplat
# Render: gsplat.rasterization(means, quats, scales, opacities, colors, ...)

Point Transformer v3 (SOTA 2024)

# pip install pointcept
from pointcept.models.point_transformer_v3 import PointTransformerV3

model = PointTransformerV3(
    in_channels=6, num_classes=20,
    enc_depths=(2, 2, 2, 6, 2), enc_channels=(32, 64, 128, 256, 512),
)
# Input: dict with 'feat', 'coord', 'grid_coord', 'offset'
out = model(input_dict)

매 결정 기준

상황 Method
매 small clouds (<10K points) classification 매 PointNet++
매 LiDAR scene seg (>100K) 매 sparse conv (Spconv/Mink)
매 SOTA segmentation 매 PTv3
매 3D detection (autonomous) 매 CenterPoint / TransFusion
매 reconstruction from images 매 Gaussian Splatting
매 registration 매 GeoTransformer
매 quick prototyping 매 Open3D

기본값: 매 Spconv (LiDAR) / PTv3 (indoor) / Open3D (general utilities).

🔗 Graph

🤖 LLM 활용

언제: 매 LiDAR scene understanding, 매 indoor scan semantic seg, 매 robot perception 의 사용. 언제 X: 매 dense images already (image CNN sufficient), 매 mesh-native tasks (use mesh networks).

안티패턴

  • 매 dense voxel grid: 매 OOM — 매 sparse representation 의 사용.
  • 매 ignore normalization: 매 cloud 의 unit sphere 또는 unit cube 의 normalize.
  • 매 PointNet for large scenes: 매 single max-pool 매 100K points → 매 information loss.
  • 매 forget T-Net (PointNet): 매 input transform 의 omit → 매 SO(3) sensitivity.

🧪 검증 / 중복

  • Verified (PointNet/PointNet++ Qi 2017, MinkowskiEngine docs, Spconv repo, PTv3 2024).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — PointNet/Sparse/PTv3 + LiDAR + Open3D + GS