f8b21af4be
10_Wiki/Topics 대규모 정리: - 오류 캡처/미완성 stub 문서 227개 제거 - 교차폴더 중복 43클러스터 병합 (63파일 → redirect) - 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건 - 카테고리 MOC 6개 신규 생성 - Graph 섹션 미해결 related-keyword 링크 10,058건 제거 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
219 lines
7.9 KiB
Markdown
219 lines
7.9 KiB
Markdown
---
|
|
id: wiki-2026-0508-point-cloud-processing
|
|
title: Point Cloud Processing
|
|
category: 10_Wiki/Topics
|
|
status: verified
|
|
canonical_id: self
|
|
aliases: [point-cloud, 3d-deep-learning, lidar-processing]
|
|
duplicate_of: none
|
|
source_trust_level: A
|
|
confidence_score: 0.9
|
|
verification_status: applied
|
|
tags: [3d, point-cloud, pointnet, lidar, sparse-conv]
|
|
raw_sources: []
|
|
last_reinforced: 2026-05-10
|
|
github_commit: pending
|
|
tech_stack:
|
|
language: Python
|
|
framework: PyTorch / Open3D
|
|
---
|
|
|
|
# Point Cloud Processing
|
|
|
|
## 매 한 줄
|
|
> **"매 unordered 3D point set 의 deep learning"**. 매 PointNet (Qi 2017) 매 permutation-invariant 의 first NN. 매 PointNet++ → sparse conv (MinkowskiEngine, Spconv) → transformer (PTv3) → 매 modern 3D foundation models. 매 LiDAR autonomous driving + 매 robotics + 매 AR/VR 의 core.
|
|
|
|
## 매 핵심
|
|
|
|
### 매 challenges
|
|
- **매 unordered**: 매 N points 의 N! permutations — 매 symmetric function 의 필요.
|
|
- **매 sparse + irregular**: 매 voxel grid 매 mostly empty.
|
|
- **매 scale variance**: 매 LiDAR (~100K points) vs CAD (~1K).
|
|
- **매 no canonical orientation**: 매 SE(3) equivariance 의 desired.
|
|
|
|
### 매 lineage
|
|
- **매 PointNet (2017)**: per-point MLP + max-pool. 매 permutation invariant. 매 no local structure.
|
|
- **매 PointNet++ (2017)**: hierarchical sampling + grouping (FPS + ball query).
|
|
- **매 voxel + sparse conv** (2018-2020): MinkowskiEngine, Spconv — 매 only non-empty voxels.
|
|
- **매 graph methods**: DGCNN, KPConv (kernel point conv).
|
|
- **매 transformer**: Point Transformer (v1/v2/v3, 2024) — 매 SOTA on ScanNet.
|
|
- **매 3D foundation (2024-2025)**: Sonata, PointTransformerV3, Uni3D.
|
|
|
|
### 매 tasks
|
|
1. **매 classification**: ModelNet40 (CAD), ScanObjectNN.
|
|
2. **매 part segmentation**: ShapeNet-Part.
|
|
3. **매 semantic segmentation**: ScanNet, S3DIS, SemanticKITTI (LiDAR).
|
|
4. **매 detection**: KITTI, nuScenes, Waymo (3D bbox).
|
|
5. **매 registration**: ICP, DGR, GeoTransformer.
|
|
6. **매 reconstruction**: NeRF, Gaussian Splatting.
|
|
|
|
### 매 응용
|
|
1. 매 autonomous driving (LiDAR perception).
|
|
2. 매 robotics manipulation (depth → grasp).
|
|
3. 매 AR/VR (scene understanding).
|
|
4. 매 BIM / construction (as-built scan).
|
|
|
|
## 💻 패턴
|
|
|
|
### Open3D — load + visualize
|
|
```python
|
|
import open3d as o3d, numpy as np
|
|
|
|
pcd = o3d.io.read_point_cloud("scan.ply")
|
|
print(pcd) # PointCloud with 1234567 points
|
|
pcd = pcd.voxel_down_sample(voxel_size=0.05)
|
|
pcd, _ = pcd.remove_statistical_outlier(nb_neighbors=20, std_ratio=2.0)
|
|
pcd.estimate_normals()
|
|
o3d.visualization.draw_geometries([pcd])
|
|
```
|
|
|
|
### PointNet (minimal)
|
|
```python
|
|
import torch.nn as nn, torch
|
|
|
|
class PointNetCls(nn.Module):
|
|
def __init__(self, num_classes=40):
|
|
super().__init__()
|
|
self.mlp1 = nn.Sequential(nn.Conv1d(3, 64, 1), nn.BatchNorm1d(64), nn.ReLU())
|
|
self.mlp2 = nn.Sequential(nn.Conv1d(64, 128, 1), nn.BatchNorm1d(128), nn.ReLU())
|
|
self.mlp3 = nn.Sequential(nn.Conv1d(128, 1024, 1), nn.BatchNorm1d(1024), nn.ReLU())
|
|
self.head = nn.Sequential(nn.Linear(1024, 256), nn.ReLU(), nn.Linear(256, num_classes))
|
|
|
|
def forward(self, x): # (B, 3, N)
|
|
x = self.mlp3(self.mlp2(self.mlp1(x)))
|
|
x = x.max(dim=2)[0] # 매 permutation-invariant max pool
|
|
return self.head(x)
|
|
```
|
|
|
|
### Farthest Point Sampling (FPS) — PointNet++
|
|
```python
|
|
def fps(xyz, npoint):
|
|
# xyz: (B, N, 3)
|
|
B, N, _ = xyz.shape
|
|
centroids = torch.zeros(B, npoint, dtype=torch.long, device=xyz.device)
|
|
distance = torch.full((B, N), 1e10, device=xyz.device)
|
|
farthest = torch.randint(0, N, (B,), device=xyz.device)
|
|
batch_idx = torch.arange(B, device=xyz.device)
|
|
for i in range(npoint):
|
|
centroids[:, i] = farthest
|
|
centroid = xyz[batch_idx, farthest, :].unsqueeze(1)
|
|
dist = ((xyz - centroid) ** 2).sum(-1)
|
|
distance = torch.minimum(distance, dist)
|
|
farthest = distance.argmax(-1)
|
|
return centroids
|
|
```
|
|
|
|
### MinkowskiEngine — sparse 3D conv
|
|
```python
|
|
import MinkowskiEngine as ME, torch
|
|
|
|
# Build sparse tensor from point cloud
|
|
coords = torch.floor(points / voxel_size).int()
|
|
coords = ME.utils.batched_coordinates([c for c in coords])
|
|
feats = torch.ones(coords.shape[0], 3) # or RGB/normal
|
|
x = ME.SparseTensor(features=feats, coordinates=coords)
|
|
|
|
class SparseUNet(ME.MinkowskiNetwork):
|
|
def __init__(self, in_channels=3, out_channels=20, D=3):
|
|
super().__init__(D)
|
|
self.conv1 = ME.MinkowskiConvolution(in_channels, 32, kernel_size=3, dimension=D)
|
|
self.bn1 = ME.MinkowskiBatchNorm(32)
|
|
self.relu = ME.MinkowskiReLU()
|
|
# ... encoder/decoder
|
|
def forward(self, x):
|
|
return self.relu(self.bn1(self.conv1(x)))
|
|
```
|
|
|
|
### Spconv (Volcano-Lab) — 매 fast alternative
|
|
```python
|
|
import spconv.pytorch as spconv
|
|
|
|
class SimpleSpconv(nn.Module):
|
|
def __init__(self):
|
|
super().__init__()
|
|
self.net = spconv.SparseSequential(
|
|
spconv.SubMConv3d(3, 32, 3, padding=1),
|
|
nn.BatchNorm1d(32), nn.ReLU(),
|
|
spconv.SparseConv3d(32, 64, 2, stride=2), # downsample
|
|
nn.BatchNorm1d(64), nn.ReLU(),
|
|
)
|
|
def forward(self, voxel_features, voxel_coords, batch_size, spatial_shape):
|
|
x = spconv.SparseConvTensor(voxel_features, voxel_coords, spatial_shape, batch_size)
|
|
return self.net(x)
|
|
```
|
|
|
|
### KITTI LiDAR — load + crop
|
|
```python
|
|
import numpy as np
|
|
|
|
def load_kitti_bin(path):
|
|
# KITTI LiDAR: (N, 4) — x, y, z, intensity
|
|
return np.fromfile(path, dtype=np.float32).reshape(-1, 4)
|
|
|
|
def crop_fov(pts, x_range=(-50, 50), y_range=(-50, 50), z_range=(-3, 1)):
|
|
mask = ((pts[:, 0] > x_range[0]) & (pts[:, 0] < x_range[1]) &
|
|
(pts[:, 1] > y_range[0]) & (pts[:, 1] < y_range[1]) &
|
|
(pts[:, 2] > z_range[0]) & (pts[:, 2] < z_range[1]))
|
|
return pts[mask]
|
|
```
|
|
|
|
### Gaussian Splatting export (modern 3D recon)
|
|
```python
|
|
# 3DGS — points + Gaussian params (μ, Σ, color, opacity)
|
|
# 매 NeRF 매 successor — 매 fast train + render
|
|
import gsplat
|
|
# https://github.com/nerfstudio-project/gsplat
|
|
# Render: gsplat.rasterization(means, quats, scales, opacities, colors, ...)
|
|
```
|
|
|
|
### Point Transformer v3 (SOTA 2024)
|
|
```python
|
|
# pip install pointcept
|
|
from pointcept.models.point_transformer_v3 import PointTransformerV3
|
|
|
|
model = PointTransformerV3(
|
|
in_channels=6, num_classes=20,
|
|
enc_depths=(2, 2, 2, 6, 2), enc_channels=(32, 64, 128, 256, 512),
|
|
)
|
|
# Input: dict with 'feat', 'coord', 'grid_coord', 'offset'
|
|
out = model(input_dict)
|
|
```
|
|
|
|
## 매 결정 기준
|
|
| 상황 | Method |
|
|
|---|---|
|
|
| 매 small clouds (<10K points) classification | 매 PointNet++ |
|
|
| 매 LiDAR scene seg (>100K) | 매 sparse conv (Spconv/Mink) |
|
|
| 매 SOTA segmentation | 매 PTv3 |
|
|
| 매 3D detection (autonomous) | 매 CenterPoint / TransFusion |
|
|
| 매 reconstruction from images | 매 Gaussian Splatting |
|
|
| 매 registration | 매 GeoTransformer |
|
|
| 매 quick prototyping | 매 Open3D |
|
|
|
|
**기본값**: 매 Spconv (LiDAR) / PTv3 (indoor) / Open3D (general utilities).
|
|
|
|
## 🔗 Graph
|
|
- 부모: [[3D-Deep-Learning]] · [[Computer Vision|Computer-Vision]]
|
|
- 응용: [[Autonomous-Driving]] · [[Gaussian-Splatting]]
|
|
- Adjacent: [[NeRF]]
|
|
|
|
## 🤖 LLM 활용
|
|
**언제**: 매 LiDAR scene understanding, 매 indoor scan semantic seg, 매 robot perception 의 사용.
|
|
**언제 X**: 매 dense images already (image CNN sufficient), 매 mesh-native tasks (use mesh networks).
|
|
|
|
## ❌ 안티패턴
|
|
- **매 dense voxel grid**: 매 OOM — 매 sparse representation 의 사용.
|
|
- **매 ignore normalization**: 매 cloud 의 unit sphere 또는 unit cube 의 normalize.
|
|
- **매 PointNet for large scenes**: 매 single max-pool 매 100K points → 매 information loss.
|
|
- **매 forget T-Net (PointNet)**: 매 input transform 의 omit → 매 SO(3) sensitivity.
|
|
|
|
## 🧪 검증 / 중복
|
|
- Verified (PointNet/PointNet++ Qi 2017, MinkowskiEngine docs, Spconv repo, PTv3 2024).
|
|
- 신뢰도 A.
|
|
|
|
## 🕓 Changelog
|
|
| 날짜 | 변경 |
|
|
|---|---|
|
|
| 2026-05-08 | Phase 1 |
|
|
| 2026-05-10 | Manual cleanup — PointNet/Sparse/PTv3 + LiDAR + Open3D + GS |
|