f8b21af4be
10_Wiki/Topics 대규모 정리: - 오류 캡처/미완성 stub 문서 227개 제거 - 교차폴더 중복 43클러스터 병합 (63파일 → redirect) - 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건 - 카테고리 MOC 6개 신규 생성 - Graph 섹션 미해결 related-keyword 링크 10,058건 제거 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
301 lines
8.8 KiB
Markdown
301 lines
8.8 KiB
Markdown
---
|
||
id: wiki-2026-0508-compute-shader
|
||
title: Compute Shader (WebGPU)
|
||
category: 10_Wiki/Topics
|
||
status: verified
|
||
canonical_id: self
|
||
aliases: [compute shader, WebGPU compute, GPGPU, WGSL, GPU-driven rendering, indirect draw]
|
||
duplicate_of: none
|
||
source_trust_level: A
|
||
confidence_score: 0.9
|
||
verification_status: applied
|
||
tags: [webgpu, compute-shader, gpgpu, wgsl, gpu-driven-rendering, three-js, particle-system, simulation]
|
||
raw_sources: []
|
||
last_reinforced: 2026-05-10
|
||
github_commit: pending
|
||
tech_stack:
|
||
language: WGSL / WebGPU
|
||
framework: Three.js / Babylon.js / wgpu-rs
|
||
---
|
||
|
||
# Compute Shader
|
||
|
||
## 매 한 줄
|
||
> **"매 GPU thousand core 의 parallel"**. 매 WebGPU 의 introduce → 매 web 의 GPGPU 의 가능. 매 particle, 매 fluid sim, 매 culling, 매 ML inference. 매 CPU 30ms (10K particle) → 매 GPU 2ms (100K particle) — 매 150× faster.
|
||
|
||
## 매 핵심
|
||
|
||
### 매 use case
|
||
1. **Particle system**: 매 millions.
|
||
2. **Fluid simulation**: 매 SPH, 매 grid-based.
|
||
3. **Cloth / soft-body**.
|
||
4. **Procedural terrain**.
|
||
5. **GPU-driven rendering**: 매 culling, 매 indirect draw.
|
||
6. **Compute skinning**: 매 GPU 의 vertex transform.
|
||
7. **Image processing**: 매 blur, 매 filter.
|
||
8. **GPGPU**: 매 ML inference, 매 numerical.
|
||
|
||
### 매 vs vertex / fragment shader
|
||
- **Vertex**: 매 per-vertex.
|
||
- **Fragment**: 매 per-pixel.
|
||
- **Compute**: 매 arbitrary computation, 매 storage R/W.
|
||
|
||
### 매 핵심 concept
|
||
|
||
#### Workgroup
|
||
- 매 thread group (e.g., 8×8×1 = 64 threads).
|
||
- 매 shared memory.
|
||
- 매 hardware-mapped (warp / wavefront).
|
||
|
||
#### Storage buffer / texture
|
||
- 매 read + write (vs sampled texture only read).
|
||
- 매 fluid sim 등 의 essential.
|
||
|
||
#### Workgroup variable (shared memory)
|
||
- 매 매 thread group 의 share.
|
||
- 매 10-100× faster than global.
|
||
- 매 reduction, prefix sum 의 base.
|
||
|
||
#### Indirect draw
|
||
- 매 GPU 의 draw command 의 generate.
|
||
- 매 CPU-GPU sync 의 minimize.
|
||
|
||
### 매 WGSL (WebGPU Shading Language)
|
||
- 매 syntax: 매 Rust-like.
|
||
- 매 type-strict.
|
||
- 매 vertex / fragment / compute 의 unified.
|
||
|
||
### 매 sync / async
|
||
- 매 GPU 의 async by default.
|
||
- 매 dependency 의 explicit barrier.
|
||
- 매 readback 의 expensive (avoid).
|
||
|
||
### 매 modern application
|
||
- **Three.js WebGPU renderer**: 매 v160+.
|
||
- **Babylon.js**.
|
||
- **wgpu-rs**: 매 native + web.
|
||
- **Hokusai** (Expo 2025 Osaka): 매 1M particle fluid.
|
||
- **Million-component BIM platform**.
|
||
|
||
## 💻 패턴
|
||
|
||
### Basic compute shader (WGSL)
|
||
```wgsl
|
||
// 매 add two arrays
|
||
@group(0) @binding(0) var<storage, read> input_a: array<f32>;
|
||
@group(0) @binding(1) var<storage, read> input_b: array<f32>;
|
||
@group(0) @binding(2) var<storage, read_write> output: array<f32>;
|
||
|
||
@compute @workgroup_size(64)
|
||
fn main(@builtin(global_invocation_id) id: vec3<u32>) {
|
||
let idx = id.x;
|
||
if (idx >= arrayLength(&input_a)) { return; }
|
||
output[idx] = input_a[idx] + input_b[idx];
|
||
}
|
||
```
|
||
|
||
### JavaScript dispatch (WebGPU)
|
||
```js
|
||
const adapter = await navigator.gpu.requestAdapter();
|
||
const device = await adapter.requestDevice();
|
||
|
||
// 매 buffer
|
||
const inputA = device.createBuffer({
|
||
size: data.byteLength,
|
||
usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST,
|
||
});
|
||
device.queue.writeBuffer(inputA, 0, data);
|
||
|
||
// 매 pipeline
|
||
const module = device.createShaderModule({ code: wgslSource });
|
||
const pipeline = device.createComputePipeline({
|
||
layout: 'auto',
|
||
compute: { module, entryPoint: 'main' },
|
||
});
|
||
|
||
const bindGroup = device.createBindGroup({
|
||
layout: pipeline.getBindGroupLayout(0),
|
||
entries: [
|
||
{ binding: 0, resource: { buffer: inputA } },
|
||
{ binding: 1, resource: { buffer: inputB } },
|
||
{ binding: 2, resource: { buffer: output } },
|
||
],
|
||
});
|
||
|
||
// 매 dispatch
|
||
const encoder = device.createCommandEncoder();
|
||
const pass = encoder.beginComputePass();
|
||
pass.setPipeline(pipeline);
|
||
pass.setBindGroup(0, bindGroup);
|
||
pass.dispatchWorkgroups(Math.ceil(data.length / 64));
|
||
pass.end();
|
||
device.queue.submit([encoder.finish()]);
|
||
```
|
||
|
||
### Particle system (Three.js WebGPU)
|
||
```js
|
||
import { Fn, instanceIndex, storage, attribute } from 'three/webgpu';
|
||
|
||
const positionsAttribute = new Float32Array(N_PARTICLES * 3);
|
||
const positionsBuffer = renderer.computeAsync(
|
||
Fn(() => {
|
||
const i = instanceIndex;
|
||
const pos = storage(positionsAttribute, 'vec3', N_PARTICLES);
|
||
pos.element(i).addAssign(velocity.element(i).mul(dt));
|
||
pos.element(i).y.assign(pos.element(i).y.sub(gravity * dt));
|
||
// 매 boundary
|
||
If(pos.element(i).y.lessThan(0), () => {
|
||
pos.element(i).y.assign(0);
|
||
velocity.element(i).y.mulAssign(-0.8);
|
||
});
|
||
})().compute(N_PARTICLES)
|
||
);
|
||
```
|
||
|
||
### Fluid simulation (SPH-style)
|
||
```wgsl
|
||
// 매 매 particle 의 neighbor 의 search + 매 force compute
|
||
@group(0) @binding(0) var<storage, read_write> particles: array<Particle>;
|
||
@group(0) @binding(1) var<uniform> params: SimParams;
|
||
|
||
@compute @workgroup_size(64)
|
||
fn step(@builtin(global_invocation_id) id: vec3<u32>) {
|
||
let i = id.x;
|
||
if (i >= arrayLength(&particles)) { return; }
|
||
|
||
var force = vec3<f32>(0.0, -9.8, 0.0);
|
||
|
||
// 매 neighbor sum (simplified — real SPH uses spatial grid)
|
||
for (var j = 0u; j < arrayLength(&particles); j++) {
|
||
if (j == i) { continue; }
|
||
let r = particles[j].pos - particles[i].pos;
|
||
let d = length(r);
|
||
if (d < params.smoothing_length) {
|
||
force += sph_force(particles[i], particles[j], r, d);
|
||
}
|
||
}
|
||
|
||
particles[i].vel += force * params.dt;
|
||
particles[i].pos += particles[i].vel * params.dt;
|
||
}
|
||
```
|
||
|
||
### GPU-driven culling (frustum)
|
||
```wgsl
|
||
@group(0) @binding(0) var<storage, read> instances: array<InstanceData>;
|
||
@group(0) @binding(1) var<storage, read_write> draw_args: array<DrawArgs>;
|
||
@group(0) @binding(2) var<uniform> camera: Camera;
|
||
|
||
@compute @workgroup_size(64)
|
||
fn cull(@builtin(global_invocation_id) id: vec3<u32>) {
|
||
let i = id.x;
|
||
if (i >= arrayLength(&instances)) { return; }
|
||
|
||
if (in_frustum(instances[i].bounding_box, camera.frustum)) {
|
||
let slot = atomicAdd(&draw_args[0].instance_count, 1u);
|
||
visible_indices[slot] = i;
|
||
}
|
||
}
|
||
```
|
||
|
||
### Compute skinning (vertex transform pre-pass)
|
||
```wgsl
|
||
@group(0) @binding(0) var<storage, read> bone_matrices: array<mat4x4<f32>>;
|
||
@group(0) @binding(1) var<storage, read> base_vertices: array<Vertex>;
|
||
@group(0) @binding(2) var<storage, read_write> skinned: array<vec4<f32>>;
|
||
|
||
@compute @workgroup_size(64)
|
||
fn skin(@builtin(global_invocation_id) id: vec3<u32>) {
|
||
let i = id.x;
|
||
let v = base_vertices[i];
|
||
|
||
var pos = vec4<f32>(0.0);
|
||
for (var b = 0u; b < 4u; b++) {
|
||
pos += bone_matrices[v.bone_idx[b]] * vec4<f32>(v.position, 1.0) * v.bone_weight[b];
|
||
}
|
||
|
||
skinned[i] = pos;
|
||
}
|
||
|
||
// 매 매 render pass 의 skinned 의 read.
|
||
```
|
||
|
||
### Workgroup shared memory (reduction)
|
||
```wgsl
|
||
var<workgroup> shared: array<f32, 64>;
|
||
|
||
@compute @workgroup_size(64)
|
||
fn sum_reduce(
|
||
@builtin(local_invocation_id) lid: vec3<u32>,
|
||
@builtin(global_invocation_id) gid: vec3<u32>,
|
||
) {
|
||
shared[lid.x] = input[gid.x];
|
||
workgroupBarrier();
|
||
|
||
// 매 tree reduction
|
||
for (var stride = 32u; stride > 0u; stride >>= 1u) {
|
||
if (lid.x < stride) {
|
||
shared[lid.x] += shared[lid.x + stride];
|
||
}
|
||
workgroupBarrier();
|
||
}
|
||
|
||
if (lid.x == 0u) {
|
||
output[workgroup_id.x] = shared[0];
|
||
}
|
||
}
|
||
```
|
||
|
||
### Async render (Three.js)
|
||
```js
|
||
// 매 compute pass 의 finish 후 의 render
|
||
async function frame() {
|
||
await renderer.computeAsync(particleUpdate);
|
||
await renderer.renderAsync(scene, camera);
|
||
}
|
||
```
|
||
|
||
## 🤔 결정 기준
|
||
| 상황 | Approach |
|
||
|---|---|
|
||
| 100K+ particle | Compute shader |
|
||
| Fluid sim | Compute + storage texture |
|
||
| Frustum culling | GPU-driven culling |
|
||
| ML inference (browser) | WebGPU + WGSL |
|
||
| Image processing | Compute + storage texture |
|
||
| Skinned mesh (many) | Compute skinning |
|
||
| < 10K particle | CPU OK |
|
||
| < 1000 instance | CPU instance |
|
||
|
||
**기본값**: WebGPU + Three.js v160+ for web. wgpu-rs for native.
|
||
|
||
## 🔗 Graph
|
||
- 부모: [[WebGPU]] · [[Computer-Graphics]]
|
||
- 변형: [[WGSL]] · [[GPU-Driven-Rendering]] · [[Indirect-Draw]]
|
||
- 응용: [[Three-js]] · [[Particle-System]]
|
||
- Adjacent: [[CSS Animations]] · [[Web-Performance]] · [[Bottlenecks]] · [[Bioenergetics]] (energy-efficient)
|
||
|
||
## 🤖 LLM 활용
|
||
**언제**: 매 web GPU compute. 매 large particle / sim. 매 GPU-driven rendering. 매 browser ML.
|
||
**언제 X**: 매 small task (CPU OK). 매 WebGL only fallback 필요.
|
||
|
||
## ❌ 안티패턴
|
||
- **CPU-GPU readback every frame**: 매 sync stall.
|
||
- **Workgroup size 의 wrong** (e.g., 8): 매 underutilization.
|
||
- **No barrier**: 매 race condition.
|
||
- **Storage texture 의 use w/o WebGPU**: 매 unsupported.
|
||
- **Sync compute + render**: 매 stall.
|
||
- **No fallback (older browser)**: 매 break.
|
||
|
||
## 🧪 검증 / 중복
|
||
- Verified (WebGPU spec, Three.js webgpu, Hokusai exhibition).
|
||
- 신뢰도 A.
|
||
- Related: [[CSS Animations]] · [[Web-Performance]] · [[Bottlenecks]] · [[Baseline-Project]] · [[20k skinned instances demo]].
|
||
|
||
## 🕓 Changelog
|
||
| 날짜 | 변경 |
|
||
|---|---|
|
||
| 2026-04-19 | Auto-mapped |
|
||
| 2026-05-08 | Phase 1 |
|
||
| 2026-05-10 | Manual cleanup — workgroup + 매 WGSL / Three.js / fluid / culling / skinning code |
|