--- id: wiki-2026-0508-compute-shader title: Compute Shader (WebGPU) category: 10_Wiki/Topics status: verified canonical_id: self aliases: [compute shader, WebGPU compute, GPGPU, WGSL, GPU-driven rendering, indirect draw] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [webgpu, compute-shader, gpgpu, wgsl, gpu-driven-rendering, three-js, particle-system, simulation] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: WGSL / WebGPU framework: Three.js / Babylon.js / wgpu-rs --- # Compute Shader ## 매 한 줄 > **"매 GPU thousand core 의 parallel"**. 매 WebGPU 의 introduce → 매 web 의 GPGPU 의 가능. 매 particle, 매 fluid sim, 매 culling, 매 ML inference. 매 CPU 30ms (10K particle) → 매 GPU 2ms (100K particle) — 매 150× faster. ## 매 핵심 ### 매 use case 1. **Particle system**: 매 millions. 2. **Fluid simulation**: 매 SPH, 매 grid-based. 3. **Cloth / soft-body**. 4. **Procedural terrain**. 5. **GPU-driven rendering**: 매 culling, 매 indirect draw. 6. **Compute skinning**: 매 GPU 의 vertex transform. 7. **Image processing**: 매 blur, 매 filter. 8. **GPGPU**: 매 ML inference, 매 numerical. ### 매 vs vertex / fragment shader - **Vertex**: 매 per-vertex. - **Fragment**: 매 per-pixel. - **Compute**: 매 arbitrary computation, 매 storage R/W. ### 매 핵심 concept #### Workgroup - 매 thread group (e.g., 8×8×1 = 64 threads). - 매 shared memory. - 매 hardware-mapped (warp / wavefront). #### Storage buffer / texture - 매 read + write (vs sampled texture only read). - 매 fluid sim 등 의 essential. #### Workgroup variable (shared memory) - 매 매 thread group 의 share. - 매 10-100× faster than global. - 매 reduction, prefix sum 의 base. #### Indirect draw - 매 GPU 의 draw command 의 generate. - 매 CPU-GPU sync 의 minimize. ### 매 WGSL (WebGPU Shading Language) - 매 syntax: 매 Rust-like. - 매 type-strict. - 매 vertex / fragment / compute 의 unified. ### 매 sync / async - 매 GPU 의 async by default. - 매 dependency 의 explicit barrier. - 매 readback 의 expensive (avoid). ### 매 modern application - **Three.js WebGPU renderer**: 매 v160+. - **Babylon.js**. - **wgpu-rs**: 매 native + web. - **Hokusai** (Expo 2025 Osaka): 매 1M particle fluid. - **Million-component BIM platform**. ## 💻 패턴 ### Basic compute shader (WGSL) ```wgsl // 매 add two arrays @group(0) @binding(0) var input_a: array; @group(0) @binding(1) var input_b: array; @group(0) @binding(2) var output: array; @compute @workgroup_size(64) fn main(@builtin(global_invocation_id) id: vec3) { let idx = id.x; if (idx >= arrayLength(&input_a)) { return; } output[idx] = input_a[idx] + input_b[idx]; } ``` ### JavaScript dispatch (WebGPU) ```js const adapter = await navigator.gpu.requestAdapter(); const device = await adapter.requestDevice(); // 매 buffer const inputA = device.createBuffer({ size: data.byteLength, usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST, }); device.queue.writeBuffer(inputA, 0, data); // 매 pipeline const module = device.createShaderModule({ code: wgslSource }); const pipeline = device.createComputePipeline({ layout: 'auto', compute: { module, entryPoint: 'main' }, }); const bindGroup = device.createBindGroup({ layout: pipeline.getBindGroupLayout(0), entries: [ { binding: 0, resource: { buffer: inputA } }, { binding: 1, resource: { buffer: inputB } }, { binding: 2, resource: { buffer: output } }, ], }); // 매 dispatch const encoder = device.createCommandEncoder(); const pass = encoder.beginComputePass(); pass.setPipeline(pipeline); pass.setBindGroup(0, bindGroup); pass.dispatchWorkgroups(Math.ceil(data.length / 64)); pass.end(); device.queue.submit([encoder.finish()]); ``` ### Particle system (Three.js WebGPU) ```js import { Fn, instanceIndex, storage, attribute } from 'three/webgpu'; const positionsAttribute = new Float32Array(N_PARTICLES * 3); const positionsBuffer = renderer.computeAsync( Fn(() => { const i = instanceIndex; const pos = storage(positionsAttribute, 'vec3', N_PARTICLES); pos.element(i).addAssign(velocity.element(i).mul(dt)); pos.element(i).y.assign(pos.element(i).y.sub(gravity * dt)); // 매 boundary If(pos.element(i).y.lessThan(0), () => { pos.element(i).y.assign(0); velocity.element(i).y.mulAssign(-0.8); }); })().compute(N_PARTICLES) ); ``` ### Fluid simulation (SPH-style) ```wgsl // 매 매 particle 의 neighbor 의 search + 매 force compute @group(0) @binding(0) var particles: array; @group(0) @binding(1) var params: SimParams; @compute @workgroup_size(64) fn step(@builtin(global_invocation_id) id: vec3) { let i = id.x; if (i >= arrayLength(&particles)) { return; } var force = vec3(0.0, -9.8, 0.0); // 매 neighbor sum (simplified — real SPH uses spatial grid) for (var j = 0u; j < arrayLength(&particles); j++) { if (j == i) { continue; } let r = particles[j].pos - particles[i].pos; let d = length(r); if (d < params.smoothing_length) { force += sph_force(particles[i], particles[j], r, d); } } particles[i].vel += force * params.dt; particles[i].pos += particles[i].vel * params.dt; } ``` ### GPU-driven culling (frustum) ```wgsl @group(0) @binding(0) var instances: array; @group(0) @binding(1) var draw_args: array; @group(0) @binding(2) var camera: Camera; @compute @workgroup_size(64) fn cull(@builtin(global_invocation_id) id: vec3) { let i = id.x; if (i >= arrayLength(&instances)) { return; } if (in_frustum(instances[i].bounding_box, camera.frustum)) { let slot = atomicAdd(&draw_args[0].instance_count, 1u); visible_indices[slot] = i; } } ``` ### Compute skinning (vertex transform pre-pass) ```wgsl @group(0) @binding(0) var bone_matrices: array>; @group(0) @binding(1) var base_vertices: array; @group(0) @binding(2) var skinned: array>; @compute @workgroup_size(64) fn skin(@builtin(global_invocation_id) id: vec3) { let i = id.x; let v = base_vertices[i]; var pos = vec4(0.0); for (var b = 0u; b < 4u; b++) { pos += bone_matrices[v.bone_idx[b]] * vec4(v.position, 1.0) * v.bone_weight[b]; } skinned[i] = pos; } // 매 매 render pass 의 skinned 의 read. ``` ### Workgroup shared memory (reduction) ```wgsl var shared: array; @compute @workgroup_size(64) fn sum_reduce( @builtin(local_invocation_id) lid: vec3, @builtin(global_invocation_id) gid: vec3, ) { shared[lid.x] = input[gid.x]; workgroupBarrier(); // 매 tree reduction for (var stride = 32u; stride > 0u; stride >>= 1u) { if (lid.x < stride) { shared[lid.x] += shared[lid.x + stride]; } workgroupBarrier(); } if (lid.x == 0u) { output[workgroup_id.x] = shared[0]; } } ``` ### Async render (Three.js) ```js // 매 compute pass 의 finish 후 의 render async function frame() { await renderer.computeAsync(particleUpdate); await renderer.renderAsync(scene, camera); } ``` ## 🤔 결정 기준 | 상황 | Approach | |---|---| | 100K+ particle | Compute shader | | Fluid sim | Compute + storage texture | | Frustum culling | GPU-driven culling | | ML inference (browser) | WebGPU + WGSL | | Image processing | Compute + storage texture | | Skinned mesh (many) | Compute skinning | | < 10K particle | CPU OK | | < 1000 instance | CPU instance | **기본값**: WebGPU + Three.js v160+ for web. wgpu-rs for native. ## 🔗 Graph - 부모: [[WebGPU]] · [[Computer-Graphics]] - 변형: [[WGSL]] · [[GPU-driven Rendering]] · [[Indirect Draw]] - 응용: [[Three.js]] · [[Particle-System]] - Adjacent: [[CSS Animations]] · [[Web-Performance]] · [[Bottlenecks]] · [[Bioenergetics]] (energy-efficient) ## 🤖 LLM 활용 **언제**: 매 web GPU compute. 매 large particle / sim. 매 GPU-driven rendering. 매 browser ML. **언제 X**: 매 small task (CPU OK). 매 WebGL only fallback 필요. ## ❌ 안티패턴 - **CPU-GPU readback every frame**: 매 sync stall. - **Workgroup size 의 wrong** (e.g., 8): 매 underutilization. - **No barrier**: 매 race condition. - **Storage texture 의 use w/o WebGPU**: 매 unsupported. - **Sync compute + render**: 매 stall. - **No fallback (older browser)**: 매 break. ## 🧪 검증 / 중복 - Verified (WebGPU spec, Three.js webgpu, Hokusai exhibition). - 신뢰도 A. - Related: [[CSS Animations]] · [[Web-Performance]] · [[Bottlenecks]] · [[Baseline (Web Platform Features)]] · [[20k skinned instances demo]]. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-04-19 | Auto-mapped | | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — workgroup + 매 WGSL / Three.js / fluid / culling / skinning code |