d8a80f6272
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해 끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은 과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업. 도구: Datacollect/scripts/link_reconcile_apply.mjs Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
8.9 KiB
8.9 KiB
id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
| id | title | category | status | canonical_id | aliases | duplicate_of | source_trust_level | confidence_score | verification_status | tags | raw_sources | last_reinforced | github_commit | tech_stack | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| wiki-2026-0508-compute-shader | Compute Shader (WebGPU) | 10_Wiki/Topics | verified | self |
|
none | A | 0.9 | applied |
|
2026-05-10 | pending |
|
Compute Shader
매 한 줄
"매 GPU thousand core 의 parallel". 매 WebGPU 의 introduce → 매 web 의 GPGPU 의 가능. 매 particle, 매 fluid sim, 매 culling, 매 ML inference. 매 CPU 30ms (10K particle) → 매 GPU 2ms (100K particle) — 매 150× faster.
매 핵심
매 use case
- Particle system: 매 millions.
- Fluid simulation: 매 SPH, 매 grid-based.
- Cloth / soft-body.
- Procedural terrain.
- GPU-driven rendering: 매 culling, 매 indirect draw.
- Compute skinning: 매 GPU 의 vertex transform.
- Image processing: 매 blur, 매 filter.
- GPGPU: 매 ML inference, 매 numerical.
매 vs vertex / fragment shader
- Vertex: 매 per-vertex.
- Fragment: 매 per-pixel.
- Compute: 매 arbitrary computation, 매 storage R/W.
매 핵심 concept
Workgroup
- 매 thread group (e.g., 8×8×1 = 64 threads).
- 매 shared memory.
- 매 hardware-mapped (warp / wavefront).
Storage buffer / texture
- 매 read + write (vs sampled texture only read).
- 매 fluid sim 등 의 essential.
Workgroup variable (shared memory)
- 매 매 thread group 의 share.
- 매 10-100× faster than global.
- 매 reduction, prefix sum 의 base.
Indirect draw
- 매 GPU 의 draw command 의 generate.
- 매 CPU-GPU sync 의 minimize.
매 WGSL (WebGPU Shading Language)
- 매 syntax: 매 Rust-like.
- 매 type-strict.
- 매 vertex / fragment / compute 의 unified.
매 sync / async
- 매 GPU 의 async by default.
- 매 dependency 의 explicit barrier.
- 매 readback 의 expensive (avoid).
매 modern application
- Three.js WebGPU renderer: 매 v160+.
- Babylon.js.
- wgpu-rs: 매 native + web.
- Hokusai (Expo 2025 Osaka): 매 1M particle fluid.
- Million-component BIM platform.
💻 패턴
Basic compute shader (WGSL)
// 매 add two arrays
@group(0) @binding(0) var<storage, read> input_a: array<f32>;
@group(0) @binding(1) var<storage, read> input_b: array<f32>;
@group(0) @binding(2) var<storage, read_write> output: array<f32>;
@compute @workgroup_size(64)
fn main(@builtin(global_invocation_id) id: vec3<u32>) {
let idx = id.x;
if (idx >= arrayLength(&input_a)) { return; }
output[idx] = input_a[idx] + input_b[idx];
}
JavaScript dispatch (WebGPU)
const adapter = await navigator.gpu.requestAdapter();
const device = await adapter.requestDevice();
// 매 buffer
const inputA = device.createBuffer({
size: data.byteLength,
usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST,
});
device.queue.writeBuffer(inputA, 0, data);
// 매 pipeline
const module = device.createShaderModule({ code: wgslSource });
const pipeline = device.createComputePipeline({
layout: 'auto',
compute: { module, entryPoint: 'main' },
});
const bindGroup = device.createBindGroup({
layout: pipeline.getBindGroupLayout(0),
entries: [
{ binding: 0, resource: { buffer: inputA } },
{ binding: 1, resource: { buffer: inputB } },
{ binding: 2, resource: { buffer: output } },
],
});
// 매 dispatch
const encoder = device.createCommandEncoder();
const pass = encoder.beginComputePass();
pass.setPipeline(pipeline);
pass.setBindGroup(0, bindGroup);
pass.dispatchWorkgroups(Math.ceil(data.length / 64));
pass.end();
device.queue.submit([encoder.finish()]);
Particle system (Three.js WebGPU)
import { Fn, instanceIndex, storage, attribute } from 'three/webgpu';
const positionsAttribute = new Float32Array(N_PARTICLES * 3);
const positionsBuffer = renderer.computeAsync(
Fn(() => {
const i = instanceIndex;
const pos = storage(positionsAttribute, 'vec3', N_PARTICLES);
pos.element(i).addAssign(velocity.element(i).mul(dt));
pos.element(i).y.assign(pos.element(i).y.sub(gravity * dt));
// 매 boundary
If(pos.element(i).y.lessThan(0), () => {
pos.element(i).y.assign(0);
velocity.element(i).y.mulAssign(-0.8);
});
})().compute(N_PARTICLES)
);
Fluid simulation (SPH-style)
// 매 매 particle 의 neighbor 의 search + 매 force compute
@group(0) @binding(0) var<storage, read_write> particles: array<Particle>;
@group(0) @binding(1) var<uniform> params: SimParams;
@compute @workgroup_size(64)
fn step(@builtin(global_invocation_id) id: vec3<u32>) {
let i = id.x;
if (i >= arrayLength(&particles)) { return; }
var force = vec3<f32>(0.0, -9.8, 0.0);
// 매 neighbor sum (simplified — real SPH uses spatial grid)
for (var j = 0u; j < arrayLength(&particles); j++) {
if (j == i) { continue; }
let r = particles[j].pos - particles[i].pos;
let d = length(r);
if (d < params.smoothing_length) {
force += sph_force(particles[i], particles[j], r, d);
}
}
particles[i].vel += force * params.dt;
particles[i].pos += particles[i].vel * params.dt;
}
GPU-driven culling (frustum)
@group(0) @binding(0) var<storage, read> instances: array<InstanceData>;
@group(0) @binding(1) var<storage, read_write> draw_args: array<DrawArgs>;
@group(0) @binding(2) var<uniform> camera: Camera;
@compute @workgroup_size(64)
fn cull(@builtin(global_invocation_id) id: vec3<u32>) {
let i = id.x;
if (i >= arrayLength(&instances)) { return; }
if (in_frustum(instances[i].bounding_box, camera.frustum)) {
let slot = atomicAdd(&draw_args[0].instance_count, 1u);
visible_indices[slot] = i;
}
}
Compute skinning (vertex transform pre-pass)
@group(0) @binding(0) var<storage, read> bone_matrices: array<mat4x4<f32>>;
@group(0) @binding(1) var<storage, read> base_vertices: array<Vertex>;
@group(0) @binding(2) var<storage, read_write> skinned: array<vec4<f32>>;
@compute @workgroup_size(64)
fn skin(@builtin(global_invocation_id) id: vec3<u32>) {
let i = id.x;
let v = base_vertices[i];
var pos = vec4<f32>(0.0);
for (var b = 0u; b < 4u; b++) {
pos += bone_matrices[v.bone_idx[b]] * vec4<f32>(v.position, 1.0) * v.bone_weight[b];
}
skinned[i] = pos;
}
// 매 매 render pass 의 skinned 의 read.
Workgroup shared memory (reduction)
var<workgroup> shared: array<f32, 64>;
@compute @workgroup_size(64)
fn sum_reduce(
@builtin(local_invocation_id) lid: vec3<u32>,
@builtin(global_invocation_id) gid: vec3<u32>,
) {
shared[lid.x] = input[gid.x];
workgroupBarrier();
// 매 tree reduction
for (var stride = 32u; stride > 0u; stride >>= 1u) {
if (lid.x < stride) {
shared[lid.x] += shared[lid.x + stride];
}
workgroupBarrier();
}
if (lid.x == 0u) {
output[workgroup_id.x] = shared[0];
}
}
Async render (Three.js)
// 매 compute pass 의 finish 후 의 render
async function frame() {
await renderer.computeAsync(particleUpdate);
await renderer.renderAsync(scene, camera);
}
🤔 결정 기준
| 상황 | Approach |
|---|---|
| 100K+ particle | Compute shader |
| Fluid sim | Compute + storage texture |
| Frustum culling | GPU-driven culling |
| ML inference (browser) | WebGPU + WGSL |
| Image processing | Compute + storage texture |
| Skinned mesh (many) | Compute skinning |
| < 10K particle | CPU OK |
| < 1000 instance | CPU instance |
기본값: WebGPU + Three.js v160+ for web. wgpu-rs for native.
🔗 Graph
- 부모: WebGPU · Computer-Graphics
- 변형: WGSL · GPU-driven Rendering · Indirect Draw
- 응용: Three.js · Particle-System
- Adjacent: CSS Animations · Web-Performance · Bottlenecks · Bioenergetics (energy-efficient)
🤖 LLM 활용
언제: 매 web GPU compute. 매 large particle / sim. 매 GPU-driven rendering. 매 browser ML. 언제 X: 매 small task (CPU OK). 매 WebGL only fallback 필요.
❌ 안티패턴
- CPU-GPU readback every frame: 매 sync stall.
- Workgroup size 의 wrong (e.g., 8): 매 underutilization.
- No barrier: 매 race condition.
- Storage texture 의 use w/o WebGPU: 매 unsupported.
- Sync compute + render: 매 stall.
- No fallback (older browser): 매 break.
🧪 검증 / 중복
- Verified (WebGPU spec, Three.js webgpu, Hokusai exhibition).
- 신뢰도 A.
- Related: CSS Animations · Web-Performance · Bottlenecks · Baseline (Web Platform Features) · 20k skinned instances demo.
🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-04-19 | Auto-mapped |
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — workgroup + 매 WGSL / Three.js / fluid / culling / skinning code |