Files

T

koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)

이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-08 12:24:15 +09:00

8.9 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

Compute Shader

매 한 줄

"매 GPU thousand core 의 parallel". 매 WebGPU 의 introduce → 매 web 의 GPGPU 의 가능. 매 particle, 매 fluid sim, 매 culling, 매 ML inference. 매 CPU 30ms (10K particle) → 매 GPU 2ms (100K particle) — 매 150× faster.

매 핵심

매 use case

Particle system: 매 millions.
Fluid simulation: 매 SPH, 매 grid-based.
Cloth / soft-body.
Procedural terrain.
GPU-driven rendering: 매 culling, 매 indirect draw.
Compute skinning: 매 GPU 의 vertex transform.
Image processing: 매 blur, 매 filter.
GPGPU: 매 ML inference, 매 numerical.

매 vs vertex / fragment shader

Vertex: 매 per-vertex.
Fragment: 매 per-pixel.
Compute: 매 arbitrary computation, 매 storage R/W.

매 핵심 concept

Workgroup

매 thread group (e.g., 8×8×1 = 64 threads).
매 shared memory.
매 hardware-mapped (warp / wavefront).

Storage buffer / texture

매 read + write (vs sampled texture only read).
매 fluid sim 등 의 essential.

Workgroup variable (shared memory)

매 매 thread group 의 share.
매 10-100× faster than global.
매 reduction, prefix sum 의 base.

Indirect draw

매 GPU 의 draw command 의 generate.
매 CPU-GPU sync 의 minimize.

매 WGSL (WebGPU Shading Language)

매 syntax: 매 Rust-like.
매 type-strict.
매 vertex / fragment / compute 의 unified.

매 sync / async

매 GPU 의 async by default.
매 dependency 의 explicit barrier.
매 readback 의 expensive (avoid).

매 modern application

Three.js WebGPU renderer: 매 v160+.
Babylon.js.
wgpu-rs: 매 native + web.
Hokusai (Expo 2025 Osaka): 매 1M particle fluid.
Million-component BIM platform.

💻 패턴

Basic compute shader (WGSL)

// 매 add two arrays
@group(0) @binding(0) var<storage, read> input_a: array<f32>;
@group(0) @binding(1) var<storage, read> input_b: array<f32>;
@group(0) @binding(2) var<storage, read_write> output: array<f32>;

@compute @workgroup_size(64)
fn main(@builtin(global_invocation_id) id: vec3<u32>) {
  let idx = id.x;
  if (idx >= arrayLength(&input_a)) { return; }
  output[idx] = input_a[idx] + input_b[idx];
}

JavaScript dispatch (WebGPU)

const adapter = await navigator.gpu.requestAdapter();
const device = await adapter.requestDevice();

// 매 buffer
const inputA = device.createBuffer({
  size: data.byteLength,
  usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST,
});
device.queue.writeBuffer(inputA, 0, data);

// 매 pipeline
const module = device.createShaderModule({ code: wgslSource });
const pipeline = device.createComputePipeline({
  layout: 'auto',
  compute: { module, entryPoint: 'main' },
});

const bindGroup = device.createBindGroup({
  layout: pipeline.getBindGroupLayout(0),
  entries: [
    { binding: 0, resource: { buffer: inputA } },
    { binding: 1, resource: { buffer: inputB } },
    { binding: 2, resource: { buffer: output } },
  ],
});

// 매 dispatch
const encoder = device.createCommandEncoder();
const pass = encoder.beginComputePass();
pass.setPipeline(pipeline);
pass.setBindGroup(0, bindGroup);
pass.dispatchWorkgroups(Math.ceil(data.length / 64));
pass.end();
device.queue.submit([encoder.finish()]);

Particle system (Three.js WebGPU)

import { Fn, instanceIndex, storage, attribute } from 'three/webgpu';

const positionsAttribute = new Float32Array(N_PARTICLES * 3);
const positionsBuffer = renderer.computeAsync(
  Fn(() => {
    const i = instanceIndex;
    const pos = storage(positionsAttribute, 'vec3', N_PARTICLES);
    pos.element(i).addAssign(velocity.element(i).mul(dt));
    pos.element(i).y.assign(pos.element(i).y.sub(gravity * dt));
    // 매 boundary
    If(pos.element(i).y.lessThan(0), () => {
      pos.element(i).y.assign(0);
      velocity.element(i).y.mulAssign(-0.8);
    });
  })().compute(N_PARTICLES)
);

Fluid simulation (SPH-style)

// 매 매 particle 의 neighbor 의 search + 매 force compute
@group(0) @binding(0) var<storage, read_write> particles: array<Particle>;
@group(0) @binding(1) var<uniform> params: SimParams;

@compute @workgroup_size(64)
fn step(@builtin(global_invocation_id) id: vec3<u32>) {
  let i = id.x;
  if (i >= arrayLength(&particles)) { return; }
  
  var force = vec3<f32>(0.0, -9.8, 0.0);
  
  // 매 neighbor sum (simplified — real SPH uses spatial grid)
  for (var j = 0u; j < arrayLength(&particles); j++) {
    if (j == i) { continue; }
    let r = particles[j].pos - particles[i].pos;
    let d = length(r);
    if (d < params.smoothing_length) {
      force += sph_force(particles[i], particles[j], r, d);
    }
  }
  
  particles[i].vel += force * params.dt;
  particles[i].pos += particles[i].vel * params.dt;
}

GPU-driven culling (frustum)

@group(0) @binding(0) var<storage, read> instances: array<InstanceData>;
@group(0) @binding(1) var<storage, read_write> draw_args: array<DrawArgs>;
@group(0) @binding(2) var<uniform> camera: Camera;

@compute @workgroup_size(64)
fn cull(@builtin(global_invocation_id) id: vec3<u32>) {
  let i = id.x;
  if (i >= arrayLength(&instances)) { return; }
  
  if (in_frustum(instances[i].bounding_box, camera.frustum)) {
    let slot = atomicAdd(&draw_args[0].instance_count, 1u);
    visible_indices[slot] = i;
  }
}

Compute skinning (vertex transform pre-pass)

@group(0) @binding(0) var<storage, read> bone_matrices: array<mat4x4<f32>>;
@group(0) @binding(1) var<storage, read> base_vertices: array<Vertex>;
@group(0) @binding(2) var<storage, read_write> skinned: array<vec4<f32>>;

@compute @workgroup_size(64)
fn skin(@builtin(global_invocation_id) id: vec3<u32>) {
  let i = id.x;
  let v = base_vertices[i];
  
  var pos = vec4<f32>(0.0);
  for (var b = 0u; b < 4u; b++) {
    pos += bone_matrices[v.bone_idx[b]] * vec4<f32>(v.position, 1.0) * v.bone_weight[b];
  }
  
  skinned[i] = pos;
}

// 매 매 render pass 의 skinned 의 read.

Workgroup shared memory (reduction)

var<workgroup> shared: array<f32, 64>;

@compute @workgroup_size(64)
fn sum_reduce(
  @builtin(local_invocation_id) lid: vec3<u32>,
  @builtin(global_invocation_id) gid: vec3<u32>,
) {
  shared[lid.x] = input[gid.x];
  workgroupBarrier();
  
  // 매 tree reduction
  for (var stride = 32u; stride > 0u; stride >>= 1u) {
    if (lid.x < stride) {
      shared[lid.x] += shared[lid.x + stride];
    }
    workgroupBarrier();
  }
  
  if (lid.x == 0u) {
    output[workgroup_id.x] = shared[0];
  }
}

Async render (Three.js)

// 매 compute pass 의 finish 후 의 render
async function frame() {
  await renderer.computeAsync(particleUpdate);
  await renderer.renderAsync(scene, camera);
}

🤔 결정 기준

상황	Approach
100K+ particle	Compute shader
Fluid sim	Compute + storage texture
Frustum culling	GPU-driven culling
ML inference (browser)	WebGPU + WGSL
Image processing	Compute + storage texture
Skinned mesh (many)	Compute skinning
< 10K particle	CPU OK
< 1000 instance	CPU instance

기본값: WebGPU + Three.js v160+ for web. wgpu-rs for native.

🔗 Graph

부모: WebGPU · Computer-Graphics
변형: WGSL · GPU-driven Rendering · Indirect Draw
응용: Three.js · Particle-System
Adjacent: CSS Animations · Web-Performance · Bottlenecks · Bioenergetics (energy-efficient)

🤖 LLM 활용

언제: 매 web GPU compute. 매 large particle / sim. 매 GPU-driven rendering. 매 browser ML. 언제 X: 매 small task (CPU OK). 매 WebGL only fallback 필요.

❌ 안티패턴

CPU-GPU readback every frame: 매 sync stall.
Workgroup size 의 wrong (e.g., 8): 매 underutilization.
No barrier: 매 race condition.
Storage texture 의 use w/o WebGPU: 매 unsupported.
Sync compute + render: 매 stall.
No fallback (older browser): 매 break.

🧪 검증 / 중복

Verified (WebGPU spec, Three.js webgpu, Hokusai exhibition).
신뢰도 A.
Related: CSS Animations · Web-Performance · Bottlenecks · Baseline (Web Platform Features) · 20k skinned instances demo.

🕓 Changelog

날짜	변경
2026-04-19	Auto-mapped
2026-05-08	Phase 1
2026-05-10	Manual cleanup — workgroup + 매 WGSL / Three.js / fluid / culling / skinning code

8.9 KiB Raw Blame History Unescape Escape