Files

T

Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization

10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-20 23:52:15 +09:00

5.9 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

Indirect Draw

매 한 줄

"매 indirect draw 는 draw call args 의 GPU buffer 의 read — CPU roundtrip 없이 GPU 의 self-dispatch". 2026 의 GPU-driven rendering pipeline 의 foundation: Vulkan/D3D12/Metal/WebGPU 의 support. 매 culling, LOD, instancing 의 GPU 에서 결정 → CPU draw-call overhead 의 elimination.

매 핵심

매 vs Direct Draw

Direct: draw(vertexCount, instanceCount, firstVertex, firstInstance) — args from CPU.
Indirect: drawIndirect(buffer, offset) — args read from GPU buffer.
Multi-draw indirect (MDI): thousands of draws from one CPU command.

매 Args Layout (WebGPU)

struct DrawIndirectArgs {
  vertexCount: u32,
  instanceCount: u32,
  firstVertex: u32,
  firstInstance: u32,
}
struct DrawIndexedIndirectArgs {
  indexCount: u32,
  instanceCount: u32,
  firstIndex: u32,
  baseVertex: i32,
  firstInstance: u32,
}

매 Pipeline (GPU-driven)

Compute shader: per-object frustum/occlusion cull → write visible list.
Compute shader: write indirect args buffer (instanceCount=0 for culled).
drawIndexedIndirect (or MDI) reads buffer → renders only visible.

매 응용

Massive instanced scenes (foliage, crowds, particles).
GPU-driven culling (frustum, occlusion via Hi-Z).
LOD selection on GPU.
Variable-rate / batched rendering (cluster culling, Nanite-style).

💻 패턴

WebGPU Indirect Draw Setup

// Args buffer (visible after compute)
const indirectBuffer = device.createBuffer({
  size: 16,  // 4 u32
  usage: GPUBufferUsage.INDIRECT | GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST,
});

// Initialize: 36 verts, 1000 instances, offset 0/0
device.queue.writeBuffer(indirectBuffer, 0,
  new Uint32Array([36, 1000, 0, 0]));

// In render pass
pass.setPipeline(pipeline);
pass.setVertexBuffer(0, vertices);
pass.drawIndirect(indirectBuffer, 0);

Culling Compute Shader (WGSL)

struct DrawArgs { vertexCount: u32, instanceCount: u32,
                  firstVertex: u32, firstInstance: u32 }

@group(0) @binding(0) var<storage, read>       objects: array<Object>;
@group(0) @binding(1) var<storage, read_write> drawArgs: DrawArgs;
@group(0) @binding(2) var<storage, read_write> visibleInstances: array<u32>;
@group(0) @binding(3) var<uniform>              camera: Camera;

@compute @workgroup_size(64)
fn cullCS(@builtin(global_invocation_id) gid: vec3<u32>) {
  let i = gid.x;
  if (i >= arrayLength(&objects)) { return; }
  let obj = objects[i];
  if (frustumTest(obj.bounds, camera.frustum)) {
    let slot = atomicAdd(&drawArgs.instanceCount, 1u);
    visibleInstances[slot] = i;
  }
}

Reset Pass (clear instanceCount)

// Each frame, before culling, zero out instanceCount
device.queue.writeBuffer(indirectBuffer, 4, new Uint32Array([0]));

Multi-Draw Indirect (Vulkan)

// Draw N different meshes from one buffer
vkCmdDrawIndexedIndirect(cmd, indirectBuf, 0,
                         /*drawCount*/ N,
                         /*stride*/ sizeof(VkDrawIndexedIndirectCommand));

// Or with count buffer (drawCount is itself on GPU)
vkCmdDrawIndexedIndirectCount(cmd, indirectBuf, 0,
                              countBuf, 0, /*maxDraws*/ N,
                              sizeof(VkDrawIndexedIndirectCommand));

Three.js (R175+ has WebGPU)

import { WebGPURenderer, BatchedMesh } from 'three';
const renderer = new WebGPURenderer();
// BatchedMesh internally uses indirect draw + instancing
const batched = new BatchedMesh(maxInstances, maxVerts, maxIndices);
batched.addGeometry(geom1);
batched.addGeometry(geom2);
// One draw call, GPU handles per-instance state

Hi-Z Occlusion Culling (sketch)

// Sample Hi-Z mip — fastest mip where bounding sphere covers >1 texel
fn occluded(bsphere: vec4<f32>) -> bool {
  let screenRect = projectToScreen(bsphere);
  let mip = computeMip(screenRect);
  let depth = textureSampleLevel(hiZ, samp, screenRect.center, mip).r;
  return bsphereMinDepth(bsphere) > depth;
}

매 결정 기준

상황	Approach
<100 unique objects	Direct draw / instancing — overhead 의 not worth
1k-1M instances	Indirect draw + GPU cull
Many distinct meshes	Multi-draw indirect (Vulkan/D3D12); WebGPU 의 batched
Foliage/crowd	Indirect + GPU LOD selection
Mobile / low-end	Direct draw (compute overhead 의 watch)

기본값: large dynamic scene 의 GPU-driven indirect pipeline. Small scene 의 direct draw.

🔗 Graph

부모: Graphics Pipeline
변형: GPU-Driven Rendering
응용: Frustum Culling · Nanite
Adjacent: WebGPU · Vulkan · Compute Shader

🤖 LLM 활용

언제: GPU-driven pipeline 의 design, culling 의 implement, draw-call overhead 의 reduce. 언제 X: simple scene 의 indirect draw 의 over-engineering — direct 의 fine.

❌ 안티패턴

CPU readback of indirect buffer: 매 stall. GPU 의 self-contained 의 keep.
Per-frame full buffer rewrite: defeats purpose. 매 GPU compute 의 update.
No Hi-Z for occlusion: false positives — Hi-Z 또는 conservative AABB 의 사용.
Indirect for tiny scenes: compute dispatch overhead > savings.
WebGL fallback assumed: WebGL 의 no indirect draw — WebGPU required.

🧪 검증 / 중복

Verified (WebGPU spec, Vulkan spec, GPU Gems / Activision Nanite paper).
신뢰도 A.

🕓 Changelog

날짜	변경
2026-05-08	Phase 1
2026-05-10	Manual cleanup — indirect draw / GPU-driven rendering full content

5.9 KiB Raw Blame History