Files
2nd/10_Wiki/Topics/DevOps_and_Security/Draw Call.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

4.7 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-draw-call Draw Call 10_Wiki/Topics verified self
Drawcall
GPU Submit
Render Command
none A 0.9 applied
graphics
gpu
performance
rendering
2026-05-10 pending
language framework
C++/Rust Vulkan/Metal/D3D12/WebGPU

Draw Call

매 한 줄

"매 CPU 가 GPU 에게 매 한 batch 를 그리라고 매 instructing 하는 single command". 1990s OpenGL glDrawArrays 시대의 매 ms-cost overhead 가 매 modern explicit API (Vulkan/D3D12/Metal/WebGPU) + bindless + GPU-driven rendering 으로 매 micro-second 수준으로 떨어짐. 매 2026 — vkCmdDrawIndexedIndirectCount + mesh shader 가 매 norm.

매 핵심

매 anatomy

  • Set pipeline (shader, blend, depth state).
  • Bind resources (vertex/index buffer, uniform, texture).
  • Issue draw (drawIndexed, dispatch).
  • Submit to queue.

매 cost source

  • Driver validation: legacy GL 의 매 main bottleneck.
  • State change: pipeline / RT / descriptor switch.
  • CPU↔GPU sync: fence wait, map/unmap.
  • Command recording: 매 modern API 에서 매 thread 분산 가능.

매 응용

  1. Draw call 수 줄임 → frame time 직접 감소.
  2. Batching (instancing, atlas, indirect).
  3. GPU-driven culling (compute → indirect).

💻 패턴

Vulkan minimal draw

vkCmdBindPipeline(cmd, VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
VkBuffer vbs[] = {vertexBuf}; VkDeviceSize off[] = {0};
vkCmdBindVertexBuffers(cmd, 0, 1, vbs, off);
vkCmdBindIndexBuffer(cmd, indexBuf, 0, VK_INDEX_TYPE_UINT32);
vkCmdBindDescriptorSets(cmd, ..., 0, 1, &set, 0, nullptr);
vkCmdDrawIndexed(cmd, indexCount, instanceCount, 0, 0, 0);

Instancing (1 call → N objects)

// vertex shader
layout(location = 0) in vec3 pos;
layout(location = 4) in mat4 modelMatrix;  // per-instance
void main() { gl_Position = vp * modelMatrix * vec4(pos, 1); }
// CPU side
vkCmdDrawIndexed(cmd, idxCount, 10000, 0, 0, 0);  // 10k objects, 1 draw

Indirect draw (GPU-driven)

struct VkDrawIndexedIndirectCommand {
    uint32_t indexCount, instanceCount, firstIndex;
    int32_t  vertexOffset; uint32_t firstInstance;
};
// Compute shader culls & writes commands + count to GPU buffer.
// CPU just calls:
vkCmdDrawIndexedIndirectCount(cmd, drawBuf, 0, countBuf, 0, MAX_DRAWS, sizeof(Cmd));

Bindless (descriptor indexing)

#extension GL_EXT_nonuniform_qualifier : require
layout(set=0, binding=0) uniform sampler2D textures[];
layout(push_constant) uniform PC { uint texIndex; };
void main() { color = texture(textures[nonuniformEXT(texIndex)], uv); }

Mesh shader (DX12 / Vulkan)

#version 460
#extension GL_EXT_mesh_shader : require
layout(local_size_x = 32) in;
layout(triangles, max_vertices = 64, max_primitives = 124) out;
void main() {
    SetMeshOutputsEXT(vertCount, primCount);
    // amplify / cull per meshlet, no IA stage
}

Multi-thread command recording (Vulkan)

// 1 secondary CB per thread
parallel_for(0, N, [&](int i) {
    VkCommandBuffer sec = secondaryCBs[threadId];
    vkBeginCommandBuffer(sec, ...);
    record_draws_for_chunk(sec, chunk[i]);
    vkEndCommandBuffer(sec);
});
vkCmdExecuteCommands(primaryCB, N, secondaryCBs.data());

매 결정 기준

상황 Approach
同 mesh 수천 개 Instancing
Diverse mesh, frustum cullable GPU-driven indirect + compute culling
Many materials Bindless texture + uber-shader
Highly detailed geometry Mesh shader + meshlet
Legacy GL/GLES Atlas + state sort + minimize binds

기본값: Modern → indirect + bindless. Legacy → batch by state.

🔗 Graph

🤖 LLM 활용

언제: Renderer architecture, perf budget 분석, profiling 결과 해석. 언제 X: Game design / art direction.

안티패턴

  • One draw per object: legacy 패턴 — instancing/indirect 사용.
  • Excessive state changes: shader/pipeline 매 frame 수천 번 swap.
  • CPU-side culling 만: GPU 보내서 매 compute 로 culling.
  • Map/unmap loop: persistent mapped buffer + ring 사용.
  • Single thread record: secondary CB + parallel_for.

🧪 검증 / 중복

  • Verified (Vulkan/D3D12 spec, Khronos best practices, GPU Zen).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — draw call cost + indirect/bindless/mesh shader