f8b21af4be
10_Wiki/Topics 대규모 정리: - 오류 캡처/미완성 stub 문서 227개 제거 - 교차폴더 중복 43클러스터 병합 (63파일 → redirect) - 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건 - 카테고리 MOC 6개 신규 생성 - Graph 섹션 미해결 related-keyword 링크 10,058건 제거 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
149 lines
4.7 KiB
Markdown
149 lines
4.7 KiB
Markdown
---
|
|
id: wiki-2026-0508-draw-call
|
|
title: Draw Call
|
|
category: 10_Wiki/Topics
|
|
status: verified
|
|
canonical_id: self
|
|
aliases: [Drawcall, GPU Submit, Render Command]
|
|
duplicate_of: none
|
|
source_trust_level: A
|
|
confidence_score: 0.9
|
|
verification_status: applied
|
|
tags: [graphics, gpu, performance, rendering]
|
|
raw_sources: []
|
|
last_reinforced: 2026-05-10
|
|
github_commit: pending
|
|
tech_stack:
|
|
language: C++/Rust
|
|
framework: Vulkan/Metal/D3D12/WebGPU
|
|
---
|
|
|
|
# Draw Call
|
|
|
|
## 매 한 줄
|
|
> **"매 CPU 가 GPU 에게 매 한 batch 를 그리라고 매 instructing 하는 single command"**. 1990s OpenGL `glDrawArrays` 시대의 매 ms-cost overhead 가 매 modern explicit API (Vulkan/D3D12/Metal/WebGPU) + bindless + GPU-driven rendering 으로 매 micro-second 수준으로 떨어짐. 매 2026 — `vkCmdDrawIndexedIndirectCount` + mesh shader 가 매 norm.
|
|
|
|
## 매 핵심
|
|
|
|
### 매 anatomy
|
|
- Set pipeline (shader, blend, depth state).
|
|
- Bind resources (vertex/index buffer, uniform, texture).
|
|
- Issue draw (`drawIndexed`, `dispatch`).
|
|
- Submit to queue.
|
|
|
|
### 매 cost source
|
|
- **Driver validation**: legacy GL 의 매 main bottleneck.
|
|
- **State change**: pipeline / RT / descriptor switch.
|
|
- **CPU↔GPU sync**: fence wait, map/unmap.
|
|
- **Command recording**: 매 modern API 에서 매 thread 분산 가능.
|
|
|
|
### 매 응용
|
|
1. Draw call 수 줄임 → frame time 직접 감소.
|
|
2. Batching (instancing, atlas, indirect).
|
|
3. GPU-driven culling (compute → indirect).
|
|
|
|
## 💻 패턴
|
|
|
|
### Vulkan minimal draw
|
|
```cpp
|
|
vkCmdBindPipeline(cmd, VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
|
|
VkBuffer vbs[] = {vertexBuf}; VkDeviceSize off[] = {0};
|
|
vkCmdBindVertexBuffers(cmd, 0, 1, vbs, off);
|
|
vkCmdBindIndexBuffer(cmd, indexBuf, 0, VK_INDEX_TYPE_UINT32);
|
|
vkCmdBindDescriptorSets(cmd, ..., 0, 1, &set, 0, nullptr);
|
|
vkCmdDrawIndexed(cmd, indexCount, instanceCount, 0, 0, 0);
|
|
```
|
|
|
|
### Instancing (1 call → N objects)
|
|
```glsl
|
|
// vertex shader
|
|
layout(location = 0) in vec3 pos;
|
|
layout(location = 4) in mat4 modelMatrix; // per-instance
|
|
void main() { gl_Position = vp * modelMatrix * vec4(pos, 1); }
|
|
```
|
|
```cpp
|
|
// CPU side
|
|
vkCmdDrawIndexed(cmd, idxCount, 10000, 0, 0, 0); // 10k objects, 1 draw
|
|
```
|
|
|
|
### Indirect draw (GPU-driven)
|
|
```cpp
|
|
struct VkDrawIndexedIndirectCommand {
|
|
uint32_t indexCount, instanceCount, firstIndex;
|
|
int32_t vertexOffset; uint32_t firstInstance;
|
|
};
|
|
// Compute shader culls & writes commands + count to GPU buffer.
|
|
// CPU just calls:
|
|
vkCmdDrawIndexedIndirectCount(cmd, drawBuf, 0, countBuf, 0, MAX_DRAWS, sizeof(Cmd));
|
|
```
|
|
|
|
### Bindless (descriptor indexing)
|
|
```glsl
|
|
#extension GL_EXT_nonuniform_qualifier : require
|
|
layout(set=0, binding=0) uniform sampler2D textures[];
|
|
layout(push_constant) uniform PC { uint texIndex; };
|
|
void main() { color = texture(textures[nonuniformEXT(texIndex)], uv); }
|
|
```
|
|
|
|
### Mesh shader (DX12 / Vulkan)
|
|
```glsl
|
|
#version 460
|
|
#extension GL_EXT_mesh_shader : require
|
|
layout(local_size_x = 32) in;
|
|
layout(triangles, max_vertices = 64, max_primitives = 124) out;
|
|
void main() {
|
|
SetMeshOutputsEXT(vertCount, primCount);
|
|
// amplify / cull per meshlet, no IA stage
|
|
}
|
|
```
|
|
|
|
### Multi-thread command recording (Vulkan)
|
|
```cpp
|
|
// 1 secondary CB per thread
|
|
parallel_for(0, N, [&](int i) {
|
|
VkCommandBuffer sec = secondaryCBs[threadId];
|
|
vkBeginCommandBuffer(sec, ...);
|
|
record_draws_for_chunk(sec, chunk[i]);
|
|
vkEndCommandBuffer(sec);
|
|
});
|
|
vkCmdExecuteCommands(primaryCB, N, secondaryCBs.data());
|
|
```
|
|
|
|
## 매 결정 기준
|
|
| 상황 | Approach |
|
|
|---|---|
|
|
| 同 mesh 수천 개 | Instancing |
|
|
| Diverse mesh, frustum cullable | GPU-driven indirect + compute culling |
|
|
| Many materials | Bindless texture + uber-shader |
|
|
| Highly detailed geometry | Mesh shader + meshlet |
|
|
| Legacy GL/GLES | Atlas + state sort + minimize binds |
|
|
|
|
**기본값**: Modern → indirect + bindless. Legacy → batch by state.
|
|
|
|
## 🔗 Graph
|
|
- 부모: [[GPU Pipeline]] · [[Real-time Rendering]]
|
|
- 변형: [[Indirect Draw]]
|
|
- 응용: [[Frustum Culling]] · [[Geometry Merging]]
|
|
- Adjacent: [[Vulkan]] · [[Metal]] · [[WebGPU]]
|
|
|
|
## 🤖 LLM 활용
|
|
**언제**: Renderer architecture, perf budget 분석, profiling 결과 해석.
|
|
**언제 X**: Game design / art direction.
|
|
|
|
## ❌ 안티패턴
|
|
- **One draw per object**: legacy 패턴 — instancing/indirect 사용.
|
|
- **Excessive state changes**: shader/pipeline 매 frame 수천 번 swap.
|
|
- **CPU-side culling 만**: GPU 보내서 매 compute 로 culling.
|
|
- **Map/unmap loop**: persistent mapped buffer + ring 사용.
|
|
- **Single thread record**: secondary CB + parallel_for.
|
|
|
|
## 🧪 검증 / 중복
|
|
- Verified (Vulkan/D3D12 spec, Khronos best practices, GPU Zen).
|
|
- 신뢰도 A.
|
|
|
|
## 🕓 Changelog
|
|
| 날짜 | 변경 |
|
|
|---|---|
|
|
| 2026-05-08 | Phase 1 |
|
|
| 2026-05-10 | Manual cleanup — draw call cost + indirect/bindless/mesh shader |
|