"매 CPU 가 GPU 에게 매 한 batch 를 그리라고 매 instructing 하는 single command". 1990s OpenGL glDrawArrays 시대의 매 ms-cost overhead 가 매 modern explicit API (Vulkan/D3D12/Metal/WebGPU) + bindless + GPU-driven rendering 으로 매 micro-second 수준으로 떨어짐. 매 2026 — vkCmdDrawIndexedIndirectCount + mesh shader 가 매 norm.
#version 460#extension GL_EXT_mesh_shader : requirelayout(local_size_x=32)in;layout(triangles,max_vertices=64,max_primitives=124)out;voidmain(){SetMeshOutputsEXT(vertCount,primCount);// amplify / cull per meshlet, no IA stage}
Multi-thread command recording (Vulkan)
// 1 secondary CB per thread
parallel_for(0,N,[&](inti){VkCommandBuffersec=secondaryCBs[threadId];vkBeginCommandBuffer(sec,...);record_draws_for_chunk(sec,chunk[i]);vkEndCommandBuffer(sec);});vkCmdExecuteCommands(primaryCB,N,secondaryCBs.data());
매 결정 기준
상황
Approach
同 mesh 수천 개
Instancing
Diverse mesh, frustum cullable
GPU-driven indirect + compute culling
Many materials
Bindless texture + uber-shader
Highly detailed geometry
Mesh shader + meshlet
Legacy GL/GLES
Atlas + state sort + minimize binds
기본값: Modern → indirect + bindless. Legacy → batch by state.