"매 main shading 전 의 depth-only render". 매 overdraw 의 reduce — 매 expensive pixel shader 의 hidden surface 의 skip. 매 modern GPU 의 early-Z + Hi-Z + 매 deferred / forward+. 매 cost: 매 vertex 2x.
매 핵심
매 motivation
Overdraw: 매 same pixel 의 shade 여러 번.
Expensive shader: 매 PBR + IBL + many light → 매 pixel cost ↑.
Solution: 매 depth 만 의 first → 매 main pass 의 occluded fragment 의 reject.
매 mechanism
Pass 1: 매 depth-only (vertex + null pixel shader).
Pass 2: 매 full shading + EQUAL depth test.
매 GPU 의 early-Z 의 pixel shader 전 의 reject.
매 trade-off
Win: 매 expensive pixel shader 의 occluded fragment 의 skip.
Loss: 매 vertex stage 2x.
Net: 매 shader complexity 의 high 의 win.
매 modern variant
Hi-Z: 매 hierarchical depth 의 GPU 의 cull tile.
Forward+: 매 light culling + Z-prepass.
Deferred: 매 G-buffer 의 prepass-like.
Visibility buffer: 매 modern alternative (Unreal 5 Nanite).
매 응용
Open world: 매 dense vegetation overdraw.
Particle: 매 alpha sort cost.
PBR-heavy: 매 expensive shader.
VR: 매 fill rate critical.
Mobile: 매 tile-based 의 different approach (defer to TBDR).
💻 패턴
Z-prepass (Unreal-style HLSL)
// Pass 1 — depth onlystructVSInput{float3pos:POSITION;};float4VS_DepthOnly(VSInputv):SV_POSITION{returnmul(float4(v.pos,1.0),MVP);}// 매 no pixel shader (or null)// Pass 2 — full shading with EQUAL testDepthStencilState:DepthFunc=EQUAL;DepthWrite=OFF;float4PS_Main(VSOutputv):SV_TARGET{returnPBR_Shade(v);// 매 expensive}
boolshould_use_depth_prepass(SceneStatsstats){// 매 high overdraw + expensive shader → 매 win
floatavg_overdraw=stats.pixels_shaded/stats.unique_pixels;floatshader_cost=stats.shader_alu_count;returnavg_overdraw>2.0&&shader_cost>100;}
Hi-Z chain (compute)
// 매 each level 의 max of 4 source pixels[numthreads(8,8,1)]voidGenerateHiZ(uint3tid:SV_DispatchThreadID){float4d;d.x=SrcDepth[tid.xy*2+uint2(0,0)];d.y=SrcDepth[tid.xy*2+uint2(1,0)];d.z=SrcDepth[tid.xy*2+uint2(0,1)];d.w=SrcDepth[tid.xy*2+uint2(1,1)];DstDepth[tid.xy]=max(max(d.x,d.y),max(d.z,d.w));}
Occlusion query (validation)
GLuintq;glGenQueries(1,&q);glBeginQuery(GL_SAMPLES_PASSED,q);DrawObject(obj);glEndQuery(GL_SAMPLES_PASSED);GLuintsamples;glGetQueryObjectuiv(q,GL_QUERY_RESULT,&samples);// 매 samples == 0 → fully occluded
Early-Z disqualification
// 매 pixel shader 의 discard / depth-write → 매 early-Z 의 disabledfloat4PS_Bad(VSOutv):SV_TARGET{if(alpha<0.5)discard;// ❌ early-Z 의 breakreturnshade(v);}// 매 ✅ early-Z friendly: 매 alpha test 의 discard 의 separate pass[earlydepthstencil]float4PS_Good(VSOutv):SV_TARGET{returnshade(v);}
TBDR mobile (no prepass needed)
// 매 mobile (Adreno, Mali, Apple) 의 tile-based deferred
// 매 already 의 hidden surface 의 reject before pixel shader
// 매 explicit Z-prepass 의 redundant
매 결정 기준
상황
Approach
Desktop + heavy PBR
Z-prepass
Mobile (TBDR)
NO (HW already does it)
Forward+
Z-prepass + light cull
Deferred
G-buffer pass = prepass
Particle / transparent
After opaque (no prepass)
Modern Unreal 5
Visibility buffer / Nanite
기본값: 매 desktop + 매 heavy shader → Z-prepass. 매 mobile → skip. 매 modern engine → visibility buffer.