--- id: wiki-2026-0508-gpu-webgl-파이프라인의-미세-지연-micro-lat title: GPU WebGL 파이프라인의 미세 지연(Micro-latency) 측정 사례 category: 10_Wiki/Topics status: verified canonical_id: self aliases: [WebGL Micro-latency, GPU Pipeline Latency, WebGL Profiling] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [webgl, gpu, performance, latency, profiling] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: javascript framework: webgl2 --- # GPU WebGL 파이프라인의 미세 지연(Micro-latency) 측정 사례 ## 매 한 줄 > **"매 GPU 작업은 비동기 — `gl.finish()` 의 X, query object 의 O"**. WebGL 의 draw call 은 GPU command buffer 에 enqueue 만 되므로 CPU 측 `performance.now()` 만으로 측정 시 실제 GPU 시간과 ms 단위 misalignment 발생. `EXT_disjoint_timer_query_webgl2` 가 매 정답. ## 매 핵심 ### 매 측정 대상 분리 - **CPU time**: JS → WebGL command 인코딩 (validate, draw call dispatch). - **GPU time**: shader 실행 + raster + blend. - **Present latency**: swapchain → display 의 frame pacing (compositor 의존). ### 매 측정 수단 - `EXT_disjoint_timer_query_webgl2` — GPU timestamp query, ns 정밀도. - `gpu.requestAdapter()` (WebGPU) — 매 native timestamp-query feature. - Chrome DevTools Performance → GPU track. - `Spector.js` — frame capture + per-call cost. ### 매 응용 1. 60→120Hz upgrade 시 frame budget shrink (16.6→8.3ms) 의 bottleneck localization. 2. Mobile WebGL 의 thermal throttling 감지 (GPU time 의 점진적 increase). 3. PWA 게임 의 input→render latency end-to-end 측정. ## 💻 패턴 ### Timer Query — frame GPU time 측정 ```javascript const ext = gl.getExtension('EXT_disjoint_timer_query_webgl2'); const queries = []; function frame() { const q = gl.createQuery(); gl.beginQuery(ext.TIME_ELAPSED_EXT, q); drawScene(); gl.endQuery(ext.TIME_ELAPSED_EXT); queries.push({ q, frameId: frameCount++ }); // resolve 매 several frame later (async) for (let i = queries.length - 1; i >= 0; i--) { const { q, frameId } = queries[i]; const available = gl.getQueryParameter(q, gl.QUERY_RESULT_AVAILABLE); const disjoint = gl.getParameter(ext.GPU_DISJOINT_EXT); if (available && !disjoint) { const ns = gl.getQueryParameter(q, gl.QUERY_RESULT); console.log(`frame ${frameId} GPU: ${(ns / 1e6).toFixed(2)} ms`); gl.deleteQuery(q); queries.splice(i, 1); } } requestAnimationFrame(frame); } ``` ### Anti — gl.finish() blocking ```javascript // X — pipeline stall, 매 production 에서 절대 사용 X const t0 = performance.now(); drawScene(); gl.finish(); // CPU↔GPU sync barrier const t1 = performance.now(); // t1-t0 는 측정 대상이 아닌 stall 의 cost ``` ### Input→render latency (RAIL model) ```javascript let inputTs = 0; canvas.addEventListener('pointerdown', e => { inputTs = e.timeStamp; // 매 hardware timestamp }); function frame(now) { if (inputTs > 0 && pendingInputProcessed) { const latency = now - inputTs; if (latency > 100) console.warn(`input→render: ${latency}ms`); inputTs = 0; } drawScene(); requestAnimationFrame(frame); } ``` ### Disjoint detection (thermal throttling) ```javascript function checkThermal() { const disjoint = gl.getParameter(ext.GPU_DISJOINT_EXT); if (disjoint) { // 매 GPU 가 context switch / power state 변경 — measurement invalid metricsBuffer.flush('thermal_event'); } } ``` ### WebGPU — native timestamp query ```javascript const querySet = device.createQuerySet({ type: 'timestamp', count: 2 }); const encoder = device.createCommandEncoder(); const pass = encoder.beginRenderPass({ ..., timestampWrites: { querySet, beginningOfPassWriteIndex: 0, endOfPassWriteIndex: 1 } }); pass.draw(...); pass.end(); encoder.resolveQuerySet(querySet, 0, 2, resolveBuffer, 0); // readback async ``` ### Frame pacing (requestVideoFrameCallback) ```javascript videoEl.requestVideoFrameCallback((now, meta) => { const presentLatency = meta.presentationTime - meta.captureTime; // 매 actual display 까지의 latency 측정 }); ``` ### Sliding-window p99 측정 ```javascript const samples = new Float32Array(120); // 2s @ 60Hz let idx = 0; function record(ms) { samples[idx++ % samples.length] = ms; if (idx % 60 === 0) { const sorted = [...samples].sort((a, b) => a - b); console.log('p50:', sorted[60], 'p99:', sorted[118]); } } ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | WebGL2 + 매 정밀 측정 | `EXT_disjoint_timer_query_webgl2` | | WebGPU 가능 | timestampWrites | | Quick debug | Spector.js capture | | Production telemetry | sliding-window p99 + disjoint flag | | Cross-browser fallback | CPU time only + acknowledged limitation | **기본값**: WebGL2 + timer query async readback, p99/disjoint 매 telemetry. ## 🔗 Graph - 부모: [[WebGL]] · [[GPU]] - 변형: [[WebGPU]] · [[Vulkan]] - 응용: [[Web-Game-Performance]] · [[Three.js]] - Adjacent: [[Frame-Pacing]] · [[Browser-Compositor]] ## 🤖 LLM 활용 **언제**: WebGL/WebGPU 앱 의 frame budget 초과 진단, 60→120Hz 마이그레이션, mobile thermal 분석. **언제 X**: 일반 DOM perf (use Performance API), Canvas2D (no GPU query). ## ❌ 안티패턴 - **gl.finish() in frame loop**: pipeline stall, 매 측정값 자체 가 왜곡. - **performance.now() only**: GPU async 의 무시 — frame 의 N+2 까지 결과 가 안 나올 수 있음. - **disjoint flag 무시**: thermal/power state 변경 시 measurement 가 garbage 인데 그대로 report. - **single sample**: GC, compositor jitter 의 무시 — 매 p99 측정 필수. ## 🧪 검증 / 중복 - Verified (Khronos `EXT_disjoint_timer_query_webgl2` spec, Chrome DevTools docs). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — WebGL micro-latency 측정 패턴 정리 |