2nd/10_Wiki/Topics/Frontend/GPU_WebGL 파이프라인의 미세 지연(Micro-latency) 측정 사례.md

---
id: wiki-2026-0508-gpu-webgl-파이프라인의-미세-지연-micro-lat
title: GPU WebGL 파이프라인의 미세 지연(Micro-latency) 측정 사례
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [WebGL Micro-latency, GPU Pipeline Latency, WebGL Profiling]
duplicate_of: none
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [webgl, gpu, performance, latency, profiling]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
  language: javascript
  framework: webgl2
---

# GPU WebGL 파이프라인의 미세 지연(Micro-latency) 측정 사례

## 매 한 줄
> **"매 GPU 작업은 비동기 — `gl.finish()` 의 X, query object 의 O"**. WebGL 의 draw call 은 GPU command buffer 에 enqueue 만 되므로 CPU 측 `performance.now()` 만으로 측정 시 실제 GPU 시간과 ms 단위 misalignment 발생. `EXT_disjoint_timer_query_webgl2` 가 매 정답.

## 매 핵심

### 매 측정 대상 분리
- **CPU time**: JS → WebGL command 인코딩 (validate, draw call dispatch).
- **GPU time**: shader 실행 + raster + blend.
- **Present latency**: swapchain → display 의 frame pacing (compositor 의존).

### 매 측정 수단
- `EXT_disjoint_timer_query_webgl2` — GPU timestamp query, ns 정밀도.
- `gpu.requestAdapter()` (WebGPU) — 매 native timestamp-query feature.
- Chrome DevTools Performance → GPU track.
- `Spector.js` — frame capture + per-call cost.

### 매 응용
1. 60→120Hz upgrade 시 frame budget shrink (16.6→8.3ms) 의 bottleneck localization.
2. Mobile WebGL 의 thermal throttling 감지 (GPU time 의 점진적 increase).
3. PWA 게임 의 input→render latency end-to-end 측정.

## 💻 패턴

### Timer Query — frame GPU time 측정
```javascript
const ext = gl.getExtension('EXT_disjoint_timer_query_webgl2');
const queries = [];

function frame() {
  const q = gl.createQuery();
  gl.beginQuery(ext.TIME_ELAPSED_EXT, q);
  drawScene();
  gl.endQuery(ext.TIME_ELAPSED_EXT);
  queries.push({ q, frameId: frameCount++ });

  // resolve 매 several frame later (async)
  for (let i = queries.length - 1; i >= 0; i--) {
    const { q, frameId } = queries[i];
    const available = gl.getQueryParameter(q, gl.QUERY_RESULT_AVAILABLE);
    const disjoint = gl.getParameter(ext.GPU_DISJOINT_EXT);
    if (available && !disjoint) {
      const ns = gl.getQueryParameter(q, gl.QUERY_RESULT);
      console.log(`frame ${frameId} GPU: ${(ns / 1e6).toFixed(2)} ms`);
      gl.deleteQuery(q);
      queries.splice(i, 1);
    }
  }
  requestAnimationFrame(frame);
}
```

### Anti — gl.finish() blocking
```javascript
// X — pipeline stall, 매 production 에서 절대 사용 X
const t0 = performance.now();
drawScene();
gl.finish(); // CPU↔GPU sync barrier
const t1 = performance.now();
// t1-t0 는 측정 대상이 아닌 stall 의 cost
```

### Input→render latency (RAIL model)
```javascript
let inputTs = 0;
canvas.addEventListener('pointerdown', e => {
  inputTs = e.timeStamp; // 매 hardware timestamp
});

function frame(now) {
  if (inputTs > 0 && pendingInputProcessed) {
    const latency = now - inputTs;
    if (latency > 100) console.warn(`input→render: ${latency}ms`);
    inputTs = 0;
  }
  drawScene();
  requestAnimationFrame(frame);
}
```

### Disjoint detection (thermal throttling)
```javascript
function checkThermal() {
  const disjoint = gl.getParameter(ext.GPU_DISJOINT_EXT);
  if (disjoint) {
    // 매 GPU 가 context switch / power state 변경 — measurement invalid
    metricsBuffer.flush('thermal_event');
  }
}
```

### WebGPU — native timestamp query
```javascript
const querySet = device.createQuerySet({ type: 'timestamp', count: 2 });
const encoder = device.createCommandEncoder();
const pass = encoder.beginRenderPass({
  ...,
  timestampWrites: { querySet, beginningOfPassWriteIndex: 0, endOfPassWriteIndex: 1 }
});
pass.draw(...);
pass.end();
encoder.resolveQuerySet(querySet, 0, 2, resolveBuffer, 0);
// readback async
```

### Frame pacing (requestVideoFrameCallback)
```javascript
videoEl.requestVideoFrameCallback((now, meta) => {
  const presentLatency = meta.presentationTime - meta.captureTime;
  // 매 actual display 까지의 latency 측정
});
```

### Sliding-window p99 측정
```javascript
const samples = new Float32Array(120); // 2s @ 60Hz
let idx = 0;
function record(ms) {
  samples[idx++ % samples.length] = ms;
  if (idx % 60 === 0) {
    const sorted = [...samples].sort((a, b) => a - b);
    console.log('p50:', sorted[60], 'p99:', sorted[118]);
  }
}
```

## 매 결정 기준
| 상황 | Approach |
|---|---|
| WebGL2 + 매 정밀 측정 | `EXT_disjoint_timer_query_webgl2` |
| WebGPU 가능 | timestampWrites |
| Quick debug | Spector.js capture |
| Production telemetry | sliding-window p99 + disjoint flag |
| Cross-browser fallback | CPU time only + acknowledged limitation |

**기본값**: WebGL2 + timer query async readback, p99/disjoint 매 telemetry.

## 🔗 Graph
- 부모: [[WebGL]] · [[GPU]]
- 변형: [[WebGPU]] · [[Vulkan]]
- 응용: [[Three.js]]
- Adjacent: [[Browser-Compositor]]

## 🤖 LLM 활용
**언제**: WebGL/WebGPU 앱 의 frame budget 초과 진단, 60→120Hz 마이그레이션, mobile thermal 분석.
**언제 X**: 일반 DOM perf (use Performance API), Canvas2D (no GPU query).

## ❌ 안티패턴
- **gl.finish() in frame loop**: pipeline stall, 매 측정값 자체 가 왜곡.
- **performance.now() only**: GPU async 의 무시 — frame 의 N+2 까지 결과 가 안 나올 수 있음.
- **disjoint flag 무시**: thermal/power state 변경 시 measurement 가 garbage 인데 그대로 report.
- **single sample**: GC, compositor jitter 의 무시 — 매 p99 측정 필수.

## 🧪 검증 / 중복
- Verified (Khronos `EXT_disjoint_timer_query_webgl2` spec, Chrome DevTools docs).
- 신뢰도 A.

## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — WebGL micro-latency 측정 패턴 정리 |