Files
2nd/10_Wiki/Topics/AI_and_ML/Prioritized-Experience-Replay.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

1.3 KiB
Raw Blame History

id, title, category, status, canonical_id, duplicate_of, aliases, source_trust_level, confidence_score, verification_status, tags, last_reinforced, github_commit
id title category status canonical_id duplicate_of aliases source_trust_level confidence_score verification_status tags last_reinforced github_commit
wiki-2026-0508-prioritized-experience-replay Prioritized Experience Replay 10_Wiki/Topics duplicate experience-replay Experience-Replay
PER
A 0.9 redirected
duplicate
reinforcement-learning
replay-buffer
2026-05-10 pending

Prioritized Experience Replay

이 문서는 Experience-Replay 의 중복본입니다. Canonical 문서로 redirect.

핵심 요약 (specialization aspects)

  • Schaul et al 2016 (ICLR): 매 sample transitions proportional to TD-error magnitude — high-error transitions trained more often.
  • Sampling probability: P(i) ∝ |δ_i|^α (α=0.6 typical).
  • Importance sampling weights: w_i = (N · P(i))^(-β) — corrects bias from non-uniform sampling; β annealed 0.4 → 1.0.
  • SumTree data structure: O(log N) sample + update.
  • 매 strict superset of uniform replay; default in Rainbow DQN.

🔗 Graph

🕓 변경 이력

날짜 변경
2026-05-08 Phase 1
2026-05-10 중복 처리 — canonical Experience-Replay 로 redirect, PER specialization aspects 보존