Files
2nd/10_Wiki/Topics/AI_and_ML/Mechanistic Interpretability & Steering.md
T
2026-05-10 22:08:15 +09:00

1.2 KiB

id, title, category, status, canonical_id, duplicate_of, aliases, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id duplicate_of aliases source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-mechanistic-interpretability-ste Mechanistic Interpretability & Steering 10_Wiki/Topics duplicate Mechanistic Interpretability (기계적 해석 가능성) Mechanistic Interpretability (기계적 해석 가능성)
Mech Interp
Steering
Activation Steering
A 0.9 applied
redirect
mech-interp
alignment
2026-05-10 pending
language framework
none none

Mechanistic Interpretability & Steering

이 문서는 Mechanistic Interpretability (기계적 해석 가능성)로 통합되었습니다.

매 한 줄

신경망 내부 회로(circuit)를 역공학하여 행동을 이해·조종(steering)하는 분야. canonical 문서 참고.

🔗 Graph

🕓 Changelog