---
id: wiki-2026-0508-mechanistic-interpretability-ste
title: "Mechanistic Interpretability & Steering"
category: 10_Wiki/Topics
status: duplicate
canonical_id: "[[Mechanistic Interpretability (기계적 해석 가능성)]]"
duplicate_of: "[[Mechanistic Interpretability (기계적 해석 가능성)]]"
aliases: [Mech Interp, Steering, Activation Steering]
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [redirect, mech-interp, alignment]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack: { language: none, framework: none }
---

# Mechanistic Interpretability & Steering

> 이 문서는 [[Mechanistic Interpretability (기계적 해석 가능성)]]로 통합되었습니다.

## 매 한 줄

신경망 내부 회로(circuit)를 역공학하여 행동을 이해·조종(steering)하는 분야. canonical 문서 참고.

## 🔗 Graph

- 부모: [[Mechanistic Interpretability (기계적 해석 가능성)]] (canonical)
- 인접: [[AI Safety & Constitutional AI]], [[AI-Alignment]]

## 🕓 Changelog

- 2026-05-10: REDIRECT 처리 — canonical = [[Mechanistic Interpretability (기계적 해석 가능성)]].