feat: Wiki 지식 자산 업데이트 - UX Scenarios, Frontend, Game Design, Topics 추가 [2026-05-08]

2026-05-08 19:52:07 +09:00
parent 9dd3d40662
commit 5ba5a55c78
3984 changed files with 334557 additions and 28839 deletions
@@ -1,29 +1,25 @@
 ---
-id: AI-COMP-001
-category: Unified
-confidence_score: 1.0
-tags: [ai, [[Deep-Learning|Deep-Learning]], [[Model-Compression|Model-Compression]], [[Quantization|Quantization]], pruning, efficient-ai]
-last_reinforced: 2026-04-26
+id: wiki-2026-0508-model-compression-strategies
+title: Model Compression Strategies
+category: 10_Wiki/Topics/AI_and_ML
+status: merged
+redirect_to: LLM_Optimization_and_Deployment_Strategies
+canonical_id: LLM_Optimization_and_Deployment_Strategies
+aliases: [P-Reinforce-REDIRECT-MODEL-COMP-STRAT]
+duplicate_of: none
+source_trust_level: A
+confidence_score: 0.92
+tags: [redirect]
+raw_sources: []
+last_reinforced: 2026-05-08
+github_commit: pending
+inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08)
 ---

-# Model Compression Strategies (모델 압축 전략)
+# [[Model-Compression-Strategies]]

-## 📌 한 줄 통찰 (The Karpathy Summary)
-> "모델의 지능은 보존하되 그 몸집([[Parameter|Parameter]]s)을 줄여, 클라우드의 한계를 넘어 모든 기기에서 지능이 숨 쉬게 하라" — 딥러닝 모델의 크기와 연산 복잡도를 줄여 추론 속도를 높이고 메모리 사용량을 절감하는 기술적 방법론.
+> [!IMPORTANT]
+> 이 문서는 고밀도 지식 자산 통합 정책에 따라 **[[LLM_Optimization_and_Deployment_Strategies]]**으로 통합되었습니다.

-## 📖 구조화된 지식 (Synthesized Content)
- **추출된 패턴:** "Redundancy Reduction and Precision Scaling" — 신경망 내의 불필요한 연결을 제거하거나 수치의 정밀도를 조절함으로써, 모델의 정확도 손실을 최소화하며 자원 점유율을 획기적으로 낮추는 압축 패턴.
- **주요 전략:**
-    - **Quantization:** 32비트 가중치를 8비트나 4비트 정수로 변환. 연산 속도와 에너지 효율 극대화.
-    - **Weight Pruning:** 중요도가 낮은 가중치를 0으로 만들어 모델을 희소(Sparse)하게 만듦.
-    - **Knowledge [[Distillation|Distillation]]:** 거대 모델의 지식을 가볍고 빠른 소형 모델로 전이.
-    - **Low-Rank Factorization:** 큰 행렬을 작은 행렬들의 곱으로 분해하여 파라미터 수 감소.
- **의의:** AI 모델이 연구실을 넘어 모바일, IoT, 자동차 등 실생활의 모든 접점에서 실시간으로 작동하게 만드는 핵심 인프라 기술.
-
-## ⚠️ 모순 및 업데이트 (Contradictions & RL Update)
- **과거 데이터와의 충돌:** 압축은 항상 성능 저하를 동반한다는 인식을 넘어, 이제는 적절한 압축과 미세 조정을 통해 오히려 과적합을 방지하고 일반화 성능을 높이는 사례가 증가함.
- **정책 변화:** Antigravity 프로젝트는 모든 배포용 모델에 대해 최소 8비트 이상의 양자화 검증을 필수화하여, 에이전트의 응답 속도를 최우선으로 관리함.
-
-## 🔗 지식 연결 (Graph)
- [[Mobile-AI-Optimization|Mobile-AI-Optimization]], [[Knowledge-Distillation|Knowledge-Distillation]], [[Inference-Optimization|Inference-Optimization]], [[Low-Rank-Adaptation-LoRA|Low-Rank-Adaptation-LoRA]]
- **Raw Source:** 10_Wiki/Topics/AI/Model-Compression-Strategies.md
+---
+*Redirected to: [[LLM_Optimization_and_Deployment_Strategies]]*