Files
2nd/10_Wiki/Topics/Architecture/Software_Architecture_Recovery.md
T
koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 12:24:15 +09:00

160 lines
5.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: wiki-2026-0508-software-architecture-recovery
title: Software Architecture Recovery
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [Architecture Recovery, Reverse Architecting]
duplicate_of: none
source_trust_level: A
confidence_score: 0.85
verification_status: applied
tags: [architecture, reverse-engineering, legacy, static-analysis]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
language: python
framework: networkx
---
# Software Architecture Recovery
## 매 한 줄
> **"매 source code → 매 architectural model 의 inference"**. Documentation 의 lost / outdated 의 legacy system 의 understanding. 2026 현재 매 LLM (Claude Opus 4.7, GPT-5) 의 augmented static-analysis 가 매 dominant — 매 dependency graph + cluster + LLM-named module summary.
## 매 핵심
### 매 phases
1. **Extraction**: 매 source code, build files, config 의 parse → entities (file, class, module).
2. **Abstraction**: 매 dependency graph, call graph, data-flow.
3. **Clustering**: 매 community detection (Louvain, label propagation), 매 LLM semantic grouping.
4. **Presentation**: C4 diagram, dependency matrix, ADR.
### 매 techniques
- **Static**: AST parse, import graph (madge, jdeps, pyan).
- **Dynamic**: trace logs, profilers, distributed tracing (OTel).
- **Hybrid**: 매 static + runtime call data merge.
- **LLM-augmented**: 매 module 별 README/code → 매 LLM summary, 매 architecture description.
### 매 응용
1. Legacy modernization assessment.
2. Microservice decomposition planning.
3. Onboarding new engineers.
## 💻 패턴
### Python — import graph 의 추출
```python
import ast, os, networkx as nx
G = nx.DiGraph()
for root, _, files in os.walk("src"):
for f in files:
if not f.endswith(".py"): continue
path = os.path.join(root, f)
tree = ast.parse(open(path).read())
mod = path.replace("/", ".").removesuffix(".py")
for node in ast.walk(tree):
if isinstance(node, ast.ImportFrom) and node.module:
G.add_edge(mod, node.module)
```
### JavaScript — madge dependency graph
```bash
npx madge --image graph.svg --extensions ts,tsx src/
npx madge --circular src/ # detect cycles
```
### Java — jdeps + GraalVM
```bash
jdeps -verbose:class -recursive app.jar > deps.txt
jdeps --inverse --package com.acme.payment app.jar
```
### Community detection (Louvain)
```python
import networkx as nx
from networkx.algorithms.community import louvain_communities
modules = louvain_communities(G.to_undirected(), resolution=1.2, seed=42)
for i, m in enumerate(modules):
print(f"Module {i}: {sorted(m)[:5]}...")
```
### LLM-augmented module naming (Claude Opus 4.7)
```python
from anthropic import Anthropic
client = Anthropic()
def name_module(files: list[str], code_snippets: list[str]) -> str:
msg = client.messages.create(
model="claude-opus-4-7",
max_tokens=200,
messages=[{"role": "user", "content":
f"Files: {files}\n\nSnippets:\n{code_snippets}\n\n"
"Give a 3-word module name + 1-line responsibility."}],
)
return msg.content[0].text
```
### Runtime trace → architecture (OpenTelemetry)
```python
# Aggregate spans into service-level call graph
from collections import Counter
edges = Counter()
for span in fetch_traces(service="checkout", since="24h"):
if span.parent and span.parent.service != span.service:
edges[(span.parent.service, span.service)] += 1
# Top edges = primary architectural connections
```
### C4 diagram emission (Structurizr DSL)
```dsl
workspace {
model {
user = person "Customer"
sys = softwareSystem "Shop" {
web = container "Web"
api = container "API"
db = container "Postgres"
}
user -> web "uses"
web -> api "REST"
api -> db "JDBC"
}
}
```
## 매 결정 기준
| 상황 | Approach |
|---|---|
| Small monolith (<100k LoC) | Static import graph + manual review |
| Microservices distributed | Distributed tracing (OTel) + service map |
| Legacy COBOL/Java enterprise | Lattix / Structure101 commercial tools |
| Quick high-level overview | LLM (Opus 4.7) on README + top-level dirs |
| Decomposition planning | Static + dynamic + LLM hybrid |
**기본값**: 매 static import graph (madge / pyan / jdeps) → Louvain cluster → LLM name → C4 diagram.
## 🔗 Graph
- 부모: [[Software Architecture]]
- 응용: [[Legacy Modernization]]
- Adjacent: [[C4 Model (Architecture Documentation)]] · [[Dependency Analysis]] · [[Static Analysis]]
## 🤖 LLM 활용
**언제**: 매 undocumented codebase 의 onboarding, 매 modernization plan, 매 dependency cycle 의 detect.
**언제 X**: 매 well-documented current arch — 매 ADR 의 read 의 충분.
## ❌ 안티패턴
- **Recovered = correct**: 매 inferred architecture 는 매 historical, 매 ideal X. Validate with team.
- **Static only for distributed system**: 매 runtime topology 의 lost.
- **LLM hallucination**: 매 module name 의 plausible 의 X-correct. 매 verify.
## 🧪 검증 / 중복
- Verified (Garlan & Schmerl SAR research, 20022024; SEI architecture reconstruction guides).
- 신뢰도 A.
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — recovery techniques with LLM-augmented analysis |