d8a80f6272
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해 끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은 과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업. 도구: Datacollect/scripts/link_reconcile_apply.mjs Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
160 lines
5.2 KiB
Markdown
160 lines
5.2 KiB
Markdown
---
|
||
id: wiki-2026-0508-software-architecture-recovery
|
||
title: Software Architecture Recovery
|
||
category: 10_Wiki/Topics
|
||
status: verified
|
||
canonical_id: self
|
||
aliases: [Architecture Recovery, Reverse Architecting]
|
||
duplicate_of: none
|
||
source_trust_level: A
|
||
confidence_score: 0.85
|
||
verification_status: applied
|
||
tags: [architecture, reverse-engineering, legacy, static-analysis]
|
||
raw_sources: []
|
||
last_reinforced: 2026-05-10
|
||
github_commit: pending
|
||
tech_stack:
|
||
language: python
|
||
framework: networkx
|
||
---
|
||
|
||
# Software Architecture Recovery
|
||
|
||
## 매 한 줄
|
||
> **"매 source code → 매 architectural model 의 inference"**. Documentation 의 lost / outdated 의 legacy system 의 understanding. 2026 현재 매 LLM (Claude Opus 4.7, GPT-5) 의 augmented static-analysis 가 매 dominant — 매 dependency graph + cluster + LLM-named module summary.
|
||
|
||
## 매 핵심
|
||
|
||
### 매 phases
|
||
1. **Extraction**: 매 source code, build files, config 의 parse → entities (file, class, module).
|
||
2. **Abstraction**: 매 dependency graph, call graph, data-flow.
|
||
3. **Clustering**: 매 community detection (Louvain, label propagation), 매 LLM semantic grouping.
|
||
4. **Presentation**: C4 diagram, dependency matrix, ADR.
|
||
|
||
### 매 techniques
|
||
- **Static**: AST parse, import graph (madge, jdeps, pyan).
|
||
- **Dynamic**: trace logs, profilers, distributed tracing (OTel).
|
||
- **Hybrid**: 매 static + runtime call data merge.
|
||
- **LLM-augmented**: 매 module 별 README/code → 매 LLM summary, 매 architecture description.
|
||
|
||
### 매 응용
|
||
1. Legacy modernization assessment.
|
||
2. Microservice decomposition planning.
|
||
3. Onboarding new engineers.
|
||
|
||
## 💻 패턴
|
||
|
||
### Python — import graph 의 추출
|
||
```python
|
||
import ast, os, networkx as nx
|
||
G = nx.DiGraph()
|
||
for root, _, files in os.walk("src"):
|
||
for f in files:
|
||
if not f.endswith(".py"): continue
|
||
path = os.path.join(root, f)
|
||
tree = ast.parse(open(path).read())
|
||
mod = path.replace("/", ".").removesuffix(".py")
|
||
for node in ast.walk(tree):
|
||
if isinstance(node, ast.ImportFrom) and node.module:
|
||
G.add_edge(mod, node.module)
|
||
```
|
||
|
||
### JavaScript — madge dependency graph
|
||
```bash
|
||
npx madge --image graph.svg --extensions ts,tsx src/
|
||
npx madge --circular src/ # detect cycles
|
||
```
|
||
|
||
### Java — jdeps + GraalVM
|
||
```bash
|
||
jdeps -verbose:class -recursive app.jar > deps.txt
|
||
jdeps --inverse --package com.acme.payment app.jar
|
||
```
|
||
|
||
### Community detection (Louvain)
|
||
```python
|
||
import networkx as nx
|
||
from networkx.algorithms.community import louvain_communities
|
||
modules = louvain_communities(G.to_undirected(), resolution=1.2, seed=42)
|
||
for i, m in enumerate(modules):
|
||
print(f"Module {i}: {sorted(m)[:5]}...")
|
||
```
|
||
|
||
### LLM-augmented module naming (Claude Opus 4.7)
|
||
```python
|
||
from anthropic import Anthropic
|
||
client = Anthropic()
|
||
def name_module(files: list[str], code_snippets: list[str]) -> str:
|
||
msg = client.messages.create(
|
||
model="claude-opus-4-7",
|
||
max_tokens=200,
|
||
messages=[{"role": "user", "content":
|
||
f"Files: {files}\n\nSnippets:\n{code_snippets}\n\n"
|
||
"Give a 3-word module name + 1-line responsibility."}],
|
||
)
|
||
return msg.content[0].text
|
||
```
|
||
|
||
### Runtime trace → architecture (OpenTelemetry)
|
||
```python
|
||
# Aggregate spans into service-level call graph
|
||
from collections import Counter
|
||
edges = Counter()
|
||
for span in fetch_traces(service="checkout", since="24h"):
|
||
if span.parent and span.parent.service != span.service:
|
||
edges[(span.parent.service, span.service)] += 1
|
||
# Top edges = primary architectural connections
|
||
```
|
||
|
||
### C4 diagram emission (Structurizr DSL)
|
||
```dsl
|
||
workspace {
|
||
model {
|
||
user = person "Customer"
|
||
sys = softwareSystem "Shop" {
|
||
web = container "Web"
|
||
api = container "API"
|
||
db = container "Postgres"
|
||
}
|
||
user -> web "uses"
|
||
web -> api "REST"
|
||
api -> db "JDBC"
|
||
}
|
||
}
|
||
```
|
||
|
||
## 매 결정 기준
|
||
| 상황 | Approach |
|
||
|---|---|
|
||
| Small monolith (<100k LoC) | Static import graph + manual review |
|
||
| Microservices distributed | Distributed tracing (OTel) + service map |
|
||
| Legacy COBOL/Java enterprise | Lattix / Structure101 commercial tools |
|
||
| Quick high-level overview | LLM (Opus 4.7) on README + top-level dirs |
|
||
| Decomposition planning | Static + dynamic + LLM hybrid |
|
||
|
||
**기본값**: 매 static import graph (madge / pyan / jdeps) → Louvain cluster → LLM name → C4 diagram.
|
||
|
||
## 🔗 Graph
|
||
- 부모: [[Software Architecture]]
|
||
- 응용: [[Legacy Modernization]]
|
||
- Adjacent: [[C4 Model (Architecture Documentation)]] · [[Dependency Analysis]] · [[Static Analysis]]
|
||
|
||
## 🤖 LLM 활용
|
||
**언제**: 매 undocumented codebase 의 onboarding, 매 modernization plan, 매 dependency cycle 의 detect.
|
||
**언제 X**: 매 well-documented current arch — 매 ADR 의 read 의 충분.
|
||
|
||
## ❌ 안티패턴
|
||
- **Recovered = correct**: 매 inferred architecture 는 매 historical, 매 ideal X. Validate with team.
|
||
- **Static only for distributed system**: 매 runtime topology 의 lost.
|
||
- **LLM hallucination**: 매 module name 의 plausible 의 X-correct. 매 verify.
|
||
|
||
## 🧪 검증 / 중복
|
||
- Verified (Garlan & Schmerl SAR research, 2002–2024; SEI architecture reconstruction guides).
|
||
- 신뢰도 A.
|
||
|
||
## 🕓 Changelog
|
||
| 날짜 | 변경 |
|
||
|---|---|
|
||
| 2026-05-08 | Phase 1 |
|
||
| 2026-05-10 | Manual cleanup — recovery techniques with LLM-augmented analysis |
|