--- id: wiki-2026-0508-code-property-graph title: Code Property Graph category: 10_Wiki/Topics status: verified canonical_id: self aliases: [CPG, Code Property Graphs, Joern CPG] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [security, sast, cpg, static-analysis] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: scala framework: joern --- # Code Property Graph ## 매 한 줄 > **"매 CPG 의 의미: 매 AST + CFG + PDG 의 매 single graph representation"**. 매 Yamaguchi et al. (2014) 가 매 IEEE S&P 의 매 propose, 매 Joern 의 매 implement. 매 2026 modern SAST (Joern, Qwiet AI/ShiftLeft, CodeQL의 dataflow) 의 매 backbone. ## 매 핵심 ### 매 3 layer - **AST** (Abstract Syntax Tree): 매 syntactic structure - **CFG** (Control Flow Graph): 매 execution order, branches - **PDG** (Program Dependency Graph): 매 data + control dependencies ### 매 single graph - 매 node = AST node - 매 edge = AST parent / CFG next / PDG dataflow / call edge - 매 query 의 graph traversal 로 매 vulnerability pattern 감지 ### 매 query language - **Joern**: Scala-based DSL (Gremlin-like) - **CodeQL**: declarative QL language (similar concept) - **Semgrep**: 매 simpler (AST-only), 매 not full CPG ### 매 응용 1. 매 SAST: 매 SQLi/XSS/RCE pattern 의 매 detection. 2. 매 audit: 매 sensitive sink (exec, eval) 의 매 tainted source 까지 trace. 3. 매 academic research: 매 vulnerability mining (CVE 의 retroactive find). ## 💻 패턴 ### Joern 의 매 install + import ```bash # 2026 Joern v4 curl -L https://github.com/joernio/joern/releases/latest/download/joern-install.sh | sh joern --import src/ joern> importCode("path/to/project") joern> cpg.method.l ``` ### 매 SQLi pattern detection ```scala // Joern Scala query: 매 user input 의 SQL 실행 까지 도달 cpg.method.name("query|execute") .parameter .reachableBy(cpg.method.name("getParameter|req\\.body").ast) .l ``` ### 매 hardcoded secret detection ```scala cpg.literal .code("\"[A-Za-z0-9+/]{32,}\"") .filter(_.method.name != "test") .l ``` ### CodeQL 의 매 taint tracking (similar) ```ql import javascript class Configuration extends TaintTracking::Configuration { Configuration() { this = "UserInputToEval" } override predicate isSource(DataFlow::Node source) { source instanceof RemoteFlowSource } override predicate isSink(DataFlow::Node sink) { exists(CallExpr c | c.getCalleeName() = "eval" | sink.asExpr() = c.getArgument(0)) } } from Configuration cfg, DataFlow::PathNode source, DataFlow::PathNode sink where cfg.hasFlowPath(source, sink) select sink, source, sink, "Tainted eval" ``` ### 매 custom 매 sink 정의 ```scala val customSinks = cpg.call.name("dangerouslySetInnerHTML|innerHTML") val customSources = cpg.call.name("fetch|axios.get").argument(1) customSinks.reachableBy(customSources).l ``` ### 매 CPG 의 매 export (for visualization) ```scala joern> cpg.runScript("export-cpg.sc") // 매 GraphML / DOT / Neo4j 로 export ``` ### 매 CI integration (Joern Scan) ```yaml # .github/workflows/joern.yml - name: Joern Scan run: | joern-scan --src ./src --output joern-report.json jq '.findings[] | select(.severity=="HIGH")' joern-report.json ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | Quick rule-based scan | 매 Semgrep (syntactic, fast) | | Deep dataflow analysis | 매 Joern / CodeQL (CPG-based) | | GitHub-native | 매 CodeQL (Advanced Security) | | Multi-language audit | 매 Joern (C/C++/Java/Python/JS/PHP) | | Custom vuln mining | 매 Joern Scala script | **기본값**: 매 Semgrep (매 quick CI gate) + 매 CodeQL (매 deep weekly audit). ## 🔗 Graph - 부모: [[SAST]] · [[Static Analysis]] - 변형: [[AST]] - 응용: [[Joern]] · [[CodeQL]] - Adjacent: [[보안 및 시스템 신뢰성 표준|DAST]] · [[SCA_Fundamentals|SCA]] · [[DevSecOps Framework]] ## 🤖 LLM 활용 **언제**: 매 Joern Scala query 의 draft, 매 CPG result 의 false positive triage, 매 custom sink/source 의 suggestion. **언제 X**: 매 LLM 의 self 의 vulnerability detection — 매 hallucination risk. 매 CPG-based 결과 가 매 ground truth. ## ❌ 안티패턴 - **CPG 의 build 만 하고 의 query 의 X**: 매 graph 의 사용 안함. - **매 source / sink 의 매 default 만**: 매 framework-specific (Express, Spring) 의 매 manual 정의 필요. - **매 Joern 의 huge codebase 의 timeout**: 매 incremental import / 매 module 별 split. - **매 alert fatigue**: 매 severity tuning 없이 매 모든 finding raise. ## 🧪 검증 / 중복 - Verified (Yamaguchi et al. 2014 IEEE S&P; Joern docs; CodeQL docs). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — CPG full content |