---
id: wiki-2026-0508-code-property-graph
title: Code Property Graph
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [CPG, Code Property Graphs, Joern CPG]
duplicate_of: none
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [security, sast, cpg, static-analysis]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
  language: scala
  framework: joern
---

# Code Property Graph

## 매 한 줄
> **"매 CPG 의 의미: 매 AST + CFG + PDG 의 매 single graph representation"**. 매 Yamaguchi et al. (2014) 가 매 IEEE S&P 의 매 propose, 매 Joern 의 매 implement. 매 2026 modern SAST (Joern, Qwiet AI/ShiftLeft, CodeQL의 dataflow) 의 매 backbone.

## 매 핵심

### 매 3 layer
- **AST** (Abstract Syntax Tree): 매 syntactic structure
- **CFG** (Control Flow Graph): 매 execution order, branches
- **PDG** (Program Dependency Graph): 매 data + control dependencies

### 매 single graph
- 매 node = AST node
- 매 edge = AST parent / CFG next / PDG dataflow / call edge
- 매 query 의 graph traversal 로 매 vulnerability pattern 감지

### 매 query language
- **Joern**: Scala-based DSL (Gremlin-like)
- **CodeQL**: declarative QL language (similar concept)
- **Semgrep**: 매 simpler (AST-only), 매 not full CPG

### 매 응용
1. 매 SAST: 매 SQLi/XSS/RCE pattern 의 매 detection.
2. 매 audit: 매 sensitive sink (exec, eval) 의 매 tainted source 까지 trace.
3. 매 academic research: 매 vulnerability mining (CVE 의 retroactive find).

## 💻 패턴

### Joern 의 매 install + import
```bash
# 2026 Joern v4
curl -L https://github.com/joernio/joern/releases/latest/download/joern-install.sh | sh
joern --import src/

joern> importCode("path/to/project")
joern> cpg.method.l
```

### 매 SQLi pattern detection
```scala
// Joern Scala query: 매 user input 의 SQL 실행 까지 도달
cpg.method.name("query|execute")
   .parameter
   .reachableBy(cpg.method.name("getParameter|req\\.body").ast)
   .l
```

### 매 hardcoded secret detection
```scala
cpg.literal
   .code("\"[A-Za-z0-9+/]{32,}\"")
   .filter(_.method.name != "test")
   .l
```

### CodeQL 의 매 taint tracking (similar)
```ql
import javascript

class Configuration extends TaintTracking::Configuration {
  Configuration() { this = "UserInputToEval" }

  override predicate isSource(DataFlow::Node source) {
    source instanceof RemoteFlowSource
  }

  override predicate isSink(DataFlow::Node sink) {
    exists(CallExpr c | c.getCalleeName() = "eval" |
      sink.asExpr() = c.getArgument(0))
  }
}

from Configuration cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink, source, sink, "Tainted eval"
```

### 매 custom 매 sink 정의
```scala
val customSinks = cpg.call.name("dangerouslySetInnerHTML|innerHTML")
val customSources = cpg.call.name("fetch|axios.get").argument(1)

customSinks.reachableBy(customSources).l
```

### 매 CPG 의 매 export (for visualization)
```scala
joern> cpg.runScript("export-cpg.sc")
// 매 GraphML / DOT / Neo4j 로 export
```

### 매 CI integration (Joern Scan)
```yaml
# .github/workflows/joern.yml
- name: Joern Scan
  run: |
    joern-scan --src ./src --output joern-report.json
    jq '.findings[] | select(.severity=="HIGH")' joern-report.json
```

## 매 결정 기준
| 상황 | Approach |
|---|---|
| Quick rule-based scan | 매 Semgrep (syntactic, fast) |
| Deep dataflow analysis | 매 Joern / CodeQL (CPG-based) |
| GitHub-native | 매 CodeQL (Advanced Security) |
| Multi-language audit | 매 Joern (C/C++/Java/Python/JS/PHP) |
| Custom vuln mining | 매 Joern Scala script |

**기본값**: 매 Semgrep (매 quick CI gate) + 매 CodeQL (매 deep weekly audit).

## 🔗 Graph
- 부모: [[SAST]] · [[Static Analysis]]
- 변형: [[AST]]
- 응용: [[Joern]] · [[CodeQL]]
- Adjacent: [[보안 및 시스템 신뢰성 표준|DAST]] · [[SCA_Fundamentals|SCA]] · [[DevSecOps Framework]]

## 🤖 LLM 활용
**언제**: 매 Joern Scala query 의 draft, 매 CPG result 의 false positive triage, 매 custom sink/source 의 suggestion.
**언제 X**: 매 LLM 의 self 의 vulnerability detection — 매 hallucination risk. 매 CPG-based 결과 가 매 ground truth.

## ❌ 안티패턴
- **CPG 의 build 만 하고 의 query 의 X**: 매 graph 의 사용 안함.
- **매 source / sink 의 매 default 만**: 매 framework-specific (Express, Spring) 의 매 manual 정의 필요.
- **매 Joern 의 huge codebase 의 timeout**: 매 incremental import / 매 module 별 split.
- **매 alert fatigue**: 매 severity tuning 없이 매 모든 finding raise.

## 🧪 검증 / 중복
- Verified (Yamaguchi et al. 2014 IEEE S&P; Joern docs; CodeQL docs).
- 신뢰도 A.

## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — CPG full content |