Files
2nd/10_Wiki/Topics/DevOps_and_Security/Code_Property_Graph.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

4.8 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-code-property-graph Code Property Graph 10_Wiki/Topics verified self
CPG
Code Property Graphs
Joern CPG
none A 0.9 applied
security
sast
cpg
static-analysis
2026-05-10 pending
language framework
scala joern

Code Property Graph

매 한 줄

"매 CPG 의 의미: 매 AST + CFG + PDG 의 매 single graph representation". 매 Yamaguchi et al. (2014) 가 매 IEEE S&P 의 매 propose, 매 Joern 의 매 implement. 매 2026 modern SAST (Joern, Qwiet AI/ShiftLeft, CodeQL의 dataflow) 의 매 backbone.

매 핵심

매 3 layer

  • AST (Abstract Syntax Tree): 매 syntactic structure
  • CFG (Control Flow Graph): 매 execution order, branches
  • PDG (Program Dependency Graph): 매 data + control dependencies

매 single graph

  • 매 node = AST node
  • 매 edge = AST parent / CFG next / PDG dataflow / call edge
  • 매 query 의 graph traversal 로 매 vulnerability pattern 감지

매 query language

  • Joern: Scala-based DSL (Gremlin-like)
  • CodeQL: declarative QL language (similar concept)
  • Semgrep: 매 simpler (AST-only), 매 not full CPG

매 응용

  1. 매 SAST: 매 SQLi/XSS/RCE pattern 의 매 detection.
  2. 매 audit: 매 sensitive sink (exec, eval) 의 매 tainted source 까지 trace.
  3. 매 academic research: 매 vulnerability mining (CVE 의 retroactive find).

💻 패턴

Joern 의 매 install + import

# 2026 Joern v4
curl -L https://github.com/joernio/joern/releases/latest/download/joern-install.sh | sh
joern --import src/

joern> importCode("path/to/project")
joern> cpg.method.l

매 SQLi pattern detection

// Joern Scala query: 매 user input 의 SQL 실행 까지 도달
cpg.method.name("query|execute")
   .parameter
   .reachableBy(cpg.method.name("getParameter|req\\.body").ast)
   .l

매 hardcoded secret detection

cpg.literal
   .code("\"[A-Za-z0-9+/]{32,}\"")
   .filter(_.method.name != "test")
   .l

CodeQL 의 매 taint tracking (similar)

import javascript

class Configuration extends TaintTracking::Configuration {
  Configuration() { this = "UserInputToEval" }

  override predicate isSource(DataFlow::Node source) {
    source instanceof RemoteFlowSource
  }

  override predicate isSink(DataFlow::Node sink) {
    exists(CallExpr c | c.getCalleeName() = "eval" |
      sink.asExpr() = c.getArgument(0))
  }
}

from Configuration cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink, source, sink, "Tainted eval"

매 custom 매 sink 정의

val customSinks = cpg.call.name("dangerouslySetInnerHTML|innerHTML")
val customSources = cpg.call.name("fetch|axios.get").argument(1)

customSinks.reachableBy(customSources).l

매 CPG 의 매 export (for visualization)

joern> cpg.runScript("export-cpg.sc")
// 매 GraphML / DOT / Neo4j 로 export

매 CI integration (Joern Scan)

# .github/workflows/joern.yml
- name: Joern Scan
  run: |
    joern-scan --src ./src --output joern-report.json
    jq '.findings[] | select(.severity=="HIGH")' joern-report.json

매 결정 기준

상황 Approach
Quick rule-based scan 매 Semgrep (syntactic, fast)
Deep dataflow analysis 매 Joern / CodeQL (CPG-based)
GitHub-native 매 CodeQL (Advanced Security)
Multi-language audit 매 Joern (C/C++/Java/Python/JS/PHP)
Custom vuln mining 매 Joern Scala script

기본값: 매 Semgrep (매 quick CI gate) + 매 CodeQL (매 deep weekly audit).

🔗 Graph

🤖 LLM 활용

언제: 매 Joern Scala query 의 draft, 매 CPG result 의 false positive triage, 매 custom sink/source 의 suggestion. 언제 X: 매 LLM 의 self 의 vulnerability detection — 매 hallucination risk. 매 CPG-based 결과 가 매 ground truth.

안티패턴

  • CPG 의 build 만 하고 의 query 의 X: 매 graph 의 사용 안함.
  • 매 source / sink 의 매 default 만: 매 framework-specific (Express, Spring) 의 매 manual 정의 필요.
  • 매 Joern 의 huge codebase 의 timeout: 매 incremental import / 매 module 별 split.
  • 매 alert fatigue: 매 severity tuning 없이 매 모든 finding raise.

🧪 검증 / 중복

  • Verified (Yamaguchi et al. 2014 IEEE S&P; Joern docs; CodeQL docs).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — CPG full content