Files
2nd/10_Wiki/Topics/AI_and_ML/Abstract_Syntax_Tree.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

5.6 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-abstract-syntax-tree Abstract Syntax Tree (AST) 10_Wiki/Topics verified self
AST
Syntax Tree
Parse Tree (informal)
none A 0.95 applied
compiler
parsing
ast
language-tooling
static-analysis
2026-05-10 pending
language framework
Python/JavaScript/Rust ast/Babel/swc/tree-sitter

Abstract Syntax Tree (AST)

매 한 줄

"매 source code 의 tree shape, syntax noise 의 strip". AST = parser 의 output, 매 token sequence 의 hierarchical node tree (FunctionDecl, BinaryExpr, ...) 로 변환. 매 compiler/linter/formatter/codemod/LLM-codegen 의 foundation — 매 2026 LLM agentic coding 의 매 ground truth structural layer.

매 핵심

매 vs Concrete Syntax Tree (CST)

  • CST (parse tree): 매 every token (paren, semicolon, whitespace) 의 retain.
  • AST: 매 semantically meaningful node only — 매 cosmetic 의 drop.
  • 매 modern formatter (Prettier, rustfmt) 의 CST-like (lossless) 의 use, 매 compiler/codemod 의 AST.

매 node anatomy

  • type: BinaryExpression, FunctionDeclaration, ...
  • children: structured field (left, right, body, params).
  • location: start/end byte offset + line/col — 매 error message + source map.

매 typical pipeline

  1. Lex → token stream.
  2. Parse → AST.
  3. Analyze (type check, scope resolve).
  4. Transform (optimize, lower).
  5. Emit (codegen, print).

매 응용

  1. Compiler (rustc, tsc, clang) 의 IR upstream.
  2. Linter (ESLint, ruff, clippy) — 매 rule = AST pattern match.
  3. Formatter (Prettier, Black, gofmt).
  4. Codemod (jscodeshift, ts-morph, libcst) — 매 large refactor.
  5. LLM agentic coding (Claude Opus 4.7 의 tree-sitter grounding).
  6. Static analysis / SAST (Semgrep, CodeQL).
  7. IDE (LSP, syntax highlight, jump-to-def).

💻 패턴

Python ast — visit + transform

import ast

src = "x = 1 + 2 * 3"
tree = ast.parse(src)

class ConstFold(ast.NodeTransformer):
    def visit_BinOp(self, node: ast.BinOp):
        self.generic_visit(node)
        if isinstance(node.left, ast.Constant) and isinstance(node.right, ast.Constant):
            try:
                return ast.copy_location(ast.Constant(value=eval(compile(ast.Expression(node), "", "eval"))), node)
            except Exception:
                pass
        return node

new = ast.fix_missing_locations(ConstFold().visit(tree))
print(ast.unparse(new))  # x = 7

tree-sitter (multi-language, incremental)

from tree_sitter import Language, Parser
import tree_sitter_python as tspy

PY = Language(tspy.language())
parser = Parser(PY)
tree = parser.parse(b"def add(a, b):\n  return a + b\n")
root = tree.root_node
for n in root.children:
    print(n.type, n.start_point, n.end_point)

Babel codemod (JS/TS)

import * as t from "@babel/types";
import generate from "@babel/generator";
import { parse } from "@babel/parser";
import traverse from "@babel/traverse";

const ast = parse(`var x = 1;`, { sourceType: "module" });
traverse(ast, {
  VariableDeclaration(path) {
    if (path.node.kind === "var") path.node.kind = "const";
  },
});
console.log(generate(ast).code);  // const x = 1;

Rust syn — proc macro

use syn::{parse_quote, ItemFn};
use quote::quote;

let f: ItemFn = parse_quote! { fn greet() { println!("hi"); } };
let name = &f.sig.ident;
let out = quote! { #f impl Greeter for () { fn name() -> &'static str { stringify!(#name) } } };

Pattern match (Semgrep-style)

rules:
  - id: dangerous-eval
    pattern: eval($X)
    message: avoid eval
    languages: [python]
    severity: ERROR

LLM-grounded edit (2026)

# 매 LLM 의 line-range edit instead of free-form rewrite — AST 의 anchor
edit = {"file": "app.py", "node_path": "Module/FunctionDef[name=handler]/body[2]",
        "replace_with": "return JSONResponse({'ok': True})"}
apply_ast_edit(edit)  # 매 syntactic safety guaranteed

매 결정 기준

상황 Tool
Single-language Python script tooling ast (stdlib)
Multi-language, incremental (editor) tree-sitter
JS/TS large codemod jscodeshift / ts-morph
Python lossless refactor (preserves comments) LibCST
Compiler frontend, type-aware codemod language native (rustc API, tsc API)
Cross-repo security scan Semgrep / CodeQL

기본값: 매 cross-language tooling — tree-sitter. Python-only — ast + LibCST.

🔗 Graph

🤖 LLM 활용

언제: 매 codemod, 매 lint rule, 매 LLM-output 의 syntactic validation, 매 IDE refactor. 언제 X: 매 trivial regex match 의 sufficient case (e.g. TODO find).

안티패턴

  • Regex 의 code 의 parse: 매 nested/quoted/comment 의 always break — 매 AST 의 use.
  • Mutate while iterating: 매 child traversal 중 parent mutate — 매 transformer pattern (return new node).
  • Lose source location: 매 error message 의 useless 의 됨 — 매 location preserve.
  • Print round-trip 의 trust: 매 unparse 의 lossy (whitespace, comment) — 매 LibCST/Prettier 의 use.

🧪 검증 / 중복

  • Verified (Aho et al. Dragon Book 2nd ed; Python ast docs; tree-sitter docs 2025; Babel handbook).
  • 신뢰도 A.
  • AST(Abstract_Syntax_Tree).md 의 redirect.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — canonical AST 문서, tree-sitter/LLM-grounded edit 추가