--- id: wiki-2026-0508-concrete-syntax-tree-cst title: Concrete Syntax Tree (CST) category: 10_Wiki/Topics status: verified canonical_id: self aliases: [Parse Tree, CST, Lossless Syntax Tree] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [parser, ast, tooling, language-engineering] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: Rust/JS framework: tree-sitter/rowan --- # Concrete Syntax Tree (CST) ## 매 한 줄 > **"매 source 의 every token (whitespace, comment 포함) 까지 보존하는 lossless tree"**. AST 가 semantic-only 인 반면 CST 는 매 source 의 round-trip 가능. 매 modern tooling — tree-sitter, rowan (rust-analyzer), Roslyn — 매 CST 위에서 매 IDE feature, refactoring, formatter 를 매 build. ## 매 핵심 ### 매 vs AST - **AST**: semantic node 만 (`if`, `BinaryOp`, etc) — comment, whitespace 버림. - **CST**: 모든 token + trivia 보존 — `source == reprint(cst)`. - CST → AST 의 lowering 가능, 역은 매 lossy. ### 매 핵심 properties - **Lossless**: print 시 원본 byte-for-byte 복구. - **Error-tolerant**: incomplete/invalid code 도 partial tree. - **Incremental**: edit 시 affected subtree 만 reparse (tree-sitter). - **Untyped or weakly-typed**: 모든 node 가 동질 — typed wrapper 로 navigate. ### 매 응용 1. IDE: syntax highlight, fold, outline, indent. 2. Refactoring: rename, extract method (preserve formatting). 3. Formatter: prettier, rustfmt — CST 로 layout 결정. 4. Linter: tree-sitter queries. ## 💻 패턴 ### tree-sitter parsing ```javascript const Parser = require('tree-sitter'); const TS = require('tree-sitter-typescript').typescript; const parser = new Parser(); parser.setLanguage(TS); const tree = parser.parse('const x: number = 1;'); // Walk const cursor = tree.walk(); do { console.log(cursor.nodeType, cursor.startIndex, cursor.endIndex); } while (cursor.gotoNextSibling() || cursor.gotoFirstChild()); ``` ### tree-sitter query (S-expression) ```scheme ; Find all function declarations (function_declaration name: (identifier) @func.name parameters: (formal_parameters) @func.params) ; Find unused imports (import_statement source: (string) @import.source) @import ``` ### Incremental edit ```javascript const oldTree = parser.parse(oldSrc); const newSrc = oldSrc.slice(0, 10) + 'INSERTED' + oldSrc.slice(10); oldTree.edit({ startIndex: 10, oldEndIndex: 10, newEndIndex: 18, startPosition: {row: 0, column: 10}, oldEndPosition: {row: 0, column: 10}, newEndPosition: {row: 0, column: 18}, }); const newTree = parser.parse(newSrc, oldTree); // reuses unchanged subtrees ``` ### rowan (rust-analyzer) typed wrapper ```rust // Untyped GreenNode + typed SyntaxNode wrapper use rowan::{GreenNode, SyntaxNode}; #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)] #[repr(u16)] enum SyntaxKind { L_PAREN, R_PAREN, IDENT, FN_KW, FN_DEF, ROOT, /* ... */ } struct FnDef(SyntaxNode); impl FnDef { fn name(&self) -> Option { self.0.children().find(|n| n.kind() == SyntaxKind::IDENT) .map(|n| n.text().to_string()) } } ``` ### Lossless rewrite (rename) ```rust // Replace token in CST and reprint — preserves comments/whitespace. fn rename(node: &SyntaxNode, old: &str, new: &str) -> String { let mut out = String::new(); for tok in node.descendants_with_tokens() { if let Some(t) = tok.as_token() { if t.kind() == SyntaxKind::IDENT && t.text() == old { out.push_str(new); } else { out.push_str(t.text()); } } } out } ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | Compiler / type checker | AST (semantic 충분) | | IDE / LSP / formatter | CST (lossless 필수) | | Refactoring tool | CST + typed wrapper | | Quick analysis script | tree-sitter query | | New language design | rowan or tree-sitter base | **기본값**: User-facing tool 이면 매 CST. Compiler internal pass 면 매 AST. ## 🔗 Graph - 부모: [[Parser]] - 변형: [[AST]] - 응용: [[Prettier]] ## 🤖 LLM 활용 **언제**: Source-to-source transformation, IDE-grade tooling, codemods. **언제 X**: Pure semantic analysis (AST 만 충분). ## ❌ 안티패턴 - **Regex on source**: 매 fragile — CST query 사용. - **AST 로 formatter**: comment 손실 — CST 필수. - **Hand-rolled parser**: error recovery 빠짐 — tree-sitter/lark 사용. - **Full reparse on every keystroke**: incremental edit API 사용. ## 🧪 검증 / 중복 - Verified (tree-sitter docs, rust-analyzer rowan, Roslyn architecture). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — CST vs AST + tree-sitter/rowan patterns |