Files
2nd/10_Wiki/Topics/Coding/CS_Tries_Trees.md
T
2026-05-09 22:47:42 +09:00

510 lines
10 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: cs-tries-trees
title: Tries / Trees — Prefix / Autocomplete / Routing
category: Coding
status: draft
source_trust_level: B
verification_status: conceptual
created_at: 2026-05-09
updated_at: 2026-05-09
tags: [cs, tree, trie, vibe-coding]
tech_stack: { language: "TS", applicable_to: ["Backend", "Frontend"] }
applied_in: []
aliases: [Trie, prefix tree, radix tree, ART, autocomplete, route matching, suffix tree]
---
# Tries / Trees
> Prefix-based 자료구조. **Autocomplete, route match, IP routing, dictionary**. Trie / Radix / ART (Adaptive Radix Tree). String key 가 자연.
## 📖 핵심 개념
- Trie: 매 char 가 node.
- Radix: 같은 path 압축.
- ART: cache-friendly, modern.
- Suffix tree: 모든 suffix 의 trie.
## 💻 코드 패턴
### Basic Trie
```ts
class TrieNode {
children = new Map<string, TrieNode>();
isEnd = false;
}
class Trie {
root = new TrieNode();
insert(word: string) {
let node = this.root;
for (const ch of word) {
if (!node.children.has(ch)) {
node.children.set(ch, new TrieNode());
}
node = node.children.get(ch)!;
}
node.isEnd = true;
}
search(word: string): boolean {
const node = this.findNode(word);
return node?.isEnd ?? false;
}
startsWith(prefix: string): boolean {
return this.findNode(prefix) !== null;
}
private findNode(s: string): TrieNode | null {
let node = this.root;
for (const ch of s) {
const next = node.children.get(ch);
if (!next) return null;
node = next;
}
return node;
}
}
```
### Autocomplete
```ts
class AutocompleteTrie {
// ... 위 +
suggestions(prefix: string, max = 10): string[] {
const node = this.findNode(prefix);
if (!node) return [];
const result: string[] = [];
this.collect(node, prefix, result, max);
return result;
}
private collect(node: TrieNode, current: string, result: string[], max: number) {
if (result.length >= max) return;
if (node.isEnd) result.push(current);
for (const [ch, child] of node.children) {
this.collect(child, current + ch, result, max);
}
}
}
const trie = new AutocompleteTrie();
['apple', 'app', 'application', 'apply'].forEach(w => trie.insert(w));
trie.suggestions('app'); // ['app', 'apple', 'application', 'apply']
```
### Frequency-based autocomplete
```ts
class FrequencyTrie {
root = new TrieNode();
insert(word: string, freq: number = 1) {
let node = this.root;
for (const ch of word) {
if (!node.children.has(ch)) {
node.children.set(ch, new TrieNode());
}
node = node.children.get(ch)!;
}
node.frequency = (node.frequency ?? 0) + freq;
node.word = word;
}
topSuggestions(prefix: string, k = 5): string[] {
const node = this.findNode(prefix);
if (!node) return [];
// Heap 또는 sort
const all: { word: string; freq: number }[] = [];
this.collectAll(node, all);
return all
.sort((a, b) => b.freq - a.freq)
.slice(0, k)
.map(x => x.word);
}
}
```
→ Search query autocomplete.
### Radix tree (compressed trie)
```ts
// "apple", "app", "apply"
// Trie: a→p→p→l→e (end), p (end), p→l→y (end)
// Radix: "app" (end) → "le" (end), "ly" (end)
// ↳ "ication" (end)
class RadixNode {
children = new Map<string, RadixNode>(); // edge label → node
isEnd = false;
value?: any;
}
class RadixTree {
root = new RadixNode();
insert(key: string, value: any) {
// Common prefix 찾기 → split or extend
// ... 복잡 implementation
}
}
```
→ Memory 절약. URL routing 자주.
### URL routing (radix tree)
```
GET /users/:id
GET /users/:id/posts
GET /posts/:id
POST /posts
Tree:
/
├── users/
│ └── :id/
│ └── posts/
└── posts/
└── :id (또는 default)
```
```ts
// find-my-way (Fastify 사용)
import findMyWay from 'find-my-way';
const router = findMyWay();
router.on('GET', '/users/:id', (req, res, params) => {
res.end(`User ${params.id}`);
});
const match = router.find('GET', '/users/123');
// { handler, params: { id: '123' } }
```
→ Express / Fastify / Hono 의 router internals.
### IP routing (longest prefix match)
```
192.168.1.0/24 → router A
192.168.0.0/16 → router B
0.0.0.0/0 → router C (default)
→ Trie of bits.
```
```ts
class IPTrie {
// Each bit (0 / 1) = child
// Leaf = next-hop
}
```
→ Linux kernel routing.
### Suffix tree
```
"banana" 의 모든 suffix:
- banana
- anana
- nana
- ana
- na
- a
Suffix tree = 이 suffix 모두 의 trie (compressed).
```
```ts
// Substring search 빠름 (O(m), m = pattern length).
// Build = O(n).
// Use case: bioinformatics, text search.
```
→ Ukkonen's algorithm.
### Aho-Corasick (multi-pattern)
```ts
// 여러 pattern 을 한 번에 search.
// Trie + failure link.
const ac = new AhoCorasick();
ac.add('cat');
ac.add('dog');
ac.add('cattle');
ac.build();
const matches = ac.search('thecattleshookhead');
// [{ pattern: 'cat', start: 3 }, { pattern: 'cattle', start: 3 }]
```
→ Spam filter, DNA search, IDS.
### Prefix sum (different from trie)
```ts
// "ABC" → counts at each position
const prefix: number[] = [0];
for (const ch of str) prefix.push(prefix[prefix.length - 1] + (ch === 'a' ? 1 : 0));
// Range query: prefix[r] - prefix[l]
```
### Segment tree
```ts
// Range query / range update.
// 매 node 가 range 의 sum / min / max.
class SegmentTree {
tree: number[];
n: number;
constructor(arr: number[]) {
this.n = arr.length;
this.tree = new Array(4 * this.n);
this.build(arr, 0, 0, this.n - 1);
}
query(l: number, r: number): number {
return this.queryHelper(0, 0, this.n - 1, l, r);
}
update(idx: number, val: number) {
this.updateHelper(0, 0, this.n - 1, idx, val);
}
}
```
→ Range sum / max / min 자주.
### Fenwick tree (BIT)
```ts
// Range sum + point update.
// Segment tree 보다 작음.
class BIT {
tree: number[];
constructor(n: number) {
this.tree = new Array(n + 1).fill(0);
}
update(i: number, delta: number) {
for (; i < this.tree.length; i += i & -i) this.tree[i] += delta;
}
query(i: number): number {
let sum = 0;
for (; i > 0; i -= i & -i) sum += this.tree[i];
return sum;
}
}
```
→ Inversion count, range sum.
### Splay tree / Red-black tree / AVL
```
Self-balancing BST.
- Splay: recently used = root (cache friendly)
- Red-black: balance via color
- AVL: balance via height
Used in:
- TreeMap / TreeSet (Java)
- std::map (C++)
- Linux kernel (Red-black for processes)
```
### B-tree (DB index)
```
[[CS_BTree_LSM_Storage]]:
매 node 가 multiple key (10-100s).
Disk-friendly.
Postgres / MySQL InnoDB.
```
### Patricia trie (compressed binary)
```
Bits 의 radix tree.
- IP routing
- Bitcoin merkle patricia (Ethereum state)
```
### MerkleTrie (Ethereum)
```
Hash 가 children 의 hash:
- Tamper detection
- Light client (proof)
```
### k-d tree (k-dimensional)
```
N-dim points 의 BST.
Use:
- Nearest neighbor search
- Range query
- 2D / 3D point cloud
```
```ts
class KDTree {
// Each node split by 1 dim.
// Alternate dimensions.
}
// 또는 외부 lib
import { kdTree } from 'kd-tree-javascript';
const tree = new kdTree(points, distance, ['x', 'y', 'z']);
const nearest = tree.nearest({ x: 0, y: 0, z: 0 }, 5); // top 5
```
### Quadtree (2D 공간)
```ts
// Game collision, geo search.
// 매 node = 4 quadrants.
class Quadtree {
bounds: Rect;
points: Point[];
children: Quadtree[] = [];
insert(p: Point) {
if (this.children.length > 0) {
const idx = this.getIdx(p);
this.children[idx].insert(p);
} else {
this.points.push(p);
if (this.points.length > MAX_POINTS) this.split();
}
}
}
```
### Geohash
```
Lat/lon → string prefix.
"u4pruyd" — 0.6m precision.
Prefix match = nearby:
"u4pru" matches all in 5km of 'u4pru' area.
→ Trie + geo.
```
```ts
import geohash from 'ngeohash';
const hash = geohash.encode(37.5, 127.0, 9); // 9 char ≈ 4.8m
const decoded = geohash.decode(hash); // {latitude, longitude}
const neighbors = geohash.neighbors(hash);
```
### Use cases summary
```
Trie:
- Autocomplete (search box)
- Spell check
- IP routing
- Dictionary (English words)
- 회사 jargon
Radix:
- URL router (Express, Fastify)
- Memory-efficient string key
ART:
- In-memory DB (Hekaton)
- Cache-friendly
Suffix tree:
- DNA / bioinformatics
- Substring search
B-tree:
- DB index (Postgres, MySQL)
- File system (ext4)
Segment tree / BIT:
- Range query
- Competitive programming
k-d tree / quadtree:
- Geo search
- Game collision
```
### Performance
```
Trie operations:
- Insert / search: O(L) — L = key length
- Memory: O(N × L) — N = key count
Radix:
- Same as Trie + 작은 메모리 (compression)
Hash map (alternative):
- O(1) — but no prefix
- Use trie when prefix matters
```
### Trie vs hash map
```
Trie:
+ Prefix query (autocomplete)
+ Sorted order
+ Lex traversal
- 큰 메모리 (per char)
Hash map:
+ O(1) lookup
+ 작은 메모리
- No prefix
```
### Production library
```
- find-my-way: Fastify router (radix)
- ART: Adaptive Radix Tree (C / Rust)
- 자체: TS 직접 구현 OK
```
### When NOT to use trie
```
- Prefix 안 필요 (Hash map)
- 큰 string + 적은 query (Bloom filter)
- Memory critical (hash + Bloom)
```
## 🤔 의사결정 기준
| 사용 | 추천 |
|---|---|
| Autocomplete | Trie / Radix |
| URL routing | Radix tree |
| IP routing | Patricia / Radix bit |
| Substring search 큰 | Suffix tree / Aho-Corasick |
| Range query | Segment / BIT |
| Geo search | Quadtree / k-d tree / Geohash |
| In-memory DB | ART |
## ❌ 안티패턴
- **모든 곳 Trie**: hash map 충분 자주.
- **Trie 의 메모리 무 측정**: 큰 dataset = OOM.
- **Recursion depth (deep trie)**: stack overflow. iterative.
- **String key 만 가정**: binary trie 도 가능.
- **Suffix tree O(n²) build**: O(n) Ukkonen's.
## 🤖 LLM 활용 힌트
- Autocomplete = Trie 의 자연 use case.
- URL router 안 Radix tree.
- Geo = Geohash + Quadtree.
- DB = B-tree (다른 문서).
## 🔗 관련 문서
- [[CS_BTree_LSM_Storage]]
- [[CS_Big_O_Practical]]
- [[DB_Full_Text_Search]]