f8b21af4be
10_Wiki/Topics 대규모 정리: - 오류 캡처/미완성 stub 문서 227개 제거 - 교차폴더 중복 43클러스터 병합 (63파일 → redirect) - 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건 - 카테고리 MOC 6개 신규 생성 - Graph 섹션 미해결 related-keyword 링크 10,058건 제거 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
149 lines
4.3 KiB
Markdown
149 lines
4.3 KiB
Markdown
---
|
||
id: wiki-2026-0508-relational-algebra-in-databases
|
||
title: Relational Algebra in Databases
|
||
category: 10_Wiki/Topics
|
||
status: verified
|
||
canonical_id: self
|
||
aliases: [Relational Algebra, RA, SQL Algebra]
|
||
duplicate_of: none
|
||
source_trust_level: A
|
||
confidence_score: 0.95
|
||
verification_status: applied
|
||
tags: [database, theory, sql, query-optimization]
|
||
raw_sources: []
|
||
last_reinforced: 2026-05-10
|
||
github_commit: pending
|
||
tech_stack:
|
||
language: sql
|
||
framework: postgres
|
||
---
|
||
|
||
# Relational Algebra in Databases
|
||
|
||
## 매 한 줄
|
||
> **"매 SQL은 매 algebra 의 syntactic sugar"**. Codd(1970)의 relational algebra는 매 set-based operator(σ, π, ⋈, ∪, −, ×) 매 closed system. 매 modern query optimizer(Postgres, DuckDB, Snowflake)의 plan tree 매 그대로 RA expression.
|
||
|
||
## 매 핵심
|
||
|
||
### 매 6 primitive operators
|
||
- **σ (Selection)**: row filter. `σ_{age>30}(R)` ≡ `WHERE age>30`.
|
||
- **π (Projection)**: column subset. `π_{name,age}(R)` ≡ `SELECT name, age`.
|
||
- **⋈ (Join)**: theta/equi/natural. `R ⋈_{R.id=S.rid} S`.
|
||
- **∪ / − / ∩**: set ops on union-compatible relations.
|
||
- **× (Cartesian product)**: `R × S` — 매 expensive.
|
||
- **ρ (Rename)**: alias.
|
||
|
||
### 매 Derived operators
|
||
- **Outer joins** (⟕, ⟖, ⟗): null-padded.
|
||
- **Division** (÷): "all-quantifier". `R ÷ S` = "tuples in R related to every S".
|
||
- **Aggregation** (γ): `_{dept}γ_{avg(salary)}(Emp)`.
|
||
|
||
### 매 응용
|
||
1. Query optimizer 매 RA tree 의 rewrite (predicate pushdown, join reordering).
|
||
2. View materialization 매 algebraic equivalence.
|
||
3. Datalog / Differential dataflow의 incremental engine.
|
||
|
||
## 💻 패턴
|
||
|
||
### Selection pushdown
|
||
```sql
|
||
-- Logical: π_{name}(σ_{age>30}(Emp ⋈ Dept))
|
||
-- Physical: σ pushed below ⋈ — 매 smaller intermediate
|
||
SELECT name FROM Emp e JOIN Dept d ON e.dept_id=d.id WHERE e.age > 30;
|
||
-- 매 optimizer 매 σ_{age>30} 의 Emp 매 push.
|
||
```
|
||
|
||
### Projection pushdown
|
||
```sql
|
||
-- π_{name,salary}(Emp ⋈ Dept) — Dept columns 매 unused
|
||
EXPLAIN (FORMAT TEXT)
|
||
SELECT e.name, e.salary FROM Emp e JOIN Dept d ON e.dept_id=d.id;
|
||
-- Postgres: only e.name,e.salary,e.dept_id materialized.
|
||
```
|
||
|
||
### Join reordering (⋈ associative + commutative)
|
||
```sql
|
||
-- (A ⋈ B) ⋈ C ≡ A ⋈ (B ⋈ C) — but cost 매 다름
|
||
SET join_collapse_limit = 12;
|
||
EXPLAIN ANALYZE
|
||
SELECT * FROM small s JOIN big b ON s.k=b.k JOIN huge h ON b.k=h.k;
|
||
-- 매 small 매 build side 의 선택.
|
||
```
|
||
|
||
### Division via NOT EXISTS
|
||
```sql
|
||
-- "students who took every required course"
|
||
-- Took ÷ Required
|
||
SELECT s.id FROM Students s
|
||
WHERE NOT EXISTS (
|
||
SELECT 1 FROM Required r
|
||
WHERE NOT EXISTS (
|
||
SELECT 1 FROM Took t
|
||
WHERE t.student_id=s.id AND t.course_id=r.course_id
|
||
)
|
||
);
|
||
```
|
||
|
||
### Aggregation (γ)
|
||
```sql
|
||
-- _{dept_id}γ_{count(*),avg(salary)}(Emp)
|
||
SELECT dept_id, COUNT(*), AVG(salary)
|
||
FROM Emp
|
||
GROUP BY dept_id;
|
||
```
|
||
|
||
### Set operations
|
||
```sql
|
||
-- A − B (set difference)
|
||
SELECT id FROM ActiveUsers
|
||
EXCEPT
|
||
SELECT id FROM BannedUsers;
|
||
|
||
-- A ∩ B
|
||
SELECT id FROM Premium INTERSECT SELECT id FROM Annual;
|
||
```
|
||
|
||
### Equivalence rewriting
|
||
```sql
|
||
-- σ_{p∧q}(R) ≡ σ_p(σ_q(R)) 매 split 의 가능
|
||
-- σ_p(R ⋈ S) ≡ σ_p(R) ⋈ S if p references only R
|
||
-- π_L(R ⋈ S) ≡ π_L(π_{L∪join}(R) ⋈ π_{L∪join}(S))
|
||
```
|
||
|
||
## 매 결정 기준
|
||
| 상황 | Operator |
|
||
|---|---|
|
||
| Filter rows | σ |
|
||
| Pick columns | π |
|
||
| Combine relations on key | ⋈ |
|
||
| Union-compatible merge | ∪ |
|
||
| All-quantifier | ÷ |
|
||
| Group + aggregate | γ |
|
||
| Preserve unmatched | ⟕/⟖/⟗ |
|
||
|
||
**기본값**: σ/π/⋈ 의 covers 매 95% of queries.
|
||
|
||
## 🔗 Graph
|
||
- 부모: [[SQL]]
|
||
- Adjacent: [[Normalization]] · [[ACID]]
|
||
|
||
## 🤖 LLM 활용
|
||
**언제**: SQL → RA tree 변환 설명, query rewrite suggestion, 학습용 derivation.
|
||
**언제 X**: production query plan — 매 EXPLAIN ANALYZE 의 사용.
|
||
|
||
## ❌ 안티패턴
|
||
- **Cartesian product 의 무심**: missing JOIN condition → N×M rows.
|
||
- **σ above ⋈**: 매 optimizer 매 push 못 하는 case → manual rewrite.
|
||
- **SELECT *** in subquery: π pushdown 매 방해.
|
||
- **Bag vs set 의 혼동**: SQL은 bag(multiset). UNION ALL ≠ ∪.
|
||
|
||
## 🧪 검증 / 중복
|
||
- Verified (Codd 1970; Garcia-Molina *Database Systems* ch.2.4; Postgres planner docs).
|
||
- 신뢰도 A.
|
||
|
||
## 🕓 Changelog
|
||
| 날짜 | 변경 |
|
||
|---|---|
|
||
| 2026-05-08 | Phase 1 |
|
||
| 2026-05-10 | Manual cleanup — full content (operators + 7 patterns) |
|