Files
2nd/10_Wiki/Topics/Backend/Relational Algebra in Databases.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

149 lines
4.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: wiki-2026-0508-relational-algebra-in-databases
title: Relational Algebra in Databases
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [Relational Algebra, RA, SQL Algebra]
duplicate_of: none
source_trust_level: A
confidence_score: 0.95
verification_status: applied
tags: [database, theory, sql, query-optimization]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
language: sql
framework: postgres
---
# Relational Algebra in Databases
## 매 한 줄
> **"매 SQL은 매 algebra 의 syntactic sugar"**. Codd(1970)의 relational algebra는 매 set-based operator(σ, π, ⋈, , , ×) 매 closed system. 매 modern query optimizer(Postgres, DuckDB, Snowflake)의 plan tree 매 그대로 RA expression.
## 매 핵심
### 매 6 primitive operators
- **σ (Selection)**: row filter. `σ_{age>30}(R)``WHERE age>30`.
- **π (Projection)**: column subset. `π_{name,age}(R)``SELECT name, age`.
- **⋈ (Join)**: theta/equi/natural. `R ⋈_{R.id=S.rid} S`.
- ** / / ∩**: set ops on union-compatible relations.
- **× (Cartesian product)**: `R × S` — 매 expensive.
- **ρ (Rename)**: alias.
### 매 Derived operators
- **Outer joins** (⟕, ⟖, ⟗): null-padded.
- **Division** (÷): "all-quantifier". `R ÷ S` = "tuples in R related to every S".
- **Aggregation** (γ): `_{dept}γ_{avg(salary)}(Emp)`.
### 매 응용
1. Query optimizer 매 RA tree 의 rewrite (predicate pushdown, join reordering).
2. View materialization 매 algebraic equivalence.
3. Datalog / Differential dataflow의 incremental engine.
## 💻 패턴
### Selection pushdown
```sql
-- Logical: π_{name}(σ_{age>30}(Emp ⋈ Dept))
-- Physical: σ pushed below ⋈ — 매 smaller intermediate
SELECT name FROM Emp e JOIN Dept d ON e.dept_id=d.id WHERE e.age > 30;
-- 매 optimizer 매 σ_{age>30} 의 Emp 매 push.
```
### Projection pushdown
```sql
-- π_{name,salary}(Emp ⋈ Dept) — Dept columns 매 unused
EXPLAIN (FORMAT TEXT)
SELECT e.name, e.salary FROM Emp e JOIN Dept d ON e.dept_id=d.id;
-- Postgres: only e.name,e.salary,e.dept_id materialized.
```
### Join reordering (⋈ associative + commutative)
```sql
-- (A ⋈ B) ⋈ C ≡ A ⋈ (B ⋈ C) — but cost 매 다름
SET join_collapse_limit = 12;
EXPLAIN ANALYZE
SELECT * FROM small s JOIN big b ON s.k=b.k JOIN huge h ON b.k=h.k;
-- 매 small 매 build side 의 선택.
```
### Division via NOT EXISTS
```sql
-- "students who took every required course"
-- Took ÷ Required
SELECT s.id FROM Students s
WHERE NOT EXISTS (
SELECT 1 FROM Required r
WHERE NOT EXISTS (
SELECT 1 FROM Took t
WHERE t.student_id=s.id AND t.course_id=r.course_id
)
);
```
### Aggregation (γ)
```sql
-- _{dept_id}γ_{count(*),avg(salary)}(Emp)
SELECT dept_id, COUNT(*), AVG(salary)
FROM Emp
GROUP BY dept_id;
```
### Set operations
```sql
-- A B (set difference)
SELECT id FROM ActiveUsers
EXCEPT
SELECT id FROM BannedUsers;
-- A ∩ B
SELECT id FROM Premium INTERSECT SELECT id FROM Annual;
```
### Equivalence rewriting
```sql
-- σ_{p∧q}(R) ≡ σ_p(σ_q(R)) 매 split 의 가능
-- σ_p(R ⋈ S) ≡ σ_p(R) ⋈ S if p references only R
-- π_L(R ⋈ S) ≡ π_L(π_{Ljoin}(R) ⋈ π_{Ljoin}(S))
```
## 매 결정 기준
| 상황 | Operator |
|---|---|
| Filter rows | σ |
| Pick columns | π |
| Combine relations on key | ⋈ |
| Union-compatible merge | |
| All-quantifier | ÷ |
| Group + aggregate | γ |
| Preserve unmatched | ⟕/⟖/⟗ |
**기본값**: σ/π/⋈ 의 covers 매 95% of queries.
## 🔗 Graph
- 부모: [[SQL]]
- Adjacent: [[Normalization]] · [[ACID]]
## 🤖 LLM 활용
**언제**: SQL → RA tree 변환 설명, query rewrite suggestion, 학습용 derivation.
**언제 X**: production query plan — 매 EXPLAIN ANALYZE 의 사용.
## ❌ 안티패턴
- **Cartesian product 의 무심**: missing JOIN condition → N×M rows.
- **σ above ⋈**: 매 optimizer 매 push 못 하는 case → manual rewrite.
- **SELECT *** in subquery: π pushdown 매 방해.
- **Bag vs set 의 혼동**: SQL은 bag(multiset). UNION ALL ≠ .
## 🧪 검증 / 중복
- Verified (Codd 1970; Garcia-Molina *Database Systems* ch.2.4; Postgres planner docs).
- 신뢰도 A.
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — full content (operators + 7 patterns) |