2nd/10_Wiki/Topics/Backend/Relational Algebra in Databases.md

---
id: wiki-2026-0508-relational-algebra-in-databases
title: Relational Algebra in Databases
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [Relational Algebra, RA, SQL Algebra]
duplicate_of: none
source_trust_level: A
confidence_score: 0.95
verification_status: applied
tags: [database, theory, sql, query-optimization]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
  language: sql
  framework: postgres
---

# Relational Algebra in Databases

## 매 한 줄
> **"매 SQL은 매 algebra 의 syntactic sugar"**. Codd(1970)의 relational algebra는 매 set-based operator(σ, π, ⋈, ∪, −, ×) 매 closed system. 매 modern query optimizer(Postgres, DuckDB, Snowflake)의 plan tree 매 그대로 RA expression.

## 매 핵심

### 매 6 primitive operators
- **σ (Selection)**: row filter. `σ_{age>30}(R)` ≡ `WHERE age>30`.
- **π (Projection)**: column subset. `π_{name,age}(R)` ≡ `SELECT name, age`.
- **⋈ (Join)**: theta/equi/natural. `R ⋈_{R.id=S.rid} S`.
- **∪ / − / ∩**: set ops on union-compatible relations.
- **× (Cartesian product)**: `R × S` — 매 expensive.
- **ρ (Rename)**: alias.

### 매 Derived operators
- **Outer joins** (⟕, ⟖, ⟗): null-padded.
- **Division** (÷): "all-quantifier". `R ÷ S` = "tuples in R related to every S".
- **Aggregation** (γ): `_{dept}γ_{avg(salary)}(Emp)`.

### 매 응용
1. Query optimizer 매 RA tree 의 rewrite (predicate pushdown, join reordering).
2. View materialization 매 algebraic equivalence.
3. Datalog / Differential dataflow의 incremental engine.

## 💻 패턴

### Selection pushdown
```sql
-- Logical:  π_{name}(σ_{age>30}(Emp ⋈ Dept))
-- Physical: σ pushed below ⋈ — 매 smaller intermediate
SELECT name FROM Emp e JOIN Dept d ON e.dept_id=d.id WHERE e.age > 30;
-- 매 optimizer 매 σ_{age>30} 의 Emp 매 push.
```

### Projection pushdown
```sql
-- π_{name,salary}(Emp ⋈ Dept) — Dept columns 매 unused
EXPLAIN (FORMAT TEXT)
SELECT e.name, e.salary FROM Emp e JOIN Dept d ON e.dept_id=d.id;
-- Postgres: only e.name,e.salary,e.dept_id materialized.
```

### Join reordering (⋈ associative + commutative)
```sql
-- (A ⋈ B) ⋈ C  ≡  A ⋈ (B ⋈ C)  — but cost 매 다름
SET join_collapse_limit = 12;
EXPLAIN ANALYZE
SELECT * FROM small s JOIN big b ON s.k=b.k JOIN huge h ON b.k=h.k;
-- 매 small 매 build side 의 선택.
```

### Division via NOT EXISTS
```sql
-- "students who took every required course"
-- Took ÷ Required
SELECT s.id FROM Students s
WHERE NOT EXISTS (
  SELECT 1 FROM Required r
  WHERE NOT EXISTS (
    SELECT 1 FROM Took t
    WHERE t.student_id=s.id AND t.course_id=r.course_id
  )
);
```

### Aggregation (γ)
```sql
-- _{dept_id}γ_{count(*),avg(salary)}(Emp)
SELECT dept_id, COUNT(*), AVG(salary)
FROM Emp
GROUP BY dept_id;
```

### Set operations
```sql
-- A − B  (set difference)
SELECT id FROM ActiveUsers
EXCEPT
SELECT id FROM BannedUsers;

-- A ∩ B
SELECT id FROM Premium INTERSECT SELECT id FROM Annual;
```

### Equivalence rewriting
```sql
-- σ_{p∧q}(R) ≡ σ_p(σ_q(R))   매 split 의 가능
-- σ_p(R ⋈ S) ≡ σ_p(R) ⋈ S    if p references only R
-- π_L(R ⋈ S) ≡ π_L(π_{L∪join}(R) ⋈ π_{L∪join}(S))
```

## 매 결정 기준
| 상황 | Operator |
|---|---|
| Filter rows | σ |
| Pick columns | π |
| Combine relations on key | ⋈ |
| Union-compatible merge | ∪ |
| All-quantifier | ÷ |
| Group + aggregate | γ |
| Preserve unmatched | ⟕/⟖/⟗ |

**기본값**: σ/π/⋈ 의 covers 매 95% of queries.

## 🔗 Graph
- 부모: [[SQL]]
- Adjacent: [[Normalization]] · [[ACID]]

## 🤖 LLM 활용
**언제**: SQL → RA tree 변환 설명, query rewrite suggestion, 학습용 derivation.
**언제 X**: production query plan — 매 EXPLAIN ANALYZE 의 사용.

## ❌ 안티패턴
- **Cartesian product 의 무심**: missing JOIN condition → N×M rows.
- **σ above ⋈**: 매 optimizer 매 push 못 하는 case → manual rewrite.
- **SELECT *** in subquery: π pushdown 매 방해.
- **Bag vs set 의 혼동**: SQL은 bag(multiset). UNION ALL ≠ ∪.

## 🧪 검증 / 중복
- Verified (Codd 1970; Garcia-Molina *Database Systems* ch.2.4; Postgres planner docs).
- 신뢰도 A.

## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — full content (operators + 7 patterns) |