--- id: wiki-2026-0508-relational-algebra-in-databases title: Relational Algebra in Databases category: 10_Wiki/Topics status: verified canonical_id: self aliases: [Relational Algebra, RA, SQL Algebra] duplicate_of: none source_trust_level: A confidence_score: 0.95 verification_status: applied tags: [database, theory, sql, query-optimization] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: sql framework: postgres --- # Relational Algebra in Databases ## 매 한 줄 > **"매 SQL은 매 algebra 의 syntactic sugar"**. Codd(1970)의 relational algebra는 매 set-based operator(σ, π, ⋈, ∪, −, ×) 매 closed system. 매 modern query optimizer(Postgres, DuckDB, Snowflake)의 plan tree 매 그대로 RA expression. ## 매 핵심 ### 매 6 primitive operators - **σ (Selection)**: row filter. `σ_{age>30}(R)` ≡ `WHERE age>30`. - **π (Projection)**: column subset. `π_{name,age}(R)` ≡ `SELECT name, age`. - **⋈ (Join)**: theta/equi/natural. `R ⋈_{R.id=S.rid} S`. - **∪ / − / ∩**: set ops on union-compatible relations. - **× (Cartesian product)**: `R × S` — 매 expensive. - **ρ (Rename)**: alias. ### 매 Derived operators - **Outer joins** (⟕, ⟖, ⟗): null-padded. - **Division** (÷): "all-quantifier". `R ÷ S` = "tuples in R related to every S". - **Aggregation** (γ): `_{dept}γ_{avg(salary)}(Emp)`. ### 매 응용 1. Query optimizer 매 RA tree 의 rewrite (predicate pushdown, join reordering). 2. View materialization 매 algebraic equivalence. 3. Datalog / Differential dataflow의 incremental engine. ## 💻 패턴 ### Selection pushdown ```sql -- Logical: π_{name}(σ_{age>30}(Emp ⋈ Dept)) -- Physical: σ pushed below ⋈ — 매 smaller intermediate SELECT name FROM Emp e JOIN Dept d ON e.dept_id=d.id WHERE e.age > 30; -- 매 optimizer 매 σ_{age>30} 의 Emp 매 push. ``` ### Projection pushdown ```sql -- π_{name,salary}(Emp ⋈ Dept) — Dept columns 매 unused EXPLAIN (FORMAT TEXT) SELECT e.name, e.salary FROM Emp e JOIN Dept d ON e.dept_id=d.id; -- Postgres: only e.name,e.salary,e.dept_id materialized. ``` ### Join reordering (⋈ associative + commutative) ```sql -- (A ⋈ B) ⋈ C ≡ A ⋈ (B ⋈ C) — but cost 매 다름 SET join_collapse_limit = 12; EXPLAIN ANALYZE SELECT * FROM small s JOIN big b ON s.k=b.k JOIN huge h ON b.k=h.k; -- 매 small 매 build side 의 선택. ``` ### Division via NOT EXISTS ```sql -- "students who took every required course" -- Took ÷ Required SELECT s.id FROM Students s WHERE NOT EXISTS ( SELECT 1 FROM Required r WHERE NOT EXISTS ( SELECT 1 FROM Took t WHERE t.student_id=s.id AND t.course_id=r.course_id ) ); ``` ### Aggregation (γ) ```sql -- _{dept_id}γ_{count(*),avg(salary)}(Emp) SELECT dept_id, COUNT(*), AVG(salary) FROM Emp GROUP BY dept_id; ``` ### Set operations ```sql -- A − B (set difference) SELECT id FROM ActiveUsers EXCEPT SELECT id FROM BannedUsers; -- A ∩ B SELECT id FROM Premium INTERSECT SELECT id FROM Annual; ``` ### Equivalence rewriting ```sql -- σ_{p∧q}(R) ≡ σ_p(σ_q(R)) 매 split 의 가능 -- σ_p(R ⋈ S) ≡ σ_p(R) ⋈ S if p references only R -- π_L(R ⋈ S) ≡ π_L(π_{L∪join}(R) ⋈ π_{L∪join}(S)) ``` ## 매 결정 기준 | 상황 | Operator | |---|---| | Filter rows | σ | | Pick columns | π | | Combine relations on key | ⋈ | | Union-compatible merge | ∪ | | All-quantifier | ÷ | | Group + aggregate | γ | | Preserve unmatched | ⟕/⟖/⟗ | **기본값**: σ/π/⋈ 의 covers 매 95% of queries. ## 🔗 Graph - 부모: [[SQL]] - Adjacent: [[Normalization]] · [[ACID]] ## 🤖 LLM 활용 **언제**: SQL → RA tree 변환 설명, query rewrite suggestion, 학습용 derivation. **언제 X**: production query plan — 매 EXPLAIN ANALYZE 의 사용. ## ❌ 안티패턴 - **Cartesian product 의 무심**: missing JOIN condition → N×M rows. - **σ above ⋈**: 매 optimizer 매 push 못 하는 case → manual rewrite. - **SELECT *** in subquery: π pushdown 매 방해. - **Bag vs set 의 혼동**: SQL은 bag(multiset). UNION ALL ≠ ∪. ## 🧪 검증 / 중복 - Verified (Codd 1970; Garcia-Molina *Database Systems* ch.2.4; Postgres planner docs). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — full content (operators + 7 patterns) |