Files
2nd/10_Wiki/Topics/Computer_Science_and_Theory/Directed-Acyclic-Graph-Dependency-Management.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

5.9 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-directed-acyclic-graph-dependenc Directed Acyclic Graph Dependency Management 10_Wiki/Topics verified self
DAG
Build Graph
Task Dependency
Topological Sort
none A 0.94 applied
DAG
dependency
build-system
scheduler
topological-sort
2026-05-10 pending
language framework
python networkx/airflow

Directed Acyclic Graph Dependency Management

매 한 줄

"매 DAG = nodes (tasks) + directed edges (must-run-before) + 매 cycle 금지". 매 1960s Make 의 build graph 부터 매 2026 Airflow/Dagster pipelines, Bazel/Turborepo monorepo, Spark physical plan, Git commit history, React fiber tree 까지 — 매 dependency resolution 의 universal data structure.

매 핵심

매 핵심 연산

  • Topological Sort: 매 valid execution order. Kahn's O(V+E) or DFS.
  • Cycle Detection: 매 DAG validity check.
  • Transitive Reduction: 매 minimal edge set with same reachability.
  • Critical Path: 매 longest path = makespan lower bound.
  • Incremental Recompute: 매 dirty subgraph 만 재실행.

매 응용

  1. Build systems: Make, Bazel, Buck, Turborepo, Nx.
  2. Workflow orchestration: Airflow, Dagster, Prefect, Argo Workflows.
  3. ML training pipelines: Kubeflow, MLflow, ZenML.
  4. Spreadsheet recalc: Excel, Google Sheets formula engine.
  5. VCS: Git commit DAG, Mercurial.
  6. React/Solid reactivity: 매 signal dependency graph.

매 schedule strategies

  • List scheduling: 매 ready tasks → workers (greedy).
  • HEFT: 매 heterogeneous earliest finish time (cloud).
  • Critical Path Method (CPM): 매 longest path 기반 prioritization.
  • Work-stealing: 매 dynamic load balancing (Tokio, Rayon).

💻 패턴

Topological Sort (Kahn's Algorithm)

from collections import deque, defaultdict

def topo_sort(nodes, edges):
    indegree = defaultdict(int)
    graph = defaultdict(list)
    for u, v in edges:
        graph[u].append(v); indegree[v] += 1
    queue = deque([n for n in nodes if indegree[n] == 0])
    order = []
    while queue:
        u = queue.popleft(); order.append(u)
        for v in graph[u]:
            indegree[v] -= 1
            if indegree[v] == 0: queue.append(v)
    if len(order) != len(nodes): raise ValueError("Cycle detected")
    return order

Parallel DAG Executor (asyncio)

import asyncio
async def run_dag(tasks, deps):
    """tasks: {name: async_fn}, deps: {name: [prereqs]}."""
    completed = {}; pending = dict(deps)
    async def run(name):
        await asyncio.gather(*(completed[d] for d in deps.get(name, [])))
        return await tasks[name]()
    completed = {n: asyncio.create_task(run(n)) for n in tasks}
    return await asyncio.gather(*completed.values())

Incremental Build (Content-Hash)

def needs_rebuild(node, hashes, prev_hashes):
    own_hash = hash_inputs(node.sources, [hashes[d] for d in node.deps])
    if prev_hashes.get(node.name) != own_hash:
        hashes[node.name] = own_hash
        return True
    hashes[node.name] = own_hash
    return False

Critical Path

def critical_path(graph, durations):
    order = topo_sort(graph.nodes, graph.edges)
    earliest = {n: durations[n] for n in graph.nodes}
    for u in order:
        for v in graph.successors(u):
            earliest[v] = max(earliest[v], earliest[u] + durations[v])
    return max(earliest.values()), earliest

Cycle Detection (DFS)

WHITE, GRAY, BLACK = 0, 1, 2
def has_cycle(graph):
    color = {n: WHITE for n in graph}
    def dfs(u):
        color[u] = GRAY
        for v in graph[u]:
            if color[v] == GRAY: return True
            if color[v] == WHITE and dfs(v): return True
        color[u] = BLACK
        return False
    return any(dfs(n) for n in graph if color[n] == WHITE)

Airflow DAG (Practical)

from airflow.decorators import dag, task
from datetime import datetime

@dag(start_date=datetime(2026,1,1), schedule="@daily", catchup=False)
def etl():
    @task
    def extract(): return fetch()
    @task
    def transform(data): return clean(data)
    @task
    def load(clean): warehouse.write(clean)
    load(transform(extract()))
etl()

매 결정 기준

상황 Approach
Build deterministic, hermetic Bazel / Buck (content-hash)
Data pipeline, scheduled Airflow / Dagster
Monorepo JS/TS Turborepo / Nx
ML experiment tracking Kubeflow / MLflow / ZenML
In-process reactive UI Signals (Solid/Vue/Svelte)
Real-time stream graph Flink / Spark Structured Streaming

기본값: 매 explicit DAG (declarative) > 매 implicit ordering — 매 visualization + audit + parallel scheduling 가능.

🔗 Graph

🤖 LLM 활용

언제: 매 multi-step agent plan 의 dependency 표현, 매 RAG indexing pipeline orchestration. 언제 X: 매 cyclic feedback loop 가 본질적 (RL, gradient descent) — 매 DAG 외 unrolled iteration.

안티패턴

  • Hidden side effects: 매 task 가 state 직접 mutate → 매 incremental build 깨짐.
  • Ignoring transitive dependencies: 매 missing edge → race condition.
  • Single-task megasinks: 매 fan-in bottleneck — 매 break into shards.
  • Cycle by feature flag: 매 conditional dependencies 가 implicit cycle 만들 수 있음.
  • Over-fine granularity: 매 nano-tasks → scheduler overhead > work.

🧪 검증 / 중복

  • Verified (Kahn 1962; Bazel docs 2026; Airflow 3.x docs; CLRS Ch.22).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — full content with topo, parallel exec, incremental, Airflow