---
id: wiki-2026-0508-self-driving-car-foundations
title: Self-Driving Car Foundations
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [Autonomous Driving, AV Stack, Self-Driving]
duplicate_of: none
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [autonomous-driving, av, perception, planning, end-to-end]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
  language: python
  framework: pytorch
---

# Self-Driving Car Foundations

## 매 한 줄
> **"매 sense → predict → plan → act, with redundancy at every layer"**. 매 modular stack (perception/prediction/planning/control) 매 dominant historically, 매 2026 trend → end-to-end neural (Tesla FSD v12, Wayve GAIA-2). 매 SAE L4 production: Waymo, Cruise(suspended), Zoox.

## 매 핵심

### 매 Sensor stack
- **Camera**: 매 dense semantic, 매 RGB perception.
- **Lidar**: 매 sparse depth (Waymo, Zoox).
- **Radar**: 매 velocity, 매 weather-robust.
- **GNSS+IMU**: 매 ego pose; 매 HD map alignment.
- Tesla pure-vision: 매 cameras + neural net only.

### 매 Modular pipeline
1. **Perception**: 3D detection (CenterPoint, BEVFormer), segmentation, lane detection.
2. **Prediction**: 매 multi-agent trajectory (MTR, Wayformer).
3. **Planning**: 매 sample-based (lattice), optimization (MPC), or learned.
4. **Control**: PID, MPC, LQR (매 lateral + longitudinal).

### 매 End-to-end (E2E) trend
- **Tesla FSD v12**: 매 video → controls neural net.
- **Wayve GAIA-2**: 매 generative world model + policy.
- **UniAD** (CVPR 2023 best paper): 매 unified perception+planning transformer.
- 매 advantage: 매 less brittle handoff; 매 disadvantage: 매 interpretability.

### 매 SAE levels
- L2 (ADAS): 매 hands-on, driver responsible.
- L3 (Mercedes Drive Pilot): 매 hands-off in conditions.
- L4 (Waymo, Zoox): 매 ODD-bound full autonomy.
- L5: 매 anywhere — 매 not yet achieved.

### 매 응용
1. Robotaxi (Waymo One, Zoox launch 2024).
2. Trucking (Aurora, Kodiak).
3. ADAS consumer (Tesla, Mercedes).
4. Last-mile delivery (Nuro).

## 💻 패턴

### BEV perception (BEVFormer-style)
```python
import torch.nn as nn

class BEVPerception(nn.Module):
    def __init__(self, n_cams=6, bev_h=200, bev_w=200):
        super().__init__()
        self.image_backbone = ResNet50()
        self.bev_queries = nn.Parameter(torch.randn(bev_h * bev_w, 256))
        self.spatial_attn = DeformableAttention()
        self.det_head = DetectionHead(num_classes=10)

    def forward(self, multi_cam_imgs, cam_intrinsics, cam_extrinsics):
        feats = [self.image_backbone(img) for img in multi_cam_imgs]
        bev = self.spatial_attn(self.bev_queries, feats, cam_intrinsics, cam_extrinsics)
        boxes_3d = self.det_head(bev)
        return boxes_3d
```

### Trajectory prediction (multimodal)
```python
class MultiModalPredictor(nn.Module):
    def __init__(self, k_modes=6):
        super().__init__()
        self.k = k_modes
        self.encoder = AgentEncoder()
        self.mode_head = nn.Linear(256, k_modes * 80)  # 80 = 8s @ 10Hz * 2 (xy)
        self.score_head = nn.Linear(256, k_modes)

    def forward(self, agent_history, map_features):
        ctx = self.encoder(agent_history, map_features)
        trajs = self.mode_head(ctx).view(-1, self.k, 40, 2)
        scores = self.score_head(ctx).softmax(-1)
        return trajs, scores
```

### MPC planner
```python
import casadi as ca

def mpc_plan(x0, ref_path, horizon=20, dt=0.1):
    opti = ca.Opti()
    x = opti.variable(4, horizon+1)  # [x, y, theta, v]
    u = opti.variable(2, horizon)    # [accel, steer]

    opti.subject_to(x[:, 0] == x0)
    cost = 0
    for t in range(horizon):
        # Bicycle model
        nx = x[0, t] + x[3, t] * ca.cos(x[2, t]) * dt
        ny = x[1, t] + x[3, t] * ca.sin(x[2, t]) * dt
        nt = x[2, t] + x[3, t] / 2.5 * ca.tan(u[1, t]) * dt
        nv = x[3, t] + u[0, t] * dt
        opti.subject_to(x[:, t+1] == ca.vertcat(nx, ny, nt, nv))
        cost += ca.sumsqr(x[:2, t] - ref_path[:, t]) + 0.1 * ca.sumsqr(u[:, t])
    opti.minimize(cost)
    opti.subject_to(opti.bounded(-3, u[0, :], 3))
    opti.subject_to(opti.bounded(-0.5, u[1, :], 0.5))
    opti.solver("ipopt")
    sol = opti.solve()
    return sol.value(u[:, 0])
```

### End-to-end policy (UniAD-style)
```python
class E2EDriving(nn.Module):
    def __init__(self):
        super().__init__()
        self.bev_perception = BEVPerception()
        self.predictor = MultiModalPredictor()
        self.planner = nn.TransformerDecoder(...)

    def forward(self, sensor_data, ego_state, command):
        bev = self.bev_perception(sensor_data)
        trajs = self.predictor(bev)
        plan = self.planner(query=ego_state, memory=torch.cat([bev, trajs]),
                            tgt_mask=command_to_mask(command))
        return plan  # 매 future ego trajectory
```

### Sensor fusion (Kalman-like)
```python
class EKF:
    def predict(self, dt):
        F = build_motion_jacobian(self.state, dt)
        self.state = motion_model(self.state, dt)
        self.P = F @ self.P @ F.T + self.Q
    def update(self, z, H, R):
        y = z - H @ self.state
        S = H @ self.P @ H.T + R
        K = self.P @ H.T @ np.linalg.inv(S)
        self.state += K @ y
        self.P = (np.eye(len(self.state)) - K @ H) @ self.P
```

### Safety: redundant trajectory check
```python
def safety_filter(planned_traj, predicted_agents, ttc_threshold=2.0):
    for agent_traj in predicted_agents:
        for t in range(len(planned_traj)):
            d = distance(planned_traj[t], agent_traj[t])
            v_rel = relative_velocity(planned_traj, agent_traj, t)
            ttc = d / max(v_rel, 0.1)
            if ttc < ttc_threshold:
                return emergency_brake_trajectory()
    return planned_traj
```

## 매 결정 기준
| 상황 | Approach |
|---|---|
| Robotaxi L4 | 매 modular + HD map (Waymo pattern) |
| Consumer ADAS | 매 vision-first E2E (Tesla FSD) |
| Trucking highway | 매 modular + lidar (Aurora) |
| Research SOTA | 매 E2E transformer (UniAD, GAIA-2) |
| Safety case | 매 modular (interpretable) + ML perception |

**기본값**: 매 production deploy 면 modular stack with E2E modules; 매 R&D 면 full E2E + world model.

## 🔗 Graph
- 부모: [[Robotics]] · [[Computer Vision]] · [[Reinforcement Learning]]
- 변형: [[End-to-End Driving]]
- 응용: [[Waymo]] · [[Tesla FSD]]
- Adjacent: [[MPC]]

## 🤖 LLM 활용
**언제**: 매 AV stack architecture decision, 매 perception/prediction/planning module design.
**언제 X**: 매 ROS-level driver/firmware code (매 different domain), 매 hardware certification (ASIL-D).

## ❌ 안티패턴
- **No safety driver during validation**: 매 fatal — 매 Uber 2018 case.
- **Single point of failure**: 매 redundant sensors+compute 필수.
- **HD map dependency only**: 매 fragile in new locations — 매 Tesla bet against, Waymo bets for.
- **End-to-end without monitor**: 매 unverifiable — 매 always include shadow rule-based safety filter.

## 🧪 검증 / 중복
- Verified (Waymo Safety Report 2023, Tesla AI Day 2022/2023, UniAD CVPR 2023).
- 신뢰도 A.

## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — modular vs E2E, BEVFormer, MPC, safety filter |