--- id: wiki-2026-0508-self-driving-car-foundations title: Self-Driving Car Foundations category: 10_Wiki/Topics status: verified canonical_id: self aliases: [Autonomous Driving, AV Stack, Self-Driving] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [autonomous-driving, av, perception, planning, end-to-end] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: python framework: pytorch --- # Self-Driving Car Foundations ## 매 한 줄 > **"매 sense → predict → plan → act, with redundancy at every layer"**. 매 modular stack (perception/prediction/planning/control) 매 dominant historically, 매 2026 trend → end-to-end neural (Tesla FSD v12, Wayve GAIA-2). 매 SAE L4 production: Waymo, Cruise(suspended), Zoox. ## 매 핵심 ### 매 Sensor stack - **Camera**: 매 dense semantic, 매 RGB perception. - **Lidar**: 매 sparse depth (Waymo, Zoox). - **Radar**: 매 velocity, 매 weather-robust. - **GNSS+IMU**: 매 ego pose; 매 HD map alignment. - Tesla pure-vision: 매 cameras + neural net only. ### 매 Modular pipeline 1. **Perception**: 3D detection (CenterPoint, BEVFormer), segmentation, lane detection. 2. **Prediction**: 매 multi-agent trajectory (MTR, Wayformer). 3. **Planning**: 매 sample-based (lattice), optimization (MPC), or learned. 4. **Control**: PID, MPC, LQR (매 lateral + longitudinal). ### 매 End-to-end (E2E) trend - **Tesla FSD v12**: 매 video → controls neural net. - **Wayve GAIA-2**: 매 generative world model + policy. - **UniAD** (CVPR 2023 best paper): 매 unified perception+planning transformer. - 매 advantage: 매 less brittle handoff; 매 disadvantage: 매 interpretability. ### 매 SAE levels - L2 (ADAS): 매 hands-on, driver responsible. - L3 (Mercedes Drive Pilot): 매 hands-off in conditions. - L4 (Waymo, Zoox): 매 ODD-bound full autonomy. - L5: 매 anywhere — 매 not yet achieved. ### 매 응용 1. Robotaxi (Waymo One, Zoox launch 2024). 2. Trucking (Aurora, Kodiak). 3. ADAS consumer (Tesla, Mercedes). 4. Last-mile delivery (Nuro). ## 💻 패턴 ### BEV perception (BEVFormer-style) ```python import torch.nn as nn class BEVPerception(nn.Module): def __init__(self, n_cams=6, bev_h=200, bev_w=200): super().__init__() self.image_backbone = ResNet50() self.bev_queries = nn.Parameter(torch.randn(bev_h * bev_w, 256)) self.spatial_attn = DeformableAttention() self.det_head = DetectionHead(num_classes=10) def forward(self, multi_cam_imgs, cam_intrinsics, cam_extrinsics): feats = [self.image_backbone(img) for img in multi_cam_imgs] bev = self.spatial_attn(self.bev_queries, feats, cam_intrinsics, cam_extrinsics) boxes_3d = self.det_head(bev) return boxes_3d ``` ### Trajectory prediction (multimodal) ```python class MultiModalPredictor(nn.Module): def __init__(self, k_modes=6): super().__init__() self.k = k_modes self.encoder = AgentEncoder() self.mode_head = nn.Linear(256, k_modes * 80) # 80 = 8s @ 10Hz * 2 (xy) self.score_head = nn.Linear(256, k_modes) def forward(self, agent_history, map_features): ctx = self.encoder(agent_history, map_features) trajs = self.mode_head(ctx).view(-1, self.k, 40, 2) scores = self.score_head(ctx).softmax(-1) return trajs, scores ``` ### MPC planner ```python import casadi as ca def mpc_plan(x0, ref_path, horizon=20, dt=0.1): opti = ca.Opti() x = opti.variable(4, horizon+1) # [x, y, theta, v] u = opti.variable(2, horizon) # [accel, steer] opti.subject_to(x[:, 0] == x0) cost = 0 for t in range(horizon): # Bicycle model nx = x[0, t] + x[3, t] * ca.cos(x[2, t]) * dt ny = x[1, t] + x[3, t] * ca.sin(x[2, t]) * dt nt = x[2, t] + x[3, t] / 2.5 * ca.tan(u[1, t]) * dt nv = x[3, t] + u[0, t] * dt opti.subject_to(x[:, t+1] == ca.vertcat(nx, ny, nt, nv)) cost += ca.sumsqr(x[:2, t] - ref_path[:, t]) + 0.1 * ca.sumsqr(u[:, t]) opti.minimize(cost) opti.subject_to(opti.bounded(-3, u[0, :], 3)) opti.subject_to(opti.bounded(-0.5, u[1, :], 0.5)) opti.solver("ipopt") sol = opti.solve() return sol.value(u[:, 0]) ``` ### End-to-end policy (UniAD-style) ```python class E2EDriving(nn.Module): def __init__(self): super().__init__() self.bev_perception = BEVPerception() self.predictor = MultiModalPredictor() self.planner = nn.TransformerDecoder(...) def forward(self, sensor_data, ego_state, command): bev = self.bev_perception(sensor_data) trajs = self.predictor(bev) plan = self.planner(query=ego_state, memory=torch.cat([bev, trajs]), tgt_mask=command_to_mask(command)) return plan # 매 future ego trajectory ``` ### Sensor fusion (Kalman-like) ```python class EKF: def predict(self, dt): F = build_motion_jacobian(self.state, dt) self.state = motion_model(self.state, dt) self.P = F @ self.P @ F.T + self.Q def update(self, z, H, R): y = z - H @ self.state S = H @ self.P @ H.T + R K = self.P @ H.T @ np.linalg.inv(S) self.state += K @ y self.P = (np.eye(len(self.state)) - K @ H) @ self.P ``` ### Safety: redundant trajectory check ```python def safety_filter(planned_traj, predicted_agents, ttc_threshold=2.0): for agent_traj in predicted_agents: for t in range(len(planned_traj)): d = distance(planned_traj[t], agent_traj[t]) v_rel = relative_velocity(planned_traj, agent_traj, t) ttc = d / max(v_rel, 0.1) if ttc < ttc_threshold: return emergency_brake_trajectory() return planned_traj ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | Robotaxi L4 | 매 modular + HD map (Waymo pattern) | | Consumer ADAS | 매 vision-first E2E (Tesla FSD) | | Trucking highway | 매 modular + lidar (Aurora) | | Research SOTA | 매 E2E transformer (UniAD, GAIA-2) | | Safety case | 매 modular (interpretable) + ML perception | **기본값**: 매 production deploy 면 modular stack with E2E modules; 매 R&D 면 full E2E + world model. ## 🔗 Graph - 부모: [[Robotics]] · [[Computer Vision]] · [[Reinforcement Learning]] - 변형: [[End-to-End Driving]] - 응용: [[Waymo]] · [[Tesla FSD]] - Adjacent: [[MPC]] ## 🤖 LLM 활용 **언제**: 매 AV stack architecture decision, 매 perception/prediction/planning module design. **언제 X**: 매 ROS-level driver/firmware code (매 different domain), 매 hardware certification (ASIL-D). ## ❌ 안티패턴 - **No safety driver during validation**: 매 fatal — 매 Uber 2018 case. - **Single point of failure**: 매 redundant sensors+compute 필수. - **HD map dependency only**: 매 fragile in new locations — 매 Tesla bet against, Waymo bets for. - **End-to-end without monitor**: 매 unverifiable — 매 always include shadow rule-based safety filter. ## 🧪 검증 / 중복 - Verified (Waymo Safety Report 2023, Tesla AI Day 2022/2023, UniAD CVPR 2023). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — modular vs E2E, BEVFormer, MPC, safety filter |