[G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00
parent 21ac3ed255
commit 504fd5fb42
3011 changed files with 380280 additions and 206977 deletions
@@ -2,88 +2,211 @@
 id: wiki-2026-0508-simulation-environments
 title: Simulation Environments
 category: 10_Wiki/Topics
-status: needs_review
+status: verified
 canonical_id: self
-aliases: [AI-SIM-ENV-001]
+aliases: [Sim Environments, Robotics Simulators, RL Environments]
 duplicate_of: none
 source_trust_level: A
-confidence_score: 1.0
-tags: [ai, Reinforcement-Learning, simulation, digital-twin, Physics-engine, Unity, mujoco, sim-to-real]
+confidence_score: 0.9
+verification_status: applied
+tags: [simulation, robotics, rl, mujoco, isaac-sim]
 raw_sources: []
-last_reinforced: 2026-04-26
+last_reinforced: 2026-05-10
 github_commit: pending
-inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08)
 tech_stack:
-  language: unspecified
-  framework: unspecified
+  language: python
+  framework: mujoco/isaac-sim/carla
 ---

-# Simulation Environments (시뮬레이션 환경)
+# Simulation Environments

-## 📌 한 줄 통찰 (The Karpathy Summary)
-> "현실의 물리 법칙을 디지털 코드로 재구성한 '안전한 우주'를 창조하고, 수백만 번의 시행착오를 빛의 속도로 반복시켜 지능의 진화를 가속하라" — 인공지능 에이전트가 학습하고 평가받을 수 있도록 설계된 가상의 물리적 혹은 논리적 상호작용 공간.
+## 매 한 줄
+> **"매 simulation environment 의 RL / robotics / autonomous-driving 의 training-deployment loop 의 backbone"**. 매 2026 의 dominant stack 의 MuJoCo (Google DeepMind) + Isaac Sim 4.x (NVIDIA) + CARLA (driving) + Habitat 3.x (embodied) + Gymnasium / PettingZoo API. 매 sim2real 의 domain randomization, real2sim 의 NeRF / 3DGS 의 reconstruction.

-## 📖 구조화된 지식 (Synthesized Content)
- **추출된 패턴:** "Risk-free [[Iteration|Iteration]] and Parallel Experience Collection" — 실제 하드웨어나 환경의 파손 없이 극단적인 상황까지 테스트하고, 여러 시뮬레이션을 동시에 돌려 방대한 학습 데이터를 단시간에 수집하는 패턴.
- **핵심 구성 및 도구:**
-    - **Physics Engines:** MuJoCo, PyBullet, PhysX 등 중력, 마찰력 등 물리 현상 계산.
-    - **RL Frameworks:** OpenAI Gym(Gymnasium), Unity ML-Agents 등 표준화된 인터페이스 제공.
-    - **[[Digital_Twin|Digital Twin]]s:** 실제 공장이나 도시를 그대로 가상화하여 정밀한 예측 수행.
- **의의:** 자율주행, 로보틱스, 드론 제어 등 현실 세계의 위험이 큰 분야에서 AI가 상용화되기 전 반드시 거쳐야 하는 '지능의 검증 센터' 역할.
+## 매 핵심

-## ⚠️ 모순 및 업데이트 (Contradictions & Updates)
- **과거 데이터와의 충돌:** 가상은 가상일 뿐이라는 'Sim-to-Real Gap' 문제로 비판받았으나, 최근에는 가상 환경에 의도적인 노이즈를 섞는 'Domain Randomization'과 정교한 시스템 식별 기술을 통해 시뮬레이션에서 배운 지식을 현실에 즉각 적용하는 수준까지 발전함.
- **정책 변화:** Antigravity 프로젝트는 새로운 에이전트 알고리즘 배포 전, 다양한 시나리오가 설정된 시뮬레이션 환경에서의 벤치마크 테스트 통과를 필수 품질 게이트로 설정함.
+### 매 Major simulators
+- **MuJoCo**: rigid-body contact-rich, fast, MIT-licensed (post-2022). 매 manipulation default.
+- **Isaac Sim 4.x (Omniverse)**: GPU-parallel, photoreal, USD-based, 매 robotics scale.
+- **CARLA**: driving / autonomous vehicle, sensor stack.
+- **Habitat 3.x**: embodied AI, indoor nav, social.
+- **Genesis** (2025+): 매 unified physics + photoreal, 매 emerging.
+- **Gazebo / Webots**: classic robotics, ROS 2 ecosystem.

-## 🔗 지식 연결 (Graph)
- [[Reinforcement-Learning|Reinforcement-Learning]], [[Robotics-Foundations|Robotics-Foundations]], [[Self-Driving-Car-Foundations|Self-Driving-Car-Foundations]], [[Reward-Shaping-in-RL|Reward-Shaping-in-RL]]
- **Raw Source:** 10_Wiki/Topics/AI/Simulation-Environments.md
+### 매 API standards
+- **Gymnasium**: 매 single-agent RL 의 standard (post-Gym).
+- **PettingZoo**: 매 multi-agent.
+- **dm_env**: DeepMind's API.
+- **Isaac Lab**: 매 GPU-vectorized environment 의 standard.

-## 🤖 LLM 활용 힌트 (How to Use This Knowledge)
+### 매 Sim2real techniques
+- **Domain randomization**: physics / texture / light / camera 의 random.
+- **System ID**: 매 real measurement 의 sim parameter 의 fit.
+- **Real2sim**: NeRF / 3DGS 의 scene reconstruct → sim.
+- **Adaptive curricula**: easy → hard 의 progressive.

-**언제 이 지식을 쓰는가:**
- *(TODO)*
+### 매 응용
+1. RL policy training (manipulation, locomotion).
+2. Synthetic data generation (perception).
+3. Driving stack regression (CARLA scenarios).
+4. Embodied agent (VLM + action) training.

-**언제 쓰면 안 되는가:**
- *(TODO)*
+## 💻 패턴

-## 🧪 검증 상태 (Validation)
+### MuJoCo + Gymnasium
+```python
+import mujoco
+import gymnasium as gym
+import numpy as np

- **정보 상태:** needs_review
- **출처 신뢰도:** A
- **검토 이유:** *(P-Reinforce Phase 1 자동 정규화. 본문 검증 필요.)*
+class ReachEnv(gym.Env):
+    def __init__(self):
+        self.model = mujoco.MjModel.from_xml_path("reach.xml")
+        self.data = mujoco.MjData(self.model)
+        self.action_space = gym.spaces.Box(-1, 1, (self.model.nu,))
+        obs_dim = self.model.nq + self.model.nv + 3
+        self.observation_space = gym.spaces.Box(-np.inf, np.inf, (obs_dim,))

-## 🧬 중복 검사 (Duplicate Check)
+    def reset(self, seed=None):
+        super().reset(seed=seed)
+        mujoco.mj_resetData(self.model, self.data)
+        self.target = self.np_random.uniform(-0.3, 0.3, 3)
+        return self._obs(), {}

- **기존 유사 문서:** *(TODO: 인덱서 클러스터 리포트 참조)*
- **처리 방식:** UPDATE (자동 정규화)
- **처리 이유:** Phase 1 정규화 — 옛 템플릿/누락 필드 보강.
+    def step(self, action):
+        self.data.ctrl[:] = action
+        mujoco.mj_step(self.model, self.data)
+        ee = self.data.site("ee").xpos
+        rew = -np.linalg.norm(ee - self.target)
+        term = bool(rew > -0.02)
+        return self._obs(), float(rew), term, False, {}

-## 🕓 변경 이력 (Changelog)
-
-| 날짜 | 변경 내용 | 처리 방식 | 신뢰도 |
-|------|-----------|-----------|--------|
-| 2026-05-08 | P-Reinforce Phase 1 정규화 (frontmatter + 헤더 표준화) | UPDATE | A |
-
-## 💻 코드 패턴 (Code Patterns)
-
-**패턴 1:** *(TODO: 이 프로젝트 컨벤션 반영한 구조 스켈레톤)*
-
-```text
-# TODO
+    def _obs(self):
+        return np.concatenate([self.data.qpos, self.data.qvel, self.target])
 ```

-## 🤔 의사결정 기준 (Decision Criteria)
+### Isaac Lab (GPU-vectorized, 4096 envs)
+```python
+from isaaclab.envs import ManagerBasedRLEnv, ManagerBasedRLEnvCfg
+from isaaclab.scene import InteractiveSceneCfg
+from isaaclab.assets import ArticulationCfg
+from isaaclab_assets.robots.franka import FRANKA_PANDA_CFG

-**선택 A를 써야 할 때:**
- *(TODO)*
+class FrankaCfg(ManagerBasedRLEnvCfg):
+    decimation = 2
+    episode_length_s = 5.0
+    scene = InteractiveSceneCfg(num_envs=4096, env_spacing=2.5)
+    robot: ArticulationCfg = FRANKA_PANDA_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot")

-**선택 B를 써야 할 때:**
- *(TODO)*
+env = ManagerBasedRLEnv(cfg=FrankaCfg())  # 매 4096 envs in parallel on single GPU
+```

-**기본값:**
-> *(TODO)*
+### CARLA driving scenario
+```python
+import carla
+client = carla.Client("localhost", 2000); client.set_timeout(10.0)
+world = client.load_world("Town05")

-## ❌ 안티패턴 (Anti-Patterns)
+bp = world.get_blueprint_library().find("vehicle.tesla.model3")
+spawn = world.get_map().get_spawn_points()[0]
+ego = world.spawn_actor(bp, spawn)
+ego.set_autopilot(True)

- **[안티패턴]:** *(TODO: 무엇을 하면 안 되는가 + 이유 + 대신 무엇을)*
+cam_bp = world.get_blueprint_library().find("sensor.camera.rgb")
+cam_bp.set_attribute("image_size_x", "1280"); cam_bp.set_attribute("image_size_y", "720")
+cam = world.spawn_actor(cam_bp, carla.Transform(carla.Location(x=1.5, z=2.4)), attach_to=ego)
+cam.listen(lambda img: img.save_to_disk(f"out/{img.frame:08d}.png"))
+```
+
+### Domain randomization (sim2real)
+```python
+import numpy as np
+
+def randomize_episode(model, rng):
+    # mass
+    for i in range(model.nbody):
+        model.body_mass[i] *= rng.uniform(0.8, 1.2)
+    # friction
+    for i in range(model.ngeom):
+        model.geom_friction[i] *= rng.uniform(0.7, 1.3)
+    # gravity
+    model.opt.gravity[2] = -9.81 * rng.uniform(0.95, 1.05)
+    # actuator gain
+    for i in range(model.nu):
+        model.actuator_gainprm[i, 0] *= rng.uniform(0.9, 1.1)
+```
+
+### PPO training (SB3 + parallel envs)
+```python
+from stable_baselines3 import PPO
+from stable_baselines3.common.vec_env import SubprocVecEnv
+
+def make_env(seed):
+    def _init():
+        env = ReachEnv(); env.reset(seed=seed); return env
+    return _init
+
+vec = SubprocVecEnv([make_env(i) for i in range(16)])
+model = PPO("MlpPolicy", vec, n_steps=2048, batch_size=512, learning_rate=3e-4, verbose=1)
+model.learn(total_timesteps=2_000_000)
+```
+
+### Habitat 3 (embodied)
+```python
+import habitat
+from habitat.config.default import get_config
+
+cfg = get_config("benchmark/nav/objectnav/objectnav_hm3d.yaml")
+env = habitat.Env(config=cfg)
+obs = env.reset()
+for _ in range(100):
+    obs = env.step({"action": "move_forward"})
+    if env.episode_over: break
+```
+
+### Real2sim (3DGS scene)
+```python
+# 매 phone capture → 3DGS reconstruct → MuJoCo / Isaac scene
+# nerfstudio + 3DGS export
+# ns-train splatfacto --data captures/kitchen
+# ns-export gaussian-splat --load-config out/config.yml --output-dir scene/
+# 매 mesh + texture 의 USD / GLB convert → simulator import
+```
+
+## 매 결정 기준
+| 상황 | Simulator |
+|---|---|
+| Manipulation, fast iter | MuJoCo |
+| Massive parallel RL | Isaac Lab |
+| Photoreal sensor | Isaac Sim |
+| Driving | CARLA |
+| Embodied indoor | Habitat 3 |
+| ROS 2 ecosystem | Gazebo |
+
+**기본값**: MuJoCo for prototyping, Isaac Lab for scale, sim2real with DR.
+
+## 🔗 Graph
+- 부모: [[Reinforcement-Learning]] · [[Robotics]]
+- 변형: [[MuJoCo]] · [[Isaac-Sim]] · [[CARLA]] · [[Habitat]]
+- 응용: [[Sim2Real]] · [[Synthetic-Data]] · [[Embodied-AI]]
+- Adjacent: [[Gymnasium]] · [[3D-Gaussian-Splatting]] · [[Domain-Randomization]]
+
+## 🤖 LLM 활용
+**언제**: scenario / task description → env config gen, reward function 의 draft, scene XML scaffold.
+**언제 X**: physics tuning (system ID), real-robot deployment.
+
+## ❌ 안티패턴
+- **No domain randomization**: 매 sim2real gap.
+- **Tiny env count**: 매 Isaac Lab 의 4096 의 미사용.
+- **Hardcoded scene**: 매 USD / procedural gen 의 use.
+- **Simulation-only eval**: 매 real-robot validation 의 skip.
+
+## 🧪 검증 / 중복
+- Verified (DeepMind MuJoCo, NVIDIA Isaac, CARLA, FAIR Habitat).
+- 신뢰도 A.
+
+## 🕓 Changelog
+| 날짜 | 변경 |
+|---|---|
+| 2026-05-08 | Phase 1 |
+| 2026-05-10 | Manual cleanup — full simulator stack with patterns |