--- id: wiki-2026-0508-simulation-environments title: Simulation Environments category: 10_Wiki/Topics status: verified canonical_id: self aliases: [Sim Environments, Robotics Simulators, RL Environments] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [simulation, robotics, rl, mujoco, isaac-sim] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: python framework: mujoco/isaac-sim/carla --- # Simulation Environments ## 매 한 줄 > **"매 simulation environment 의 RL / robotics / autonomous-driving 의 training-deployment loop 의 backbone"**. 매 2026 의 dominant stack 의 MuJoCo (Google DeepMind) + Isaac Sim 4.x (NVIDIA) + CARLA (driving) + Habitat 3.x (embodied) + Gymnasium / PettingZoo API. 매 sim2real 의 domain randomization, real2sim 의 NeRF / 3DGS 의 reconstruction. ## 매 핵심 ### 매 Major simulators - **MuJoCo**: rigid-body contact-rich, fast, MIT-licensed (post-2022). 매 manipulation default. - **Isaac Sim 4.x (Omniverse)**: GPU-parallel, photoreal, USD-based, 매 robotics scale. - **CARLA**: driving / autonomous vehicle, sensor stack. - **Habitat 3.x**: embodied AI, indoor nav, social. - **Genesis** (2025+): 매 unified physics + photoreal, 매 emerging. - **Gazebo / Webots**: classic robotics, ROS 2 ecosystem. ### 매 API standards - **Gymnasium**: 매 single-agent RL 의 standard (post-Gym). - **PettingZoo**: 매 multi-agent. - **dm_env**: DeepMind's API. - **Isaac Lab**: 매 GPU-vectorized environment 의 standard. ### 매 Sim2real techniques - **Domain randomization**: physics / texture / light / camera 의 random. - **System ID**: 매 real measurement 의 sim parameter 의 fit. - **Real2sim**: NeRF / 3DGS 의 scene reconstruct → sim. - **Adaptive curricula**: easy → hard 의 progressive. ### 매 응용 1. RL policy training (manipulation, locomotion). 2. Synthetic data generation (perception). 3. Driving stack regression (CARLA scenarios). 4. Embodied agent (VLM + action) training. ## 💻 패턴 ### MuJoCo + Gymnasium ```python import mujoco import gymnasium as gym import numpy as np class ReachEnv(gym.Env): def __init__(self): self.model = mujoco.MjModel.from_xml_path("reach.xml") self.data = mujoco.MjData(self.model) self.action_space = gym.spaces.Box(-1, 1, (self.model.nu,)) obs_dim = self.model.nq + self.model.nv + 3 self.observation_space = gym.spaces.Box(-np.inf, np.inf, (obs_dim,)) def reset(self, seed=None): super().reset(seed=seed) mujoco.mj_resetData(self.model, self.data) self.target = self.np_random.uniform(-0.3, 0.3, 3) return self._obs(), {} def step(self, action): self.data.ctrl[:] = action mujoco.mj_step(self.model, self.data) ee = self.data.site("ee").xpos rew = -np.linalg.norm(ee - self.target) term = bool(rew > -0.02) return self._obs(), float(rew), term, False, {} def _obs(self): return np.concatenate([self.data.qpos, self.data.qvel, self.target]) ``` ### Isaac Lab (GPU-vectorized, 4096 envs) ```python from isaaclab.envs import ManagerBasedRLEnv, ManagerBasedRLEnvCfg from isaaclab.scene import InteractiveSceneCfg from isaaclab.assets import ArticulationCfg from isaaclab_assets.robots.franka import FRANKA_PANDA_CFG class FrankaCfg(ManagerBasedRLEnvCfg): decimation = 2 episode_length_s = 5.0 scene = InteractiveSceneCfg(num_envs=4096, env_spacing=2.5) robot: ArticulationCfg = FRANKA_PANDA_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot") env = ManagerBasedRLEnv(cfg=FrankaCfg()) # 매 4096 envs in parallel on single GPU ``` ### CARLA driving scenario ```python import carla client = carla.Client("localhost", 2000); client.set_timeout(10.0) world = client.load_world("Town05") bp = world.get_blueprint_library().find("vehicle.tesla.model3") spawn = world.get_map().get_spawn_points()[0] ego = world.spawn_actor(bp, spawn) ego.set_autopilot(True) cam_bp = world.get_blueprint_library().find("sensor.camera.rgb") cam_bp.set_attribute("image_size_x", "1280"); cam_bp.set_attribute("image_size_y", "720") cam = world.spawn_actor(cam_bp, carla.Transform(carla.Location(x=1.5, z=2.4)), attach_to=ego) cam.listen(lambda img: img.save_to_disk(f"out/{img.frame:08d}.png")) ``` ### Domain randomization (sim2real) ```python import numpy as np def randomize_episode(model, rng): # mass for i in range(model.nbody): model.body_mass[i] *= rng.uniform(0.8, 1.2) # friction for i in range(model.ngeom): model.geom_friction[i] *= rng.uniform(0.7, 1.3) # gravity model.opt.gravity[2] = -9.81 * rng.uniform(0.95, 1.05) # actuator gain for i in range(model.nu): model.actuator_gainprm[i, 0] *= rng.uniform(0.9, 1.1) ``` ### PPO training (SB3 + parallel envs) ```python from stable_baselines3 import PPO from stable_baselines3.common.vec_env import SubprocVecEnv def make_env(seed): def _init(): env = ReachEnv(); env.reset(seed=seed); return env return _init vec = SubprocVecEnv([make_env(i) for i in range(16)]) model = PPO("MlpPolicy", vec, n_steps=2048, batch_size=512, learning_rate=3e-4, verbose=1) model.learn(total_timesteps=2_000_000) ``` ### Habitat 3 (embodied) ```python import habitat from habitat.config.default import get_config cfg = get_config("benchmark/nav/objectnav/objectnav_hm3d.yaml") env = habitat.Env(config=cfg) obs = env.reset() for _ in range(100): obs = env.step({"action": "move_forward"}) if env.episode_over: break ``` ### Real2sim (3DGS scene) ```python # 매 phone capture → 3DGS reconstruct → MuJoCo / Isaac scene # nerfstudio + 3DGS export # ns-train splatfacto --data captures/kitchen # ns-export gaussian-splat --load-config out/config.yml --output-dir scene/ # 매 mesh + texture 의 USD / GLB convert → simulator import ``` ## 매 결정 기준 | 상황 | Simulator | |---|---| | Manipulation, fast iter | MuJoCo | | Massive parallel RL | Isaac Lab | | Photoreal sensor | Isaac Sim | | Driving | CARLA | | Embodied indoor | Habitat 3 | | ROS 2 ecosystem | Gazebo | **기본값**: MuJoCo for prototyping, Isaac Lab for scale, sim2real with DR. ## 🔗 Graph - 부모: [[Reinforcement-Learning]] · [[Robotics]] - 응용: [[Sim2Real]] · [[Synthetic-Data]] · [[Embodied-AI]] - Adjacent: [[3D-Gaussian-Splatting]] ## 🤖 LLM 활용 **언제**: scenario / task description → env config gen, reward function 의 draft, scene XML scaffold. **언제 X**: physics tuning (system ID), real-robot deployment. ## ❌ 안티패턴 - **No domain randomization**: 매 sim2real gap. - **Tiny env count**: 매 Isaac Lab 의 4096 의 미사용. - **Hardcoded scene**: 매 USD / procedural gen 의 use. - **Simulation-only eval**: 매 real-robot validation 의 skip. ## 🧪 검증 / 중복 - Verified (DeepMind MuJoCo, NVIDIA Isaac, CARLA, FAIR Habitat). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — full simulator stack with patterns |