Files
2nd/10_Wiki/Topics/AI_and_ML/Simulation-Environments.md
T
koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 12:24:15 +09:00

6.9 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-simulation-environments Simulation Environments 10_Wiki/Topics verified self
Sim Environments
Robotics Simulators
RL Environments
none A 0.9 applied
simulation
robotics
rl
mujoco
isaac-sim
2026-05-10 pending
language framework
python mujoco/isaac-sim/carla

Simulation Environments

매 한 줄

"매 simulation environment 의 RL / robotics / autonomous-driving 의 training-deployment loop 의 backbone". 매 2026 의 dominant stack 의 MuJoCo (Google DeepMind) + Isaac Sim 4.x (NVIDIA) + CARLA (driving) + Habitat 3.x (embodied) + Gymnasium / PettingZoo API. 매 sim2real 의 domain randomization, real2sim 의 NeRF / 3DGS 의 reconstruction.

매 핵심

매 Major simulators

  • MuJoCo: rigid-body contact-rich, fast, MIT-licensed (post-2022). 매 manipulation default.
  • Isaac Sim 4.x (Omniverse): GPU-parallel, photoreal, USD-based, 매 robotics scale.
  • CARLA: driving / autonomous vehicle, sensor stack.
  • Habitat 3.x: embodied AI, indoor nav, social.
  • Genesis (2025+): 매 unified physics + photoreal, 매 emerging.
  • Gazebo / Webots: classic robotics, ROS 2 ecosystem.

매 API standards

  • Gymnasium: 매 single-agent RL 의 standard (post-Gym).
  • PettingZoo: 매 multi-agent.
  • dm_env: DeepMind's API.
  • Isaac Lab: 매 GPU-vectorized environment 의 standard.

매 Sim2real techniques

  • Domain randomization: physics / texture / light / camera 의 random.
  • System ID: 매 real measurement 의 sim parameter 의 fit.
  • Real2sim: NeRF / 3DGS 의 scene reconstruct → sim.
  • Adaptive curricula: easy → hard 의 progressive.

매 응용

  1. RL policy training (manipulation, locomotion).
  2. Synthetic data generation (perception).
  3. Driving stack regression (CARLA scenarios).
  4. Embodied agent (VLM + action) training.

💻 패턴

MuJoCo + Gymnasium

import mujoco
import gymnasium as gym
import numpy as np

class ReachEnv(gym.Env):
    def __init__(self):
        self.model = mujoco.MjModel.from_xml_path("reach.xml")
        self.data = mujoco.MjData(self.model)
        self.action_space = gym.spaces.Box(-1, 1, (self.model.nu,))
        obs_dim = self.model.nq + self.model.nv + 3
        self.observation_space = gym.spaces.Box(-np.inf, np.inf, (obs_dim,))

    def reset(self, seed=None):
        super().reset(seed=seed)
        mujoco.mj_resetData(self.model, self.data)
        self.target = self.np_random.uniform(-0.3, 0.3, 3)
        return self._obs(), {}

    def step(self, action):
        self.data.ctrl[:] = action
        mujoco.mj_step(self.model, self.data)
        ee = self.data.site("ee").xpos
        rew = -np.linalg.norm(ee - self.target)
        term = bool(rew > -0.02)
        return self._obs(), float(rew), term, False, {}

    def _obs(self):
        return np.concatenate([self.data.qpos, self.data.qvel, self.target])

Isaac Lab (GPU-vectorized, 4096 envs)

from isaaclab.envs import ManagerBasedRLEnv, ManagerBasedRLEnvCfg
from isaaclab.scene import InteractiveSceneCfg
from isaaclab.assets import ArticulationCfg
from isaaclab_assets.robots.franka import FRANKA_PANDA_CFG

class FrankaCfg(ManagerBasedRLEnvCfg):
    decimation = 2
    episode_length_s = 5.0
    scene = InteractiveSceneCfg(num_envs=4096, env_spacing=2.5)
    robot: ArticulationCfg = FRANKA_PANDA_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot")

env = ManagerBasedRLEnv(cfg=FrankaCfg())  # 매 4096 envs in parallel on single GPU

CARLA driving scenario

import carla
client = carla.Client("localhost", 2000); client.set_timeout(10.0)
world = client.load_world("Town05")

bp = world.get_blueprint_library().find("vehicle.tesla.model3")
spawn = world.get_map().get_spawn_points()[0]
ego = world.spawn_actor(bp, spawn)
ego.set_autopilot(True)

cam_bp = world.get_blueprint_library().find("sensor.camera.rgb")
cam_bp.set_attribute("image_size_x", "1280"); cam_bp.set_attribute("image_size_y", "720")
cam = world.spawn_actor(cam_bp, carla.Transform(carla.Location(x=1.5, z=2.4)), attach_to=ego)
cam.listen(lambda img: img.save_to_disk(f"out/{img.frame:08d}.png"))

Domain randomization (sim2real)

import numpy as np

def randomize_episode(model, rng):
    # mass
    for i in range(model.nbody):
        model.body_mass[i] *= rng.uniform(0.8, 1.2)
    # friction
    for i in range(model.ngeom):
        model.geom_friction[i] *= rng.uniform(0.7, 1.3)
    # gravity
    model.opt.gravity[2] = -9.81 * rng.uniform(0.95, 1.05)
    # actuator gain
    for i in range(model.nu):
        model.actuator_gainprm[i, 0] *= rng.uniform(0.9, 1.1)

PPO training (SB3 + parallel envs)

from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import SubprocVecEnv

def make_env(seed):
    def _init():
        env = ReachEnv(); env.reset(seed=seed); return env
    return _init

vec = SubprocVecEnv([make_env(i) for i in range(16)])
model = PPO("MlpPolicy", vec, n_steps=2048, batch_size=512, learning_rate=3e-4, verbose=1)
model.learn(total_timesteps=2_000_000)

Habitat 3 (embodied)

import habitat
from habitat.config.default import get_config

cfg = get_config("benchmark/nav/objectnav/objectnav_hm3d.yaml")
env = habitat.Env(config=cfg)
obs = env.reset()
for _ in range(100):
    obs = env.step({"action": "move_forward"})
    if env.episode_over: break

Real2sim (3DGS scene)

# 매 phone capture → 3DGS reconstruct → MuJoCo / Isaac scene
# nerfstudio + 3DGS export
# ns-train splatfacto --data captures/kitchen
# ns-export gaussian-splat --load-config out/config.yml --output-dir scene/
# 매 mesh + texture 의 USD / GLB convert → simulator import

매 결정 기준

상황 Simulator
Manipulation, fast iter MuJoCo
Massive parallel RL Isaac Lab
Photoreal sensor Isaac Sim
Driving CARLA
Embodied indoor Habitat 3
ROS 2 ecosystem Gazebo

기본값: MuJoCo for prototyping, Isaac Lab for scale, sim2real with DR.

🔗 Graph

🤖 LLM 활용

언제: scenario / task description → env config gen, reward function 의 draft, scene XML scaffold. 언제 X: physics tuning (system ID), real-robot deployment.

안티패턴

  • No domain randomization: 매 sim2real gap.
  • Tiny env count: 매 Isaac Lab 의 4096 의 미사용.
  • Hardcoded scene: 매 USD / procedural gen 의 use.
  • Simulation-only eval: 매 real-robot validation 의 skip.

🧪 검증 / 중복

  • Verified (DeepMind MuJoCo, NVIDIA Isaac, CARLA, FAIR Habitat).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — full simulator stack with patterns