Isaac Gym

A library for simulating humanoids in Isaac Gym.

This library is built on top of the Isaac Gym library and Humanoid-gym and provides a simple interface for running experiments with GPR. We have defined the following tasks:

Getting up
Walking

We currently support GPR, Zeroth, XBot, Unitree G1 and H1 with more humanoids to come.

The walking task works reliably with the upper body fixed. The getting up task is still an open challenge!

Getting Started

This repository requires Python 3.8 due to compatibility issues with underlying libraries. We hope to support more recent Python versions in the future.

Clone this repository:

git clone https://github.com/kscalelabs/sim.git
cd sim

Create a new conda environment and install the package:

conda create --name kscale-sim-library python=3.8.19
conda activate kscale-sim-library
make install-dev

Install third-party dependencies:

Manually download IsaacGym_Preview_4_Package.tar.gz from https://developer.nvidia.com/isaac-gym, and run:

tar -xvf IsaacGym_Preview_4_Package.tar.gz
conda env config vars set ISAACGYM_PATH=`pwd`/isaacgym
conda deactivate
conda activate kscale-sim-library
make install-third-party-external

Currently, our inference and export package kinfer is not compatible with Python 3.8, so we hack it in by accessing it locally as a submodule.

Consequently, you’ll also need to add it to your PYTHONPATH like this:

conda env config vars set PYTHONPATH=third_party/kinfer

export PYTHONPATH=third_party/kinfer

Running GPR experiments

Run a small training with visualization with the following command:

python sim/train.py --task=gpr --num_envs=4

or full training in the headless mode:

python sim/train.py --task=gpr --num_envs=4096 --headless

Run evaluation with the following command:

python sim/play.py --task gpr --sim_device cpu

See this doc for more beginner tips.

Adding a new robot

Creating a new embodiment is quite straightforward. The best way is to copy an existing robot and modify it.

The best starting point would be GPR.

Create a folder with a new robot here.
Add joint.py file setting up basic properties and joint configuration - see an example.
- Make sure to update the p_gains (def stiffness) and d_gains (def damping), such that the joints match the URDF joint names.
Update height and default rotation of your humanoid.
Add a new embodiment configuration and environment here.
Add a new embodiment to the registry. To run things export your default path:

export MODEL_DIR=sim/resources/NEW_HUMANOID

and kick off the training:

python sim/train.py --env-id NEW_HUMANOID --num-envs 4

Making it stand

The best way to start is to make your new robot stand. In this case you want to comment out the rewards that are responsible for making the robot walk.
If the robot flies away, inspect your joint limits. During training, we introduce a lot of noise to the default joint positions as well as masses. You might need either adapt the limits or the noise level.
Isaac Sim often hits velocities nans when some joints are hitting their limits - you can change the engine parameters to fix this issue.
The revolute joints cannot have 0 velocity in the URDF definition - otherwise, engine will go nans as well.
Observe the reward for orientation and default joint position. The model should just farm these two rewards.
If the robot still struggles at keeping the standing pose, you can also change the urdf definition using the script to fix the upper body or change joint limits.

Making it walk

We set up the logic so that your new robot should start to walk with basic configuration after some modifications.

To get things going it’s best to start from the good standing pose, with knees slightly bent.
The gait reward is the most crucial single reward. You have to adapt the hyperparameters to your robot design to get it to work.
If the robots tends to jump, use only one limb you will have to adapt the overall rewards schema to enforce proper behavior.
If the setup is correct, you should see positive results after first 200-300 epochs. However, the robust policy will requires 2000-3000 epochs.

Performance Analysis

Further analysis can be done on the IsaacGym simulation outputs after doing the initial training.

You will be prompted to make an account with Weights and Biases WandB when first performing train.py. Basic statistics on the ML training process will be provided online afterwards.

For information on the kinematics and controls of the model, it is possible to generate an HDF5 dataset when running play.py. This currently includes:

x, y, and yaw commands
Joint positiions and velocities
Joint ‘actions’, relating to the input torque
Among others; see sim/sim/play.py for the exact outputs.

To save this dataset, specify the argument play.py (...) -log_h5. This will save the HDF5 binary to your current working directory. Note that HDF5 is a binary format, so you will likely need an outside tool, such as h5dump (included in Debian) or h5py to interface with the files.

Common Issues

After cloning Isaac Gym, sometimes the bindings mysteriously disappear.

To fix this, update the submodule:

git submodule update --init --recursive

If you observe errors with libpython3.8.so.1.0, you can try the following:

export LD_LIBRARY_PATH=PATH_TO_YOUR_ENV/lib:$LD_LIBRARY_PATH

If you still see segmentation faults, you can try the following:

sudo apt-get vulkan1

FAQ

Frequently Seen Bugs and Errors

Onshape
- Only revolute and fastened joints
- Set weights or material
- Add joint limits in Onshape
Remove Base to Body Fixed Joint and the Base link. The K-Scale Onshape to URDF converter includes these but they were removed in the GPR URDF.
Flying Robot (walking then flying away like there is a strong gust)
- Check effort and velocity in limits. For small servos around effort=“24” velocity=“30”, For large humanoids like GPR, effort=“80” and velocity=“5”
- Check def stiffness and def damping. For stiffness servos should be ~20 for smaller robots using servos, ~150 for larger humanoids. For damping, should be ~0.5 for servo robots and ~10 for larger humanoids using actuators.

Expected Behavior

These are subject to how you change some of the training params, but generally using the repo mostly as is

Episode Length should be >2k
Mean reward ~40 for quadruped.
Mean reward ~[ ] for humanoid

Mujoco MinPPO