Commit e6651b9a by richie.li

first commit

parents
# These are some examples of commonly ignored file patterns.
# You should customize this list as applicable to your project.
# Learn more about .gitignore:
# https://www.atlassian.com/git/tutorials/saving-changes/gitignore
# Node artifact files
node_modules/
dist/
# Compiled Java class files
*.class
# Compiled Python bytecode
*.py[cod]
# Log files
*.log
# Package files
*.jar
# Maven
target/
dist/
# JetBrains IDE
.idea/
# Unit test reports
TEST*.xml
# Generated by MacOS
.DS_Store
# Generated by Windows
Thumbs.db
# Applications
*.app
*.exe
*.war
# Large media files
*.mp4
*.tiff
*.avi
*.flv
*.mov
*.wmv
# VS Code
.vscode
# logs
logs
runs
# other
*.egg-info
__pycache__
*.swp
MUJOCO_LOG.TXT
\ No newline at end of file
English | [中文](README.zh_CN.md)
## Introduction
[AgiBot X1](https://www.zhiyuan-robot.com/qzproduct/169.html) is a modular humanoid robot with high dof developed and open-sourced by AgiBot. It is built upon AgiBot's open-source framework `AimRT` as middleware and using reinforcement learning for locomotion control.
This project is about the reinforcement learning training code used by AgiBot X1. It can be used in conjunction with the [inference software](https://aimrt.org/) provided with AgiBot X1 for real-robot and simulated walking debugging, or be imported to other robot models for training.
![](doc/id.jpg)
## Start
### Install Dependencies
1. Create a new Python 3.8 virtual environment:
- `conda create -n myenv python=3.8`.
2. Install pytorch 1.13 and cuda-11.7:
- `conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia`
3. Install numpy-1.23:
- `conda install numpy=1.23`.
4. Install Isaac Gym:
- Download and install Isaac Gym Preview 4 from https://developer.nvidia.com/isaac-gym.
- `cd isaacgym/python && pip install -e .`
- Run an example with `cd examples && python 1080_balls_of_solitude.py`.
- Consult `isaacgym/docs/index.html` for troubleshooting.
6. Install the training code dependencies:
- Clone this repository.
- `pip install -e .`
### Usage
#### Train:
```python scripts/train.py --task=x1_dh_stand --run_name=<run_name> --headless```
- The trained model will be saved in `/log/<experiment_name>/exported_data/<date_time><run_name>/model_<iteration>.pt`, where `<experiment_name>` is defined in the config file.
![](doc/train.gif)
#### Play:
```python /scripts/play.py --task=x1_dh_stand --load_run=<date_time><run_name>```
![](doc/play.gif)
#### Generate the JIT Model:
``` python scripts/export_policy_dh.py --task=x1_dh_stand --load_run=<date_time><run_name> ```
- The JIT model will be saved in ``` log/exported_policies/<date_time>```
#### Generate the ONNX Model:
``` python scripts/export_onnx_dh.py --task=x1_dh_stand --load_run=<date_time> ```
- The ONNX model will be saved at ```log/exported_policies/<date_time>```
#### Parameter Descriptions:
- task: Task name
- resume: Resume training from a checkpoint
- experiment_name: Name of the experiment to run or load.
- run_name: Name of the run.
- load_run: Name of the run to load when resume=True. If -1: will load the last run.
- checkpoint: Saved model checkpoint number. If -1: will load the last checkpoint.
- num_envs: Number of environments to create.
- seed: Random seed.
- max_iterations: Maximum number of training iterations.
### Add New Environments
1. Create a new folder under the `envs/` directory, and then create a configuration file `<your_env>_config.py` and an environment file `<your_env>_env.py` in the folder. The two files should inherit `LeggedRobotCfg` and `LeggedRobot` respectively.
2. Place the URDF, mesh, and MJCF files of the new robot in the `resources/` folder.
- Configure the URDF path, PD gain, body name, default_joint_angles, experiment_name, etc., for the new robot in `<your_env>_config.py`.
3. Register the new robot in `humanoid/envs/__init__.py`.
### sim2sim
Use Mujoco for sim2sim validation:
```
python scripts/sim2sim.py --task=x1_dh_stand --load_model /path/to/exported_policies/
```
![](doc/mujoco.gif)
### Usage of Joystick
We use the Logitech F710 Joystick. When starting play.py and sim2sim.py, press and hold button 4 while rotating the joystick to control the robot to move forward/backward, strafe left/right or rotate.
![](doc/joy_map.jpg)
| Button | Command |
| -------------------- |:--------------------:|
| 4 + 1- | Move forward |
| 4 + 1+ | Move backward |
| 4 + 0- | Strafe left |
| 4 + 0+ | Strafe right |
| 4 + 3- | Rotate counterclockwise |
| 4 + 3+ | Rotate clockwise |
## Directory Structure
```
.
|— humanoid # Main code directory
| |—algo # Algorithm directory
| |—envs # Environment directory
| |—scripts # Script directory
| |—utilis # Utility and function directory
|— logs # Model directory
|— resources # Resource library
| |— robots # Robot urdf, mjcf, mesh
|— README.md # README document
```
> References
> * [GitHub - leggedrobotics/legged_gym: Isaac Gym Environments for Legged Robots](https://github.com/leggedrobotics/legged_gym)
> * [GitHub - leggedrobotics/rsl_rl: Fast and simple implementation of RL algorithms, designed to run fully on GPU.](https://github.com/leggedrobotics/rsl_rl)
> * [GitHub - roboterax/humanoid-gym: Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer https://arxiv.org/abs/2404.05695](https://github.com/roboterax/humanoid-gym)
[English](README.md) | 中文
## 简介
[智元灵犀X1](https://www.zhiyuan-robot.com/qzproduct/169.html) 是由智元研发并开源的模块化、高自由度人形机器人,X1的软件系统基于智元开源组件 `AimRT` 作为中间件实现,并且采用强化学习方法进行运动控制。
本工程为智元灵犀X1所使用的强化学习训练代码,可配合智元灵犀X1配套的[推理软件](https://aimrt.org/)进行真机和仿真的行走调试,或导入其他机器人模型进行训练。
![](doc/id.jpg)
## 代码运行
### 安装依赖
1. 创建一个新的python3.8虚拟环境:
- `conda create -n myenv python=3.8`.
2. 安装 pytorch 1.13 和 cuda-11.7:
- `conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia`
3. 安装 numpy-1.23:
- `conda install numpy=1.23`.
4. 安装 Isaac Gym:
- 下载并安装 Isaac Gym Preview 4 https://developer.nvidia.com/isaac-gym.
- `cd isaacgym/python && pip install -e .`
- Run an example with `cd examples && python 1080_balls_of_solitude.py`.
- Consult `isaacgym/docs/index.html` for troubleshooting.
6. 安装训练代码依赖:
- Clone this repository.
- `pip install -e .`
### 使用
#### Train:
```python scripts/train.py --task=x1_dh_stand --run_name=<run_name> --headless```
- 训练好的模型会存`/log/<experiment_name>/exported_data/<date_time><run_name>/model_<iteration>.pt` 其中 `<experiment_name>` 在config文件中定义.
![](doc/train.gif)
#### Play:
```python /scripts/play.py --task=x1_dh_stand --load_run=<date_time><run_name>```
![](doc/play.gif)
#### 生成jit模型:
``` python scripts/export_policy_dh.py --task=x1_dh_stand --load_run=<date_time><run_name> ```
- jit模型会存在 ``` log/exported_policies/<date_time>```
#### 生成onnx模型:
``` python scripts/export_onnx_dh.py --task=x1_dh_stand --load_run=<date_time> ```
- onnx模型会存在 ```log/exported_policies/<date_time>```
#### 参数说明:
- task: Task name
- resume: Resume training from a checkpoint
- experiment_name: Name of the experiment to run or load.
- run_name: Name of the run.
- load_run: Name of the run to load when resume=True. If -1: will load the last run.
- checkpoint: Saved model checkpoint number. If -1: will load the last checkpoint.
- num_envs: Number of environments to create.
- seed: Random seed.
- max_iterations: Maximum number of training iterations.
### 添加新环境
1.在 `envs/`目录下创建一个新文件夹,在新文件夹下创建一个配置文件`<your_env>_config.py`和环境文件`<your_env>_env.py`,这两个文件要分别继承`LeggedRobotCfg`和`LeggedRobot`
2.将新机器的urdf, mesh, mjcf放到 `resources/`文件夹下
- 在`<your_env>_config.py`里配置新机器的urdf path,PD gain,body name, default_joint_angles, experiment_name等
3.在`humanoid/envs/__init__.py`里注册你的新机器
### sim2sim
使用mujoco来进行sim2sim验证:
```
python scripts/sim2sim.py --task=x1_dh_stand --load_model /path/to/exported_policies/
```
![](doc/mujoco.gif)
### 手柄使用
我们使用Logitech f710手柄,在启动play.py和sim2sim.py时,按住4的同时转动摇杆可以控制机器人前后,左右和旋转。
![](doc/joy_map.jpg)
| 按键 | 命令 |
| -------------------- |:--------------------:|
| 4 + 1- | 前进 |
| 4 + 1+ | 后退 |
| 4 + 0- | 左平移 |
| 4 + 0+ | 右平移 |
| 4 + 3- | 逆时针旋转 |
| 4 + 3+ | 顺时针旋转 |
## 目录结构
```
.
|— humanoid # 主要代码目录
| |—algo # 算法目录
| |—envs # 环境目录
| |—scripts # 脚本目录
| |—utilis # 工具、功能目录
|— logs # 模型目录
|— resources # 资源库
| |— robots # 机器人urdf, mjcf, mesh
|— README.md # 说明文档
```
> 参考项目:
>
> * [GitHub - leggedrobotics/legged_gym: Isaac Gym Environments for Legged Robots](https://github.com/leggedrobotics/legged_gym)
> * [GitHub - leggedrobotics/rsl_rl: Fast and simple implementation of RL algorithms, designed to run fully on GPU.](https://github.com/leggedrobotics/rsl_rl)
> * [GitHub - roboterax/humanoid-gym: Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer https://arxiv.org/abs/2404.05695](https://github.com/roboterax/humanoid-gym)
差异被折叠。 点击展开。
This image diff could not be displayed because it is too large. You can view the blob instead.
This image diff could not be displayed because it is too large. You can view the blob instead.
This image diff could not be displayed because it is too large. You can view the blob instead.
# SPDX-FileCopyrightText: Copyright (c) 2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2021 ETH Zurich, Nikita Rudin
# SPDX-FileCopyrightText: Copyright (c) 2024 Beijing RobotEra TECHNOLOGY CO.,LTD. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Copyright (c) 2024, AgiBot Inc. All rights reserved.
import os
LEGGED_GYM_ROOT_DIR = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
LEGGED_GYM_ENVS_DIR = os.path.join(LEGGED_GYM_ROOT_DIR, 'humanoid', 'envs')
# SPDX-FileCopyrightText: Copyright (c) 2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2021 ETH Zurich, Nikita Rudin
# SPDX-FileCopyrightText: Copyright (c) 2024 Beijing RobotEra TECHNOLOGY CO.,LTD. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Copyright (c) 2024, AgiBot Inc. All rights reserved.
from .vec_env import VecEnv
from .ppo import *
\ No newline at end of file
# SPDX-FileCopyrightText: Copyright (c) 2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2021 ETH Zurich, Nikita Rudin
# SPDX-FileCopyrightText: Copyright (c) 2024 Beijing RobotEra TECHNOLOGY CO.,LTD. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Copyright (c) 2024, AgiBot Inc. All rights reserved.
from .dh_ppo import DHPPO
from .dh_on_policy_runner import DHOnPolicyRunner
from .actor_critic_dh import ActorCriticDH
from .rollout_storage import RolloutStorage
# SPDX-FileCopyrightText: Copyright (c) 2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2021 ETH Zurich, Nikita Rudin
# SPDX-FileCopyrightText: Copyright (c) 2024 Beijing RobotEra TECHNOLOGY CO.,LTD. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Copyright (c) 2024, AgiBot Inc. All rights reserved.
import torch
import torch.nn as nn
from torch.distributions import Normal
class ActorCriticDH(nn.Module):
def __init__(self, num_short_obs,
num_proprio_obs,
num_critic_obs,
num_actions,
actor_hidden_dims=[256, 256, 256],
critic_hidden_dims=[256, 256, 256],
state_estimator_hidden_dims=[256, 128, 64],
in_channels = 66,
kernel_size=[6, 4],
filter_size=[32, 16],
stride_size=[3, 2],
lh_output_dim=64,
init_noise_std=1.0,
activation = nn.ELU(),
**kwargs):
if kwargs:
print("ActorCriticDH.__init__ got unexpected arguments, which will be ignored: " + str([key for key in kwargs.keys()]))
super(ActorCriticDH, self).__init__()
# define actor net and critic net
# self.num_short_obs = int(cfg.env.num_single_obs * cfg.env.short_frame_stack), 5 history
# lh_output_dim is cnn output
# 3 is state estimator output
mlp_input_dim_a = num_short_obs + lh_output_dim + 3
# num_privileged_obs = int(c_frame_stack * single_num_privileged_obs), 3 history
mlp_input_dim_c = num_critic_obs
# Policy
actor_layers = []
actor_layers.append(nn.Linear(mlp_input_dim_a, actor_hidden_dims[0]))
actor_layers.append(activation)
for l in range(len(actor_hidden_dims)):
if l == len(actor_hidden_dims) - 1:
# num_actions policy output(12)
actor_layers.append(nn.Linear(actor_hidden_dims[l], num_actions))
else:
actor_layers.append(nn.Linear(actor_hidden_dims[l], actor_hidden_dims[l + 1]))
actor_layers.append(activation)
self.actor = nn.Sequential(*actor_layers)
# Value function
critic_layers = []
critic_layers.append(nn.Linear(mlp_input_dim_c, critic_hidden_dims[0]))
critic_layers.append(activation)
for l in range(len(critic_hidden_dims)):
if l == len(critic_hidden_dims) - 1:
critic_layers.append(nn.Linear(critic_hidden_dims[l], 1))
else:
critic_layers.append(nn.Linear(critic_hidden_dims[l], critic_hidden_dims[l + 1]))
critic_layers.append(activation)
self.critic = nn.Sequential(*critic_layers)
print(f"Actor MLP: {self.actor}")
print(f"Critic MLP: {self.critic}")
# Action noise
self.std = nn.Parameter(init_noise_std * torch.ones(num_actions))
self.distribution = None
# disable args validation for speedup
Normal.set_default_validate_args = False
#define long_history CNN
long_history_layers = []
self.in_channels = in_channels
cnn_output_dim = num_proprio_obs
for out_channels, kernel_size, stride_size in zip(filter_size, kernel_size, stride_size):
long_history_layers.append(nn.Conv1d(in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride_size))
long_history_layers.append(nn.ReLU())
cnn_output_dim = (cnn_output_dim - kernel_size + stride_size) // stride_size
in_channels = out_channels
cnn_output_dim *= out_channels
long_history_layers.append(nn.Flatten())
long_history_layers.append(nn.Linear(cnn_output_dim, 128))
long_history_layers.append(nn.ELU())
long_history_layers.append(nn.Linear(128, lh_output_dim))
self.long_history = nn.Sequential(*long_history_layers)
print(f"long_history CNN: {self.long_history}")
#define state_estimator MLP
# self.num_short_obs = int(cfg.env.num_single_obs * cfg.env.short_frame_stack), 5 history
self.num_short_obs = num_short_obs
state_estimator_input_dim = num_short_obs
state_estimator_output_dim = 3
state_estimator_layers = []
state_estimator_layers.append(nn.Linear(state_estimator_input_dim, state_estimator_hidden_dims[0]))
state_estimator_layers.append(activation)
for l in range(len(state_estimator_hidden_dims)):
if l == len(state_estimator_hidden_dims) - 1:
state_estimator_layers.append(nn.Linear(state_estimator_hidden_dims[l], state_estimator_output_dim))
else:
state_estimator_layers.append(nn.Linear(state_estimator_hidden_dims[l], state_estimator_hidden_dims[l + 1]))
state_estimator_layers.append(activation)
self.state_estimator = nn.Sequential(*state_estimator_layers)
print(f"state_estimator MLP: {self.state_estimator}")
self.num_proprio_obs = num_proprio_obs
@staticmethod
# not used at the moment
def init_weights(sequential, scales):
[torch.nn.init.orthogonal_(module.weight, gain=scales[idx]) for idx, module in
enumerate(mod for mod in sequential if isinstance(mod, nn.Linear))]
def reset(self, dones=None):
pass
def forward(self):
raise NotImplementedError
@property
def action_mean(self):
return self.distribution.mean
@property
def action_std(self):
return self.distribution.stddev
@property
def entropy(self):
return self.distribution.entropy().sum(dim=-1)
def update_distribution(self, observations):
mean = self.actor(observations)
self.distribution = Normal(mean, mean*0. + self.std)
def act(self, observations, **kwargs):
short_history = observations[...,-self.num_short_obs:]
es_vel = self.state_estimator(short_history)
compressed_long_history = self.long_history(observations.view(-1, self.in_channels, self.num_proprio_obs))
actor_obs = torch.cat((short_history, es_vel, compressed_long_history),dim=-1)
self.update_distribution(actor_obs)
return self.distribution.sample()
def get_actions_log_prob(self, actions):
return self.distribution.log_prob(actions).sum(dim=-1)
def act_inference(self, observations):
short_history = observations[...,-self.num_short_obs:]
es_vel = self.state_estimator(short_history)
compressed_long_history = self.long_history(observations.view(-1, self.in_channels, self.num_proprio_obs))
actor_obs = torch.cat((short_history, es_vel, compressed_long_history),dim=-1)
actions_mean = self.actor(actor_obs)
return actions_mean
def evaluate(self, critic_observations, **kwargs):
value = self.critic(critic_observations)
return value
\ No newline at end of file
# SPDX-FileCopyrightText: Copyright (c) 2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2021 ETH Zurich, Nikita Rudin
# SPDX-FileCopyrightText: Copyright (c) 2024 Beijing RobotEra TECHNOLOGY CO.,LTD. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Copyright (c) 2024, AgiBot Inc. All rights reserved.
import torch
from typing import Tuple, Union
from abc import ABC, abstractmethod
# minimal interface of the environment
class VecEnv(ABC):
num_envs: int
num_obs: int
num_short_obs: int
num_privileged_obs: int
num_actions: int
max_episode_length: int
privileged_obs_buf: torch.Tensor
obs_buf: torch.Tensor
rew_buf: torch.Tensor
reset_buf: torch.Tensor
episode_length_buf: torch.Tensor # current episode duration
extras: dict
device: torch.device
@abstractmethod
def step(self, actions: torch.Tensor) -> Tuple[torch.Tensor, Union[torch.Tensor, None], torch.Tensor, torch.Tensor, dict]:
pass
@abstractmethod
def reset(self, env_ids: Union[list, torch.Tensor]):
pass
@abstractmethod
def get_observations(self) -> torch.Tensor:
pass
@abstractmethod
def get_privileged_observations(self) -> Union[torch.Tensor, None]:
pass
\ No newline at end of file
# SPDX-FileCopyrightText: Copyright (c) 2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2021 ETH Zurich, Nikita Rudin
# SPDX-FileCopyrightText: Copyright (c) 2024 Beijing RobotEra TECHNOLOGY CO.,LTD. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Copyright (c) 2024, AgiBot Inc. All rights reserved.
from humanoid import LEGGED_GYM_ROOT_DIR, LEGGED_GYM_ENVS_DIR
from .base.legged_robot import LeggedRobot
from .x1.x1_dh_stand_config import X1DHStandCfg, X1DHStandCfgPPO
from .x1.x1_dh_stand_env import X1DHStandEnv
from humanoid.utils.task_registry import task_registry
task_registry.register( "x1_dh_stand", X1DHStandEnv, X1DHStandCfg(), X1DHStandCfgPPO() )
# SPDX-FileCopyrightText: Copyright (c) 2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2021 ETH Zurich, Nikita Rudin
# SPDX-FileCopyrightText: Copyright (c) 2024 Beijing RobotEra TECHNOLOGY CO.,LTD. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Copyright (c) 2024, AgiBot Inc. All rights reserved.
import inspect
class BaseConfig:
def __init__(self) -> None:
""" Initializes all member classes recursively. Ignores all namse starting with '__' (buit-in methods)."""
self.init_member_classes(self)
@staticmethod
def init_member_classes(obj):
# iterate over all attributes names
for key in dir(obj):
# disregard builtin attributes
# if key.startswith("__"):
if key=="__class__":
continue
# get the corresponding attribute object
var = getattr(obj, key)
# check if it the attribute is a class
if inspect.isclass(var):
# instantate the class
i_var = var()
# set the attribute to the instance instead of the type
setattr(obj, key, i_var)
# recursively init members of the attribute
BaseConfig.init_member_classes(i_var)
\ No newline at end of file
# SPDX-FileCopyrightText: Copyright (c) 2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2021 ETH Zurich, Nikita Rudin
# SPDX-FileCopyrightText: Copyright (c) 2024 Beijing RobotEra TECHNOLOGY CO.,LTD. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Copyright (c) 2024, AgiBot Inc. All rights reserved.
import sys
from isaacgym import gymapi
from isaacgym import gymutil
import numpy as np
import torch
# Base class for RL tasks
class BaseTask():
def __init__(self, cfg, sim_params, physics_engine, sim_device, headless):
self.gym = gymapi.acquire_gym()
self.sim_params = sim_params
self.physics_engine = physics_engine
self.sim_device = sim_device
sim_device_type, self.sim_device_id = gymutil.parse_device_str(
self.sim_device)
self.headless = headless
# env device is GPU only if sim is on GPU and use_gpu_pipeline=True, otherwise returned tensors are copied to CPU by physX.
if sim_device_type == 'cuda' and sim_params.use_gpu_pipeline:
self.device = self.sim_device
else:
self.device = 'cpu'
# graphics device for rendering, -1 for no rendering
self.graphics_device_id = self.sim_device_id
if self.headless == True:
self.graphics_device_id = -1
self.num_envs = cfg.env.num_envs
self.num_obs = cfg.env.num_observations
self.num_short_obs = int(cfg.env.num_single_obs * cfg.env.short_frame_stack)
self.num_privileged_obs = cfg.env.num_privileged_obs
self.num_actions = cfg.env.num_actions
self.num_single_obs = cfg.env.num_single_obs
# optimization flags for pytorch JIT
torch._C._jit_set_profiling_mode(False)
torch._C._jit_set_profiling_executor(False)
# allocate buffers
self.obs_buf = torch.zeros(
self.num_envs, self.num_obs, device=self.device, dtype=torch.float)
self.rew_buf = torch.zeros(
self.num_envs, device=self.device, dtype=torch.float)
# new reward buffers for exp rewrads
self.neg_reward_buf = torch.zeros(
self.num_envs, device=self.device, dtype=torch.float)
self.pos_reward_buf = torch.zeros(
self.num_envs, device=self.device, dtype=torch.float)
self.reset_buf = torch.ones(
self.num_envs, device=self.device, dtype=torch.long)
self.episode_length_buf = torch.zeros(
self.num_envs, device=self.device, dtype=torch.long)
self.time_out_buf = torch.zeros(
self.num_envs, device=self.device, dtype=torch.bool)
if self.num_privileged_obs is not None:
self.privileged_obs_buf = torch.zeros(
self.num_envs, self.num_privileged_obs, device=self.device, dtype=torch.float)
else:
self.privileged_obs_buf = None
self.extras = {}
# create envs, sim and viewer
self.create_sim()
self.gym.prepare_sim(self.sim)
self.enable_viewer_sync = True
self.viewer = None
# if running with a viewer, set up keyboard shortcuts and camera
if self.headless == False:
# subscribe to keyboard shortcuts
self.viewer = self.gym.create_viewer(
self.sim, gymapi.CameraProperties())
self.gym.subscribe_viewer_keyboard_event(
self.viewer, gymapi.KEY_ESCAPE, "QUIT")
self.gym.subscribe_viewer_keyboard_event(
self.viewer, gymapi.KEY_V, "toggle_viewer_sync")
camera_properties = gymapi.CameraProperties()
camera_properties.width = 720
camera_properties.height = 480
camera_handle = self.gym.create_camera_sensor(
self.envs[0], camera_properties)
self.camera_handle = camera_handle
else:
# pass
camera_properties = gymapi.CameraProperties()
camera_properties.width = 720
camera_properties.height = 480
camera_handle = self.gym.create_camera_sensor(
self.envs[0], camera_properties)
self.camera_handle = camera_handle
def get_observations(self):
return self.obs_buf
def get_privileged_observations(self):
return self.privileged_obs_buf
def get_rma_observations(self):
return self.rma_obs_buf
def reset_idx(self, env_ids):
"""Reset selected robots"""
raise NotImplementedError
def reset(self):
""" Reset all robots"""
self.reset_idx(torch.arange(self.num_envs, device=self.device))
obs, privileged_obs, _, _, _ = self.step(torch.zeros(
self.num_envs, self.num_actions, device=self.device, requires_grad=False))
return obs, privileged_obs
def step(self, actions):
raise NotImplementedError
def render(self, sync_frame_time=True):
if self.viewer:
# check for window closed
if self.gym.query_viewer_has_closed(self.viewer):
sys.exit()
# check for keyboard events
for evt in self.gym.query_viewer_action_events(self.viewer):
if evt.action == "QUIT" and evt.value > 0:
sys.exit()
elif evt.action == "toggle_viewer_sync" and evt.value > 0:
self.enable_viewer_sync = not self.enable_viewer_sync
# fetch results
if self.device != 'cpu':
self.gym.fetch_results(self.sim, True)
# step graphics
if self.enable_viewer_sync:
self.gym.step_graphics(self.sim)
self.gym.draw_viewer(self.viewer, self.sim, True)
if sync_frame_time:
self.gym.sync_frame_time(self.sim)
else:
self.gym.poll_viewer_events(self.viewer)
# SPDX-FileCopyrightText: Copyright (c) 2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2021 ETH Zurich, Nikita Rudin
# SPDX-FileCopyrightText: Copyright (c) 2024 Beijing RobotEra TECHNOLOGY CO.,LTD. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Copyright (c) 2024, AgiBot Inc. All rights reserved.
from humanoid import LEGGED_GYM_ROOT_DIR
import os
from humanoid.envs import *
from humanoid.utils import get_args, task_registry
from datetime import datetime
import torch
def get_load_path(root, load_run=-1, checkpoint=-1):
try:
runs = os.listdir(root)
runs.sort()
if "exported" in runs:
runs.remove("exported")
last_run = os.path.join(root, runs[-1])
except:
raise ValueError("No runs in this directory: " + root)
if load_run == -1:
load_run = last_run
else:
load_run = os.path.join(root, load_run)
models = [file for file in os.listdir(load_run)]
models.sort(key=lambda m: "{0:0>15}".format(m))
model = models[-1]
load_path = os.path.join(load_run, model)
return load_path
def export_onnx(args):
env_cfg, train_cfg = task_registry.get_cfgs(name=args.task)
# load jit
log_root = os.path.join(LEGGED_GYM_ROOT_DIR, 'logs', train_cfg.runner.experiment_name, 'exported_policies')
model_path = get_load_path(log_root, load_run=args.load_run, checkpoint=args.checkpoint)
print("Load model from:", model_path)
jit_model = torch.jit.load(model_path)
jit_model.eval()
current_date_time = datetime.now().strftime('%Y-%m-%d_%H-%M-%S')
root_path = os.path.join(LEGGED_GYM_ROOT_DIR, 'logs',
train_cfg.runner.experiment_name, 'exported_onnx',
current_date_time)
os.makedirs(root_path, exist_ok=True)
dir_name = args.task.split('_')[0] + "_policy.onnx"
path = os.path.join(root_path, dir_name)
example_input = torch.randn(1,env_cfg.env.num_observations)
# export onnx model
torch.onnx.export(jit_model, # JIT model
example_input, # model example input
path, # model output path
export_params=True, # export model params
opset_version=11, # ONNX opset version
do_constant_folding=True, # optimize constant variable folding
input_names=['input'], # model input name
output_names=['output'], # model output name
)
print("Export onnx model to: ", path)
if __name__ == '__main__':
args = get_args()
if args.load_run == None:
args.load_run = -1
export_onnx(args)
\ No newline at end of file
# SPDX-FileCopyrightText: Copyright (c) 2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2021 ETH Zurich, Nikita Rudin
# SPDX-FileCopyrightText: Copyright (c) 2024 Beijing RobotEra TECHNOLOGY CO.,LTD. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Copyright (c) 2024, AgiBot Inc. All rights reserved.
from humanoid import LEGGED_GYM_ROOT_DIR
import os
import copy
from humanoid.envs import *
from humanoid.utils import get_args, task_registry, Logger
from humanoid.utils.helpers import get_load_path, class_to_dict
from datetime import datetime
import torch
from humanoid.algo.ppo import ActorCriticDH
class ExportedDH(torch.nn.Module):
def __init__(self, actor, long_history, state_estimator, num_short_obs, in_channels, num_proprio_obs):
super().__init__()
self.actor = copy.deepcopy(actor).cpu()
self.long_history = copy.deepcopy(long_history).cpu()
self.state_estimator = copy.deepcopy(state_estimator).cpu()
self.num_short_obs = num_short_obs
self.in_channels = in_channels
self.num_proprio_obs = num_proprio_obs
def forward(self, observations):
short_history = observations[...,-self.num_short_obs:]
es_vel = self.state_estimator(short_history)
compressed_long_history = self.long_history(observations.view(-1, self.in_channels, self.num_proprio_obs))
actor_obs = torch.cat((short_history, es_vel, compressed_long_history),dim=-1)
actions_mean = self.actor(actor_obs)
return actions_mean
def export(self, path):
self.to("cpu")
traced_script_module = torch.jit.script(self)
traced_script_module.save(path)
def export_policy(args):
env_cfg, train_cfg = task_registry.get_cfgs(name=args.task)
train_cfg_dict = class_to_dict(train_cfg)
policy_cfg = train_cfg_dict["policy"]
num_critic_obs = env_cfg.env.num_privileged_obs
if env_cfg.terrain.measure_heights:
num_critic_obs = env_cfg.env.c_frame_stack * (env_cfg.env.single_num_privileged_obs +env_cfg.terrain.num_height)
num_short_obs = env_cfg.env.short_frame_stack * env_cfg.env.num_single_obs
actor_critic_class = eval(train_cfg_dict["runner"]["policy_class_name"])
actor_critic: ActorCriticDH = actor_critic_class(
num_short_obs, env_cfg.env.num_single_obs, num_critic_obs, env_cfg.env.num_actions, **policy_cfg
)
# load policy
log_root_encoder = os.path.join(LEGGED_GYM_ROOT_DIR, 'logs', train_cfg.runner.experiment_name, 'exported_data')
model_path = get_load_path(log_root_encoder, load_run=args.load_run, checkpoint=args.checkpoint)
print("Load model from:", model_path)
loaded_dict = torch.load(model_path)
actor_critic.load_state_dict(loaded_dict["model_state_dict"])
exported_policy = ExportedDH(actor_critic.actor,
actor_critic.long_history,
actor_critic.state_estimator,
num_short_obs,
policy_cfg["in_channels"],
env_cfg.env.num_single_obs)
current_date_time = datetime.now().strftime('%Y-%m-%d_%H-%M-%S')
root_path = os.path.join(LEGGED_GYM_ROOT_DIR, 'logs',
train_cfg.runner.experiment_name, 'exported_policies',
current_date_time)
os.makedirs(root_path, exist_ok=True)
dir_name = "policy_dh.jit"
path = os.path.join(root_path, dir_name)
exported_policy.export(path)
print("Export policy to:", path)
if __name__ == '__main__':
EXPORT_POLICY = True
args = get_args()
if args.load_run == None:
args.load_run = -1
if args.checkpoint == None:
args.checkpoint = -1
export_policy(args)
\ No newline at end of file
# SPDX-FileCopyrightText: Copyright (c) 2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2021 ETH Zurich, Nikita Rudin
# SPDX-FileCopyrightText: Copyright (c) 2024 Beijing RobotEra TECHNOLOGY CO.,LTD. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Copyright (c) 2024, AgiBot Inc. All rights reserved.
from humanoid.envs import *
from humanoid.utils import get_args, task_registry
def train(args):
env, env_cfg = task_registry.make_env(name=args.task, args=args)
ppo_runner, train_cfg, log_dir = task_registry.make_alg_runner(env=env, name=args.task, args=args)
ppo_runner.learn(num_learning_iterations=train_cfg.runner.max_iterations, init_at_random_ep_len=False)
if __name__ == '__main__':
args = get_args()
train(args)
# SPDX-FileCopyrightText: Copyright (c) 2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2021 ETH Zurich, Nikita Rudin
# SPDX-FileCopyrightText: Copyright (c) 2024 Beijing RobotEra TECHNOLOGY CO.,LTD. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Copyright (c) 2024, AgiBot Inc. All rights reserved.
from .helpers import class_to_dict, get_load_path, get_args, export_policy_as_jit, set_seed, update_class_from_dict
from .task_registry import task_registry
from .logger import Logger
from .math import *
from .terrain import Terrain
\ No newline at end of file
# SPDX-FileCopyrightText: Copyright (c) 2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2021 ETH Zurich, Nikita Rudin
# SPDX-FileCopyrightText: Copyright (c) 2024 Beijing RobotEra TECHNOLOGY CO.,LTD. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Copyright (c) 2024, AgiBot Inc. All rights reserved.
import os
import copy
import torch
import numpy as np
import random
from isaacgym import gymapi
from isaacgym import gymutil
from humanoid import LEGGED_GYM_ROOT_DIR, LEGGED_GYM_ENVS_DIR
def class_to_dict(obj) -> dict:
if not hasattr(obj, "__dict__"):
return obj
result = {}
for key in dir(obj):
if key.startswith("_"):
continue
element = []
val = getattr(obj, key)
if isinstance(val, list):
for item in val:
element.append(class_to_dict(item))
else:
element = class_to_dict(val)
result[key] = element
return result
def update_class_from_dict(obj, dict):
for key, val in dict.items():
attr = getattr(obj, key, None)
if isinstance(attr, type):
update_class_from_dict(attr, val)
else:
setattr(obj, key, val)
return
def set_seed(seed):
if seed == -1:
seed = np.random.randint(0, 10000)
print("Setting seed: {}".format(seed))
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
os.environ["PYTHONHASHSEED"] = str(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
# For cudnn backend to ensure reproducibility
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
def parse_sim_params(args, cfg):
# code from Isaac Gym Preview 2
# initialize sim params
sim_params = gymapi.SimParams()
# set some values from args
if args.physics_engine == gymapi.SIM_FLEX:
if args.device != "cpu":
print("WARNING: Using Flex with GPU instead of PHYSX!")
elif args.physics_engine == gymapi.SIM_PHYSX:
sim_params.physx.use_gpu = args.use_gpu
sim_params.physx.num_subscenes = args.subscenes
sim_params.use_gpu_pipeline = args.use_gpu_pipeline
# if sim options are provided in cfg, parse them and update/override above:
if "sim" in cfg:
gymutil.parse_sim_config(cfg["sim"], sim_params)
# Override num_threads if passed on the command line
if args.physics_engine == gymapi.SIM_PHYSX and args.num_threads > 0:
sim_params.physx.num_threads = args.num_threads
return sim_params
def get_load_path(root, load_run=-1, checkpoint=-1):
try:
runs = os.listdir(root)
runs.sort()
if "exported" in runs:
runs.remove("exported")
last_run = os.path.join(root, runs[-1])
except:
raise ValueError("No runs in this directory: " + root)
if load_run == -1:
load_run = last_run
else:
load_run = os.path.join(root, load_run)
if checkpoint == -1:
models = [file for file in os.listdir(load_run) if "model" in file]
models.sort(key=lambda m: "{0:0>15}".format(m))
model = models[-1]
else:
model = "model_{}.pt".format(checkpoint)
load_path = os.path.join(load_run, model)
return load_path
def update_cfg_from_args(env_cfg, cfg_train, args):
# seed
if env_cfg is not None:
# num envs
if args.num_envs is not None:
env_cfg.env.num_envs = args.num_envs
if cfg_train is not None:
if args.seed is not None:
cfg_train.seed = args.seed
# alg runner parameters
if args.max_iterations is not None:
cfg_train.runner.max_iterations = args.max_iterations
if args.resume:
cfg_train.runner.resume = args.resume
if args.experiment_name is not None:
cfg_train.runner.experiment_name = args.experiment_name
if args.run_name is not None:
cfg_train.runner.run_name = args.run_name
if args.load_run is not None:
cfg_train.runner.load_run = args.load_run
if args.checkpoint is not None:
cfg_train.runner.checkpoint = args.checkpoint
return env_cfg, cfg_train
def get_args():
custom_parameters = [
{
"name": "--task",
"type": str,
"default": "XBotL_free",
"help": "Resume training or start testing from a checkpoint. Overrides config file if provided.",
},
{
"name": "--resume",
"action": "store_true",
"default": False,
"help": "Resume training from a checkpoint",
},
{
"name": "--experiment_name",
"type": str,
"help": "Name of the experiment to run or load. Overrides config file if provided.",
},
{
"name": "--run_name",
"type": str,
"help": "Name of the run. Overrides config file if provided.",
},
{
"name": "--load_run",
"type": str,
"help": "Name of the run to load when resume=True. If -1: will load the last run. Overrides config file if provided.",
},
{
"name": "--checkpoint",
"type": int,
"help": "Saved model checkpoint number. If -1: will load the last checkpoint. Overrides config file if provided.",
},
{
"name": "--headless",
"action": "store_true",
"default": False,
"help": "Force display off at all times",
},
{
"name": "--horovod",
"action": "store_true",
"default": False,
"help": "Use horovod for multi-gpu training",
},
{
"name": "--rl_device",
"type": str,
"default": "cuda:0",
"help": "Device used by the RL algorithm, (cpu, gpu, cuda:0, cuda:1 etc..)",
},
{
"name": "--num_envs",
"type": int,
"help": "Number of environments to create. Overrides config file if provided.",
},
{
"name": "--seed",
"type": int,
"help": "Random seed. Overrides config file if provided.",
},
{
"name": "--max_iterations",
"type": int,
"help": "Maximum number of training iterations. Overrides config file if provided.",
},
]
# parse arguments
args = gymutil.parse_arguments(
description="RL Policy", custom_parameters=custom_parameters
)
# name allignment
args.sim_device_id = args.compute_device_id
args.sim_device = args.sim_device_type
if args.sim_device == "cuda":
args.sim_device += f":{args.sim_device_id}"
return args
def export_policy_as_jit(actor_critic, path):
os.makedirs(path, exist_ok=True)
path = os.path.join(path, "policy_1.pt")
model = copy.deepcopy(actor_critic.actor).to("cpu")
traced_script_module = torch.jit.script(model)
traced_script_module.save(path)
# SPDX-FileCopyrightText: Copyright (c) 2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2021 ETH Zurich, Nikita Rudin
# SPDX-FileCopyrightText: Copyright (c) 2024 Beijing RobotEra TECHNOLOGY CO.,LTD. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Copyright (c) 2024, AgiBot Inc. All rights reserved.
import torch
from torch import Tensor
import numpy as np
from isaacgym.torch_utils import quat_apply, normalize
from typing import Tuple
# @ torch.jit.script
def quat_apply_yaw(quat, vec):
quat_yaw = quat.clone().view(-1, 4)
quat_yaw[:, :2] = 0.
quat_yaw = normalize(quat_yaw)
return quat_apply(quat_yaw, vec)
# @ torch.jit.script
def wrap_to_pi(angles):
angles %= 2*np.pi
angles -= 2*np.pi * (angles > np.pi)
return angles
# @ torch.jit.script
def torch_rand_sqrt_float(lower, upper, shape, device):
# type: (float, float, Tuple[int, int], str) -> Tensor
r = 2*torch.rand(*shape, device=device) - 1
r = torch.where(r<0., -torch.sqrt(-r), torch.sqrt(r))
r = (r + 1.) / 2.
return (upper - lower) * r + lower
\ No newline at end of file
# SPDX-FileCopyrightText: Copyright (c) 2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2021 ETH Zurich, Nikita Rudin
# SPDX-FileCopyrightText: Copyright (c) 2024 Beijing RobotEra TECHNOLOGY CO.,LTD. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Copyright (c) 2024, AgiBot Inc. All rights reserved.
import os
from typing import Tuple
from datetime import datetime
from humanoid.algo import VecEnv
from humanoid.algo import DHOnPolicyRunner
from humanoid import LEGGED_GYM_ROOT_DIR, LEGGED_GYM_ENVS_DIR
from .helpers import get_args, update_cfg_from_args, class_to_dict, get_load_path, set_seed, parse_sim_params
from humanoid.envs.base.legged_robot_config import LeggedRobotCfg, LeggedRobotCfgPPO
class TaskRegistry():
def __init__(self):
self.task_classes = {}
self.env_cfgs = {}
self.train_cfgs = {}
def register(self, name: str, task_class: VecEnv, env_cfg: LeggedRobotCfg, train_cfg: LeggedRobotCfgPPO):
self.task_classes[name] = task_class
self.env_cfgs[name] = env_cfg
self.train_cfgs[name] = train_cfg
def get_task_class(self, name: str) -> VecEnv:
return self.task_classes[name]
def get_cfgs(self, name) -> Tuple[LeggedRobotCfg, LeggedRobotCfgPPO]:
train_cfg = self.train_cfgs[name]
env_cfg = self.env_cfgs[name]
# copy seed
env_cfg.seed = train_cfg.seed
return env_cfg, train_cfg
def make_env(self, name, args=None, env_cfg=None) -> Tuple[VecEnv, LeggedRobotCfg]:
""" Creates an environment either from a registered namme or from the provided config file.
Args:
name (string): Name of a registered env.
args (Args, optional): Isaac Gym comand line arguments. If None get_args() will be called. Defaults to None.
env_cfg (Dict, optional): Environment config file used to override the registered config. Defaults to None.
Raises:
ValueError: Error if no registered env corresponds to 'name'
Returns:
isaacgym.VecTaskPython: The created environment
Dict: the corresponding config file
"""
# if no args passed get command line arguments
if args is None:
args = get_args()
# check if there is a registered env with that name
if name in self.task_classes:
task_class = self.get_task_class(name)
else:
raise ValueError(f"Task with name: {name} was not registered")
if env_cfg is None:
# load config files
env_cfg, _ = self.get_cfgs(name)
# override cfg from args (if specified)
env_cfg, _ = update_cfg_from_args(env_cfg, None, args)
set_seed(env_cfg.seed)
# parse sim params (convert to dict first)
sim_params = {"sim": class_to_dict(env_cfg.sim)}
sim_params = parse_sim_params(args, sim_params)
env = task_class( cfg=env_cfg,
sim_params=sim_params,
physics_engine=args.physics_engine,
sim_device=args.sim_device,
headless=args.headless)
self.env_cfg_for_wandb = env_cfg
return env, env_cfg
def make_alg_runner(self, env, name=None, args=None, train_cfg=None, log_root="default") -> Tuple[DHOnPolicyRunner, LeggedRobotCfgPPO]:
""" Creates the training algorithm either from a registered namme or from the provided config file.
Args:
env (isaacgym.VecTaskPython): The environment to train (TODO: remove from within the algorithm)
name (string, optional): Name of a registered env. If None, the config file will be used instead. Defaults to None.
args (Args, optional): Isaac Gym comand line arguments. If None get_args() will be called. Defaults to None.
train_cfg (Dict, optional): Training config file. If None 'name' will be used to get the config file. Defaults to None.
log_root (str, optional): Logging directory for Tensorboard. Set to 'None' to avoid logging (at test time for example).
Logs will be saved in <log_root>/<date_time>_<run_name>. Defaults to "default"=<path_to_LEGGED_GYM>/logs/<experiment_name>.
Raises:
ValueError: Error if neither 'name' or 'train_cfg' are provided
Warning: If both 'name' or 'train_cfg' are provided 'name' is ignored
Returns:
PPO: The created algorithm
Dict: the corresponding config file
"""
# if no args passed get command line arguments
if args is None:
args = get_args()
# if config files are passed use them, otherwise load from the name
if train_cfg is None:
if name is None:
raise ValueError("Either 'name' or 'train_cfg' must be not None")
# load config files
_, train_cfg = self.get_cfgs(name)
else:
if name is not None:
print(f"'train_cfg' provided -> Ignoring 'name={name}'")
# override cfg from args (if specified)
_, train_cfg = update_cfg_from_args(None, train_cfg, args)
current_date_time_str = datetime.now().strftime('%Y-%m-%d_%H-%M-%S')
if log_root=="default":
log_root = os.path.join(LEGGED_GYM_ROOT_DIR, 'logs', train_cfg.runner.experiment_name, 'exported_data')
log_dir = os.path.join(log_root, current_date_time_str + train_cfg.runner.run_name)
elif log_root is None:
log_dir = None
else:
log_dir = os.path.join(log_root, current_date_time_str + train_cfg.runner.run_name)
train_cfg_dict = class_to_dict(train_cfg)
env_cfg_dict = class_to_dict(self.env_cfg_for_wandb)
all_cfg = {**train_cfg_dict, **env_cfg_dict}
runner_class = eval(train_cfg_dict["runner_class_name"])
runner = runner_class(env, all_cfg, log_dir, device=args.rl_device)
#save resume path before creating a new log_dir
resume = train_cfg.runner.resume
if resume:
# load previously trained model
resume_path = get_load_path(log_root, load_run=train_cfg.runner.load_run, checkpoint=train_cfg.runner.checkpoint)
print(f"Loading model from: {resume_path}")
runner.load(resume_path, load_optimizer=False)
return runner, train_cfg, log_dir
# make global task registry
task_registry = TaskRegistry()
\ No newline at end of file
<mujoco model="flat">
<statistic center="0 0 0.55" extent="1.1"/>
<visual>
<headlight diffuse="0.6 0.6 0.6" ambient="0.3 0.3 0.3" specular="0 0 0"/>
<rgba haze="0.15 0.25 0.35 1"/>
<global azimuth="150" elevation="-20"/>
</visual>
<visual>
<rgba com="0.502 1.0 0 0.5" contactforce="0.98 0.4 0.4 1" contactpoint="1.0 1.0 0.6 0.4"/>
<scale com="1" forcewidth="0.03" contactwidth="0.01" contactheight="0.02" framewidth="0.05" framelength="0.6"/>
<map force="0.005"/>
</visual>
<asset>
<texture name="skybox" type="skybox" builtin="gradient" rgb1="0.2 0.3 0.4" rgb2="0 0 0" width="1000" height="1000" mark="random" random="0.001" markrgb="1 1 1"/>
<texture type="2d" name="groundplane" builtin="checker" mark="edge" rgb1="0.2 0.3 0.4" rgb2="0.1 0.2 0.3" markrgb="0.8 0.8 0.8" width="1000" height="1000"/>
<material name="groundplane" texture="groundplane" texuniform="true" texrepeat="5 5" reflectance="0.2"/>
</asset>
<worldbody>
<light pos="0 0 10" dir="0 0 -1" directional="true"/>
<geom name="floor" size="0 3 .125" type="plane" material="groundplane" conaffinity="7" condim="3" friction="1"/>
</worldbody>
</mujoco>
差异被折叠。 点击展开。
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册登录 后发表评论