|
| 1 | +# end_to_end_planning module |
| 2 | + |
| 3 | +The *end_to_end_planning* module contains the *EndToEndPlanningRLLearner* class, which inherits from the abstract |
| 4 | +class *LearnerRL*. |
| 5 | + |
| 6 | +### Class EndToEndPlanningRLLearner |
| 7 | +Bases: `engine.learners.LearnerRL` |
| 8 | + |
| 9 | +The *EndToEndPlanningRLLearner* is an agent that can be used to train quadrotor robots equipped with a depth sensor to |
| 10 | +follow a provided trajectory while avoiding obstacles. |
| 11 | + |
| 12 | +The [EndToEndPlanningRLLearner](/src/opendr/planning/end_to_end_planning/e2e_planning_learner.py) class has the |
| 13 | +following public methods: |
| 14 | + |
| 15 | +#### `EndToEndPlanningRLLearner` constructor |
| 16 | + |
| 17 | +Constructor parameters: |
| 18 | + |
| 19 | +- **env**: *gym.Env*\ |
| 20 | + Reinforcment learning environment to train or evaluate the agent on. |
| 21 | +- **lr**: *float, default=3e-4*\ |
| 22 | + Specifies the initial learning rate to be used during training. |
| 23 | +- **n_steps**: *int, default=1024*\ |
| 24 | + Specifies the number of steps to run for environment per update. |
| 25 | +- **iters**: *int, default=5e4*\ |
| 26 | + Specifies the number of steps the training should run for. |
| 27 | +- **batch_size**: *int, default=64*\ |
| 28 | + Specifies the batch size during training. |
| 29 | +- **checkpoint_after_iter**: *int, default=500*\ |
| 30 | + Specifies per how many training steps a checkpoint should be saved. |
| 31 | +- **temp_path**: *str, default=''*\ |
| 32 | + Specifies a path where the algorithm stores log files and saves checkpoints. |
| 33 | +- **device**: *{'cpu', 'cuda'}, default='cuda'*\ |
| 34 | + Specifies the device to be used. |
| 35 | + |
| 36 | +#### `EndToEndPlanningRLLearner.fit` |
| 37 | +```python |
| 38 | +EndToEndPlanningRLLearner.fit(self, env, logging_path, silent, verbose) |
| 39 | +``` |
| 40 | + |
| 41 | +Train the agent on the environment. |
| 42 | + |
| 43 | +Parameters: |
| 44 | + |
| 45 | +- **env**: *gym.Env, default=None*\ |
| 46 | + If specified use this env to train. |
| 47 | +- **logging_path**: *str, default=''*\ |
| 48 | + Path for logging and checkpointing. |
| 49 | +- **silent**: *bool, default=False*\ |
| 50 | + Disable verbosity. |
| 51 | +- **verbose**: *bool, default=True*\ |
| 52 | + Enable verbosity. |
| 53 | + |
| 54 | + |
| 55 | +#### `EndToEndPlanningRLLearner.eval` |
| 56 | +```python |
| 57 | +EndToEndPlanningRLLearner.eval(self, env) |
| 58 | +``` |
| 59 | +Evaluate the agent on the specified environment. |
| 60 | + |
| 61 | +Parameters: |
| 62 | + |
| 63 | +- **env**: *gym.Env, default=None*\ |
| 64 | + Environment to evaluate on. |
| 65 | + |
| 66 | + |
| 67 | +#### `EndToEndPlanningRLLearner.save` |
| 68 | +```python |
| 69 | +EndToEndPlanningRLLearner.save(self, path) |
| 70 | +``` |
| 71 | +Saves the model in the path provided. |
| 72 | + |
| 73 | +Parameters: |
| 74 | + |
| 75 | +- **path**: *str*\ |
| 76 | + Path to save the model, including the filename. |
| 77 | + |
| 78 | + |
| 79 | +#### `EndToEndPlanningRLLearner.load` |
| 80 | +```python |
| 81 | +EndToEndPlanningRLLearner.load(self, path) |
| 82 | +``` |
| 83 | +Loads a model from the path provided. |
| 84 | + |
| 85 | +Parameters: |
| 86 | + |
| 87 | +- **path**: *str*\ |
| 88 | + Path of the model to be loaded. |
| 89 | + |
| 90 | + |
| 91 | +#### `EndToEndPlanningRLLearner.infer` |
| 92 | +```python |
| 93 | +EndToEndPlanningRLLearner.infer(self, batch, deterministic) |
| 94 | +``` |
| 95 | +Performs inference on a single observation or a list of observations. |
| 96 | + |
| 97 | +Parameters: |
| 98 | + |
| 99 | +- **batch**: *dict or list of dict, default=None*\ |
| 100 | + Single observation or list of observations. |
| 101 | +- **deterministic**: *bool, default=True*\ |
| 102 | + Use deterministic actions from the policy |
| 103 | + |
| 104 | +### Simulation environment setup |
| 105 | + |
| 106 | +The environment includes an Ardupilot controlled quadrotor in Webots simulation. |
| 107 | +For the installation of Ardupilot instructions are available [here](https://github.com/ArduPilot/ardupilot). |
| 108 | + |
| 109 | +The required files to complete Ardupilot setup can be downloaded by running [`download_ardupilot_files.py`](src/opendr/planning/end_to_end_planning/download_ardupilot_files.py) script. |
| 110 | +The downloaded files (zipped as `ardupilot.zip`) should be replaced under the installation of Ardupilot. |
| 111 | +In order to run Ardupilot in Webots 2021a, controller codes should be replaced. (For older versions of Webots, these files can be skipped.) |
| 112 | +The world file for the environment is provided under `/ardupilot/libraries/SITL/examples/webots/worlds/` for training and testing. |
| 113 | + |
| 114 | +Install `mavros` package for ROS communication with Ardupilot. |
| 115 | +Instructions are available [here](https://github.com/mavlink/mavros/blob/master/mavros/README.md#installation). |
| 116 | +Source installation is recomended. |
| 117 | + |
| 118 | +### Running the environment |
| 119 | + |
| 120 | +The following steps should be executed to have a ROS communication between Gym environment and simulation. |
| 121 | +- Start the Webots and open the provided world file. |
| 122 | +The simulation time should stop at first time step and wait for Ardupilot software to run. |
| 123 | +- Run following script from Ardupilot directory: `./libraries/SITL/examples/Webots/dronePlus.sh` which starts software in the loop execution of the Ardupilot software. |
| 124 | +- Run `roscore`. |
| 125 | +- Run `roslaunch mavros apm.launch` which creates ROS communication for Ardupilot. |
| 126 | +- Run following ROS nodes in `src/opendr/planning/end_to_end_planning/src`: |
| 127 | + - `children_robot` which activates required sensors on quadrotor and creates ROS communication for them. |
| 128 | + - `take_off` which takes off the quadrotor. |
| 129 | + - `range_image` which converts the depth image into array format to be input for the learner. |
| 130 | + |
| 131 | +After these steps the [AgiEnv](src/opendr/planning/end_to_end_planning/envs/agi_env.py) gym environment can send action comments to the simulated drone and receive depth image and pose information from simulation. |
| 132 | + |
| 133 | +### Examples |
| 134 | + |
| 135 | +Training in Webots environment: |
| 136 | + |
| 137 | +```python |
| 138 | +from opendr.planning.end_to_end_planning import EndToEndPlanningRLLearner, AgiEnv |
| 139 | + |
| 140 | +env = AgiEnv() |
| 141 | +learner = EndToEndPlanningRLLearner(env, n_steps=1024) |
| 142 | +learner.fit(logging_path='./end_to_end_planning_tmp') |
| 143 | +``` |
| 144 | + |
| 145 | + |
| 146 | +Running a pretrained model: |
| 147 | + |
| 148 | +```python |
| 149 | +from opendr.planning.end_to_end_planning import EndToEndPlanningRLLearner, AgiEnv |
| 150 | + |
| 151 | +env = AgiEnv() |
| 152 | +learner = EndToEndPlanningRLLearner(env) |
| 153 | +learner.load('{$OPENDR_HOME}/src/opendr/planning/end_to_end_planning/pretrained_model/saved_model.zip') |
| 154 | +obs = env.reset() |
| 155 | +sum_of_rew = 0 |
| 156 | +number_of_timesteps = 20 |
| 157 | +for i in range(number_of_timesteps): |
| 158 | + action, _states = learner.infer(obs, deterministic=True) |
| 159 | + obs, rewards, dones, info = env.step(action) |
| 160 | + sum_of_rew += rewards |
| 161 | + if dones: |
| 162 | + obs = env.reset() |
| 163 | +print("Reward collected is:", sum_of_rew) |
| 164 | +``` |
| 165 | + |
| 166 | +### Performance Evaluation |
| 167 | + |
| 168 | +TABLE 1: Speed (FPS) and energy consumption for inference on various platforms. |
| 169 | + |
| 170 | +| | TX2 | Xavier | RTX 2080 Ti | |
| 171 | +| --------------- | ----- | ------ | ----------- | |
| 172 | +| FPS Evaluation | 153.5 | 201.6 | 973.6 | |
| 173 | +| Energy (Joules) | 0.12 | 0.051 | \- | |
| 174 | + |
| 175 | +TABLE 2: Platform compatibility evaluation. |
| 176 | + |
| 177 | +| Platform | Test results | |
| 178 | +| -------------------------------------------- | ------------ | |
| 179 | +| x86 - Ubuntu 20.04 (bare installation - CPU) | Pass | |
| 180 | +| x86 - Ubuntu 20.04 (bare installation - GPU) | Pass | |
| 181 | +| x86 - Ubuntu 20.04 (pip installation) | Pass | |
| 182 | +| x86 - Ubuntu 20.04 (CPU docker) | Pass | |
| 183 | +| x86 - Ubuntu 20.04 (GPU docker) | Pass | |
| 184 | +| NVIDIA Jetson TX2 | Pass | |
| 185 | +| NVIDIA Jetson Xavier AGX | Pass | |
0 commit comments