Skip to content

Commit cdd2142

Browse files
authored
implicitron v0 (#1133)
Co-authored-by: Jeremy Francis Reizenstein <[email protected]>
1 parent 0e377c6 commit cdd2142

File tree

90 files changed

+17075
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

90 files changed

+17075
-0
lines changed
+276
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,276 @@
1+
# Introduction
2+
3+
Implicitron is a PyTorch3D-based framework for new-view synthesis via modeling the neural-network based representations.
4+
5+
# License
6+
7+
Implicitron is distributed as part of PyTorch3D under the [BSD license](https://github.com/facebookresearch/pytorch3d/blob/main/LICENSE).
8+
It includes code from [SRN](http://github.com/vsitzmann/scene-representation-networks) and [IDR](http://github.com/lioryariv/idr) repos.
9+
See [LICENSE-3RD-PARTY](https://github.com/facebookresearch/pytorch3d/blob/main/LICENSE-3RD-PARTY) for their licenses.
10+
11+
12+
# Installation
13+
14+
There are three ways to set up Implicitron, depending on the flexibility level required.
15+
If you only want to train or evaluate models as they are implemented changing only the parameters, you can just install the package.
16+
Implicitron also provides a flexible API that supports user-defined plug-ins;
17+
if you want to re-implement some of the components without changing the high-level pipeline, you need to create a custom launcher script.
18+
The most flexible option, though, is cloning PyTorch3D repo and building it from sources, which allows changing the code in arbitrary ways.
19+
Below, we descibe all three options in more details.
20+
21+
22+
## [Option 1] Running an executable from the package
23+
24+
This option allows you to use the code as is without changing the implementations.
25+
Only configuration can be changed (see [Configuration system](#configuration-system)).
26+
27+
For this setup, install the dependencies and PyTorch3D from conda following [the guide](https://github.com/facebookresearch/pytorch3d/blob/master/INSTALL.md#1-install-with-cuda-support-from-anaconda-cloud-on-linux-only). Then, install implicitron-specific dependencies:
28+
29+
```shell
30+
pip install "hydra-core>=1.1" visdom lpips matplotlib
31+
```
32+
33+
Runner executable is available as `pytorch3d_implicitron_runner` shell command.
34+
See [Running](#running) section below for examples of training and evaluation commands.
35+
36+
## [Option 2] Supporting custom implementations
37+
38+
To plug in custom implementations, for example, of renderer or implicit-function protocols, you need to create your own runner script and import the plug-in implementations there.
39+
First, install PyTorch3D and Implicitron dependencies as described in the previous section.
40+
Then, implement the custom script; copying `pytorch3d/projects/implicitron_trainer/experiment.py` is a good place to start.
41+
See [Custom plugins](#custom-plugins) for more information on how to import implementations and enable them in the configs.
42+
43+
44+
## [Option 3] Cloning PyTorch3D repo
45+
46+
This is the most flexible way to set up Implicitron as it allows changing the code directly.
47+
It allows modifying the high-level rendering pipeline or implementing yet-unsupported loss functions.
48+
Please follow the instructions to [install PyTorch3D from a local clone](https://github.com/facebookresearch/pytorch3d/blob/main/INSTALL.md#2-install-from-a-local-clone).
49+
Then, install Implicitron-specific dependencies:
50+
51+
```shell
52+
pip install "hydra-core>=1.1" visdom lpips matplotlib
53+
```
54+
55+
You are still encouraged to implement custom plugins as above where possible as it makes reusing the code easier.
56+
The executable is located in `pytorch3d/projects/implicitron_trainer`.
57+
58+
59+
# Running
60+
61+
This section assumes that you use the executable provided by the installed package.
62+
If you have a custom `experiment.py` script (as in the Option 2 above), replace the executable with the path to your script.
63+
64+
## Training
65+
66+
To run training, pass a yaml config file, followed by a list of overridden arguments.
67+
For example, to train NeRF on the first skateboard sequence from CO3D dataset, you can run:
68+
```shell
69+
pytorch3d_implicitron_runner --config-path ./configs/ --config-name repro_singleseq_nerf dataset_args.dataset_root=<DATASET_ROOT> dataset_args.category='skateboard' dataset_args.test_restrict_sequence_id=0 test_when_finished=True exp_dir=<CHECKPOINT_DIR>
70+
```
71+
72+
Here, `--config-path` points to the config path relative to `pytorch3d_implicitron_runner` location;
73+
`--config-name` picks the config (in this case, `repro_singleseq_nerf.yaml`);
74+
`test_when_finished` will launch evaluation script once training is finished.
75+
Replace `<DATASET_ROOT>` with the location where the dataset in Implicitron format is stored
76+
and `<CHECKPOINT_DIR>` with a directory where checkpoints will be dumped during training.
77+
Other configuration parameters can be overridden in the same way.
78+
See [Configuration system](#configuration-system) section for more information on this.
79+
80+
81+
## Evaluation
82+
83+
To run evaluation on the latest checkpoint after (or during) training, simply add `eval_only=True` to your training command.
84+
85+
E.g. for executing the evaluation on the NeRF skateboard sequence, you can run:
86+
```shell
87+
pytorch3d_implicitron_runner --config-path ./configs/ --config-name repro_singleseq_nerf dataset_args.dataset_root=<CO3D_DATASET_ROOT> dataset_args.category='skateboard' dataset_args.test_restrict_sequence_id=0 exp_dir=<CHECKPOINT_DIR> eval_only=True
88+
```
89+
Evaluation prints the metrics to `stdout` and dumps them to a json file in `exp_dir`.
90+
91+
## Visualisation
92+
93+
The script produces a video of renders by a trained model assuming a pre-defined camera trajectory.
94+
In order for it to work, `ffmpeg` needs to be installed:
95+
96+
```shell
97+
conda install ffmpeg
98+
```
99+
100+
Here is an example of calling the script:
101+
```shell
102+
projects/implicitron_trainer/visualize_reconstruction.py exp_dir=<CHECKPOINT_DIR> visdom_show_preds=True n_eval_cameras=40 render_size="[64,64]" video_size="[256,256]"
103+
```
104+
105+
The argument `n_eval_cameras` sets the number of renderring viewpoints sampled on a trajectory, which defaults to a circular fly-around;
106+
`render_size` sets the size of a render passed to the model, which can be resized to `video_size` before writing.
107+
108+
Rendered videos of images, masks, and depth maps will be saved to `<CHECKPOINT_DIR>/vis`.
109+
110+
111+
# Configuration system
112+
113+
We use hydra and OmegaConf to parse the configs.
114+
The config schema and default values are defined by the dataclasses implementing the modules.
115+
More specifically, if a class derives from `Configurable`, its fields can be set in config yaml files or overridden in CLI.
116+
For example, `GenericModel` has a field `render_image_width` with the default value 400.
117+
If it is specified in the yaml config file or in CLI command, the new value will be used.
118+
119+
Configurables can form hierarchies.
120+
For example, `GenericModel` has a field `raysampler: RaySampler`, which is also Configurable.
121+
In the config, inner parameters can be propagated using `_args` postfix, e.g. to change `raysampler.n_pts_per_ray_training` (the number of sampled points per ray), the node `raysampler_args.n_pts_per_ray_training` should be specified.
122+
123+
The root of the hierarchy is defined by `ExperimentConfig` dataclass.
124+
It has top-level fields like `eval_only` which was used above for running evaluation by adding a CLI override.
125+
Additionally, it has non-leaf nodes like `generic_model_args`, which dispatches the config parameters to `GenericModel`. Thus, changing the model parameters may be achieved in two ways: either by editing the config file, e.g.
126+
```yaml
127+
generic_model_args:
128+
render_image_width: 800
129+
raysampler_args:
130+
n_pts_per_ray_training: 128
131+
```
132+
133+
or, equivalently, by adding the following to `pytorch3d_implicitron_runner` arguments:
134+
135+
```shell
136+
generic_model_args.render_image_width=800 generic_model_args.raysampler_args.n_pts_per_ray_training=128
137+
```
138+
139+
See the documentation in `pytorch3d/implicitron/tools/config.py` for more details.
140+
141+
## Replaceable implementations
142+
143+
Sometimes changing the model parameters does not provide enough flexibility, and you want to provide a new implementation for a building block.
144+
The configuration system also supports it!
145+
Abstract classes like `BaseRenderer` derive from `ReplaceableBase` instead of `Configurable`.
146+
This means that other Configurables can refer to them using the base type, while the specific implementation is chosen in the config using `_class_type`-postfixed node.
147+
In that case, `_args` node name has to include the implementation type.
148+
More specifically, to change renderer settings, the config will look like this:
149+
```yaml
150+
generic_model_args:
151+
renderer_class_type: LSTMRenderer
152+
renderer_LSTMRenderer_args:
153+
num_raymarch_steps: 10
154+
hidden_size: 16
155+
```
156+
157+
See the documentation in `pytorch3d/implicitron/tools/config.py` for more details on the configuration system.
158+
159+
## Custom plugins
160+
161+
If you have an idea for another implementation of a replaceable component, it can be plugged in without changing the core code.
162+
For that, you need to set up Implicitron through option 2 or 3 above.
163+
Let's say you want to implement a renderer that accumulates opacities similar to an X-ray machine.
164+
First, create a module `x_ray_renderer.py` with a class deriving from `BaseRenderer`:
165+
166+
```python
167+
from pytorch3d.implicitron.tools.config import registry
168+
169+
@registry.register
170+
class XRayRenderer(BaseRenderer, torch.nn.Module):
171+
n_pts_per_ray: int = 64
172+
173+
# if there are other base classes, make sure to call `super().__init__()` explicitly
174+
def __post_init__(self):
175+
super().__init__()
176+
# custom initialization
177+
178+
def forward(
179+
self,
180+
ray_bundle,
181+
implicit_functions=[],
182+
evaluation_mode: EvaluationMode = EvaluationMode.EVALUATION,
183+
**kwargs,
184+
) -> RendererOutput:
185+
...
186+
```
187+
188+
Please note `@registry.register` decorator that registers the plug-in as an implementation of `Renderer`.
189+
IMPORTANT: In order for it to run, the class (or its enclosing module) has to be imported in your launch script. Additionally, this has to be done before parsing the root configuration class `ExperimentConfig`.
190+
Simply add `import .x_ray_renderer` in the beginning of `experiment.py`.
191+
192+
After that, you should be able to change the config with:
193+
```yaml
194+
generic_model_args:
195+
renderer_class_type: XRayRenderer
196+
renderer_XRayRenderer_args:
197+
n_pts_per_ray: 128
198+
```
199+
200+
to replace the implementation and potentially override the parameters.
201+
202+
# Code and config structure
203+
204+
As per above, the config structure is parsed automatically from the module hierarchy.
205+
In particular, model parameters are contained in `generic_model_args` node, and dataset parameters in `dataset_args` node.
206+
207+
Here is the class structure (single-line edges show aggregation, while double lines show available implementations):
208+
```
209+
generic_model_args: GenericModel
210+
└-- sequence_autodecoder_args: Autodecoder
211+
└-- raysampler_args: RaySampler
212+
└-- renderer_*_args: BaseRenderer
213+
╘== MultiPassEmissionAbsorptionRenderer
214+
╘== LSTMRenderer
215+
╘== SignedDistanceFunctionRenderer
216+
└-- ray_tracer_args: RayTracing
217+
└-- ray_normal_coloring_network_args: RayNormalColoringNetwork
218+
└-- implicit_function_*_args: ImplicitFunctionBase
219+
╘== NeuralRadianceFieldImplicitFunction
220+
╘== SRNImplicitFunction
221+
└-- raymarch_function_args: SRNRaymarchFunction
222+
└-- pixel_generator_args: SRNPixelGenerator
223+
╘== SRNHyperNetImplicitFunction
224+
└-- hypernet_args: SRNRaymarchHyperNet
225+
└-- pixel_generator_args: SRNPixelGenerator
226+
╘== IdrFeatureField
227+
└-- image_feature_extractor_args: ResNetFeatureExtractor
228+
└-- view_sampler_args: ViewSampler
229+
└-- feature_aggregator_*_args: FeatureAggregatorBase
230+
╘== IdentityFeatureAggregator
231+
╘== AngleWeightedIdentityFeatureAggregator
232+
╘== AngleWeightedReductionFeatureAggregator
233+
╘== ReductionFeatureAggregator
234+
solver_args: init_optimizer
235+
dataset_args: dataset_zoo
236+
dataloader_args: dataloader_zoo
237+
```
238+
239+
Please look at the annotations of the respective classes or functions for the lists of hyperparameters.
240+
241+
# Reproducing CO3D experiments
242+
243+
Common Objects in 3D (CO3D) is a large-scale dataset of videos of rigid objects grouped into 50 common categories.
244+
Implicitron provides implementations and config files to reproduce the results from [the paper](https://arxiv.org/abs/2109.00512).
245+
Please follow [the link](https://github.com/facebookresearch/co3d#automatic-batch-download) for the instructions to download the dataset.
246+
In training and evaluation scripts, use the download location as `<DATASET_ROOT>`.
247+
It is also possible to define environment variable `CO3D_DATASET_ROOT` instead of specifying it.
248+
To reproduce the experiments from the paper, use the following configs. For single-sequence experiments:
249+
250+
| Method | config file |
251+
|-----------------|-------------------------------------|
252+
| NeRF | repro_singleseq_nerf.yaml |
253+
| NeRF + WCE | repro_singleseq_nerf_wce.yaml |
254+
| NerFormer | repro_singleseq_nerformer.yaml |
255+
| IDR | repro_singleseq_idr.yaml |
256+
| SRN | repro_singleseq_srn_noharm.yaml |
257+
| SRN + γ | repro_singleseq_srn.yaml |
258+
| SRN + WCE | repro_singleseq_srn_wce_noharm.yaml |
259+
| SRN + WCE + γ | repro_singleseq_srn_wce_noharm.yaml |
260+
261+
For multi-sequence experiments (without generalisation to new sequences):
262+
263+
| Method | config file |
264+
|-----------------|--------------------------------------------|
265+
| NeRF + AD | repro_multiseq_nerf_ad.yaml |
266+
| SRN + AD | repro_multiseq_srn_ad_hypernet_noharm.yaml |
267+
| SRN + γ + AD | repro_multiseq_srn_ad_hypernet.yaml |
268+
269+
For multi-sequence experiments (with generalisation to new sequences):
270+
271+
| Method | config file |
272+
|-----------------|--------------------------------------|
273+
| NeRF + WCE | repro_multiseq_nerf_wce.yaml |
274+
| NerFormer | repro_multiseq_nerformer.yaml |
275+
| SRN + WCE | repro_multiseq_srn_wce_noharm.yaml |
276+
| SRN + WCE + γ | repro_multiseq_srn_wce.yaml |
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
defaults:
2+
- default_config
3+
- _self_
4+
exp_dir: ./data/exps/base/
5+
architecture: generic
6+
visualize_interval: 0
7+
visdom_port: 8097
8+
dataloader_args:
9+
batch_size: 10
10+
dataset_len: 1000
11+
dataset_len_val: 1
12+
num_workers: 8
13+
images_per_seq_options:
14+
- 2
15+
- 3
16+
- 4
17+
- 5
18+
- 6
19+
- 7
20+
- 8
21+
- 9
22+
- 10
23+
dataset_args:
24+
dataset_root: ${oc.env:CO3D_DATASET_ROOT}"
25+
load_point_clouds: false
26+
mask_depths: false
27+
mask_images: false
28+
n_frames_per_sequence: -1
29+
test_on_train: true
30+
test_restrict_sequence_id: 0
31+
generic_model_args:
32+
loss_weights:
33+
loss_mask_bce: 1.0
34+
loss_prev_stage_mask_bce: 1.0
35+
loss_autodecoder_norm: 0.01
36+
loss_rgb_mse: 1.0
37+
loss_prev_stage_rgb_mse: 1.0
38+
output_rasterized_mc: false
39+
chunk_size_grid: 102400
40+
render_image_height: 400
41+
render_image_width: 400
42+
num_passes: 2
43+
implicit_function_NeuralRadianceFieldImplicitFunction_args:
44+
n_harmonic_functions_xyz: 10
45+
n_harmonic_functions_dir: 4
46+
n_hidden_neurons_xyz: 256
47+
n_hidden_neurons_dir: 128
48+
n_layers_xyz: 8
49+
append_xyz:
50+
- 5
51+
latent_dim: 0
52+
raysampler_args:
53+
n_rays_per_image_sampled_from_mask: 1024
54+
min_depth: 0.0
55+
max_depth: 0.0
56+
scene_extent: 8.0
57+
n_pts_per_ray_training: 64
58+
n_pts_per_ray_evaluation: 64
59+
stratified_point_sampling_training: true
60+
stratified_point_sampling_evaluation: false
61+
renderer_MultiPassEmissionAbsorptionRenderer_args:
62+
n_pts_per_ray_fine_training: 64
63+
n_pts_per_ray_fine_evaluation: 64
64+
append_coarse_samples_to_fine: true
65+
density_noise_std_train: 1.0
66+
view_sampler_args:
67+
masked_sampling: false
68+
image_feature_extractor_args:
69+
stages:
70+
- 1
71+
- 2
72+
- 3
73+
- 4
74+
proj_dim: 16
75+
image_rescale: 0.32
76+
first_max_pool: false
77+
solver_args:
78+
breed: adam
79+
lr: 0.0005
80+
lr_policy: multistep
81+
max_epochs: 2000
82+
momentum: 0.9
83+
weight_decay: 0.0
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
generic_model_args:
2+
image_feature_extractor_args:
3+
add_images: true
4+
add_masks: true
5+
first_max_pool: true
6+
image_rescale: 0.375
7+
l2_norm: true
8+
name: resnet34
9+
normalize_image: true
10+
pretrained: true
11+
stages:
12+
- 1
13+
- 2
14+
- 3
15+
- 4
16+
proj_dim: 32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
generic_model_args:
2+
image_feature_extractor_args:
3+
add_images: true
4+
add_masks: true
5+
first_max_pool: false
6+
image_rescale: 0.375
7+
l2_norm: true
8+
name: resnet34
9+
normalize_image: true
10+
pretrained: true
11+
stages:
12+
- 1
13+
- 2
14+
- 3
15+
- 4
16+
proj_dim: 16

0 commit comments

Comments
 (0)