Skip to content

Commit 22bedf9

Browse files
authored
simplify examples structure (#1247)
* simplify examples structure * update changelog * fix imports * rename example * rename scripts * changelog
1 parent 16f4cc9 commit 22bedf9

20 files changed

+138
-84
lines changed

CHANGELOG.md

+3-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
88

99
### Added
1010

11-
- Added parity test between a vanilla MNIST model and lightning model ([#1284](https://github.com/PyTorchLightning/pytorch-lightning/pull/1284))
11+
- Added parity test between a vanilla MNIST model and lightning model ([#1284](https://github.com/PyTorchLightning/pytorch-lightning/pull/1284))
12+
- Added parity test between a vanilla RNN model and lightning model ([#1351](https://github.com/PyTorchLightning/pytorch-lightning/pull/1351))
1213
- Added Reinforcement Learning - Deep Q-network (DQN) lightning example ([#1232](https://github.com/PyTorchLightning/pytorch-lightning/pull/1232))
1314
- Added support for hierarchical `dict` ([#1152](https://github.com/PyTorchLightning/pytorch-lightning/pull/1152))
1415
- Added `TrainsLogger` class ([#1122](https://github.com/PyTorchLightning/pytorch-lightning/pull/1122))
@@ -40,6 +41,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
4041
- Give warnings for unimplemented required lightning methods ([#1317](https://github.com/PyTorchLightning/pytorch-lightning/pull/1317))
4142
- Enhanced load_from_checkpoint to also forward params to the model ([#1307](https://github.com/PyTorchLightning/pytorch-lightning/pull/1307))
4243
- Made `evaluate` method private >> `Trainer._evaluate(...)`. ([#1260](https://github.com/PyTorchLightning/pytorch-lightning/pull/1260))
44+
- Simplify the PL examples structure (shallower and more readable) ([#1247](https://github.com/PyTorchLightning/pytorch-lightning/pull/1247))
4345

4446
### Deprecated
4547

pl_examples/README.md

+62-9
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,67 @@
11
# Examples
2-
This folder has 4 sections:
2+
This folder has 3 sections:
33

4-
### Basic examples
5-
These show the most common use of Lightning for either CPU or GPU training.
4+
## Basic Examples
5+
Use these examples to test how lightning works.
66

7-
### Domain templates
8-
These are templates to show common approaches such as GANs and RL.
7+
#### Test on CPU
8+
```bash
9+
python cpu_template.py
10+
```
911

10-
### Full examples
11-
Contains examples demonstrating ImageNet training, Semantic Segmentation, etc.
12+
---
13+
#### Train on a single GPU
14+
```bash
15+
python gpu_template.py --gpus 1
16+
```
1217

13-
### Multi-node examples
14-
These show how to run jobs on a GPU cluster using lightning.
18+
---
19+
#### DataParallel (dp)
20+
Train on multiple GPUs using DataParallel.
21+
22+
```bash
23+
python gpu_template.py --gpus 2 --distributed_backend dp
24+
```
25+
26+
---
27+
#### DistributedDataParallel (ddp)
28+
29+
Train on multiple GPUs using DistributedDataParallel
30+
```bash
31+
python gpu_template.py --gpus 2 --distributed_backend ddp
32+
```
33+
34+
---
35+
#### DistributedDataParallel+DP (ddp2)
36+
37+
Train on multiple GPUs using DistributedDataParallel + dataparallel.
38+
On a single node, uses all GPUs for 1 model. Then shares gradient information
39+
across nodes.
40+
```bash
41+
python gpu_template.py --gpus 2 --distributed_backend ddp2
42+
```
43+
44+
## Multi-node example
45+
46+
This demo launches a job using 2 GPUs on 2 different nodes (4 GPUs total).
47+
To run this demo do the following:
48+
49+
1. Log into the jumphost node of your SLURM-managed cluster.
50+
2. Create a conda environment with Lightning and a GPU PyTorch version.
51+
3. Choose a script to submit
52+
53+
### DDP
54+
Submit this job to run with DistributedDataParallel (2 nodes, 2 gpus each)
55+
```bash
56+
sbatch ddp_job_submit.sh YourEnv
57+
```
58+
59+
### DDP2
60+
Submit this job to run with a different implementation of DistributedDataParallel.
61+
In this version, each node acts like DataParallel but syncs across nodes like DDP.
62+
```bash
63+
sbatch ddp2_job_submit.sh YourEnv
64+
```
65+
66+
## Domain templates
67+
These are templates to show common approaches such as GANs and RL.

pl_examples/__init__.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,7 @@ def optimize_on_cluster(hyperparams):
140140
141141
"""
142142

143-
from .basic_examples.lightning_module_template import LightningTemplateModel
143+
from pl_examples.models.lightning_template import LightningTemplateModel
144144

145145
__all__ = [
146146
'LightningTemplateModel'

pl_examples/basic_examples/README.md

+25-2
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Basic Examples
1+
## Basic Examples
22
Use these examples to test how lightning works.
33

44
#### Test on CPU
@@ -36,4 +36,27 @@ On a single node, uses all GPUs for 1 model. Then shares gradient information
3636
across nodes.
3737
```bash
3838
python gpu_template.py --gpus 2 --distributed_backend ddp2
39-
```
39+
```
40+
41+
42+
# Multi-node example
43+
44+
This demo launches a job using 2 GPUs on 2 different nodes (4 GPUs total).
45+
To run this demo do the following:
46+
47+
1. Log into the jumphost node of your SLURM-managed cluster.
48+
2. Create a conda environment with Lightning and a GPU PyTorch version.
49+
3. Choose a script to submit
50+
51+
#### DDP
52+
Submit this job to run with DistributedDataParallel (2 nodes, 2 gpus each)
53+
```bash
54+
sbatch ddp_job_submit.sh YourEnv
55+
```
56+
57+
#### DDP2
58+
Submit this job to run with a different implementation of DistributedDataParallel.
59+
In this version, each node acts like DataParallel but syncs across nodes like DDP.
60+
```bash
61+
sbatch ddp2_job_submit.sh YourEnv
62+
```

pl_examples/basic_examples/cpu_template.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
import torch
99

1010
import pytorch_lightning as pl
11-
from pl_examples.basic_examples.lightning_module_template import LightningTemplateModel
11+
from pl_examples.models.lightning_template import LightningTemplateModel
1212

1313
SEED = 2334
1414
torch.manual_seed(SEED)

pl_examples/basic_examples/gpu_template.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
import torch
99

1010
import pytorch_lightning as pl
11-
from pl_examples.basic_examples.lightning_module_template import LightningTemplateModel
11+
from pl_examples.models.lightning_template import LightningTemplateModel
1212

1313
SEED = 2334
1414
torch.manual_seed(SEED)

pl_examples/multi_node_examples/multi_node_ddp2_demo.py pl_examples/basic_examples/multi_node_ddp2_demo.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
import torch
99

1010
import pytorch_lightning as pl
11-
from pl_examples.basic_examples.lightning_module_template import LightningTemplateModel
11+
from pl_examples.models.lightning_template import LightningTemplateModel
1212

1313
SEED = 2334
1414
torch.manual_seed(SEED)

pl_examples/multi_node_examples/multi_node_ddp_demo.py pl_examples/basic_examples/multi_node_ddp_demo.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
import torch
99

1010
import pytorch_lightning as pl
11-
from pl_examples.basic_examples.lightning_module_template import LightningTemplateModel
11+
from pl_examples.models.lightning_template import LightningTemplateModel
1212

1313
SEED = 2334
1414
torch.manual_seed(SEED)

pl_examples/domain_templates/gan.py pl_examples/domain_templates/generative_adversarial_net.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
"""
22
To run this template just do:
3-
python gan.py
3+
python generative_adversarial_net.py
44
55
After a few epochs, launch TensorBoard to see the images being generated at every batch:
66

pl_examples/full_examples/semantic_segmentation/semseg.py pl_examples/domain_templates/semantic_segmentation.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,10 @@
66
import torch.nn.functional as F
77
import torchvision.transforms as transforms
88
from PIL import Image
9-
from models.unet.model import UNet
109
from torch.utils.data import DataLoader, Dataset
1110

1211
import pytorch_lightning as pl
12+
from pl_examples.models.unet import UNet
1313

1414

1515
class KITTI(Dataset):

pl_examples/full_examples/imagenet/__init__.py

Whitespace-only changes.

pl_examples/full_examples/semantic_segmentation/models/unet/model.py

-44
This file was deleted.
File renamed without changes.

pl_examples/full_examples/semantic_segmentation/models/unet/parts.py pl_examples/models/unet.py

+41
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,47 @@
33
import torch.nn.functional as F
44

55

6+
class UNet(nn.Module):
7+
"""
8+
Architecture based on U-Net: Convolutional Networks for Biomedical Image Segmentation
9+
Link - https://arxiv.org/abs/1505.04597
10+
11+
Parameters:
12+
num_classes (int): Number of output classes required (default 19 for KITTI dataset)
13+
bilinear (bool): Whether to use bilinear interpolation or transposed
14+
convolutions for upsampling.
15+
"""
16+
17+
def __init__(self, num_classes=19, bilinear=False):
18+
super().__init__()
19+
self.layer1 = DoubleConv(3, 64)
20+
self.layer2 = Down(64, 128)
21+
self.layer3 = Down(128, 256)
22+
self.layer4 = Down(256, 512)
23+
self.layer5 = Down(512, 1024)
24+
25+
self.layer6 = Up(1024, 512, bilinear=bilinear)
26+
self.layer7 = Up(512, 256, bilinear=bilinear)
27+
self.layer8 = Up(256, 128, bilinear=bilinear)
28+
self.layer9 = Up(128, 64, bilinear=bilinear)
29+
30+
self.layer10 = nn.Conv2d(64, num_classes, kernel_size=1)
31+
32+
def forward(self, x):
33+
x1 = self.layer1(x)
34+
x2 = self.layer2(x1)
35+
x3 = self.layer3(x2)
36+
x4 = self.layer4(x3)
37+
x5 = self.layer5(x4)
38+
39+
x6 = self.layer6(x5, x4)
40+
x6 = self.layer7(x6, x3)
41+
x6 = self.layer8(x6, x2)
42+
x6 = self.layer9(x6, x1)
43+
44+
return self.layer10(x6)
45+
46+
647
class DoubleConv(nn.Module):
748
"""
849
Double Convolution and BN and ReLU

pl_examples/multi_node_examples/README.md

-21
This file was deleted.

pl_examples/multi_node_examples/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)