Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytorch Tutorial 5 - MNIST classification - Jupyter and Markdown #91

Merged
merged 8 commits into from
Oct 21, 2021
40 changes: 22 additions & 18 deletions simple_applications/pytorch/mnist/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ PopTorch. To learn more about PopTorch, see our [PyTorch for the IPU: User Guide

## How to use this demo

1) Prepare the environment.
### 1) Prepare the environment.

Install the Poplar SDK following the instructions in the [Getting Started](https://docs.graphcore.ai/en/latest/getting-started.html)
guide for your IPU system. Make sure to run the `enable.sh` scripts for Poplar
Expand All @@ -16,23 +16,29 @@ Then install the package requirements:
pip install -r requirements.txt
```

2) Run the program. Note that the PopTorch Python API only supports Python 3.
Data will be automatically downloaded using torchvision utils.
### 2) Run the program.
Note that the PopTorch Python API only supports Python 3. Data will be
automatically downloaded using torchvision utils.

```bash
python3 mnist_poptorch.py
```

Select your hyperparameters in this cell. If you wish to modify them, re-run
all cells below it. For further reading on hyperparameters, see [Hyperparameters (machine learning)](https://en.wikipedia.org/wiki/Hyperparameter_(machine_learning))
Set up parameters for training:
### 3) Hyperparameters
Set the hyperparameters for this demo. If you're running this example in
a Jupyter notebook and wish to modify them, re-run all the cells below.
For further reading on hyperparameters, see [Hyperparameters (machine learning)](https://en.wikipedia.org/wiki/Hyperparameter_(machine_learning))


```python
# Batch size for training
batch_size = 8

# Device iteration - batches per step
# Device iteration - batches per step. Number of iterations the device should
# run over the data before returning to the user.
# This is equivalent to running the IPU in a loop over that the specified
# number of iterations, with a new batch of data each time. However, increasing
# deviceIterations is more efficient because the loop runs on the IPU directly.
device_iterations = 50

# Batch size for testing
Expand All @@ -42,7 +48,7 @@ test_batch_size = 80
epochs = 10

# Learning rate
learning_rate = 0.05
learning_rate = 0.03
```

Import required libraries:
Expand All @@ -57,7 +63,7 @@ import poptorch
import torch.optim as optim
```

Download the datasets for MNIST - database for handwritten digits.
Download the datasets for MNIST and set up data loaders.
Source: [The MNIST Database](http://yann.lecun.com/exdb/mnist/)


Expand Down Expand Up @@ -135,7 +141,6 @@ class Network(nn.Module):
self.layer3_act = nn.ReLU()
self.layer3_dropout = torch.nn.Dropout(0.5)
self.layer4 = nn.Linear(128, 10)
self.softmax = nn.Softmax(1)

def forward(self, x):
x = self.layer1(x)
Expand All @@ -147,8 +152,8 @@ class Network(nn.Module):
return x
```

Here we define a thin wrapper around the `torch.nn.Module` that will use
cross-entropy loss function - see more [here](https://en.wikipedia.org/wiki/Cross_entropy#Cross-entropy_loss_function_and_logistic_regression)
Next we define a thin wrapper around the `torch.nn.Module` that will use
the cross-entropy loss function. To learn more about cross entropy click [here](https://en.wikipedia.org/wiki/Cross_entropy#Cross-entropy_loss_function_and_logistic_regression).

This class is creating a custom module to compose the Neural Network and
the Cross Entropy module into one object, which under the hood will invoke
Expand Down Expand Up @@ -205,15 +210,14 @@ print(model_with_loss)
(layer3_act): ReLU()
(layer3_dropout): Dropout(p=0.5, inplace=False)
(layer4): Linear(in_features=128, out_features=10, bias=True)
(softmax): Softmax(dim=1)
)
(loss): CrossEntropyLoss()
)


Now we apply the model wrapping function, which will perform a shallow copy
of the PyTorch model. To train the model, we also will use the Stochastic
Gradient Descent with no momentum [SGD](https://docs.graphcore.ai/projects/poptorch-user-guide/en/latest/reference.html#poptorch.optim.SGD).
of the PyTorch model. To train the model we will use the Stochastic Gradient
Descent with no momentum [SGD](https://docs.graphcore.ai/projects/poptorch-user-guide/en/latest/reference.html#poptorch.optim.SGD).


```python
Expand Down Expand Up @@ -267,8 +271,8 @@ training_model.detachFromDevice()
```

Let's check the validation loss on IPU using the trained model. The weights
in `model.parameters()` will be copied from the IPU to the host. The trained
model will be reused to compile the new inference model.
in `model.parameters()` will be copied from the IPU to the host. The weights
from the trained model will be reused to compile the new inference model.


```python
Expand All @@ -294,7 +298,7 @@ Finally the accuracy on the test set is:
print("Accuracy on test set: {:0.2f}%".format(sum_acc / len(test_data)))
```

Accuracy on test set: 99.07%
Accuracy on test set: 99.24%


Release resources:
Expand Down
Loading