Skip to content

Commit 1f77d03

Browse files
committed
make mentions of mps in docs. ty good people in issue #28
1 parent a6bffee commit 1f77d03

File tree

2 files changed

+3
-1
lines changed

2 files changed

+3
-1
lines changed

Diff for: README.md

+2
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,8 @@ $ python train.py config/train_shakespeare_char.py --device=cpu --compile=False
125125

126126
Where we decrease the context length to just 64 characters and only use a batch size of 8.
127127

128+
Finally, on Apple Silicon Macbooks you can use device `--device mps` ("Metal Performance Shaders"), which can significantly accelerate training (2-3X). You will need a specific version of PyTorch. See [Issue 28](https://github.com/karpathy/nanoGPT/issues/28).
129+
128130
## benchmarking
129131

130132
For model benchmarking `bench.py` might be useful. It's identical to what happens in the meat of the training loop of `train.py`, but omits much of the other complexities.

Diff for: train.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@
6767
# DDP settings
6868
backend = 'nccl' # 'nccl', 'gloo', etc.
6969
# system
70-
device = 'cuda' # examples: 'cpu', 'cuda', 'cuda:0', 'cuda:1', etc.
70+
device = 'cuda' # examples: 'cpu', 'cuda', 'cuda:0', 'cuda:1' etc., or try 'mps' on macbooks
7171
dtype = 'bfloat16' # 'float32' or 'bfloat16'
7272
compile = True # use PyTorch 2.0 to compile the model to be faster
7373
# -----------------------------------------------------------------------------

0 commit comments

Comments
 (0)