Skip to content

Commit e6fc189

Browse files
make_distribute_tutorial_work_in_google_colab (#3022)
* make_distribute_tutorial work in google_colab --------- Co-authored-by: Svetlana Karslioglu <[email protected]>
1 parent 01eeee6 commit e6fc189

File tree

1 file changed

+9
-3
lines changed

1 file changed

+9
-3
lines changed

intermediate_source/dist_tuto.rst

+9-3
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ the following template.
4747
"""run.py:"""
4848
#!/usr/bin/env python
4949
import os
50+
import sys
5051
import torch
5152
import torch.distributed as dist
5253
import torch.multiprocessing as mp
@@ -66,8 +67,12 @@ the following template.
6667
if __name__ == "__main__":
6768
world_size = 2
6869
processes = []
69-
mp.set_start_method("spawn")
70-
for rank in range(world_size):
70+
if "google.colab" in sys.modules:
71+
print("Running in Google Colab")
72+
mp.get_context("spawn")
73+
else:
74+
mp.set_start_method("spawn")
75+
for rank in range(size):
7176
p = mp.Process(target=init_process, args=(rank, world_size, run))
7277
p.start()
7378
processes.append(p)
@@ -156,7 +161,8 @@ we should not modify the sent tensor nor access the received tensor before ``req
156161
In other words,
157162

158163
- writing to ``tensor`` after ``dist.isend()`` will result in undefined behaviour.
159-
- reading from ``tensor`` after ``dist.irecv()`` will result in undefined behaviour.
164+
- reading from ``tensor`` after ``dist.irecv()`` will result in undefined
165+
behaviour, until ``req.wait()`` has been executed.
160166

161167
However, after ``req.wait()``
162168
has been executed we are guaranteed that the communication took place,

0 commit comments

Comments
 (0)