-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Single node DDP: "Default process group is not initialized" #2254
Comments
can you post code to reproduce? just a minimal example that breaks BTW, the GPU template is fixed... |
done, let me post my env as well |
ok wait... i think i see it. one sec |
I just tested the merged changes with both ddp and ddp_spawn again got this:
|
try again. that was a typo |
cheers, works now! |
Still having the |
I still have this bug as well. One temporary solution is creating a new single GPU trainer to do the test. Like
|
Right, I know it works on single gpu. I have a large test set and ideally want faster inference using multiple gpus. |
Can we re-open this issue? I am still having the |
+1, doesn't look like the issue is resolved yet. |
having the same problem..... I also tried to downgrade pl to an older version, like 0.7.5, and try to using the older version to do the inference. But, the model trained and saved using the 0.8.x seems to not directly be compatible with older version. |
version: 0.8.4 train with ddp, Got "Default process group is not initialized" when run trainer.test() |
could you try master? this is fixed there |
Just tried it, it works fine now! Thank you! |
@williamFalcon Trying 0.8.5 Trained with ddp, and testing with ddp, but got the following error message:
Any idea? Thanks! |
🐛 Bug
Unable to start single node ddp training on 0.8.0
To Reproduce
was going to run the gpu_template but... #2235both methods of running the template result in the same error
The text was updated successfully, but these errors were encountered: