Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup CI as running on TPU #963

Closed
2 tasks
vfdev-5 opened this issue Apr 22, 2020 · 4 comments · Fixed by #981
Closed
2 tasks

Setup CI as running on TPU #963

vfdev-5 opened this issue Apr 22, 2020 · 4 comments · Fixed by #981

Comments

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Apr 22, 2020

🚀 Feature

Ignite will support distributed training on TPU (e.g. #960). Currently, metric's computation is impacted in the same way as for DDP on GPUs. This should be addressed in a different Issue/PR.
Idea of this issue is to setup CI to emulate running on TPU as it is done in pytorch/xla.

  • Setup another workflow on our CirlceCI as it is done for xla:
  • Add a simple test marked as TPU (@pytest.mark.tpu)
@erip
Copy link
Contributor

erip commented Apr 25, 2020

I'd be interested in helping with this, but it seems like it'll require some administration on the CI side (setting env vars at least). Are TPU instances available freely for CI through CircleCI or is TPU virtualized through docker? Not sure how that works...

@vfdev-5
Copy link
Collaborator Author

vfdev-5 commented Apr 25, 2020

@erip thanks ! I think it is CPU emulation what is done on xla CircleCI. If you could take a look how they propose contributors to work on xla dev and run tests, so, we can understand how to setup our tests. In our case, we wont need to rebuild xla etc, we can just use their docker and setup CPU emulation stuff.

but it seems like it'll require some administration on the CI side (setting env vars at least)

Let me check if I can activate CircleCI specific workflows on PR. But anyway, it is just about sending another .circleci/another_config.yml file, I think. Let me check. Otherwise, we can opt to Github acitons

EDIT: seems like we can have a single .circleci/config.yml. So, let's create this XLA CI workflow with Github Actions.

@erip
Copy link
Contributor

erip commented Apr 25, 2020

It seems like if we want to use the XLA docker images that pytorch/pytorch and pytorch/xla use in GitHub Actions, we'll need to develop an action that wraps the container. What's not immediately clear is how the tests actually get run from there. 😄 I'll need to do some reading, but just commenting here to document for myself later.

@vfdev-5
Copy link
Collaborator Author

vfdev-5 commented Apr 25, 2020

@erip I think it is more simple than that:

@erip erip mentioned this issue Apr 25, 2020
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants