-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tutorial for AOTI Python runtime #2997
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2997
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 194388e with merge base 96b9c27 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Hi @svekars Do I ignore this https://github.com/pytorch/tutorials/actions/runs/10360994812/job/28680461319?pr=2997 or do I need to add some checks in the tutorial? The failure is because the machine doesn't support triton |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few editorial nits. Also, it feels a bit short for a full size intermediate tutorial. We should either add more or move to recipes. Also, we need to add entries either to index.rst or recipes_sourece/recipes_index.rst (depending on whether it's recipe or tutorial)
# | ||
# .. note:: | ||
# | ||
# This API also supports :func:`torch.compile` options like `mode` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# This API also supports :func:`torch.compile` options like `mode` | |
# This API also supports :func:`torch.compile` options like ``mode`` and other. |
Sure, once the content is finalized and looks good, we can move it where you think its appropriate |
# a shared library that can be run in a non-Python environment. | ||
# | ||
# | ||
# In this tutorial, you will learn an end-to-end example of how to use AOTInductor for python runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will make the story more complete by explaining the "why" part here, e.g. eliminating recompilation at run time, max-autotune ahead of time, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. Haven't mentioned eliminating recompilation, since the tutorial doesn't show that
example_inputs = (torch.randn(2, 3, 224, 224, device=device),) | ||
|
||
# min=2 is not a bug and is explained in the 0/1 Specialization Problem | ||
batch_dim = torch.export.Dim("batch", min=2, max=32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it is ok to use min=1
here, but we can't feed in an example input with batch size 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example with batch_size 1 is usually tried often, hence I set min=2
Co-authored-by: Angela Yi <[email protected]> Co-authored-by: Svetlana Karslioglu <[email protected]>
…torials into tutorial/aoti_python
…torials into tutorial/aoti_python
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please double-check the formatting here. Also need to add to the recipe_index.rst. But otherwise, from the publishing perspective LGTM.
Co-authored-by: Svetlana Karslioglu <[email protected]>
@svekars I fixed the indentation of Pre-requisites. Its still not rendering correctly. Any suggestion? |
|
||
###################################################################### | ||
# We see that there is a drastic speedup in first inference time using AOTInductor compared | ||
# to ``torch.compile`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have some example numbers to share here? So readers can get some rough idea without actually running the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the rendered html , the tutorial shows 2.92 ms vs 7000 ms. It might be good to collect this number over a range of models similar to how we show perf difference with compile vs eager.
* Tutorial for AOTI Python runtime --------- Co-authored-by: Svetlana Karslioglu <[email protected]> Co-authored-by: Angela Yi <[email protected]>
* Tutorial for AOTI Python runtime --------- Co-authored-by: Svetlana Karslioglu <[email protected]> Co-authored-by: Angela Yi <[email protected]>
* Tutorial for AOTI Python runtime --------- Co-authored-by: Svetlana Karslioglu <[email protected]> Co-authored-by: Angela Yi <[email protected]>
Description
We have an AOT Inductor tutorial for showing inference on C++ runtime here
This tutorial shows how to run AOTI on Python runtime
torch.compile
option likemax-autotune
modeChecklist