-
Notifications
You must be signed in to change notification settings - Fork 252
[PT2E] observers do not handle inputs with different shapes correctly #2112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
yeah we haven't test per tensor that much I think, we'd need to fix per tensor use case |
I have thought about this scenario a bit before, I think the fix for this one would be using |
Do you mean using granularity in all cases or only for per-tensor? |
I think using block size = -1 for per-tensor also makes sense. What do you think? |
just for per tensor for now, we can assert that only one of |
yeah this could make sense, but then the question is can we express all dynamic shapes with block_size? |
I think so. For a certain dimension:
All cases can be covered. How does that sound? |
I remember discussed this with @drisspg and @vkuzo before. there is another dimension which is the number of dims (rank of the tensor), e.g. for per token/per row, it is actually: all the preceding dims except for last dim should use size 1, except for the last dim: ao/torchao/quantization/observer.py Line 81 in 2fcab01
I'm not sure if we will have a input tensor that has variable rank though So in theory this is not enough, but maybe it's enough in practice? |
I see. Yeah maybe an observer would not see multiple ranks in practice but I am not sure. Would you have more discussions and decide what to do? Thanks. |
yeah we'll discuss in our team meeting tomorrow to decide on what to do |
Summary
The observers in Torchao, such as
AffineQuantizedMinMaxObserver
, support different quantization granularity by keeping block sizes. For example, if granularity isPerTensor
and input shape is(16, 3, 224, 224)
, the observer keeps block sizes =(16, 3, 224, 224)
.However, the block sizes would be wrong if inputs with different shapes are passed in. For example, if another input with shape =
(16, 3, 56, 56)
comes, the block sizes are updated as(16, 3, 56, 56)
. It is wrong for inputs with shape =(16, 3, 224, 224)
.How to reproduce
The code to reproduce is similar as the reproducer here: #2094
One needs to bypass the #2094 issue by changing source code manually and also run the
converted_model
afterconvert_pt2e
:The text was updated successfully, but these errors were encountered: