-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RVV] Add qu8-gemm/qu8-igemm kernels for rvv #8105
base: master
Are you sure you want to change the base?
Conversation
looks good overall. you call this m4 but store is m1? in gemm config you set NR to 4 * hardware_config->vlenb / sizeof(int32_t); 4x4 - typically on large convolutions it helps to be taller... the overread of reading left side as bytes is amortized by applying the same weights to them. the quantization you could do as uint8 like arm does. typically there are fewer registers as 8 bit than float and/or min/max may be faster as 8 bit instead of float. the gemm-config doesnt specify a packw kernel? weird... we are using the reference code. As you see our qu8 requiring a kernel zero point, typically requires 16 bit implementation, which is not ideal. |
qu8_gemm_config.minmax.igemm[XNN_MR_TO_INDEX(4)] = xnn_init_hmp_igemm_ukernel((xnn_igemm_ukernel_fn) xnn_qu8_igemm_minmax_fp32_ukernel_4x4v__rvv); | ||
qu8_gemm_config.init.qu8 = xnn_init_qu8_conv_minmax_fp32_scalar_params; | ||
qu8_gemm_config.mr = 4; | ||
qu8_gemm_config.nr = 4 * hardware_config->vlenb / sizeof(int32_t); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
qu8_gemm_config.nr = hardware_config->vlenb / sizeof(uint8_t);
The 4v kernel is m1 so NR = vlenb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I realized part way through this that I really should have changed the naming here to be 1v rather than 4v as you say. But I also realized that I had done similar for the other Q gemm/igemm kernels ... so I probably should go back and fix those in my next pass. Thanks for all the other notes above, I hope to get back to this in a couple of days.
@dsharlet and @fbarchard please review when you are able. Thank you.