A first sample version of `FloatQuant` #159

nghielme · 2024-12-09T20:51:43Z

Please, consider that the function should be tested more extensively.
Sample FloatQuant function implemented. A sample use of the function can be found in the Examples. ±inf are clipped to ±max_val. ±NaN are mapped to NaN. The zero is always representable. I tested with subnormals (to be intended as subnormals for the output representation) and the quantizer represented the subnormals with no loss (I didn't extensively tested this part though). I tested the function against Brevitas FloatQuant implementation: they do not always match. For example I think 0.3125 should be representable (x == xq) by a float quantizer with 4bits for mantissa, 4bits for the exponent, 0 bias and 1bit for the sign. Brevitas FloatQuant implementation quantize it to 0.25. Not sure what I should consider correct for this case.

…on can be found in the `Examples`. `±inf` are clipped to `±max_val`. `±NaN` are mapped to `±NaN`. The zero is always representable. I tested with subnormals (to be intended as subnormals for the output representation) and the quantizer represented the subnormals with no loss (I didn't extensively tested this part though). I tested the function against Brevitas `FloatQuant` implementation: they do not always match. For example I think `0.3125` should be representable (`x == xq`) by a float quantizer with 4bits for mantissa, 4bits for the exponent, 0 bias and 1bit for the sign. Brevitas `FloatQuant` implementation quantize it to `0.25`. Not sure what I should consider correct for this case.

nickfraser · 2024-12-10T10:13:36Z

Brevitas developer here, thanks for this concrete example - I will look into it ASAP!

nghielme · 2024-12-10T10:15:57Z

Hi @nickfraser
I spotted some potential bugs in Brevitas FloatQuant, should I create one issue in the corresponding GH repo in which I can describe them and propose my fix?

nickfraser · 2024-12-10T10:20:50Z

Yes, please do - if you can provide minimal examples as well, this will make it much easier for us as well 🙏. Note, dev is our staging branch, so please make sure any issues still exist in dev.

If you have proposed solutions, feel free to make PRs as well (pointing to dev), so long as you follow our contributor guidelines.

Co-authored-by: Nicolo Ghielmetti <[email protected]>

… provided. Some other tests have been added

nghielme · 2024-12-12T09:58:00Z

src/qonnx/custom_op/general/floatquant.py

+    exponent_bias=None,
+    max_val=None,
+    rounding_mode="ROUND",
+    lt_subnorm_to_zero=False,


@maltanar , this name is terrible! Please, help me in finding a better one 🤦‍♂️

… quantization logic. Now QONNX and Brevitas float quantisers match.

maltanar

Thanks @nghielme ! Looking good, but needs a few fixes before merging.

maltanar · 2025-03-07T09:02:57Z

docs/qonnx-custom-ops/floatquant_op.md



 #### Sample Implementation
-TODO
+```python


let's add a comment that links this back to the source file, in case it changes in the future but we forget to update it here.
e.g.
# see src/qonnx/custom_op/general/floatquant.py for up-to-date implementation

maltanar · 2025-03-07T09:05:11Z

tests/custom_op/test_floatquant.py

+
+import numpy as np
+
+from qonnx.custom_op.general.floatquant import compute_max_val, float_quantize


wrong name imported? the impl seems to have been renamed to float_quant (and no longer float_quantize)

maltanar · 2025-03-07T09:08:35Z

src/qonnx/custom_op/general/floatquant.py

+    exponent_bias,
+    signed,


since we have a notion of a default exponent bias (as implemented by the compute_default_exponent_bias function) I suggest to enable using the default by setting exponent_bias=None. in that case signed should be either moved up in the parameter list to come before the params with default values, or assigned some default value (True?) itself

maltanar · 2025-03-07T09:09:42Z

tests/custom_op/test_floatquant.py

+    assert np.all(float_quantize(testcase_a, unit_scale, 2, 3) == testcase_a)
+    assert np.all(float_quantize(testcase_b, unit_scale, 2, 3) == testcase_b)
+    assert np.all(float_quantize(testcase_c, unit_scale, 2, 3) == compute_max_val(2, 3))
+    assert np.all(float_quantize(testcase_d, unit_scale, 3, 2) == compute_max_val(3, 2))
+    assert np.all(float_quantize(testcase_e, unit_scale, 2, 1) == compute_max_val(2, 1))
+    assert np.all(float_quantize(testcase_f, unit_scale, 2, 3, lt_subnorm_to_zero=True) == 0.0)


missing signed and perhaps exponent_bias args?

maltanar · 2025-03-07T09:11:33Z

tests/custom_op/test_floatquant.py

+    assert compute_max_val(2, 3) == 7.5  # FP6 E2M3
+    assert compute_max_val(3, 2) == 28.0  # FP6 E3M2
+    assert compute_max_val(2, 1) == 6.0  # FP4 E2M1
+


please also add the test with the example executing QONNX exported from Brevitas & comparing against (pre-generated) reference value from Brevitas, which you shared with me previously

nghielme requested a review from maltanar December 9, 2024 20:51

maltanar and others added 3 commits December 10, 2024 12:47

[FloatQ] copy over float_quantize into custom op placeholder

d7d35b2

Co-authored-by: Nicolo Ghielmetti <[email protected]>

[Test] add test skeleton for compute_max_val and float_quantize

0f6633a

FloatQuant implementation improved to pass the nullifying tests Yaman…

491a3be

… provided. Some other tests have been added

nghielme commented Dec 12, 2024

View reviewed changes

nickfraser mentioned this pull request Dec 16, 2024

2 potential bugs related to FloatQuant Xilinx/brevitas#1126

Closed

[FloatQuant] integrate FloatQuant custom operation and refactor float…

2e51b8b

… quantization logic. Now QONNX and Brevitas float quantisers match.

maltanar requested changes Mar 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A first sample version of `FloatQuant` #159

A first sample version of `FloatQuant` #159

nghielme commented Dec 9, 2024 •

edited

Loading

nickfraser commented Dec 10, 2024

nghielme commented Dec 10, 2024

nickfraser commented Dec 10, 2024

nghielme Dec 12, 2024

maltanar left a comment

maltanar Mar 7, 2025

maltanar Mar 7, 2025

maltanar Mar 7, 2025

maltanar Mar 7, 2025

maltanar Mar 7, 2025


		import numpy as np

		from qonnx.custom_op.general.floatquant import compute_max_val, float_quantize

A first sample version of FloatQuant #159

Are you sure you want to change the base?

A first sample version of FloatQuant #159

Conversation

nghielme commented Dec 9, 2024 • edited Loading

nickfraser commented Dec 10, 2024

nghielme commented Dec 10, 2024

nickfraser commented Dec 10, 2024

nghielme Dec 12, 2024

Choose a reason for hiding this comment

maltanar left a comment

Choose a reason for hiding this comment

maltanar Mar 7, 2025

Choose a reason for hiding this comment

maltanar Mar 7, 2025

Choose a reason for hiding this comment

maltanar Mar 7, 2025

Choose a reason for hiding this comment

maltanar Mar 7, 2025

Choose a reason for hiding this comment

maltanar Mar 7, 2025

Choose a reason for hiding this comment

A first sample version of `FloatQuant` #159

A first sample version of `FloatQuant` #159

nghielme commented Dec 9, 2024 •

edited

Loading