Description
Currently, there are a number of constraints on the quantization parameters, namely the sign of the scales (e.g. https://github.com/Xilinx/brevitas/blob/dev/src/brevitas/core/scaling/standalone.py#L274) and the sign of the zero-point (e.g. https://github.com/Xilinx/brevitas/blob/dev/src/brevitas/core/zero_point.py#L300).
Moreover, these constraints, which were imposed, to a certain extent, for compatibility with certain export flows, can lead to sub-optimal performance, as discussed for the scales in #1308. An analogous discussion should be done for the zero-point.
However, lifting these constraints might break retro-compatibility, as well as require a good amount of refactoring, given that these are also taken as implicit assumptions in certain parts of the codebase.
In this regard, a non-comprehensive list of issues, that are likely to come up when lifting these constraints, is provided below:
- Avoid duplication for signed/unsigned scales/zero-point computation.
- Handling of Po2 scales.
- Interaction of signed scales with MSE.
- Impact of signed scales in the quantization of the zero-point.