quantization
Generate a quantization parameter file for ncnn framework int8 inference
Vector (and Scalar) Quantization, in Pytorch
[CVPR'20] ZeroQ: A Novel Zero Shot Quantization Framework
PyTorch implementation of Data Free Quantization Through Weight Equalization and Bias Correction.
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Official inference framework for 1-bit LLMs
A simple network quantization demo using pytorch from scratch.
Neural Network Compression Framework for enhanced OpenVINO™ inference
A highly parallelized implementation of non-maximum suppression for object detection used for self-driving cars.
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.