Computer Science > Machine Learning

arXiv:2211.14928 (cs)

[Submitted on 27 Nov 2022]

Title:Class-based Quantization for Neural Networks

Authors:Wenhao Sun, Grace Li Zhang, Huaxi Gu, Bing Li, Ulf Schlichtmann

View PDF

Abstract:In deep neural networks (DNNs), there are a huge number of weights and multiply-and-accumulate (MAC) operations. Accordingly, it is challenging to apply DNNs on resource-constrained platforms, e.g., mobile phones. Quantization is a method to reduce the size and the computational complexity of DNNs. Existing quantization methods either require hardware overhead to achieve a non-uniform quantization or focus on model-wise and layer-wise uniform quantization, which are not as fine-grained as filter-wise quantization. In this paper, we propose a class-based quantization method to determine the minimum number of quantization bits for each filter or neuron in DNNs individually. In the proposed method, the importance score of each filter or neuron with respect to the number of classes in the dataset is first evaluated. The larger the score is, the more important the filter or neuron is and thus the larger the number of quantization bits should be. Afterwards, a search algorithm is adopted to exploit the different importance of filters and neurons to determine the number of quantization bits of each filter or neuron. Experimental results demonstrate that the proposed method can maintain the inference accuracy with low bit-width quantization. Given the same number of quantization bits, the proposed method can also achieve a better inference accuracy than the existing methods.

Comments:	accepted by DATE2023 (Design, Automation and Test in Europe)
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2211.14928 [cs.LG]
	(or arXiv:2211.14928v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2211.14928

Submission history

From: Grace Li Zhang [view email]
[v1] Sun, 27 Nov 2022 20:25:46 UTC (1,698 KB)

Computer Science > Machine Learning

Title:Class-based Quantization for Neural Networks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Class-based Quantization for Neural Networks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators