research-article

Open access

LUTIN: Efficient Neural Network Inference with Table Lookup

Authors:

Hsiang-Pang LiAuthors Info & Claims

ISLPED '24: Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design

Pages 1 - 6

https://doi.org/10.1145/3665314.3670804

Published: 09 September 2024 Publication History

PDF eReader

Abstract

DNN models are becoming increasingly large and complex, but they are also being deployed on commodity devices that require low power and latency but lack specialized accelerators. We introduce LUTIN (LUT-based INference), which reduces the amount of matrix multiplication in DNN inference by converting it into table lookups. LUTIN's innovation is its use of hyperparameter optimization to refine the quantization process and vector partitioning, allowing it to run efficiently on a variety of hardware. By reducing off-chip memory lookups and designing a cache-efficient data layout, LUTIN reduces energy consumption while increasing the use of available CPU cache, even on devices with limited processing power. Our approach goes beyond the traditional limitations of 8-bit quantization, investigating lower bit-widths to further reduce LUT size while meeting accuracy requirements. Experimental results show that LUTIN achieves up to a 2.34x speedup in latency and a 2.04x improvement in energy efficiency over full-precision models.

References

[1]

D. Blalock and J. Guttag. Multiplying matrices without multiplying, 2021.

Google Scholar

[2]

O. Chang and H. Lipson. Balanced and deterministic weight-sharing helps network performance, 2023.

Google Scholar

[3]

Q. Deng, Y. Zhang, M. Zhang, and J. Yang. Lacc: Exploiting lookup table-based fast and accurate vector multiplication in dram-based cnn accelerator. In DAC, pages 1--6, 2019.

Digital Library

Google Scholar

[4]

Y. Jeon, B. Park, S. J. Kwon, B. Kim, J. Yun, and D. Lee. Biqgemm: matrix multiplication with lookup table for binary-coding-based quantized dnns. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1--14. IEEE, 2020.

Crossref

Google Scholar

[5]

A. Krizhevsky, G. Hinton, et al. Learning multiple layers of features from tiny images. 2009.

Google Scholar

[6]

Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278--2324, 1998.

Crossref

Google Scholar

[7]

D.-T. Nguyen, A. Bhattacharjee, A. Moitra, and P. Panda. Deepcam: A fully cam-based inference accelerator with variable hash lengths for energy-efficient deep neural networks. In DATE, pages 1--6, 2023.

Google Scholar

[8]

J. Ran, R. Lin, J. C. Lok Li, J. Zhou, and N. Wong. Pecan: A product-quantized content addressable memory network. In DATE, pages 1--6, 2023.

Google Scholar

[9]

K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

Google Scholar

[10]

Y. E. Wang, G.-Y. Wei, and D. M. Brooks. Benchmarking tpu, gpu, and cpu platforms for deep learning. ArXiv, abs/1907.10701, 2019.

Google Scholar

Index Terms

LUTIN: Efficient Neural Network Inference with Table Lookup
1. Computer systems organization
  1. Embedded and cyber-physical systems

Recommendations

Routability-driven technology mapping for lookup table-based FPGA's

A new algorithm for technology mapping of lookup table-based Field-Programmable Gate Arrays (FPGA's) is presented. It has the capability of producing compact designs (minimizing the number of cells (CLB's)), as well as the flexibility of trading ...
Weightless Neural Networks for Efficient Edge Inference
PACT '22: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques

Weightless neural networks (WNNs) are a class of machine learning model which use table lookups to perform inference, rather than the multiply-accumulate operations typical of deep neural networks (DNNs). Individual weightless neurons are capable of ...
A New Look-Up Table Approach for High-Speed Finite Field Multiplication
ISED '11: Proceedings of the 2011 International Symposium on Electronic System Design

This paper presents a new high-speed multiplier over GF(2^m) based on look-up table (LUT) approach. A straight-forward LUT-based multiplication requires a table of size (m x 2^m) bits for the Galois field of order m which is quite large for the fields ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

ISLPED '24: Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design

August 2024

384 pages

ISBN:9798400706882

DOI:10.1145/3665314

Chair:
Pascal Meinerzhagen,
Program Chair:
Kapil Dev,
Program Co-chair:
Jerald Yoo

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 September 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

ISLPED '24

Sponsor:

SIGDA

ISLPED '24: 29th ACM/IEEE International Symposium on Low Power Electronics and Design

August 5 - 7, 2024

CA, Newport Beach, USA

Acceptance Rates

Overall Acceptance Rate 398 of 1,159 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
143
Total Downloads

Downloads (Last 12 months)143
Downloads (Last 6 weeks)44

Reflects downloads up to 03 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Index Terms

Recommendations

Routability-driven technology mapping for lookup table-based FPGA's

Weightless Neural Networks for Efficient Edge Inference

A New Look-Up Table Approach for High-Speed Finite Field Multiplication