[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3665314.3670804acmconferencesArticle/Chapter ViewAbstractPublication PagesislpedConference Proceedingsconference-collections
research-article
Open access

LUTIN: Efficient Neural Network Inference with Table Lookup

Published: 09 September 2024 Publication History

Abstract

DNN models are becoming increasingly large and complex, but they are also being deployed on commodity devices that require low power and latency but lack specialized accelerators. We introduce LUTIN (LUT-based INference), which reduces the amount of matrix multiplication in DNN inference by converting it into table lookups. LUTIN's innovation is its use of hyperparameter optimization to refine the quantization process and vector partitioning, allowing it to run efficiently on a variety of hardware. By reducing off-chip memory lookups and designing a cache-efficient data layout, LUTIN reduces energy consumption while increasing the use of available CPU cache, even on devices with limited processing power. Our approach goes beyond the traditional limitations of 8-bit quantization, investigating lower bit-widths to further reduce LUT size while meeting accuracy requirements. Experimental results show that LUTIN achieves up to a 2.34x speedup in latency and a 2.04x improvement in energy efficiency over full-precision models.

References

[1]
D. Blalock and J. Guttag. Multiplying matrices without multiplying, 2021.
[2]
O. Chang and H. Lipson. Balanced and deterministic weight-sharing helps network performance, 2023.
[3]
Q. Deng, Y. Zhang, M. Zhang, and J. Yang. Lacc: Exploiting lookup table-based fast and accurate vector multiplication in dram-based cnn accelerator. In DAC, pages 1--6, 2019.
[4]
Y. Jeon, B. Park, S. J. Kwon, B. Kim, J. Yun, and D. Lee. Biqgemm: matrix multiplication with lookup table for binary-coding-based quantized dnns. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1--14. IEEE, 2020.
[5]
A. Krizhevsky, G. Hinton, et al. Learning multiple layers of features from tiny images. 2009.
[6]
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278--2324, 1998.
[7]
D.-T. Nguyen, A. Bhattacharjee, A. Moitra, and P. Panda. Deepcam: A fully cam-based inference accelerator with variable hash lengths for energy-efficient deep neural networks. In DATE, pages 1--6, 2023.
[8]
J. Ran, R. Lin, J. C. Lok Li, J. Zhou, and N. Wong. Pecan: A product-quantized content addressable memory network. In DATE, pages 1--6, 2023.
[9]
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
[10]
Y. E. Wang, G.-Y. Wei, and D. M. Brooks. Benchmarking tpu, gpu, and cpu platforms for deep learning. ArXiv, abs/1907.10701, 2019.

Index Terms

  1. LUTIN: Efficient Neural Network Inference with Table Lookup

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ISLPED '24: Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design
    August 2024
    384 pages
    ISBN:9798400706882
    DOI:10.1145/3665314
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 September 2024

    Check for updates

    Author Tags

    1. LUT
    2. DNN
    3. inference

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ISLPED '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 398 of 1,159 submissions, 34%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 143
      Total Downloads
    • Downloads (Last 12 months)143
    • Downloads (Last 6 weeks)44
    Reflects downloads up to 03 Jan 2025

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media