[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3431920.3439477acmconferencesArticle/Chapter ViewAbstractPublication PagesfpgaConference Proceedingsconference-collections
poster

NPE: An FPGA-based Overlay Processor for Natural Language Processing

Published: 17 February 2021 Publication History

Abstract

In recent years, transformer-based models have shown state-of-the-art results for Natural Language Processing (NLP). In particular, the introduction of the BERT language model brought with it breakthroughs in tasks such as question answering and natural language inference, advancing applications that allow humans to interact naturally with embedded devices. FPGA-based overlay processors have been shown as effective solutions for edge image and video processing applications, which mostly rely on low precision linear matrix operations. In contrast, transformer-based NLP techniques employ a variety of higher precision nonlinear operations with significantly higher frequency. We present NPE, an FPGA-based overlay processor that can efficiently execute a variety of NLP models. NPE offers software-like programmability to the end user and, unlike FPGA designs that implement specialized accelerators for each nonlinear function, can be upgraded for future NLP models without requiring reconfiguration. NPE can meet real-time conversational AI latency targets for the BERT language model with 4x lower power than CPUs and 6x lower power than GPUs. We also show NPE uses 3x fewer FPGA resources relative to comparable BERT network-specific accelerators in the literature. NPE provides a cost-effective and power-efficient FPGA-based solution for Natural Language Processing at the edge.

Cited By

View all
  • (2024)ONE-SA: Enabling Nonlinear Operations in Systolic Arrays For Efficient and Flexible Neural Network Inference2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546535(1-6)Online publication date: 25-Mar-2024
  • (2024)Enhancing Long Sequence Input Processing in FPGA-Based Transformer Accelerators through Attention FusionProceedings of the Great Lakes Symposium on VLSI 202410.1145/3649476.3658810(599-603)Online publication date: 12-Jun-2024
  • (2024)SimBU: Self-Similarity-Based Hybrid Binary-Unary Computing for Nonlinear FunctionsIEEE Transactions on Computers10.1109/TC.2024.339851273:9(2192-2205)Online publication date: Sep-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
FPGA '21: The 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays
February 2021
240 pages
ISBN:9781450382182
DOI:10.1145/3431920
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 February 2021

Check for updates

Author Tags

  1. BERT
  2. FPGA
  3. NLP
  4. accelerator
  5. machine learning
  6. nonlinear
  7. overlay
  8. processor

Qualifiers

  • Poster

Conference

FPGA '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 125 of 627 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)ONE-SA: Enabling Nonlinear Operations in Systolic Arrays For Efficient and Flexible Neural Network Inference2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546535(1-6)Online publication date: 25-Mar-2024
  • (2024)Enhancing Long Sequence Input Processing in FPGA-Based Transformer Accelerators through Attention FusionProceedings of the Great Lakes Symposium on VLSI 202410.1145/3649476.3658810(599-603)Online publication date: 12-Jun-2024
  • (2024)SimBU: Self-Similarity-Based Hybrid Binary-Unary Computing for Nonlinear FunctionsIEEE Transactions on Computers10.1109/TC.2024.339851273:9(2192-2205)Online publication date: Sep-2024
  • (2024)An FPGA-based Multi-Core Overlay Processor for Transformer-based Models2024 2nd International Symposium of Electronics Design Automation (ISEDA)10.1109/ISEDA62518.2024.10617729(697-702)Online publication date: 10-May-2024
  • (2024)An FPGA-Based Efficient Streaming Vector Processing Engine for Transformer-Based Models2024 2nd International Symposium of Electronics Design Automation (ISEDA)10.1109/ISEDA62518.2024.10617499(722-727)Online publication date: 10-May-2024
  • (2024)DTrans: A Dataflow-transformation FPGA Accelerator with Nonlinear-operators fusion aiming for the Generative Model2024 34th International Conference on Field-Programmable Logic and Applications (FPL)10.1109/FPL64840.2024.00054(339-345)Online publication date: 2-Sep-2024
  • (2024)LORA: A Latency-Oriented Recurrent Architecture for GPT Model on Multi-FPGA Platform with Communication Optimization2024 34th International Conference on Field-Programmable Logic and Applications (FPL)10.1109/FPL64840.2024.00053(332-338)Online publication date: 2-Sep-2024
  • (2024)TransFRU: Efficient Deployment of Transformers on FPGA with Full Resource Utilization2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)10.1109/ASP-DAC58780.2024.10473976(521-526)Online publication date: 22-Jan-2024
  • (2024)SWAT: An Efficient Swin Transformer Accelerator Based on FPGA2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)10.1109/ASP-DAC58780.2024.10473931(515-520)Online publication date: 22-Jan-2024
  • (2024)Vision Transformer-based overlay processor for Edge ComputingApplied Soft Computing10.1016/j.asoc.2024.111421156(111421)Online publication date: May-2024
  • Show More Cited By

View Options

Login options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media