Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleAugust 2022
Self adaptive reconfigurable arrays (SARA): learning flexible GEMM accelerator configuration and mapping-space using ML
DAC '22: Proceedings of the 59th ACM/IEEE Design Automation ConferencePages 583–588https://doi.org/10.1145/3489517.3530506This work demonstrates a scalable reconfigurable accelerator (RA) architecture designed to extract maximum performance and energy efficiency for GEMM workloads. We also present a self-adaptive (SA) unit, which runs a learnt model for one-shot ...
- research-articleDecember 2021
RASA: Efficient Register-Aware Systolic Array Matrix Engine for CPU
- Geonhwa Jeong,
- Eric Qin,
- Ananda Samajdar,
- Christopher J. Hughes,
- Sreenivas Subramoney,
- Hyesoon Kim,
- Tushar Krishna
2021 58th ACM/IEEE Design Automation Conference (DAC)Pages 253–258https://doi.org/10.1109/DAC18074.2021.9586257As AI-based applications become pervasive, CPU vendors are starting to incorporate matrix engines within the datapath to boost efficiency. Systolic arrays have been the premier architectural choice as matrix engines in offload accelerators. However, we ...
- research-articleOctober 2018
GeneSys: enabling continuous learning through neural network evolution in hardware
MICRO-51: Proceedings of the 51st Annual IEEE/ACM International Symposium on MicroarchitecturePages 855–866https://doi.org/10.1109/MICRO.2018.00074Modern deep learning systems rely on (a) a hand-tuned neural network topology, (b) massive amounts of labelled training data, and (c) extensive training over large-scale compute resources to build a system that can perform efficient image classification ...
- research-articleJune 2018
Euphrates: algorithm-SoC co-design for low-power mobile continuous vision
ISCA '18: Proceedings of the 45th Annual International Symposium on Computer ArchitecturePages 547–560https://doi.org/10.1109/ISCA.2018.00052Continuous computer vision (CV) tasks increasingly rely on convolutional neural networks (CNN). However, CNNs have massive compute demands that far exceed the performance and energy constraints of mobile devices. In this paper, we propose and develop an ...
- research-articleMarch 2018
MAERI: Enabling Flexible Dataflow Mapping over DNN Accelerators via Reconfigurable Interconnects
ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating SystemsPages 461–475https://doi.org/10.1145/3173162.3173176Deep neural networks (DNN) have demonstrated highly promising results across computer vision and speech recognition, and are becoming foundational for ubiquitous AI. The computational complexity of these algorithms and a need for high energy-efficiency ...
Also Published in:
ACM SIGPLAN Notices: Volume 53 Issue 2 - research-articleOctober 2017
Rethinking NoCs for Spatial Neural Network Accelerators
NOCS '17: Proceedings of the Eleventh IEEE/ACM International Symposium on Networks-on-ChipArticle No.: 19, Pages 1–8https://doi.org/10.1145/3130218.3130230Applications across image processing, speech recognition, and classification heavily rely on neural network-based algorithms that have demonstrated highly promising results in accuracy. However, such algorithms involve massive computations that are not ...