Abstract
FPGAs have become a popular choice for deploying Convolutional Neural Networks (CNNs). As a result, many researchers have explored the deployment and mapping of CNN on FPGA. However, the verification of these deployments at the design time is one of the biggest challenges. The need for design-time verification is growing exponentially because of its use in safety-critical applications. To the best of our knowledge, this is the first work that proposes a 2-Level 3-Way (2L-3W) hardware–software co-verification methodology at design time. 2L-3W provides a step-by-step guide for the successful mapping, deployment, and verification of CNN on FPGA boards. The 2-Level verification serves the purpose of ensuring the implementation in each stage (software and hardware) is following the desired behavior. The 3-Way co-verification provides a cross-paradigm (software, design architecture, and hardware) layer-by-layer parameter check to assure the correct implementation and mapping of the CNNs onto FPGA boards. The proposed 2L-3W co-verification methodology has been evaluated over several test cases. In each case, the prediction and layer-by-layer output of the CNN deployed on the PYNQ FPGA board (hardware), intermediate design results of the layer-by-layer output of the CNN implemented on Vivado HLS, and the prediction and layer-by-layer output of the software level (Caffe) are compared to obtain a similarity score with a Python script. The comparison provides the degree of success of the CNN mapping to the FPGA and helps identify in design time the layer to be debugged in the case of unsuccessful mapping. We demonstrated our technique on LeNet CNN and LeNet-3D CNN (a Caffe-inspired network for the Cifar10 dataset), and the co-verification results yielded layer-by-layer similarity scores of 99% accuracy.
Similar content being viewed by others
References
Zhang C, Li P, Sun G, Guan Y, Xiao B, Cong J. Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 161–170 (2015). ACM.
Odetola TA, Oderhohwo O, Hasan SR. A scalable multilabel classification to deploy deep learning architectures for edge devices. arXiv preprint 2019. http://arxiv.org/abs/1911.02098.
Wang C, Gong L, Yu Q, Li X, Xie Y, Zhou X. DLAU: a scalable deep learning accelerator unit on FPGA. IEEE Trans Comput Aided Des Integr Circuits Syst. 2017;36(3):513–7.
Odetola TA, Mohammed HR, Hasan, SR. A stealthy hardware trojan exploiting the architectural vulnerability of deep learning architectures: Input interception attack (iia). arXiv preprint 2019. http://arxiv.org/abs/1911.00783.
Bacis M, Natale G, Del Sozzo E, Santambrogio MD. A pipelined and scalable dataflow implementation of convolutional neural networks on fpga. In: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 90–97 (2017). IEEE.
Hailesellasie MT, Hasan SR. Mulnet: a flexible CNN processor with higher resource utilization efficiency for constrained devices. IEEE Access. 2019;7:47509–24.
Guo K, Sui L, Qiu J, Yu J, Wang J, Yao S, Han S, Wang Y, Yang H. Angel-Eye: a complete design flow for mapping CNN onto embedded FPGA. IEEE Trans Comput Aided Des Integr Circuits Syst. 2017;37(1):35–47.
Park J, Sung W. FPGA based implementation of deep neural networks using on-chip memory only. In: Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference On, pp. 1011–1015 (2016). IEEE.
Rastegari M, Ordonez V, Redmon J, Farhadi A. XNOR-net: Imagenet classification using binary convolutional neural networks. In: European Conference on Computer Vision, pp. 525–542. Springer; 2016.
Zhang X, Ramachandran A, Zhuge C, He D, Zuo W, Cheng Z, Rupnow K, Chen D. Machine learning on fpgas to face the iot revolution. In: Proceedings of the 36th International Conference on Computer-Aided Design, pp. 819–826. IEEE Press; 2017.
Wang L-T, Chang Y-W, Cheng K-TT. Electronic design automation: synthesis, verification, and test. Burlington: Morgan Kaufmann; 2009.
Xiang W, Tran H-D, Johnson TT. Output reachable set estimation and verification for multilayer neural networks. IEEE Trans Neural Netw Learn Syst. 2018;29(11):5777–83.
Dwarakanath A, Ahuja M, Sikand S, Rao RM, Bose R, Dubash N, Podder S. Identifying implementation bugs in machine learning based image classifiers using metamorphic testing. In: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 118–128. ACM; 2018.
Mu J, Zhang W, Liang H, Sinha S. A Collaborative Framework for FPGA-based CNN Design Modeling and Optimization. In: 2018 28th International Conference on Field Programmable Logic and Applications (FPL), pp. 139–1397. IEEE; 2018.
Hao C, Zhang X, Li Y, Huang S, Xiong J, Rupnow K, Hwu W-m, Chen D. Fpga/dnn co-design: an efficient design methodology for 1ot intelligence on the edge. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), pp. 1–6. IEEE; 2019.
Park H, Lee C, Lee H, Yoo Y, Park Y, Kim I, Yi K.:Optimizing DCNN FPGA accelerator design for handwritten hangul character recognition: work-in-progress. In: Proceedings of the 2017 International Conference on Compilers, Architectures and Synthesis for Embedded Systems Companion, p. 11. ACM; 2017.
O’Loughlin D, Coffey A, Callaly F, Lyons D, Morgan F. Xilinx vivado high level synthesis: Case studies 2014.
Lacey G, Taylor GW, Areibi S. Deep learning on FPGAs: past, present, and future. arXiv preprint 2016. http://arxiv.org/abs/1602.04283.
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T. Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678. ACM; 2014.
Guo Y, Yao A, Chen Y. Dynamic network surgery for efficient DNNs. In: Advances in neural information processing systems, 2016; pp. 1379–1387.
Janßen B, Wingender T, Hübner M. Hardware accelerator framework approach for dynamic partial reconfigurable overlays on xilinx pynq. Informatik 2017.
Janßen B, Zimprich, P, Hübner M. A dynamic partial reconfigurable overlay concept for pynq. In: 2017 27th International Conference on Field Programmable Logic and Applications (FPL), pp. 1–4. IEEE; 2017.
Xilinx: Python productivity for Zynq (Pynq) Documentation Release 2.2. https://buildmedia.readthedocs.org/media/pdf/pynq/latest/pynq.pdf 2019.
Johnson J. Using the AXI DMA in Vivado. 2014. http://www.fpgadeveloper.com/2014/08/using-the-axi-dma-in-vivado.html.
Xilinx: AXI DMA Controller. 2019. https://www.xilinx.com/products/intellectual-property/axi_dma.html.
Odetola TA, Mohammed Y. Similarity Map. 2020. https://github.com/yousufm97/similarity_maps.
Kästner F, Janßen B, Kautz F, Hübner M, Corradi G. Hardware/software codesign for convolutional neural networks exploiting dynamic partial reconfiguration on pynq. In: 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 154–161. IEEE; 2018.
Han S, Liu X, Mao H, Pu J, Pedram A, Horowitz MA, Dally WJ. EIE: efficient inference engine on compressed deep neural network. In: Computer Architecture (ISCA), 2016 ACM/IEEE 43rd Annual International Symposium On, pp. 243–254. IEEE; 2016.
Choi J, Irick KM, Hardin J, Qiu W, Yuille A, Sampson J, Narayanan V. Stochastic functional verification of DNN design through progressive virtual dataset generation. In: 2018 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–5. IEEE; 2018.
Lee C-w. FPGA Accelerator for CNN using Vivado HLS. GitHub 2018.
Evanczuk S. Get started with machine learning using readily available hardware and software. Digi-Key 2018.
Xilinx: Accelerating DNNs with Xilinx Alveo Accelerator Cards. Xilinx 2018.
Funding
This work is partially supported by the National Science Foundation NSF CNS #1852126, the Carnegie Classification Funding from College of Engineering, and the Center for Manufacturing Research (CMR) at Tennessee Technological University.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Odetola, T.A., Groves, K.M., Mohammed, Y. et al. 2L-3W: 2-Level 3-Way Hardware–Software Co-verification for the Mapping of Convolutional Neural Network (CNN) onto FPGA Boards. SN COMPUT. SCI. 3, 60 (2022). https://doi.org/10.1007/s42979-021-00954-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-021-00954-5