[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

A Scalable 3D Array Architecture for Accelerating Convolutional Neural Networks

  • Conference paper
  • First Online:
Cognitive Systems and Information Processing (ICCSIP 2021)

Abstract

Convolutional neural network (CNN) is widely used in computer vision and image recognition, and the structure of the CNN becomes more and more complex. The complexity of CNN brings challenges of performance and storage capacity for hardware implementation. To address these challenges, in this paper, we propose a novel 3D array architecture for accelerating CNN. This proposed architecture has several benefits: Firstly, the strategy of multilevel caches is employed to improve data reusage, and thus reducing the access frequency to external memory; Secondly, performance and throughout are balanced among 3D array nodes by using novel workload and weight partitioning schemes. Thirdly, computing and transmission are performed simultaneously, resulting in higher parallelism and lower hardware storage requirement; Finally, the efficient data mapping strategy is proposed for better scalability of the entire system. The experimental results show that our proposed 3D array architecture can effectively improve the overall computing performance of the system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 71.50
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 89.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alex K., Ilya S., Geoffrey E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, vol. 25 (2012)

    Google Scholar 

  2. Karen, S., Andrew, Z.: Very deep convolutional networks for large-scale image recognition. Comput. Sci. 1409, 1–14 (2014)

    Google Scholar 

  3. Kaiming, H., Xiangyu, Z., Shaoqing, R., Jian, S.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition CVPR 2016, pp. 770–778 (2016)

    Google Scholar 

  4. Adrian, M., Caulfield, E.S., Chung, A.P.: A cloud scale acceleration architecture. In: 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), p. 1. IEEE Computer Society (2017)

    Google Scholar 

  5. Jeremy, B., SungYe, K., Jeff, A.: clCaffe: OpenCL accelerated Caffe for convolutional neural networks. In: IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE (2016)

    Google Scholar 

  6. Jialiang, Z., Jing, L.: Improving the performance of OpenCL based FPGA accelerator for convolutional neural network. In: The ACM/SIGDA International Symposium (2017)

    Google Scholar 

  7. Chen, Z., Zhenman, F., Peichen, P.: Caffeine: towards uniformed representation and acceleration for deep convolutional neural networks. In: IEEE/ACM International Conference on Computer aided Design (2017)

    Google Scholar 

  8. Liqiang, L., Yun, L., Qingcheng, X.: Evaluating fast algorithms for convolutional neural networks on FPGAs. In: IEEE 25th Annual International Symposium on Field Programmable Custom Computing Machines (FCCM) (2017)

    Google Scholar 

  9. Lili, Z.: Research on the Acceleration of Tiny-yolo Convolution Neural Network Based on HLS. Chongqing University (2017)

    Google Scholar 

  10. Yufei, M., Yu, C., Sarma, V., Jae, S.: Optimizing loop operation and data ow in FPGA acceleration of deep convolutional neural networks. In: Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays FPGA 2017, pp 45–54 (2017)

    Google Scholar 

  11. Chen, Z., Peng, L., Guangyu, S., Yijin, G., Bingjun, X., Jason, C.: Optimizing FPGA based accelerator design for deep convolutional neural networks. In: Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays FPGA 2015, pp. 161–170 (2015)

    Google Scholar 

  12. Marimuthu, S., Jawahar, N., Ponnambalam, S.: Threshold accepting and ant-colony optimization algorithms for scheduling m-machine flow shops with lot streaming. J. Mater. Process. Technol. 209(2), 1026–1041 (2009)

    Article  Google Scholar 

  13. YunChia, L., Mfatih, T., Quan, K.: A discrete particle swarm optimization algorithm for the no wait flowshop scheduling problem. Comput. Oper. Res. 35(9), 2807–2839 (2008)

    Article  MathSciNet  Google Scholar 

  14. Nicholas, G., Chelliah, S.: A survey of machine scheduling problems with blocking and no wait in process. Oper. Res. 44(3), 510–525 (1996)

    Article  MathSciNet  Google Scholar 

  15. Charles, E., Ekkehard, W.: GANGLION a fast field programmable gate array implementation of a connectionist classifier. IEEE J. Solid-State Circuits 27(3), 288–299 (1992)

    Article  Google Scholar 

  16. Jocelyn, C., Steven, P., Francois, R., Boyer, P.Y.: An FPGA based processor for image processing and neural networks. In: Microneuro, p. 330. IEEE (1996)

    Google Scholar 

  17. Clement, F., Berin, M., Benoit, C.: NeuFlow: a runtime reconfigurable dataflow processor for vision. In: Computer Vision and Pattern Recognition Workshops (2011)

    Google Scholar 

  18. Geng, T., Wang, T., Li, A., Jin, X., Herbordt, M.: FPDeep: scalable acceleration of CNN training on deeply-pipelined FPGA clusters. Trans. Comput. 14(8), 1143–1158 (2020)

    MATH  Google Scholar 

  19. Motamedi, M., Gysel, P., Akella, V., Ghiasi, S.: Design space exploration of FPGA based deep convolutional neural networks. In: Proceedings of the Asia and South Pacific Design Automation Conference ASPDAC, pp. 575–580 (2016)

    Google Scholar 

  20. Jiang, L.I., Kubo, H., Yuichi, O., Satoru, Y.: A Multidimensional Configurable Processor Array Vocalise. Kyushu Institute of Technology (2014)

    Google Scholar 

  21. Chen, Y.H., Krishna, T., Emer, J.S., Eyeriss, S.V.: An Energy efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J. Solid State Circuit 52, 127–138 (2016)

    Article  Google Scholar 

Download references

Acknowledgement

This work was supported by the Hundred Talents Program of Chinese Academy of Sciences under grant No. Y9BEJ11001. This research was primarily conducted at Suzhou Institute of Nano-Tech and Nano-Bionics (SINANO).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ji, Y. et al. (2022). A Scalable 3D Array Architecture for Accelerating Convolutional Neural Networks. In: Sun, F., Hu, D., Wermter, S., Yang, L., Liu, H., Fang, B. (eds) Cognitive Systems and Information Processing. ICCSIP 2021. Communications in Computer and Information Science, vol 1515. Springer, Singapore. https://doi.org/10.1007/978-981-16-9247-5_7

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-9247-5_7

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-9246-8

  • Online ISBN: 978-981-16-9247-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics