[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Generalized Exponentiation Using STT Magnetic Tunnel Junctions: Circuit Design, Performance, and Application to Neural Network Gradient Decay

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

While nonlinear functions such as square and square root are critical for fields such as signal processing and machine learning, computation of these functions presents challenges in the digital domain, including power, area, and delay overheads. While selective computation in the analog domain is a viable alternative, tradeoffs of increased noise and reduced accuracy are prominent challenges. Herein, we propose a reconfigurable analog circuit which is capable of performing generalized exponentiation within a mixed-signal field programmable array. The resulting analog block of magnetic tunnel junctions along with FET-based sensing and amplification circuits are circuit-switched-configurable with terminal-level control. Herein the design is configured to rapidly evaluate various arithmetic operations within acceptable error tolerances for selected applications. When compared to a state-of-the-art approximate digital multiplier, our design yields an approximately 95% reduction in area and stable output within a period comparable to single-cycle execution. In addition, the analog circuit allows for efficient and versatile computation of activation functions in a neural network architecture; simulation results demonstrate the possibility of reducing network size while retaining accuracy through such an approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Strickland RN, Draelos T, Mao Z. Edge detection in machine vision using a simple L1 norm template matching algorithm. Pattern Recognit. 1990;23(5):411–21. https://doi.org/10.1016/0031-3203(90)90064-R.

    Article  Google Scholar 

  2. Shi Y, Xia S, Zhou Y, Shi Y. Sparse signal processing for massive device connectivity via deep learning. In: 2020 IEEE international conference on communications workshops (ICC Workshops); 2020. p. 1–6. https://doi.org/10.1109/ICCWorkshops49005.2020.9145284.

  3. Tatulian A, Salehi S, DeMara RF. Mixed-signal spin/charge reconfigurable array for energy-aware compressive signal processing. In: 2019 International conference on ReConFigurable computing and FPGAs (ReConFig); 2019. p. 1–8. https://doi.org/10.1109/ReConFig48160.2019.8994799.

  4. Yang X, Chen Y, Liang H. Square root based activation function in neural networks. In: 2018 International conference on audio, language and image processing (ICALIP); 2018. p. 84–9. https://doi.org/10.1109/ICALIP.2018.8455590.

  5. Sipper M. Neural networks with À La Carte selection of activation functions. SN Comput Sci. 2021;2(6):1–9. https://doi.org/10.1007/s42979-021-00885-1.

    Article  Google Scholar 

  6. Hasnat A, Bhattacharyya T, Dey A, Halder S, Bhattacharjee D. A fast FPGA based architecture for computation of square root and inverse square root. In: 2017 Devices for integrated circuit (DevIC); 2017. p. 383–7. https://doi.org/10.1109/DEVIC.2017.8073975.

  7. Jiang H, Liu C, Lombardi F, Han J. Low-power approximate unsigned multipliers with configurable error recovery. IEEE Trans Circuits Syst I Regul Pap. 2018;66(1):189–202. https://doi.org/10.1109/TCSI.2018.2856245.

    Article  Google Scholar 

  8. Arya N, Soni T, Pattanaik M, Sharma G. Area and energy efficient approximate square rooters for error resilient applications. In: 2020 33rd international conference on VLSI design and 2020 19th international conference on embedded systems (VLSID); 2020. p. 90–5. https://doi.org/10.1109/VLSID49098.2020.00033.

  9. de Sousa AJS, et al. A very compact CMOS analog multiplier for application in CNN synapses. In: 2019 IEEE 10th Latin American symposium on circuits and systems (LASCAS); 2019. p. 241–4. https://doi.org/10.1109/LASCAS.2019.8667594.

  10. Wunderlich RB, Adil F, Hasler P. Floating gate-based field programmable mixed-signal array. IEEE Trans Very Large Integr (VLSI) Syst. 2012;21(8):1496–505. https://doi.org/10.1109/TVLSI.2012.2211049.

    Article  Google Scholar 

  11. Schlottmann C, Hasler P. FPAA empowering cooperative analog-digital signal processing. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP); 2012. p. 5301–4. https://doi.org/10.1109/ICASSP.2012.6289117.

  12. Huang Y. Hybrid analog-digital co-processing for scientific computation. New York: Columbia University; 2018.

    Google Scholar 

  13. Rumberg B, Graham DW. A low-power field-programmable analog array for wireless sensing. In: Sixteenth international symposium on quality electronic design; 2015. p. 542–546. https://doi.org/10.1109/ISQED.2015.7085484.

  14. Tatulian A, DeMara RF. A reconfigurable and compact spin-based analog block for generalizable nth power and root computation. In: 2021 IEEE computer society annual symposium on VLSI (ISVLSI); 2021. p. 302–7. https://doi.org/10.1109/ISVLSI51109.2021.00062.

  15. Abuelma’Atti MT, Abuelmaatti AM. A new current-mode CMOS analog programmable arbitrary nonlinear function synthesizer. Microelectron J. 2012;43(11):802–8. https://doi.org/10.1016/j.mejo.2012.07.003.

    Article  Google Scholar 

  16. D’Angelo RJ, Sonkusale SR. A time-mode translinear principle for nonlinear analog computation. IEEE Trans Circuits Syst I Regul Pap. 2015;62(9):2187–95. https://doi.org/10.1109/TCSI.2015.2451912.

    Article  Google Scholar 

  17. Koza JR, Bennett FH, Andre D, Keane MA, Dunlap F. Automated synthesis of analog electrical circuits by means of genetic programming. IEEE Trans Evol Comput. 1997;1(2):109–28. https://doi.org/10.1109/4235.687879.

    Article  Google Scholar 

  18. Sapargaliyev YA, Kalganova TG. Open-ended evolution to discover analogue circuits for beyond conventional applications. Genet Program Evolvable Mach. 2012;13(4):411–43. https://doi.org/10.1007/s10710-012-9163-8.

    Article  Google Scholar 

  19. Thangavel V, Song ZX, DeMara RF. Intrinsic evolution of truncated Puiseux series on a mixed-signal field-programmable soc. IEEE Access. 2016;4:2863–72. https://doi.org/10.1109/ACCESS.2016.2537983.

    Article  Google Scholar 

  20. Miura S, et al. Scalability of quad interface p-MTJ for 1× nm STT-MRAM with 10 ns low power write operation, 10 years retention and endurance > 1011. 2020 IEEE symposium on VLSI technology; 2020. p. 1–2. https://doi.org/10.1109/TED.2020.3025749.

  21. Verma S, Kaushik BK. Low-power high-density STT MRAMs on a 3-D vertical silicon nanowire platform. IEEE Trans Very Large Scale Integr (VLSI) Syst. 2016;24(4):1371–6. https://doi.org/10.1109/TVLSI.2015.2454859.

    Article  Google Scholar 

  22. Shinji Y, Fukushima A, Nagahama T, Ando K, Suzuki Y. High tunnel magnetoresistance at room temperature in fully epitaxial Fe/MgO/Fe tunnel junctions due to coherent spin-polarized tunneling. Jpn J Appl Phys. 2004;43(4B):L588–90. https://doi.org/10.1143/JJAP.43.L588.

    Article  Google Scholar 

  23. Shoun M, Hayakawa J, Ikeda S, Miura K, Hasegawa H, Endoh T, Ohno H, Hanyu T. Fabrication of a nonvolatile full adder based on logic-in-memory architecture using magnetic tunnel junctions. Appl Phys Express. 2008;1(9): 091301. https://doi.org/10.1143/APEX.1.091301.

    Article  Google Scholar 

  24. Joshi VK, Barla P, Bhat S, Kaushik BK. From MTJ device to hybrid CMOS/MTJ circuits: a review. IEEE Access. 2020;8:194105–46. https://doi.org/10.1109/ACCESS.2020.3033023.

    Article  Google Scholar 

  25. Zhu L, et al. Heterogeneous 3D integration for a RISC-V system with STT-MRAM. IEEE Comput Archit Lett. 2020;19(1):51–4. https://doi.org/10.1109/LCA.2020.2992644.

    Article  Google Scholar 

  26. Chun KC, Zhao H, Harms JD, Kim T, Wang J, Kim CH. A Scaling roadmap and performance evaluation of in-plane and perpendicular MTJ based STT-MRAMs for high-density cache memory. IEEE J Solid-State Circuit. 2013;48(2):598–610. https://doi.org/10.1109/JSSC.2012.2224256.

    Article  Google Scholar 

  27. Salehi S, DeMara RF. SLIM-ADC: spin-based logic-in-memory analog to digital converter leveraging she-enabled domain wall motion devices. Microelectron J. 2018;81:137–43. https://doi.org/10.1016/j.mejo.2018.09.012.

    Article  Google Scholar 

  28. Zhang Y, et al. Compact modeling of perpendicular-anisotropy CoFeB/MgO magnetic tunnel junctions. IEEE Trans Electron Devices. 2012;59(3):819–26. https://doi.org/10.1109/TED.2011.2178416.

    Article  Google Scholar 

  29. Parkin SSP, Fontana RE, Marley AC. Low-field magnetoresistance in magnetic tunnel junctions prepared by contact masks and lithography: 25% magnetoresistance at 295 K in mega-ohm micron-sized junctions. J Appl Phys. 1997;81(8):5521. https://doi.org/10.1063/1.364588.

    Article  Google Scholar 

  30. Camsari KY, Salahuddin S, Datta S. Implementing p-bits with embedded MTJ. IEEE Electron Device Lett. 2017;38(12):1767–70. https://doi.org/10.1109/LED.2017.2768321.

    Article  Google Scholar 

  31. Datta S. p-Bits for probabilistic computing. In: 2019 Device Research Conference (DRC); 2019. p. 35–6. https://doi.org/10.1109/DRC46940.2019.9046390.

  32. Wunderlich RB, Adil F, Hasler P. Floating gate-based field programmable mixed-signal array. IEEE Trans Very Large Scale Integr Syst. 2012;21(8):1496–505. https://doi.org/10.1109/TVLSI.2012.2211049.

    Article  Google Scholar 

  33. George S, et al. A programmable and configurable mixed-mode FPAA SoC. IEEE Trans Very Large Scale Integr Syst. 2016;24(6):2253–61. https://doi.org/10.1109/TVLSI.2015.2504119.

    Article  Google Scholar 

  34. Choi Y, Lee Y, Baek SH, Lee SJ, Kim J. CHIMERA: a field-programmable mixed-signal IC with time-domain configurable analog blocks. IEEE J Solid-State Circuits. 2017;53(2):431–44. https://doi.org/10.1109/JSSC.2017.2757005.

    Article  Google Scholar 

  35. Kubota H, et al. Quantitative measurement of voltage dependence of spin-transfer torque in MgO-based magnetic tunnel junctions. Nat Phys. 2008;4(1):37–41. https://doi.org/10.1038/nphys784.

    Article  Google Scholar 

  36. Wang S, Lee H, Grezes C, Khalili P, Wang KL, Gupta P. MTJ variation monitor-assisted adaptive MRAM write. In: 2016 53rd ACM/EDAC/IEEE design automation conference (DAC); 2016. p. 1–6. https://doi.org/10.1145/2897937.2897979.

  37. Yuan L, Liou SH, Wang D. Temperature dependence of magnetoresistance in magnetic tunnel junctions with different free layer structures. Phys Rev B. 2006;73(13): 134403. https://doi.org/10.1103/PhysRevB.73.134403.

    Article  Google Scholar 

  38. Madec M, Kammerer JB, Hébrard L. Compact modeling of a magnetic tunnel junction—part II: tunneling current model. IEEE Trans Electron Devices. 2010;57(6):1416–24. https://doi.org/10.1109/TED.2010.2047071.

    Article  Google Scholar 

  39. Gao Z, Dai L, Han S, Chih-Lin I, Wang Z, Hanzo L. Compressive sensing techniques for next-generation wireless communications. IEEE Wirel Commun. 2018;25(3):144–53. https://doi.org/10.1109/MWC.2017.1700147.

    Article  Google Scholar 

  40. Chartrand R. Fast algorithms for nonconvex compressive sensing: MRI reconstruction from very few data. In: 2009 IEEE international symposium on biomedical imaging: from nano to macro; 2009. p. 262–5. https://doi.org/10.1109/ISBI.2009.5193034.

  41. Septimus A, Steinberg R. Compressive sampling hardware reconstruction. In: Proceedings of 2010 IEEE international symposium on circuits and systems; 2010. p. 3316–9. https://doi.org/10.1109/ISCAS.2010.5537976.

  42. Candès EJ. The restricted isometry property and its implications for compressed sensing. CR Math. 2008;346(9–10):589–92. https://doi.org/10.1016/j.crma.2008.03.014.

    Article  MathSciNet  MATH  Google Scholar 

  43. Marques EC, Maciel N, Naviner L, Cai H, Yang J. A review of sparse recovery algorithms. IEEE Access. 2018;7:1300–22. https://doi.org/10.1109/ACCESS.2018.2886471.

    Article  Google Scholar 

  44. Bai L, Maechler P, Muehlberghuber M, Kaeslin H. High-speed compressed sensing reconstruction on FPGA using OMP and AMP. In: 2012 19th IEEE international conference on electronics, circuits, and systems (ICECS 2012); 2012. p. 53–6. https://doi.org/10.1109/ICECS.2012.6463559.

  45. Maechler P, Studer C, Bellasi D, Maleki A, Burg A, Felber N, Kaeslin H, Baraniuk RG. VLSI design of approximate message passing for signal restoration and compressive sensing. IEEE J Emerg Select Top Circuits Syst. 2012;2(3):579–90. https://doi.org/10.1109/JETCAS.2012.2214636.

    Article  Google Scholar 

  46. Protas E, Bratti JD, Gaya JFO, Drews P, Botelho SSC. Visualization methods for image transformation convolutional neural networks. IEEE Trans Neural Netw Learn Syst. 2018;30(7):2231–43. https://doi.org/10.1109/TNNLS.2018.2881194.

    Article  Google Scholar 

  47. Juang C, Chiou C, Lai C. Hierarchical singleton-type recurrent neural fuzzy networks for noisy speech recognition. IEEE Trans Neural Netw. 2007;18(3):833–43. https://doi.org/10.1109/TNN.2007.891194.

    Article  Google Scholar 

  48. Basodi S, Ji C, Zhang H, Pan Y. Gradient amplification: an efficient way to train deep neural networks. Big Data Min Anal. 2020;3(3):196–207. https://doi.org/10.26599/BDMA.2020.9020004.

    Article  Google Scholar 

  49. Zand R, Camsari KY, Datta S, DeMara RF. Composable probabilistic inference networks using MRAM-based stochastic neurons. ACM J Emerg Technol Comput Syst (JETC). 2019;15(2):1–22. https://doi.org/10.1145/3304105.

    Article  Google Scholar 

  50. Pourmeidani H, Sheikhfaal S, Zand R, DeMara RF. Probabilistic interpolation recoder for energy-error-product efficient DBNs with p-bit devices. IEEE Trans Emerg Top Comput. 2020. https://doi.org/10.1109/TETC.2020.2965079.

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the Center for Probabilistic Spin Logic for Low-Energy Boolean and Non-Boolean Computing (CAPSL), one of the Nanoelectronic Computing Research (nCORE) Centers as task 2759.006, a Semiconductor Research Corporation (SRC) program sponsored by the NSF through CCF-1739635, and by NSF through ECCS-1810256.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adrian Tatulian.

Ethics declarations

Conflict of Interest

The authors declare no conflict of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Hardware for AI, Machine Learning and Emerging Electronic Systems” guest edited by Himanshu Thapliyal, Saraju Mohanty and VS Kanchana Bhaaskaran.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tatulian, A., DeMara, R.F. Generalized Exponentiation Using STT Magnetic Tunnel Junctions: Circuit Design, Performance, and Application to Neural Network Gradient Decay. SN COMPUT. SCI. 3, 148 (2022). https://doi.org/10.1007/s42979-022-01039-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-022-01039-7

Keywords

Navigation