Abstract
While nonlinear functions such as square and square root are critical for fields such as signal processing and machine learning, computation of these functions presents challenges in the digital domain, including power, area, and delay overheads. While selective computation in the analog domain is a viable alternative, tradeoffs of increased noise and reduced accuracy are prominent challenges. Herein, we propose a reconfigurable analog circuit which is capable of performing generalized exponentiation within a mixed-signal field programmable array. The resulting analog block of magnetic tunnel junctions along with FET-based sensing and amplification circuits are circuit-switched-configurable with terminal-level control. Herein the design is configured to rapidly evaluate various arithmetic operations within acceptable error tolerances for selected applications. When compared to a state-of-the-art approximate digital multiplier, our design yields an approximately 95% reduction in area and stable output within a period comparable to single-cycle execution. In addition, the analog circuit allows for efficient and versatile computation of activation functions in a neural network architecture; simulation results demonstrate the possibility of reducing network size while retaining accuracy through such an approach.
Similar content being viewed by others
References
Strickland RN, Draelos T, Mao Z. Edge detection in machine vision using a simple L1 norm template matching algorithm. Pattern Recognit. 1990;23(5):411–21. https://doi.org/10.1016/0031-3203(90)90064-R.
Shi Y, Xia S, Zhou Y, Shi Y. Sparse signal processing for massive device connectivity via deep learning. In: 2020 IEEE international conference on communications workshops (ICC Workshops); 2020. p. 1–6. https://doi.org/10.1109/ICCWorkshops49005.2020.9145284.
Tatulian A, Salehi S, DeMara RF. Mixed-signal spin/charge reconfigurable array for energy-aware compressive signal processing. In: 2019 International conference on ReConFigurable computing and FPGAs (ReConFig); 2019. p. 1–8. https://doi.org/10.1109/ReConFig48160.2019.8994799.
Yang X, Chen Y, Liang H. Square root based activation function in neural networks. In: 2018 International conference on audio, language and image processing (ICALIP); 2018. p. 84–9. https://doi.org/10.1109/ICALIP.2018.8455590.
Sipper M. Neural networks with À La Carte selection of activation functions. SN Comput Sci. 2021;2(6):1–9. https://doi.org/10.1007/s42979-021-00885-1.
Hasnat A, Bhattacharyya T, Dey A, Halder S, Bhattacharjee D. A fast FPGA based architecture for computation of square root and inverse square root. In: 2017 Devices for integrated circuit (DevIC); 2017. p. 383–7. https://doi.org/10.1109/DEVIC.2017.8073975.
Jiang H, Liu C, Lombardi F, Han J. Low-power approximate unsigned multipliers with configurable error recovery. IEEE Trans Circuits Syst I Regul Pap. 2018;66(1):189–202. https://doi.org/10.1109/TCSI.2018.2856245.
Arya N, Soni T, Pattanaik M, Sharma G. Area and energy efficient approximate square rooters for error resilient applications. In: 2020 33rd international conference on VLSI design and 2020 19th international conference on embedded systems (VLSID); 2020. p. 90–5. https://doi.org/10.1109/VLSID49098.2020.00033.
de Sousa AJS, et al. A very compact CMOS analog multiplier for application in CNN synapses. In: 2019 IEEE 10th Latin American symposium on circuits and systems (LASCAS); 2019. p. 241–4. https://doi.org/10.1109/LASCAS.2019.8667594.
Wunderlich RB, Adil F, Hasler P. Floating gate-based field programmable mixed-signal array. IEEE Trans Very Large Integr (VLSI) Syst. 2012;21(8):1496–505. https://doi.org/10.1109/TVLSI.2012.2211049.
Schlottmann C, Hasler P. FPAA empowering cooperative analog-digital signal processing. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP); 2012. p. 5301–4. https://doi.org/10.1109/ICASSP.2012.6289117.
Huang Y. Hybrid analog-digital co-processing for scientific computation. New York: Columbia University; 2018.
Rumberg B, Graham DW. A low-power field-programmable analog array for wireless sensing. In: Sixteenth international symposium on quality electronic design; 2015. p. 542–546. https://doi.org/10.1109/ISQED.2015.7085484.
Tatulian A, DeMara RF. A reconfigurable and compact spin-based analog block for generalizable nth power and root computation. In: 2021 IEEE computer society annual symposium on VLSI (ISVLSI); 2021. p. 302–7. https://doi.org/10.1109/ISVLSI51109.2021.00062.
Abuelma’Atti MT, Abuelmaatti AM. A new current-mode CMOS analog programmable arbitrary nonlinear function synthesizer. Microelectron J. 2012;43(11):802–8. https://doi.org/10.1016/j.mejo.2012.07.003.
D’Angelo RJ, Sonkusale SR. A time-mode translinear principle for nonlinear analog computation. IEEE Trans Circuits Syst I Regul Pap. 2015;62(9):2187–95. https://doi.org/10.1109/TCSI.2015.2451912.
Koza JR, Bennett FH, Andre D, Keane MA, Dunlap F. Automated synthesis of analog electrical circuits by means of genetic programming. IEEE Trans Evol Comput. 1997;1(2):109–28. https://doi.org/10.1109/4235.687879.
Sapargaliyev YA, Kalganova TG. Open-ended evolution to discover analogue circuits for beyond conventional applications. Genet Program Evolvable Mach. 2012;13(4):411–43. https://doi.org/10.1007/s10710-012-9163-8.
Thangavel V, Song ZX, DeMara RF. Intrinsic evolution of truncated Puiseux series on a mixed-signal field-programmable soc. IEEE Access. 2016;4:2863–72. https://doi.org/10.1109/ACCESS.2016.2537983.
Miura S, et al. Scalability of quad interface p-MTJ for 1× nm STT-MRAM with 10 ns low power write operation, 10 years retention and endurance > 1011. 2020 IEEE symposium on VLSI technology; 2020. p. 1–2. https://doi.org/10.1109/TED.2020.3025749.
Verma S, Kaushik BK. Low-power high-density STT MRAMs on a 3-D vertical silicon nanowire platform. IEEE Trans Very Large Scale Integr (VLSI) Syst. 2016;24(4):1371–6. https://doi.org/10.1109/TVLSI.2015.2454859.
Shinji Y, Fukushima A, Nagahama T, Ando K, Suzuki Y. High tunnel magnetoresistance at room temperature in fully epitaxial Fe/MgO/Fe tunnel junctions due to coherent spin-polarized tunneling. Jpn J Appl Phys. 2004;43(4B):L588–90. https://doi.org/10.1143/JJAP.43.L588.
Shoun M, Hayakawa J, Ikeda S, Miura K, Hasegawa H, Endoh T, Ohno H, Hanyu T. Fabrication of a nonvolatile full adder based on logic-in-memory architecture using magnetic tunnel junctions. Appl Phys Express. 2008;1(9): 091301. https://doi.org/10.1143/APEX.1.091301.
Joshi VK, Barla P, Bhat S, Kaushik BK. From MTJ device to hybrid CMOS/MTJ circuits: a review. IEEE Access. 2020;8:194105–46. https://doi.org/10.1109/ACCESS.2020.3033023.
Zhu L, et al. Heterogeneous 3D integration for a RISC-V system with STT-MRAM. IEEE Comput Archit Lett. 2020;19(1):51–4. https://doi.org/10.1109/LCA.2020.2992644.
Chun KC, Zhao H, Harms JD, Kim T, Wang J, Kim CH. A Scaling roadmap and performance evaluation of in-plane and perpendicular MTJ based STT-MRAMs for high-density cache memory. IEEE J Solid-State Circuit. 2013;48(2):598–610. https://doi.org/10.1109/JSSC.2012.2224256.
Salehi S, DeMara RF. SLIM-ADC: spin-based logic-in-memory analog to digital converter leveraging she-enabled domain wall motion devices. Microelectron J. 2018;81:137–43. https://doi.org/10.1016/j.mejo.2018.09.012.
Zhang Y, et al. Compact modeling of perpendicular-anisotropy CoFeB/MgO magnetic tunnel junctions. IEEE Trans Electron Devices. 2012;59(3):819–26. https://doi.org/10.1109/TED.2011.2178416.
Parkin SSP, Fontana RE, Marley AC. Low-field magnetoresistance in magnetic tunnel junctions prepared by contact masks and lithography: 25% magnetoresistance at 295 K in mega-ohm micron-sized junctions. J Appl Phys. 1997;81(8):5521. https://doi.org/10.1063/1.364588.
Camsari KY, Salahuddin S, Datta S. Implementing p-bits with embedded MTJ. IEEE Electron Device Lett. 2017;38(12):1767–70. https://doi.org/10.1109/LED.2017.2768321.
Datta S. p-Bits for probabilistic computing. In: 2019 Device Research Conference (DRC); 2019. p. 35–6. https://doi.org/10.1109/DRC46940.2019.9046390.
Wunderlich RB, Adil F, Hasler P. Floating gate-based field programmable mixed-signal array. IEEE Trans Very Large Scale Integr Syst. 2012;21(8):1496–505. https://doi.org/10.1109/TVLSI.2012.2211049.
George S, et al. A programmable and configurable mixed-mode FPAA SoC. IEEE Trans Very Large Scale Integr Syst. 2016;24(6):2253–61. https://doi.org/10.1109/TVLSI.2015.2504119.
Choi Y, Lee Y, Baek SH, Lee SJ, Kim J. CHIMERA: a field-programmable mixed-signal IC with time-domain configurable analog blocks. IEEE J Solid-State Circuits. 2017;53(2):431–44. https://doi.org/10.1109/JSSC.2017.2757005.
Kubota H, et al. Quantitative measurement of voltage dependence of spin-transfer torque in MgO-based magnetic tunnel junctions. Nat Phys. 2008;4(1):37–41. https://doi.org/10.1038/nphys784.
Wang S, Lee H, Grezes C, Khalili P, Wang KL, Gupta P. MTJ variation monitor-assisted adaptive MRAM write. In: 2016 53rd ACM/EDAC/IEEE design automation conference (DAC); 2016. p. 1–6. https://doi.org/10.1145/2897937.2897979.
Yuan L, Liou SH, Wang D. Temperature dependence of magnetoresistance in magnetic tunnel junctions with different free layer structures. Phys Rev B. 2006;73(13): 134403. https://doi.org/10.1103/PhysRevB.73.134403.
Madec M, Kammerer JB, Hébrard L. Compact modeling of a magnetic tunnel junction—part II: tunneling current model. IEEE Trans Electron Devices. 2010;57(6):1416–24. https://doi.org/10.1109/TED.2010.2047071.
Gao Z, Dai L, Han S, Chih-Lin I, Wang Z, Hanzo L. Compressive sensing techniques for next-generation wireless communications. IEEE Wirel Commun. 2018;25(3):144–53. https://doi.org/10.1109/MWC.2017.1700147.
Chartrand R. Fast algorithms for nonconvex compressive sensing: MRI reconstruction from very few data. In: 2009 IEEE international symposium on biomedical imaging: from nano to macro; 2009. p. 262–5. https://doi.org/10.1109/ISBI.2009.5193034.
Septimus A, Steinberg R. Compressive sampling hardware reconstruction. In: Proceedings of 2010 IEEE international symposium on circuits and systems; 2010. p. 3316–9. https://doi.org/10.1109/ISCAS.2010.5537976.
Candès EJ. The restricted isometry property and its implications for compressed sensing. CR Math. 2008;346(9–10):589–92. https://doi.org/10.1016/j.crma.2008.03.014.
Marques EC, Maciel N, Naviner L, Cai H, Yang J. A review of sparse recovery algorithms. IEEE Access. 2018;7:1300–22. https://doi.org/10.1109/ACCESS.2018.2886471.
Bai L, Maechler P, Muehlberghuber M, Kaeslin H. High-speed compressed sensing reconstruction on FPGA using OMP and AMP. In: 2012 19th IEEE international conference on electronics, circuits, and systems (ICECS 2012); 2012. p. 53–6. https://doi.org/10.1109/ICECS.2012.6463559.
Maechler P, Studer C, Bellasi D, Maleki A, Burg A, Felber N, Kaeslin H, Baraniuk RG. VLSI design of approximate message passing for signal restoration and compressive sensing. IEEE J Emerg Select Top Circuits Syst. 2012;2(3):579–90. https://doi.org/10.1109/JETCAS.2012.2214636.
Protas E, Bratti JD, Gaya JFO, Drews P, Botelho SSC. Visualization methods for image transformation convolutional neural networks. IEEE Trans Neural Netw Learn Syst. 2018;30(7):2231–43. https://doi.org/10.1109/TNNLS.2018.2881194.
Juang C, Chiou C, Lai C. Hierarchical singleton-type recurrent neural fuzzy networks for noisy speech recognition. IEEE Trans Neural Netw. 2007;18(3):833–43. https://doi.org/10.1109/TNN.2007.891194.
Basodi S, Ji C, Zhang H, Pan Y. Gradient amplification: an efficient way to train deep neural networks. Big Data Min Anal. 2020;3(3):196–207. https://doi.org/10.26599/BDMA.2020.9020004.
Zand R, Camsari KY, Datta S, DeMara RF. Composable probabilistic inference networks using MRAM-based stochastic neurons. ACM J Emerg Technol Comput Syst (JETC). 2019;15(2):1–22. https://doi.org/10.1145/3304105.
Pourmeidani H, Sheikhfaal S, Zand R, DeMara RF. Probabilistic interpolation recoder for energy-error-product efficient DBNs with p-bit devices. IEEE Trans Emerg Top Comput. 2020. https://doi.org/10.1109/TETC.2020.2965079.
Acknowledgements
This work was supported in part by the Center for Probabilistic Spin Logic for Low-Energy Boolean and Non-Boolean Computing (CAPSL), one of the Nanoelectronic Computing Research (nCORE) Centers as task 2759.006, a Semiconductor Research Corporation (SRC) program sponsored by the NSF through CCF-1739635, and by NSF through ECCS-1810256.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare no conflict of interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Hardware for AI, Machine Learning and Emerging Electronic Systems” guest edited by Himanshu Thapliyal, Saraju Mohanty and VS Kanchana Bhaaskaran.
Rights and permissions
About this article
Cite this article
Tatulian, A., DeMara, R.F. Generalized Exponentiation Using STT Magnetic Tunnel Junctions: Circuit Design, Performance, and Application to Neural Network Gradient Decay. SN COMPUT. SCI. 3, 148 (2022). https://doi.org/10.1007/s42979-022-01039-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-022-01039-7