More Web Proxy on the site http://driver.im/

research-article

Memristor-CMOS Analog Coprocessor for Acceleration of High-Performance Computing Applications

Authors:

Nihar Athreyas,

J. Joshua YangAuthors Info & Claims

ACM Journal on Emerging Technologies in Computing Systems (JETC), Volume 14, Issue 3

Article No.: 38, Pages 1 - 30

https://doi.org/10.1145/3269985

Published: 01 November 2018 Publication History

Abstract

Vector matrix multiplication computation underlies major applications in machine vision, deep learning, and scientific simulation. These applications require high computational speed and are run on platforms that are size, weight, and power constrained. With the transistor scaling coming to an end, existing digital hardware architectures will not be able to meet this increasing demand. Analog computation with its rich set of primitives and inherent parallel architecture can be faster, more efficient, and compact for some of these applications. One such primitive is a memristor-CMOS crossbar array-based vector matrix multiplication. In this article, we develop a memristor-CMOS analog coprocessor architecture that can handle floating-point computation. To demonstrate the working of the analog coprocessor at a system level, we use a new electronic design automation tool called PSpice Systems Option, which performs integrated cosimulation of MATLAB/Simulink and PSpice. It is shown that the analog coprocessor has a superior performance when compared to other processors, and a speedup of up to 12 × when compared to projected GPU performance is observed. Using the new PSpice Systems Option tool, various application simulations for image processing and solutions to partial differential equations are performed on the analog coprocessor model.<?enlrg 3pt?>

References

[1]

Analog Devices. 2017. Retrieved from http://www.analog.com/en/products/switches-multiplexers/analog-switches-multiplexers/adg901.html.

[2]

ARM Community. 2015. Retrieved from https://community.arm.com/processors/b/blog/posts/introducing-cortex-a32-arm-s-smallest-lowest-power-armv8-a-processor-for-next-generation-32-bit-embedded-applications

[3]

N. Athreyas, D. Gupta, and J. Gupta. 2017. Analog signal processing solution for machine vision applications. Journal of Real-Time Image Processing 13 (Feb. 2017), 1--22. Retrieved from https://link.springer.com/article/10.1007/s11554-017-0669-4.

[4]

M. Bojnordi and E. Ipek. 2016. Memristive Boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning. In Proceedings of HPCA, 1--13.

[5]

B. E. Boser, E. Sackinger, J. Bromley, Y. Le Cun, and L. D. Jackel. 1991. An analog neural network processor with programmable topology. IEEE Journal of Solid-State Circuits 26, 12 (Dec. 1991), 2017--2025.

[6]

Y. Byung-Do. 2015. Low-power and area-efficient shift register using pulsed latches. IEEE Transactions on Circuits and Systems I: Regular Papers 62, 6 (May 2015), 1564--1571.

[7]

Cadence. 2017. Retrieved from http://www.pspice.com/technology/pspice-systems-option.

[8]

P.-Y. Chen, D. Kadetotad, Z. Xu, A. Mohanty, B. Lin, J. Ye, S. Vrudhula, J.-S. Seo, Y. Cao, and S. Yu. 2015. Technology-design co-optimization of resistive cross-point array for accelerating learning algorithms on chip. In IEEE Design, Automation 8 Test in Europe (DATE’15).

Digital Library

[9]

P.-Y. Chen, B. Lin, I.-T. Wang, T.-H. Hou, J. Ye, S. Vrudhula, J.-S. Seo, Y. Cao, and S. Yu. 2015. Mitigating effects of non-ideal synaptic device characteristics for on-chip learning. In IEEE/ACM International Conference on Computer-Aided Design (ICCAD’15).

Digital Library

[10]

P. Chi, S. Li, C. Xu, T. Zhang, J. Zhao, Y. Liu, Y. Wang, and Y. Xiw. 2016. PRIME: A novel processing-in-memory architecture for neural network computation in Reram-based main memory. In ISCA.

Digital Library

[11]

S. Choi, P. Sheridan, and W. D. Lu. 2015. Data clustering using memristor networks. Scientific Reports 5, 10492 (May 2015).

[12]

L. O. Chua. 1971. Memristor-the missing circuit element. IEEE Transactions on Circuit Theory 18, 5 (Sept. 1971), 507--519.

[13]

L. O. Chua. 2012. The fourth element. Proceedings of the IEEE 100, 6 (Apr. 2012), 1920--1927.

[14]

F. Chung and S.-T. Yau. 2000. Discrete green's functions. Journal of Combinatorial Theory 91, 1--2 (July 2000), 191--214.

Digital Library

[15]

F. De Simone, D. Ticca, F. Dufaux, M. Ansorge, and T. Ebrahimi. 2008. A comparative study of color image compression standards using perceptually driven quality metrics. In SPIE Optics and Photonics, Applications of Digital Image Processing.

[16]

V. G. Devereux. 1987. Limiting of YUV Digital Video Signals. BBC Research Department.

[17]

R. Dosselmann and X. D. Yang. 2009. A comprehensive assessment of the structural similarity index. Signal, Image and Video Processing 5, 1 (Nov. 2009), 81--91.

[18]

R. Genov and G. Cauwenberghs. 2001. Charge-mode parallel architecture for vector-matrix multiplication. In Transactions on IEEE Circuits and Systems II 48, 10 (Oct. 2001), 930--936.

[19]

R. Genov and G. Cauwenberghs. 2003. Kerneltron: Support vector “machine” in silicon. IEEE Transactions on Neural Networks 14, 5 (Nov. 2003), 1426--1434.

Digital Library

[20]

R. Gonzalez and R. Woods. 2002. Digital Image Processing. Prentice Hall, Upper Saddle River, NJ.

Digital Library

[21]

S. Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. A. Horowitz, and W. J. Dally. 2016. EIE: Efficient inference engine on compressed deep neural network. In Proceedings of ISCA, 243--254.

Digital Library

[22]

P. Harpe, Y. Zhang, G. Dolmans, K. Philips, and H. D. Groot. 2012. A 7-to-10b 0-to-4MS/s flexible SAR ADC with 6.5-to-16fJ/conversion-step. In ISSCC, 472--474.

[23]

M. R. Hestenes and E. Stiefel. 1952. Methods of conjugate gradients for solving linear systems. Journal of Research of the National Bureau of Standards 49, 1 (May 1952), 409--436.

[24]

J. Hu, C. J. Xue, Q. Zhuge, W-C. Tseng, and E. H-M. Sha. 2011. Towards energy efficient hybrid on-chip scratch pad memory with non-volatile memory. In DATE, 1--6.

[25]

M. Hu, J. P. Strachan, Z. Li, E. M. Grafals, N. Davila, C. Graves, S. Lam, N. Ge, R. S. Williams, and J. Yang. 2016. Dot-product engine for neuromorphic computing: Programming 1T1M crossbar to accelerate matrix-vector multiplication. In Proceedings of DAC-53.

Digital Library

[26]

Intel. 2017. Retrieved from https://www.intelnervana.com/neon/.

[27]

A. K. Jain. 1989. Fundamentals of Digital Image Processing. Prentice Hall, Englewood Cliffs, NJ, 150--153.

Digital Library

[28]

H. Jiang, L. Han, P. Lin, Z. Wang, M. H. Jang, Q. Wu, M. Barnell, J. J. Yang, H. L. Xin, and Q. Xia. 2016. Sub-10nm ta channel responsible for superior performance of a HfO2 memristor. Scientific Reports 6, 28525 (June 2016).

[29]

Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre-luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg, John Hu, Robert Hundt, Dan Hurt, Julian Ibarz, Aaron Jaffey, Alek Jaworski, Alexander Kaplan, Harshit Khaitan, Andy Koch, Naveen Kumar, Steve Lacy, James Laudon, James Law, Diemthu Le, Chris Leary, Zhuyuan Liu, Kyle Lucke, Alan Lundin, Gordon MacKean, Adriana Maggiore, Maire Mahony, Kieran Miller, Rahul Nagarajan, Ravi Narayanaswami, Ray Ni, Kathy Nix, Thomas Norrie, Mark Omernick, Narayana Penukonda, Andy Phelps, Jonathan Ross, Matt Ross, Amir Salek, Emad Samadiani, Chris Severn, Gregory Sizikov, Matthew Snelham, Jed Souter, Dan Steinberg, Andy Swing, Mercedes Tan, Gregory Thorson, Bo Tian, Horia Toma, Erick Tuttle, Vijay Vasudevan, Richard Walter, Walter Wang, Eric Wilcox, and Doe Hyun Yoon. 2017. In-datacenter performance analysis of a tensor processing unit. In ISCA.

Digital Library

[30]

F. Kub, K. Moon, I. Mack, and F. Long. 1990. Programmable analog vector-matrix multipliers. IEEE Journal of Solid-State Circuits 25, 1 (Feb. 1990), 207--214.

[31]

D. Lewis. 2004. SerDes architectures and applications. DesignCon.

[32]

W.-T. Lin, H.-Y. Huang, and T.-H. Kuo. 2014. A 12-bit 40 nm DAC Achieving SFDR>70 dB at 1.6 GS/s and IMD<--61dB at 2.8 GS/s With DEMDRZ Technique. IEEE Journal of Solid-State Circuits 49, 3 (Feb. 2014), 708--717.

[33]

Mathworks. Retrieved from https://www.mathworks.com/help/images/ref/fspecial.html.

[34]

K. K. Moon, F. J. Kub, and I. A. Mack. 1990. Random address 32 times; 32 programmable analog vector-matrix multiplier for artificial neural networks. In Proceedings of the IEEE Custom Integrated Circuits Conference, 26.7/1-26.7/4.

[35]

Nvidia. 2016. NVIDIA Tesla P100”. Nvidia Whitepaper.

[36]

Nvidia. 2017. Retrieved from https://www.nvidia.com/en-us/data-center/volta-gpu-architecture.

[37]

X. Pan and H. Graeb. 2011. Reliability optimization of analog integrated circuits considering the tradeoff between lifetime and area. ICMAT 52, 8 (Oct. 2011), 1559--1564.

[38]

M. Parvizi, K. Allidina, and M. N. El-Gamal. 2016. An ultra-low-power wideband inductorless CMOS LNA with tunable active shunt-feedback. IEEE Transactions on Microwave Theory and Technique 64, 6 (May 2016), 1843--1853.

[39]

W. B. Pennebaker and J. L. Mitchell. 1993. JPEG: Still Image Data Compression Standard. Van Nostrand Reinhold.

Digital Library

[40]

A. Shafiee, A. Nag, N. Muralimanohar, R. Balasubramonian, J. P. Strachan, M. Hu, R. S. Williams, and V. Srikumar. ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. In Proceedings of the ISCA, 14--26.

Digital Library

[41]

P. M. Sheridan, C. Du, and W. D. Lu. 2016. Feature extraction using memristor networks. IEEE Transactions on Neural Networks and Learning Systems 27, 11 (Nov. 2016), 1--10.

[42]

P. Sheridan, W. Ma, and W. Lu. 2014. Pattern recognition with memristor networks. In ISCAS, 1078--1081.

[43]

T. Sohmers. 2017. EE380: Computer systems colloquium seminar. In The REX Neo Architecture: An Energy Efficient New Processor Architecture for HPC, DSP, Machine Learning, and More. Retrieved from https://www.youtube.com/watch?v=ki6jVXZM2XU.

[44]

J. Stam. Stable fluids. In Proceedings of SIGGRAPH, 121--128.

Digital Library

[45]

J. P. Strachan, A. C. Torrezan, F. Miao, M. D. Pickett, J. J. Yang, W. Yi, G. Medeiros-Ribeiro, and R. S. Williams. 2013. State dynamics and modeling of tantalum oxide memristors. IEEE Transactions on Electron Devices 60 7 (July 2013), 2194--2202.

[46]

H. Tang. 2012. Study of Design for Reliability of RF and Analog Integrated Circuits. PhD. Dissertation, Dept. Electrical Eng., University of Central Florida, Orlando, FL.

[47]

L. Tao, S. Liu, L. Li, Y. Wang, S. Zhang, T. Chen, Z. Xu, O. Temam, and Y. Chen. 2016. DaDianNao: A neural network supercomputer. IEEE Transactions on Computers 66, 1 (May 2016), 73--88.

Digital Library

[48]

A. Vatanjou Asghar, T. Ytterdal, and S. Aunet. 2015. Energy efficient sub/near-threshold ripple-carry adder in standard 65 nm CMOS. In ASQED, 7--12.

[49]

S. Winkler. 2005. Digital Video Quality: Vision Models and Metrics. John Wiley 8 Sons, West Sussex.

[50]

J. J. Yang, D. B. Strukov, and D. R. Stewart. 2012. Memristive devices for computing. Nature Nanotechnology 8, 1 (Dec. 2012), 13--24.

[51]

Wei Yi, Sergey E. Savel'ev, Gilberto Medeiros-Ribeiro, Feng Miao, M.-X. Zhang, J. Joshua Yang, Alexander M. Bratkovsky, and R. Stanley Williams. 2014. Quantized conductance coincides with state instability and excess noise in tantalum oxide memristors. Nature Communications 7 (Oct. 2017), 1--6.

Cited By

Tong SBao HLi JYang LZhou HLi YMiao X(2024)Energy-Efficient Brain Floating Point Convolutional Neural Network Using MemristorsIEEE Transactions on Electron Devices10.1109/TED.2024.337995371:5(3293-3300)Online publication date: May-2024
https://doi.org/10.1109/TED.2024.3379953
Liehr MBeckmann KCady N(2022)Impact of Switching Variability, Memory Window, and Temperature on Vector Matrix Operations Using 65nm CMOS Integrated Hafnium Dioxide-based ReRAM Devices2022 IEEE 31st Microelectronics Design & Test Symposium (MDTS)10.1109/MDTS54894.2022.9826924(1-6)Online publication date: 23-May-2022
https://doi.org/10.1109/MDTS54894.2022.9826924
Ji XDong ZZhou GLai CYan YQi D(2021)Memristive System Based Image Processing Technology: A Review and PerspectiveElectronics10.3390/electronics1024317610:24(3176)Online publication date: 20-Dec-2021
https://doi.org/10.3390/electronics10243176
Show More Cited By

Index Terms

Memristor-CMOS Analog Coprocessor for Acceleration of High-Performance Computing Applications
1. Hardware
  1. Emerging technologies
    1. Analysis and design of emerging devices and systems
      1. Emerging architectures

Recommendations

Design of Memristor-Based Combinational Logic Circuits
Abstract
This paper proposes three modified memristor ratioed logic (MRL) gates: NOT, NOR and A AND (NOR B) (i.e., $A \cdot \bar{B}$ ), each of which only needs 1 memristor and 1 NMOS. Based on the modified MRL gates, we design some combinational logic circuits, ...
A Novel Memristor-CMOS Hybrid Full-Adder and Its Application
Advances in Neural Networks – ISNN 2019
Abstract
Memristor is a nano-scale component with information storage capability and binary characteristics. The memristive logic circuit composed of the structure is simple in structure and complete in logic function, and can be applied to logic operation ...
Memristor based unbalanced ternary logic gates

This paper introduces a novel design of basic ternary logic gates using memristor, which is a set of AND, OR, inverters, NOR, and NAND gates. The ternary logic is a promising alternative to the conventional binary logic design technique. The resistive-...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Journal on Emerging Technologies in Computing Systems

ACM Journal on Emerging Technologies in Computing Systems Volume 14, Issue 3

July 2018

150 pages

ISSN:1550-4832

EISSN:1550-4840

DOI:10.1145/3287773

Editor:
Yuan Xie
University of California, Santa Barbara, USA

Issue’s Table of Contents

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

ACM Journals for the Design of Smart and Connected Systems

Publication History

Published: 01 November 2018

Accepted: 01 August 2018

Revised: 01 July 2018

Received: 01 December 2017

Published in JETC Volume 14, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

DARPA contract

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
467
Total Downloads

Downloads (Last 12 months)48
Downloads (Last 6 weeks)6

Reflects downloads up to 03 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Tong SBao HLi JYang LZhou HLi YMiao X(2024)Energy-Efficient Brain Floating Point Convolutional Neural Network Using MemristorsIEEE Transactions on Electron Devices10.1109/TED.2024.337995371:5(3293-3300)Online publication date: May-2024
https://doi.org/10.1109/TED.2024.3379953
Liehr MBeckmann KCady N(2022)Impact of Switching Variability, Memory Window, and Temperature on Vector Matrix Operations Using 65nm CMOS Integrated Hafnium Dioxide-based ReRAM Devices2022 IEEE 31st Microelectronics Design & Test Symposium (MDTS)10.1109/MDTS54894.2022.9826924(1-6)Online publication date: 23-May-2022
https://doi.org/10.1109/MDTS54894.2022.9826924
Ji XDong ZZhou GLai CYan YQi D(2021)Memristive System Based Image Processing Technology: A Review and PerspectiveElectronics10.3390/electronics1024317610:24(3176)Online publication date: 20-Dec-2021
https://doi.org/10.3390/electronics10243176
Kiani FYin JWang ZYang JXia Q(2021)All Hardware-based Two-layer Perceptron Implemented in Memristor Crossbar Arrays2021 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS51556.2021.9401793(1-5)Online publication date: May-2021
https://doi.org/10.1109/ISCAS51556.2021.9401793
Liehr MHazra JBeckmann KRafiq SCady N(2020)Impact of Switching Variability of 65nm CMOS Integrated Hafnium Dioxide-based ReRAM Devices on Distinct Level Operations2020 IEEE International Integrated Reliability Workshop (IIRW)10.1109/IIRW49815.2020.9312855(1-4)Online publication date: Oct-2020
https://doi.org/10.1109/IIRW49815.2020.9312855
Wang CFeng DTong WLiu JLi ZChang JZhang YWu BXu JZhao WLi YRen R(2019)Cross-point Resistive MemoryACM Transactions on Design Automation of Electronic Systems10.1145/332506724:4(1-37)Online publication date: 20-Jun-2019
https://dl.acm.org/doi/10.1145/3325067

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents