[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

Time-Series Forecasting and Sequence Learning Using Memristor-based Reservoir System

Published: 10 December 2024 Publication History

Abstract

Pushing the frontiers of time-series information processing in the ever-growing domain of edge devices with stringent resources has been impeded by the systems’ ability to process information and learn locally on the device. Local processing and learning of time-series information typically demand intensive computations and massive storage as the process involves retrieving information and tuning hundreds of parameters back in time. In this work, we developed a memristor-based echo state network accelerator that features efficient temporal data processing and in situ online learning. The proposed design is benchmarked using various datasets involving real-world tasks, such as forecasting the load energy consumption and weather conditions. The experimental results illustrate that the hardware model experiences a marginal degradation in performance as compared to the software counterpart. This is mainly attributed to the limited precision and dynamic range of network parameters when emulated using memristor devices. The proposed system is evaluated for lifespan, robustness, and energy-delay product. It is observed that the system demonstrates reasonable robustness for device failure below 10%, which may occur due to stuck-at faults. Furthermore, 247× reduction in energy consumption is achieved when compared to a custom CMOS digital design implemented at the same technology node.

1 Introduction

Empowering edge devices with the capability to process and learn continuously locally on the device is essential when handling stationary and non-stationary time-series data, as remote cloud services have proven to be slow and unsecured [14]. While such an objective enhances reliability and adaptability to edge constraints, it is challenging to satisfy in devices with limited resources. In this work, the local processing and learning from time-series data is explored within the context of reservoir networks, particularly echo state network (ESN) [12]. ESN is developed to handle continuous data of stationary and non-stationary nature and has powered a large number of applications, including prosthetic finger control [42], stock market prediction [35], cyber-attack and anomaly detection [9, 32], weather forecasting [6], and modeling dynamic motions in bio-mimic robots [29]. The network features efficient temporal data processing and unrivaled training speed [3, 4]. Furthermore, it does not suffer from the known training problems in most recurrent neural networks, such as vanishing and exploding gradients, which occur due to the recurrent connections and gradient backward propagation through time [41]. Thus, running ESN on edge devices using dedicated custom-designed hardware not only enables the algorithm to make temporal decisions in real time but also yields a significant improvement in power efficiency and throughput [8, 17].
These reasons urge the research community to investigate the physical implementation of ESN. There are several digital [7, 10, 16] and mixed-signal implementations of ESN networks in the literature using memristor devices, MOSFET switching, and photonics [18, 30, 37, 41]. However, this work will place emphasis on the memristor-based implementations, owing to their numerous advantages over other approaches, such as the integration with CMOS devices and fabrication using the standard foundry process [13]. Furthermore, the intrinsic non-linearity feature of the memristor enhances the separability of the reservoir and the properties of the echo state [24]. A recent attempt at a memristive ESN, including a hardware implementation of the ESN using a memristor double-crossbar array, was presented by Hassan et al. in 2017. The proposed design is evaluated for autonomous signal generation using the Mackey-Glass dataset and the simulation is done in MATLAB. In 2019, Wen et al. [36] presented a memristor-based ESN trained in online fashion using least mean square (LMS). The network is evaluated for short-term power load forecasting. In the same year, a memristor-based reservoir computing system, which leverages the concept of the virtual node, was introduced by Moon at al. [24]. The virtual node concept involves building a single reservoir node and using it to virtually emulate a chain of other nodes in the reservoir. The proposed system features high-power efficiency but suffers from high latency and short lifespan, and it is demonstrated in speech recognition and time-series forecasting. In 2022, the same concept was adopted to build a cyclic reservoir computing system based on memristor devices. The proposed design is implemented in a 65nm technology node and verified on a handwritten vowel recognition task [20]. Recently, Nair et al. [25] proposed a memristive ESN with an extended synaptic sampling machine (ESSM) to ensure synaptic stochasticity and power efficiency. The proposed neural system is benchmarked for classification tasks using ECG, MNIST, Fashion MNIST, and CIFAR10.
While a few of the ESN implementations are equipped with in situ learning to endow the ESN with fast learning and adaptation capabilities, none extend the learning to cover the structural plasticity (synaptogenesis), which is imperative for controlling the network’s sparsity level. To the best of our knowledge, few of them utilize leaky-integrated discrete-time continuous-value neurons in the reservoir layer. Having leaky-integrated neurons is essential when dealing with temporal data, as it controls the speed of the reservoir update dynamics and eventually the duration of the short-term memory of ESN [21]. Besides the lack of using leaky-integrated neurons, none of the aforementioned implementations take into consideration the limited precision of signal digitization and undesired leakage of the sample and hold (S/H) circuits during network training and testing. This article addresses these problems with key contributions summarized as follows:
Introducing an energy-efficient memristive ESN accelerator, targeting AI-powered edge devices.
Enhancing the hardware-aware adaptation and the dynamic diversity of ESN via enabling in situ learning (synaptogenesis and synaptic plasticity) and using leaky-integrated neurons.
Investigating the potential of enhancing the lifespan of ESN and its robustness to device failure.
Evaluating the proposed accelerator for robustness, latency, network lifespan, and energy-delay product (EDP).
The rest of the article is organized as follows. Section 2 discusses the theory of the ESN and training procedure. Section 3 presents the system design and implementation of the ESN. The design methodology is introduced in Section 4. Sections 5 and 6 discuss the results and conclude the article, respectively.

2 Overview of ESN

The ESN is a class of reservoir networks originally proposed by Jaeger [12] in 2001 to circumvent challenges associated with training recurrent neural networks (vanishing and exploding gradient when using backpropagation through time [8]). The algorithm is inspired by the structure of the neocortex and attempts to model its dynamics and sequence learning [41]. The high-level diagram of the ESN, shown in Figure 1, comprises three consecutive layers of \(n_u\) input neurons, \(n_r\) reservoir (or hidden) neurons, and \(n_o\) readout (or output) neurons. The input layer serves as a buffer, holding the information presented to the network. The reservoir works as a feature extractor that provides non-linear dynamics and memory, enabling the network to process information of a temporal nature. The output layer is a linear classifier. When the ESN is used for time-series forecasting, it operates in three phases: initialization, prediction, and training, summarized in Algorithm 1. During the initialization phase (lines 2–5), the strength of the synaptic connections (weights) is randomly initialized and the features associated with the structure of the network, such as the reservoir feedback connections, are set. Besides the synaptic initialization, all non-zero elements of the reservoir weight matrix are scaled in width to get the network operating at the edge of chaos. Once the initialization phase is finished, which occurs only once, the prediction phase begins. Here, the input examples (\(U = \lbrace {u}^{1}, {u}^{2}, ...., {u}^{n_m}\rbrace\)) are presented to the network in an online fashion. Given an example, \({u} \in \mathbb {R}^{n_u \times n_t}\), its features are sequentially presented to the network where they initially get multiplied by the synaptic weights connecting the reservoir and input layer, \(W_{ri} \in \mathbb {R}^{n_r \times n_u}\), and then summed up. Concurrently, within the reservoir layer, the outputs of the reservoir neurons from the previous timestep (\(x^{\lt t-1\gt }\)) are multiplied by their corresponding feedback connection weights, \(W_{rr} \in \mathbb {R}^{n_r \times n_r}\), and summed up. The weighted sums from both input and feedback connections are added and then subjected to a hyperbolic tangent tanh activation function (g) to introduce non-linearity. The output of the tanh function, \(\hat{x}^{\lt t\gt }\), represents the reservoir neuron’s internal state, which is used along with the leakage rate (\(\delta\)) to determine the final output of the reservoir neurons, \(x^{\lt t\gt }\) (see lines 8 and 9). The reservoir neurons in this work capture the leaky-integrated discrete-time continuous-value feature to ensure broader control over the reservoir update dynamics. The outputs of the reservoir neurons are then transmitted to the readout layer via the weighted synaptic connections, \(W_{or} \in \mathbb {R}^{n_o \times n_r}\), and the weighted sum is subsequently fed to a sigmoid activation function (f) to compute the predicted value. Once the predicted value is computed, the training phase starts to minimize the error between the predicted and targeted values (for forecasting, the targeted value is \(u^{\lt t+n_p\gt }\) in the time-series data, where \(n_p\) denotes the number of prediction steps). Unlike other networks, the training in the ESN is confined solely to the output layer. This makes the algorithm well known for its fast training, thereby making it attractive to numerous applications. Various training algorithms can be used to train the ESN, such as backpropagation-decorrelation, force learning [31], and ridge regression, but the most common approach is ridge regression [21]. Ridge regression involves finding the optimal set of weights that minimize the squared error between the outputs of the network (\(\hat{Y}\)) and ground truth labels (Y). It is carried out by multiplying the Moore-Penrose generalized inverse of the reservoir output (X) by ground truth labels. Orthogonal projection and Tikhonov regularization (\(\lambda\)) are recommended to be employed here to overcome overfitting and enhance stability (see (1)).
\begin{equation} W_{or} = YX (XX^T + \lambda I)^{-1} \end{equation}
(1)
Fig. 1.
Fig. 1. A high-level representation of the ESN which consists of an input, reservoir, and readout layer. The input serves as a buffer, whereas the reservoir and readout layers are dedicated to feature extraction and classification, respectively.
Although such a training approach is effective and recommended for training ESN, it is computationally costly and requires massive memory to store thousands of labels and states (reservoir output). One may resort to the recursive least squares (RLS) to mitigate the computational and memory requirements, but this approach is not feasible for edge devices with stringent resource constraints. It is important to note that the aforementioned training approaches suffer from a divergence problem when not trained for several timesteps, generally caused by the accumulation of small errors. Such a problem is inevitable and cannot be avoided by altering the structure of the network, but can be mitigated by bringing the reservoir dynamic to its original state by representing ground truth input to the reservoir [24]. However, such an approach may work when dealing with stationary data but not real-world data of a non-stationary nature. Therefore, to overcome the challenges associated with finding the optimal set of parameters that ensure the best network performance while securing fast learning and adaptation, we use the online LMS algorithm due to its simplicity and minimal use for computational resources and storage. The LMS is used with weight decay regularization (LMS+L2) to optimize ESN performance even further as it is exposed to new events (see lines 13–20). Besides regularization, the gradients are sparsified (line 16) to avoid insignificant changes in the network parameters and to enhance the learning process in hardware. Furthermore, the weights are tuned frequently according to the parameter \(n_{up}\) rather than every iteration. This will expedite network adaptation and minimize energy consumption.

3 System Design

The general architecture of an ESN, as covered in the overview section, is composed of consecutive layers of fully and sparsely connected neurons. In hardware, the same architecture is captured, but inter-layer connections are buffered to control data flow (Figure 2). During the forward pass, the input signals are sampled and held, and then presented to the reservoir layer via weighted connections modeled by memristor devices. Once the weighted sum of the input reaches the reservoir neurons, collectively with weighted reservoir neuron responses from the previous timestep, the reservoir neuron non-linear response is determined. The output of the reservoir is predicted using the readout layer, which is trained in an online fashion as alluded to earlier.
Fig. 2.
Fig. 2. The schematic of the proposed ESN accelerator, including input, reservoir, and readout layers. Each layer comprises S/H circuits to discretize and temporally hold the time-series data, memristor devices to emulate the synaptic weights, and neuron circuits (leaky-integrated discrete-time continuous-value neurons in the reservoir and point neurons in the output layer). In the output layer, an additional unit, the training circuitry, enables in situ learning.
The input layer is equipped with an S/H circuit1 to discretize the continuous time-series input signal and to hold it until the reservoir neurons are ready to receive a stimulus. The stimulus to the reservoir neurons is presented via a set of memristor devices integrated into the crossbar structure. These memristors model the synaptic weights bridging the input-reservoir layers. Using one memristor to model each synaptic weight in the network is not common, as it can only model unipolar weights. Thus, two cross-coupled memristors (2M structure) are typically used, where the net difference between their conductance results in a bipolar weight representation. Using two memristors doubles the resources utilized in each layer, leading to uncompact and a less energy-efficient design. To ensure efficiency, one may resort to a 1M1R structure. In 1M1R, (\(n_u \times n_r+1\)) crossbar is used, where each memristor in the crossbar corresponds to one weight representation and the additional column is reserved for the reference resistor (or untunable memristor). Hence, the net difference between the conductance of any tunable memristor and the corresponding reference resistor results in a bipolar weight representation [44]. While the 1M1R structure uses fewer resources, it suffers from a limited dynamic range and may not be favorable when implementing streaming networks due to memristor endurance limitations. In this work, we adopted the 2M structure and attempted to leverage network and device inherent features to minimize resource usage while enhancing network robustness and performance.
\begin{equation} \begin{aligned} \hat{x_i}^{\lt t\gt } = g\left[\left(\frac{R_f}{M^-_{i1}} - \frac{R_f}{M^+_{i1}}\right)~ u^{\lt t\gt }+ \left(\frac{R_f}{M^-_{i2}} - \frac{R_f}{M^+_{i2}}\right) ~x^{\lt t-1\gt }_{1} + .... + \left(\frac{R_f}{M^-_{i(n_h+1)}} - \frac{R_f}{M^+_{i(n_h+1)}}\right)~x^{\lt t-1\gt }_{n_h}\right] \end{aligned} \end{equation}
(2)
\begin{equation} x_i^{\lt t\gt } \simeq \underbrace{\frac{M_z || M_y}{(M_z || M_y) + M_x}}_{\delta }\hat{x}^{\lt t\gt }+ \underbrace{\frac{M_z||M_x}{(M_z || M_x)+M_y}}_{1-\delta }x^{\lt t-1\gt } \end{equation}
(3)
When the stimulus reaches the reservoir layer, the internal states of all the reservoir neurons, \(\hat{x}^{\lt t\gt }\), are updated. This process begins by computing the sum of inputs (\(u^{\lt t\gt }\) and \(x^{\lt t-1\gt }\)) weighted by non-tunable memristor devices (random-fixed weights), as described in (2). Then, the non-linearity, represented by tanh function, is approximated by leveraging the linear operating region and limited bias of pMOS-input operational amplifiers (Op-Amps). After determining the internal states of the reservoir neuron, the leaky-integrated feature is captured to assess how much information from the previous activation should be carried over to the current hidden state. Implementing this feature in hardware may involve using at least one additional Op-Amp, leading to high power consumption and a large footprint area. To circumvent this issue, we introduce the so-called leakage cell. The leakage cell combines the internal state and the previous activations using three memristors connected as shown in Figure 2. These memristors model the leakage terms, with the internal state being multiplied by the leakage rate (\(\delta\)), modeled by \(\frac{M_z || M_y}{(M_z || M_y) + M_x}\), and the previous reservoir neuron output being multiplied by (\(1 - \delta\)), modeled by \(\frac{M_z||M_x}{(M_z || M_x)+M_y}\).2 This approach reduces the power consumption and area of the reservoir layer by more than 2\(\times\) and 170\(\times\), respectively, compared to using Op-Amps. However, once the final outputs of the reservoir neurons are ready, they will be relayed to the feedback circuit. In the feedback circuit, the activation of each reservoir neuron is bypassed to the output layer to compute the final network output, \(\hat{y}\), and sampled and held to be used as feedback in the upcoming iterations. Due to the fact that the activation of the reservoir neuron is a result of tanh function, it can be either positive or negative. Storing a negative value in an S/H circuit is challenging and not feasible. To this end, we superimposed the output of each reservoir neuron on a fixed DC offset to bring all outputs to positive ranges during the sampling phase. The DC offset is canceled once the stored values are represented to the reservoir. Once the output of the reservoir layer is ready, it is multiplied by the corresponding weights, and the final output of the ESN is computed. Then the training phase starts, which will be discussed in detail in the next subsection.
It is important to highlight that each crossbar in the reservoir layer uses a 2M configuration. Although this configuration is well known to possess a 10\(\times\) smaller footprint as compared to 1T1M [39], it is harder to configure. Reconfigurability is an essential feature in ESN hardware implementation as it endows the network with an additional degree of freedom to continuously change its structure and sparsity level, ensuring network stability. In this work, reconfigurability (synaptogenesis) is enabled through the use of the Ziksa writing scheme, which we proposed in prior work [47]. Ziksa is used to suppress the effect of the synaptic connections that are not involved in computation, promoting network sparsity. This suppression occurs by setting the conductance of the memristor devices emulating the positive portion of the weights (\(M^+\)) to be equal to that representing the negative portion of the weights (\(M^-\)) such that the net current flowing through these synapses for a given input is zero. Our approach significantly reduces the resources required to enable sparsity, unlike the work of Kume et al. [18], which suggests using switching transistors and 1-bit memory cells or flip-flops to control each MOSFET device in the crossbar.

3.1 ESN Accelerator Training

As discussed in the overview section, training in ESNs is confined solely to the readout layer, and it can be performed either analytically via ridge regression or iteratively by using LMS. In this work, we choose LMS to ensure fast learning and minimize resource usage. However, the LMS training equation consists of two terms: gradient and regularization. The gradient is first estimated by computing network output error, the difference between the predicted value and the ground truth as presented by the training circuitry block. Typically, a subtractor, a non-inverting operational amplifier, is used for this purpose. The output of the subtractor is multiplied by the activations of the reservoir neurons to estimate the gradient required to modulate the corresponding memristor conductance. The gradient is digitized using a 6-bit Flash-ADC (optimized for power-constrained devices), and the output is sent to a global controller where it gets temporarily stored. Then, the regularization term is estimated as follows. First, a fixed test voltage3 is applied to the memristor that needs to be updated (\(M_u\)). The output voltage is then captured again by the ADC, and the result will be sent to the global controller to estimate the value of the memristor conductance according to (4).
\begin{equation} M_u = R_f \times \frac{V_{test}}{V_o} \end{equation}
(4)
Once the memristor conductance is estimated, it is multiplied by the regularization factor, and the result is added to the gradient to estimate the amount of change that needs to be made to a memristor. This amount (\(\Phi\)) is then translated into a pulse signal with a fixed amplitude and variable duration (duration reflects the amount of change to be made to a memristor) using the conversion circuit shown in Figure 3, where the output (\(V_m\)) is given by Equations (5) and (6). Here, \(V_{int}\) denotes the output of the integrator, and \(V_r\) is a reference input voltage, which, along with R and C, controls the speed of integration.
\begin{equation} V_m = {\left\lbrace \begin{array}{ll}T_h,& \Phi \gt V_{int} \\ 0, & otherwise \end{array}\right.} \end{equation}
(5)
\begin{equation} T_h = \Phi \times \frac{V_r}{RC} \end{equation}
(6)
Fig. 3.
Fig. 3. The designed circuit used to convert (\(\Phi\)) into a pulse signal of fixed amplitude and variable duration. The circuit encompasses an integrator and comparator with internal positive feedback.
It is imperative to mention that the changes in memristors occur in a sequential fashion, one memristor at a time. This approach allows for more precise tuning and reduces power consumption during the training process. While such a training approach is known to be slow, we speed it up by means of gradient sparsification, which limits the changes to the devices with gradient values above a predefined threshold (\(\Theta\)). The training operation is verified in Cadence Virtuoso for a small-scale network (1x20x1),4 whereas the large-scale ESN verification is done using a Python model. The Python model takes into consideration all hardware constraints, such as limited range and precision of the weight representation, non-linearity of the memristor devices, and leakage in the S/H circuit.

4 Methodology

4.1 Design Space Exploration

To select the best hyperparameters that result in a compact design and optimal performance, the particle swarm optimization algorithm is used. The algorithm is integrated with the ESN software model and set into constrained mode, in which the feasible points for hyperparameters are defined based on the hardware constraints. This includes the limited range of weight representation, crossbar size, precision, and so on. In this work, the particle swarm optimization employs a swarm of 50 particles, which are chosen to optimize for the following: the number of reservoir neurons that controls the high-dimensional space of feature extraction, frequency of synaptic updates (\(n_{up}\)), sparsity level within the reservoir layer which influences the computational cost, leakage rate to moderate the speed of the reservoir dynamic updates, regularization term to mitigate the overfitting, and learning rate to regulate convergence speed.5 The algorithm runs for 30 iterations with the target of minimizing the wMAPE of real-world benchmarks when performing forecasting tasks.

4.2 Device Non-Idealities and Process Variabilities

There are several operational and reliability concerns when using memristor devices to model the synaptic connections in neural network accelerators, such as endurance limitations and device non-idealities. The endurance limitation is highly impacted by several factors, including the programming voltage, device structure, and device material. For instance, using excessive writing voltage can lead to a major deterioration in memristor endurance, whereas the proper use of programming voltage leads to a significant improvement [34]. However, crossing the endurance limits may drive the memristor devices to exhibit various behaviors which may severely impact network performance. For example, Yang et al. [40] observed a reduction in the metal-oxide memristor (Pt/TaOx/Ta) resistance ratio once the switching cycles surpass \(6\times 10^9\). Once the device switching window is collapsed, the device will be stuck at the reset resistance rather than being shorted. Kim et al. [15] observed different behavior. After surpassing the device switching cycles limit, the memristor experienced a stuck-at failure. In this work, since we are modeling the physical characteristics of the device proposed by Yang et al. [40], we will consider that the memristors experience stuck-at failure when passing the endurance limit. The endurance limit considered is \(1\times 10^9\) cycles when tuned with a fixed voltage pulse of \(\pm\)1.2v for set and reset.
When it comes to the device non-idealities, we have cycle-to-cycle and device-to-device variabilities, which characterize the time-varying stability of memristors and their uniformity when integrated into a crossbar structure [33]. These non-idealities typically occur due to device material and imperfect manufacturing processes [1]. However, in this work, the cycle-to-cycle variability of the device resistance range has been considered and is modeled as a variation in the weight range. For device-to-device variation, it is applied to the memristor threshold and resistance (write variation). Here, the write variation refers to the variability in the rate of change in device resistance during the learning process, which is modeled by adding noise to the learning rule. Regarding the process variation and limitations, it is considered for the developed 6-bit flash ADC, comparators, and operational amplifiers.6

4.3 Device Model

The memristor device model used in this work is VTEAM [19], which is fitted to the physical device characteristics proposed in the work of Yang et al. [40] as shown in Figure 4. The model is given by Equations (7) and (8),7 where \(G_{on}\) and \(G_{off}\) represent the maximum and minimum conductance range of the memristor device. w and D denote the device state variable and its thickness, respectively. Here, the change in device conductance is non-linear and it is governed by the state variable change when a voltage surpassing the device threshold is applied across its terminals. To achieve better fitting, the Z-window function, proposed in previous work [45], is used. The Z-window has a broad range of parameters to control device characteristics. Furthermore, it possesses attractive features, such as overcoming the boundary lock problem, scaling, and non-symmetrical behavior. The Z-window is given in (9), where \(\delta\), k, and p refer to the sliding level (over the x-axis), scalability factor, and falling curve slope as it approaches the boundaries of the device. To comply with the device characteristics, the technology node, and targeted application constraints, the following has been considered: (i) the memristor offers a high conductance range, (ii) the set and reset voltages are no more than 1.2v, and (iii) the device variability is limited to 10%. Table 1 shows all device parameters used in the developed ESN accelerator.
\begin{equation} G_{mem} = \frac{w}{D} \times G_{on} + \left(1 - \frac{w}{D}\right) \times G_{off} \end{equation}
(7)
\begin{equation} \frac{\Delta w}{\Delta t} = {\left\lbrace \begin{array}{ll}k_{off}.\Big (\frac{v(t)}{v_{off}} - 1\Big)^{\alpha _{off}}.f_{z}(w),&0 \lt v_{off} \lt v \\ 0, &v_{on} \lt v\lt v_{off} \\ k_{on}.\Big (\frac{v(t)}{v_{on}} - 1\Big)^{\alpha _{on}}.f_{z}(w),&v \lt v_{on} \lt 0 \end{array}\right.} \end{equation}
(8)
\begin{equation} f_z(w) = \frac{k[1-2 (\frac{w}{D} - \delta)]^p}{e^{\tau (\frac{w}{D} - \delta)^p}} \end{equation}
(9)
Table 1.
ParameterValue (Reservoir and Readout)Value (Leakage Cell)
Memristor range200k\(\Omega\) to 2M\(\Omega\)100k\(\Omega\) to 10M\(\Omega\)
Memristor threshold\(\pm\)1v\(\pm\)1v
Full switching pulses4167
Training voltage\(\pm\)1.2v\(\pm\)1.2v
Endurance\(1\times 10^{9}\)\(1\times 10^{9}\)
Switching time\(\lt\)10ns\(\lt\)10ns
Table 1. Memristor Device Parameters Used in the Developed ESN Accelerator
Fig. 4.
Fig. 4. Fitting of the used memristor device model to the physical device characteristics provided in the work of Yang et al. [40].

4.4 Univariate Benchmarks

To verify the operation of the proposed memristor-based ESN accelerator, several univariate benchmarks have been used in the time-series forecasting task:
PJM energy: This dataset holds 145,366 samples representing regional energy consumption in the United States [27]. The energy consumption is recorded every hour for the period from 01/01/2010 to 07/01/2018. The ESN is employed for short-term energy consumption forecasting—that is, the energy consumption for the next 50 to 100 hours.
Daily temperature: The daily minimum temperature in Melbourne, Australia, recorded between the years 1981 and 1990 [11]. The dataset contains 3,605 noisy samples. We smoothed the data with moving average (window size = 5) to reduce the noise effect and used ESN to predict the temperature for the next day.
Mackey-Glass: This dataset is generated by non-linear time-delayed differential equation modeling a chaotic system [22] (see (10)). It is widely used as a benchmark for time-series forecasting tasks. In this work, the parameters of the Mackey-Glass equation are set as \(\beta = 0.25, \gamma =0.1, \tau = 18,~\text{and}~ n=10\), whereas the number of the generated examples used in forecasting task with ESN is 4,000.
\begin{equation} \frac{dx}{dt} = \beta \frac{x(t-\tau)}{1+[x(t-\tau)]^n} - \gamma x(t) \end{equation}
(10)
NARMA10: The classical non-linear autoregressive moving average (NARMA) was introduced by Atiya and Parlos [2]. It represents a dynamical system that is difficult to model due to its non-linearity and long-term dependencies. In this work, 4,000 samples from the 10th order NARMA system are employed for the forecasting task. The system is defined by (11), where \(y(t)\) and \(s(t)\) respectively are the output and the input of the system at time t.
\begin{equation} \begin{aligned}&y(t+1) = 0.3y(t) + 0.05y(t)\sum \limits ^{9}_{t=0} y(t-i) + 1.5s(t-9)s(t)+0.1 \end{aligned} \end{equation}
(11)
All benchmark features are scaled to range between 0 and 1 so that they have the same interval as the reservoir and output neurons. Furthermore, the first 100 samples are allocated to the initial washout period.

5 Experimental Results and Discussion

5.1 Time-Series Forecasting

To quantify the performance of the proposed ESN memristor-based accelerator, time-series forecasting is evaluated using the weighted mean absolute percentage error (wMAPE) metric, which is given by (12). With weights (\(\varpi _i\)) set to unity, the wMAPE can determine the average difference between the actual and predicted values while capturing the substantial fluctuations in magnitude [23].
\begin{equation} wMAPE = \frac{\sum \nolimits ^{0.5 \times n_t}_{t=1} \varpi _i|y^{\lt t\gt } - \hat{y}^{\lt t\gt }|}{\sum \nolimits ^{0.5 \times n_t}_{t=1} \varpi _i|y^{\lt t\gt }|} \end{equation}
(12)
Figure 5(a) depicts the wMAPE of the software and the developed hardware model of the ESN when using the PJM energy dataset to predict the energy consumption of the load for the next 50 hours. It is evident that the wMAPE, which is recorded every time 250 samples are introduced to the network, starts with a high value and then gradually degrades over time as the network learns and captures short- and long-term dependencies. In the absence of the regularization term in the learning equation, it appears that the network takes a longer time to adapt. Furthermore, the network struggles to re-adapt when there is a major change in input patterns. When it comes to the developed hardware model, which uses regularization, a similar trend as the software model has been observed. However, one may notice that there is a gap in performance (\(\sim\)3.9%) between the software and hardware model, which can attributed to several reasons. Among these are the non-idealities of memristor devices, limited precision of the ADC used when modulating the conductance of the memristors, and undesired leakage in the S/H circuit leading to a drop in the stored charges.
Fig. 5.
Fig. 5. (a) The wMAPE of the proposed ESN models (SW and HW) is calculated while predicting the energy consumption for 50 hours of load, computed every 250 samples. (b, c) The impact of various levels of stuck-on and stuck-off faults in memristor devices on the network performance while performing time-series forecasting task when using 1M1R and 2M crossbar structures, respectively.
Table 2 shows the wMAPE when forecasting data from stationary and non-stationary benchmarks for various timesteps. The wMAPE is calculated when training the proposed memristor-based ESN accelerator with LMS, and LMS with L2 regularization using point and leaky-integrated neurons in the reservoir layer. It can be seen that having leaky-integrated neurons significantly improves network performance, as it enables the ESN to form temporal dependencies with controllable dynamic updates.
Table 2.
BenchmarkForecastLMS (LIN)LMS+L2LMS+L2 (LIN)
PJM-Energy50-Step0.073\(\pm\)0.00110.087\(\pm\)0.00240.061\(\pm\)0.0084
 100-Step0.075\(\pm\)0.0100.092\(\pm\)0.0230.066 \(\pm\)0.0089
Mackey-Glass50-Step0.053\(\pm\)0.00690.079\(\pm\)0.02730.047\(\pm\)0.0004
 100-Step0.060\(\pm\)0.0110.082\(\pm\)0.0330.047\(\pm\)0.0067
Daily-Temp50-Step0.075\(\pm\)0.00520.097\(\pm\)0.01890.073\(\pm\)0.0015
 100-Step0.093\(\pm\)0.0290.0105\(\pm\)0.0250.083\(\pm\)0.0172
NARMA1050-Step0.191\(\pm\)0.00390.198\(\pm\)0.00690.189\(\pm\)0.0034
 100-Step0.195\(\pm\)0.00390.211\(\pm\)0.01920.191\(\pm\)0.0039
Table 2. Forecasting wMAPE of Univariate Stationary and Non-Stationary Benchmarks Using the Proposed Memristor-Based ESN When Trained with Both LMS and LMS with L2 Regularization, and When Using Point and Leaky-Integrated Neurons (LINs) in the Reservoir Layer

5.2 Device Failure Effect

There are several types of device failure a network has to deal with especially when using memristor devices. The memristor devices may experience a deviation in characteristics such as a change in the device conductance range due to excessive device switching, known as an aging fault, or it may have a fault in the device right after the fabrication process such as stuck-at fault. In this work, we will primarily focus on stuck-at fault because it is most common and may severely impact network performance [38]. There are three types of stuck-at fault: stuck-at, stuck-on, and stuck-off. Previous work has shown that stuck-at replicates the memristor device failure after the forming process and such fault has marginal effect on network performance [46]. Thus, this work solely investigates the impact of stuck-on (high-conductance state) and stuck-off (low-conductance state) faults on predicting future events in time-series data. In Figure 5, we show the change in wMAPE when using the PJM energy dataset to predict the energy consumption of the load for the next 50 hours in the presence of various levels of stuck-on and stuck-off faults occurring in the readout and reservoir layers. The fault impact is investigated in both 1M1R and 2M crossbar structure. In 1M1R structure, it is found that regardless of the fault type, below 10% its impact can be deemed marginal causing approximately \(\pm\)1.3% change in the wMAPE value. However, this impact exacerbates beyond 10% and seems to have an exponential negative impact on wMAPE (see Figure 5(b)). In 2M structure, we may observe similar impact, but it can be suppressed by leveraging the programmable nature of memristor devices. When using two memristors to represent each synaptic weight, one can (i) leverage the intact memristors to tune the weights modeled by faulty devices, and here the weights will not be frozen but rather have limited dynamic range, and (ii) enforce sparsity (i.e., zero weight values) via equating the conductance of the intact and faulty devices. Figure 5(c) illustrates the wMAPE in the presence of stuck-on and stuck-at faults when we enforce sparsity.

5.3 Network Lifespan

The network lifespan (LSP) is defined by its ability to sustain learning and acquire new knowledge [43]. In memristor-based architectures, the lifespan is highly affected by the memristor’s limited endurance, leading to either a gradual or severe drop in the dynamic range of network parameters, loss of elasticity, and eventually network performance. Typically, memristor devices, particularly oxide-based devices, offer an endurance ranging between \(10^{6} - 10^{12}\) [5]. This low endurance is sufficient to get the network to continuously learn and update, but not for a long period of time.
Estimating the lifespan of the memristor-based network is not a trivial task, as it is influenced by several factors such as memristor endurance variability, changing input statistics, and network convergence time. However, in this work, first-order estimation for the lifespan will be considered as given in (13), where \(E_d\) and \(\sigma\) respectively are the device endurance and variability. \(U_f\) denotes the update frequency of the memristor devices while learning.
\begin{equation} LSP = \frac{E_d \pm \sigma }{U_f} \end{equation}
(13)
For the used benchmarks, the PJM energy dataset, the input samples are recorded and presented to the network every hour. Thus, the lifespan of the proposed ESN accelerator, when trained on the PJM energy dataset, is \(\sim\)115,740 years, given \(E_d=10^{9}\). However, this number can drop linearly with any increase in the update frequency. For instance, in Mackey-Glass, if the update occurs every 100ms, the lifespan of the same network drops to 3.21 years. Thus, we suggest the following two techniques to enhance the lifespan. First, use two memristors in differential configuration to model the synaptic weights. This not only results in a wide weight dynamic range but also helps extend the lifetime of the network, assuming the training process is conducted in an alternating manner. Second, utilize an R-M configuration to limit the voltage drop across memristor devices during the forming process and conventional switching. Reducing the voltage drop can either be achieved by explicitly integrating the memristor with a proper resistor [15] or inherently in a 1T1M configuration. Integrating memristor devices into a crossbar structure is also expected to extend the device endurance due to the wire parasitic resistance [28]. Due to the simplicity and effectiveness of the first approach, alternating tuning, it is adopted in the developed ESN accelerator. Consequently, \(\sim 2\times\) enhancement in the lifespan of the network is achieved.

5.4 Latency

The latency, the time required to process each input sample presented to the memristor-based ESN accelerator, is calculated while processing the univariate time-series dataset. During forecasting, the latency (worst-case scenario) is estimated to be 45.83ns, unequally contributed by several units in the network (see Figure 6(a)). However, the units that account for \(\sim\)58% of the delay are the neuron circuits, which are implemented using two Op-Amps to capture the non-linearity. This issue manifests only in the output layer, as each reservoir neuron uses one Op-Amp and a leakage cell to capture the non-linearity and leaky-integrated feature, respectively.
Fig. 6.
Fig. 6. (a, c) The propagation delay and power consumption of the main circuit blocks used by the proposed memristor-based ESN accelerator. (b) The power consumption of the proposed accelerator recorded over time while forecasting future events from the PJM energy dataset.
When comparing the presented memristor-based ESN accelerator with the digital counterpart clocked at 50MHz, \(\sim 607\times\) reduction in latency is witnessed when performing vector-matrix multiplication, owing to the extensive parallelism and in-memory computing of crossbar architecture, and \(\sim 4\times\) reduction in applying the non-linearity. It is important to mention here that to ensure precision when estimating the latency of the developed ESN accelerator, we built a large-scale network (1\(\times 105\times\)1) in Cadence Virtuoso and estimated the time required to propagate an input signal through the individual components (developed under the 65nm process) and layers. The estimation of the propagation time is done using Cadence Virtuoso-ADE while considering resistance and capacitance parasitics, extracted in Mentor Graphics-Calibre.

5.5 Energy-Delay Product

The EDP of the proposed memristor-based accelerator implemented with a 65 nm process is 153.68nJ\(\times\)ns. It is estimated in Cadence Virtuoso while forecasting the load energy consumption from the PJM energy dataset. It is important to note here that estimating the EDP of a large-scale mixed-signal system tends to be challenging due to the disparity in the time scale between simulation and actual hardware models. For instance, the samples of the PJM dataset are recorded every hour, and estimating the energy consumption using the same time scale can take days or weeks. Thus, to speed up the process, the time was reduced by \(2.7\times 10^{-9}\). Figure 6(b) depicts the power consumption of the proposed ESN accelerator recorded over time. It can be observed that static power is dominating as compared to dynamic power consumption, which manifests when input samples are presented to the network. In Figure 6(c), we report the breakdown of the average power consumption between the main units. One may observe that most of the power budget is directed toward the Op-Amps. Thus, in this work, we strive to reduce the number of the Op-Amp used in the reservoir layer, which results in more than 2\(\times\) reduction in power consumption.
It is essential to highlight that the joint enhancements in power consumption, latency, and architecture are reflected on the entire system leading to 247\(\times\) reduction in energy consumption of the proposed memristive ESN accelerator compared to the digital counterpart [42]. This makes the proposed accelerator more suitable for edge devices with stringent resources. Table 3 provides a high-level comparison of the proposed ESN accelerator with previous works. Our accelerator offers in situ training, allowing for (i) processing of stationary and non-stationary data locally on the device, and (ii) hardware-aware learning and faster adaptation. Furthermore, it has low latency8 as compared to other implementations reported in the literature.
Table 3.
AlgorithmMixed-ESN [17]LS-ESN [36]Cyclic-ESN [20]ESSM-ESN [25]This Work
TaskPredictionForecastingClassificationClassificationForecasting
Reservoir size301008128\(\times 64\times\)28105
Input\(\times\)Output size1\(\times\)11\(\times\)18\(\times\)576\(\times\)11\(\times\)1
Power dissipation0.202mW0.327mW\(^{\rm c}\)58.38mW\(^{\rm a}\)73.17mW
BenchmarksESD & PFLoad PowerVowel RecognitionECGPJM Energy
Latency7.62s50ns\(\lt\)\(^{\rm b}\)45.83ns
TrainingOff-ChipOn-ChipOff-ChipOff-ChipOn-Chip
Technology nodePTM 45nmStandard 65nmPTM 22nmStandard 65nm
Table 3. Comparison of the Proposed ESN Accelerator with Previous Work
\(^{\rm a}\) In the work of Nair et al. [25], the power consumption when training the ESSM-ESN model is 2.04W.
\(^{\rm b}\) The latency is reported solely for the individual circuit blocks used in ESSM-ESN.
\(^{\rm c}\) For the same network size and based on the provided information, the power consumption the work of Liang et al. [20] can reach \(\sim\)163.04mW when the latency is 50ns.
One may note that these implementations are on different substrates, and therefore this table offers a high-level reference template for ESN hardware rather than an absolute comparison.

6 Conclusion

In this article, we proposed an energy-efficient memristor-based ESN accelerator to enable time-series data processing and learning on edge devices with stringent resources. The proposed design features in situ training, enabling processing of stationary and non-stationary data, hardware-aware learning, and fast adaptation. When evaluated for time-series forecasting using standard benchmarks, it was found that the hardware model experiences a marginal degradation in performance compared to the software counterpart. This is attributed to several reasons, among which are the memristor devices’ non-idealities, limited precision of the used ADC, and undesired leakage in the S/H circuits. Regarding the lifespan of the network, we suggest an alternating training approach to enhance the lifespan of the ESN. In the case of latency, while forecasting, we observe a significant reduction in latency compared to the digital implementations. This is because the most extensive operations, multiply-accumulate, are performed concurrently and in memory. When it comes to power and energy consumption, we notice that most of the power budget is directed toward the Op-Amps used to model the activation functions and capture the leaky-integrated feature. Thus, we strive to reduce the number of Op-Amps utilized to emulate the activation functions and capture leaky-integrated features via the introduction of the memristor-based leakage cell. This results in a more than 2\(\times\) reduction in power consumption. The combined enhancements in architecture, latency, and power consumption give rise to a 247\(\times\) reduction in the energy consumption of the proposed memristive ESN accelerator compared to the digital counterpart implemented at the same technology node.

Footnotes

1
To ensure high energy efficiency and a wide input range, we used the S/H circuit presented in the work of O’Halloran and Sarpeshkar [26]. The circuit is modified to generate a differential output and to hold multiple clones of the sampled input, which will be used during the prediction and training phases.
2
The resistance of the \(M_z\) memristor should be set to be really large as compared to \(M_x\) and \(M_y\) so that the approximated \(\delta\) and \(1-\delta\) always sum up to 1.
3
Due to technology limitations, the test signal (\(V_{test}\)) is set to be 50mV to avoid any undesired clipping or distortion in the generated output.
4
The training is verified while considering memristor cycle-to-cycle and device-to-device variabilities. Variabilities with a normal distribution has been considered, with a mean defined by device parameters and a standard deviation equal to 10% of the mean.
5
The leakage rate, regularization term, and learning rate are set to be global parameters.
6
All analog units are tested for fabrication process, voltage supply, and ambient temperature variations using corner analysis.
7
\(k_off\), \(k_{on}\), \(\alpha _{on}\), and \(\alpha _{off}\) are constants, and \(v_{off}\) and \(v_{on}\) are the memristor threshold voltages.
8
No comparison in terms of energy or EDP is conducted due to the lack of information in the preceding references.

References

[1]
Peyman Pouyan, Esteve Amat, and Antonio Rubio. 2016. Memristive crossbar memory lifetime evaluation and reconfiguration strategies. IEEE Transactions on Emerging Topics in Computing 6, 2 (2016), 207–218.
[2]
Amir F. Atiya and Alexander G. Parlos. 2000. New results on recurrent network training: Unifying the algorithms and accelerating convergence. IEEE Transactions on Neural Networks 11, 3 (2000), 697–709.
[3]
Filippo Maria Bianchi, Simone Scardapane, Sigurd Løkse, and Robert Jenssen. 2020. Reservoir computing approaches for representation and classification of multivariate time series. IEEE Transactions on Neural Networks and Learning Systems 32, 5 (2020), 2169–2179.
[4]
Luca Cerina, Marco D. Santambrogio, Giuseppe Franco, Claudio Gallicchio, and Alessio Micheli. 2020. EchoBay: Design and optimization of echo state networks under memory and time constraints. ACM Transactions on Architecture and Code Optimization 17, 3 (2020), 1–24.
[5]
Marionna Coll, Josep Fontcuberta, Matthias Althammer, Manuel Bibes, Hans Boschker, Alberto Calleja, Guanglei Cheng, M. Cuoco, Regina Dittmann, B. Dkhil, I. El Baggari, M. Fanciulli, I. Fina, E. Fortunato, C. Frontera, S. Fujita, V. Garcia, S. T. B. Goennenwein, C. G. Granqvist, J. Grollier, R. Gross, A. Hagfeldt, G. Herranz, K. Hono, E. Houwman, M. Huijben, A. Kalaboukhov, D. J. Keeble, G. Koster, L. F. Kourkoutis, J. Levy, M. Lira-Cantu, J. L. MacManus-Driscoll, Jochen Mannhart, R. Martins, S. Menzel, T. Mikolajick, M. Napari, M. D. Nguyen, G. Niklasson, C. Paillard, S. Panigrahi, G. Rijnders, F. Sánchez, P. Sanchis, S. Sanna, D. G. Schlom, U. Schroeder, K. M. Shen, A. Siemon, M. Spreitzer, H. Sukegawa, R. Tamayo, J. van den Brink, N. Pryds, and F. Miletto Granozio. 2019. Towards oxide electronics: A roadmap. Applied Surface Science 482 (2019), 1–93.
[6]
Arkadeep De, Arpan Nandi, Arjun Mallick, Asif Iqbal Middya, and Sarbani Roy. 2023. Forecasting chaotic weather variables with echo state networks and a novel swing training approach. Knowledge-Based Systems 269 (2023), 110506.
[7]
Victor M. Gan, Yibin Liang, Lianjun Li, Lingjia Liu, and Yang Yi. 2021. A cost-efficient digital ESN architecture on FPGA for OFDM symbol detection. ACM Journal on Emerging Technologies in Computing Systems 17, 4 (2021), 1–15.
[8]
Daniel J. Gauthier, Erik Bollt, Aaron Griffith, and Wendson A. S. Barbosa. 2021. Next generation reservoir computing. Nature Communications 12, 1 (2021), 5564.
[9]
Kian Hamedani, Lingjia Liu, Rachad Atat, Jinsong Wu, and Yang Yi. 2017. Reservoir computing meets smart grids: Attack detection using delayed feedback networks. IEEE Transactions on Industrial Informatics 14, 2 (2017), 734–743.
[10]
Kentaro Honda and Hakaru Tamukoh. 2020. A hardware-oriented echo state network and its FPGA implementation. Journal of Robotics, Networking and Artificial Life 7, 1 (2020), 58–62.
[11]
Rob J. Hyndman and Yangzhuoran Yang. 2018. Daily Minimum Temperatures in Melbourne, Australia (1981–1990). Retrieved November 5, 2024 from https://pkg.yangzhuoranyang.com/tsdl/
[12]
Herbert Jaeger. 2001. The “Echo State” Approach to Analysing and Training Recurrent Neural Networks—With an Erratum Note. GMD Technical Report 148. German National Research Center for Information Technology, Bonn, Germany.
[13]
Weiwen Jiang, Bike Xie, Chun-Chen Liu, and Yiyu Shi. 2019. Integrating memristors and CMOS for better AI. Nature Electronics 2, 9 (2019), 376–377.
[14]
Changhyeon Kim, Sanghoon Kang, Dongjoo Shin, Sungpill Choi, Youngwoo Kim, and Hoi-Jun Yoo. 2019. A 2.1 TFLOPS/W mobile deep RL accelerator with transposable PE array and experience compression. In Proceedings of the 2019 IEEE International Solid-State Circuits Conference (ISSCC ’19). IEEE, 136–138.
[15]
Kyung Min Kim, J. Joshua Yang, John Paul Strachan, Emmanuelle Merced Grafals, Ning Ge, Noraica Davila Melendez, Zhiyong Li, and R. Stanley Williams. 2016. Voltage divider effect for the improvement of variability and endurance of TaO x memristor. Scientific Reports 6 (2016), 20085.
[16]
Denis Kleyko, Edward Paxon Frady, Mansour Kheffache, and Evgeny Osipov. 2020. Integer echo state networks: Efficient reservoir computing for digital hardware. IEEE Transactions on Neural Networks and Learning Systems 33, 4 (2020), 1688–1701.
[17]
Dhireesha Kudithipudi, Qutaiba Saleh, Cory Merkel, James Thesing, and Bryant Wysocki. 2016. Design and analysis of a neuromemristive reservoir computing architecture for biosignal processing. Frontiers in Neuroscience 9 (2016), 502.
[18]
Yuki Kume, Song Bian, and Takashi Sato. 2020. A tuning-free hardware reservoir based on MOSFET crossbar array for practical echo state network implementation. In Proceedings of the 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC ’20). IEEE, 458–463.
[19]
Shahar Kvatinsky, Misbah Ramadan, Eby G. Friedman, and Avinoam Kolodny. 2015. VTEAM: A general model for voltage-controlled memristors. IEEE Transactions on Circuits and Systems II: Express Briefs 62, 8 (2015), 786–790.
[20]
Xiangpeng Liang, Yanan Zhong, Jianshi Tang, Zhengwu Liu, Peng Yao, Keyang Sun, Qingtian Zhang, Bin Gao, Hadi Heidari, He Qian, and Huaqiang Wu. 2022. Rotating neurons for all-analog implementation of cyclic reservoir computing. Nature Communications 13, 1 (2022), 1549.
[21]
Mantas Lukoševičius. 2012. A practical guide to applying echo state networks. In Neural Networks: Tricks of the Trade. Springer, 659–686.
[22]
Michael C. Mackey and Leon Glass. 1977. Oscillation and chaos in physiological control systems. Science 197, 4300 (1977), 287–289.
[23]
Seyed Mahdi Miraftabzadeh, Cristian Giovanni Colombo, Michela Longo, and Federica Foiadelli. 2023. A day-ahead photovoltaic power prediction via transfer learning and deep neural networks. Forecasting 5, 1 (2023), 213–228.
[24]
John Moon, Wen Ma, Jong Hoon Shin, Fuxi Cai, Chao Du, Seung Hwan Lee, and Wei D Lu. 2019. Temporal data classification and forecasting using a memristor-based reservoir computing system. Nature Electronics 2, 10 (2019), 480–487.
[25]
Vineeta V. Nair, Chithra Reghuvaran, Deepu John, Bhaskar Choubey, and Alex James. 2023. ESSM: Extended synaptic sampling machine with stochastic echo state neuro-memristive circuits. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 13, 4 (2023), 965–974.
[26]
Micah O’Halloran and Rahul Sarpeshkar. 2004. A 10-nW 12-bit accurate analog storage cell with 10-aA leakage. IEEE Journal of Solid-State Circuits 39, 11 (2004), 1985–1996.
[27]
PJM Interconnection LLC. n.d. PJM Hourly Energy Consumption Data. Retrieved November 6, 2024 from https://www.kaggle.com/datasets/robikscube/hourly-energy-consumption/data
[28]
Sayeef Salahuddin, Kai Ni, and Suman Datta. 2018. The era of hyper-scaling in electronics. Nature Electronics 1, 8 (2018), 442–450.
[29]
MennaAllah Soliman, Mostafa A. Mousa, Mahmood A. Saleh, Mahmoud Elsamanty, and Ahmed G. Radwan. 2021. Modelling and implementation of soft bio-mimetic turtle using echo state network and soft pneumatic actuators. Scientific Reports 11, 1 (2021), 12076.
[30]
Mariia Sorokina. 2020. Multidimensional fiber echo state network analogue. Journal of Physics: Photonics 2, 4 (2020), 044006.
[31]
David Sussillo and Larry F. Abbott. 2009. Generating coherent patterns of activity from chaotic neural networks. Neuron 63, 4 (2009), 544–557.
[32]
Waseem Ullah, Tanveer Hussain, Zulfiqar Ahmad Khan, Umair Haroon, and Sung Wook Baik. 2022. Intelligent dual stream CNN and echo state network for anomaly detection. Knowledge-Based Systems 253 (2022), 109456.
[33]
Ioannis Vourkas and Georgios Ch. Sirakoulis. 2016. Memristor-Based Nanoelectronic Computing Circuits and Architectures. Vol. 19. Springer.
[34]
Guoming Wang, Shibing Long, Zhaoan Yu, Meiyun Zhang, Yang Li, Dinglin Xu, Hangbing Lv, Qi Liu, Xiaobing Yan, Ming Wang, Xiaoxin Xu, Hongtao Liu, Baohe Yang, and Ming Liu. 2015. Impact of program/erase operation on the performances of oxide-based resistive switching memory. Nanoscale Research Letters 10, 1 (2015), 39.
[35]
Wei-Jia Wang, Yong Tang, Jason Xiong, and Yi-Cheng Zhang. 2021. Stock market index prediction based on reservoir computing models. Expert Systems with Applications 178 (2021), 115022.
[36]
Shiping Wen, Rui Hu, Yin Yang, Tingwen Huang, Zhigang Zeng, and Yong-Duan Song. 2018. Memristor-based echo state network with online least mean square. IEEE Transactions on Systems, Man, and Cybernetics: Systems 49, 9 (2018), 1787–1796.
[37]
Ewelina Wlazlak, Piotr Zawal, and Konrad Szacilowski. 2020. Neuromorphic applications of a multivalued [\(SnI_4 {(C_6H_5)_2 SO}_2\)] memristor incorporated in the echo state machine. ACS Applied Electronic Materials 2, 2 (2020), 329–338.
[38]
Jiawei Xu, Yuxiang Huan, Kunlong Yang, Yiqiang Zhan, Zhuo Zou, and Li-Rong Zheng. 2018. Optimized near-zero quantization method for flexible memristor based neural network. IEEE Access 6 (2018), 29320–29331.
[39]
Chris Yakopcic. 2014. Memristor Device Modeling and Circuit Design for Read Out Integrated Circuits, Memory Architectures, and Neuromorphic Systems. University of Dayton.
[40]
J. Joshua Yang, M.-X. Zhang, John Paul Strachan, Feng Miao, Matthew D. Pickett, Ronald D. Kelley, G. Medeiros-Ribeiro, and R. Stanley Williams. 2010. High switching endurance in TaO x memristive devices. Applied Physics Letters 97, 23 (2010), 232102.
[41]
Heng Zhang and Danilo Vasconcellos Vargas. 2023. A survey on reservoir computing and its interdisciplinary applications beyond traditional machine learning. IEEE Access 11 (2023), 81033–81070.
[42]
Abdullah M. Zyarah, Alaa M. Abdul-Hadi, and Dhireesha Kudithipudi. 2024. Reservoir network with structural plasticity for human activity recognition. IEEE Transactions on Emerging Topics in Computational Intelligence 8, 5 (2024), 3228–3238. DOI:
[43]
Abdullah M. Zyarah, Kevin Gomez, and Dhireesha Kudithipudi. 2020. Neuromorphic system for spatial and temporal information processing. IEEE Transactions on Computers 69, 8 (2020), 1099–1112.
[44]
Abdullah M. Zyarah and Dhireesha Kudithipudi. 2018. Semi-trained memristive crossbar computing engine with in situ learning accelerator. ACM Journal on Emerging Technologies in Computing Systems 14, 4 (2018), 1–16.
[45]
Abdullah M. Zyarah and Dhireesha Kudithipudi. 2019. Neuromemrisitive architecture of HTM with on-device learning and neurogenesis. ACM Journal on Emerging Technologies in Computing Systems 15, 3 (2019), 1–24.
[46]
Abdullah M. Zyarah and Dhireesha Kudithipudi. 2019. Neuromemristive multi-layer random projection network with on-device learning. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN ’19). IEEE, 1–8.
[47]
Abdullah M. Zyarah, Nicholas Soures, Lydia Hays, Robin B. Jacobs-Gedrim, Sapan Agarwal, Matthew Marinella, and Dhireesha Kudithipudi. 2017. Ziksa: On-chip learning accelerator with memristor crossbars for multilevel neural networks. In Proceedings of the 2017 IEEE International Symposium on Circuits and Systems (ISCAS ’17). IEEE, 1–4.

Index Terms

  1. Time-Series Forecasting and Sequence Learning Using Memristor-based Reservoir System

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Embedded Computing Systems
      ACM Transactions on Embedded Computing Systems  Volume 24, Issue 1
      January 2025
      664 pages
      EISSN:1558-3465
      DOI:10.1145/3696805
      • Editor:
      • Tulika Mitra
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Journal Family

      Publication History

      Published: 10 December 2024
      Online AM: 05 November 2024
      Accepted: 27 October 2024
      Revised: 27 August 2024
      Received: 18 May 2024
      Published in TECS Volume 24, Issue 1

      Check for updates

      Author Tags

      1. Memristor
      2. echo state network
      3. reservoir accelerator
      4. in situ learning

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 213
        Total Downloads
      • Downloads (Last 12 months)213
      • Downloads (Last 6 weeks)160
      Reflects downloads up to 17 Jan 2025

      Other Metrics

      Citations

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Full Access

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media