[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

Spiking-NeRF: Spiking Neural Network for Energy-Efficient Neural Rendering

Published: 26 August 2024 Publication History

Abstract

Artificial Neural Networks (ANNs) have achieved remarkable performance in many artificial intelligence tasks. As the application scenarios become more sophisticated, the computation and energy consumption of ANNs are also constantly increasing, which poses a challenge for deploying ANNs on energy-constrained devices. Spiking Neural Networks (SNNs) provide a promising solution to build energy-efficiency neural networks. However, the current training methods of SNNs cannot output values as precise as ANNs. This limits the applications of SNNs to relatively simple image classification tasks. In this article, we extend the application of SNNs to neural rendering tasks and propose an energy-efficient spiking neural rendering model, called Spiking-NeRF (Spiking Neural Radiance Fields). We first analyze the ANN-to-SNN conversion theory and propose an output scheme for SNNs to obtain the precise scene property values. Then we customize the parameter normalization method for the special network architecture of neural rendering. Furthermore, we present an early termination strategy (ETS) based on the discrete nature of spikes to reduce energy consumption. We evaluate the performance of Spiking-NeRF on both realistic and synthetic scenes. Experimental results show that Spiking-NeRF can achieve comparable rendering performance to ANN-based NeRF with up to \(2.27\times\) energy reduction.

1 Introduction

The past two decades have witnessed the tremendous success of Artificial Neural Networks (ANNs) in various applications including computer vision [23], speech recognition [32], and natural language processing [20]. More recently, as an emerging and very promising direction in computer graphics, neural rendering successfully uses ANNs to learn scene representation and synthesize photorealistic images of the scene [48]. In particular, the seminal work of Neural Radiance Fields (NeRF) [30] uses volume rendering [29] to project the scene representations into images and leads to an “explosion” of developments in the neural rendering field [49]. However, like other applications of ANNs [45], neural rendering has to process a huge number of model parameters and input data during the inference process, and thus incurs substantial energy consumption due to the large amount of computation, which poses a great challenge to the deployment of neural rendering on energy-constrained devices.
Many comprehensive techniques have been proposed for neural networks targeting to energy-efficiency. First, at the algorithm level, researchers have developed techniques such as pruning [14, 52], quantization [8], and knowledge distillation [19] to build a light-weight ANN model in the training stage. Second, at the computing hardware level, customized neural network accelerators [7] have been developed which have shown to be able to achieve orders of magnitude higher energy-efficiency than state-of-the-art Complementary Metal Oxide Semiconductor designs. Third, inspired by the biological neurons [17], researchers devise the energy-efficient Spiking Neural Networks (SNNs) as a new computing paradigm [18]. SNNs apply a series of discrete binary spikes to transfer information, instead of continuous activation values in ANNs. This leads to more energy-efficient computation by substituting multiplication with addition in customized neuromorphic chips [1, 38]. SNNs have recently been applied to image classification tasks and have achieved high accuracy and high energy-efficiency, even on deep neural networks (such as VGG, ResNet) and complex datasets (such as CIFAR-10, ImageNet) [4, 9].
Due to the non-differentiable nature of the spikes in SNNs, gradient-based back-propagation algorithms used in ANNs cannot be applied to train SNNs. Researchers have proposed three training methods for SNNs, including Spike Timing-Dependent Plasticity (STDP) rule-based learning method [42], spike-based error back-propagation algorithm [25], and ANN-to-SNN conversion method [5, 10, 21, 41]. The STDP rule updates the parameters of SNNs based on the order in which the spikes arrive at the neurons [40]. The spike-based error back-propagation algorithm approximates the non-linear output function of the spiking neurons as a derivable function and then trains the SNNs using the gradient-based back-propagation algorithm [24, 33]. As illustrated in Figure 1, the conversion method converts the parameters of pre-trained ANNs to those of SNNs and has almost no additional computational overhead compared to the training process of ANNs.
Fig. 1.
Fig. 1. The flowchart of ANN-to-SNN conversion method.
The aforementioned three training methods of SNNs have been successfully applied in image classification tasks. However, the STDP-based training method is only suitable for training shallow SNNs, and the accuracy of STDP-based SNNs on complex datasets such as CIFAR-10 is far lower than that of ANNs [11]. Besides, the spike-based error bark-propagation algorithm introduces the surrogate function to perform back-propagation process, which increases the complexity of training process and limits its scalability in deep SNNs and complex datasets [24]. In contrast to these two methods, the ANN-to-SNN conversion method applies the scaled parameters of ANNs to SNNs [10], has excellent scalability, and yields the best-performing SNNs [4, 16].
In this article, we focus on developing energy-efficient spiking neural rendering using the ANN-to-SNN conversion method. Although SNNs have been studied thoroughly in image classification tasks, it is more challenging to implement SNNs in neural rendering. The conversion method establishes the proportional relationship between the firing rates of spiking neurons in converted SNNs and the activation values of analog neurons in source ANNs. In image classification tasks, the converted SNNs only need to ensure that the output value of the correct class is the largest. Therefore, the ANN-to-SNN conversion method can ensure that the converted SNN has a high classification accuracy. However, in neural rendering, ANNs output the numerical values of the scene properties. To guarantee the rendering performance, the converted SNNs must output the same values as the source ANNs. In addition, the neural network architecture used for neural rendering differs from the normal classification-purposed ANNs—there are hidden layers whose inputs include extra inserted values in addition to the outputs from the previous layer (see Figure 2). Considering the aforementioned two unique properties, we cannot apply traditional ANN-to-SNN conversion method to yield SNNs with good rendering performance.
Fig. 2.
Fig. 2. An overview of NeRF’s scene representation and neural network architecture.
To overcome these two challenges, in this work, we first propose a precise output decoding scheme for SNNs through the mathematical analysis of the conversion process. We then customize the parameter normalization method for the special network architecture of neural rendering. Moreover, considering that the colors of the sampled points contribute variably to the color of the pixel, we apply an Early Termination Strategy (ETS) to reduce the energy consumption. Combining these three methods, we propose Spiking-NeRF, an energy-efficient neural rendering model based on SNNs.
The contributions of this article can be summarized as follows:
We propose a precise output decoding scheme for SNNs since neural rendering requires the precise numerical values of scene properties, through analyzing the mathematical relationship between ANNs and converted SNNs.
We customize parameter normalization method for the special network architecture of neural rendering and develop an ETS based on the color weights of the sampled points and the discrete property of spikes.
We evaluate the performance of Spiking-NeRF on both realistic scenes and synthetic scenes. Experimental results show that Spiking-NeRF can achieve comparable rendering performance for all scenes and reduce energy consumption by up to \(2.27\times\) compared to the ANN-based NeRF.
The remainder of this article is organized as follows: In Section 2, we review the progress of research on SNNs based on ANN-to-SNN conversion methods. In Section 3, we first introduce the background of neural rendering techniques and then introduce the work of NeRF in detail. Afterward, in Section 4, we propose a precise output scheme, a customized parameter normalization method, and an ETS to build the energy-efficient Spiking-NeRF. Finally, we present our experimental results in Section 5, followed by conclusions in Section 6.

2 Related Work

ANN-to-SNN conversion is a promising approach for training SNNs. The existing works on the ANN-to-SNN conversion method have yielded SNNs with high accuracy for image classification tasks. Cao et al. [5] first propose the ANN-to-SNN conversion method and apply it to image classification tasks. Through mapping the parameters from ANNs to SNNs and replacing the Rectified Linear Unit (ReLU) activation with the Integrate-and-Fire (IF) model, the converted SNNs can achieve similar accuracy as ANNs while reducing the energy consumption by two orders of magnitude. To further reduce the accuracy loss, Diehl et al. [10] propose model-based and data-based parameter normalization methods. But their method results in intolerable low fire rates of spiking neurons if applied to deep SNNs. Based on the model-based normalization method, Rueckauer et al. [41] present a robust parameter normalization method to improve the fire rates. Besides, they adopt the reset by subtraction method to reduce conversion loss. Combining these two approaches, they successfully convert deep ANNs (like VGG-16) to SNNs and report good performance of converted SNNs on MNIST, CIFAR-10, and ImageNet. As deeper networks achieve better performance on image classification tasks, Han et al. [13] propose the Residual Membrane Potential (RMP) neurons and convert ResNet-24 and ResNet-34 to SNNs. However, their method requires a large number of time steps for the converted SNNs to achieve comparable accuracy to ANNs. To reduce the number of time steps during the inference process of the converted SNNs, Deng and Gu [9] shift the initial membrane potential of every spiking neuron to increase its fire rate and Ho and Chang [16] improve the data-based normalization method and propose Trainable Cliping Layers (TCL) to restrict the maximum activation value of each layer. Parameter normalization method based on trainable parameters in the TCL can greatly increase the firing rates of the spiking neurons and reduce the number of time steps required for the converted SNNs. Based on TCL, Bu et al. [4] optimize the initial potentials of spiking neurons, resulting in state-of-the-art classification accuracy. Furthermore, Liu et al. [27] propose an efficient conversion framework and achieve the fewest inference time steps for SNNs on image classification tasks.
Moreover, some works have extended the application of SNNs exploiting the ANN-to-SNN conversion method. Kim et al. [21] introduce the channel-wise normalization method for convolutional neural networks and signed spiking neuron with imbalanced threshold for Leaky-ReLU activation function, then propose the first spike-based energy-efficient object detection model. Tan et al. [46] propose robust fire rates of spiking neurons to reduce the conversion loss and extend the application of SNNs to deep Q-Networks. Despite these efforts, the applications of SNNs are still limited at present. In this article, we apply the ANN-to-SNN conversion method to develop an energy-efficient spiking neural rendering model.

3 Preliminary

Synthesizing photo-realistic images and videos is crucial in the field of computational graphics and has been the focus of research in recent decades. Traditional techniques generate synthetic images of scenes using rendering algorithms such as rasterization or ray tracing. However, these techniques typically requires a significant amount of expensive manual effort to synthesize high-quality images [48]. With the remarkable development of artificial intelligence, the computer graphics community tries to combine basic principles of computational graphics with machine learning to solve rendering problems, which is known as neural rendering techniques. The emerging neural rendering techniques use neural networks to learn the geometry, appearance, illumination, and other properties of a scene [49]. In comparison to traditional rendering techniques, neural rendering achieves state-of-the-art rendering performance using pre-trained neural network models and represents a leap forward toward the goal of synthesizing photo-realistic image and video content. In recent years, numerous studies have explored various ways of accomplishing rendering using learnable scene properties, contributing to tremendous progress in the field of neural rendering. Among them, the innovative work of the NeRF [30] has made a breakthrough in novel view synthesis and leads to a significant surge in developments of neural rendering.
Using a Multi-Layer Perceptron (MLP), NeRF represents a static scene as a continuous five dimensional function [30]. The function \(F:(\textbf{x},\textbf{d})\rightarrow(\textbf{c},\sigma)\) regresses from three dimensional (3D) position coordinate \(\textbf{x}=(x,y,z)\) and two dimensional viewing direction \(\textbf{d}=(\theta,\phi)\) to emitted color \(\textbf{c}=(r,g,b)\) and volume density \(\sigma\). Figure 2 illustrates an overview of NeRF’s scene representation and neural network architecture. A ray \(\textbf{r}(h)=\textbf{o}+h\textbf{d}\) is emitted from the camera’s center of projection o along the direction d and passes through one pixel on the image plane. When rendering the pixel, a set of points with distance h is sampled along the ray \(\textbf{r}(h)=\textbf{o}+h\textbf{d}\). For each point with \(h_{i}\in\textbf{h}\), its corresponding 3D position coordinates can be obtained by the camera ray \(\textbf{x}=\textbf{r}(h_{i})\). To enable the MLP to learn high frequency details, NeRF separately preprocesses each of the three coordinate values in x and the two components of direction d with the following sinusoidal encoding function:
\begin{align}\gamma(p)=[\text{sin}(2^{0}p),\text{cos}(2^{0}p),\dots,\text{sin}(2^{L}p), \text{cos}(2^{L}p)]^{T},\end{align}
(1)
where \(L\) is an integer hyper-parameter. It is worth noting that in Figure 2, \(\gamma(\textbf{x})\) and \(\gamma(\textbf{d})\) are inserted to the stage 2 and stage 3, respectively. This unique network architecture is widely adopted in the field of neural rendering to improve the learning performance [6, 30, 35]. To obtain the color of a certain pixel, the estimated colors and densities of the sampled points along the ray are used to approximate the volume rendering integral by numerical quadrature [29]:
\begin{align}\displaystyle\hat{C}(\textbf{r})=\sum_{i=1}^{N}w_{i}\textbf{c}_{i},\end{align}
(2)
\begin{align}\displaystyle\text{with}\ w_{i}=(1-\text{exp}(-\sigma_{i}\delta_{i}))\text{exp }\left(-\sum_{j=1}^{i-1}\sigma_{j}\delta_{j}\right),\end{align}
(3)
where \(\hat{C}(\textbf{r})\) is the estimated pixel color, \(\textbf{c}_{i}\) and \(\sigma_{i}\) are the estimated color and density of the \(i\)th point, \(\delta_{i}=h_{i+1}-h_{i}\) is the distance between two adjacent sampled points along the ray, and \(w_{i}\) evaluates the importance of \(\textbf{c}_{i}\) to the pixel.
To improve the rendering performance, NeRF utilizes both coarse and fine neural networks for scene reconstruction. During the inference process, NeRF first samples \(N_{c}\) points along each ray and feeds them to the coarse network to perform the forward propagation process. Based on the outputs of the coarse network and Equation (3), the color weight \(w_{i}\) of each sampled point is calculated. The color weights of \(N_{c}\) sampled points are then normalized using \(\hat{w}_{i}=w_{i}/\sum^{N_{c}}_{j=1}w_{j}\) to estimate the distribution of the color weights along the entire ray. Subsequently, \(N_{f}\) points with larger color weights are sampled according to the estimated distribution, and the resulting \(N_{c}+N_{f}\) sampled points are jointly input into the fine network for inference. This hierarchical sampling strategy effectively feeds more points that contribute significantly to the pixel color into the fine network, thereby enhancing the rendering performance of NeRF. However, this strategy also dramatically increase the computational overhead of NeRF. For instance, if \(N_{c}=N_{f}=64\) is selected, NeRF must perform forward propagation on \(3\times 10^{7}\) input sampled points to render an image of size \(400\times 400\). Additionally, dozens or even hundreds of images must be rendered to reconstruct a complete scene, which inevitably results in significant energy consumption. To address this challenge, this article proposes Spiking-NeRF, an energy-efficient neural rendering model that utilizes the SNN to implement neural rendering.

4 Spiking-NeRF

In NeRF, synthesizing photo-realistic images requires the neural network to estimate the precise values of color \(\textbf{c}_{i}\) and density \(\sigma_{i}\) for each sampled point. As a result, if we apply SNNs to NeRF, the converted SNNs should output the same numerical values as the source ANNs. In this section, we first present a precise output decoding scheme by analyzing the inference process and mathematical relationship of ANNs and SNNs. With the proposed output scheme, the converted SNNs can output precise values corresponding to the source ANNs. For the two layers in NeRF with the inserted \(\gamma(\textbf{x})\) and \(\gamma(\textbf{d})\) (see Figure 2), we develop customized conversion method for their parameters and propose a spiking neural rendering model, called Spiking-NeRF. Finally, we propose an ETS for Spiking-NeRF to reduce energy consumption by analyzing the computing paradigm of volume rendering and combining it with the discrete property of the spikes.

4.1 Conversion Method

4.1.1 Inference Process of ANNs and SNNs.

The ANNs to be converted to SNNs are ordinarily trained with ReLU activation. Specifically, for an \(L\)-layer fully-connected ANN with ReLU, the inference process of the analog neurons can be formulated as
\begin{align}f^{l}(\boldsymbol{x})=\text{max}(0,\boldsymbol{W}^{l}_{A}f^{l-1}(\boldsymbol{x})+\boldsymbol{b}^{l}_{A}),\end{align}
(4)
where matrix \(\boldsymbol{W}^{l}_{A}\) (\(l={1,2,3,\dots,L}\)) denotes the weight matrix of the ANN between layer \(l-1\) and layer \(l\), vector \(\boldsymbol{b}^{l}\) indicates the bias of the ANN in layer \(l\), vector \(\boldsymbol{x}\) refers to the input of the ANN, and \(f^{l}(\boldsymbol{x})\) is the activation values of layer \(l\). Utilizing Equation (4), we can derive the analog values predicted by the output layer:
\begin{align}f^{L}(\boldsymbol{x})=\boldsymbol{W}^{L}_{A}f^{L-1}(\cdots f^{1}(\boldsymbol{x})\cdots)+\boldsymbol{b}^{L}_{A}.\end{align}
(5)
Different from ANNs, SNNs simulate spiking neurons with a series of discrete spikes over \(T\) time steps. In our work, we consider the IF model [5, 10] for spiking neurons. Figure 3 illustrates the forward propagation process of a spike neuron with IF model, which consists of integration phase and firing phase. In the derivation of the equations below, we consider the forward propagation process for all spiking neurons in each layer. The length of the spike train in SNN is the total time step \(T\). At a certain time step \(t\), the spiking neurons in layer \(l\) receive the spikes from layer \(l-1\) and enter the integration phase, the integrated membrane potentials \(\boldsymbol{z}^{l}(t)\) can be computed as
Fig. 3.
Fig. 3. The forward propagation process of a spike neuron with IF model.
\begin{align}\boldsymbol{z}^{l}(t)=\boldsymbol{v}^{l}(t-1)+\boldsymbol{W}^{l}_{S}\boldsymbol{\theta}^{l-1}(t)V_{th}^{l-1}+ \boldsymbol{b}^{l}_{S},\end{align}
(6)
where \(\boldsymbol{W}^{l}_{S}\) and \(\boldsymbol{b}^{l}_{S}\) are synaptic weight and bias of spiking neurons at layer \(l\), \(\boldsymbol{v}^{l}(t-1)\) is the membrane potential of layer \(l\) at time step \(t-1\), \(V_{th}^{l-1}\) is the firing threshold potential of layer \(l-1\), \(\boldsymbol{\theta}^{l-1}(t)\) is a vector denoting whether each neuron in layer \(l-1\) fires a spike at time step \(t\). For the spiking neurons whose integrated membrane potentials exceed \(V_{th}^{l}\), they enter the firing phase and fire spikes to the following layer. Therefore, \(\boldsymbol{\theta}^{l}(t)\) is defined as follows:
\begin{align}\boldsymbol{\theta}^{l}(t)=U(\boldsymbol{z}^{l}(t)-V_{th}^{l}),\end{align}
(7)
where \(U(\alpha)\) is the unit step function. Once firing spikes, the spiking neurons reset their integrated potentials in \(\boldsymbol{z}^{l}(t)\). To reduce the information loss, we adopt the reset by subtraction strategy following the previous works [13, 41]. The RMPs after reset can be formulated as
\begin{align}\boldsymbol{v}^{l}(t)=\boldsymbol{v}^{l}(t-1)+\boldsymbol{W}^{l}_{S}\boldsymbol{\theta}^{l-1}(t)V_{th}^{l-1}+ \boldsymbol{b}^{l}_{S}-\boldsymbol{\theta}^{l}(t)V_{th}^{l}.\end{align}
(8)

4.1.2 Precise Output Decoding Scheme.

In general, the initial membrane potentials \(\boldsymbol{v}^{l}(0)\) are set to zero. Thus we integrate Equation (8) over \(T\) time steps and obtain:
\begin{align}\boldsymbol{v}^{l}(T)=\boldsymbol{W}^{l}_{S}\sum_{t=1}^{T}\boldsymbol{\theta}^{l-1}(t)V_{th}^{l-1}+T \boldsymbol{b}^{l}_{S}-\sum_{t=1}^{T}\boldsymbol{\theta}^{l}(t)V_{th}^{l}.\end{align}
(9)
During \(T\) time steps, the firing rates of spiking neurons in layer \(l\) are denoted as \(\boldsymbol{r}^{l}(T)=\frac{\sum_{t=1}^{T}\boldsymbol{\theta}^{l}(t)}{T}\) and fall into the interval \([0,1]\). Divide Equation (9) by \(T\), we can get:
\begin{align}\boldsymbol{r}^{l}(T)V_{th}^{l}=\boldsymbol{W}^{l}_{S}\boldsymbol{r}^{l-1}(T)V_{th}^{l-1}+\boldsymbol{b}^{l}_{ S}-\frac{\boldsymbol{v}^{l}(T)}{T}.\end{align}
(10)
If total time step \(T\) is large enough, the values of the RMPs \(\boldsymbol{v}^{l}(T)\) can be neglected compared to \(T\). Thus, we can assume \(\frac{\boldsymbol{v}^{l}(T)}{T}\) is infinitely approaching zero and Equation (10) can be reformulated as:
\begin{align}\boldsymbol{r}^{l}(T)V_{th}^{l} & =\boldsymbol{W}^{l}_{S}\boldsymbol{r}^{l-1}(T)V_{th}^{l-1}+\boldsymbol{b}^{l}_{S} \\& =\text{max}(0,\boldsymbol{W}^{l}_{S}\boldsymbol{r}^{l-1}(T)V_{th}^{l-1}+\boldsymbol{b}^{l }_{S}).\end{align}
(11)
For a pre-trained ANN with ReLU activation function, the activation value in layer \(l\) has an upper bound \(\lambda^{l}=\text{max}\{f^{l}(\boldsymbol{x})\}\) when running on the dataset. We normalize the activation of analog neurons as \(p^{l}(\boldsymbol{x})=\frac{f^{l}(\boldsymbol{x})}{\lambda^{l}}\), and reformulate Equation (4) as
\begin{align}p^{l}(\boldsymbol{x})\lambda^{l}=\text{max}(0,\boldsymbol{W}^{l}_{A}p^{l-1}(\boldsymbol{x})\lambda^{l- 1}+\boldsymbol{b}^{l}_{A}). \end{align}
(12)
The essential principle of ANN-to-SNN conversion is that the firing rates of spiking neurons are proportional to the activation values of analog neurons. The firing rates \(\boldsymbol{r}^{l}(T)\) of spiking neurons and the normalized activation \(p^{l}(\boldsymbol{x})\) of analog neurons are both strict in \([0,1]\). Typically, the threshold potentials of all layers in SNNs are set to \(1\). Comparing Equation (11) with Equation (12), we can conclude that ANNs can be converted to SNNs through the following parameter normalization rules [10, 41]:
\begin{align}\boldsymbol{W}^{l}_{S}=\boldsymbol{W}^{l}_{A}\frac{\lambda^{l-1}}{\lambda^{l}}; \quad\boldsymbol{b}^{l}_{S}=\boldsymbol{b}^{l}_{A}\frac{1}{\lambda^{l}}.\end{align}
(13)
Since the test set is unknown, in practice the maximum activation \(\lambda^{l}\) is estimated by the inference process on the training set.
With the above parameter normalization method, there are two output decoding schemes for the output neurons of SNNs [21]: one outputs fire rates \(\boldsymbol{r}^{L}(T)\), the other outputs the integrated membrane potentials \(\boldsymbol{z}^{L}(t)\). Unfortunately, neither of these output schemes generate precise output values corresponding to the source ANNs. \(\boldsymbol{r}^{L}(T)\) always falls into \([0,1]\), and \(\boldsymbol{z}^{L}(t)\) changes linearly as \(T\) increases. Actually, both of them are proportional to the activation values of ANNs [41]. Therefore, the converted SNNs applying these two output schemes cannot output the precise values corresponding to source ANNs. To guarantee the rendering performance, we propose an output decoding scheme which enables the converted SNNs output the precise values corresponding to the output values of the source ANNs.
In our proposed precise output decoding scheme, the neurons in the output layer only integrate membrane potentials without firing spikes. Through Equation (6), we can obtain the integrated membrane potentials of the output layer \(L\) over \(T\) time steps. Then we divide it by \(T\) and get
\begin{align}\frac{\boldsymbol{z}^{L}(T)}{T}=\boldsymbol{W}^{L}_{S}\boldsymbol{r}^{L-1}(T)V_{th}^{L-1}+\boldsymbol{b}^{L}_{ S}.\end{align}
(14)
The same as Equation (12), reformulating Equation (5) as \(f^{L}(\boldsymbol{x})=\boldsymbol{W}^{L}_{A}p^{L-1}(\boldsymbol{x})\lambda^{L-1}+\boldsymbol{b}^{L}_{A}\), and comparing it with Equation (14), we can conclude that the parameters of output layer should obey the following conversion rules:
\begin{align}\boldsymbol{W}^{L}_{S}=\boldsymbol{W}^{L}_{A}\lambda^{L-1};\quad\boldsymbol{b}^{L}_{S}= \boldsymbol{b}^{L}_{A}.\end{align}
(15)
Equation (11) demonstrates information transfer relationship between adjacent layers of SNNs. Bring it to Equation (14), and we can get
\begin{align}\frac{\boldsymbol{z}^{L}(T)}{T}=\boldsymbol{W}^{L}_{S}(\cdots(\boldsymbol{W}^{1}_{S}\boldsymbol{r}^{0}(t)+\boldsymbol{ b}^{1}_{S})\cdots)+\boldsymbol{b}^{L}_{S},\end{align}
(16)
where \(\boldsymbol{r}^{0}(t)\) is the firing rate of the input spikes. To obtain the numerical relationship between the source ANNs and the converted SNNs, we bring Equations (13) and (15) into Equation (16) and get
\begin{align}\frac{\boldsymbol{z}^{L}(T)}{T}=\boldsymbol{W}^{L}_{A}(\cdots(\boldsymbol{W}^{1}_{A}\boldsymbol{r}^{0}(t)+\boldsymbol{ b}^{1}_{A})\cdots)+\boldsymbol{b}^{L}_{A}.\end{align}
(17)
Compare Equation (17) with Equation (5), then we can get that the integrated membrane potentials of the converted SNNs in layer \(L\) divided by total time step \(T\) equal to the output values of the source ANNs, i.e., \(\frac{\boldsymbol{z}^{L}(T)}{T}=f^{L}(\boldsymbol{x})\). Therefore, to obtain the precise output values, we adopt \(\frac{\boldsymbol{z}^{L}(T)}{T}\) to be the output of Spiking-NeRF.

4.1.3 Customized Parameter Normalization Method for Spiking-NeRF.

According to Equation (1), the maximum value of input layer is 1, thus the weight and bias between the input layer and its next layer obey the following normalization rule:
\begin{align}\boldsymbol{W}^{l}_{S}=\boldsymbol{W}^{l}_{A}\frac{1}{\lambda^{l}};\quad\boldsymbol{b}^{l }_{S}=\boldsymbol{b}^{l}_{A}\frac{1}{\lambda^{l}}.\end{align}
(18)
Moreover, the neural network architecture of NeRF is different from the ordinary full-connected ANNs. As illustrated in Figure 2, the inputs of stage 2 and stage 3 are the activation values of the previous layer concatenated with \(\gamma(\textbf{x})\) and \(\gamma(\textbf{d})\), respectively. For the parameters of these two layers, we have to take different parameter normalization methods for the two parts of concatenated inputs. The parameters for the activation values of the previous layer (blue arrows in Figure 2) still follow the normalization rules of Equation (13):
\begin{align}\boldsymbol{W}^{l}_{S}[P]=\boldsymbol{W}^{l}_{A}[P]\frac{\lambda^{l-1}}{\lambda^{ l}};\quad\boldsymbol{b}^{l}_{S}[P]=\boldsymbol{b}^{l}_{A}[P]\frac{1}{\lambda^{l}}.\end{align}
(19)
And \(\boldsymbol{W}^{l}_{A}[P]\) and \(\boldsymbol{b}^{l}_{A}[P]\) represent the weight and bias corresponding to the activation values of the previous layer, respectively. Same as the input layer, the maximum value of inserted input equals to 1 rather than \(\lambda^{l-1}\) according to Equation (1). Therefore, the parameters for the inserted inputs (green arrows in Figure 2) obey the following normalization rules during conversion:
\begin{align}\boldsymbol{W}^{l}_{S}[I]=\boldsymbol{W}^{l}_{A}[I]\frac{1}{\lambda^{l}};\quad\boldsymbol{b}^{l}_{S}[I]=\boldsymbol{b}^{l}_{A}[I]\frac{1}{\lambda^{l}},\end{align}
(20)
where \(\boldsymbol{W}^{l}_{A}[I]\) and \(\boldsymbol{b}^{l}_{A}[I]\) respectively represent the weight and bias corresponding to the inserted inputs in the weight matrix \(\boldsymbol{W}^{l}_{A}\) and bias vector \(\boldsymbol{b}^{l}_{A}\) of the current layer. In order to output the precise value, Spiking-NeRF uses the proposed precise output decoding scheme. Therefore, the parameter normalization rule for the output layer is expressed in Equation (15), and the parameter normalization rule for the residual layers is defined by Equation (13). In conclusion, the overall parameter normalization flow for converting NeRF to Spiking-NeRF is summarized in Figure 4 and Algorithm 1.
Fig. 4.
Fig. 4. The parameter conversion flow for Spiking-NeRF. The colored rectangles indicate the weight matrices and bias vectors of NeRF. Different colors indicate different parameter normalization rules, which is illustrated by the legend on the right side of the figure.

4.2 ETS

The parameter \(w_{i}\) in Equation (2) represents the importance of the color of the \(i\)th sampled point along the ray to the pixel, we note it as color weight. As formulated in Equation (3), since the distance \(\delta_{i}\) is non-negative and the volume density \(\sigma_{i}\) is rectified by ReLU function, the color weight \(w_{i}\) is a non-negative value in the interval of [0, 1]. Larger color weight \(w_{i}\) means that \(\textbf{c}_{i}\) contributes more to the pixel, and \(w_{i}=0\) means that \(\textbf{c}_{i}\) have no contribution to the pixel. Figure 5 illustrates the percentage of sampled points with color weights equal to zero and those with color weight \(\in\) (0, 1] in the training set of realistic scenes and synthetic scenes [30], respectively. For realistic scenes, the sampled points with \(w_{i}=0\) are more than \(50\%\). Since each synthetic scene consists of a physical object and a white background (such as the Ship image in Figure 10(a)), the sampled points with \(w_{i}=0\) account for a larger percentage in the synthetic scenes. For example, the sampled points with \(w_{i}=0\) in Chair scene account for as high as \(94.68\%\). The color weights of the sampled points are obtained based on the outputs of the neural network. Therefore, in NeRF, the number of computational operations required by every sampled point is the same regardless of their contribution to the pixel color, which inevitably results in significant energy overhead.
Fig. 5.
Fig. 5. The percentage of sampled points with color weight equal to zero and those with color weight \(\in\) (0, 1] in the training set of realistic scenes and synthetic scenes.
As demonstrated in the previous section, the converted SNNs transmit discrete spikes and can generate precise output values compared with the source ANNs. We use this property of the converted SNNs and propose an ETS according to the color weights of sampled points, thus reducing the computational operations and energy consumption. The proposed ETS is summarized in Figure 6 and Algorithm 2. In general, all inputs in SNNs have the same total time step \(T\). But during the inference process of Spiking-NeRF, we can obtain \(\sigma_{i,T_{temp}}\) from the membrane potential and use Equation (3) to calculate the color weight of the sampled point at a proper time step \(T_{temp}\) (\(T_{temp}{\lt}T\)), and then determine whether to terminate the inference process of the sampled point based on its color weight \(w_{i,T_{temp}}\). Specifically, the actual time step \(T_{actual}\) of the \(i\)th sampled point is determined by the following function:
Fig. 6.
Fig. 6. The flowchart of the proposed ETS.
\begin{align}T_{actual}(w_{i,T_{temp}})=\left\{\begin{matrix}T & w_{i,T_{temp}} > 0 \\T_{temp} & w_{i,T_{temp}}=0\end{matrix}\right.\end{align}
(21)
Since color weight equal to zero indicates the \(i\)th sampled point has no contribution to the color of the pixel, we terminate the inference process of the sampled points with \(w_{i,T_{temp}}=0\) and output \(\frac{\boldsymbol{z}^{L}_{i}(T_{temp})}{T_{temp}}\). To guarantee the rendering performance, the sampled points with positive color weight (\(w_{i,T_{temp}}{\gt}0\)) are still fed into Spiking-NeRF for the residual time step \(T-T_{temp}\) (blue dashed input spikes in Figure 6), and Spiking-NeRF outputs \(\frac{\boldsymbol{z}^{L}_{i}(T)}{T}\) at time step \(T\) (blue dashed MP in Figure 6). Therefore, our proposed ETS can reduce the computational overhead and energy consumption of Spiking-NeRF while ensuring image rendering quality.

5 Experimental Results

5.1 Implementation Details

We evaluate the rendering performance of Spiking-NeRF on both synthetic and realistic datasets [30], and compare its energy consumption with that of NeRF. All the experiments are implemented on NVIDIA GeForce 2080Ti GPUs with PyTorch framework [37].

5.1.1 Datasets.

The synthetic dataset contains eight scenes of different objects. For each scene, there are \(100\) views for training and \(200\) views for testing. The image of each view has \(400\times 400\) pixels. The complex realistic dataset consists of eight real-world scenes captured by mobile phones. Each scene is composed of \(20\)\(60\) images, all at \(504\times 378\) pixels. And one eighth of the images are used for testing. These 16 scenes constitute the complete dataset employed in NeRF [30]. Therefore, the experiments in this article also evaluate Spiking-NeRF on these 16 scenes.

5.1.2 Input Encoding for Spiking-NeRF.

According to Equation (1), the inputs of NeRF are mapped with fourier features before being fed into the neural network. The intensity values of mapped inputs are in the interval of \([-1,1]\). In Spiking-NeRF, we use Poisson rate coding [15] to generate input spike trains, whose firing rates are approximately equal to the absolute values of the corresponding inputs in ANN. And the spike train of the \(i\)th input neuron is positive or negative depending on the sign of its corresponding analog input value in ANN.

5.1.3 Training Setup.

Following the practice in NeRF [30], we employ a 10-layer MLP network with ReLU activation function, as shown in Figure 2, to reconstruct each scene. Furthermore, in order to reduce conversion loss, the trainable clipping layers [16] are added after the ReLU activation for the hidden layers of MLP. And we execute the parameter normalization for Spiking-NeRF using the parameters of trainable clipping layers in NeRF. All the models are trained with Adam [22] optimizer for 200,000 iterations. The batch size is set to 1,024 rays. Following [30], we use the initial learning rate of \(5\times 10^{-4}\) and decay it exponentially as the iterations increase. The number of sampled points for synthetic scenes is set to \(N_{c}=64\) and \(N_{f}=128\), while the number of sampled points for realistic scenes is set to \(N_{c}=N_{f}=64\). The hyper-parameter \(L\) in Equation (1) is set to \(10\) for 3D coordinate x and \(4\) for view direction d. For parameters in trainable clipping layers, we set their initial values to \(600\) and \(100\) when training synthetic scenes and realistic scenes respectively, with the \(L_{2}\)-regularization coefficient of \(1\times 10^{-1}\).

5.1.4 Evaluation Metrics.

The performance of Spiking-NeRF is evaluated from three aspects:
The quality of rendered images. The Peak Signal-to-Noise Ratio (PSNR) of the rendered image is used as the metric for image quality assessment. PSNR is calculated by the mean square error of the ground truth images and the renderings. A larger PSNR value means that the rendered image is more photo-realistic. The evaluation of energy consumption is the ratio of energy consumed by NeRF and Spiking-NeRF respectively when rendering a single image.
The energy consumption during rendering. The energy consumption is determined by the type and the quantity of computational operations of neural networks [21, 24, 27, 41]. During the inference process, the computational operations that occurred in analog neurons are Multiply-Accumulate (MAC) operations, while integrating discrete binary spikes that occurred in spiking neurons are Accumulate (AC) operations. In the case of ANNs, the quantity of MAC is determined by the network architecture. So rendering each image requires the same quantity of MAC operations. In the case of SNNs, the quantity of AC operations is determined by both the firing rates of spiking neurons and the network architecture. Since the fire rates vary in image, the quantity of AC operations required by Spiking-NeRF is also different when rendering each image. To make a fair comparison, we calculate the number of AC operations required for Spiking-NeRF to render all images in the test set and take the average value as the number of AC operations required to render a single image. Following the practice in [24], we utilize 3.2pJ/MAC and 0.1pJ/AC as the energy consumption baseline.
The scalability of the proposed ETS. To further validate the scalability of our proposed ETS, we change the neural network architecture of NeRF and apply ETS on it. Considering the enormous training cost of NeRF, we use the following four network architectures for experiments:
Case (a) contains 1.5\(\times\) more neurons (384 neurons in each hidden layer) than NeRF;
Case (b) contains 0.5\(\times\) fewer neurons (128 neurons in each hidden layer) than NeRF;
Case (c) adds one layer to stage 1 and stage 2 in Figure 2, respectively (total 12 layers);
Case (d) reduces one layer in stage 1 and stage 2, respectively (total 8 layers).

5.1.5 Number of Computational Operation.

Since the neural network of NeRF is an MLP, we only consider the number of computational operations in fully connected layers. For an \(L\)-layer fully connected neural network, let \(f_{in,l}\) denote the number of neurons in layer \(l-1\) and \(f_{out,l}\) denote the number of neurons in layer \(l\). The number of MAC operations required for a single forward propagation process of ANN can be calculated as
\begin{align}\sum^{L}_{l=2}f_{in,l}\times f_{out,l}.\end{align}
(22)
For SNN with total time step \(T\), the number of AC operations required for a single forward propagation process can be calculated as
\begin{align}\sum^{T}_{t=1}\left(\sum^{L}_{l=2}s_{l-1}(t)\times f_{out,l}\right),\end{align}
(23)
where \(s_{l-1}(t)\) represents the total number of spikes fired by the spiking neurons in layer \(l-1\) at time step \(t\).

5.2 Rendering Performance

We evaluate the rendering performance of Spiking-NeRF quantitatively and qualitatively. Table 1 and Table 2 show the average PSNR values of the images rendered by NeRF, Spiking-NeRF without ETS, and Spiking-NeRF on the test set of eight real scenes and eight synthetic scenes, respectively. Spiking-NeRF refers to the spiking neural rendering model that has applied ETS. Figures 7 and 8 illustrate the results of rendering an image from the Fern and Ship by Spiking-NeRF without ETS when the total time step \(T\) ranges from \(8\) to \(1{,}024\), respectively.
Table 1.
Realistic ScenesFortressFernOrchidsRoomFlowerHornsLeavesT-Rex
NeRF32.4826.7221.1332.7328.0628.6522.2427.81
Spiking-NeRF without ETS\(T=8\)14.5115.919.1217.0710.3513.059.6815.05
\(T=16\)18.8018.9510.1220.9314.0317.6212.0217.79
\(T=32\)23.5121.8512.1724.9918.5121.9415.1520.88
\(T=64\)28.0424.2515.2728.4022.7225.2518.3523.89
\(T=128\)30.9725.7618.6730.8525.7227.2920.7326.08
\(T=256\)32.0626.4520.5931.9727.2228.2322.8327.17
\(T=512\)32.2726.6821.1332.4627.7728.5622.1827.58
\(T=1{,}024\)32.3826.7621.2332.6227.9628.6622.2727.72
Spiking-NeRF\(T=256\)32.0526.4620.5831.9527.2228.2322.8127.16
Table 1. The Average PSNR Value of the Test Set of Realistic Scenes Rendered by (1) NeRF, (2) Spiking-NeRF without ETS, and (3) Spiking-NeRF
The bolded values indicate the negligible effect of ETS on PSNR.
Table 2.
Synthetic ScenesMicMaterialsLegoHothogShipDrumsFicusChair
NeRF33.2829.3331.6336.7929.3725.5429.1733.74
Spiking-NeRF without ETS\(T=8\)9.8110.369.8114.8513.2211.1413.9812.24
\(T=16\)14.3714.2613.5918.9517.3314.5619.3815.29
\(T=32\)19.9719.0918.8323.7621.7917.9824.1420.40
\(T=64\)26.1123.5624.5026.3325.4921.5927.0926.33
\(T=128\)30.5726.7328.9633.1327.7524.2328.4430.47
\(T=256\)32.3728.3730.9535.8428.7425.2328.8932.83
\(T=512\)32.8329.0131.4336.7729.1025.4729.0633.56
\(T=1{,}024\)32.9629.2331.6537.0229.2125.5229.1033.73
Spiking-NeRF\(T=256\)32.3728.3630.9835.8328.7725.2328.9132.84
Table 2. The Average PSNR Value of the Test Set of Realistic Scenes Rendered by (1) NeRF, (2) Spiking-NeRF without ETS, and (3) Spiking-NeRF
The bolded values indicate the negligible effect of ETS on PSNR.
Fig. 7.
Fig. 7. The rendering performance of Spiking-NeRF without ETS on Fern with various time step \(T\).
Fig. 8.
Fig. 8. The rendering performance of Spiking-NeRF without ETS on Ship with various time step \(T\).
As illustrated in Tables 1 and 2, the average PSNR value of images on the test set rendered by Spiking-NeRF without ETS increases with the total time step \(T\). Meanwhile, the increment of average PSNR gradually decays as \(T\) increases. Specifically, when the total time step \(T\) increases from 8 to 16, the average PSNR values of Fern and Ship increase by 3.04 and 4.11, respectively. However, when \(T\) increases from 512 to 1,024, the average PSNR values for Fern and Ship only increase by 0.08 and 0.11, respectively. When \(T\) is larger than 256, the PSNR value increases extremely slowly. Moreover, it can be observed from Figures 7 and 8 that the rendered images have almost no visual difference at \(T=256\), \(T=512\), and \(T=1{,}024\), and the variation of PSNR value is also very small. However, in terms of SNN, the computational operations is proportional to \(T\). An increase in \(T\) from 256 to 512 doubles the computational operations of SNN, and an increase in \(T\) from 256 to 1,024 quadruples the computational operations of SNN. To balance the tradeoff between PSNR and energy consumption, we set the total time step \(T\) of Spiking-NeRF to 256 for all scenes.
To implement our proposed ETS, we first sweep hyper-parameter \(T_{temp}\) on the training set of each scene. The sweep range of \(T_{temp}\) is [8, 88], with the interval of \(8\). For each scene, we select an optimal \(T_{temp}\) that can balance the tradeoff between the PSNR and energy consumption on the training set. And then we apply the selected \(T_{temp}\) to the inference process of the test set. Figure 9 illustrates the sweep results of four scenes from realistic scenes and four scenes from synthetic scenes. And we demonstrate how to select \(T_{temp}\) for each scene. During the sweep process, the PSNR values of Flower, Lego, Ficus, and Materials are almost constant. However, on these scenes, the ratio of energy consumed by NeRF and Spiking-NeRF has a maximum value. We take the time step with the maximum energy ratio as \(T_{temp}\). Therefore, we select \(T_{temp}=16\) for Flower, \(T_{temp}=32\) for Lego, \(T_{temp}=16\) for Ficus, and \(T_{temp}=16\) for Materials to reduce energy consumption. In addition, the ratio of energy consumed by NeRF and Spiking-NeRF decreases as \(T\) increases for Fern, T-Rex, Horns and Ship. Moreover, on these four scenes, the PSNR values of images rendered by Spiking-NeRF gradually stabilize as \(T\) increases. After the PSNR values are stabilized, we take the time step with the largest energy ratio as \(T_{temp}\). Therefore, we select \(T_{temp}=56\) for Fern, \(T_{temp}=56\) for T-Rex, \(T_{temp}=72\) for Horns and \(T_{temp}=16\) for Ship to reduce energy consumption. The \(T_{temp}\) for the remaining scenes is determined by the same approach and is reported in Table 3. We investigate the relationship between the selection of \(T_{temp}\) and the complexity of the image. We assess the image complexity by quantifying it in terms of the variance of the pixel points and find that a smaller variance in pixel points corresponds to a smaller value of \(T_{temp}\).
Fig. 9.
Fig. 9. The PSNR and energy ratio of Spiking-NeRF when sweeping \(T_{temp}\) from 8 to 88 on the training set of realistic scenes and synthetic scenes. The total time step \(T\) is set to 256.
Table 3.
DatasetSpiking-NeRF w/o ETS (\(T=256\))Spiking-NeRF (\(T=256\))
ACsEnergy (J)Ratio\(T_{temp}\)ACsEnergy (J)Ratio
Fern4.60E\(+\)1446.001.18563.03E\(+\)1430.301.78
T-Rex4.86E\(+\)1448.601.11562.91E\(+\)1429.101.86
Horns4.72E\(+\)1447.201.15722.80E\(+\)1428.001.93
Room4.57E\(+\)1445.701.18642.86E\(+\)1428.601.89
Flower5.28E\(+\)1452.801.20163.06E\(+\)1430.061.77
Leaves4.98E\(+\)1449.801.09482.73E\(+\)1427.301.98
Orchid4.79E\(+\)1447.901.13243.04E\(+\)1430.401.78
Fortress4.91E\(+\)1449.101.10242.53E\(+\)1425.302.14
Mic8.67E\(+\)1486.700.70163.56E\(+\)1435.601.70
Lego6.93E\(+\)1469.300.87324.41E\(+\)1444.101.37
Ship5.53E\(+\)1455.301.09163.42E\(+\)1434.201.77
Chair8.23E\(+\)1482.300.73244.10E\(+\)1441.001.48
Drums7.41E\(+\)1471.400.82163.04E\(+\)1430.401.99
Ficus7.72E\(+\)1477.200.78162.67E\(+\)1426.702.27
Hotdog6.95E\(+\)1469.500.87244.83E\(+\)1448.301.25
Materials7.89E\(+\)1478.900.77163.64E\(+\)1436.401.66
Table 3. The Number of Operations and Energy Rendering a Single Image Required by NeRF, Spiking-NeRF without ETS When \(T=256\), and Spiking-NeRF When \(T=256\)
When applying ETS to Spiking-NeRF, as shown in the bold rows of Tables 1 and 2, the average PSNR values of all scenes on the test set rendered by Spiking-NeRF and Spiking-NeRF without ETS at \(T=256\) has almost no difference. Moreover we compare the rendering performance of the proposed precise output decoding scheme with that of the traditional decoding schemes (fire rates and membrane potentials). Figure 10(a) and (b) show that the rendered images of Fern and Ship have no observable visual difference between NeRF and Spiking-NeRF with the proposed decoding scheme. However, as shown in Figure 10(c) and (d), Spiking-NeRF cannot generate recognizable images with either membrane potentials or with fire rates as output. Besides, we also validate the effectiveness of the customized parameter normalization method for Spiking-NeRF. When Spiking-NeRF does not apply the customized parameter normalization method for \(\gamma(\textbf{x})\) inserted to stage 2 and \(\gamma(\textbf{d})\) inserted to stage 3, the rendering results on Fern and Ship are shown in Figure 11(b) and (d), respectively. And Figure 11(a) and (c) are the ground truth. There is an obvious visual difference between Figure 11(a) and (b), and between Figure 11(c) and (d). In addition, the PSNR values of Figure 11(b) and (d) are very low.
Fig. 10.
Fig. 10. The rendering performance of (a) NeRF, (b) Spiking-NeRF with proposed decoding scheme, (c) Spiking-NeRF with fire rates as output, and (d) Spiking-NeRF with membrane potentials as output.
Fig. 11.
Fig. 11. The visual difference of ground truth (a and c) and the images render by Spiking-NeRF without customized parameter normalization method (b and d) on Fern and Ship.

5.3 Energy Consumption

We compare the energy consumed by NeRF, Spiking-NeRF without ETS, and Spiking-NeRF when rendering a single image. We first calculate the average number of computational operations required to render a single image by NeRF and Spiking-NeRF respectively, and then evaluate the total energy consumed to render a single image based on the energy consumption baseline. When rendering a single image in realistic scenes and synthetic scenes, NeRF requires 1.69E\(+\)13 MACs and 1.89E\(+\)13 MACs, and consumes 54.08J and 60.48J energy, respectively. The reduction in energy consumption achieved by Spiking-NeRF comes from two aspects:
The computational operations in ANN are energy-intensive MAC operations, while the computational operations in SNN are less energy-consuming AC operations. As reported in Table 3, when \(T=256\), Spiking-NeRF without ETS shows a slightly better energy efficiency advantage than NeRF on realistic scenes while consuming more energy than NeRF on synthetic scenes.
The proposed ETS reduces the number of AC operations in SNN. After implementing our proposed ETS, Spiking-NeRF has a more significant energy efficiency advantage. The energy consumption of Spiking-NeRF with \(T=256\) is \(1.25\times\) to \(2.27\times\) less than NeRF.
The reduction in energy consumption achieved by the proposed ETS is associated with the value of \(T_{temp}\) and the percentage of positive color weight \(w_{i,T_{temp}}\). A smaller \(T_{temp}\) or a decreased percentage of positive \(w_{i,T_{temp}}\), corresponds to a greater reduction in energy consumption of the converted SNN with ETS. Moreover, the energy reduced by ETS is also related to the activation value of the source ANN. When the activation value of the source ANN is smaller, the proportion of the activated spiking neurons during interval \((0,T_{temp})\) is lower. Consequently, the computational operations conducted by the converted SNN during interval \((0,T_{temp})\) are fewer. Therefore, the smaller the activation value of the source ANN, the more energy consumption can be reduced by applying the proposed ETS strategy to the converted SNN.
Table 4.
DatasetNeRFCase (a)Case (b)Case (c)Case (d)
Fern1.69E\(+\)1354.08 J3.70E\(+\)13118.40 J0.45E\(+\)1314.40 J2.17E\(+\)1369.44 J1.21E\(+\)1338.72 J
Ship1.89E\(+\)1360.48 J4.14E\(+\)13132.48 J0.51E\(+\)1316.32 J2.43E\(+\)1377.76 J1.35E\(+\)1343.20 J
Table 4. The Number of MAC Operations (Left Column) and Energy Consumption (Right Column) Required by Different ANN Architectures When Rendering a Single Image

5.4 Scalability

The number of MAC operations and energy consumption required to render a single image by NeRF and other four ANN network architectures are shown in Table 4. We convert the aforementioned four neural networks in Section 5.1.4 into SNNs and apply the proposed ETS on realistic Fern scene and synthetic Ship scene. In order to ensure the fairness of the experiments, we refer to the experimental setup of vanilla Spiking-NeRF in Section 5.2 and set the total time step \(T\) of every SNN to 256. As reported in Table 5, the average PSNR values of the images rendered by SNN and ANN on the testset are comparable regardless of the network architecture.
Case (a) and Case (b) have the same layers as vanilla NeRF. In these two cases, the average PSNR value of the converted SNN have the same tendency to increase or decrease as the average PSNR value of the source ANN. In Case (c), the average PSNR value of ANN surpasses that of vanilla NeRF. However, the average PSNR of the converted SNN is observed to be inferior to that of vanilla NeRF. This phenomenon can be attributed to that the spiking neurons in deeper layer are harder to be activated. On the contrary, if the number of network layers decreases, then the difference between SNN and ANN becomes less. Therefore, the difference between the average PSNR value of SNN and ANN in Case (c) is larger than vanilla NeRF, while the difference between the average PSNR value of SNN and ANN in Case (d) is smaller than vanilla NeRF.
Table 5.
DatasetNeRFCase (a)Case (b)Case (c)Case (d)
Fern26.7226.4627.0526.8025.4524.9826.8926.3926.5626.40
Ship29.3728.7729.8029.4228.0727.3729.4228.5728.9428.58
Table 5. The Average PSNR Value on Testset of ANN (Left Column) and SNN with ETS (Right Column) with Different Neural Network Architectures
The total time step \(T\) of SNN is \(256\).
The number of AC operations and energy consumption required to render a single image by SNNs with ETS and other four ANN network architectures are shown in Table 6. As Case (a) and Case (c) have more complicated neural network architectures than vanilla NeRF, the computational operations of the converted SNN increase. In these two cases, SNNs without ETS have a slight energy reduction over ANNs. The proposed ETS reduces the number of computational operations when SNNs rendering images. Therefore, SNNs with ETS have more energy reduction than SNNs without ETS. For example, in Case (a), the energy consumption of the source ANN is 2.40\(\times\) higher than the energy consumption of the converted SNN with ETS on Fern dataset. The computational operations of Case (b) and Case (d) are fewer than that of vanilla NeRF. In these two cases, the energy consumption of the converted SNN without ETS is more than that of the source ANN. In Case (b), the energy consumption of the source ANN on Fern dataset is 0.5\(\times\) as much as that of the converted SNN. Nevertheless, after applying our proposed ETS, the converted SNNs consumes less energy than the source ANNs. In conclusion, the ETS proposed in this article enables the converted SNNs to have less energy consumption than the source ANNs in all experimental neural networks.
As reported in Table 6, the energy ratios for Case (a) and Case (c) with ETS are both greater than those of NeRF, while the energy ratios for Case (b) and Case (d) with ETS are lower than those of NeRF. This fact arises from more MAC operations of deeper or wider ANNs. After the deeper or wider ANNs are converted into SNN, the ETS can reduce more AC operations in the converted SNNs. Therefore, the proposed ETS shows better performance on deeper and wider neural networks.
Table 6.
Network and DatasetSNN w/o ETS (\(T=256\))SNN with ETS (\(T=256\))
ACsEnergy (J)Ratio\(T_{temp}\)ACsEnergy (J)Ratio
NeRFFern4.60E\(+\)1446.001.18563.03E\(+\)1430.301.78
Ship5.53E\(+\)1455.301.09163.42E\(+\)1434.201.77
Case (a)Fern9.24E\(+\)1492.401.28564.94E\(+\)1449.402.40
Ship1.05E\(+\)15105.01.26166.11E\(+\)1461.602.15
Case (b)Fern2.57E\(+\)1425.700.56561.33E\(+\)1413.301.08
Ship2.64E\(+\)1426.400.62161.37E\(+\)1413.701.19
Case (c)Fern6.20E\(+\)1462.001.12643.44E\(+\)1434.402.02
Ship7.32E\(+\)1473.201.06244.35E\(+\)1443.501.79
Case (d)Fern5.15E\(+\)1451.500.75322.48E\(+\)1424.801.56
Ship4.76E\(+\)1447.600.91163.33E\(+\)1433.301.30
Table 6. The Number of AC Operations and Energy Consumption Rendering a Single Image Required by SNN without ETS and SNN with ETS for Different Neural Work Architectures
The total time step \(T\) of SNN is \(256\).

5.5 Discussion

5.5.1 Estimation of Energy Consumption.

The experimental results presented in this article are obtained through simulation on a general-purpose GPU platform, rather than running Spiking-NeRF on a neuromorphic chip. Therefore, to compare the energy consumption, we use the energy consumed by AC operation and MAC operation on the same hardware platform as the energy consumption baseline. In reality, the energy efficiency of SNNs is even more pronounced when running on neuromorphic chips. If we follow the practice in [21], which estimates the energy consumption of Spiking-NeRF on the neuromorphic chip TrueNorth and NeRF on the NVIDIA TITAN V100 GPU, our findings indicate that Spiking-NeRF can achieve a reduction in energy consumption by 2.18\(\times\) to 3.95\(\times\).

5.5.2 Discussion with Other Works Based on NeRF.

Although NeRF is an innovative method for reconstructing scenes, it does not perform well in rendering complex scenes and consumes a large amount of time and energy during both training and inference. Numerous optimization works have been conducted based on NeRF. At the algorithm level, researchers extend NeRF to handle dynamic scenes [26, 34], deformable objects [12, 36], scenes with changing illumination and occluders [28, 47], neural relighting tasks [3, 43], generative models [44, 51] and so on. In addition, there are also some works to improve image quality [2, 50, 53] or reduce energy consumption [31] of NeRF. At the computing hardware level, researchers have proposed specialized hardware architectures for NeRF to achieve energy-efficient [39, 54] rendering. Different from the aforementioned related works, our method achieves the target of energy-efficiency at the computing paradigm level. Notably, our article is the first work applying SNN to neural rendering. The algorithm-level studies based on NeRF can leverage our proposed methods to convert ANNs to SNNs, thereby achieving energy-efficient spiking neural rendering. Furthermore, similar to hardware-level studies, the neuromorphic hardware architecture for the deployment of Spiking-NeRF can be customized to achieve energy efficiency.

6 Conclusion

In this article, we propose Spiking-NeRF, an energy-efficient spiking neural rendering model. We first propose a precise output decoding scheme for SNNs, which allows the converted SNNs to output the precise values corresponding to the source ANNs. Then we customize the parameter normalization method for the special network architecture of neural rendering. Furthermore, we present an ETS to improve the energy-efficiency of Spiking-NeRF. We evaluate the performance of Spiking-NeRF on both realistic scenes and synthetic scenes. Experimental results show that compared to ANN-based NeRF, Spiking-NeRF can achieve comparable rendering performance with up to \(2.27\times\) energy reduction.

References

[1]
Filipp Akopyan, Jun Sawada, Andrew Cassidy, Rodrigo Alvarez-Icaza, John Arthur, Paul Merolla, Nabil Imam, Yutaka Nakamura, Pallab Datta, Gi-Joon Nam, et al. 2015. TrueNorth: Design and Tool Flow of a 65 mW 1 Million Neuron Programmable Neurosynaptic Chip. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 34, 10 (2015), 1537–1557.
[2]
Jonathan T. Barron, Ben Mildenhall, Matthew Tancik, Peter Hedman, Ricardo Martin-Brualla, and Pratul P. Srinivasan. 2021. Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5855–5864.
[3]
Mark Boss, Raphael Braun, Varun Jampani, Jonathan T Barron, Ce Liu, and Hendrik Lensch. 2021. NeRD: Neural Reflectance Decomposition from Image Collections. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 12684–12694.
[4]
Tong Bu, Jianhao Ding, Zhaofei Yu, and Tiejun Huang. 2022. Optimized Potential Initialization for Low-Latency Spiking Neural Networks. arXiv: 2202.01440. Retrieved from https://arxiv.org/abs/2202.01440
[5]
Yongqiang Cao, Yang Chen, and Deepak Khosla. 2015. Spiking Deep Convolutional Neural Networks for Energy-Efficient Object Recognition. International Journal of Computer Vision 113, 1 (2015), 54–66.
[6]
Anpei Chen, Minye Wu, Yingliang Zhang, Nianyi Li, Jie Lu, Shenghua Gao, and Jingyi Yu. 2018. Deep Surface Light Fields. Proceedings of the ACM on Computer Graphics and Interactive Techniques (PACMCGIT) 1, 1 (2018), 1–17.
[7]
Yu-Hsin Chen, Tien-Ju Yang, Joel Emer, and Vivienne Sze. 2019. Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices. IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS) 9, 2 (2019), 292–308.
[8]
Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang, Vijayalakshmi Srinivasan, and Kailash Gopalakrishnan. 2018. PACT: Parameterized Clipping Activation for Quantized Neural Networks. arXiv: 1805.06085. Retrieved from https://arxiv.org/abs/1805.06085
[9]
Shikuang Deng and Shi Gu. 2021. Optimal Conversion of Conventional Artificial Neural Networks to Spiking Neural Networks. arXiv: 2103.00476. Retrieved from https://arxiv.org/abs/2103.00476
[10]
Peter U. Diehl, Daniel Neil, Jonathan Binas, Matthew Cook, Shih-Chii Liu, and Michael Pfeiffer. 2015. Fast-Classifying, High-Accuracy Spiking Deep Networks Through Weight and Threshold Balancing. In Proceedings of 2015 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–8.
[11]
Paul Ferré, Franck Mamalet, and Simon J. Thorpe. 2018. Unsupervised Feature Learning With Winner-Takes-All Based STDP. Frontiers in Computational Neuroscience 12 (2018), 24.
[12]
Guy Gafni, Justus Thies, Michael Zollhofer, and Matthias Nießner. 2021. Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8649–8658.
[13]
Bing Han, Gopalakrishnan Srinivasan, and Kaushik Roy. 2020. RMP-SNN: Residual Membrane Potential Neuron for Enabling Deeper High-Accuracy and Low-Latency Spiking Neural Network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 13558–13567.
[14]
Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning Both Weights and Connections for Efficient Neural Network. Advances in Neural Information Processing Systems (NeurIPS) 28 (2015), 1135–1143.
[15]
David Heeger. 2000. Poisson Model of Spike Generation. Handout, University of Standford 5, 1–13 (2000), 76.
[16]
Nguyen-Dong Ho and Ik-Joon Chang. 2021. TCL: An ANN-to-SNN Conversion with Trainable Clipping Layers. In Proceedings of 2021 58th ACM/IEEE Design Automation Conference (DAC). IEEE, 793–798.
[17]
Alan L Hodgkin and Andrew F. Huxley. 1952. A Quantitative Description of Membrane Current and Its Application to Conduction and Excitation in Nerve. The Journal of Physiology 117, 4 (1952), 500.
[18]
Eric Hunsberger and Chris Eliasmith. 2015. Spiking Deep Networks with LIF Neurons. arXiv: 1510.08829. Retrieved from https://arxiv.org/abs/1510.08829
[19]
Xiao Jin, Baoyun Peng, Yichao Wu, Yu Liu, Jiaheng Liu, Ding Liang, Junjie Yan, and Xiaolin Hu. 2019. Knowledge Distillation via Route Constrained Optimization. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1345–1354.
[20]
Melvin Johnson, Mike Schuster, Quoc V Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2017. Google’s Multilingual Neural Machine Translation System: Enabling Zero-shot Translation. Transactions of the Association for Computational Linguistics 5 (2017), 339–351.
[21]
Seijoon Kim, Seongsik Park, Byunggook Na, and Sungroh Yoon. 2020. Spiking-YOLO: Spiking Neural Network for Energy-efficient Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 11270–11277.
[22]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arXiv: 1412.6980. Retrieved from https://arxiv.org/abs/1412.6980
[23]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems (NeurIPS) 25 (2012), 1097–1105.
[24]
Chankyu Lee, Syed Shakib Sarwar, Priyadarshini Panda, Gopalakrishnan Srinivasan, and Kaushik Roy. 2020. Enabling Spike-Based Backpropagation for Training Deep Neural Network Architectures. Frontiers in Neuroscience (2020), 119.
[25]
Jun Haeng Lee, Tobi Delbruck, and Michael Pfeiffer. 2016. Training Deep Spiking Neural Networks Using Backpropagation. Frontiers in Neuroscience 10 (2016), 508.
[26]
Zhengqi Li, Simon Niklaus, Noah Snavely, and Oliver Wang. 2021. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 6498–6508.
[27]
Fangxin Liu, Wenbo Zhao, Zhezhi He, Yanzhi Wang, Zongwu Wang, Changzhi Dai, Xiaoyao Liang, and Li Jiang. 2021. Improving Neural Network Efficiency via Post-Training Quantization with Adaptive Floating-Point. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 5281–5290.
[28]
Ricardo Martin-Brualla, Noha Radwan, Mehdi SM Sajjadi, Jonathan T. Barron, Alexey Dosovitskiy, and Daniel Duckworth. 2021. NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7210–7219.
[29]
Nelson Max. 1995. Optical Models for Direct Volume Rendering. IEEE Transactions on Visualization and Computer Graphics (T-VCG) 1, 2 (1995), 99–108.
[30]
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In Proceedings of European Conference on Computer Vision (ECCV). Springer, 405–421.
[31]
Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. 2022. Instant Neural Graphics Primitives with a Multiresolution Hash Encoding. ACM Transactions on Graphics (ToG) 41, 4 (2022), 1–15.
[32]
Ali Bou Nassif, Ismail Shahin, Imtinan Attili, Mohammad Azzeh, and Khaled Shaalan. 2019. Speech Recognition Using Deep Neural Networks: A Systematic Review. IEEE Access 7 (2019), 19143–19165.
[33]
Emre O Neftci, Hesham Mostafa, and Friedemann Zenke. 2019. Surrogate Gradient Learning in Spiking Neural Networks: Bringing the Power of Gradient-Based Optimization to Spiking Neural Networks. IEEE Signal Processing Magazine 36, 6 (2019), 51–63.
[34]
Julian Ost, Fahim Mannan, Nils Thuerey, Julian Knodt, and Felix Heide. 2021. Neural Scene Graphs for Dynamic Scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2856–2865.
[35]
Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. 2019. DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[36]
Keunhong Park, Utkarsh Sinha, Peter Hedman, Jonathan T Barron, Sofien Bouaziz, Dan B Goldman, Ricardo Martin-Brualla, and Steven M Seitz. 2021. HyperNeRF: A Higher-dimensional Representation for Topologically Varying Neural Radiance Fields. arXiv: 2106.13228. Retrieved from https://arxiv.org/abs/2106.13228
[37]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. Advances in Neural Information Processing Systems (NeurIPS) 32 (2019), 8024–8035.
[38]
Ning Qiao, Hesham Mostafa, Federico Corradi, Marc Osswald, Fabio Stefanini, Dora Sumislawska, and Giacomo Indiveri. 2015. A Reconfigurable On-Line Learning Spiking Neuromorphic Processor Comprising 256 Neurons and 128K Synapses. Frontiers in Neuroscience 9 (2015), 141.
[39]
Chaolin Rao, Huangjie Yu, Haochuan Wan, Jindong Zhou, Yueyang Zheng, Minye Wu, Yu Ma, Anpei Chen, Binzhe Yuan, Pingqiang Zhou, et al. 2022. ICARUS: A Specialized Architecture for Neural Radiance Fields Rendering. ACM Transactions on Graphics (TOG) 41, 6 (2022), 1–14.
[40]
Kaushik Roy, Akhilesh Jaiswal, and Priyadarshini Panda. 2019. Towards Spike-Based Machine Intelligence with Neuromorphic Computing. Nature 575, 7784 (2019), 607–617.
[41]
Bodo Rueckauer, Iulia-Alexandra Lungu, Yuhuang Hu, Michael Pfeiffer, and Shih-Chii Liu. 2017. Conversion of Continuous-valued Deep Networks to Efficient Event-Driven Networks for Image Classification. Frontiers in Neuroscience 11 (2017), 682.
[42]
Gopalakrishnan Srinivasan, Priyadarshini Panda, and Kaushik Roy. 2018. STDP-Based Unsupervised Feature Learning Using Convolution-Over-Time in Spiking Neural Networks for Energy-Efficient Neuromorphic Computing. ACM Journal on Emerging Technologies in Computing Systems (JETC) 14, 4 (2018), 1–12.
[43]
Pratul P. Srinivasan, Boyang Deng, Xiuming Zhang, Matthew Tancik, Ben Mildenhall, and Jonathan T. Barron. 2021. NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7495–7504.
[44]
Shih-Yang Su, Frank Yu, Michael Zollhöfer, and Helge Rhodin. 2021. A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose. Advances in Neural Information Processing Systems (NeurIPS) 34 (2021), 12278–12291.
[45]
Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, and Joel S Emer. 2017. Efficient Processing of Deep Neural Networks: A Tutorial and Survey. Proceedings of IEEE 105, 12 (2017), 2295–2329.
[46]
Weihao Tan, Devdhar Patel, and Robert Kozma. 2021. Strategy and Benchmark for Converting Deep Q-Networks to Event-Driven Spiking Neural Networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 9816–9824.
[47]
Matthew Tancik, Ben Mildenhall, Terrance Wang, Divi Schmidt, Pratul P. Srinivasan, Jonathan T. Barron, and Ren Ng. 2021. Learned Initializations for Optimizing Coordinate-Based Neural Representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2846–2855.
[48]
Ayush Tewari, Ohad Fried, Justus Thies, Vincent Sitzmann, Stephen Lombardi, Kalyan Sunkavalli, Ricardo Martin-Brualla, Tomas Simon, Jason Saragih, Matthias Nießner, et al. 2020. State of the Art on Neural Rendering. In Proceedings of Computer Graphics Forum, Vol. 39. Wiley Online Library, 701–727.
[49]
Ayush Tewari, Justus Thies, Ben Mildenhall, Pratul Srinivasan, Edgar Tretschk, W Yifan, Christoph Lassner, Vincent Sitzmann, Ricardo Martin-Brualla, Stephen Lombardi, Tomas Simon, Christian Theobalt, Matthias Nießner, Jonathan T. Barron, Gordon Wetzstein, Michael Zollhöfer, and Vladislav Golyanik. 2022. Advances in Neural Rendering. In Proceedings of Computer Graphics Forum, Vol. 41. Wiley Online Library, 703–735.
[50]
Dor Verbin, Peter Hedman, Ben Mildenhall, Todd Zickler, Jonathan T. Barron, and Pratul P Srinivasan. 2022. Ref-NeRF: Structured View-Dependent Appearance for Neural Radiance Fields. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 5481–5490.
[51]
Hongyi Xu, Thiemo Alldieck, and Cristian Sminchisescu. 2021. H-NeRF: Neural Radiance Fields for Rendering and Temporal Reconstruction of Humans in Motion. Advances in Neural Information Processing Systems (NeurIPS) 34 (2021), 14955–14966.
[52]
Tien-Ju Yang, Yu-Hsin Chen, and Vivienne Sze. 2017. Designing Energy-efficient Convolutional Neural Networks Using Energy-Aware Pruning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5687–5695.
[53]
Alex Yu, Vickie Ye, Matthew Tancik, and Angjoo Kanazawa. 2021. pixelNeRF: Neural Radiance Fields from One or Few Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4578–4587.
[54]
Yueyang Zheng, Chaolin Rao, Haochuan Wan, Yuliang Zhou, Pingqiang Zhou, Jingyi Yu, and Xin Lou. 2022. An RRAM-Based Neural Radiance Field Processor. In 2022 IEEE 35th International System-on-Chip Conference (SOCC). IEEE, 1–5.

Cited By

View all
  • (2025)Multiscale brain modeling: bridging microscopic and macroscopic brain dynamics for clinical and technological applicationsFrontiers in Cellular Neuroscience10.3389/fncel.2025.153746219Online publication date: 19-Feb-2025
  • (2024)Set of Diverse Queries With Uncertainty Regularization for Composed Image RetrievalIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.340100634:10_Part_2(10494-10506)Online publication date: 1-Oct-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Journal on Emerging Technologies in Computing Systems
ACM Journal on Emerging Technologies in Computing Systems  Volume 20, Issue 3
July 2024
99 pages
EISSN:1550-4840
DOI:10.1145/3613628
  • Editor:
  • Ramesh Karri
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 26 August 2024
Online AM: 11 July 2024
Accepted: 25 May 2024
Revised: 21 January 2024
Received: 08 May 2023
Published in JETC Volume 20, Issue 3

Check for updates

Author Tags

  1. Spiking neural network (SNN)
  2. neural rendering
  3. neural radiance field

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)513
  • Downloads (Last 6 weeks)85
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Multiscale brain modeling: bridging microscopic and macroscopic brain dynamics for clinical and technological applicationsFrontiers in Cellular Neuroscience10.3389/fncel.2025.153746219Online publication date: 19-Feb-2025
  • (2024)Set of Diverse Queries With Uncertainty Regularization for Composed Image RetrievalIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.340100634:10_Part_2(10494-10506)Online publication date: 1-Oct-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media