More Web Proxy on the site http://driver.im/

research-article

Open access

Lightning: A Reconfigurable Photonic-Electronic SmartNIC for Fast and Energy-Efficient Inference

Authors:

Christian Williams,

Alexander Sludds,

Homa Esfahanizadeh,

Manya GhobadiAuthors Info & Claims

ACM SIGCOMM '23: Proceedings of the ACM SIGCOMM 2023 Conference

Pages 452 - 472

https://doi.org/10.1145/3603269.3604821

Published: 01 September 2023 Publication History

Abstract

The massive growth of machine learning-based applications and the end of Moore's law have created a pressing need to redesign computing platforms. We propose Lightning, the first reconfigurable photonic-electronic smartNIC to serve real-time deep neural network inference requests. Lightning uses a fast datapath to feed traffic from the NIC into the photonic domain without creating digital packet processing and data movement bottlenecks. To do so, Lightning leverages a novel reconfigurable count-action abstraction that keeps track of the required computation operations of each inference packet. Our count-action abstraction decouples the compute control plane from the data plane by counting the number of operations in each task and triggers the execution of the next task(s) without interrupting the dataflow. We evaluate Lightning's performance using four platforms: a prototype, chip synthesis, emulations, and simulations. Our prototype demonstrates the feasibility of performing 8-bit photonic multiply-accumulate operations with 99.25% accuracy. To the best of our knowledge, our prototype is the highest-frequency photonic computing system, capable of serving real-time inference queries at 4.055 GHz end-to-end. Our simulations with large DNN models show that compared to Nvidia A100 GPU, A100X DPU, and Brainwave smartNIC, Lightning accelerates the average inference serve time by 337×, 329×, and 42×, while consuming 352×, 419×, and 54× less energy, respectively.

References

[1]

[n. d.]. Nvidia converged accelerators. ([n. d.]). https://www.nvidia.com/content/dam/en-zz/Solutions/gtcf21/converged-accelerator/pdf/datasheet.pdf, year=2022.

[2]

2021. DAC Performance Survey 1997-2021. (2021). https://github.com/pietro-caragiulo/survey-DAC.

[3]

2021. Nvidia A100 GPU. (2021). https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/nvidia-a100-datasheet-us-nvidia-1758950-r4-web.pdf.

[4]

2021. Nvidia Triton Inference Server. (2021). https://developer.nvidia.com/nvidia-triton-inference-server.

[5]

2022. 10 GHz Intensity Modulator. (2022). https://www.thorlabs.com/thorproduct.cfm?partnumber=LN81S-FC.

[6]

2022. 40 GHz Intensity Modulator, Z-Cut, FC/PC Connectors, 1525 nm - 1605 nm, Small Form Factor Housing . (2022). https://www.thorlabs.com/thorproduct.cfm?partnumber=LNA6112.

[7]

2022. Advanced eXtensible Interface. (2022). https://en.wikipedia.org/wiki/Advanced_eXtensible_Interface,.

[8]

2022. AMBA® AXI-Stream Protocol Specification. (2022). https://developer.arm.com/documentation/ihi0051/a/Interface-Signals/Transfer-signaling/Handshake-process.

[9]

2022. Electro-optic modulator. (2022). https://en.wikipedia.org/wiki/Electro-optic_modulator,.

[10]

2022. Intel Stratix 10 FPGA and SoC Family Plan. (2022). https://www.intel.com/content/www/us/en/docs/programmable/683729/current/fpga-and-soc-family-plan.html.

[11]

2022. Keysight M8100 Series Arbitrary Waveform Generator. (2022). https://www.keysight.com/us/en/products/arbitrary-waveform-generators/m8100-series-arbitrary-waveform-generators.html.

[12]

2022. LMH5401 Evaluation Module. (2022). https://www.ti.com/tool/LMH5401EVM,.

[13]

2022. Mach-Zehnder interferometer . (2022). https://en.wikipedia.org/wiki/Mach%E2%80%93Zehnder_interferometer.

[14]

2022. N3IC github repository. (2022). https://github.com/nec-research/n3ic-nsdi22,.

[15]

2022. Petalinux Tools. (2022). https://www.xilinx.com/products/design-tools/embedded-software/petalinux-sdk.html.

[16]

2022. QICK: Quantum Instrumentation Control Kit . (2022). https://github.com/openquantumhardware/qick.

[17]

2022. RFSOC-PYNQ . (2022). http://www.rfsoc-pynq.io/.

[18]

2022. Thorlabs InGaAs Fixed Gain Amplified Detector, 750 - 1650 nm, DC - 9.5 GHz . (2022). https://www.thorlabs.com/thorproduct.cfm?partnumber=PDA8GS.

[19]

2022. TSMC 3 nm Wafer Pricing to Reach $20,000; Next-Gen CPUs/GPUs to be More Expensive. (2022). https://www.techpowerup.com/301393/tsmc-3-nm-wafer-pricing-to-reach-usd-20-000-next-gen-cpus-gpus-to-be-more-expensive.

[20]

2022. UltraScale+ Devices Integrated 100G Ethernet Subsystem v3.1. (2022). https://docs.xilinx.com/v/u/en-US/pg203-cmac-usplus.

[21]

2022. UltraScale™ architecture-based FPGAs Memory IP core. (2022). https://www.xilinx.com/content/dam/xilinx/support/documents/ip_documentation/ultrascale_memory_ip/v1_4/pg150-ultrascale-memory-ip.pdf.

[22]

2022. Verilator. (2022). https://www.veripool.org/verilator/,.

[23]

2022. Zynq UltraScale+ RFSoC. (2022). https://www.xilinx.com/products/silicon-devices/soc/rfsoc.html.

[24]

2022. Zynq UltraScale+ RFSoC RF Data Converter v2.6 Gen 1/2/3 LogiCORE IP Product Guide. (2022). https://docs.xilinx.com/v/u/en-US/pg269-rf-data-converter.

[25]

2022. Zynq UltraScale+ RFSoC ZCU111 Evaluation Kit. (2022). https://www.xilinx.com/products/boards-and-kits/zcu111.html,.

[26]

2023. 125 MS/s 16 bit multi-purpose digitizer. (2023). https://spectrum-instrumentation.com/products/details/M2p5943-x4.php.

[27]

2023. 2023 General Europractice Pricelist. (July 2023). https://europractice-ic.com/schedules-prices-2023/.

[28]

2023. ConnectX 100Gbps SmartNICs. (2023). https://www.nvidia.com/en-us/networking/ethernet-adapters/.

[29]

2023. How Much Power Does Memory Use? (2023). https://www.crucial.com/support/articles-faq-memory/how-much-power-does-memory-use.

[30]

2023. Nvidia Tesla P4 GPU. (2023). https://images.nvidia.com/content/pdf/tesla/184457-Tesla-P4-Datasheet-NV-Final-Letter-Web.pdf.

[31]

Hitesh Ballani. 2023. Unlocking the future of computing: The Analog Iterative Machine's lightning-fast approach to optimization. (2023). https://www.microsoft.com/en-us/research/blog/unlocking-the-future-of-computing-the-analog-iterative-machines-lightning-fast-approach-to-optimization/?secret=O92oxp.

[32]

Ron Banner, Itay Hubara, Elad Hoffer, and Daniel Soudry. 2018. Scalable Methods for 8-Bit Training of Neural Networks. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS'18). Curran Associates Inc., Red Hook, NY, USA, 5151--5159.

Digital Library

[33]

Tamal Bose and Francois Meyer. 2003. Digital signal and image processing. John Wiley & Sons, Inc.

[34]

Pat Bosshart, Glen Gibb, Hun-Seok Kim, George Varghese, Nick McKeown, Martin Izzard, Fernando Mujica, and Mark Horowitz. 2013. Forwarding Metamorphosis: Fast Programmable Match-Action Processing in Hardware for SDN. In Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM (SIGCOMM '13). Association for Computing Machinery, New York, NY, USA, 99--110.

Digital Library

[35]

E Oran Brigham. 1988. The fast Fourier transform and its applications. Prentice-Hall, Inc.

Digital Library

[36]

Madhukar Budagavi, Arild Fuldseth, Gisle Bjøntegaard, Vivienne Sze, and Mangesh Sadafale. 2013. Core transform design in the high efficiency video coding (HEVC) standard. IEEE Journal of Selected Topics in Signal Processing 7, 6 (2013), 1029--1041.

[37]

Maurizio Burla, Claudia Hoessbacher, Wolfgang Heni, Christian Haffner, Yuriy Fedoryshyn, Dominik Werner, Tatsuhiko Watanabe, Hermann Massler, Delwin L Elder, Larry R Dalton, et al. 2019. 500 GHz plasmonic Mach-Zehnder modulator enabling sub-THz microwave photonics. Apl Photonics 4, 5 (2019).

[38]

Kevin K Chang, Abhijith Kashyap, Hasan Hassan, Saugata Ghose, Kevin Hsieh, Donghyuk Lee, Tianshi Li, Gennady Pekhimenko, Samira Khan, and Onur Mutlu. 2016. Understanding latency variation in modern DRAM chips: Experimental characterization, analysis, and optimization. In Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science. 323--336.

Digital Library

[39]

Yu-Hsin Chen, Tushar Krishna, Joel S Emer, and Vivienne Sze. 2016. Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE journal of solid-state circuits 52, 1 (2016), 127--138.

Digital Library

[40]

Qixiang Cheng, Jihye Kwon, Madeleine Glick, Meisam Bahadori, Luca P Carloni, and Keren Bergman. 2020. Silicon photonics codesign for deep learning. Proc. IEEE 108, 8 (2020), 1261--1282.

[41]

Jin Hee Cho, Jihwan Kim, Woo Young Lee, Dong Uk Lee, Tae Kyun Kim, Heat Bit Park, Chunseok Jeong, Myeong-Jae Park, Seung Geun Baek, Seokwoo Choi, et al. 2018. A 1.2 V 64Gb 341GB/S HBM2 stacked DRAM with spiral point-to-point TSV structure and improved bank group data control. In 2018 IEEE International Solid-State Circuits Conference-(ISSCC). IEEE, 208--210.

[42]

Devin Coldewey. 2023. Lightmatter's photonic AI hardware is ready to shine with $154M in new funding. (May 2023). https://techcrunch.com/2023/05/31/lightmatters-photonic-ai-hardware-is-ready-to-shine-with-154m-in-new-funding/.

[43]

Bita Darvish Rouhani, Daniel Lo, Ritchie Zhao, Ming Liu, Jeremy Fowers, Kalin Ovtcharov, Anna Vinogradsky, Sarah Massengill, Lita Yang, Ray Bittner, et al. 2020. Pushing the limits of narrow precision inferencing at cloud scale with microsoft floating point. Advances in neural information processing systems 33 (2020), 10271--10281.

[44]

Abhipraya Kumar Dash. [n. d.]. VGG-16 Architecture. ([n. d.]). https://iq.opengenus.org/vgg16/.

[45]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248--255.

[46]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR abs/1810.04805 (2018). arXiv:1810.04805 http://arxiv.org/abs/1810.04805

[47]

Albert Einstein. 1905. On a heuristic point of view concerning the production and transformation of light. Annalen der Physik (1905), 1--18.

[48]

Nabil H. Farhat, Demetri Psaltis, Aluizio Prata, and Eung Paek. 1985. Optical implementation of the Hopfield model. Appl. Opt. 24, 10 (May 1985), 1469--1475.

[49]

Dror G Feitelson. 1988. Optical Computing: A survey for computer scientists. (1988).

Digital Library

[50]

J. Feldmann, N. Youngblood, M. Karpov, H. Gehring, X. Li, M. Stappers, M. Le Gallo, X. Fu, A. Lukashchuk, A. S. Raja, J. Liu, C. D. Wright, A. Sebastian, T. J. Kippenberg, W. H. P. Pernice, and H. Bhaskaran. 2021. Parallel convolutional processing using an integrated photonic tensor core. Nature 589, 7840 (2021), 52--58.

[51]

Jeremy Fowers, Kalin Ovtcharov, Michael Papamichael, Todd Massengill, Ming Liu, Daniel Lo, Shlomi Alkalay, Michael Haselman, Logan Adams, Mahdi Ghandi, Stephen Heil, Prerak Patel, Adam Sapek, Gabriel Weisz, Lisa Woods, Sitaram Lanka, Steven K. Reinhardt, Adrian M. Caulfield, Eric S. Chung, and Doug Burger. 2018. A Configurable Cloud-Scale DNN Processor for Real-Time AI. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 1--14.

Digital Library

[52]

Alexander L Gaeta, Michal Lipson, and Tobias J Kippenberg. 2019. Photonic-chip-based frequency combs. nature photonics 13, 3 (2019), 158--169.

[53]

Robert Gallager. 1962. Low-density parity-check codes. IRE Transactions on information theory 8, 1 (1962), 21--28.

[54]

Sahaj Garg, Joe Lou, Anirudh Jain, and Mitchell Nahmias. 2021. Dynamic precision analog computing for neural networks. arXiv preprint arXiv:2102.06365 (2021).

[55]

Manya Ghobadi, Zhizhen Zhong, Weiyang Wang, Alexander Sludds, Ryan Hamerly, Liane Bernstein, and Dirk Englund. 2021. In-network optical inference. (May 20 2021). US Patent 63,191,120.

[56]

Soroush Ghodrati, Byung Hoon Ahn, Joon Kyung Kim, Sean Kinzer, Brahmendra Reddy Yatham, Navateja Alla, Hardik Sharma, Mohammad Alian, Eiman Ebrahimi, Nam Sung Kim, Cliff Young, and Hadi Esmaeilzadeh. 2020. Planaria: Dynamic Architecture Fission for Spatial Multi-Tenant Acceleration of Deep Neural Networks. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 681--697.

[57]

Heedong Goh and Andrea Alù. 2022. Nonlocal Scatterer for Compact Wave-Based Analog Computing. Phys. Rev. Lett. 128 (Feb 2022), 073201. Issue 7.

[58]

J W Goodman, A R Dias, and L M Woody. 1978. Fully parallel, high-speed incoherent optical method for performing discrete Fourier transforms. Opt. Lett. 2, 1 (Jan. 1978), 1--3.

[59]

Kasper Groes and Albin Ludvigsen. 2023. ChatGPT's Electricity Consumption. (March 2023). https://towardsdatascience.com/chatgpts-electricity-consumption-7873483feac4.

[60]

Christian Haffner, Daniel Chelladurai, Yuriy Fedoryshyn, Arne Josten, Benedikt Baeuerle, Wolfgang Heni, Tatsuhiko Watanabe, Tong Cui, Bojun Cheng, Soham Saha, et al. 2018. Low-loss plasmon-assisted electro-optic modulator. Nature 556, 7702 (2018), 483--486.

[61]

Ryan Hamerly, Liane Bernstein, Alexander Sludds, Marin Soljačić, and Dirk Englund. 2019. Large-scale optical neural networks based on photoelectric multiplication. Physical Review X 9, 2 (2019), 021032.

[62]

Richard W Hamming. 1950. Error detecting and error correcting codes. The Bell system technical journal 29, 2 (1950), 147--160.

[63]

Song Han, Junlong Kang, Huizi Mao, Yiming Hu, Xin Li, Yubin Li, Dongliang Xie, Hong Luo, Song Yao, Yu Wang, et al. 2017. Ese: Efficient speech recognition engine with sparse lstm on fpga. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 75--84.

Digital Library

[64]

Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A Horowitz, and William J Dally. 2016. EIE: Efficient inference engine on compressed deep neural network. ACM SIGARCH Computer Architecture News 44, 3 (2016), 243--254.

Digital Library

[65]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[66]

Mingbo He, Mengyue Xu, Yuxuan Ren, Jian Jian, Ziliang Ruan, Yongsheng Xu, Shengqian Gao, Shihao Sun, Xueqin Wen, Lidan Zhou, et al. 2019. High-performance hybrid silicon and lithium niobate Mach-Zehnder modulators for 100 Gbit s- 1 and beyond. Nature Photonics 13, 5 (2019), 359--364.

[67]

Philip Jacobson, Mizuki Shirao, Kerry Yu, Guan-Lin Su, and Ming C. Wu. 2022. Hybrid Convolutional Optoelectronic Reservoir Computing for Image Recognition. Journal of Lightwave Technology 40, 3 (2022), 692--699.

[68]

Norman P. Jouppi, Doe Hyun Yoon, Matthew Ashcraft, Mark Gottscho, Thomas B. Jablin, George Kurian, James Laudon, Sheng Li, Peter Ma, Xiaoyu Ma, Thomas Norrie, Nishant Patil, Sushma Prasad, Cliff Young, Zongwei Zhou, and David Patterson. 2021. Ten Lessons From Three Generations Shaped Google's TPUv4i : Industrial Product. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA). 1--14.

Digital Library

[69]

Norman P Jouppi, Doe Hyun Yoon, Matthew Ashcraft, Mark Gottscho, Thomas B Jablin, George Kurian, James Laudon, Sheng Li, Peter Ma, Xiaoyu Ma, et al. 2021. Ten lessons from three generations shaped google's tpuv4i: Industrial product. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA). IEEE, 1--14.

Digital Library

[70]

Aakash Kaushik. [n. d.]. VGG-19 Architecture. ([n. d.]). https://iq.opengenus.org/vgg19-architecture/.

[71]

Mehrdad Khani, Manya Ghobadi, Mohammad Alizadeh, Ziyi Zhu, Madeleine Glick, Keren Bergman, Amin Vahdat, Benjamin Klenk, and Eiman Ebrahimi. 2021. SiP-ML: high-bandwidth optical network interconnects for machine learning training. In Proceedings of the 2021 ACM SIGCOMM 2021 Conference. 657--675.

Digital Library

[72]

Prashanta Kharel, Christian Reimer, Kevin Luke, Lingyan He, and Mian Zhang. 2021. Breaking voltage-bandwidth limits in integrated lithium niobate modulators using micro-structured electrodes. Optica 8, 3 (2021), 357--363.

[73]

Taehyun Kim, Deondre Martin Ng, Junzhi Gong, Youngjin Kwon, Minlan Yu, and KyoungSoo Park. 2023. Rearchitecting the TCP Stack for I/O-Offloaded Content Delivery. In 19th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2022. USENIX.

[74]

Ueli Koch, Christopher Uhl, Horst Hettrich, Yuriy Fedoryshyn, Claudia Hoessbacher, Wolfgang Heni, Benedikt Baeuerle, Bertold I Bitachon, Arne Josten, Masafumi Ayata, et al. 2020. A monolithic bipolar CMOS electronic-plasmonic high-speed transmitter. Nature Electronics 3, 6 (2020), 338--345.

[75]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2017. Imagenet classification with deep convolutional neural networks. Commun. ACM 60, 6 (2017), 84--90.

Digital Library

[76]

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.

[77]

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.

[78]

Mo Li and Hong X Tang. 2019. Strong pockels materials. Nature Materials 18, 1 (2019), 9--11.

[79]

Ji Lin, Wei-Ming Chen, Yujun Lin, Chuang Gan, Song Han, et al. 2020. Mcunet: Tiny deep learning on iot devices. Advances in Neural Information Processing Systems 33 (2020), 11711--11722.

[80]

Xing Lin, Yair Rivenson, Nezih T Yardimci, Muhammed Veli, Yi Luo, Mona Jarrahi, and Aydogan Ozcan. 2018. All-optical machine learning using diffractive deep neural networks. Science 361, 6406 (2018), 1004--1008.

[81]

Weichen Liu, Wenyang Liu, Yichen Ye, Qian Lou, Yiyuan Xie, and Lei Jiang. 2019. Holylight: A nanophotonic accelerator for deep learning in data centers. In 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1483--1488.

[82]

Susan Luschas, Richard Schreier, and Hae-Seung Lee. 2004. Radio frequency digital-to-analog converter. IEEE Journal of Solid-State Circuits 39, 9 (2004), 1462--1467.

[83]

Dennis Maes, Luis Reis, Stijn Poelman, Ewoud Vissers, Vanessa Avramovic, Mohammed Zaknoune, Gunther Roelkens, Sam Lemey, Emilien Peytavit, and Bart Kuyken. 2022. High-speed photodiodes on silicon nitride with a bandwidth beyond 100 Ghz. In CLEO: Science and Innovations. Optica Publishing Group, SM3K-3.

[84]

Peter L. McMahon. 2023. The physics of optical computing. arXiv preprint arXiv:2308.00088 (2023).

[85]

Microsoft. 2023. Project AIM (Analog Iterative Machine). (2023). https://www.microsoft.com/en-us/research/project/aim/.

[86]

Nour Moustafa and Jill Slay. 2015. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In 2015 military communications and information systems conference (MilCIS). IEEE, 1--6.

[87]

Maxim Naumov, Dheevatsa Mudigere, Hao-Jun Michael Shi, Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole-Jean Wu, Alisson G Azzolini, et al. 2019. Deep learning recommendation model for personalization and recommendation systems. arXiv preprint arXiv:1906.00091 (2019).

[88]

RL Nguyen, AM Castrillon, A Fan, A Mellati, Benjamin T Reyes, Cindra Abidin, E Olsen, F Ahmad, Geoff Hatcher, J Chana, et al. 2021. 8.6 A Highly Reconfigurable 40-97GS/s DAC and ADC with 40GHz AFE Bandwidth and Sub-35fJ/conv-step for 400Gb/s Coherent Optical Applications in 7nm FinFET. In 2021 IEEE International Solid-State Circuits Conference (ISSCC), Vol. 64. IEEE, 136--138.

[89]

Tan Nguyen, Samuel Williams, Marco Siracusa, Colin MacLean, Douglas Doerfler, and Nicholas J Wright. 2020. The performance and energy efficiency potential of FPGAs in scientific computing. In 2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS). IEEE, 8--19.

[90]

Mike O'Connor, Niladrish Chatterjee, Donghyuk Lee, John Wilson, Aditya Agrawal, Stephen W Keckler, and William J Dally. 2017. Fine-grained DRAM: Energy-efficient DRAM for extreme bandwidth systems. In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture. 41--54.

Digital Library

[91]

OZ Optics. 2021. Super modulator bias controller. (June 2021). https://www.ozoptics.com/ALLNEW_PDF/DTS0165.pdf.

[92]

Jiaxin Peng, Yousra Alkabani, Shuai Sun, Volker J Sorger, and Tarek El-Ghazawi. 2020. Dnnara: A deep neural network accelerator using residue arithmetic and integrated photonics. In Proceedings of the 49th International Conference on Parallel Processing. 1--11.

Digital Library

[93]

Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language Models are Unsupervised Multitask Learners. (2019).

[94]

Kim Roberts, Qunbi Zhuge, Inder Monga, Sebastien Gareau, and Charles Laperle. 2017. Beyond 100 Gb/s: capacity, flexibility, and network optimization. Journal of Optical Communications and Networking 9, 4 (2017), C12--C24.

[95]

Hardik Sharma, Jongse Park, Divya Mahajan, Emmanuel Amaro, Joon Kyung Kim, Chenkai Shao, Asit Mishra, and Hadi Esmaeilzadeh. 2016. From high-level deep neural models to FPGAs. In Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on. IEEE, 1--12.

Digital Library

[96]

Yichen Shen, Nicholas C Harris, Scott Skirlo, Mihika Prabhu, Tom Baehr-Jones, Michael Hochberg, Xin Sun, Shijie Zhao, Hugo Larochelle, Dirk Englund, et al. 2017. Deep learning with coherent nanophotonic circuits. Nature Photonics 11, 7 (2017), 441--446.

[97]

Shawn Yohanes Siew, Bo Li, Feng Gao, Hai Yang Zheng, Wenle Zhang, Pengfei Guo, Shawn Wu Xie, Apu Song, Bin Dong, Lian Wee Luo, et al. 2021. Review of silicon photonics technology and platform development. Journal of Lightwave Technology 39, 13 (2021), 4374--4389.

[98]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[99]

Giuseppe Siracusano, Salvator Galea, Davide Sanvito, Mohammad Malekzadeh, Gianni Antichi, Paolo Costa, Hamed Haddadi, and Roberto Bifulco. 2022. Rearchitecting Traffic Analysis with Neural Network Interface Cards. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). USENIX Association, Renton, WA, 513--533.

[100]

Arunan Sivanathan, Hassan Habibi Gharakheili, Franco Loi, Adam Radford, Chamith Wijenayake, Arun Vishwanath, and Vijay Sivaraman. 2018. Classifying IoT devices in smart environments using network traffic characteristics. IEEE Transactions on Mobile Computing 18, 8 (2018), 1745--1759.

[101]

Alexander Sludds, Saumil Bandyopadhyay, Zaijun Chen, Zhizhen Zhong, Jared Cochrane, Liane Bernstein, Darius Bunandar, P Ben Dixon, Scott Hamilton, Matthew Streshinsky, Ari Novack, Tom Baehr-Jones, Michael Hochberg, Manya Ghobadi, Ryan Hamerly, and Dirk Englund. 2022. Delocalized Photonic Deep Learning on the Internet's Edge. Science 378, 6617 (2022), 270--276.

[102]

Alexander Sludds, Ryan Hamerly, Saumil Bandyopadhyay, Zhizhen Zhong, Zaijun Chen, Liane Bernstein, Manya Ghobadi, and Dirk Englund. 2022. Demonstration of WDM-Enabled Ultralow-Energy Photonic Edge Computing, In Optical Fiber Communication Conference (OFC) 2022. Optical Fiber Communication Conference (OFC) 2022, Th3A.3.

[103]

Tushar Swamy, Alexander Rucker, Muhammad Shahbaz, Ishan Gaur, and Kunle Olukotun. 2022. Taurus: A Data Plane Architecture for per-Packet ML. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2022). Association for Computing Machinery, New York, NY, USA, 1099--1114.

Digital Library

[104]

Cheng Wang, Mian Zhang, Xi Chen, Maxime Bertrand, Amirhassan Shams-Ansari, Sethumadhavan Chandrasekhar, Peter Winzer, and Marko Lončar. 2018. Integrated lithium niobate electro-optic modulators operating at CMOS-compatible voltages. Nature 562, 7725 (2018), 101--104.

[105]

Cheng Wang, Mian Zhang, Xi Chen, Maxime Bertrand, Amirhassan Shams-Ansari, Sethumadhavan Chandrasekhar, Peter Winzer, and Marko Lončar. 2018. Integrated lithium niobate electro-optic modulators operating at CMOS-compatible voltages. Nature 562, 7725 (2018), 101--104.

[106]

Tianyu Wang, Shi-Yuan Ma, Logan G Wright, Tatsuhiro Onodera, Brian C Richard, and Peter L McMahon. 2022. An optical neural network using less than 1 photon per multiplication. Nature Communications 13, 1 (2022), 1--8.

[107]

Zeke Wang, Hongjing Huang, Jie Zhang, Fei Wu, and Gustavo Alonso. 2022. FpgaNIC: An FPGA-based Versatile 100Gb SmartNIC for GPUs. In 2022 USENIX Annual Technical Conference (USENIX ATC 22). USENIX Association, Carlsbad, CA, 967--986. https://www.usenix.org/conference/atc22/presentation/wang-zeke

[108]

Gordon Wetzstein, Aydogan Ozcan, Sylvain Gigan, Shanhui Fan, Dirk Englund, Marin Soljačić, Cornelia Denz, David AB Miller, and Demetri Psaltis. 2020. Inference in artificial intelligence with deep optics and photonics. Nature 588, 7836 (2020), 39--47.

[109]

AMD Xilinx. 2021. Virtex UltraScale+ FPGA Data Sheet: DC and AC Switching Characteristics. (2021). https://docs.xilinx.com/v/u/en-US/ds923-virtex-ultrascale-plus.

[110]

Zhaoqi Xiong and Noa Zilberman. 2019. Do switches dream of machine learning? toward in-network classification. In Proceedings of the 18th ACM workshop on hot topics in networks. 25--33.

Digital Library

[111]

Xingyuan Xu, Mengxi Tan, Bill Corcoran, Jiayang Wu, Andreas Boes, Thach G Nguyen, Sai T Chu, Brent E Little, Damien G Hicks, Roberto Morandotti, et al. 2021. 11 TOPS photonic convolutional accelerator for optical neural networks. Nature 589, 7840 (2021), 44--51.

[112]

Xiaoxiao Xue, Pei-Hsun Wang, Yi Xuan, Minghao Qi, and Andrew M. Weiner. 2017. Microresonator Kerr frequency combs with high conversion efficiency. Laser & Photonics Reviews 11, 1 (2017), 1600276. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/lpor.201600276

[113]

Javier Yanes. 2020. Optical Computing: Solving Problems at the Speed of Light. (Feb. 2020). https://www.bbvaopenmind.com/en/technology/future/optical-computing-solving-problems-at-the-speed-of-light/.

[114]

Zhipeng Zhao, Hugo Sadok, Nirav Atre, James C Hoe, Vyas Sekar, and Justine Sherry. 2020. Achieving 100Gbps intrusion prevention on a single server. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 1083--1100.

Digital Library

[115]

Zhizhen Zhong, Weiyang Wang, Manya Ghobadi, Alexander Sludds, Ryan Hamerly, Liane Bernstein, and Dirk Englund. 2021. IOI: In-network Optical Inference. In Proceedings of the ACM SIGCOMM 2021 Workshop on Optical Systems. 18--22.

Digital Library

[116]

Hailong Zhou, Jianji Dong, Junwei Cheng, Wenchan Dong, Chaoran Huang, Yichen Shen, Qiming Zhang, Min Gu, Chao Qian, Hongsheng Chen, et al. 2022. Photonic matrix multiplication lights up photonic accelerator and beyond. Light: Science & Applications 11, 1 (2022), 30.

[117]

Yu Zhu, Zhenhao He, Wenqi Jiang, Kai Zeng, Jingren Zhou, and Gustavo Alonso. 2021. Distributed recommendation inference on fpga clusters. In 2021 31st International Conference on Field-Programmable Logic and Applications (FPL). IEEE, 279--285.

Cited By

Zhang HZhang HHuang ZChen Y(2025)ChipAI: A scalable chiplet-based accelerator for efficient DNN inference using silicon photonicsJournal of Systems Architecture10.1016/j.sysarc.2024.103308158(103308)Online publication date: Jan-2025
https://doi.org/10.1016/j.sysarc.2024.103308
Murali MBanerjee ABasu T(2024)Lithium niobate on insulator: an emerging nanophotonic crystal for optimized light controlBeilstein Journal of Nanotechnology10.3762/bjnano.15.11415(1415-1426)Online publication date: 14-Nov-2024
https://doi.org/10.3762/bjnano.15.114
Cheng WGao ZChen TGanesan DLane NShi W(2024)Real-time Wideband Software-defined Radio with Python Programmability based on RFSoCProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3698855(1772-1774)Online publication date: 4-Dec-2024
https://dl.acm.org/doi/10.1145/3636534.3698855
Show More Cited By

Index Terms

Lightning: A Reconfigurable Photonic-Electronic SmartNIC for Fast and Energy-Efficient Inference
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Analog computers
      2. Optical computing
  2. Real-time systems
    1. Real-time system architecture
2. Hardware

Recommendations

Demo: First Demonstration of Real-Time Photonic-Electronic DNN Acceleration on SmartNICs
ACM SIGCOMM '23: Proceedings of the ACM SIGCOMM 2023 Conference

We demonstrate Lightning, a reconfigurable photonic-electronic deep learning smartNIC that serves real-time inference requests at 4.055 GHz compute frequency. To do so, Lightning uses a novel datapath to feed traffic from the NIC into its photonic ...
Photonic Reconfigurable Accelerators for Efficient Inference of CNNs With Mixed-Sized Tensors
Photonic microring resonator (MRR)-based hardware accelerators have been shown to provide disruptive speedup and energy-efficiency improvements for processing deep convolutional neural networks (CNNs). However, previous MRR-based CNN accelerators fail to ...
A multi-mode video-stream processor with cyclically reconfigurable architecture
CF '08: Proceedings of the 5th conference on Computing frontiers

This paper presents an approach for development of cost-effective hardware platform for video/image processing. The approach utilizes the SRAM based reconfigurable logic devices (FPGAs) and, their capability of run-time temporal partitioning of logic ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ACM SIGCOMM '23: Proceedings of the ACM SIGCOMM 2023 Conference

September 2023

1217 pages

ISBN:9798400702365

DOI:10.1145/3603269

Chairs:
Henning Schulzrinne,
Vishal Misra,
Program Chairs:
Eddie Kohler,
David Maltz

Copyright © 2023 Owner/Author(s).

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGCOMM: ACM Special Interest Group on Data Communication

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 2023

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Funding Sources

DARPA
Air Force AI Accelerator
ARPA-E
NSF (National Science Foundation)
Sloan fellowship
the U.S. Army Research Office through the Institute for Soldier Nanotechnologies (ISN)
NSF Center for Quantum Networks

Conference

ACM SIGCOMM '23

Sponsor:

SIGCOMM

ACM SIGCOMM '23: ACM SIGCOMM 2023 Conference

September 10, 2023

NY, New York, USA

Acceptance Rates

Overall Acceptance Rate 462 of 3,389 submissions, 14%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
4,010
Total Downloads

Downloads (Last 12 months)2,359
Downloads (Last 6 weeks)201

Reflects downloads up to 15 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang HZhang HHuang ZChen Y(2025)ChipAI: A scalable chiplet-based accelerator for efficient DNN inference using silicon photonicsJournal of Systems Architecture10.1016/j.sysarc.2024.103308158(103308)Online publication date: Jan-2025
https://doi.org/10.1016/j.sysarc.2024.103308
Murali MBanerjee ABasu T(2024)Lithium niobate on insulator: an emerging nanophotonic crystal for optimized light controlBeilstein Journal of Nanotechnology10.3762/bjnano.15.11415(1415-1426)Online publication date: 14-Nov-2024
https://doi.org/10.3762/bjnano.15.114
Cheng WGao ZChen TGanesan DLane NShi W(2024)Real-time Wideband Software-defined Radio with Python Programmability based on RFSoCProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3698855(1772-1774)Online publication date: 4-Dec-2024
https://dl.acm.org/doi/10.1145/3636534.3698855
Andrulis TChaudhry GSuriyakumar VEmer JSze V(2024)Architecture-Level Modeling of Photonic Deep Neural Network Accelerators2024 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS61541.2024.00040(307-309)Online publication date: 5-May-2024
https://doi.org/10.1109/ISPASS61541.2024.00040
Cong GYamamoto NKou RMaegami YNamiki SYamada K(2024)Vertically hierarchical electro-photonic neural network by cascading element-wise multiplicationAPL Photonics10.1063/5.01970339:5Online publication date: 28-May-2024
https://doi.org/10.1063/5.0197033
Kesgin BTeğin U(2024)Implementing the analogous neural network using chaotic strange attractorsCommunications Engineering10.1038/s44172-024-00242-z3:1Online publication date: 15-Jul-2024
https://doi.org/10.1038/s44172-024-00242-z
Li ZGan RChen ZDeng ZGao RChen KGuo CZhang YLiu LYu SLiu J(2024)Scalable On-Chip Optoelectronic Ising Machine Utilizing Thin-Film Lithium Niobate PhotonicsACS Photonics10.1021/acsphotonics.4c0000311:4(1703-1714)Online publication date: 15-Mar-2024
https://doi.org/10.1021/acsphotonics.4c00003
Barabash NSidorenko KNezhdanov ABobrov A(2024)Accuracy of linear transformations performed on a nonideal Mach–Zehnder interferometerOptics Communications10.1016/j.optcom.2024.130686566(130686)Online publication date: Sep-2024
https://doi.org/10.1016/j.optcom.2024.130686

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents