[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3649329.3655664acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article
Open access

Zeroth-Order Optimization of Optical Neural Networks with Linear Combination Natural Gradient and Calibrated Model

Published: 07 November 2024 Publication History

Abstract

Optical neural networks (ONNs) have attracted great attention due to their low energy consumption and high-speed processing. The usual neural network training scheme leads to poor performance for ONNs because of their special parameterization and fabrication variations. This paper contributes to extend zeroth-order (ZO) optimization, which can be used to train such ONNs, in two ways. The first is to propose linear combination natural gradient, which mitigates the optimization difficulty caused by the special parameterization of an ONN. The second is to generate a guided direction vector by calibration for better guessing than random vectors generated in ZO optimization. Experimental results show that the two extensions significantly outperformed the existing ZO optimization and related methods with little computational overhead.

References

[1]
S. Amari. 1998. Natural gradient works efficiently in learning. Neural Computation 10, 2 (1998), 251--276.
[2]
M. Arjovsky, A. Shah, and Y. Bengio. 2016. Unitary evolution recurrent neural networks. In International Conference on Machine Learning. PMLR, 1120--1128.
[3]
F. Ashtiani, A. J. Geers, and F. Aflatouni. 2022. An on-chip photonic deep neural network for image classification. Nature 606 (2022), 501--506.
[4]
J. Bae, P. Vicol, J. Z. HaoChen, and R. B. Grosse. 2022. Amortized proximal optimization. Advances in Neural Information Processing Systems 35 (2022), 8982--8997.
[5]
S. Bandyopadhyay, A. Sludds, S. Krastanov, R. Hamerly, N. Harris, D. Bunandar, M. Streshinsky, M. Hochberg, and D. Englund. 2022. Single chip photonic deep neural network with accelerated training. arXiv preprint arXiv:2208.01623 (2022).
[6]
S. Banerjee, M. Nikdast, and K. Chakrabarty. 2023. Characterizing Coherent Integrated Photonic Neural Networks Under Imperfections. Journal of Lightwave Technology 41, 5 (2023), 1464--1479.
[7]
C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer.
[8]
W. Bogaerts, D. Pérez, J. Capmany, D. A. B. Miller, J. Poon, D. Englund, F. Morichetti, and A. Melloni. 2020. Programmable photonic circuits. Nature 586 (2020), 207--216.
[9]
L. Bottou. 2012. Stochastic gradient descent tricks. In Neural Networks: Tricks of the Trade: Second Edition. Springer, 421--436.
[10]
W. R. Clements, P. C. Humphreys, B. J. Metcalf, W. S. Kolthammer, and I. A. Walmsley. 2016. Optimal design for universal multiport interferometers. Optica 3, 12 (2016), 1460--1465.
[11]
M. Y.-S. Fang, S. Manipatruni, C. Wierzynski, A. Khosrowshahi, and M. R. DeWeese. 2019. Design of optical neural networks with component imprecisions. Optical Express 27, 10 (2019), 14009--14029.
[12]
I. Goodfellow, Y. Bengio, and A. Courville. 2016. Deep learning. MIT press.
[13]
J. Gu, C. Feng, Z. Zhao, Z. Ying, R. T. Chen, and D. Z. Pan. 2021. Efficient on-chip learning for optical neural networks through power-aware sparse zeroth-order optimization. In Proc. AAAI Conf. Artificial Intelligence, Vol. 35. 7583--7591.
[14]
J. Gu, Z. Zhao, C. Feng, W. Li, R. T. Chen, and D. Z. Pan. 2020. FLOPS: Efficient on-chip learning for optical neural networks through stochastic zeroth-order optimization. In Proc. 57th ACM/IEEE Design Automation Conference (DAC). 1--6.
[15]
D. P. Kingma and J. Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[16]
F. Kunstner, P. Hennig, and L. Balles. 2019. Limitations of the empirical Fisher approximation for natural gradient descent. Advances in Neural Information Processing Systems 32 (2019).
[17]
Y. LeCun, Y. Bengio, and G. Hinton. 2015. Deep learning. Nature 521 (2015), 436--444.
[18]
Y. LeCun and C. Cortes. 2010. MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist
[19]
S. Liu, P.-Y. Chen, B. Kailkhura, G. Zhang, A. O. Hero III, and P. K. Varshney. 2020. A primer on zeroth-order optimization in signal processing and machine learning: Principals, recent advances, and applications. IEEE Signal Processing Magazine 37, 5 (2020), 43--54.
[20]
K. D. G. Maduranga, K. E. Helfrich, and Q. Ye. 2019. Complex unitary recurrent neural networks using scaled Cayley transform. In Proc. AAAI Conf. Artificial Intelligence. 4528--4535.
[21]
N. Maheswaranathan, L. Metz, G. Tucker, D. Choi, and J. Sohl-Dickstein. 2019. Guided evolutionary strategies: Augmenting random search with surrogate gradients. In International Conference on Machine Learning. PMLR, 4264--4273.
[22]
L. Malagò and G. Pistone. 2015. Information geometry of the Gaussian distribution in view of stochastic optimization. In Proc. ACM Conference on Foundations of Genetic Algorithms XIII. 150--162.
[23]
J. Martens. 2020. New insights and perspectives on the natural gradient method. The Journal of Machine Learning Research 21, 1 (2020), 5776--5851.
[24]
J. Martens and R. Grosse. 2015. Optimizing neural networks with Kronecker-factored approximate curvature. In International Conference on Machine Learning. PMLR, 2408--2417.
[25]
J. C. Mikkelsen, W. D. Sacher, and J. K. S. Poon. 2014. Dimensional variation tolerant silicon-on-insulator directional couplers. Opt. Express 22, 3 (Feb 2014), 3145--3150.
[26]
R. Pascanu and Y. Bengio. 2013. Revisiting natural gradient for deep networks. arXiv preprint arXiv:1301.3584 (2013).
[27]
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. 2019. PyTorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems 32 (2019).
[28]
M. Reck and A. Zeilinger. 1994. Experimental realization of any discrete unitary operator. Phys. Rev. Lett. 73, 1 (1994), 58--61.
[29]
Y. Shen, N. C. Harris, S. Skirlo, M. Prabhu, T. Baehr-Jones, M. Hochberg, X. Sun, S. Zhao, H. Larochelle, D. Englund, and M. Soljačić. 2017. Deep learning with coherent nanophotonic circuits. Nat. Photonics 11 (2017), 441--446.
[30]
R. Tang, R. Tanomura, T. Tanemura, and Y. Nakano. 2021. Ten-Port Unitary Optical Processor on a Silicon Photonic Chip. ACS Photonics 8, 7 (2021), 2074--2080.
[31]
S. K. Vadlamani, D. Englund, and R. Hamerly. 2023. Transferable learning on analog hardware. Science Advances 9, 28 (2023), eadh3436.
[32]
I. A. D. Williamson, T. W. Hughes, M. Minkov, B. Bartlett, S. Pai, and S. Fan. 2020. Reprogrammable Electro-Optic Nonlinear Activation Functions for Optical Neural Networks. IEEE Journal of Selected Topics in Quantum Electronics 26, 1 (2020), 1--12.
[33]
P. Zhao, P.-Y. Chen, S. Wang, and X. Lin. 2020. Towards query-efficient black-box adversary with zeroth-order natural gradient descent. In Proc. AAAI Conf. Artificial Intelligence, Vol. 34. 6909--6916.
[34]
H. Zhou, Y. Zhao, X. Wang, D. Gao, J. Dong, and X. Zhang. 2020. Self-configuring and reconfigurable silicon photonic signal processor. ACS Photonics 7, 3 (2020), 792--799.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '24: Proceedings of the 61st ACM/IEEE Design Automation Conference
June 2024
2159 pages
ISBN:9798400706011
DOI:10.1145/3649329
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 November 2024

Check for updates

Author Tags

  1. optical neural network (ONN)
  2. neural network training
  3. zeroth-order (ZO) optimization
  4. natural gradient
  5. calibration

Qualifiers

  • Research-article

Conference

DAC '24
Sponsor:
DAC '24: 61st ACM/IEEE Design Automation Conference
June 23 - 27, 2024
CA, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 77
    Total Downloads
  • Downloads (Last 12 months)77
  • Downloads (Last 6 weeks)42
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media