Agent-Based Collaborative Random Search for Hyperparameter Tuning and Global Function Optimization †
<p>Hierarchical structure built for <math display="inline"><semantics> <mrow> <msub> <mi mathvariant="bold">λ</mi> <mi mathvariant="bold-italic">o</mi> </msub> <mo>=</mo> <mrow> <mo>{</mo> <msub> <mi>λ</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>λ</mi> <mn>2</mn> </msub> <mo>,</mo> <msub> <mi>λ</mi> <mn>3</mn> </msub> <mo>,</mo> <msub> <mi>λ</mi> <mn>4</mn> </msub> <mo>,</mo> <msub> <mi>λ</mi> <mn>5</mn> </msub> <mo>,</mo> <msub> <mi>λ</mi> <mn>6</mn> </msub> <mo>}</mo> </mrow> </mrow> </semantics></math>, where the primary and complementary hyperparameters of each node are, respectively, highlighted in green and orange, and the labels are the indexes of <math display="inline"><semantics> <msub> <mi>λ</mi> <mi>i</mi> </msub> </semantics></math>.</p> "> Figure 2
<p>A toy example demonstrating three iterations of running the proposed method for tuning two hyperparameters <math display="inline"><semantics> <msub> <mi>λ</mi> <mn>1</mn> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>λ</mi> <mn>2</mn> </msub> </semantics></math> using terminal agents <math display="inline"><semantics> <msubsup> <mi>g</mi> <mrow> <msub> <mi>λ</mi> <mn>1</mn> </msub> </mrow> <mn>1</mn> </msubsup> </semantics></math> and <math display="inline"><semantics> <msubsup> <mi>g</mi> <mrow> <msub> <mi>λ</mi> <mn>2</mn> </msub> </mrow> <mn>1</mn> </msubsup> </semantics></math>, respectively. It is assumed that for each agent, <math display="inline"><semantics> <mrow> <mi>b</mi> <mo>=</mo> <mn>3</mn> </mrow> </semantics></math>.</p> "> Figure 3
<p>Average performance of the C-support vector classification (SVC) (first row) and stochastic gradient descent (SGD) (second row) classifiers on two synthetic classification datasets based on the accuracy measure. The error bars in each plot are calculated based on the standard error.</p> "> Figure 4
<p>Average performance of the passive aggressive (first row) and elastic net (second row) regression algorithms on two synthetic regression datasets based on the mean squared error (MSE) measure. The error bars in each plot are calculated based on the standard error.</p> "> Figure 5
<p>Average values of the Hartmann function optimized under variable iterations, budgets, explorations, and connection thresholds. Each row of the figure pertains to a particular dimension size, and the error bars are calculated based on the standard error.</p> "> Figure 6
<p>Average values of the Rastrigin function optimized under variable iterations, budgets, explorations, and connection thresholds. Each row of the figure pertains to a particular dimension size, and the error bars are calculated based on the standard error.</p> "> Figure 7
<p>Average values of the Styblinski–Tang function optimized under variable iterations, budgets, explorations, and connection thresholds. Each row of the figure pertains to a particular dimension size, and the error bars are calculated based on the standard error.</p> "> Figure 8
<p>Average values of the toy mean absolute error function optimized under variable iterations, budgets, explorations, and connection thresholds. Each row of the figure pertains to a particular dimension size, and the error bars are calculated based on the standard error.</p> "> Figure 9
<p>Average function values of four objective functions optimized under a variable number of iterations. Each row of the figure pertains to a particular dimension size, and the error bars are calculated based on the standard error.</p> "> Figure 10
<p>Average function values of four objective functions optimized under a variable number of budget values. Each row of the figure pertains to a particular dimension size, and the error bars are calculated based on the standard error.</p> ">
Abstract
:1. Introduction
2. Methodology
2.1. Preliminaries
2.2. Agent-Based Randomized Searching Algorithm
2.2.1. Distributed Hierarchy Formation
Algorithm 1: Distributed formation of the hierarchical agent-based hyperparameter tuning structure. |
2.2.2. Collaborative Tuning Process
Algorithm 2: Iterated collaborative tuning procedure. |
Algorithm 3: A terminal agent’s randomized tuning process. |
3. Results and Discussion
3.1. Computational Complexity
3.2. Empirical Results
ML Algorithm | Dataset | Performance Metric | |
---|---|---|---|
C-Support Vector Classification (SVC) [26,27] | 1 | artificial (100,20) † | accuracy |
Stochastic Gradient Descent (SGD) Classifier [26] | 2 | artificial (500,20) † | accuracy |
Passive Aggressive Regressor [26,28] | 3 | artificial (300,100) ‡ | mean squared error |
Elastic Net Regressor [26,29] | 4 | artificial (300,100) ‡ | mean squared error |
Function | Domain | ||
---|---|---|---|
Hartmann, 3D, 4D, 6D [31] | 3D:−3.86278, 4D:−3.135474, 6D:−3.32237 | ||
Rastrigin, 3D, 6D, 10D [32] | 3D:0, 6D:0, 10D:0 | ||
Styblinski–Tang, 3D, 6D, 10D [33] | 3D:−117.4979, 6D:−234.9959, 10D:391.6599 | ||
Mean Average Error, 3D, 6D, 10D † | 3D:0,6D:0, 10D:0 ‡ |
4. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Bergstra, J.; Bengio, Y. Random search for hyper-parameter optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
- Kohavi, R.; John, G.H. Automatic parameter selection by minimizing estimated error. In Machine Learning Proceedings 1995; Elsevier: Amsterdam, The Netherlands, 1995; pp. 304–312. [Google Scholar]
- Bischl, B.; Mersmann, O.; Trautmann, H.; Weihs, C. Resampling Methods for Meta-Model Validation with Recommendations for Evolutionary Computation. Evol. Comput. 2012, 20, 249–275. [Google Scholar] [CrossRef] [PubMed]
- Montgomery, D.C. Design and Analysis of Experiments; John Wiley & Sons: Hoboken, NJ, USA, 2017. [Google Scholar]
- John, G.H. Cross-Validated C4.5: Using Error Estimation for Automatic Parameter Selection; Technical Report; Stanford University: Stanford, CA, USA, 1994. [Google Scholar]
- Močkus, J. On Bayesian methods for seeking the extremum. In Proceedings of the Optimization Techniques IFIP Technical Conference, Novosibirsk, Russia, 1–7 July 1974; Springer: Berlin/Heidelberg, Germany, 1975; pp. 400–404. [Google Scholar]
- Mockus, J. Bayesian Approach to Global Optimization: Theory and Applications; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012; Volume 37. [Google Scholar]
- Feurer, M.; Hutter, F. Hyperparameter optimization. In Automated Machine Learning; Springer: Cham, Switzerland, 2019; pp. 3–33. [Google Scholar]
- Simon, D. Evolutionary Optimization Algorithms; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
- Alibrahim, H.; Ludwig, S.A. Hyperparameter Optimization: Comparing Genetic Algorithm against Grid Search and Bayesian Optimization. In Proceedings of the 2021 IEEE Congress on Evolutionary Computation (CEC), Kraków, Poland, 28 June–1 July 2021; pp. 1551–1559. [Google Scholar]
- Bellman, R.E. Adaptive Control Processes; Princeton University Press: Princeton, NJ, USA, 1961. [Google Scholar]
- Shahriari, B.; Swersky, K.; Wang, Z.; Adams, R.P.; De Freitas, N. Taking the human out of the loop: A review of Bayesian optimization. Proc. IEEE 2015, 104, 148–175. [Google Scholar] [CrossRef]
- Garcia-Barcos, J.; Martinez-Cantin, R. Fully Distributed Bayesian Optimization with Stochastic Policies. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, Macao, China, 10–16 August 2019. [Google Scholar]
- Young, M.T.; Hinkle, J.D.; Kannan, R.; Ramanathan, A. Distributed Bayesian optimization of deep reinforcement learning algorithms. J. Parallel Distrib. Comput. 2020, 139, 43–52. [Google Scholar] [CrossRef]
- Frazier, P.I. A Tutorial on Bayesian Optimization. arXiv 2018, arXiv:1807.02811. [Google Scholar] [CrossRef]
- Friedrichs, F.; Igel, C. Evolutionary tuning of multiple SVM parameters. Neurocomputing 2005, 64, 107–117. [Google Scholar] [CrossRef]
- Loshchilov, I.; Hutter, F. CMA-ES for hyperparameter optimization of deep neural networks. arXiv 2016, arXiv:1604.07269. [Google Scholar]
- Ryzko, D. Modern Big Data Architectures: A Multi-Agent Systems Perspective; John Wiley & Sons: Hoboken, NJ, USA, 2020. [Google Scholar]
- Esmaeili, A.; Gallagher, J.C.; Springer, J.A.; Matson, E.T. HAMLET: A Hierarchical Agent-Based Machine Learning Platform. ACM Trans. Auton. Adapt. Syst. 2022, 16, 1–46. [Google Scholar] [CrossRef]
- Esmaeili, A.; Ghorrati, Z.; Matson, E.T. Hierarchical Collaborative Hyper-Parameter Tuning. In Proceedings of the Advances in Practical Applications of Agents, Multi-Agent Systems, and Complex Systems Simulation. The PAAMS Collection, L’Aquila, Italy, 13–15 July 2022; Springer International Publishing: Cham, Switzerland, 2022; pp. 127–139. [Google Scholar]
- Bardenet, R.; Brendel, M.; Kégl, B.; Sebag, M. Collaborative hyperparameter tuning. In Proceedings of the International Conference on Machine Learning, PMLR, Atlanta, GA, USA, 17–19 June 2013; pp. 199–207. [Google Scholar]
- Swearingen, T.; Drevo, W.; Cyphers, B.; Cuesta-Infante, A.; Ross, A.; Veeramachaneni, K. ATM: A distributed, collaborative, scalable system for automated machine learning. In Proceedings of the 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, USA, 11–14 December 2017; pp. 151–162. [Google Scholar]
- Koch, P.; Golovidov, O.; Gardner, S.; Wujek, B.; Griffin, J.; Xu, Y. Autotune: A derivative-free optimization framework for hyperparameter tuning. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 443–452. [Google Scholar]
- Iranfar, A.; Zapater, M.; Atienza, D. Multi-agent reinforcement learning for hyperparameter optimization of convolutional neural networks. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2021, 41, 1034–1047. [Google Scholar] [CrossRef]
- Parker-Holder, J.; Nguyen, V.; Roberts, S.J. Provably efficient online hyperparameter optimization with population-based bandits. Adv. Neural Inf. Process. Syst. 2020, 33, 17200–17211. [Google Scholar]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Chang, C.C.; Lin, C.J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2011, 2, 1–27. [Google Scholar] [CrossRef]
- Crammer, K.; Dekel, O.; Keshet, J.; Shalev-Shwartz, S.; Singer, Y. Online passive aggressive algorithms. J. Mach. Learn. Res. 2006, 7, 551–585. [Google Scholar]
- Friedman, J.; Hastie, T.; Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 2010, 33, 1. [Google Scholar] [CrossRef] [PubMed]
- Scikit-Learn API Reference. Available online: https://scikit-learn.org/stable/modules/classes.html (accessed on 24 November 2022).
- Jamil, M.; Yang, X.S. A literature survey of benchmark functions for global optimisation problems. Int. J. Math. Model. Numer. Optim. 2013, 4, 150. [Google Scholar] [CrossRef]
- Rudolph, G. Globale Optimierung Mit Parallelen Evolutionsstrategien. Ph.D. Thesis, Diplomarbeit, Universit at Dortmund, Fachbereich Informatik, Dortmund, Germany, 1990. [Google Scholar]
- Styblinski, M.; Tang, T.S. Experiments in nonconvex optimization: Stochastic approximation with function smoothing and simulated annealing. Neural Netw. 1990, 3, 467–483. [Google Scholar] [CrossRef]
- Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
- Kirkpatrick, S.; Gelatt, C.D., Jr.; Vecchi, M.P. Optimization by simulated annealing. Science 1983, 220, 671–680. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Esmaeili, A.; Ghorrati, Z.; Matson, E.T. Agent-Based Collaborative Random Search for Hyperparameter Tuning and Global Function Optimization. Systems 2023, 11, 228. https://doi.org/10.3390/systems11050228
Esmaeili A, Ghorrati Z, Matson ET. Agent-Based Collaborative Random Search for Hyperparameter Tuning and Global Function Optimization. Systems. 2023; 11(5):228. https://doi.org/10.3390/systems11050228
Chicago/Turabian StyleEsmaeili, Ahmad, Zahra Ghorrati, and Eric T. Matson. 2023. "Agent-Based Collaborative Random Search for Hyperparameter Tuning and Global Function Optimization" Systems 11, no. 5: 228. https://doi.org/10.3390/systems11050228
APA StyleEsmaeili, A., Ghorrati, Z., & Matson, E. T. (2023). Agent-Based Collaborative Random Search for Hyperparameter Tuning and Global Function Optimization. Systems, 11(5), 228. https://doi.org/10.3390/systems11050228