[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Building Surrogate Models Using Trajectories of Agents Trained by Reinforcement Learning

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2024 (ICANN 2024)

Abstract

Sample efficiency in the face of computationally expensive simulations is a common concern in surrogate modeling. Current strategies to minimize the number of samples needed are not as effective in simulated environments with wide state spaces. As a response to this challenge, we propose a novel method to efficiently sample simulated deterministic environments by using policies trained by Reinforcement Learning. We provide an extensive analysis of these surrogate-building strategies with respect to Latin-Hypercube sampling or Active Learning and Kriging, cross-validating performances with all sampled datasets. The analysis shows that a mixed dataset that includes samples acquired by random agents, expert agents, and agents trained to explore the regions of maximum entropy of the state transition distribution provides the best scores through all datasets, which is crucial for a meaningful state space representation. We conclude that the proposed method improves the state-of-the-art and clears the path to enable the application of surrogate-aided Reinforcement Learning policy optimization strategies on complex simulators.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 49.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 59.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/dmlc/xgboost.

Abbreviations

AL:

Active Learning

Random:

Random sampling

Sobol:

Sobol sampling

LHS:

Latin-Hypercube sampling

RA:

Random agent

EA:

Expert agent

MEA:

Maximum Entropy agent

MA:

Mixed agent

MPA:

Mixed (Partially) agent

PA:

Partial agents

References

  1. Ali, M.: PyCaret: an open source, low-code machine learning library in Python, PyCaret version 1.0.0 (2020). https://www.pycaret.org

  2. Asher, M.J., Croke, B.F.W., Jakeman, A.J., Peeters, L.J.M.: A review of surrogate models and their application to groundwater modeling. Water Resour. Res. 51(8), 5957–5973 (2015)

    Article  Google Scholar 

  3. Broad, D.R., Dandy, G.C., Maier, H.R.: Water distribution system optimization using metamodels. J. Water Resour. Plan. Manag. 131(3), 172–180 (2005)

    Article  Google Scholar 

  4. Bui, V.H., et al.: Deep neural network-based surrogate model for optimal component sizing of power converters using deep reinforcement learning. IEEE Access 10, 78702–78712 (2022)

    Article  Google Scholar 

  5. Burhenne, S., et al.: Sampling based on Sobol’ sequences for Monte Carlo techniques applied to building simulations. In: Proceedings of International Conference Building Simulation, pp. 1816–1823 (2011)

    Google Scholar 

  6. Cao, D., et al.: Model-free voltage regulation of unbalanced distribution network based on surrogate model and deep reinforcement learning. arXiv preprint arXiv:2006.13992 (2020)

  7. Cherkassky, V., Ma, Y.: Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw. 17(1), 113–126 (2004)

    Article  Google Scholar 

  8. Chollet, F.: Deep Learning with Python: Second Edition. Simon and Schuster (2021)

    Google Scholar 

  9. Cressie, N.: The origins of Kriging. Math. Geol. 22(3), 239–252 (1990). https://doi.org/10.1007/BF00889887

    Article  MathSciNet  Google Scholar 

  10. da Costa Paulo, B., et al.: Surrogate model of a HVAC system for PV self-consumption maximisation. Energ. Convers. Manag. X 19, 100396 (2023)

    Google Scholar 

  11. Dubourg, V., Sudret, B., Bourinet, J.M.: Reliability-based design optimization using Kriging surrogates and subset simulation. Struct. Multidisc. Optim. 44(5), 673–690 (2011)

    Article  Google Scholar 

  12. Eriksson, L., Johansson, E., Kettaneh-Wold, N., Wikström, C., Wold, S.: Design of Experiments: Principles and Applications, 3rd edn. Umetrics AB, Stockholm (2000)

    Google Scholar 

  13. Fang, H., Horstemeyer, M.F.: Global response approximation with radial basis functions. Eng. Optim. 38(4), 407–424 (2006)

    Article  MathSciNet  Google Scholar 

  14. Forrester, A.I.J., Keane, A.J.: Recent advances in surrogate-based optimization. Prog. Aerosp. Sci. 45(1), 50–79 (2009)

    Article  Google Scholar 

  15. Gaspar, B., Teixeira, A.P., Soares, C.G.: Assessment of the efficiency of Kriging surrogate models for structural reliability analysis. Probab. Eng. Mech. 37, 24–34 (2014)

    Article  Google Scholar 

  16. Giraldo-Pérez, J.P., Mejía-Gutiérrez, R., Aguilar, J.: A reinforcement learning based energy optimization approach for household fridges. Sustain. Energ. Grids Netw. 36, 101174 (2023)

    Article  Google Scholar 

  17. Helton, J.C., Davis, F.J.: Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems. Reliab. Eng. Syst. Saf. 81(1), 23–69 (2003)

    Article  Google Scholar 

  18. Lüthen, N., Marelli, S., Sudret, B.: A spectral surrogate model for stochastic simulators computed from trajectory samples. Comput. Methods Appl. Mech. Eng. 406, 115875 (2023)

    Article  MathSciNet  Google Scholar 

  19. Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11) (2008)

    Google Scholar 

  20. Mai, H.T., Lee, J., Kang, J., Nguyen-Xuan, H., Lee, J.: An improved blind Kriging surrogate model for design optimization problems. Mathematics 10(16), 2906 (2022)

    Google Scholar 

  21. McBride, K., Sundmacher, K.: Overview of surrogate modeling in chemical process engineering. Chem. Eng. Tech. 91(3), 228–239 (2019)

    Google Scholar 

  22. Moustapha, M., Sudret, B., Bourinet, J.M., Guillaume, B.: Quantile-based optimization under uncertainties using adaptive Kriging surrogate models. Struct. Multidisc. Optim. 54(6), 1403–1421 (2016)

    Article  MathSciNet  Google Scholar 

  23. Mutti, M., Pratissoli, L., Restelli, M.: Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate. arXiv preprint arXiv:2007.04640 (2021)

  24. Pandala, S.R.: Shankarpandala/lazypredict (2024). https://github.com/shankarpandala/lazypredict

  25. Razavi, S., Tolson, B.A., Burn, D.H.: Review of surrogate modeling in water resources. Water Resour. Res. 48(7) (2012)

    Google Scholar 

  26. Sieusahai, A., Guzdial, M.: Explaining deep reinforcement learning agents in the Atari domain through a surrogate model. In: Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, vol. 17, pp. 82–90 (2021)

    Google Scholar 

  27. Simpson, T.W., Mauery, T.M., Korte, J.J., Mistree, F.: Kriging Models for Global Approximation in Simulation-Based Multidisciplinary Design Optimization. AIAA J. 39(12), 2233–2241 (2001)

    Article  Google Scholar 

  28. Wang, Z., Ierapetritou, M.: A novel feasibility analysis method for black-box processes using a radial basis function adaptive sampling approach. AIChE J. 63(2), 532–550 (2017)

    Article  Google Scholar 

  29. Williams, B.A., Cremaschi, S.: Surrogate model selection for design space approximation and surrogate based optimization. In: Muñoz, S.G., Laird, C.D., Realff, M.J. (eds.) Computer Aided Chemical Engineering, Proceedings of the 9 International Conference on Foundations of Computer-Aided Process Design, vol. 47, pp. 353–358. Elsevier (2019)

    Google Scholar 

  30. Xing, J., Luo, Y., Gao, Z.: A global optimization strategy based on the Kriging surrogate model and parallel computing. Struct. Multidisc. Optim. 62(1), 405–417 (2020)

    Article  Google Scholar 

  31. Zhou, Y., Lu, Z.: An enhanced Kriging surrogate modeling technique for high-dimensional problems. Mech. Syst. Signal Process. 140, 106687 (2020)

    Article  Google Scholar 

Download references

Funding

This research has been supported by the Spanish Ministry (NextGenerationEU Funds) through Project IA4TES (Grant Number: MIA.2021.M04.0008).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julen Cestero .

Editor information

Editors and Affiliations

A Algorithm implementation details

A Algorithm implementation details

The XGBoost models are trained using the default parameters. For that, we use the XGBoost Python packageFootnote 1.

The ANNs have been subjected to a hyperparameter optimization process. The main parameters are the following ones:

  • 2 hidden layers with 512 and 256 neurons each,

  • learning rate of 0.001,

  • batch size of 64,

  • 25 epochs for the Mujoco environments, and 10 epochs for the other environments.

  • Early stopping if the validation curve was increased by more than 0.001

The Gaussian surrogate (used in Kriging along with AL) has the following specifications:

  • Uses LHS for the general sampling procedure of each training step.

  • Samples \(100\,000\) space points every epoch.

  • If the maximum std of the sampled space is less than 0.01, the epoch is halted. Another stop condition is having added 300 points to the training set.

  • We train 3 epochs. For each epoch, the initial space sample is repeated to prevent overfitting.

Note that the stopping conditions for the Kriging process have been added to prevent computational problems since, after every step, the training time increased drastically.

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cestero, J., Quartulli, M., Restelli, M. (2024). Building Surrogate Models Using Trajectories of Agents Trained by Reinforcement Learning. In: Wand, M., Malinovská, K., Schmidhuber, J., Tetko, I.V. (eds) Artificial Neural Networks and Machine Learning – ICANN 2024. ICANN 2024. Lecture Notes in Computer Science, vol 15019. Springer, Cham. https://doi.org/10.1007/978-3-031-72341-4_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-72341-4_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-72340-7

  • Online ISBN: 978-3-031-72341-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics