Optimizing energy production using policy search and predictive state representations
Pages 2969 - 2977
Abstract
We consider the challenging practical problem of optimizing the power production of a complex of hydroelectric power plants, which involves control over three continuous action variables, uncertainty in the amount of water inflows and a variety of constraints that need to be satisfied. We propose a policy-search-based approach coupled with predictive modelling to address this problem. This approach has some key advantages compared to other alternatives, such as dynamic programming: the policy representation and search algorithm can conveniently incorporate domain knowledge; the resulting policies are easy to interpret, and the algorithm is naturally parallelizable. Our algorithm obtains a policy which outperforms the solution found by dynamic programming both quantitatively and qualitatively.
References
[1]
Salas, J. D. (1980). Applied modeling of hydrologic time series. Water Resources Publication.
[2]
Carpentier, P. L., Gendreau, M., Bastin, F. (2013). Long-term management of a hydroelectric multireservoir system under uncertainty using the progressive hedging algorithm. Water Resources Research, 49(5), 2812-2827.
[3]
Rani, D., Moreira, M.M. (2010). Simulation-optimization modeling: a survey and potential application in reservoir systems operation. Water resources management, 24(6), 1107-1138.
[4]
Labadie, J.W. (2004). Optimal operation of multireservoir systems: State-of-the-art review. Journal of Water Resources Planning and Management, 130(2), 93-111.
[5]
Baños, R., Manzano-Agugliaro, F., Montoya, F. G., Gil, C., Alcayde, A., Gömez, J. (2011). Optimization methods applied to renewable and sustainable energy: A review. Renewable and Sustainable Energy Reviews, 15(4), 1753-1766.
[6]
Deisenroth, M.P., Neumann, G., Peters, J. (2013). A Survey on Policy Search for Robotics. Foundations and Trends in Robotics, 21, pp.388-403.
[7]
Boots, B., Siddiqi, S., Gordon, G. (2010). Closing the learning-planning loop with predictive state representations. In Proc. of Robotics: Science and Systems VI.
[8]
Ong, S., Grinberg, Y., Pineau, J. (2013). Mixed Observability Predictive State Representations. In Proc. of 27th AAAI Conference on Artificial Intelligence.
[9]
Littman, M., Sutton, R., Singh, S. (2002). Predictive representations ofstate. Advances in Neural Information Processing Systems (NIPS).
[10]
Singh, S., James, M., Rudary, M. (2004). Predictive state representations: A new theory for modeling dynamical systems. In Proc. of 20th Conference on Uncertainty in Artificial Intelligence.
[11]
Sveinsson, O.G.B., Salas, J.D., Lane, W.L., Frevert, D.K. (2007). Stochastic Analisys Modeling and Simulation (SAMS-2007). URL: http://www.sams.colostate.edu.
[12]
J.B., Marco, R., Harboe, J.D., Salas (Eds.) (1993). Stochastic hydrology and its use in water resources systems simulation and optimization, 237. Springer.
[13]
Bellman, R. (1954). Dynamic Programming. Princeton University Press.
[14]
Tseng, P. (2001). Convergence of a block coordinate descent method for nondifferentiable minimization. Journal of optimization theory and applications, 109(3), 475-494.
[15]
Loucks, D.P., J.R. Stedinger, D.A. Haith (1981). Water Resources Systems Planning and Analysis. Prentice-Hall, Englewood Cliffs, N.J.
[16]
Gosavi, A. (2003). Simulation-based optimization: parametric optimization techniques and reinforcement learning, 25. Springer.
[17]
Fortin, P. (2008). Canadian clean: Clean, renewable hydropower leads electricity generation in Canada. IEEE Power Energy Mag., July/August, 41-46.
[18]
Breton, M., Hachem, S., Hammadia, A. (2002). A decomposition approach for the solution of the unit loading problem in hydroplants. Automatica, 38(3), 477-485.
Index Terms
- Optimizing energy production using policy search and predictive state representations
Index terms have been assigned to the content through auto-classification.
Recommendations
Predictive feature selection for genetic policy search
Automatic learning of control policies is becoming increasingly important to allow autonomous agents to operate alongside, or in place of, humans in dangerous and fast-paced situations. Reinforcement learning (RL), including genetic policy search ...
Data-efficient policy evaluation through behavior policy search
ICML'17: Proceedings of the 34th International Conference on Machine Learning - Volume 70We consider the task of evaluating a policy for a Markov decision process (MDP). The standard unbiased technique for evaluating a policy is to deploy the policy and observe its performance. We show that the data collected from deploying a different ...
Comments
Please enable JavaScript to view thecomments powered by Disqus.Information & Contributors
Information
Published In
Publisher
MIT Press
Cambridge, MA, United States
Publication History
Published: 08 December 2014
Qualifiers
- Article
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 0Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Reflects downloads up to 01 Jan 2025