Abstract
We consider a finite state-action uncertain constrained Markov decision process under discounted and average cost criteria. The running costs are defined by random variables and the transition probabilities are known. The uncertainties present in the objective function and the constraints are modelled using chance constraints. We assume that the random cost vectors follow multivariate elliptically symmetric distributions and dependence among the random constraints is driven by a Gumbel–Hougaard copula. We propose two second order cone programming problems whose optimal values give lower and upper bounds of the optimal value of the uncertain constrained Markov decision process. As an application, we study a stochastic version of a service and admission control problem in a queueing system. The proposed approximation methods are illustrated on randomly generated instances of queueing control problem as well as on well known class of Markov decision problems known as Garnets.
Similar content being viewed by others
References
Altman, E. (1999). Constrained Markov decision processes. Chapman and Hall/CRC.
Archibald, T. W., McKinnon, K. I. M., & Thomas, L. C. (1995). On the generation of Markov decision processes. Journal of the Operational Research Society, 46(3), 354–361.
Bertsekas, D. P. (2000). Dynamic programming and optimal control (2nd ed.). Athena Scientific.
Charnes, A., & Cooper, W. W. (1959). Chance-constrained programming. Management Science, 6, 73–79.
Cheng, J., & Lisser, A. (2012). A second-order cone programming approach for linear programs with joint probabilistic constraints. Operations Research, 5, 325–328.
Cheng, J., Houda, M., & Lisser, A. (2015). Chance constrained 0–1 quadratic programs using copulas. Optimization Letters, 9(7), 1283–1295.
Delage, E., & Mannor, S. (2010). Percentile optimization for Markov decision processes with parameter uncertainty. Operations Research, 58(1), 203–213.
El Asri, L., Piot, B., Geist, M., Laroche, R., & Pietquin, O. (2016) Score-based inverse reinforcement learning. In Proceedings of the 2016 international conference on autonomous agents & multiagent systems. AAMAS ’16 (pp 457–465). International Foundation for Autonomous Agents and Multiagent Systems.
Fang, K.-T., Kotz, S., & Ng, K. W. (1990). Symmetric multivariate and related distributions. Chapman and Hall/CRC.
Geng, X., & Xie, L. (2019). Data-driven decision making in power systems with probabilistic guarantees: Theory and applications of chance-constrained optimization. Annual Reviews in Control, 47, 341.
Henrion, R. (2007). Structural properties of linear probabilistic constraints. Optimization, 56, 425–440.
Iyengar, G. N. (2005). Robust dynamic programming. Mathematics of Operations Research, 30(2), 257–280.
Jaworski, P., Durante, F., Härdle, W. K., & Rychlik, T. (2010). Copula theory and its applications. In Proceedings of the workshop Held in Warsaw, Poland, 25–26 September 2009.
Luedtke, J., Ahmed, S., & Nemhauser, G. (2010). An integer programming approach for linear programs with probabilistic constraints. Mathematical Programming, 122(2), 247–272.
Mannor, S., Simester, D., Sun, P., & Tsitsiklis, J. N. (2007). Bias and variance approximation in value function estimates. Management Science, 53(2), 308–322.
Nelsen, R. B. (2006). An introduction to copulas. Springer Series in StatisticsSpringer.
Nilim, A., & El Ghaoui, L. (2005). Robust control of Markov decision processes with uncertain transition matrices. Operations Research, 53(5), 780–798.
Prékopa, A. (1995). Stochastic programming. Kluwer Academic Publishers.
Puterman, M. L. (1994). Markov decision process (1st ed.). Wiley.
Satia, J. K., & Lave, R. E., Jr. (1973). Markovian decision processes with uncertain transition probabilities. Operations Research, 21(3), 728–740.
Sklar, M. (1959). Fonctions de répartition à n dimensions et leurs marges. Publications de l’Institut de statistique de l’Université de Paris, 8, 229–231.
Varagapriya, V., Singh, V. V., & Lisser, A. (2022). Constrained Markov decision processes with uncertain costs. Operations Research Letters, 50(2), 218–223.
White, C. C., & Eldeib, H. K. (1994). Markov decision processes with imprecise transition probabilities. Operations Research, 42(4), 739–749.
Wiesemann, W., Kuhn, D., & Rustem, B. (2013). Robust Markov decision processes. Mathematics of Operations Research, 38(1), 153–183.
Ye, Y. (2011). The simplex and policy-iteration methods are strongly polynomial for the Markov decision problem with a fixed discount rate. Mathematics of Operations Research, 36(4), 593–603.
Acknowledgements
The research of first author was supported by CSIR, India. The research of second and third author was supported by DST/CEFIPRA Project No. IFC/4117/DST-CNRS-5th call/2017-18/2 and CNRS Project No. AR/SB:2018-07-440, respectively.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Varagapriya, V., Singh, V.V. & Lisser, A. Joint chance-constrained Markov decision processes. Ann Oper Res 322, 1013–1035 (2023). https://doi.org/10.1007/s10479-022-05025-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-022-05025-3