Markov Decision Processes with Multiple Objectives

Krishnendu Chatterjee¹⁸,
Rupak Majumdar¹⁹ &
Thomas A. Henzinger^18,20

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3884))

Included in the following conference series:

Annual Symposium on Theoretical Aspects of Computer Science

2186 Accesses
66 Citations

Abstract

We consider Markov decision processes (MDPs) with multiple discounted reward objectives. Such MDPs occur in design problems where one wishes to simultaneously optimize several criteria, for example, latency and power. The possible trade-offs between the different objectives are characterized by the Pareto curve. We show that every Pareto-optimal point can be achieved by a memoryless strategy; however, unlike in the single-objective case, the memoryless strategy may require randomization. Moreover, we show that the Pareto curve can be approximated in polynomial time in the size of the MDP. Additionally, we study the problem if a given value vector is realizable by any strategy, and show that it can be decided in polynomial time; but the question whether it is realizable by a deterministic memoryless strategy is NP-complete. These results provide efficient algorithms for design exploration in MDP models with multiple objectives.

This research was supported in part by the AFOSR MURI grant F49620-00-1-0327, and the NSF grants CCR-0225610, CCR-0234690, and CCR-0427202.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Simple Strategies in Multi-Objective MDPs

Multi-cost Bounded Tradeoff Analysis in MDP

Article Open access 28 July 2020

Value Iteration for Long-Run Average Reward in Markov Decision Processes

References

Etzioni, O., Hanks, S., Jiang, T., Karp, R.M., Madari, O., Waarts, O.: Efficient information gathering on the internet. In: FOCS 1996, pp. 234–243. IEEE, Los Alamitos (1996)
Google Scholar
Filar, J., Vrieze, K.: Competitive Markov Decision Processes. Springer, Heidelberg (1997)
MATH Google Scholar
Garey, M.R., Johnson, D.S.: Computers and Intractability. W.H. Freeman, New York (1979)
MATH Google Scholar
Hartley, R.: Finite Discounted Vector Markov Decision Processes. Technical Report, Department of Decision Theory, Manchester University (1979)
Google Scholar
Koski, J.: Multicriteria truss optimization. In: Multicriteria Optimization in Engineering and in the Sciences (1988)
Google Scholar
Owen, G.: Game Theory. Academic Press, London (1995)
MATH Google Scholar
Papadimitriou, C.H., Yannakakis, M.: On the approximability of trade-offs and optimal access of web sources. In: FOCS 2000, pp. 86–92. IEEE, Los Alamitos (2000)
Google Scholar
Puterman, M.L.: Markov Decision Processes. Wiley, Chichester (1994)
Book MATH Google Scholar
Szymanek, R., Catthoor, F., Kuchcinski, K.: Time-energy design space exploration for multi-layer memory architectures. In: DATE 2004, IEEE, Los Alamitos (2004)
Google Scholar
White, D.J.: Multi-objective infinite-horizon discounted Markov decision processes. Journal of Mathematical Analysis and Applications 89, 639–647 (1982)
Article MathSciNet MATH Google Scholar
Yang, P., Catthoor, F.: Pareto-optimization based run time task scheduling for embedded systems. In: CODES-ISSS 2003, pp. 120–125. ACM, New York (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

UC Berkeley, USA
Krishnendu Chatterjee & Thomas A. Henzinger
UC Los Angeles, USA
Rupak Majumdar
EPFL, Switzerland
Thomas A. Henzinger

Authors

Krishnendu Chatterjee
View author publications
You can also search for this author in PubMed Google Scholar
Rupak Majumdar
View author publications
You can also search for this author in PubMed Google Scholar
Thomas A. Henzinger
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

LIF, CNRS & Univ. de Provence,, Marseille, France
Bruno Durand
RWTH Aachen University,, Lehrstuhl Informatik 7,, 52056,, Aachen,, Germany
Wolfgang Thomas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chatterjee, K., Majumdar, R., Henzinger, T.A. (2006). Markov Decision Processes with Multiple Objectives. In: Durand, B., Thomas, W. (eds) STACS 2006. STACS 2006. Lecture Notes in Computer Science, vol 3884. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11672142_26

Download citation

DOI: https://doi.org/10.1007/11672142_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32301-3
Online ISBN: 978-3-540-32288-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Markov Decision Processes with Multiple Objectives

Abstract

Access this chapter

Preview

Similar content being viewed by others

Simple Strategies in Multi-Objective MDPs

Multi-cost Bounded Tradeoff Analysis in MDP

Value Iteration for Long-Run Average Reward in Markov Decision Processes

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Markov Decision Processes with Multiple Objectives

Abstract

Access this chapter

Preview

Similar content being viewed by others

Simple Strategies in Multi-Objective MDPs

Multi-cost Bounded Tradeoff Analysis in MDP

Value Iteration for Long-Run Average Reward in Markov Decision Processes

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation