[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

The Complexity of Decentralized Control of Markov Decision Processes

Published: 01 November 2002 Publication History

Abstract

We consider decentralized control of Markov decision processes and give complexity bounds on the worst-case running time for algorithms that find optimal solutions. Generalizations of both the fully observable case and the partially observable case that allow for decentralized control are described. For even two agents, the finite-horizon problems corresponding to both of these models are hard for nondeterministic exponential time. These complexity results illustrate a fundamental difference between centralized and decentralized control of Markov decision processes. In contrast to the problems involving centralized control, the problems we considerprovably do not admit polynomial-time algorithms. Furthermore, assuming EXP ? NEXP, the problems require superexponential time to solve in the worst case.

References

[1]
Altman, E. 2001. Applications of Markov decision processes in telecommunications: A survey. E. Feinberg, A. Shwartz, ed. Markov Decision Processes: Models, Methods, Directions, and Open Problems. Kluwer, New York.
[2]
Babai, L, L. Fortnow, C. Lund. 1991. Non-deterministic exponential time has two-prover interactive protocols. Comput. Complexity1 3-40.
[3]
Blondel, V. D., J. N. Tsitsiklis. 2000. A survey of computational complexity results in systems and control. Automatica36(9) 1249-1274.
[4]
Cassandra, A., M. L. Littman, N. L. Zhang. 1997. Incremental pruning: A simple, fast, exact method for partially observable Markov decision processes. Proc. Thirteenth Ann. Conf. on Uncertainty in Artificial Intelligence, Providence, RI, 54-61.
[5]
Coradeschi, S., L. Karlsson, P. Stone, T. Balch, G. Kraetzschmar, M. Asada. 2000. Overview of RoboCup-99. AI Magazine21(3) 11-18.
[6]
Hansen, E. 1998. Solving POMDPs by searching in policy space. In Proc. Fourteenth Ann. Conf. on Uncertainty in Artificial Intelligence, Madison, WI, 211-219.
[7]
Hsu, K. S. I. Marcus. 1982. Decentralized control of finite state Markov processes. IEEE Trans. Auto. ControlAC-27(2) 426--431.
[8]
Jaakkola. T., S. P. Singh, M. I. Jordan. 1995. Reinforcement learning algorithm for partially observable Markov decision problems. Proc. Adv. in Neural Inform. Processing Systems7 345-352.
[9]
Kaelbling, L. P., M. L. Littman, A. R. Cassandra. 1998. Planning and acting in partially observable stochastic domains. Artificial Intelligence101(1-2), 99-134.
[10]
Lewis, H. 1978. Complexity of solvable cases of the decision problem for predicate calculus. Proc. Nineteenth Sympos. on the Foundations of Comput. Sci., Ann Arbor, MI, 35-47.
[11]
Lusena. C., J. Goldsmith. T. Li, S. Sittinger, C. Wells. 1999. My brain is full: When more memory helps. Proc. Fifteenth Conf. on Uncertainty in Artificial Intelligence, Stockholm, Sweden, 374-381.
[12]
Madani, O., S. Hanks, A. Condon. 1999. On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems. Proc. Sixteenth National Conf. on Artificial Intelligence, Orlando, FL, 541-548.
[13]
Meuleau, N. K-E. Kim, L. Kaelbling, A. R. Cassandra. 1999. Solving POMDPs by searching the space of finite policies. Proc. Fifteenth Conf. on Uncertainty in Artificial Intelligence. Stockholm, Sweden. 417-426
[14]
Mundhenk, M., J. Goldsmith, C. Lusena. E. Allender. 2000. Complexity of finite-horizon Markov decision process problems. J. ACM47(4) 681-720.
[15]
Ooi, J. M., G. W. Wornell. 1996. Decentralized control of a multiple access broadcast channel: Performance bounds. In Proc. 35th Conf. on Decision and Control. Kobe, Japan, 293-298.
[16]
Papadimitriou, C. H. 1994. Computational Complexity. Addison-Wesley, Reading, MA.
[17]
Papadimitriou, C. H., J. Tsitsiklis. 1982. On the complexity of designing distributed protocols. Inform. Control53 211-218.
[18]
Papadimitriou, C. H., J. Tsitsiklis. 1986. Intractable problems in control theory. SIAM J. Control Optim.24(4) 639-654.
[19]
Papadimitriou, C. H., J. Tsitsiklis. 1987. The complexity of Markov decision processes. Math. Oper. Res12(3) 441--450.
[20]
Peshkin, L., K.-E. Kim, N. Meuleau, L. P. Kaelbling. 2000. Learning to cooperate via policy search. Proc. Sixteenth Internat. Conf. on Uncertainty in Artificial Intelligence, Stanford. CA. 489-496.
[21]
Peterson, G. L. J. R. Reif. 1979. Multiple-person alternation. 20th Ann. Sympos. on Foundations of Comput. Sci., San Juan, PR. 348-363.
[22]
Puterman, M. L. 1994. Markov Decision Processes. J. Wiley & Sons, New York.
[23]
Schneider, J., W.-K. Wong, A. Moore, M. Riedmiller. 1999. Distributed value functions. Proc. Sixteenth Internat. Conf. on Machine Learning, Bled. Slovenia, 371-378.
[24]
Zhang, W. 2001. Algorithms for partially observable markov decision processes. Ph.D. thesis. Hong Kong University of Science and Technology, Kowloon, Hong Kong.

Cited By

View all
  • (2025)A multi-agent curiosity reward model for task-oriented dialogue systemsPattern Recognition10.1016/j.patcog.2024.110884157:COnline publication date: 1-Jan-2025
  • (2024)Observer-Aware Planning with Implicit and Explicit CommunicationProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663000(1409-1417)Online publication date: 6-May-2024
  • (2024)Collaborative Attack Sequence Generation Model Based on Multiagent Reinforcement Learning for Intelligent Traffic Signal SystemInternational Journal of Intelligent Systems10.1155/2024/47340302024Online publication date: 1-Jan-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Mathematics of Operations Research
Mathematics of Operations Research  Volume 27, Issue 4
November 2002
205 pages

Publisher

INFORMS

Linthicum, MD, United States

Publication History

Published: 01 November 2002

Author Tags

  1. Markov decision process
  2. computational complexity
  3. decentralized control

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2025)A multi-agent curiosity reward model for task-oriented dialogue systemsPattern Recognition10.1016/j.patcog.2024.110884157:COnline publication date: 1-Jan-2025
  • (2024)Observer-Aware Planning with Implicit and Explicit CommunicationProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663000(1409-1417)Online publication date: 6-May-2024
  • (2024)Collaborative Attack Sequence Generation Model Based on Multiagent Reinforcement Learning for Intelligent Traffic Signal SystemInternational Journal of Intelligent Systems10.1155/2024/47340302024Online publication date: 1-Jan-2024
  • (2024)Multi-Objective Multi-Agent Planning for Discovering and Tracking Multiple Mobile ObjectsIEEE Transactions on Signal Processing10.1109/TSP.2024.342375572(3669-3685)Online publication date: 1-Jan-2024
  • (2024)Distributed Policy Gradient for Linear Quadratic Networked Control With Limited Communication RangeIEEE Transactions on Signal Processing10.1109/TSP.2024.338639672(2087-2100)Online publication date: 8-Apr-2024
  • (2024)Decentralized policy learning with partial observation and mechanical constraints for multiperson modelingNeural Networks10.1016/j.neunet.2023.11.068171:C(40-52)Online publication date: 17-Apr-2024
  • (2024)Mobile robot sequential decision making using a deep reinforcement learning hyper-heuristic approachExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.124959257:COnline publication date: 10-Dec-2024
  • (2024)An improved sand cat swarm optimization for moving target search by UAVExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.122189238:PEOnline publication date: 27-Feb-2024
  • (2024)Formal contracts mitigate social dilemmas in multi-agent reinforcement learningAutonomous Agents and Multi-Agent Systems10.1007/s10458-024-09682-538:2Online publication date: 18-Oct-2024
  • (2024)Coordinate-aligned multi-camera collaboration for active multi-object trackingMultimedia Systems10.1007/s00530-024-01420-x30:4Online publication date: 29-Jul-2024
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media