More Web Proxy on the site http://driver.im/

article

Planning with durative actions in stochastic domains

Authors:

Daniel S. WeldAuthors Info & Claims

Journal of Artificial Intelligence Research, Volume 31, Issue 1

Pages 33 - 82

Published: 01 January 2008 Publication History

Abstract

Probabilistic planning problems are typically modeled as a Markov Decision Process (MDP). MDPs, while an otherwise expressive model, allow only for sequential, non-durative actions. This poses severe restrictions in modeling and solving a real world planning problem. We extend the MDP model to incorporate-1) simultaneous action execution, 2) durative actions, and 3) stochastic durations. We develop several algorithms to combat the computational explosion introduced by these features. The key theoretical ideas used in building these algorithms are -- modeling a complex problem as an MDP in extended state/action space, pruning of irrelevant actions, sampling of relevant actions, using informed heuristics to guide the search, hybridizing different planners to achieve benefits of both, approximating the problem and replanning. Our empirical evaluation illuminates the different merits in using various algorithms, viz., optimality, empirical closeness to optimality, theoretical error bounds, and speed.

References

[1]

Aberdeen, D., Thiebaux, S., & Zhang, L. (2004). Decision-theoretic military operations planning. In ICAPS'04.

[2]

Aberdeen, D., & Buffet, O. (2007). Concurrent probabilistic temporal planning with policy-gradients. In ICAPS'07.

[3]

Bacchus, F., & Ady, M. (2001). Planning with resources and concurrency: A forward chaining approach. In IJCAI'01, pp. 417-424.

Digital Library

[4]

Barto, A., Bradtke, S., & Singh, S. (1995). Learning to act using real-time dynamic programming. Artificial Intelligence, 72, 81-138.

Digital Library

[5]

Bertsekas, D. (1995). Dynamic Programming and Optimal Control. Athena Scientific.

Digital Library

[6]

Blum, A., & Furst, M. (1997). Fast planning through planning graph analysis. Artificial Intelligence, 90(1-2), 281-300.

Digital Library

[7]

Bonet, B., & Geffner, H. (2003). Labeled RTDP: Improving the convergence of real-time dynamic programming. In ICAPS'03, pp. 12-21.

[8]

Bonet, B., & Geffner, H. (2005). mGPT: A probabilistic planner based on heuristic search. JAIR, 24, 933.

Digital Library

[9]

Boutilier, C., Dean, T., & Hanks, S. (1999). Decision theoretic planning: Structural assumptions and computational leverage. J. Artificial Intelligence Research, 11, 1-94.

Digital Library

[10]

Boyan, J. A., & Littman, M. L. (2000). Exact solutions to time-dependent MDPs. In NIPS'00, p. 1026.

[11]

Bresina, J., Dearden, R., Meuleau, N., Smith, D., & Washington, R. (2002). Planning under continuous time and resource uncertainty : A challenge for AI. In UAI'02.

Digital Library

[12]

Chen, Y., Wah, B. W., & Hsu, C. (2006). Temporal planning using subgoal partitioning and resolution in sgplan. JAIR, 26, 323.

Digital Library

[13]

Cushing, W., Kambhampati, S., Mausam, & Weld, D. S. (2007). When is temporal planning really temporal?. In IJCAI'07.

Digital Library

[14]

Dearden, R., Meuleau, N., Ramakrishnan, S., Smith, D. E., & Washington, R. (2003). Incremental Contingency Planning. In ICAPS'03 Workshop on Planning under Uncertainty and Incomplete Information.

[15]

Do, M. B., & Kambhampati, S. (2001). Sapa: A domain-independent heuristic metric temporal planner. In ECP'01.

[16]

Do, M. B., & Kambhampati, S. (2003). Sapa: A scalable multi-objective metric temporal planner. JAIR, 20, 155-194.

Digital Library

[17]

Edelkamp, S. (2003). Taming numbers and duration in the model checking integrated planning system. Journal of Artificial Intelligence Research, 20, 195-238.

[18]

Feng, Z., Dearden, R., Meuleau, N., & Washington, R. (2004). Dynamic programming for structured continuous Markov decision processes. In UAI'04, p. 154.

Digital Library

[19]

Foss, J., & Onder, N. (2005). Generating temporally contingent plans. In IJCAI'05 Workshop on Planning and Learning in Apriori Unknown or Dynamic Domains.

[20]

Fox, M., & Long, D. (2003). PDDL2.1: An extension to PDDL for expressing temporal planning domains. JAIR Special Issue on 3rd International Planning Competition, 20, 61-124.

Digital Library

[21]

Gerevini, A., & Serina, I. (2002). LPG: A planner based on local search for planning graphs with action graphs. In AIPS'02, p. 281.

[22]

Guestrin, C., Koller, D., & Parr, R. (2001). Max-norm projections for factored MDPs. In IJCAI'01, pp. 673-682.

Digital Library

[23]

Hansen, E., & Zilberstein, S. (2001). LAO*: A heuristic search algorithm that finds solutions with loops. Artificial Intelligence, 129, 35-62.

Digital Library

[24]

Haslum, P., & Geffner, H. (2001). Heuristic planning with time and resources. In ECP'01.

[25]

Hoey, J., St-Aubin, R., Hu, A., & Boutilier, C. (1999). SPUDD: Stochastic planning using decision diagrams. In UAI'99, pp. 279-288.

Digital Library

[26]

Jensen, R. M., & Veloso, M. (2000). OBDD=based universal planning for synchronized agents in non-deterministic domains. Journal of Artificial Intelligence Research, 13, 189.

Digital Library

[27]

Kushmerick, N., Hanks, S., & Weld, D. (1995). An algorithm for probabilistic planning. Artificial Intelligence, 76(1-2), 239-286.

Digital Library

[28]

Laborie, P., & Ghallab, M. (1995). Planning with sharable resource constraints. In IJCAI'95, p. 1643.

Digital Library

[29]

Little, I., Aberdeen, D., & Thiebaux, S. (2005). Prottle: A probabilistic temporal planner. In AAAI'05.

Digital Library

[30]

Little, I., & Thiebaux, S. (2006). Concurrent probabilistic planning in the graphplan framework. In ICAPS'06.

[31]

Long, D., & Fox, M. (2003). The 3rd international planning competition: Results and analysis. JAIR, 20, 1-59.

Digital Library

[32]

Mausam (2007). Stochastic planning with concurrent, durative actions. Ph.d. dissertation, University of Washington.

Digital Library

[33]

Mausam, Bertoli, P., &Weld, D. (2007). A hybridized planner for stochastic domains. In IJCAI'07.

Digital Library

[34]

Mausam, & Weld, D. (2004). Solving concurrent Markov decision processes. In AAAI'04.

Digital Library

[35]

Mausam, & Weld, D. (2005). Concurrent probabilistic temporal planning. In ICAPS'05, pp. 120- 129.

[36]

Mausam, & Weld, D. (2006a). Challenges for temporal planning with uncertain durations. In ICAPS'06.

[37]

Mausam, & Weld, D. (2006b). Probabilistic temporal planning with uncertain durations. In AAAI'06.

Digital Library

[38]

Meuleau, N., Hauskrecht, M., Kim, K.-E., Peshkin, L., Kaelbling, L., Dean, T., & Boutilier, C. (1998). Solving very large weakly coupled Markov Decision Processes. In AAAI'98, pp. 165-172.

Digital Library

[39]

Musliner, D., Murphy, D., & Shin, K. (1991). World modeling for the dynamic construction of real-time control plans. Artificial Intelligence, 74, 83-127.

Digital Library

[40]

Nigenda, R. S., & Kambhampati, S. (2003). Altalt-p: Online parallelization of plans with heuristic state search. Journal of Artificial Intelligence Research, 19, 631-657.

Digital Library

[41]

Penberthy, J., & Weld, D. (1994). Temporal planning with continuous change. In AAAI'94, p. 1010.

Digital Library

[42]

Rohanimanesh, K., & Mahadevan, S. (2001). Decision-Theoretic planning with concurrent temporally extended actions. In UAI'01, pp. 472-479.

Digital Library

[43]

Singh, S., & Cohn, D. (1998). How to dynamically merge markov decision processes. In NIPS'98. The MIT Press.

Digital Library

[44]

Smith, D., & Weld, D. (1999). Temporal graphplan with mutual exclusion reasoning. In IJCAI'99, pp. 326-333 Stockholm, Sweden. San Francisco, CA: Morgan Kaufmann.

Digital Library

[45]

Vidal, V., & Geffner, H. (2006). Branching and pruning: An optimal temporal pocl planner based on constraint programming. AIJ, 170(3), 298-335.

Digital Library

[46]

Younes, H. L. S., & Simmons, R. G. (2004a). Policy generation for continuous-time stochastic domains with concurrency. In ICAPS'04, p. 325.

[47]

Younes, H. L. S., & Simmons, R. G. (2004b). Solving generalized semi-markov decision processes using continuous phase-type distributions. In AAAI'04, p. 742.

Digital Library

[48]

Zhang, W., & Dietterich, T. G. (1995). A reinforcement learning approach to job-shop scheduling. In IJCAI'95, pp. 1114-1120.

Digital Library

Cited By

Jiang YYedidsion HZhang SSharon GStone P(2019)Multi-robot planning with conflicts and synergiesAutonomous Robots10.1007/s10514-019-09848-143:8(2011-2032)Online publication date: 1-Dec-2019
https://dl.acm.org/doi/10.1007/s10514-019-09848-1
Saxe AEarle ARosman B(2017)Hierarchy through composition with multitask LMDPsProceedings of the 34th International Conference on Machine Learning - Volume 7010.5555/3305890.3305993(3017-3026)Online publication date: 6-Aug-2017
https://dl.acm.org/doi/10.5555/3305890.3305993
Zhang SJiang YSharon GStone PLarson KWinikoff MDas SDurfee E(2017)Multirobot Symbolic Planning under Temporal UncertaintyProceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems10.5555/3091125.3091199(501-510)Online publication date: 8-May-2017
https://dl.acm.org/doi/10.5555/3091125.3091199
Show More Cited By

Recommendations

Stochastic planning with concurrent, durative actions
Planning and acting in partially observable stochastic domains

In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov decision processes (mdps) and partially observable ...
A Continuous Planning Framework with Durative Actions
TIME '02: Proceedings of the Ninth International Symposium on Temporal Representation and Reasoning (TIME'02)

This paper describes a continuous planning framework to be used by a planning agent situated within an environment, which reasons about goals with priorities and deadlines, and actions with duration. The framework assumes that goals may be generated ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Journal of Artificial Intelligence Research

Journal of Artificial Intelligence Research Volume 31, Issue 1

January 2008

651 pages

ISSN:1076-9757

Issue’s Table of Contents

Publisher

AI Access Foundation

El Segundo, CA, United States

Publication History

Published: 01 January 2008

Received: 01 February 2007

Published in JAIR Volume 31, Issue 1

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jiang YYedidsion HZhang SSharon GStone P(2019)Multi-robot planning with conflicts and synergiesAutonomous Robots10.1007/s10514-019-09848-143:8(2011-2032)Online publication date: 1-Dec-2019
https://dl.acm.org/doi/10.1007/s10514-019-09848-1
Saxe AEarle ARosman B(2017)Hierarchy through composition with multitask LMDPsProceedings of the 34th International Conference on Machine Learning - Volume 7010.5555/3305890.3305993(3017-3026)Online publication date: 6-Aug-2017
https://dl.acm.org/doi/10.5555/3305890.3305993
Zhang SJiang YSharon GStone PLarson KWinikoff MDas SDurfee E(2017)Multirobot Symbolic Planning under Temporal UncertaintyProceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems10.5555/3091125.3091199(501-510)Online publication date: 8-May-2017
https://dl.acm.org/doi/10.5555/3091125.3091199
Zhao PWang HQi CLiu D(2017)HTN planning with uncontrollable durations for emergency decision-makingJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-16155733:1(255-267)Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.3233/JIFS-161557
Micheli ADo MSmith D(2015)Compiling away uncertainty in strong temporal planning with uncontrollable durationsProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832415.2832476(1631-1637)Online publication date: 25-Jul-2015
https://dl.acm.org/doi/10.5555/2832415.2832476
Fox MLong DMagazzeni D(2012)Plan-based policies for efficient multiple battery load managementJournal of Artificial Intelligence Research10.5555/2387933.238794044:1(335-382)Online publication date: 1-May-2012
https://dl.acm.org/doi/10.5555/2387933.2387940
Ahuja APalshikar G(2012)Generating project plans for data center transformationsProceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence10.1007/978-3-642-35101-3_65(767-778)Online publication date: 4-Dec-2012
https://dl.acm.org/doi/10.1007/978-3-642-35101-3_65
Fox MLong DMagazzeni D(2011)Automatic construction of efficient multiple battery usage policiesProceedings of the Twenty-First International Conference on International Conference on Automated Planning and Scheduling10.5555/3038485.3038496(74-81)Online publication date: 11-Jun-2011
https://dl.acm.org/doi/10.5555/3038485.3038496
Coles AColes AClark AGilmore S(2011)Cost-sensitive concurrent planning under duration uncertainty for service-level agreementsProceedings of the Twenty-First International Conference on International Conference on Automated Planning and Scheduling10.5555/3038485.3038491(34-41)Online publication date: 11-Jun-2011
https://dl.acm.org/doi/10.5555/3038485.3038491
Fox MLong DMagazzeni D(2011)Automatic construction of efficient multiple battery usage policiesProceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three10.5555/2283696.2283832(2620-2625)Online publication date: 16-Jul-2011
https://dl.acm.org/doi/10.5555/2283696.2283832
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents