Abstract
This paper deals with the solving multivariate partially observed Markov decision process (POMDPs). We give sufficient conditions on the cost function, dynamics of the Markov chain target and observation probabilities so that the optimal scheduling policy has a threshold structure with respect to the multivariate TP2 ordering. We present stochastic approximation algorithms to estimate the parameterized threshold policy.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Krishnamurthy, V., Djonin, D.: Structured threshold policies for dynamic sensor scheduling–a partially observed Markov decision process approach. IEEE Trans. Signal Proc. 55(10), 4938–4957 (2007)
Moran, W., Suvorova, S., Howard, S.: Application of sensor scheduling concepts to radar. In: Hero, A., Castanon, D., Cochran, D., Kastella, K. (eds.) Foundations and Applications for Sensor Management, pp. 221–256. Springer, Heidelberg (2006)
Evans, R., Krishnamurthy, V., Nair, G.: Networked sensor management and data rate control for tracking maneuvering targets. IEEE Trans. Signal Proc. 53(6), 1979–(1991)
Lovejoy, W.: Some monotonicity results for partially observed Markov decision processes. Operations Research 35(5), 736–743 (1987)
Rieder, U.: Structural results for partially observed control models. Methods and Models of Operations Research 35, 473–490 (1991)
Krishnamurthy, V.: Algorithms for optimal scheduling and management of hidden Markov model sensors. IEEE Trans. Signal Proc. 50(6), 1382–1397 (2002)
Krishnamurthy, V., Wahlberg, B.: POMDP multiarmed bandits – structural results. Mathematics of Operations Research (May 2009)
Lovejoy, W.: On the convexity of policy regions in partially observed systems. Operations Research 35(4), 619–621 (1987)
Spall, J.: Introduction to Stochastic Search and Optimization. Wiley, Chichester (2003)
Gantmacher, F.: Matrix Theory, vol. 2. Chelsea Publishing Company, New York (1960)
Karlin, S., Rinott, Y.: Classes of orderings of measures and related correlation inequalities. I. Multivariate totally positive distributions. Journal of Multivariate Analysis 10, 467–498 (1980)
Topkis, D.: Supermodularity and Complementarity. Princeton University Press, Princeton (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Krishnamurthy, V. (2009). Optimal Threshold Policies for Multivariate Stopping-Time POMDPs. In: Sossai, C., Chemello, G. (eds) Symbolic and Quantitative Approaches to Reasoning with Uncertainty. ECSQARU 2009. Lecture Notes in Computer Science(), vol 5590. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02906-6_73
Download citation
DOI: https://doi.org/10.1007/978-3-642-02906-6_73
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02905-9
Online ISBN: 978-3-642-02906-6
eBook Packages: Computer ScienceComputer Science (R0)