Marbach, 1998 - Google Patents

Simulation-based methods for Markov decision processes

Marbach, 1998

Document ID: 1773797065947367731
Author: Marbach P
Publication year: 1998

External Links

Cited by

Snippet

Markov decision processes have been a popular paradigm for sequential decision making under uncertainty. Dynamic programming provides a framework for studying such problems, as well as for devising algorithms to compute an optimal control policy. Dynamic …

Continue reading at citeseerx.ist.psu.edu (PDF) (other versions)

238000000034 method 0 title abstract description 102

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/54—Store-and-forward switching systems
- H04L12/56—Packet switching systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce, e.g. shopping or e-commerce

Similar Documents

Publication	Publication Date	Title
Liang et al.	2021	Logistics-involved QoS-aware service composition in cloud manufacturing with deep reinforcement learning
Marbach et al.	2001	Simulation-based optimization of Markov reward processes
Jaakkola et al.	1994	Reinforcement learning algorithm for partially observable Markov decision problems
Kallus et al.	2023	Stochastic optimization forests
Iida et al.	2006	Approximate solutions of a dynamic forecast-inventory model
Goel et al.	2019	Beyond online balanced descent: An optimal algorithm for smoothed online optimization
Serfozo	1981	Optimal control of random walks, birth and death processes, and queues
Dong et al.	2022	Simple agent, complex environment: Efficient reinforcement learning with agent states
Chen et al.	2019	Tailored base-surge policies in dual-sourcing inventory systems with demand learning
Cao	2003	From perturbation analysis to Markov decision processes and reinforcement learning
Oh et al.	2021	Multinomial logit contextual bandits: Provable optimality and practicality
Kerimov et al.	2023	On the optimality of greedy policies in dynamic matching
Marbach	1998	Simulation-based methods for Markov decision processes
Chen et al.	2004	Dynamic programming equations for discounted constrained stochastic control
Tang et al.	2024	Learn to Optimize-A Brief Overview
Altman et al.	1991	Sensitivity of constrained Markov decision processes
Ding et al.	2024	Feature-based inventory control with censored demand
Hernández-Lerma et al.	1998	Infinite-horizon Markov control processes with undiscounted cost criteria: from average to overtaking optimality
Mahadevan	1996	An average-reward reinforcement learning algorithm for computing bias-optimal policies
Chen et al.	2005	Coordinating multiple agents via reinforcement learning
van der Mei et al.	1997	Polling systems in heavy traffic: Exhaustiveness of service policies
Wang et al.	2019	Optimal pricing for tandem queues with finite buffers
Dearden	2001	Structured prioritized sweeping
Wang et al.	2015	Optimal and effective web service composition with trust and user preference
Liu et al.	2018	R2PG: Risk-Sensitive and Reliable Policy Gradient