Marbach, 1998 - Google Patents
Simulation-based methods for Markov decision processesMarbach, 1998
View PDF- Document ID
- 1773797065947367731
- Author
- Marbach P
- Publication year
External Links
Snippet
Markov decision processes have been a popular paradigm for sequential decision making under uncertainty. Dynamic programming provides a framework for studying such problems, as well as for devising algorithms to compute an optimal control policy. Dynamic …
- 238000000034 method 0 title abstract description 102
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/54—Store-and-forward switching systems
- H04L12/56—Packet switching systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce, e.g. shopping or e-commerce
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liang et al. | Logistics-involved QoS-aware service composition in cloud manufacturing with deep reinforcement learning | |
Marbach et al. | Simulation-based optimization of Markov reward processes | |
Jaakkola et al. | Reinforcement learning algorithm for partially observable Markov decision problems | |
Kallus et al. | Stochastic optimization forests | |
Iida et al. | Approximate solutions of a dynamic forecast-inventory model | |
Goel et al. | Beyond online balanced descent: An optimal algorithm for smoothed online optimization | |
Serfozo | Optimal control of random walks, birth and death processes, and queues | |
Dong et al. | Simple agent, complex environment: Efficient reinforcement learning with agent states | |
Chen et al. | Tailored base-surge policies in dual-sourcing inventory systems with demand learning | |
Cao | From perturbation analysis to Markov decision processes and reinforcement learning | |
Oh et al. | Multinomial logit contextual bandits: Provable optimality and practicality | |
Kerimov et al. | On the optimality of greedy policies in dynamic matching | |
Marbach | Simulation-based methods for Markov decision processes | |
Chen et al. | Dynamic programming equations for discounted constrained stochastic control | |
Tang et al. | Learn to Optimize-A Brief Overview | |
Altman et al. | Sensitivity of constrained Markov decision processes | |
Ding et al. | Feature-based inventory control with censored demand | |
Hernández-Lerma et al. | Infinite-horizon Markov control processes with undiscounted cost criteria: from average to overtaking optimality | |
Mahadevan | An average-reward reinforcement learning algorithm for computing bias-optimal policies | |
Chen et al. | Coordinating multiple agents via reinforcement learning | |
van der Mei et al. | Polling systems in heavy traffic: Exhaustiveness of service policies | |
Wang et al. | Optimal pricing for tandem queues with finite buffers | |
Dearden | Structured prioritized sweeping | |
Wang et al. | Optimal and effective web service composition with trust and user preference | |
Liu et al. | R2PG: Risk-Sensitive and Reliable Policy Gradient |