Vuong et al., 2019 - Google Patents

Uncertainty-aware model-based policy optimization

Vuong et al., 2019

Document ID: 14419615351770162475
Author: Vuong T; Tran K
Publication year: 2019
Publication venue: arXiv preprint arXiv:1906.10717

External Links

Cited by

Snippet

Model-based reinforcement learning has the potential to be more sample efficient than model-free approaches. However, existing model-based methods are vulnerable to model bias, which leads to poor generalization and asymptotic performance compared to model …

Continue reading at arxiv.org (PDF) (other versions)

238000005457 optimization 0 title abstract description 19

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30533—Other types of queries
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/18—Digital computers in general; Data processing equipment in general in which a programme is changed according to experience gained by the computer itself during a complete run; Learning machines
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management

Similar Documents

Publication	Publication Date	Title
Wang et al.	2019	Exploring model-based planning with policy networks
Bakshy et al.	2018	AE: A domain-agnostic platform for adaptive experimentation
Xu et al.	2018	Learning to explore via meta-policy gradient
Boney et al.	2020	Regularizing model-based planning with energy-based models
Luis et al.	2010	Inductive transfer for learning Bayesian networks
Vuong et al.	2019	Uncertainty-aware model-based policy optimization
Zhao et al.	2022	Adaptive behavior cloning regularization for stable offline-to-online reinforcement learning
Gaier et al.	2018	Data-efficient neuroevolution with kernel-based surrogate models
Song et al.	2023	Efficient evaluation methods for neural architecture search: A survey
Li et al.	2020	Hyper-parameter estimation method with particle swarm optimization
Liu et al.	2012	Type-2 hierarchical fuzzy system for high-dimensional data-based modeling with uncertainties
Zhao et al.	2023	Ode-based recurrent model-free reinforcement learning for pomdps
Xiao et al.	2019	Nonparametric kernel smoother on topology learning neural networks for incremental and ensemble regression
Jawed et al.	2021	Multi-task learning curve forecasting across hyperparameter configurations and datasets
Vuong et al.	0	Policy Optimization In the Face of Uncertainty
Oxenstierna	2017	Predicting house prices using ensemble learning with cluster aggregations
Li et al.	2017	Policy gradient methods with gaussian process modelling acceleration
Gupta et al.	2019	Sequential knowledge transfer across problems
Li et al.	2015	Continuous probabilistic model building genetic network programming using reinforcement learning
Nilsson et al.	2024	Tree Ensembles for Contextual Bandits
Yang et al.	2022	BiES: adaptive policy optimization for model-based offline reinforcement learning
Anitha et al.	2024	Deep artificial neural network based multilayer gated recurrent model for effective prediction of software development effort
Wulur et al.	2021	Planning-integrated Policy for Efficient Reinforcement Learning in Sparse-reward Environments
Li et al.	2021	Bayesian optimization with particle swarm
Faury et al.	2019	Rover descent: Learning to optimize by learning to navigate on prototypical loss surfaces