[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Vuong et al., 2019 - Google Patents

Uncertainty-aware model-based policy optimization

Vuong et al., 2019

View PDF
Document ID
14419615351770162475
Author
Vuong T
Tran K
Publication year
Publication venue
arXiv preprint arXiv:1906.10717

External Links

Snippet

Model-based reinforcement learning has the potential to be more sample efficient than model-free approaches. However, existing model-based methods are vulnerable to model bias, which leads to poor generalization and asymptotic performance compared to model …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/04Inference methods or devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/12Computer systems based on biological models using genetic models
    • G06N3/126Genetic algorithms, i.e. information processing using digital simulations of the genetic system
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computer systems based on specific mathematical models
    • G06N7/005Probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30424Query processing
    • G06F17/30533Other types of queries
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/0265Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
    • G05B13/027Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/18Digital computers in general; Data processing equipment in general in which a programme is changed according to experience gained by the computer itself during a complete run; Learning machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management

Similar Documents

Publication Publication Date Title
Wang et al. Exploring model-based planning with policy networks
Bakshy et al. AE: A domain-agnostic platform for adaptive experimentation
Xu et al. Learning to explore via meta-policy gradient
Boney et al. Regularizing model-based planning with energy-based models
Luis et al. Inductive transfer for learning Bayesian networks
Vuong et al. Uncertainty-aware model-based policy optimization
Zhao et al. Adaptive behavior cloning regularization for stable offline-to-online reinforcement learning
Gaier et al. Data-efficient neuroevolution with kernel-based surrogate models
Song et al. Efficient evaluation methods for neural architecture search: A survey
Li et al. Hyper-parameter estimation method with particle swarm optimization
Liu et al. Type-2 hierarchical fuzzy system for high-dimensional data-based modeling with uncertainties
Zhao et al. Ode-based recurrent model-free reinforcement learning for pomdps
Xiao et al. Nonparametric kernel smoother on topology learning neural networks for incremental and ensemble regression
Jawed et al. Multi-task learning curve forecasting across hyperparameter configurations and datasets
Vuong et al. Policy Optimization In the Face of Uncertainty
Oxenstierna Predicting house prices using ensemble learning with cluster aggregations
Li et al. Policy gradient methods with gaussian process modelling acceleration
Gupta et al. Sequential knowledge transfer across problems
Li et al. Continuous probabilistic model building genetic network programming using reinforcement learning
Nilsson et al. Tree Ensembles for Contextual Bandits
Yang et al. BiES: adaptive policy optimization for model-based offline reinforcement learning
Anitha et al. Deep artificial neural network based multilayer gated recurrent model for effective prediction of software development effort
Wulur et al. Planning-integrated Policy for Efficient Reinforcement Learning in Sparse-reward Environments
Li et al. Bayesian optimization with particle swarm
Faury et al. Rover descent: Learning to optimize by learning to navigate on prototypical loss surfaces