[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Zhao et al., 2022 - Google Patents

Adaptive behavior cloning regularization for stable offline-to-online reinforcement learning

Zhao et al., 2022

View PDF
Document ID
6985242959602302250
Author
Zhao Y
Boney R
Ilin A
Kannala J
Pajarinen J
Publication year
Publication venue
arXiv preprint arXiv:2210.13846

External Links

Snippet

Offline reinforcement learning, by learning from a fixed dataset, makes it possible to learn agent behaviors without interacting with the environment. However, depending on the quality of the offline dataset, such pre-trained agents may have limited performance and …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/12Computer systems based on biological models using genetic models
    • G06N3/126Genetic algorithms, i.e. information processing using digital simulations of the genetic system
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/04Inference methods or devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/04Architectures, e.g. interconnection topology
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/0265Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
    • G05B13/027Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition

Similar Documents

Publication Publication Date Title
Zhao et al. Adaptive behavior cloning regularization for stable offline-to-online reinforcement learning
Lee et al. Offline-to-online reinforcement learning via balanced replay and pessimistic q-ensemble
Ajay et al. Is conditional generative modeling all you need for decision-making?
Chen et al. Delay-aware model-based reinforcement learning for continuous control
Vecerik et al. Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards
Kurenkov et al. Ac-teach: A bayesian actor-critic method for policy learning with an ensemble of suboptimal teachers
Chen et al. Latent-variable advantage-weighted policy optimization for offline rl
Shrestha et al. Deepaveragers: offline reinforcement learning by solving derived non-parametric mdps
Ma et al. Offline goal-conditioned reinforcement learning via $ f $-advantage regression
Cang et al. Behavioral priors and dynamics models: Improving performance and domain transfer in offline rl
Choi et al. Variational empowerment as representation learning for goal-based reinforcement learning
Li et al. ACDER: Augmented curiosity-driven experience replay
Hein et al. Generating interpretable fuzzy controllers using particle swarm optimization and genetic programming
Zhang et al. Efficient experience replay architecture for offline reinforcement learning
ElDahshan et al. Deep reinforcement learning based video games: A review
Vuong et al. Uncertainty-aware model-based policy optimization
Zhao et al. Improving Offline-to-Online Reinforcement Learning with Q-Ensembles
Zhao et al. Ensemble-based offline-to-online reinforcement learning: From pessimistic learning to optimistic exploration
Coelho et al. VQC-based reinforcement learning with data re-uploading: performance and trainability
Yang et al. Continuous control for searching and planning with a learned model
Ma et al. Learning to coordinate from offline datasets with uncoordinated behavior policies
Hepburn et al. Model-based trajectory stitching for improved behavioural cloning and its applications
Lee et al. Addressing distribution shift in online reinforcement learning with offline datasets
Li et al. Offline Reinforcement Learning with Uncertainty Critic Regularization Based on Density Estimation
Liu et al. Judgmentally adjusted Q-values based on Q-ensemble for offline reinforcement learning