Uchendu et al., 2023 - Google Patents

Jump-start reinforcement learning

Uchendu et al., 2023

Document ID: 15331057141078785906
Author: Uchendu I; Xiao T; Lu Y; Zhu B; Yan M; Simon J; Bennice M; Fu C; Ma C; Jiao J; Levine S; Hausman K
Publication year: 2023
Publication venue: International Conference on Machine Learning

External Links

Cited by

Snippet

Reinforcement learning (RL) provides a theoretical framework for continuously improving an agent's behavior via trial and error. However, efficiently learning policies from scratch can be very difficult, particularly for tasks that present exploration challenges. In such settings, it …

Continue reading at proceedings.mlr.press (PDF) (other versions)

230000002787 reinforcement 0 title abstract description 44

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
- G06N3/0635—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means using analogue means
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
- G06N3/0472—Architectures, e.g. interconnection topology using probabilistic elements, e.g. p-rams, stochastic processors
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches

Similar Documents

Publication	Publication Date	Title
Uchendu et al.	2023	Jump-start reinforcement learning
Pertsch et al.	2021	Accelerating reinforcement learning with learned skill priors
Wu et al.	2021	Greedy hierarchical variational autoencoders for large-scale video prediction
Chane-Sane et al.	2021	Goal-conditioned reinforcement learning with imagined subgoals
Singh et al.	2020	Cog: Connecting new skills to past experience with offline reinforcement learning
Sæmundsson et al.	2018	Meta reinforcement learning with latent variable gaussian processes
Nagabandi et al.	2020	Deep dynamics models for learning dexterous manipulation
Pertsch et al.	2021	Guided reinforcement learning with learned skills
Heess et al.	2013	Actor-critic reinforcement learning with energy-based policies
Huang et al.	2021	Continual model-based reinforcement learning with hypernetworks
Harutyunyan et al.	2019	The termination critic
Hiraoka et al.	2019	Learning robust options by conditional value at risk optimization
Wang et al.	2021	Deep deterministic policy gradient with compatible critic network
Tschiatschek et al.	2018	Variational inference for data-efficient model learning in pomdps
Elsayed et al.	2014	A surrogate-assisted differential evolution algorithm with dynamic parameters selection for solving expensive optimization problems
Perez et al.	2018	Efficient transfer learning and online adaptation with latent variable models for continuous control
Sacks et al.	2022	Learning to optimize in model predictive control
Zuo et al.	2020	Off-policy adversarial imitation learning for robotic tasks with low-quality demonstrations
Seo et al.	2024	Continuous control with coarse-to-fine reinforcement learning
Chen et al.	2024	Self-supervised reinforcement learning that transfers using random features
Sikchi et al.	2023	Score models for offline goal-conditioned reinforcement learning
Shi et al.	2022	Efficient hierarchical policy network with fuzzy rules
Liu et al.	2019	Hindsight generative adversarial imitation learning
Lin et al.	2020	Exploration-efficient deep reinforcement learning with demonstration guidance for robot control
Galashov et al.	2020	Importance weighted policy learning and adaptation