Uchendu et al., 2023 - Google Patents
Jump-start reinforcement learningUchendu et al., 2023
View PDF- Document ID
- 15331057141078785906
- Author
- Uchendu I
- Xiao T
- Lu Y
- Zhu B
- Yan M
- Simon J
- Bennice M
- Fu C
- Ma C
- Jiao J
- Levine S
- Hausman K
- Publication year
- Publication venue
- International Conference on Machine Learning
External Links
Snippet
Reinforcement learning (RL) provides a theoretical framework for continuously improving an agent's behavior via trial and error. However, efficiently learning policies from scratch can be very difficult, particularly for tasks that present exploration challenges. In such settings, it …
- 230000002787 reinforcement 0 title abstract description 44
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
- G06N3/0635—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means using analogue means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
- G06N3/0472—Architectures, e.g. interconnection topology using probabilistic elements, e.g. p-rams, stochastic processors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Uchendu et al. | Jump-start reinforcement learning | |
Pertsch et al. | Accelerating reinforcement learning with learned skill priors | |
Wu et al. | Greedy hierarchical variational autoencoders for large-scale video prediction | |
Chane-Sane et al. | Goal-conditioned reinforcement learning with imagined subgoals | |
Singh et al. | Cog: Connecting new skills to past experience with offline reinforcement learning | |
Sæmundsson et al. | Meta reinforcement learning with latent variable gaussian processes | |
Nagabandi et al. | Deep dynamics models for learning dexterous manipulation | |
Pertsch et al. | Guided reinforcement learning with learned skills | |
Heess et al. | Actor-critic reinforcement learning with energy-based policies | |
Huang et al. | Continual model-based reinforcement learning with hypernetworks | |
Harutyunyan et al. | The termination critic | |
Hiraoka et al. | Learning robust options by conditional value at risk optimization | |
Wang et al. | Deep deterministic policy gradient with compatible critic network | |
Tschiatschek et al. | Variational inference for data-efficient model learning in pomdps | |
Elsayed et al. | A surrogate-assisted differential evolution algorithm with dynamic parameters selection for solving expensive optimization problems | |
Perez et al. | Efficient transfer learning and online adaptation with latent variable models for continuous control | |
Sacks et al. | Learning to optimize in model predictive control | |
Zuo et al. | Off-policy adversarial imitation learning for robotic tasks with low-quality demonstrations | |
Seo et al. | Continuous control with coarse-to-fine reinforcement learning | |
Chen et al. | Self-supervised reinforcement learning that transfers using random features | |
Sikchi et al. | Score models for offline goal-conditioned reinforcement learning | |
Shi et al. | Efficient hierarchical policy network with fuzzy rules | |
Liu et al. | Hindsight generative adversarial imitation learning | |
Lin et al. | Exploration-efficient deep reinforcement learning with demonstration guidance for robot control | |
Galashov et al. | Importance weighted policy learning and adaptation |