Sæmundsson et al., 2018 - Google Patents

Meta reinforcement learning with latent variable gaussian processes

Sæmundsson et al., 2018

Document ID: 5749403218492476053
Author: Sæmundsson S; Hofmann K; Deisenroth M
Publication year: 2018
Publication venue: arXiv preprint arXiv:1803.07551

External Links

Cited by

Snippet

Learning from small data sets is critical in many practical applications where data collection is time consuming or expensive, eg, robotics, animal experiments or drug design. Meta learning is one way to increase the data efficiency of learning algorithms by generalizing …

Continue reading at arxiv.org (PDF) (other versions)

230000002787 reinforcement 0 title abstract description 5

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding or deleting nodes or connections, pruning
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
- G06N3/0635—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means using analogue means
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
- G06N3/0472—Architectures, e.g. interconnection topology using probabilistic elements, e.g. p-rams, stochastic processors
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B17/00—Systems involving the use of models or simulators of said systems
- G05B17/02—Systems involving the use of models or simulators of said systems electric
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis

Similar Documents

Publication	Publication Date	Title
Sæmundsson et al.	2018	Meta reinforcement learning with latent variable gaussian processes
Pertsch et al.	2021	Guided reinforcement learning with learned skills
Uchendu et al.	2023	Jump-start reinforcement learning
Shafiullah et al.	2022	Behavior transformers: Cloning $ k $ modes with one stone
Hansen et al.	2021	Stabilizing deep q-learning with convnets and vision transformers under data augmentation
Blondé et al.	2019	Sample-efficient imitation learning via generative adversarial nets
Zhao et al.	2019	Maximum entropy-regularized multi-goal reinforcement learning
Brys et al.	2015	Reinforcement Learning from Demonstration through Shaping.
Quillen et al.	2018	Deep reinforcement learning for vision-based robotic grasping: A simulated comparative evaluation of off-policy methods
Laskey et al.	2017	Dart: Noise injection for robust imitation learning
Huang et al.	2021	Continual model-based reinforcement learning with hypernetworks
Heess et al.	2013	Actor-critic reinforcement learning with energy-based policies
Shi et al.	2022	Skill-based model-based reinforcement learning
Wang et al.	2022	Skill preferences: Learning to extract and execute robotic skills from human feedback
Tschiatschek et al.	2018	Variational inference for data-efficient model learning in pomdps
Thakur et al.	2019	Uncertainty aware learning from demonstrations in multiple contexts using bayesian neural networks
Feng et al.	2023	Finetuning offline world models in the real world
Liu et al.	2022	Distilling motion planner augmented policies into visual control policies for robot manipulation
Torabi et al.	2019	Sample-efficient adversarial imitation learning from observation
Wang et al.	2022	Consciousness‐driven reinforcement learning: An online learning control framework
He et al.	2023	Bridging the sim-to-real gap from the information bottleneck perspective
Liu et al.	2019	Hindsight generative adversarial imitation learning
Camacho et al.	2021	Sparsedice: Imitation learning for temporally sparse data via regularization
Yang et al.	2023	Learn from robot: Transferring skills for diverse manipulation via cycle generative networks
Tiboni et al.	2022	Online vs. offline adaptive domain randomization benchmark