Sæmundsson et al., 2018 - Google Patents
Meta reinforcement learning with latent variable gaussian processesSæmundsson et al., 2018
View PDF- Document ID
- 5749403218492476053
- Author
- Sæmundsson S
- Hofmann K
- Deisenroth M
- Publication year
- Publication venue
- arXiv preprint arXiv:1803.07551
External Links
Snippet
Learning from small data sets is critical in many practical applications where data collection is time consuming or expensive, eg, robotics, animal experiments or drug design. Meta learning is one way to increase the data efficiency of learning algorithms by generalizing …
- 230000002787 reinforcement 0 title abstract description 5
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding or deleting nodes or connections, pruning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
- G06N3/0635—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means using analogue means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
- G06N3/0472—Architectures, e.g. interconnection topology using probabilistic elements, e.g. p-rams, stochastic processors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B17/00—Systems involving the use of models or simulators of said systems
- G05B17/02—Systems involving the use of models or simulators of said systems electric
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Sæmundsson et al. | Meta reinforcement learning with latent variable gaussian processes | |
Pertsch et al. | Guided reinforcement learning with learned skills | |
Uchendu et al. | Jump-start reinforcement learning | |
Shafiullah et al. | Behavior transformers: Cloning $ k $ modes with one stone | |
Hansen et al. | Stabilizing deep q-learning with convnets and vision transformers under data augmentation | |
Sharma et al. | Third-person visual imitation learning via decoupled hierarchical controller | |
Quillen et al. | Deep reinforcement learning for vision-based robotic grasping: A simulated comparative evaluation of off-policy methods | |
Zhao et al. | Maximum entropy-regularized multi-goal reinforcement learning | |
Brys et al. | Reinforcement learning from demonstration through shaping | |
Blondé et al. | Sample-efficient imitation learning via generative adversarial nets | |
Laskey et al. | Dart: Noise injection for robust imitation learning | |
Haldar et al. | Watch and match: Supercharging imitation with regularized optimal transport | |
Heess et al. | Actor-critic reinforcement learning with energy-based policies | |
Shi et al. | Skill-based model-based reinforcement learning | |
Wang et al. | Skill preferences: Learning to extract and execute robotic skills from human feedback | |
Montgomery et al. | Reset-free guided policy search: Efficient deep reinforcement learning with stochastic initial states | |
Tschiatschek et al. | Variational inference for data-efficient model learning in pomdps | |
Thakur et al. | Uncertainty aware learning from demonstrations in multiple contexts using bayesian neural networks | |
Baert et al. | Maximum causal entropy inverse constrained reinforcement learning | |
Feng et al. | Finetuning offline world models in the real world | |
Liu et al. | Distilling motion planner augmented policies into visual control policies for robot manipulation | |
Torabi et al. | Sample-efficient adversarial imitation learning from observation | |
Zuo et al. | Off-policy adversarial imitation learning for robotic tasks with low-quality demonstrations | |
Wang et al. | Consciousness‐driven reinforcement learning: An online learning control framework | |
Liu et al. | Hindsight generative adversarial imitation learning |