Wang et al., 2022 - Google Patents
Skill preferences: Learning to extract and execute robotic skills from human feedbackWang et al., 2022
View PDF- Document ID
- 10817062050482384311
- Author
- Wang X
- Lee K
- Hakhamaneshi K
- Abbeel P
- Laskin M
- Publication year
- Publication venue
- Conference on Robot Learning
External Links
Snippet
A promising approach to solving challenging long-horizon tasks has been to extract behavior priors (skills) by fitting generative models to large offline datasets of demonstrations. However, such generative models inherit the biases of the underlying data …
- 238000004805 robotic 0 title description 10
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
- G06N3/0635—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means using analogue means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/002—Quantum computers, i.e. information processing by using quantum superposition, coherence, decoherence, entanglement, nonlocality, teleportation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Skill preferences: Learning to extract and execute robotic skills from human feedback | |
Pertsch et al. | Guided reinforcement learning with learned skills | |
Dasari et al. | Transformers for one-shot visual imitation | |
Lee et al. | Pebble: Feedback-efficient interactive reinforcement learning via relabeling experience and unsupervised pre-training | |
Pertsch et al. | Accelerating reinforcement learning with learned skill priors | |
Sæmundsson et al. | Meta reinforcement learning with latent variable gaussian processes | |
Sohn et al. | Hierarchical reinforcement learning for zero-shot generalization with subtask dependencies | |
Mandlekar et al. | Iris: Implicit reinforcement without interaction at scale for learning control from offline robot manipulation data | |
Uchendu et al. | Jump-start reinforcement learning | |
Blondé et al. | Sample-efficient imitation learning via generative adversarial nets | |
Laskey et al. | Robot grasping in clutter: Using a hierarchy of supervisors for learning from demonstrations | |
Kang et al. | Policy optimization with demonstrations | |
Zhao et al. | Maximum entropy-regularized multi-goal reinforcement learning | |
Rajeswaran et al. | Epopt: Learning robust neural network policies using model ensembles | |
Lu et al. | Aw-opt: Learning robotic skills with imitation andreinforcement at scale | |
Shi et al. | Skill-based model-based reinforcement learning | |
Wang et al. | Rl-vlm-f: Reinforcement learning from vision language foundation model feedback | |
Simmons-Edler et al. | Q-learning for continuous actions with cross-entropy guided policies | |
Zhou et al. | Policy architectures for compositional generalization in control | |
Pore et al. | On simple reactive neural networks for behaviour-based reinforcement learning | |
Tschiatschek et al. | Variational inference for data-efficient model learning in pomdps | |
Ramirez et al. | Reinforcement learning from expert demonstrations with application to redundant robot control | |
Zuo et al. | Adversarial imitation learning with mixed demonstrations from multiple demonstrators | |
Banerjee et al. | Optimal actor-critic policy with optimized training datasets | |
Ding et al. | Learning a universal human prior for dexterous manipulation from human preference |