Cang et al., 2021 - Google Patents

Behavioral priors and dynamics models: Improving performance and domain transfer in offline rl

Cang et al., 2021

Document ID: 17015169259498962404
Author: Cang C; Rajeswaran A; Abbeel P; Laskin M
Publication year: 2021
Publication venue: arXiv preprint arXiv:2106.09119

External Links

Cited by

Snippet

Offline Reinforcement Learning (RL) aims to extract near-optimal policies from imperfect offline data without additional environment interactions. Extracting policies from diverse offline datasets has the potential to expand the range of applicability of RL by making the …

Continue reading at arxiv.org (PDF) (other versions)

230000003542 behavioural 0 title abstract description 76

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G06N5/046—Forward inferencing, production systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/004—Artificial life, i.e. computers simulating life
- G06N3/006—Artificial life, i.e. computers simulating life based on simulated virtual individual or collective life forms, e.g. single "avatar", social simulations, virtual worlds
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30587—Details of specialised database models
- G06F17/30595—Relational databases
- G06F17/30598—Clustering or classification
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/02—Computer systems based on specific mathematical models using fuzzy logic
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/18—Digital computers in general; Data processing equipment in general in which a programme is changed according to experience gained by the computer itself during a complete run; Learning machines

Similar Documents

Publication	Publication Date	Title
Cang et al.	2021	Behavioral priors and dynamics models: Improving performance and domain transfer in offline rl
Song et al.	2019	V-mpo: On-policy maximum a posteriori policy optimization for discrete and continuous control
Brown et al.	2020	Better-than-demonstrator imitation learning via automatically-ranked demonstrations
Amin et al.	2021	A survey of exploration methods in reinforcement learning
Alet et al.	2020	Meta-learning curiosity algorithms
Polson et al.	2017	Deep learning: A Bayesian perspective
Abdolmaleki et al.	2018	Relative entropy regularized policy iteration
Plaat et al.	2020	Deep model-based reinforcement learning for high-dimensional problems, a survey
Finn et al.	2016	Generalizing skills with semi-supervised reinforcement learning
Gomez et al.	2008	Accelerated Neural Evolution through Cooperatively Coevolved Synapses.
Heylighen et al.	2001	Cybernetics and second-order cybernetics
Lyu et al.	2022	Double check your state before trusting it: Confidence-aware bidirectional offline model-based imagination
Ghosh et al.	2019	Learning to reach goals without reinforcement learning
Kaushik et al.	2018	Multi-objective model-based policy search for data-efficient learning with sparse rewards
Kim et al.	2023	Accelerating reinforcement learning with value-conditional state entropy exploration
Nguyen et al.	2020	Effects of decision complexity in goal-seeking gridworlds: A comparison of instance-based learning and reinforcement learning agents
Peng et al.	2018	Continual match based training in Pommerman: Technical report
Zhang et al.	2023	Efficient experience replay architecture for offline reinforcement learning
Bortkiewicz et al.	2024	Accelerating Goal-Conditioned RL Algorithms and Research
Sun et al.	2022	Constrained mdps can be solved by eearly-termination with recurrent models
Freire et al.	2021	Sequential memory improves sample and memory efficiency in Episodic Control
Guzman et al.	2022	Adaptive model predictive control by learning classifiers
Gong et al.	2022	Evolutionary symbolic regression from a probabilistic perspective
Kuric et al.	2021	Meta reinforcement learning for fast adaptation of hierarchical policies
KrisshnaKumar et al.	2024	Towards Physically Talented Aerial Robots with Tactically Smart Swarm Behavior thereof: An Efficient Co-design Approach