[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Cang et al., 2021 - Google Patents

Behavioral priors and dynamics models: Improving performance and domain transfer in offline rl

Cang et al., 2021

View PDF
Document ID
17015169259498962404
Author
Cang C
Rajeswaran A
Abbeel P
Laskin M
Publication year
Publication venue
arXiv preprint arXiv:2106.09119

External Links

Snippet

Offline Reinforcement Learning (RL) aims to extract near-optimal policies from imperfect offline data without additional environment interactions. Extracting policies from diverse offline datasets has the potential to expand the range of applicability of RL by making the …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/04Architectures, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/12Computer systems based on biological models using genetic models
    • G06N3/126Genetic algorithms, i.e. information processing using digital simulations of the genetic system
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/04Inference methods or devices
    • G06N5/046Forward inferencing, production systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computer systems based on specific mathematical models
    • G06N7/005Probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/004Artificial life, i.e. computers simulating life
    • G06N3/006Artificial life, i.e. computers simulating life based on simulated virtual individual or collective life forms, e.g. single "avatar", social simulations, virtual worlds
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30587Details of specialised database models
    • G06F17/30595Relational databases
    • G06F17/30598Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computer systems based on specific mathematical models
    • G06N7/02Computer systems based on specific mathematical models using fuzzy logic
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/18Digital computers in general; Data processing equipment in general in which a programme is changed according to experience gained by the computer itself during a complete run; Learning machines

Similar Documents

Publication Publication Date Title
Cang et al. Behavioral priors and dynamics models: Improving performance and domain transfer in offline rl
Brown et al. Better-than-demonstrator imitation learning via automatically-ranked demonstrations
Song et al. V-mpo: On-policy maximum a posteriori policy optimization for discrete and continuous control
Amin et al. A survey of exploration methods in reinforcement learning
Alet et al. Meta-learning curiosity algorithms
Polson et al. Deep learning: A Bayesian perspective
Abdolmaleki et al. Relative entropy regularized policy iteration
Plaat et al. Deep model-based reinforcement learning for high-dimensional problems, a survey
Gomez et al. Accelerated Neural Evolution through Cooperatively Coevolved Synapses.
Heylighen et al. Cybernetics and second-order cybernetics
Campos et al. Beyond fine-tuning: Transferring behavior in reinforcement learning
Ghosh et al. Learning to reach goals without reinforcement learning
Kaushik et al. Multi-objective model-based policy search for data-efficient learning with sparse rewards
Lyu et al. Double check your state before trusting it: Confidence-aware bidirectional offline model-based imagination
Cobo et al. Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains
Botteghi et al. Unsupervised representation learning in deep reinforcement learning: A review
Zhang et al. Efficient experience replay architecture for offline reinforcement learning
Kim et al. Accelerating reinforcement learning with value-conditional state entropy exploration
Jhunjhunwala Policy extraction via online q-value distillation
Cerezo et al. Fractal AI: A fragile theory of intelligence
Freire et al. Sequential episodic control
Sun et al. Constrained mdps can be solved by eearly-termination with recurrent models
Guzman et al. Adaptive model predictive control by learning classifiers
Kuric et al. Meta reinforcement learning for fast adaptation of hierarchical policies
KrisshnaKumar et al. Towards Physically Talented Aerial Robots with Tactically Smart Swarm Behavior thereof: An Efficient Co-design Approach