[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Lee et al., 2021 - Google Patents

Bayesian residual policy optimization:: Scalable bayesian reinforcement learning with clairvoyant experts

Lee et al., 2021

View PDF
Document ID
2389416385153745085
Author
Lee G
Hou B
Choudhury S
Srinivasa S
Publication year
Publication venue
2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

External Links

Snippet

Informed and robust decision making in the face of uncertainty is critical for robots operating in unstructured environments. We formulate this as Bayesian Reinforcement Learning over latent Markov Decision Processes (MDPs). While Bayes-optimality is theoretically the gold …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/0265Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
    • G05B13/027Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/04Architectures, e.g. interconnection topology
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/30Nc systems
    • G05B2219/39Robotics, robotics to robotics hand
    • G05B2219/39376Hierarchical, learning, recognition and skill level and adaptation servo level
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/04Inference methods or devices
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0268Control of position or course in two dimensions specially adapted to land vehicles using internal positioning means
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0255Control of position or course in two dimensions specially adapted to land vehicles using acoustic signals, e.g. ultra-sonic singals
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6267Classification techniques
    • G06K9/6279Classification techniques relating to the number of classes

Similar Documents

Publication Publication Date Title
CN113485380B (en) AGV path planning method and system based on reinforcement learning
Morales et al. A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
Jesus et al. Deep deterministic policy gradient for navigation of mobile robots in simulated environments
EP3832420A1 (en) Deep learning based motion control of a group of autonomous vehicles
Bansal et al. A hamilton-jacobi reachability-based framework for predicting and analyzing human motion for safe planning
Lee et al. Ensemble bayesian decision making with redundant deep perceptual control policies
Cai et al. Lets-drive: Driving in a crowd by learning from tree search
Chaffre et al. Sim-to-real transfer with incremental environment complexity for reinforcement learning of depth-based robot navigation
CN114020013B (en) Unmanned aerial vehicle formation collision avoidance method based on deep reinforcement learning
Lee et al. Bayesian residual policy optimization:: Scalable bayesian reinforcement learning with clairvoyant experts
Chen et al. Mobile robot obstacle avoidance using short memory: a dynamic recurrent neuro-fuzzy approach
Amiri et al. Learning and reasoning for robot sequential decision making under uncertainty
Liu et al. Episodic memory-based robotic planning under uncertainty
Fischer et al. Sampling-based inverse reinforcement learning algorithms with safety constraints
Yin et al. Autonomous navigation of mobile robots in unknown environments using off-policy reinforcement learning with curriculum learning
Gamal et al. Learning from fuzzy system demonstration: Autonomous navigation of mobile robot in static indoor environment using multimodal deep learning
Zhang et al. Performance guaranteed human-robot collaboration with POMDP supervisory control
Hirose et al. Probabilistic visual navigation with bidirectional image prediction
Ramakrishna et al. Augmenting learning components for safety in resource constrained autonomous robots
Stein et al. Navigating in populated environments by following a leader
Xiao et al. Reinforcement learning-driven dynamic obstacle avoidance for mobile robot trajectory tracking
Malone et al. Efficient motion-based task learning for a serial link manipulator
Cherroun et al. Intelligent systems based on reinforcement learning and fuzzy logic approaches," Application to mobile robotic"
González-Rodríguez et al. Uncertainty-Aware autonomous mobile robot navigation with deep reinforcement learning
Srinivasan et al. Path planning with user route preference-A reward surface approximation approach using orthogonal Legendre polynomials