Lee et al., 2021 - Google Patents
Bayesian residual policy optimization:: Scalable bayesian reinforcement learning with clairvoyant expertsLee et al., 2021
View PDF- Document ID
- 2389416385153745085
- Author
- Lee G
- Hou B
- Choudhury S
- Srinivasa S
- Publication year
- Publication venue
- 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
External Links
Snippet
Informed and robust decision making in the face of uncertainty is critical for robots operating in unstructured environments. We formulate this as Bayesian Reinforcement Learning over latent Markov Decision Processes (MDPs). While Bayes-optimality is theoretically the gold …
- 238000005457 optimization 0 title abstract description 16
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B2219/00—Program-control systems
- G05B2219/30—Nc systems
- G05B2219/39—Robotics, robotics to robotics hand
- G05B2219/39376—Hierarchical, learning, recognition and skill level and adaptation servo level
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0268—Control of position or course in two dimensions specially adapted to land vehicles using internal positioning means
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0255—Control of position or course in two dimensions specially adapted to land vehicles using acoustic signals, e.g. ultra-sonic singals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113485380B (en) | AGV path planning method and system based on reinforcement learning | |
Morales et al. | A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning | |
Jesus et al. | Deep deterministic policy gradient for navigation of mobile robots in simulated environments | |
EP3832420A1 (en) | Deep learning based motion control of a group of autonomous vehicles | |
Bansal et al. | A hamilton-jacobi reachability-based framework for predicting and analyzing human motion for safe planning | |
Lee et al. | Ensemble bayesian decision making with redundant deep perceptual control policies | |
Cai et al. | Lets-drive: Driving in a crowd by learning from tree search | |
Chaffre et al. | Sim-to-real transfer with incremental environment complexity for reinforcement learning of depth-based robot navigation | |
CN114020013B (en) | Unmanned aerial vehicle formation collision avoidance method based on deep reinforcement learning | |
Lee et al. | Bayesian residual policy optimization:: Scalable bayesian reinforcement learning with clairvoyant experts | |
Chen et al. | Mobile robot obstacle avoidance using short memory: a dynamic recurrent neuro-fuzzy approach | |
Amiri et al. | Learning and reasoning for robot sequential decision making under uncertainty | |
Liu et al. | Episodic memory-based robotic planning under uncertainty | |
Fischer et al. | Sampling-based inverse reinforcement learning algorithms with safety constraints | |
Yin et al. | Autonomous navigation of mobile robots in unknown environments using off-policy reinforcement learning with curriculum learning | |
Gamal et al. | Learning from fuzzy system demonstration: Autonomous navigation of mobile robot in static indoor environment using multimodal deep learning | |
Zhang et al. | Performance guaranteed human-robot collaboration with POMDP supervisory control | |
Hirose et al. | Probabilistic visual navigation with bidirectional image prediction | |
Ramakrishna et al. | Augmenting learning components for safety in resource constrained autonomous robots | |
Stein et al. | Navigating in populated environments by following a leader | |
Xiao et al. | Reinforcement learning-driven dynamic obstacle avoidance for mobile robot trajectory tracking | |
Malone et al. | Efficient motion-based task learning for a serial link manipulator | |
Cherroun et al. | Intelligent systems based on reinforcement learning and fuzzy logic approaches," Application to mobile robotic" | |
González-Rodríguez et al. | Uncertainty-Aware autonomous mobile robot navigation with deep reinforcement learning | |
Srinivasan et al. | Path planning with user route preference-A reward surface approximation approach using orthogonal Legendre polynomials |