Hwang et al., 2022 - Google Patents
Option compatible reward inverse reinforcement learningHwang et al., 2022
View PDF- Document ID
- 17888956388082567896
- Author
- Hwang R
- Lee H
- Hwang H
- Publication year
- Publication venue
- Pattern Recognition Letters
External Links
Snippet
Reinforcement learning in complex environments is a challenging problem. In particular, the success of reinforcement learning algorithms depends on a well-designed reward function. Inverse reinforcement learning (IRL) solves the problem of recovering reward functions from …
- 230000002787 reinforcement 0 title abstract description 33
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/02—Computer systems based on specific mathematical models using fuzzy logic
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Moerland et al. | A0c: Alpha zero in continuous action space | |
Ueltzhöffer | Deep active inference | |
Taddy | The technological elements of artificial intelligence | |
Powell | Perspectives of approximate dynamic programming | |
Rezende et al. | Causally correct partial models for reinforcement learning | |
Pinosky et al. | Hybrid control for combining model-based and model-free reinforcement learning | |
Hwang et al. | Option compatible reward inverse reinforcement learning | |
Jaimungal | Reinforcement learning and stochastic optimisation | |
Pynadath et al. | Reinforcement learning for adaptive theory of mind in the sigma cognitive architecture | |
Francon et al. | Effective reinforcement learning through evolutionary surrogate-assisted prescription | |
Jaafra et al. | A review of meta-reinforcement learning for deep neural networks architecture search | |
Kottas et al. | Bi-linear adaptive estimation of fuzzy cognitive networks | |
Huang et al. | CMA evolution strategy assisted by kriging model and approximate ranking | |
Zhang et al. | An end-to-end inverse reinforcement learning by a boosting approach with relative entropy | |
Mousavi et al. | Automatic abstraction controller in reinforcement learning agent via automata | |
Castellini et al. | Explaining the influence of prior knowledge on POMCP policies | |
Boukraichi et al. | A priori compression of convolutional neural networks for wave simulators | |
Liu et al. | Distributional reinforcement learning with epistemic and aleatoric uncertainty estimation | |
Alexandridis et al. | Modelling of nonlinear process dynamics using Kohonen's neural networks, fuzzy systems and Chebyshev series | |
Leventi-Peetz et al. | Scope and sense of explainability for ai-systems | |
Lahiri et al. | Combining counterfactuals with shapley values to explain image models | |
Zhao et al. | Augmenting policy learning with routines discovered from a single demonstration | |
Volodin | Causeoccam: Learning interpretable abstract representations in reinforcement learning environments via model sparsity | |
Méndez-Molina et al. | Carl: A synergistic framework for causal reinforcement learning | |
Wang et al. | Erlang planning network: An iterative model-based reinforcement learning with multi-perspective |