Hwang et al., 2022 - Google Patents
Option compatible reward inverse reinforcement learningHwang et al., 2022
View PDF- Document ID
- 17888956388082567896
- Author
- Hwang R
- Lee H
- Hwang H
- Publication year
- Publication venue
- Pattern Recognition Letters
External Links
Snippet
Reinforcement learning in complex environments is a challenging problem. In particular, the success of reinforcement learning algorithms depends on a well-designed reward function. Inverse reinforcement learning (IRL) solves the problem of recovering reward functions from …
- 230000002787 reinforcement 0 title abstract description 33
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/02—Computer systems based on specific mathematical models using fuzzy logic
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bengio et al. | Machine learning for combinatorial optimization: a methodological tour d’horizon | |
Jaafra et al. | Reinforcement learning for neural architecture search: A review | |
Hu et al. | Petri-net-based dynamic scheduling of flexible manufacturing system via deep reinforcement learning with graph convolutional network | |
Moerland et al. | A0c: Alpha zero in continuous action space | |
Taddy | The technological elements of artificial intelligence | |
Powell | Perspectives of approximate dynamic programming | |
Hwang et al. | Option compatible reward inverse reinforcement learning | |
Jaimungal | Reinforcement learning and stochastic optimisation | |
Pynadath et al. | Reinforcement learning for adaptive theory of mind in the sigma cognitive architecture | |
Pinosky et al. | Hybrid control for combining model-based and model-free reinforcement learning | |
Francon et al. | Effective reinforcement learning through evolutionary surrogate-assisted prescription | |
Jaafra et al. | A review of meta-reinforcement learning for deep neural networks architecture search | |
Han et al. | Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks | |
Kottas et al. | Bi-linear adaptive estimation of fuzzy cognitive networks | |
Mousavi et al. | Automatic abstraction controller in reinforcement learning agent via automata | |
Arumugam et al. | Interpreting denoising autoencoders with complex perturbation approach | |
Służalec et al. | Quasi-optimal hp-finite element refinements towards singularities via deep neural network prediction | |
Castellini et al. | Explaining the influence of prior knowledge on POMCP policies | |
Schmidhuber | Learning algorithms for networks with internal and external feedback | |
Alexandridis et al. | Modelling of nonlinear process dynamics using Kohonen's neural networks, fuzzy systems and Chebyshev series | |
Leventi-Peetz et al. | Scope and sense of explainability for ai-systems | |
Liu et al. | Distributional reinforcement learning with epistemic and aleatoric uncertainty estimation | |
Lahiri et al. | Combining counterfactuals with shapley values to explain image models | |
Méndez-Molina et al. | CARL: A Synergistic Framework for Causal Reinforcement Learning | |
Di et al. | Newton’s method, Bellman recursion and differential dynamic programming for unconstrained nonlinear dynamic games |