[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Liu et al., 2020 - Google Patents

Overview of reinforcement learning based on value and policy

Liu et al., 2020

Document ID
6085800706183336402
Author
Liu Y
Yang J
Chen L
Guo T
Jiang Y
Publication year
Publication venue
2020 Chinese Control And Decision Conference (CCDC)

External Links

Snippet

Reinforcement learning methods are mainly divided into two categories based on value functions and policies. This article systematically introduces and summarizes reinforcement learning methods from these two categories. First, it summarizes the reinforcement learning …
Continue reading at ieeexplore.ieee.org (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/04Architectures, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/0265Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
    • G05B13/027Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/12Computer systems based on biological models using genetic models
    • G06N3/126Genetic algorithms, i.e. information processing using digital simulations of the genetic system
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/04Inference methods or devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computer systems based on specific mathematical models
    • G06N7/02Computer systems based on specific mathematical models using fuzzy logic
    • G06N7/023Learning or tuning the parameters of a fuzzy system
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5009Computer-aided design using simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computer systems based on specific mathematical models
    • G06N7/005Probabilistic networks
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B17/00Systems involving the use of models or simulators of said systems
    • G05B17/02Systems involving the use of models or simulators of said systems electric

Similar Documents

Publication Publication Date Title
Li et al. A reinforcement learning based RMOEA/D for bi-objective fuzzy flexible job shop scheduling
Liu et al. Overview of reinforcement learning based on value and policy
Russell et al. Q-decomposition for reinforcement learning agents
Dash et al. Efficient stock price prediction using a self evolving recurrent neuro-fuzzy inference system optimized through a modified differential harmony search technique
Touati et al. Randomized value functions via multiplicative normalizing flows
Gasic et al. Gaussian processes for fast policy optimisation of pomdp-based dialogue managers
Rosenbloom The Sigma cognitive architecture and system
Juang et al. A locally recurrent fuzzy neural network with support vector regression for dynamic-system modeling
Zhao et al. Asynchronous reinforcement learning algorithms for solving discrete space path planning problems
CN111309880A (en) Multi-agent action strategy learning method, device, medium and computing equipment
Ergen et al. Energy-efficient LSTM networks for online learning
Barto Reinforcement learning and dynamic programming
Liu et al. Prioritized experience replay based on multi-armed bandit
Dong et al. A hybrid algorithm for workflow scheduling in cloud environment
Hung A fuzzy GARCH model applied to stock market scenario using a genetic algorithm
Byeon Advances in Value-based, Policy-based, and Deep Learning-based Reinforcement Learning
Zhao et al. Ensemble-based offline-to-online reinforcement learning: From pessimistic learning to optimistic exploration
Chen et al. Boosting the performance of computing systems through adaptive configuration tuning
Ghazanfari et al. Enhancing nash q-learning and team q-learning mechanisms by using bottlenecks
Schmitt et al. Exploration via epistemic value estimation
Chen et al. Averaged-A3C for asynchronous deep reinforcement learning
Aydin et al. Adaptive operator selection utilising generalised experience
Tsuchiya et al. Explainable Reinforcement Learning Based on Q-Value Decomposition by Expected State Transitions.
Sun et al. Multi-Agent Deep Deterministic Policy Gradient Algorithm Based on Classification Experience Replay
Tang et al. Hierarchical reinforcement learning based on multi-agent cooperation game theory