default search action
ADPRL 2009: Nashville, TN, USA
- IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009, Nashville, TN, USA, March 31 - April 1, 2009. IEEE 2009, ISBN 978-1-4244-2761-1
Keynote Lecture
- Dimitri P. Bertsekas:
A unified framework for temporal difference methods. 1-7
Adaptive Dynamic Programming and Reinforcement Learning
- Hirotaka Hachiya, Takayuki Akiyama, Masashi Sugiyama, Jan Peters:
Efficient data reuse in value function approximation. 8-15 - Lili Cui, Huaguang Zhang, Derong Liu, Yongsu Kim:
Constrained optimal control of affine nonlinear discrete-time systems using GHJB method. 16-21 - Xuerui Bai, Dongbin Zhao, Jianqiang Yi:
ADHDP(λ) strategies based coordinated ramps metering with queuing consideration. 22-27
ADP and RL for Controls
- Hongwei Zhang, Jie Huang, Frank L. Lewis:
Algorithm and stability of ATC receding horizon control. 28-35 - Kyriakos G. Vamvoudakis, Draguna L. Vrabie, Frank L. Lewis:
Online policy iteration based algorithms to solve the continuous-time infinite horizon optimal control problem. 36-41 - Dongsung Huh, Emanuel Todorov:
Real-time motor control using recurrent neural networks. 42-49 - Dan Liu, Emanuel Todorov:
Hierarchical optimal control of a 7-DOF arm model. 50-57 - Tom Erez, William D. Smart:
Coupling perception and action using minimax optimal control. 58-65
Markov Decision Processes
- Jun Ma, Warren B. Powell:
A convergent recursive least squares approximate policy iteration algorithm for multi-dimensional Markov decision process with continuous state and action spaces. 66-73 - Huizhen Yu, Dimitri P. Bertsekas:
Basis function adaptation methods for cost approximation in MDP. 74-81 - Elva Corona-Xelhuantzi, Eduardo F. Morales, Luis Enrique Sucar:
Executing concurrent actions with multiple Markov decision processes. 82-89 - Emanuel Todorov, Yuval Tassa:
Iterative local dynamic programming. 90-95 - Eugene A. Feinberg:
Adaptive computation of optimal nonrandomized policies in constrained average-reward MDPs. 96-100
Architecture of ADP and RL
- Marco A. Wiering, Hado van Hasselt:
The QV family compared to other reinforcement learning algorithms. 101-108 - Philippe Preux, Sertan Girgin, Manuel Loth:
Feature discovery in approximate dynamic programming. 109-116 - Raphaël Fonteneau, Susan A. Murphy, Louis Wehenkel, Damien Ernst:
Inferring bounds on the performance of a control policy from a sample of trajectories. 117-123 - Xin Zhang, Huaguang Zhang, Derong Liu, Yongsu Kim:
Neural-network-based reinforcement learning controller for nonlinear systems with non-symmetric dead-zone inputs. 124-129 - Yogesh P. Awate:
Algorithms for variance reduction in a policy-gradient based actor-critic framework. 130-136
Policy Search in ADP and RL
- Ilya O. Ryzhov, Warren B. Powell:
The knowledge gradient algorithm for online subset selection. 137-144 - Boris Defourny, Damien Ernst, Louis Wehenkel:
Planning under uncertainty, ensembles of disturbance trees and kernelized discrete action spaces. 145-152 - Lucian Busoniu, Damien Ernst, Bart De Schutter, Robert Babuska:
Policy search with cross-entropy optimization of basis functions. 153-160 - Emanuel Todorov:
Eigenfunction approximation methods for linearly-solvable optimal control problems. 161-168 - Jason Pazis, Michail G. Lagoudakis:
Learning continuous-action control policies. 169-176
Statistical and Multiagent RL
- Harm van Seijen, Hado van Hasselt, Shimon Whiteson, Marco A. Wiering:
A theoretical and empirical analysis of Expected Sarsa. 177-184 - Matthieu Geist, Olivier Pietquin, Gabriel Fricout:
Kalman Temporal Differences: The deterministic case. 185-192 - Willi Richert, Ulrich Scheller, Markus Koch, Bernd Kleinjohann, Claudius Stern:
Integrating sporadic imitation in Reinforcement Learning robots. 193-198 - Roman V. Belavkin:
Bounds of optimal learning. 199-204 - Ali Akramizadeh, Mohammad B. Menhaj, Ahmad Afshar:
Multiagent reinforcement learning in extensive form games with complete information. 205-211
Applications of ADP and RL
- C. Alexander Simpkins, Emanuel Todorov:
Practical numerical methods for stochastic optimal control of biological systems in continuous time and space. 212-218 - Evangelos A. Theodorou, Jonas Buchli, Stefan Schaal:
Path integral-based stochastic optimal control for rigid body dynamics. 219-225 - Jan Peters, Jens Kober:
Using reward-weighted imitation for robot Reinforcement Learning. 226-232 - H. Daniel Patiño, Santiago Tosetti, Flavio Capraro:
Adaptive Critic Designs-based autonomous unmanned vehicles navigation: Application to robotic farm vehicles. 233-237 - Xiaofeng Lin, Tangbo Liu, Shaojian Song, Chunning Song:
Neuro-controller of cement rotary kiln temperature with adaptive critic designs. 238-242
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.