abstract

Public Access

Overcoming the Long Horizon Barrier for Sample-Efficient Reinforcement Learning with Latent Low-Rank Structure

Authors:

Tyler Sam,

Yudong Chen,

Christina Lee YuAuthors Info & Claims

SIGMETRICS '23: Abstract Proceedings of the 2023 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems

Pages 85 - 86

https://doi.org/10.1145/3578338.3593562

Published: 19 June 2023 Publication History

PDF eReader

SIGMETRICS '23: Abstract Proceedings of the 2023 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems

Overcoming the Long Horizon Barrier for Sample-Efficient Reinforcement Learning with Latent Low-Rank Structure

Pages 85 - 86

Abstract
Supplemental Material
References

Abstract

Reinforcement learning (RL) methods have been increasingly popular in sequential decision making tasks due to its empirical success. However, large state and action spaces in real-world problems modeled as a Markov decision processes (MDPs) limit the use of RL algorithms. Given a standard finite-horizon MDP (S, A, P, R, H) with state space S, action space A, transition kernel P = {Ph} ∈ []H, reward function R = {R h} ∈ [H] bounded between zero and one, and time horizon H, one needs Ω (|S||A|H3/∈2 samples given a generative model to learn an optimal policy [3], which can be impractical when S and A are large. The above tabular RL framework does not capture the fact that many real-world systems in fact have additional structure that if exploited should improve computational and statistical efficiency. Moreover, [1] empirically verifies that optimal and near-optimal action-value functions (both viewed as |S|-by-|A| matrices) of classical stochastic control tasks have low rank. Thus, the critical question is what are the minimal low rank structural assumptions that allow for computationally and statistically efficient learning?

Supplemental Material

MP4 File

Presentation video

Download
52.09 MB

References

[1]

Rozada, S., and Marqes, A. G. Tensor and matrix low-rank value-function approximation in reinforcement learning. arXiv preprint arXiv:2201.09736 (2022).

Google Scholar

[2]

Shah, D., Song, D., Xu, Z., and Yang, Y. Sample efficient reinforcement learning via low-rank matrix estimation. In Advances in Neural Information Processing Systems (2020), H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, Eds., vol. 33, Curran Associates, Inc., pp. 12092--12103.

Google Scholar

[3]

Sidford, A., Wang, M., Wu, X., Yang, L., and Ye, Y. Near-optimal time and sample complexities for solving markov decision processes with a generative model. In Advances in Neural Information Processing Systems (2018), S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, Eds., vol. 31, Curran Associates, Inc.

Google Scholar

Index Terms

Overcoming the Long Horizon Barrier for Sample-Efficient Reinforcement Learning with Latent Low-Rank Structure
1. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory
      1. Reinforcement learning
        Sequential decision making

Recommendations

Overcoming the Long Horizon Barrier for Sample-Efficient Reinforcement Learning with Latent Low-Rank Structure
SIGMETRICS '23

Reinforcement learning (RL) methods have been increasingly popular in sequential decision making tasks due to its empirical success. However, large state and action spaces in real-world problems modeled as a Markov decision processes (MDPs) limit the use ...
Overcoming the Long Horizon Barrier for Sample-Efficient Reinforcement Learning with Latent Low-Rank Structure
POMACS

The practicality of reinforcement learning algorithms has been limited due to poor scaling with respect to the problem size, as the sample complexity of learning an ε-optimal policy is Ω(|S||A|H/ ε2) over worst case instances of an MDP with state space S,...
Overcoming the Long Horizon Barrier for Sample-Efficient Reinforcement Learning with Latent Low-Rank Structure

Reinforcement learning (RL) methods have been increasingly popular in sequential decision making tasks due to its empirical success. However, large state and action spaces in real-world problems modeled as a Markov decision processes (MDPs) limit the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

SIGMETRICS '23: Abstract Proceedings of the 2023 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems

June 2023

123 pages

ISBN:9798400700743

DOI:10.1145/3578338

General Chair:
Evgenia Smirni
William & Mary, US
,
Program Chairs:
Konstantin Avrachenkov
INRIA Sophia Antipolis, FR
,
Phillipa Gill
Google, US
,
Bhuvan Urgaonkar
Penn State University, US and Amazon, US

ACM SIGMETRICS Performance Evaluation Review Volume 51, Issue 1
SIGMETRICS '23
June 2023
108 pages
ISSN:0163-5999
DOI:10.1145/3606376
Editor:
Zhenhua Liu
Stony Brook University
Issue’s Table of Contents

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 June 2023

Check for updates

Author Tags

Qualifiers

Abstract

Data Availability

Presentation video https://dl.acm.org/doi/10.1145/3578338.3593562#SIGMETRICS23-38.mp4

Funding Sources

NSF (National Science Foundation)

Conference

SIGMETRICS '23

Sponsor:

SIGMETRICS

SIGMETRICS '23: ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems

June 19 - 23, 2023

Florida, Orlando, United States

Acceptance Rates

Overall Acceptance Rate 459 of 2,691 submissions, 17%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
87
Total Downloads

Downloads (Last 12 months)59
Downloads (Last 6 weeks)4

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Index Terms

Recommendations