Keyword: restless bandits : Search

extended-abstract

Fairness of Exposure in Online Restless Multi-armed Bandits

AAMAS '24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent SystemsPages 2474–2476

Restless multi-armed bandits (RMABs) generalize the multi-armed bandits where each arm exhibits Markovian behavior and transitions according to their transition dynamics. Solutions to RMAB exist for both offline and online cases. However, they do not ...

research-article

Indexability is Not Enough for Whittle: Improved, Near-Optimal Algorithms for Restless Bandits

AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent SystemsPages 1294–1302

We study the problem of planning restless multi-armed bandits (RMABs) with multiple actions. This is a popular model for multi-agent systems with applications like multi-channel communication, monitoring and machine maintenance tasks, and healthcare. ...

research-article

Public Access

Index-aware reinforcement learning for adaptive video streaming at the wireless edge

MobiHoc '22: Proceedings of the Twenty-Third International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile ComputingPages 81–90https://doi.org/10.1145/3492866.3549726

We study adaptive video streaming for multiple users in wireless access edge networks with unreliable channels. The key challenge is to jointly optimize the video bitrate adaptation and resource allocation such that the users' cumulative quality of ...

research-article

Networked Restless Multi-Armed Bandits for Mobile Interventions

AAMAS '22: Proceedings of the 21st International Conference on Autonomous Agents and Multiagent SystemsPages 1001–1009

Motivated by a broad class of mobile intervention problems, we propose and study restless multi-armed bandits (RMABs) with network effects. In our model, arms are partially recharging and connected through a graph, so that pulling one arm also improves ...

research-article

Open Access

Minimizing Age-of-Information in Heterogeneous Multi-Channel Systems: A New Partial-Index Approach

MobiHoc '21: Proceedings of the Twenty-second International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile ComputingPages 11–20https://doi.org/10.1145/3466772.3467030

We study how to schedule data sources in a wireless time-sensitive information system with multiple heterogeneous and unreliable channels to minimize the total expected Age-of-Information (AoI). Although one could formulate this problem as a discrete-...

research-article

Index Policies and Performance Bounds for Dynamic Selection Problems

Management Science (MANS), Volume 66, Issue 7Pages 3029–3050https://doi.org/10.1287/mnsc.2019.3342

We consider dynamic selection problems, where a decision maker repeatedly selects a set of items from a larger collection of available items. A classic example is the dynamic assortment problem with demand learning, where a retailer chooses items to offer ...

article

Free

Optimal policies for observing time series and related restless bandit problems

The Journal of Machine Learning Research (JMLR), Volume 20, Issue 1Pages 1218–1310

The trade-off between the cost of acquiring and processing data, and uncertainty due to a lack of data is fundamental in machine learning. A basic instance of this trade-off is the problem of deciding when to make noisy and costly observations of a ...

extended-abstract

Asymptotic Optimal Control of Markov-Modulated Restless Bandits

SIGMETRICS '18: Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer SystemsPages 44–46https://doi.org/10.1145/3219617.3219636

This paper studies optimal control subject to changing conditions. This is an area that recently received a lot of attention as it arises in numerous situations in practice. Some applications being cloud computing systems with fluctuating arrival rates, ...

Also Published in:

ACM SIGMETRICS Performance Evaluation Review: Volume 46 Issue 1

research-article

Asymptotic Optimal Control of Markov-Modulated Restless Bandits

Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS), Volume 2, Issue 1Article No.: 7, Pages 1–25https://doi.org/10.1145/3179410

This paper studies optimal control subject to changing conditions. This is an area that recently received a lot of attention as it arises in numerous situations in practice. Some applications being cloud computing systems where the arrival rates of new ...

research-article

Free

Exploration vs exploitation with partially observable Gaussian autoregressive arms

VALUETOOLS '14: Proceedings of the 8th International Conference on Performance Evaluation Methodologies and ToolsPages 209–216https://doi.org/10.4108/icst.valuetools.2014.258207

We consider a restless bandit problem with Gaussian autoregressive arms, where the state of an arm is only observed when it is played and the state-dependent reward is collected. Since arms are only partially observable, a good decision policy needs to ...

research-article

Index policies for a multi-class queue with convex holding cost and abandonments

SIGMETRICS '14: The 2014 ACM international conference on Measurement and modeling of computer systemsPages 125–137https://doi.org/10.1145/2591971.2591983

We investigate a resource allocation problem in a multi-class server with convex holding costs and user impatience under the average cost criterion. In general, the optimal policy has a complex dependency on all the input parameters and state ...

Also Published in:

ACM SIGMETRICS Performance Evaluation Review: Volume 42 Issue 1

article

Green Access Point Selection for Wireless Local Area Networks Enhanced by Cognitive Radio

Mobile Networks and Applications (MNET), Volume 18, Issue 4Pages 553–566https://doi.org/10.1007/s11036-013-0437-z

In wireless local area networks (WLANs) made up of Extended Service Sets, access point (AP) selection is a key issue to improve the network performance and balance the traffic load. WLANs operating in the shared Industrial, Scientific and Medical band ...

research-article

Opportunistic schedulers for optimal scheduling of flows in wireless systems with ARQ feedback

ITC '12: Proceedings of the 24th International Teletraffic CongressArticle No.: 14, Pages 1–8

In this paper we study three opportunistic schedulers for the problem of optimal multi-class flow-level scheduling in wireless downlink and uplink systems. For user channels we employ the Gilbert-Elliot model of good and bad channel condition with flow-...

research-article

Optimal index rules for single resource allocation to stochastic dynamic competitors

Peter Jacko

VALUETOOLS '11: Proceedings of the 5th International ICST Conference on Performance Evaluation Methodologies and ToolsPages 425–433

In this paper we present a generic Markov decision process model of optimal single resource allocation to a collection of stochastic dynamic competitors. The main goal is to identify sufficient conditions under which this problem is optimally solved by ...

research-article

Approximation algorithms for restless bandit problems

Journal of the ACM (JACM), Volume 58, Issue 1Article No.: 3, Pages 1–50https://doi.org/10.1145/1870103.1870106

The restless bandit problem is one of the most well-studied generalizations of the celebrated stochastic multi-armed bandit (MAB) problem in decision theory. In its ultimate generality, the restless bandit problem is known to be PSPACE-Hard to ...

article

Index Policies for the Admission Control and Routing of Impatient Customers to Heterogeneous Service Stations

Operations Research (OPRH), Volume 57, Issue 4Pages 975–989https://doi.org/10.1287/opre.1080.0632

We propose a general Markovian model for the optimal control of admissions and subsequent routing of customers for service provided by a collection of heterogeneous stations. Queue-length information is available to inform all decisions. Admitted ...

research-article

Free

An index policy for multiarmed multimode restless bandits

José Niño-Mora

ValueTools '08: Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and ToolsArticle No.: 61, Pages 1–6https://doi.org/10.4108/ICST.VALUETOOLS2008.4410

This paper introduces and addresses the multiarmed multimode restless bandit problem, concerning the optimal dynamic allocation of a shared resource to a collection of projects which can be operated in multiple modes, subject to a peak resource ...

research-article

Free

Admission control and routing to parallel queues with delayed information via marginal productivity indices

ValueTools '08: Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and ToolsArticle No.: 52, Pages 1–10https://doi.org/10.4108/ICST.VALUETOOLS2008.4409

This paper addresses the problem of designing and computing a tractable index policy for dynamic job admission control and/or routing in a discrete time Markovian model of parallel loss queues with one-period delayed state observation, which comes close ...

research-article

Free

Computing an index policy for multiarmed bandits with deadlines

José Niño-Mora

ValueTools '08: Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and ToolsArticle No.: 13, Pages 1–10https://doi.org/10.4108/ICST.VALUETOOLS2008.4406

This paper introduces the multiarmed bandit problem with deadlines, which concerns the dynamic selection of a live project to engage out of a portfolio of Markovian bandit projects expiring after given deadlines, to maximize the expected total ...

research-article

Characterization and computation of restless bandit marginal productivity indices

José Niño-Mora

ValueTools '07: Proceedings of the 2nd international conference on Performance evaluation methodologies and toolsArticle No.: 74, Pages 1–10

The restless bandit problem furnishes a powerful modeling paradigm for settings involving the optimal dynamic priority allocation to multiple stochatic projects, given as binary-action (active/passive) Markov decision processes (MDPs). Though generally ...

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Save to Binder

Also Published in:

Also Published in: