Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- extended-abstractMay 2024
Fairness of Exposure in Online Restless Multi-armed Bandits
AAMAS '24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent SystemsPages 2474–2476Restless multi-armed bandits (RMABs) generalize the multi-armed bandits where each arm exhibits Markovian behavior and transitions according to their transition dynamics. Solutions to RMAB exist for both offline and online cases. However, they do not ...
- research-articleMay 2023
Indexability is Not Enough for Whittle: Improved, Near-Optimal Algorithms for Restless Bandits
AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent SystemsPages 1294–1302We study the problem of planning restless multi-armed bandits (RMABs) with multiple actions. This is a popular model for multi-agent systems with applications like multi-channel communication, monitoring and machine maintenance tasks, and healthcare. ...
- research-articleOctober 2022
Index-aware reinforcement learning for adaptive video streaming at the wireless edge
MobiHoc '22: Proceedings of the Twenty-Third International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile ComputingPages 81–90https://doi.org/10.1145/3492866.3549726We study adaptive video streaming for multiple users in wireless access edge networks with unreliable channels. The key challenge is to jointly optimize the video bitrate adaptation and resource allocation such that the users' cumulative quality of ...
- research-articleMay 2022
Networked Restless Multi-Armed Bandits for Mobile Interventions
- Han-Ching Ou,
- Christoph Siebenbrunner,
- Jackson Killian,
- Meredith B. Brooks,
- David Kempe,
- Yevgeniy Vorobeychik,
- Milind Tambe
AAMAS '22: Proceedings of the 21st International Conference on Autonomous Agents and Multiagent SystemsPages 1001–1009Motivated by a broad class of mobile intervention problems, we propose and study restless multi-armed bandits (RMABs) with network effects. In our model, arms are partially recharging and connected through a graph, so that pulling one arm also improves ...
- research-articleJuly 2021
Minimizing Age-of-Information in Heterogeneous Multi-Channel Systems: A New Partial-Index Approach
MobiHoc '21: Proceedings of the Twenty-second International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile ComputingPages 11–20https://doi.org/10.1145/3466772.3467030We study how to schedule data sources in a wireless time-sensitive information system with multiple heterogeneous and unreliable channels to minimize the total expected Age-of-Information (AoI). Although one could formulate this problem as a discrete-...
-
- research-articleJuly 2020
Index Policies and Performance Bounds for Dynamic Selection Problems
We consider dynamic selection problems, where a decision maker repeatedly selects a set of items from a larger collection of available items. A classic example is the dynamic assortment problem with demand learning, where a retailer chooses items to offer ...
- articleJanuary 2019
Optimal policies for observing time series and related restless bandit problems
The trade-off between the cost of acquiring and processing data, and uncertainty due to a lack of data is fundamental in machine learning. A basic instance of this trade-off is the problem of deciding when to make noisy and costly observations of a ...
- extended-abstractJune 2018
Asymptotic Optimal Control of Markov-Modulated Restless Bandits
SIGMETRICS '18: Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer SystemsPages 44–46https://doi.org/10.1145/3219617.3219636This paper studies optimal control subject to changing conditions. This is an area that recently received a lot of attention as it arises in numerous situations in practice. Some applications being cloud computing systems with fluctuating arrival rates, ...
Also Published in:
ACM SIGMETRICS Performance Evaluation Review: Volume 46 Issue 1 - research-articleApril 2018
Asymptotic Optimal Control of Markov-Modulated Restless Bandits
Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS), Volume 2, Issue 1Article No.: 7, Pages 1–25https://doi.org/10.1145/3179410This paper studies optimal control subject to changing conditions. This is an area that recently received a lot of attention as it arises in numerous situations in practice. Some applications being cloud computing systems where the arrival rates of new ...
- research-articleDecember 2014
Exploration vs exploitation with partially observable Gaussian autoregressive arms
VALUETOOLS '14: Proceedings of the 8th International Conference on Performance Evaluation Methodologies and ToolsPages 209–216https://doi.org/10.4108/icst.valuetools.2014.258207We consider a restless bandit problem with Gaussian autoregressive arms, where the state of an arm is only observed when it is played and the state-dependent reward is collected. Since arms are only partially observable, a good decision policy needs to ...
- research-articleJune 2014
Index policies for a multi-class queue with convex holding cost and abandonments
SIGMETRICS '14: The 2014 ACM international conference on Measurement and modeling of computer systemsPages 125–137https://doi.org/10.1145/2591971.2591983We investigate a resource allocation problem in a multi-class server with convex holding costs and user impatience under the average cost criterion. In general, the optimal policy has a complex dependency on all the input parameters and state ...
Also Published in:
ACM SIGMETRICS Performance Evaluation Review: Volume 42 Issue 1 - articleAugust 2013
Green Access Point Selection for Wireless Local Area Networks Enhanced by Cognitive Radio
Mobile Networks and Applications (MNET), Volume 18, Issue 4Pages 553–566https://doi.org/10.1007/s11036-013-0437-zIn wireless local area networks (WLANs) made up of Extended Service Sets, access point (AP) selection is a key issue to improve the network performance and balance the traffic load. WLANs operating in the shared Industrial, Scientific and Medical band ...
- research-articleSeptember 2012
Opportunistic schedulers for optimal scheduling of flows in wireless systems with ARQ feedback
ITC '12: Proceedings of the 24th International Teletraffic CongressArticle No.: 14, Pages 1–8In this paper we study three opportunistic schedulers for the problem of optimal multi-class flow-level scheduling in wireless downlink and uplink systems. For user channels we employ the Gilbert-Elliot model of good and bad channel condition with flow-...
- research-articleMay 2011
Optimal index rules for single resource allocation to stochastic dynamic competitors
In this paper we present a generic Markov decision process model of optimal single resource allocation to a collection of stochastic dynamic competitors. The main goal is to identify sufficient conditions under which this problem is optimally solved by ...
- research-articleDecember 2010
Approximation algorithms for restless bandit problems
Journal of the ACM (JACM), Volume 58, Issue 1Article No.: 3, Pages 1–50https://doi.org/10.1145/1870103.1870106The restless bandit problem is one of the most well-studied generalizations of the celebrated stochastic multi-armed bandit (MAB) problem in decision theory. In its ultimate generality, the restless bandit problem is known to be PSPACE-Hard to ...
- articleJuly 2009
Index Policies for the Admission Control and Routing of Impatient Customers to Heterogeneous Service Stations
We propose a general Markovian model for the optimal control of admissions and subsequent routing of customers for service provided by a collection of heterogeneous stations. Queue-length information is available to inform all decisions. Admitted ...
- research-articleOctober 2008
An index policy for multiarmed multimode restless bandits
ValueTools '08: Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and ToolsArticle No.: 61, Pages 1–6https://doi.org/10.4108/ICST.VALUETOOLS2008.4410This paper introduces and addresses the multiarmed multimode restless bandit problem, concerning the optimal dynamic allocation of a shared resource to a collection of projects which can be operated in multiple modes, subject to a peak resource ...
- research-articleOctober 2008
Admission control and routing to parallel queues with delayed information via marginal productivity indices
ValueTools '08: Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and ToolsArticle No.: 52, Pages 1–10https://doi.org/10.4108/ICST.VALUETOOLS2008.4409This paper addresses the problem of designing and computing a tractable index policy for dynamic job admission control and/or routing in a discrete time Markovian model of parallel loss queues with one-period delayed state observation, which comes close ...
- research-articleOctober 2008
Computing an index policy for multiarmed bandits with deadlines
ValueTools '08: Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and ToolsArticle No.: 13, Pages 1–10https://doi.org/10.4108/ICST.VALUETOOLS2008.4406This paper introduces the multiarmed bandit problem with deadlines, which concerns the dynamic selection of a live project to engage out of a portfolio of Markovian bandit projects expiring after given deadlines, to maximize the expected total ...
- research-articleOctober 2007
Characterization and computation of restless bandit marginal productivity indices
ValueTools '07: Proceedings of the 2nd international conference on Performance evaluation methodologies and toolsArticle No.: 74, Pages 1–10The restless bandit problem furnishes a powerful modeling paradigm for settings involving the optimal dynamic priority allocation to multiple stochatic projects, given as binary-action (active/passive) Markov decision processes (MDPs). Though generally ...