[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3670865.3673460acmconferencesArticle/Chapter ViewAbstractPublication PagesecConference Proceedingsconference-collections
research-article
Free access

Efficient Prior-Free Mechanisms for No-Regret Agents

Published: 17 December 2024 Publication History

Abstract

We study a repeated Principal Agent problem between a long lived Principal and Agent pair in a prior free setting. In our setting, the sequence of realized states of nature may be adversarially chosen, the Agent is non-myopic, and the Principal aims for a strong form of policy regret. Following [Camara et al., 2020], we model the Agent's long-run behavior with behavioral assumptions that relax the common prior assumption (for example, that the Agent has no swap regret). Within this framework, we revisit the mechanism proposed by [Camara et al., 2020], which informally uses calibrated forecasts of the unknown states of nature in place of a common prior. We give two main improvements. First, we give a mechanism that has an exponentially improved dependence (in terms of both running time and regret bounds) on the number of distinct states of nature. To do this, we show that our mechanism does not require truly calibrated forecasts, but rather forecasts that are unbiased subject to only a polynomially sized collection of events --- which can be produced with polynomial overhead. Second, instead of constructing a policy and assuming that the policy is "stable" (informally requiring that under such a policy, all approximately optimal actions lead to approximately the same Principal payoff), we propose a general framework given access to a stable policy oracle. We then instantiate the oracle by developing efficient algorithms in several significant special cases, including the focal linear contracting setting. Taken together, our new mechanism makes the compelling framework proposed by [Camara et al., 2020] more powerful, now able to be realized over polynomially sized state spaces.

References

[1]
Maria-Florina Balcan, Avrim Blum, Nika Haghtalab, and Ariel D. Procaccia. 2015. Commitment Without Regrets: Online Learning in Stackelberg Security Games. Proceedings of the Sixteenth ACM Conference on Economics and Computation (2015). https://api.semanticscholar.org/CorpusID:14830193
[2]
Martino Bernasconi, Matteo Castiglioni, Andrea Celli, Alberto Marchesi, Francesco Trovò, and Nicola Gatti. 2023. Optimal rates and efficient algorithms for online Bayesian persuasion. In International Conference on Machine Learning. PMLR, 2164--2183.
[3]
Martino Bernasconi, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti, and Francesco Trovò. 2022. Sequential information design: Learning to persuade in the dark. Advances in Neural Information Processing Systems 35 (2022), 15917--15928.
[4]
Avrim Blum, Nika Haghtalab, and Ariel D Procaccia. 2014. Learning optimal commitment to overcome insecurity. Advances in Neural Information Processing Systems 27 (2014).
[5]
Avrim Blum, MohammadTaghi Hajiaghayi, Katrina Ligett, and Aaron Roth. 2008. Regret minimization and the price of total anarchy. In Proceedings of the fortieth annual ACM symposium on Theory of computing. 373--382.
[6]
Avrim Blum and Yishay Mansour. 2007. From external to internal regret. Journal of Machine Learning Research 8, 6 (2007).
[7]
Patrick Bolton and Mathias Dewatripont. 2004. Contract theory. MIT press.
[8]
Mark Braverman, Jieming Mao, Jon Schneider, and Matt Weinberg. 2018. Selling to a no-regret buyer. In Proceedings of the 2018 ACM Conference on Economics and Computation. 523--538.
[9]
William Brown, Jon Schneider, and Kiran Vodrahalli. 2023. Is Learning in Games Good for the Learners? arXiv preprint arXiv:2305.19496 (2023).
[10]
Linda Cai, S Matthew Weinberg, Evan Wildenhain, and Shirley Zhang. 2023. Selling to Multiple No-Regret Buyers. arXiv preprint arXiv:2307.04175 (2023).
[11]
Modibo K Camara, Jason D Hartline, and Aleck Johnsen. 2020. Mechanisms for a no-regret agent: Beyond the common prior. In 2020 ieee 61st annual symposium on foundations of computer science (focs). IEEE, 259--270.
[12]
Gabriel Carroll. 2015. Robustness and linear contracts. American Economic Review 105, 2 (2015), 536--563.
[13]
Gabriel Carroll. 2021. Contract Theory. (2021).
[14]
Matteo Castiglioni, Alberto Marchesi, and Nicola Gatti. 2021. Bayesian Agency: Linear versus Tractable Contracts. arXiv e-prints (2021), arXiv-2106.
[15]
Sylvain Chassang. 2013. Calibrated incentive contracts. Econometrica 81, 5 (2013), 1935--1971.
[16]
Yiling Chen, Yang Liu, and Chara Podimata. 2020. Learning strategy-aware linear classifiers. Advances in Neural Information Processing Systems 33 (2020), 15265--15276.
[17]
Alon Cohen, Argyrios Deligkas, and Moran Koren. 2022. Learning approximately optimal contracts. In International Symposium on Algorithmic Game Theory. Springer, 331--346.
[18]
Lee Cohen and Yishay Mansour. 2019. Optimal algorithm for bayesian incentive-compatible exploration. In Proceedings of the 2019 ACM Conference on Economics and Computation. 135--151.
[19]
Natalie Collina, Eshwar Ram Arunachaleswaran, and Michael Kearns. 2023. Efficient Stackelberg Strategies for Finitely Repeated Games. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems. 643--651.
[20]
Yuan Deng, Jon Schneider, and Balusubramanian Sivan. 2019. Strategizing against No-regret Learners. arXiv:1909.13861 [cs.GT]
[21]
Jinshuo Dong, Aaron Roth, Zachary Schutzman, Bo Waggoner, and Zhiwei Steven Wu. 2018. Strategic classification from revealed preferences. In Proceedings of the 2018 ACM Conference on Economics and Computation. 55--70.
[22]
Shaddin Dughmi and Haifeng Xu. 2016. Algorithmic bayesian persuasion. In Proceedings of the forty-eighth annual ACM symposium on Theory of Computing. 412--425.
[23]
Paul Dütting, Tomer Ezra, Michal Feldman, and Thomas Kesselheim. 2022. Combinatorial contracts. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS). IEEE, 815--826.
[24]
Paul Dütting, Tim Roughgarden, and Inbal Talgam-Cohen. 2019. Simple versus Optimal Contracts. In Proceedings of the 2019 ACM Conference on Economics and Computation (Phoenix, AZ, USA) (EC '19). Association for Computing Machinery, New York, NY, USA, 369--387.
[25]
Dean P Foster and Sergiu Hart. 2018. Smooth calibration, leaky forecasts, finite recall, and nash dynamics. Games and Economic Behavior 109 (2018), 271--293.
[26]
Dean P Foster and Rakesh Vohra. 1999. Regret in the on-line decision problem. Games and Economic Behavior 29, 1--2 (1999), 7--35.
[27]
Dean P Foster and Rakesh V Vohra. 1998. Asymptotic calibration. Biometrika 85, 2 (1998), 379--390.
[28]
Jiarui Gan, Rupak Majumdar, Goran Radanovic, and Adish Singla. 2022. Bayesian persuasion in sequential decision-making. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 5025--5033.
[29]
Sumegha Garg, Christopher Jung, Omer Reingold, and Aaron Roth. 2024. Oracle Efficient Online Multicalibration and Omniprediction. In ACM-SIAM Symposium on Discrete Algorithms.
[30]
Ira Globus-Harris, Declan Harrison, Michael Kearns, Aaron Roth, and Jessica Sorrell. 2023. Multicalibration as Boosting for Regression. In International Conference on Machine Learning, ICML 2023, 23--29 July 2023, Honolulu, Hawaii, USA (Proceedings of Machine Learning Research, Vol. 202), Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett (Eds.). PMLR, 11459--11492. https://proceedings.mlr.press/v202/globus-harris23a.html
[31]
Eyal Gofer and Yishay Mansour. 2016. Lower Bounds on Individual Sequence Regret. Mach. Learn. 103, 1 (apr 2016), 1--26.
[32]
Parikshit Gopalan, Lunjia Hu, Michael P Kim, Omer Reingold, and Udi Wieder. 2023a. Loss Minimization Through the Lens Of Outcome Indistinguishability. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Schloss Dagstuhl-Leibniz-Zentrum für Informatik.
[33]
Parikshit Gopalan, Adam Tauman Kalai, Omer Reingold, Vatsal Sharan, and Udi Wieder. 2022. Omnipredictors. In 13th Innovations in Theoretical Computer Science Conference, ITCS 2022, January 31 - February 3, 2022, Berkeley, CA, USA (LIPIcs, Vol. 215), Mark Braverman (Ed.). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 79:1--79:21.
[34]
Parikshit Gopalan, Michael P Kim, and Omer Reingold. 2023b. Characterizing notions of omniprediction via multicalibration. arXiv preprint arXiv:2302.06726 (2023).
[35]
Sanford J Grossman and Oliver D Hart. 1992. An analysis of the principal-agent problem. In Foundations of Insurance Economics: Readings in Economics and Finance. Springer, 302--340.
[36]
Nika Haghtalab, Thodoris Lykouris, Sloan Nietert, and Alexander Wei. 2022. Learning in stackelberg games with non-myopic agents. In Proceedings of the 23rd ACM Conference on Economics and Computation. 917--918.
[37]
Nika Haghtalab, Chara Podimata, and Kunhe Yang. 2023. Calibrated Stackelberg Games: Learning Optimal Commitments Against Calibrated Agents. arXiv:2306.02704 [cs.GT]
[38]
Ursula Hébert-Johnson, Michael Kim, Omer Reingold, and Guy Rothblum. 2018. Multicalibration: Calibration for the (computationally-identifiable) masses. In International Conference on Machine Learning. PMLR, 1939--1948.
[39]
Chien-Ju Ho, Aleksandrs Slivkins, and Jennifer Wortman Vaughan. 2014. Adaptive contract design for crowdsourcing markets: Bandit algorithms for repeated principal-agent problems. In Proceedings of the fifteenth ACM conference on Economics and computation. 359--376.
[40]
Bengt Holmström. 1979. Moral hazard and observability. The Bell journal of economics (1979), 74--91.
[41]
Bengt Holmstrom and Paul Milgrom. 1987. Aggregation and linearity in the provision of intertemporal incentives. Econometrica: Journal of the Econometric Society (1987), 303--328.
[42]
Sham M Kakade and Dean P Foster. 2008. Deterministic calibration and Nash equilibrium. J. Comput. System Sci. 74, 1 (2008), 115--130.
[43]
Emir Kamenica and Matthew Gentzkow. 2011. Bayesian persuasion. American Economic Review 101, 6 (2011), 2590--2615.
[44]
Yoav Kolumbus and Noam Nisan. 2022. How and why to manipulate your own agent: On the incentives of users of learning agents. Advances in Neural Information Processing Systems 35 (2022), 28080--28094.
[45]
Thodoris Lykouris, Vasilis Syrgkanis, and Éva Tardos. 2016. Learning and efficiency in games with dynamic population. In Proceedings of the twenty-seventh annual ACM-SIAM symposium on Discrete algorithms. SIAM, 120--129.
[46]
Yishay Mansour, Mehryar Mohri, Jon Schneider, and Balasubramanian Sivan. 2022a. Strategizing against learners in bayesian games. In Conference on Learning Theory. PMLR, 5221--5252.
[47]
Yishay Mansour, Aleksandrs Slivkins, Vasilis Syrgkanis, and Zhiwei Steven Wu. 2022b. Bayesian exploration: Incentivizing exploration in Bayesian games. Operations Research 70, 2 (2022), 1105--1127.
[48]
Denis Nekipelov, Vasilis Syrgkanis, and Eva Tardos. 2015. Econometrics for learning agents. In Proceedings of the sixteenth acm conference on economics and computation. 1--18.
[49]
Georgy Noarov, Ramya Ramalingam, Aaron Roth, and Stephan Xie. 2023. High-Dimensional Prediction for Sequential Decision Making. arXiv preprint arXiv:2310.17651 (2023).
[50]
Aaron Roth. 2023. Uncertain: Modern Topics in Uncertainty Estimation.
[51]
Aaron Roth, Aleksandrs Slivkins, Jonathan Ullman, and Zhiwei Steven Wu. 2020. Multidimensional dynamic pricing for welfare maximization. ACM Transactions on Economics and Computation (TEAC) 8, 1 (2020), 1--35.
[52]
Aaron Roth, Jonathan Ullman, and Zhiwei Steven Wu. 2016. Watch and learn: Optimizing from revealed preferences feedback. In Proceedings of the forty-eighth annual ACM symposium on Theory of Computing. 949--962.
[53]
Tim Roughgarden. 2015. Intrinsic robustness of the price of anarchy. Journal of the ACM (JACM) 62, 5 (2015), 1--42.
[54]
Mark Sellke and Aleksandrs Slivkins. 2021. The price of incentivizing exploration: A characterization via thompson sampling and sample complexity. In Proceedings of the 22nd ACM Conference on Economics and Computation. 795--796.
[55]
Jibang Wu, Zixuan Zhang, Zhe Feng, Zhaoran Wang, Zhuoran Yang, Michael I Jordan, and Haifeng Xu. 2022. Sequential information design: Markov persuasion process and its efficient reinforcement learning. arXiv preprint arXiv:2202.10678 (2022).
[56]
Shengjia Zhao, Michael Kim, Roshni Sahoo, Tengyu Ma, and Stefano Ermon. 2021. Calibrating predictions to decisions: A novel approach to multi-class calibration. Advances in Neural Information Processing Systems 34 (2021), 22313--22324.
[57]
Banghua Zhu, Stephen Bates, Zhuoran Yang, Yixin Wang, Jiantao Jiao, and Michael I Jordan. 2022. The sample complexity of online contract design. arXiv preprint arXiv:2211.05732 (2022).
[58]
You Zu, Krishnamurthy Iyer, and Haifeng Xu. 2021. Learning to persuade on the fly: Robustness against ignorance. In Proceedings of the 22nd ACM Conference on Economics and Computation. 927--928.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
EC '24: Proceedings of the 25th ACM Conference on Economics and Computation
July 2024
1340 pages
ISBN:9798400707049
DOI:10.1145/3670865
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 December 2024

Check for updates

Author Tags

  1. policy regret
  2. calibration
  3. contract theory
  4. bayesian persuasion

Qualifiers

  • Research-article

Funding Sources

Conference

EC '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 664 of 2,389 submissions, 28%

Upcoming Conference

EC '25
The 25th ACM Conference on Economics and Computation
July 7 - 11, 2025
Stanford , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 15
    Total Downloads
  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)15
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media