[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

When autonomous agents model other agents: : An appeal for altered judgment coupled with mouths, ears, and a little more tape

Published: 01 March 2020 Publication History

Abstract

Agent modeling has rightfully garnered much attention in the design and study of autonomous agents that interact with other agents. However, despite substantial progress to date, existing agent-modeling methods too often (a) have unrealistic computational requirements and data needs; (b) fail to properly generalize across environments, tasks, and associates; and (c) guide behavior toward inefficient (myopic) solutions. Can these challenges be overcome? Or are they just inherent to a very complex problem? In this reflection, I argue that some of these challenges may be reduced by, first, modeling alternative processes than what is often modeled by existing algorithms and, second, considering more deeply the role of non-binding communication signals. Additionally, I believe that progress in developing autonomous agents that effectively interact with other agents will be enhanced as we develop and utilize a more comprehensive set of measurement tools and benchmarks. I believe that further development of these areas is critical to creating autonomous agents that effectively model and interact with other agents.

References

[1]
D.K. Goodwin, Team of Rivals: The Political Genius of Abraham Lincoln, Simon & Schuster, New York, 2012.
[2]
S.V. Albrecht, P. Stone, Autonomous agents modelling other agents: a comprehensive survey and open problems, Artif. Intell. 258 (2018) 66–95.
[3]
A. Grant, Give and Take: A Revolutionary Approach to Success, Penguin Books, 2013.
[4]
J. Nachbar, Prediction, optimization, and learning in repeated games, Econometrica 65 (2) (1997) 275–309.
[5]
D. Foster, H. Young, On the impossibility of predicting the behavior of rational agents, Proc. Natl. Acad. Sci. 98 (22) (2001) 12848–12853.
[6]
J. Nachbar, Beliefs in repeated games, Econometrica 73 (2) (2005) 459–480.
[7]
R. Axelrod, The Evolution of Cooperation, Basic Books, New York, 1984.
[8]
M.L. Littman, P. Stone, A polynomial-time Nash equilibrium algorithm for repeated games, Decis. Support Syst. 39 (2005) 55–66.
[9]
R. Powers, Y. Shoham, New criteria and a new algorithm for learning in multi-agent systems, in: Neural Information Processing Systems, 2004, pp. 1089–1096.
[10]
E.M.D. Cote, M.L. Littman, A polynomial-time Nash equilibrium algorithm for repeated stochastic games, in: Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, 2008, pp. 419–426.
[11]
M. Elidrisi, N. Johnson, M. Gini, J.W. Crandall, Fast adaptive learning in repeated stochastic games by game abstraction, in: Proceedings of the 13th International Conference on Autonomous Agents and Multi-Agent Systems, 2014.
[12]
J. Foerster, R.Y. Chen, M. Al-Shedivat, S. Whiteson, P. Abbeel, I. Mordatch, Learning with opponent-learning awareness, in: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018, pp. 122–130.
[13]
W. Shen, A.A. Khemeiri, A. Almehrzi, W.A. Enezi, I. Rahwan, J.W. Crandall, Regulating highly automated robot ecologies: insights from three user studies, in: Proceedings of the 5th International Conference on Human-Agent Interaction, 2017.
[14]
M.L. Littman, P. Stone, Leading best-response strategies in repeated games, in: IJCAI Workshop on Economic Agents, Models, and Mechanisms, Seattle, WA, 2001.
[15]
M. Nowak, K. Sigmund, A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner's dilemma game, Nature 364 (1993) 56–58.
[16]
W.H. Press, F.J. Dyson, Iterated prisoner's dilemma contains strategies that dominate any evolutionary opponent, Proc. Natl. Acad. Sci. USA 109 (26) (2012) 10409–10413.
[17]
J.W. Crandall, M. Oudah Tennom, F. Ishowo-Oloko, S. Abdallah, J.F. Bonnefon, M. Cebrian, A. Shariff, M.A. Goodrich, I. Rahwan, Cooperating with machines, Nat. Commun. 9 (233) (2018).
[18]
B.R. Bruns, Names for games: locating 2 × 2 games, Games 6 (4) (2015) 495–520.
[19]
F.R. Covey, The 7 Habits of Highly Successful People: Restoring the Character Ethic, Simon & Schuster, New York, 1989.
[20]
R. Powers, Y. Shoham, Learning against opponents with bounded memory, in: Proceedings of the 19th International Joint Conference on Artificial Intelligence, 2005, pp. 817–822.
[21]
J.W. Crandall, Towards minimizing disappointment in repeated games, J. Artif. Intell. Res. 49 (2014) 111–142.
[22]
Y. Shoham, K. Layton-Brown, Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations, Cambridge University Press, New York, NY, 2008.
[23]
A.K. Chopra, M.P. Singh, Agent communication, in: G. Weiss (Ed.), Multiagent Systems, 2nd edition, MIT Press, Cambridge, MA, 2013, pp. 101–142. Ch. 3.
[24]
F. Lopes, H. Coelho (Eds.), Negotiation and Argumentation in Multi-Agent Systems: Fundamentals, Theories, Systems and Applications, Bentham Books, 2014.
[25]
S. Fatima, S. Kraus, M. Wooldridge, Principles of Automated Negotiation, Cambridge University Press, 2014.
[26]
Automated negotiating agent competition 2018 : https://www.ijcai-18.org/anac/index.html.
[27]
S. Sen, S. Airiau, R. Mukherjee, Towards a Pareto-optimal solution in general-sum games, in: Proceedings of the 2nd International Conference on Autonomous Agents and Multi-Agent Systems, 2003, pp. 153–160.
[28]
M. Oudah, T. Rahwan, T. Crandall, J.W. Crandall, How AI wins friends and influences people in repeated games with cheap talk, in: Proceedings of the 32nd National Conference on Artificial Intelligence, 2018.
[29]
M. Rovatsos, G. Weiß, M. Wolf, Multiagent learning for open systems: a study in opponent classification, in: Adaptive Agents and Multi-Agent Systems, in: LNAI, vol. 2636, Springer, 2003, pp. 66–87.
[30]
P. Stone, G.A. Kaminka, S. Kraus, J.S. Rosenschein, Ad hoc autonomous agent teams: collaboration without pre-coordination, in: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010, pp. 1504–1509.
[31]
P. Chocron, M. Schorlemmer, Vocabulary alignment in openly specified interactions, in: Proceedings of the 16th International Conference on Autonomous Agents and Multiagent Systems, 2017.
[32]
A. Pentland, To signal is human, Am. Sci. 98 (2010) 204–211.
[33]
D. Sally, Conversation and cooperation in social dilemmas a meta-analysis of experiments from 1958 to 1992, Ration. Soc. 7 (1) (1995) 58–92.
[34]
D. Balliet, Communication and cooperation in social dilemmas: a meta-analytic review, Ration. Soc. 54 (1) (2009) 39–57.
[35]
M. van Lent, W. Fisher, M. Mancuso, An explainable artificial intelligence system for small-unit tactical behavior, in: Proc. of the 16th Conference on Innovative Applications of Artificial Intelligence, 2004.
[36]
M.G. Core, H.C. Lane, M. van Lent, D. Gomboc, S. Solomon, M. Rosenberg, Building explainable artificial intelligence systems, in: Proc. of the 18th Conference on Innovative Applications of Artificial Intelligence, 2006.
[37]
D. Gunning, Explainable artificial intelligence, Tech. Rep. DARPA-BAA-16-53 DARPA Broad Agency Announcement, August 2016, http://www.darpa.mil/program/explainable-artificial-intelligence.
[38]
J.N. Foerster, Y.M. Assael, N. de Freitas, S. Whiteson, Learning to communicate with deep multi-agent reinforcement learning, in: Advances in Neural Information Processing Systems, 2016, pp. 2137–2145.
[39]
H.J. Yoon, H. Chen, K. Long, H. Zhang, A. Gahlawat, D. Lee, N. Hovakimyan, Learning to communicate: a machine learning framework for heterogeneous multi-agent robotic systems, in: Proceedings of AIAA SciTech, 2019.
[40]
Y. Shoham, R. Powers, T. Grenager, If multi-agent learning is the answer, what is the question?, Artif. Intell. 171 (7) (2007) 365–377.
[41]
S.V. Albrecht, S. Ramamoorthy, Comparative evaluation of MAL algorithms in a diverse set of ad hoc team problems, in: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems, 2012, pp. 349–356.
[42]
D.H. Wolpert, W.G. Macready, No free lunch theorems for optimization, IEEE Trans. Evol. Comput. 1 (1) (1997) 67–82.
[43]
G.W. Brown, Iterative solutions of games by fictitious play, in: T.C. Koopmans (Ed.), Activity Analysis of Production and Allocation, John Wiley & Sons, New York, 1951.
[44]
D. Fudenberg, D.K. Levine, The Theory of Learning in Games, The MIT Press, 1998.
[45]
D.P. Foster, R. Vohra, Regret in the on-line decision problem, Games Econ. Behav. 29 (1999) 7–35.
[46]
M. Bowling, Convergence and no-regret in multiagent learning, in: Adv. Neur. In., 2004, pp. 209–216.
[47]
A. Greenwald, A. Jafari, A general class of no-regret learning algorithms and game-theoretic equilibria, in: Proceedings of the 16th Annual Conference on Computational Learning Theory, 2003, pp. 2–12.
[48]
J. Hu, M.P. Wellman, Multiagent reinforcement learning: theoretical framework and an algorithm, in: Proceedings of the 15th International Conference on Machine Learning, 1998, pp. 242–250.
[49]
M.L. Littman, Friend-or-foe: Q-learning in general-sum games, in: Proceedings of the 18th International Conference on Machine Learning, 2001, pp. 322–328.
[50]
M. Bowling, M. Veloso, Multiagent learning using a variable learning rate, Artif. Intell. 136 (2) (2002) 215–250.
[51]
J.W. Crandall, M.A. Goodrich, Learning to compete, compromise, and cooperate in repeated general-sum games, in: Proceedings of the 22nd International Conference on Machine Learning, 2005.
[52]
D. de Farias, N. Megiddo, Exploration–exploitation tradeoffs for expert algorithms in reactive environments, in: Adv. Neur. In., 2004, pp. 409–416.
[53]
R. Arora, O. Dekel, A. Tewari, Online bandit learning against an adaptive adversary: from regret to policy regret, in: Proceedings of the 29th International Conference on Machine Learning, 2012, pp. 1503–1510.
[54]
P. Hernandez-Leal, Y. Zhan, M.E. Taylor, L.E. Sucar, E. Munoz de Cote, Efficiently detecting switches against non-stationary opponents, Auton. Agents Multi-Agent Syst. 31 (4) (2017) 767–789.
[55]
M. Ravula, S. Alkobi, P. Stone, Ad hoc teamwork with behavior switching agents, in: International Joint Conference on Artificial Intelligence, 2019.
[56]
E. Short, J. Hart, M. Vu, B. Scassellati, No fair!! An interaction with a cheating robot, in: Proceedings of the 5th ACM/IEEE International Conference on Human-Robot Interaction, HRI, 2010, pp. 219–226.
[57]
M. Bowling, N. Burch, M. Johanson, O. Tammelin, Heads-up limit hold'em poker is solved, Science 347 (6218) (2015) 145–149.
[58]
E. Nudelman, J. Wortman, K. Leyton-Brown, Y. Shoham, Run the GAMUT: a comprehensive approach to evaluating game-theoretic algorithms, in: Proceedings of the 3rd International Conference on Autonomous Agents and Multiagent Systems, NYC, NY, 2004.
[59]
A. Rapoport, M.J. Guyer, A Taxonomy of 2 × 2 Games, Bobbs-Merrill, 1967.
[60]
A. Rapoport, M.J. Guyer, D.G. Gordon, The 2 × 2 Game, The Univ. of Michigan Press, 1976.
[61]
S.J. Brams, A Theory of Moves, Cambridge University Press, 1994.
[62]
D. Robinson, D. Goforth, The Topology of the 2 × 2 Games: A New Period Table, Routledge, 2005.
[63]
S.V. Albrecht, J.W. Crandall, S. Ramamoorthy, An empirical study on the practical impact of prior beliefs over policy types, in: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015, pp. 1988–1994.
[65]
International planning competition 2018 : https://ipc2018.bitbucket.io/.

Index Terms

  1. When autonomous agents model other agents: An appeal for altered judgment coupled with mouths, ears, and a little more tape
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Artificial Intelligence
        Artificial Intelligence  Volume 280, Issue C
        Mar 2020
        159 pages

        Publisher

        Elsevier Science Publishers Ltd.

        United Kingdom

        Publication History

        Published: 01 March 2020

        Author Tags

        1. Agent modeling
        2. Autonomous agents
        3. Multi-agent systems

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 0
          Total Downloads
        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 13 Jan 2025

        Other Metrics

        Citations

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media