Abstract
In agents that operate in environments where decision-making needs to take into account, not only the environment, but also the minimizing actions of an opponent (as in games), it is fundamental that the agent is endowed with the ability of progressively tracing the profile of its adversaries, in such a manner that this profile aids in the process of selecting appropriate actions. However, it would be unsuitable to construct an agent with a decision-making system based only on the elaboration of such a profile, as this would prevent the agent from having its “own identity,” which would leave the agent at the mercy of its opponent. Following this direction, this study proposes an automatic Checkers player, called ACE-RL-Checkers, equipped with a dynamic decision-making module, which adapts to the profile of the opponent over the course of the game. In such a system, the action selection process is conducted through a composition of multilayer perceptron neural network and case library. In this case, the neural network represents the “identity” of the agent, i.e., it is an already trained static decision-making module. On the other hand, the case library represents the dynamic decision-making module of the agent, which is generated by the Automatic Case Elicitation technique. This technique has a pseudo-random exploratory behavior, which allows the dynamic decision-making of the agent to be directed either by the opponent’s game profile or randomly. In order to avoid a high occurrence of pseudo-random decision-making in the game initial phases—in which the agent counts on very little information about its opponent—this work proposes a new module based on sequential pattern mining for generating a base of experience rules extracted from human expert’s game records. This module will improve the agent’s move selection in the game initial phases. Experiments carried out in tournaments involving ACE-RL-Checkers and other agents correlated to this work, confirm the superiority of the dynamic architecture proposed herein.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
American Checkers Federation (ACF) (2014) http://www.usacheckers.com/
World Checkers and Draughts Federation (WCDF) (2014) http://www.wcdf.net/
Aamodt A, Plaza E (1994) Case-based reasoning; foundational issues, methodological variations, and system approaches. AI Commun 7(1):39–59
Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749
Al-Khateeb B, Kendall G (2012) Effect of look-ahead depth in evolutionary checkers. J Comput Sci Technol 27(5):996–1006
Al-Khateeb B, Kendall G (2012) Introducing individual and social learning into evolutionary checkers. IEEE Trans Comput Intell AI Games 4:258–269
Banks S, Rafter R, Smyth B (2015) The recommendation game: using a game-with-a-purpose to generate recommendation data. In: Proceedings of the 9th ACM conference on recommender systems. ACM, New York, pp 305–308
Campos P, Langlois T (2003) Abalearn: efficient self-play learning of the game abalone. In: INESC-ID, Neural Networks and Signal Processing Group
Cheheltani SH, Ebadzadeh MM (2012) Immune based fuzzy agent plays checkers game. Appl Soft Comput 12(8):2227–2236
Derrac J, Garcia S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol Comput 1(1):3–18
Duarte VAR, Julia RMS (2012) Mp-draughts: ordering the search tree and refining the game board representation to improve a multi-agent system for draughts. In: 2012 IEEE 24th international conference on tools with artificial intelligence (ICTAI), vol 1, pp 1120–1125
Duarte VAR, Julia RMS, Albertini MK, Neto HC (2015) Mp-draughts: unsupervised learning multi-agent system based on mlp and adaptive neural networks. In: 2015 IEEE 27th international conference on tools with artificial intelligence (ICTAI), pp 920–927
Fierz MC (2008) Cake informations. Technical report. http://www.fierz.ch/cake.php
Fierz MC (2012) Checkerboard program—version 1.72. Technical report. http://www.fierz.ch/checkerboard.php
Fogel DB, Chellapilla K (2001) Verifying Anaconda’s expert rating by competing against chinook: experiments in co-evolving a neural checkers player. Neurocomputing 42(1–4):69–86
Herik HJV, Uiterwijk JW, Rijswijck JV (2002) Games solved: now and in the future. Artif Intell 134(1–2):277–311
Jong KAD, Schultz AC (1988) Using experience-based learning in game playing. In: Fifth international machine learning conference, pp 284–290
Lin MY, Lee SY (2002) Fast discovery of sequential patterns by memory indexing. Data Wareh Knowl Discov 2454:150–160
Lynch M, Griffith N (1997) Neurodraughts: the role of representation, search, training regime and architecture in a TD draughts player. In: Eighth Ireland conference on artificial intelligence, pp 67–72 . http://iamlynch.com/nd.html
Mabroukeh NR, Ezeife CI (2010) A taxonomy of sequential pattern mining algorithms. ACM Comput Surv 43(1):1–41
McCarthy JL, Feigenbaum EA (1990) In memoriam: Arthur Samuel: pioneer in machine learning. AI Mag 11(3):10–11
Millington I (2006) Artificial intelligence for games. Morgan Kaufmann Publishers Inc., San Francisco
Misiunas T (2014) Realtime recommendation system for online games. Master’s thesis, School of Informatics, University of Edinburgh, Edinburgh
Mller M, Enzenberger M (2009) Fuego-an open-source framework for board games and go engine based on Monte-Carlo tree search. Technical report, Department of Computing Science
Neto HC, Julia RMS (2015) ACE-RL-Checkers: improving automatic case elicitation through knowledge obtained by reinforcement learning in player agents. In: 2015 IEEE conference on computational intelligence and games (CIG), pp 328–335
Neto HC, Julia RMS, Caixeta GS, Barcelos ARA (2014) Ls-visiondraughts: improving the performance of an agent for checkers by integrating computational intelligence, reinforcement learning and a powerful search method. Appl Intell. https://doi.org/10.1007/s10489-014-0536-y
Neto HC, Julia RMS, Duarte VAR (2015) Improving the accuracy of the cases in the automatic case elicitation-based hybrid agents for checkers. In: 2015 IEEE 27th international conference on tools with artificial intelligence (ICTAI), pp 912–919
Pei J, Han J, Mortazavi-Asl B, Wang J, Pinto H, Chen Q, Dayal U, Hsu MC (2004) Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Trans Knowl Data Eng 16(11):1424–1440
Plaat A (1996) Research re: Search & re-search. Ph.D. thesis, Tinbergen Institute and Department of Computer Science, Erasmus University, Rotterdam
Plaat A, Schaeffer J, Pijls W, Bruin A (1995) A new paradigm for minimax search
Powell JH, Hauff BM, Hastings JD (2004) Utilizing case-based reasoning and automatic case elicitation to develop a self-taught knowledgeable agent. In: Challenges in game artificial intelligence: papers from the AAAI workshop (Technical report WS-0404). AAAI Press
Powell JH, Hauff BM, Hastings JD (2005) Evaluating the effectiveness of exploration and accumulated experience in automatic case elicitation. In: Proceedings of ICCBR 2005. Springer, Berlin, pp 397–407
Samuel AL (1959) Some studies in machine learning using the game of checkers. IBM J Res Dev 3(3):210–229
Samuel AL (1967) Some studies in machine learning using the game of checkers II. IBM J Res Dev 11(6):601–617
Schaeffer J, Burch N, Bjornsson Y, Kishimoto A, Muller M, Lake R, Lu P, Sutphen S (2007) Checkers is solved. Sci Express 328(5844):1518
Schaeffer J, Lake R, Lu P, Bryant M (1996) Chinook: the world man-machine checkers champion. AI Mag 17(1):21–30
Schein AI, Popescul A, Ungar LH, Pennock DM (2002) Methods and metrics for cold-start recommendations. In: Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR’02. ACM, New York, pp 253–260
Srikant R, Agrawal R (1996) Mining sequential patterns: Generalizations and performance improvements. In: Proceedings of the 5th international conference on extending database technology: advances in database technology. Springer, London, pp 3–17
Sutton RS (1988) Learning to predict by the methods of temporal differences. Mach Learn 3(1):9–44
Tomaz LBP, Julia RMS, Barcelos ARA (2013) Improving the accomplishment of a neural network based agent for draughts that operates in a distributed learning environment. In: IRI. IEEE, pp 262–269
Wang L, Wang Y, Li Y (2015) Mining experiential patterns from game-logs of board game. Int J Comput Games Technol. https://doi.org/10.1155/2015/576201
Yan X, Han J, Afshar R (2003) Clospan: mining closed sequential patterns in large datasets. In: Proceedings of the 3rd SIAM, pp 166–177
Zaki MJ (2001) Spade: an efficient algorithm for mining frequent sequences. Mach Learn 42(1–2):31–60
Zobrist AL (1969) A hashing method with applications for game playing. Technical report
Acknowledgements
The authors thank FAPEMIG (Brazil) for fellowships and financial support.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Neto, H.C., Julia, R.M.S. ACE-RL-Checkers: decision-making adaptability through integration of automatic case elicitation, reinforcement learning, and sequential pattern mining. Knowl Inf Syst 57, 603–634 (2018). https://doi.org/10.1007/s10115-018-1175-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-018-1175-0