Abstract
Human social learning is an effective process that has inspired many existing machine learning approaches, such as learning from observation and learning by demonstration. In this paper, we introduce another form of social learning, learning from a casual conversation or LCC a machine learning approach in which an artificially intelligent agent learns new information through an extended natural language dialog with a human. Our system enables the agent to add or change information in its knowledge base as a result of the human’s conversational text inputs. LCC seeks to close an important gap in the state of the art that has focused on teaching computer agents how to perform specific tasks. Furthermore, LCC could also provide an efficient way to enhance the knowledge base of certain types of systems without requiring the involvement of a programmer. LCC does not require the user to enter specific information; instead, the user can converse naturally with the agent. As part of its learning process, LCC identifies the text inputs from the conversing human that contain information worth learning, and then determines whether the inputs are heretofore unknown and learns it; in agreement with what it already “knows” and ignores it; or in conflict with what it “knows” and it must resolve the conflict. LCC’s architecture consists of multiple sub-systems combined to perform the above tasks. Its learning component can add new information to the knowledge base, confirm existing information, and/or update existing information found to be related to the user input. The LCC system functionality was rigorously assessed with test statements comprising various difficulty levels. Furthermore, its acceptance by human users was evaluated by two separate groups of human test subjects—one group who interacted with the system, and a second group that evaluated the logs of the interactions of the first group. The collected results were all found to be acceptable and within the range of our expectations.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of data and materials
All used data can be found in the appendices of the first author full dissertation document that can be accessed from https://stars.library.ucf.edu
Code availability
Not applicable at the moment but the authors are planning to publish the code soon
Notes
Behavioral cloning is a type of imitation learning where the agent receives the states and actions of an expert demonstrator as training data then the learning agent uses a supervised machine learning approach such as a classifier to replicate the demonstrator policy (Torabi et al., 2018) It is very similar in nature to LfO and LfD.
TF-IDF stands for Term Frequency-Inverse Document Frequency, a numerical measure that reflects the importance of a word in a document or corpus.
SemEval is an ongoing series of evaluations of computational semantic analysis systems. The evaluation involves exploring the natural meaning of the language. This task is not intuitive to machines as it is for humans (International workshop on semantic evaluation, 2015).
References
Anderson, J. R. (1996). ACT: A simple theory of complex cognition. American Psychologist, 51, 355.
Apté, C., Damerau, F., & Weiss, S. M. (1994). Automated learning of decision rules for text categorization. Assoc Comput Mach (ACM) Trans Inf Syst TOIS, 12, 233–251.
Chacón, A., Marco-Sola, S., Espinosa, A., Ribeca, P., & Moure, J. C. (2014). Thread-cooperative, bit-parallel computation of levenshtein distance on GPU. In Proceedings of the 28th of international conference on supercomputing (pp. 103–112)
Chang, Y.-W., Hsieh, C.-J., Chang, K.-W., Ringgaard, M., & Lin, C.-J. (2010). Training and testing low-degree polynomial data mappings via linear SVM. Journal of Machine Learning Research, 11, 1471.
ChatterBot-machine learning, conversational dialog engine. Retrieved from, https://chatterbot.readthedocs.io/en/stable/. (2019).
Chieu, H. L., & Ng, H. T. (2002). A maximum entropy approach to information extraction from semi-structured and free text. In Proceedings of the association for the advancement of artificial intelligence (AAAI), (vol. 2002, pp. 786–791).
Clark, H., & Schaefer, E. (1989). Contributing to discourse’cognitive. Science, 13(13), 259–294.
Cox, G. (2017). chatterbot.corpus.english.greetings. Retrieved from, https://github.com/gunthercox/chatterbot-corpus/blob/master/chatterbot_corpus/data/english/greetings.yml.
Cox, G. (2019). chatterbot.corpus.english.conversations. Retrieved from, https://github.com/gunthercox/chatterbot-corpus/blob/master/chatterbot_corpus/data/english/greetings.yml.
Dai, W., Xue, G.-R., Yang, Q., & Yu, Y. (2007). Transferring Naïve Bayes classifiers for text classification. In Proceedings of the association for the advancement of artificial intelligence (AAAI) (pp. 540–545).
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Dietterich, T. G. (2002). Ensemble learning. The handbook of brain theory and neural networks (vol. 2, pp. 110–125).
Dunford, R., Su, Q., & Tamang, E. (2014). The Pareto principle. The Plymouth Student Scientist, 7, 140–148.
Eggins, S., & Slade, D. (2004). Analysing casual conversation. Equinox Publishing Ltd. Cassell.
Feldman, A. (1959). Mannerisms of speech and gestures in everyday life. New York, NY: International Universities Press.
Finkel, J. R., Grenager, T., & Manning, C. (2005). Incorporating non-local information into information extraction systems by Gibbs sampling. In Proceedings of the 43rd annual meeting on association for computational linguistics (pp. 363–370).
fuzzywuzzy. Retrieved from, https://github.com/seatgeek/fuzzywuzzy
Ganesan, K., Zhai, C., & Han, J. (2010). Opinosis: A graph-based approach to abstractive summarization of highly redundant opinions. In Proceedings of the 23rd international conference on computational linguistics (pp. 340–348).
Garfinkel, H. (1967). Studies in ethnomethodology. Prentice Hall.
Gilmartin, E., Saam, C., Vogel, C., Campbell, N., & Wade, V. (2018). Just talking-modelling casual conversation. In Proceedings of the 19th annual SIGdial meeting on discourse and dialogue (pp. 51–59).
Goldberg, Y., & Levy, O. (2014). word2vec Explained: Deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv:1402.3722
Goldwasser, D., & Roth, D. (2011). Learning from natural instructions. In Proceedings of international joint conference on artificial intelligence (IJCAI).
International workshop on semantic evaluation (SemEval-2015). http://alt.qcri.org/semeval2015
Jaccard similarity measure. Retrieved from, https://scikit-learn.org/stable/modules/generated/sklearn.metrics.jaccard_score.html
Joseph, V. R., & Vakayil, A. (2021). SPlit: An optimal method for data splitting. Technometrics, 64, 166.
Kuhlmann, G., Stone, P., Mooney, R., & Shavlik, J. (2004). Guiding a reinforcement learner with natural language advice: Initial results in RoboCup soccer. In Proceedings of the association for the advancement of artificial intelligence (AAAI) workshop on supervisory control of learning and adaptive systems.
Li, J., Miller, A. H., Chopra, S., Ranzato, M., & Weston, J. (2016). Learning through dialogue interactions. arXiv:1612.04936
Liu, B., & Mazumder, S. (2021) Lifelong and continual learning dialogue systems: Learning during conversation. In Proceedings of AAAI.
Luong, M.-T., Pham, H., & Manning, C. D. (2015) Effective approaches to attention-based neural machine translation. arXiv:1508.04025
Mihalcea, R., Corley, C., & Strapparava, C. (2006) Corpus-based and knowledge-based measures of text semantic similarity. In Proceedings of the association for the advancement of artificial intelligence (AAAI) (pp. 775–780).
Mohammed, A. A. (2019). Machine learning from casual conversation. Doctoral Dissertation, Department of Computer Science, University of Central Florida Electronic Theses and Dissertations. 6297. Retrieved from, https://stars.library.ucf.edu/etd/6297
Naïve Bayes text classification. Retrieved from, https://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html
Nigam, K., Lafferty, J., & McCallum, A. (1999). Using maximum entropy for text classification. In Proceedings of international joint conference on artificial intelligence IJCAI-99 workshop on machine learning for information filtering (pp. 61–67).
Provost, F. J., & Fawcett, T. (1997). Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. Knowledge Discovery and Data Mining (KDD) (pp. 43–48).
Random facts. Retrieved from, https://www.factslides.com
Rybski, P. E., Yoon, K., Stolarz, J., & Veloso, M. M. (2007). Interactive robot task training through dialog and demonstration. In Proceedings of the ACM/IEEE international conference on Human-robot interaction (pp. 49–56).
Sacks, H., Schegloff, E. A., & Jefferson, G. (1978). Studies in the organization of conversational interaction (pp. 696–735). Elsevier.
Sultan, M. A., Bethard, S., & Sumner, T. (2015). DLS \(@\) CU: Sentence similarity from word alignment and semantic vector composition. In Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015) (pp. 148–153).
Sultan, M. A., Bethard, S., & Sumner, T. (2014). Back to basics for monolingual alignment: Exploiting word similarity and contextual evidence. Transactions of the Association for Computational Linguistics, 2, 219–230.
Torabi, F., Warnell, G., & Stone, P. (2018). Behavioral cloning from observation. arXiv:1805.01954
Torrey, L., Walker, T., Shavlik, J., & Maclin, R. (2005). Using advice to transfer knowledge acquired in one reinforcement learning task to another. In The European conference on machine learning and principles and practice of knowledge discovery in databases (ECML-PKDD) (pp. 412–424).
Traum, D. R., & Hinkelman, E. A. (1992). Conversation acts in task-oriented spoken dialogue. Computational Intelligence, 8, 575–599.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998–6008).
Ventola, E. (1979). The structure of casual conversation in English. Journal of Pragmatics, 3, 267–298.
Weston, J. E. (2016). Dialog-based language learning. In Advances in neural information processing systems (pp. 829–837).
WordNet. Retrieved from, https://wordnet.princeton.edu/
Yujian, L., & Bo, L. (2007). A normalized Levenshtein distance metric. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 1091–1095.
Zhang, H., Yu, H., & Xu, W. (2017). Listen, interact and talk: Learning to speak via interaction. In NIPS workshop on visually-grounded interaction and language.
Funding
Work was supported indirectly through a teaching assistantship from the University of Central Florida for the first author.
Author information
Authors and Affiliations
Contributions
AMA: Conceptualization, methodology; investigation; software development: data acquisition and curation; writing and editing. AJG: conceptualization; project administration; supervision: manuscript editing
Corresponding author
Ethics declarations
Conflict of interest
There are no conflicts of interest for any of the authors
Ethical approval
The use of human test subjects and surveys were approved by the Institutional Review Board at the University of Central Florida, SBE-18-1418 dated: 8/14/2018. The approval letter can be found in APPENDIX H from the frist author dissertation document that can be accessed from https://stars.library.ucf.edu/etd/6297/. The authors consent that the submitted work is original and have not have been published or submitted elsewhere.
Consent to participate
Not applicable as the authors did not use any identification data related to the participants in this research.
Consent for publication
Not applicable.
Additional information
Editor: Derek Greene.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mohammed Ali, A.E., Gonzalez, A.J. Machine learning from casual conversation. Mach Learn 112, 4789–4836 (2023). https://doi.org/10.1007/s10994-023-06383-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-023-06383-0