[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3462244.3479932acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Toddler-Guidance Learning: Impacts of Critical Period on Multimodal AI Agents

Published: 18 October 2021 Publication History

Abstract

Critical periods are phases during which a toddler’s brain develops in spurts. To promote children’s cognitive development, proper guidance is critical in this stage. However, it is not clear whether such a critical period also exists for the training of AI agents. Similar to human toddlers, well-timed guidance and multimodal interactions might significantly enhance the training efficiency of AI agents as well. To validate this hypothesis, we adapt this notion of critical periods to learning in AI agents and investigate the critical period in the virtual environment for AI agents. We formalize the critical period and Toddler-guidance learning in the reinforcement learning (RL) framework. Then, we built up a toddler-like environment with VECA toolkit to mimic human toddlers’ learning characteristics. We study three discrete levels of mutual interaction: weak-mentor guidance (sparse reward), moderate mentor guidance (helper-reward), and mentor demonstration (behavioral cloning). We also introduce the EAVE dataset consisting of 30,000 real-world images to fully reflect the toddler’s viewpoint. We evaluate the impact of critical periods on AI agents from two perspectives: how and when they are guided best in both uni- and multimodal learning. Our experimental results show that both uni- and multimodal agents with moderate mentor guidance and critical period on 1 million and 2 million training steps show a noticeable improvement. We validate these results with transfer learning on the EAVE dataset and find the performance advancement on the same critical period and the guidance.

References

[1]
Alessandro Achille, Matteo Rovere, and Stefano Soatto. 2018. Critical learning periods in deep networks. In International Conference on Learning Representations.
[2]
Kai Arulkumaran, Marc Peter Deisenroth, Miles Brundage, and Anil Anthony Bharath. 2017. A brief survey of deep reinforcement learning. arXiv preprint arXiv:1708.05866(2017).
[3]
Sven Bambach, David Crandall, Linda Smith, and Chen Yu. 2018. Toddler-inspired visual object learning. In Advances in neural information processing systems. 1201–1210.
[4]
Yoshua Bengio, J. Louradour, Ronan Collobert, and J. Weston. 2009. Curriculum learning. In ICML ’09.
[5]
Stephen Billett. 2000. Guided learning at work. Journal of Workplace learning(2000).
[6]
Michael S Brainard and Eric I Knudsen. 1998. Sensitive periods for visual calibration of the auditory space map in the barn owl optic tectum. Journal of Neuroscience 18, 10 (1998), 3929–3942.
[7]
Stefan Braun, Daniel Neil, and Shih-Chii Liu. 2017. A curriculum learning method for improved noise robustness in automatic speech recognition. 2017 25th European Signal Processing Conference (EUSIPCO) (2017), 548–552.
[8]
Changan Chen, Unnat Jain, Carl Schissler, Sebastia Vicenc Amengual Gari, Ziad Al-Halah, Vamsi Krishna Ithapu, Philip Robinson, and Kristen Grauman. 2020. Soundspaces: Audio-visual navigation in 3d environments. In Proceedings of the European Conference on Computer Vision (ECCV). Springer.
[9]
Xinlei Chen and Abhinav Gupta. 2015. Webly supervised learning of convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision. 1431–1439.
[10]
Barry R Chiswick and Paul W Miller. 2008. A test of the critical period hypothesis for language learning. Journal of multilingual and multicultural development 29, 1(2008), 16–29.
[11]
John M Darley and Russell H Fazio. 1980. Expectancy confirmation processes arising in the social interaction sequence.American Psychologist 35, 10 (1980), 867.
[12]
Finale Doshi-Velez. 2009. The infinite partially observable Markov decision process. Advances in neural information processing systems 22 (2009), 477–485.
[13]
J. Elman. 1993. Learning and development in neural networks: the importance of starting small. Cognition 48(1993), 71–99.
[14]
Eleanor J Gibson. 1988. Exploratory behavior in the development of perceiving, acting, and the acquiring of knowledge. Annual review of psychology 39, 1 (1988), 1–42.
[15]
Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International Conference on Machine Learning. PMLR, 1861–1870.
[16]
Takao K Hensch. 2004. Critical period regulation. Annu. Rev. Neurosci. 27(2004), 549–579.
[17]
George C Homans. 1974. Social behavior: Its elementary forms. (1974).
[18]
Andrew Jesson, N. Guizard, Sina Hamidi Ghalehjegh, D. Goblot, F. Soudan, and Nicolas Chapados. 2017. CASED: Curriculum Adaptive Sampling for Extreme Data Imbalance. ArXiv abs/1807.10819(2017).
[19]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).
[20]
Eric Kolve, Roozbeh Mottaghi, Winson Han, Eli VanderBilt, Luca Weihs, Alvaro Herrasti, Daniel Gordon, Yuke Zhu, Abhinav Gupta, and Ali Farhadi. 2017. Ai2-thor: An interactive 3d environment for visual ai. arXiv preprint arXiv:1712.05474(2017).
[21]
A Kral. 2013. Auditory critical periods: a review from system’s perspective. Neuroscience 247(2013), 117–133.
[22]
Stephen D Krashen. 1973. Lateralization, language learning, and the critical period: Some new evidence. Language learning 23, 1 (1973), 63–74.
[23]
Lisa Lee, Ben Eysenbach, Russ R Salakhutdinov, Shixiang Shane Gu, and Chelsea Finn. 2020. Weakly-Supervised Reinforcement Learning for Controllable Behavior. Advances in Neural Information Processing Systems 33 (2020).
[24]
L. Lin, Keze Wang, Deyu Meng, W. Zuo, and Lei Zhang. 2018. Active Self-Paced Learning for Cost-Effective and Progressive Face Identification. IEEE Transactions on Pattern Analysis and Machine Intelligence 40 (2018), 7–19.
[25]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740–755.
[26]
Tyler Tian Lu. 2009. Fundamental limitations of semi-supervised learning. Master’s thesis. University of Waterloo.
[27]
A. Ng, D. Harada, and Stuart J. Russell. 1999. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping. In ICML.
[28]
Sheryl L Olson, John E Bates, and Kathryn Bayles. 1990. Early antecedents of childhood impulsivity: The role of parent-child interaction, cognitive competence, and temperament. Journal of abnormal child psychology 18, 3 (1990), 317–334.
[29]
Kwanyoung Park, Jeong Heo, and Youngki Lee. 2020. VECA: A VR Toolkit for Training and Testing Cognitive Agents. https://github.com/GGOSinon/VECA.
[30]
Kwanyoung Park, Junseok Park, Hyunseok Oh, Byoung-Tak Zhang, and Youngki Lee. 2021. Learning task-agnostic representation via toddler-inspired learning. arxiv:2101.11221 [cs.AI]
[31]
Anastasia Pentina, Viktoriia Sharmanska, and Christoph H Lampert. 2015. Curriculum learning of multiple tasks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5492–5500.
[32]
Te Pi, Xi Li, Zhongfei Zhang, Deyu Meng, Fei Wu, Jun Xiao, and Yueting Zhuang. 2016. Self-Paced Boost Learning for Classification. In IJCAI.
[33]
Jean Piaget and Margaret Cook. 1952. The origins of intelligence in children. Vol. 8. International Universities Press New York.
[34]
Ann L Robson. 2002. Critical/sensitive periods. Child Develop-ment (2002), 101–103.
[35]
Manolis Savva, Abhishek Kadian, Oleksandr Maksymets, Yili Zhao, Erik Wijmans, Bhavana Jain, Julian Straub, Jia Liu, Vladlen Koltun, Jitendra Malik, 2019. Habitat: A platform for embodied ai research. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9339–9347.
[36]
Roger C Schank. 1972. Conceptual dependency: A theory of natural language understanding. Cognitive psychology 3, 4 (1972), 552–631.
[37]
Yangyang Shi, Martha Larson, and Catholijn M Jonker. 2015. Recurrent neural network language model adaptation with curriculum learning. Computer Speech & Language 33, 1 (2015), 136–154.
[38]
Yoav Shoham, Rob Powers, and Trond Grenager. 2003. Multi-agent reinforcement learning: a critical survey. (2003).
[39]
Abhinav Shrivastava, A. Gupta, and Ross B. Girshick. 2016. Training Region-Based Object Detectors with Online Hard Example Mining. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), 761–769.
[40]
David Michael Singleton and Zsolt Lengyel. 1995. The age factor in second language acquisition: A critical look at the critical period hypothesis. Multilingual Matters.
[41]
Catherine E Snow and Marian Hoefnagel-Höhle. 1978. The critical period for language acquisition: Evidence from second language learning. Child development (1978), 1114–1128.
[42]
Petru Soviany, Claudiu Ardei, Radu Tudor Ionescu, and M. Leordeanu. 2020. Image Difficulty Curriculum for Generative Adversarial Networks (CuGAN). 2020 IEEE Winter Conference on Applications of Computer Vision (WACV) (2020), 3452–3461.
[43]
Petru Soviany, Radu Tudor Ionescu, Paolo Rota, and N. Sebe. 2021. Curriculum Learning: A Survey. ArXiv abs/2101.10382(2021).
[44]
Valentin I Spitkovsky, Hiyan Alshawi, and Daniel Jurafsky. 2009. Baby Steps: How “Less is More” in unsupervised dependency parsing. (2009).
[45]
Sebastian P Suggate, Elizabeth A Schaughency, and Elaine Reese. 2013. Children learning to read later catch up to children reading earlier. Early Childhood Research Quarterly 28, 1 (2013), 33–48.
[46]
Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.
[47]
Piotr Sztompka. 2002. Socjologia. Analiza społeczeństwa, Znak, Kraków (2002), 324.
[48]
Sebastian Thrun and Lorien Pratt. 2012. Learning to learn. Springer Science & Business Media.
[49]
Satoshi Tsutsui, Arjun Chandrasekaran, Md Alimoor Reza, David Crandall, and Chen Yu. 2020. A Computational Model of Early Word Learning from the Infant’s Point of View. arXiv preprint arXiv:2006.02802(2020).
[50]
Radu Tudor Ionescu, Bogdan Alexe, Marius Leordeanu, Marius Popescu, Dim P Papadopoulos, and Vittorio Ferrari. 2016. How hard can it be? Estimating the difficulty of visual search in an image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2157–2166.
[51]
Matteo Turchetta, Andrey Kolobov, Shital Shah, Andreas Krause, and Alekh Agarwal. 2020. Safe reinforcement learning via curriculum induction. arXiv preprint arXiv:2006.12136(2020).
[52]
Alan M Turing. 2009. Computing machinery and intelligence. In Parsing the Turing Test. Springer, 23–65.
[53]
Karl Weiss, Taghi M Khoshgoftaar, and DingDing Wang. 2016. A survey of transfer learning. Journal of Big data 3, 1 (2016), 9.
[54]
Fei Xia, Amir R Zamir, Zhiyang He, Alexander Sax, Jitendra Malik, and Silvio Savarese. 2018. Gibson env: Real-world perception for embodied agents. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9068–9079.
[55]
Y. Zhang, P. David, and Boqing Gong. 2017. Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes. 2017 IEEE International Conference on Computer Vision (ICCV) (2017), 2039–2049.
[56]
Jennifer N Zosh, Emily J Hopkins, Hanne Jensen, Claire Liu, Dave Neale, Kathy Hirsh-Pasek, S Lynneth Solis, and David Whitebread. 2017. Learning through play: a review of the evidence.
  1. Toddler-Guidance Learning: Impacts of Critical Period on Multimodal AI Agents

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMI '21: Proceedings of the 2021 International Conference on Multimodal Interaction
    October 2021
    876 pages
    ISBN:9781450384810
    DOI:10.1145/3462244
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 October 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Guidance
    2. Reinforcement Learning
    3. Toddler Object Learning
    4. Virtual Environment for Cognitive Agents

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    ICMI '21
    Sponsor:
    ICMI '21: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION
    October 18 - 22, 2021
    QC, Montréal, Canada

    Acceptance Rates

    Overall Acceptance Rate 453 of 1,080 submissions, 42%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 266
      Total Downloads
    • Downloads (Last 12 months)55
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 01 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media