More Web Proxy on the site http://driver.im/

research-article

Toddler-Guidance Learning: Impacts of Critical Period on Multimodal AI Agents

Authors:

Kwanyoung Park,

Byoung-Tak ZhangAuthors Info & Claims

ICMI '21: Proceedings of the 2021 International Conference on Multimodal Interaction

Pages 212 - 220

https://doi.org/10.1145/3462244.3479932

Published: 18 October 2021 Publication History

Abstract

Critical periods are phases during which a toddler’s brain develops in spurts. To promote children’s cognitive development, proper guidance is critical in this stage. However, it is not clear whether such a critical period also exists for the training of AI agents. Similar to human toddlers, well-timed guidance and multimodal interactions might significantly enhance the training efficiency of AI agents as well. To validate this hypothesis, we adapt this notion of critical periods to learning in AI agents and investigate the critical period in the virtual environment for AI agents. We formalize the critical period and Toddler-guidance learning in the reinforcement learning (RL) framework. Then, we built up a toddler-like environment with VECA toolkit to mimic human toddlers’ learning characteristics. We study three discrete levels of mutual interaction: weak-mentor guidance (sparse reward), moderate mentor guidance (helper-reward), and mentor demonstration (behavioral cloning). We also introduce the EAVE dataset consisting of 30,000 real-world images to fully reflect the toddler’s viewpoint. We evaluate the impact of critical periods on AI agents from two perspectives: how and when they are guided best in both uni- and multimodal learning. Our experimental results show that both uni- and multimodal agents with moderate mentor guidance and critical period on 1 million and 2 million training steps show a noticeable improvement. We validate these results with transfer learning on the EAVE dataset and find the performance advancement on the same critical period and the guidance.

References

[1]

Alessandro Achille, Matteo Rovere, and Stefano Soatto. 2018. Critical learning periods in deep networks. In International Conference on Learning Representations.

[2]

Kai Arulkumaran, Marc Peter Deisenroth, Miles Brundage, and Anil Anthony Bharath. 2017. A brief survey of deep reinforcement learning. arXiv preprint arXiv:1708.05866(2017).

[3]

Sven Bambach, David Crandall, Linda Smith, and Chen Yu. 2018. Toddler-inspired visual object learning. In Advances in neural information processing systems. 1201–1210.

[4]

Yoshua Bengio, J. Louradour, Ronan Collobert, and J. Weston. 2009. Curriculum learning. In ICML ’09.

[5]

Stephen Billett. 2000. Guided learning at work. Journal of Workplace learning(2000).

[6]

Michael S Brainard and Eric I Knudsen. 1998. Sensitive periods for visual calibration of the auditory space map in the barn owl optic tectum. Journal of Neuroscience 18, 10 (1998), 3929–3942.

[7]

Stefan Braun, Daniel Neil, and Shih-Chii Liu. 2017. A curriculum learning method for improved noise robustness in automatic speech recognition. 2017 25th European Signal Processing Conference (EUSIPCO) (2017), 548–552.

[8]

Changan Chen, Unnat Jain, Carl Schissler, Sebastia Vicenc Amengual Gari, Ziad Al-Halah, Vamsi Krishna Ithapu, Philip Robinson, and Kristen Grauman. 2020. Soundspaces: Audio-visual navigation in 3d environments. In Proceedings of the European Conference on Computer Vision (ECCV). Springer.

Digital Library

[9]

Xinlei Chen and Abhinav Gupta. 2015. Webly supervised learning of convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision. 1431–1439.

Digital Library

[10]

Barry R Chiswick and Paul W Miller. 2008. A test of the critical period hypothesis for language learning. Journal of multilingual and multicultural development 29, 1(2008), 16–29.

[11]

John M Darley and Russell H Fazio. 1980. Expectancy confirmation processes arising in the social interaction sequence.American Psychologist 35, 10 (1980), 867.

[12]

Finale Doshi-Velez. 2009. The infinite partially observable Markov decision process. Advances in neural information processing systems 22 (2009), 477–485.

[13]

J. Elman. 1993. Learning and development in neural networks: the importance of starting small. Cognition 48(1993), 71–99.

[14]

Eleanor J Gibson. 1988. Exploratory behavior in the development of perceiving, acting, and the acquiring of knowledge. Annual review of psychology 39, 1 (1988), 1–42.

[15]

Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International Conference on Machine Learning. PMLR, 1861–1870.

[16]

Takao K Hensch. 2004. Critical period regulation. Annu. Rev. Neurosci. 27(2004), 549–579.

[17]

George C Homans. 1974. Social behavior: Its elementary forms. (1974).

[18]

Andrew Jesson, N. Guizard, Sina Hamidi Ghalehjegh, D. Goblot, F. Soudan, and Nicolas Chapados. 2017. CASED: Curriculum Adaptive Sampling for Extreme Data Imbalance. ArXiv abs/1807.10819(2017).

[19]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).

[20]

Eric Kolve, Roozbeh Mottaghi, Winson Han, Eli VanderBilt, Luca Weihs, Alvaro Herrasti, Daniel Gordon, Yuke Zhu, Abhinav Gupta, and Ali Farhadi. 2017. Ai2-thor: An interactive 3d environment for visual ai. arXiv preprint arXiv:1712.05474(2017).

[21]

A Kral. 2013. Auditory critical periods: a review from system’s perspective. Neuroscience 247(2013), 117–133.

[22]

Stephen D Krashen. 1973. Lateralization, language learning, and the critical period: Some new evidence. Language learning 23, 1 (1973), 63–74.

[23]

Lisa Lee, Ben Eysenbach, Russ R Salakhutdinov, Shixiang Shane Gu, and Chelsea Finn. 2020. Weakly-Supervised Reinforcement Learning for Controllable Behavior. Advances in Neural Information Processing Systems 33 (2020).

[24]

L. Lin, Keze Wang, Deyu Meng, W. Zuo, and Lei Zhang. 2018. Active Self-Paced Learning for Cost-Effective and Progressive Face Identification. IEEE Transactions on Pattern Analysis and Machine Intelligence 40 (2018), 7–19.

[25]

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740–755.

[26]

Tyler Tian Lu. 2009. Fundamental limitations of semi-supervised learning. Master’s thesis. University of Waterloo.

[27]

A. Ng, D. Harada, and Stuart J. Russell. 1999. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping. In ICML.

[28]

Sheryl L Olson, John E Bates, and Kathryn Bayles. 1990. Early antecedents of childhood impulsivity: The role of parent-child interaction, cognitive competence, and temperament. Journal of abnormal child psychology 18, 3 (1990), 317–334.

[29]

Kwanyoung Park, Jeong Heo, and Youngki Lee. 2020. VECA: A VR Toolkit for Training and Testing Cognitive Agents. https://github.com/GGOSinon/VECA.

[30]

Kwanyoung Park, Junseok Park, Hyunseok Oh, Byoung-Tak Zhang, and Youngki Lee. 2021. Learning task-agnostic representation via toddler-inspired learning. arxiv:2101.11221 [cs.AI]

[31]

Anastasia Pentina, Viktoriia Sharmanska, and Christoph H Lampert. 2015. Curriculum learning of multiple tasks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5492–5500.

[32]

Te Pi, Xi Li, Zhongfei Zhang, Deyu Meng, Fei Wu, Jun Xiao, and Yueting Zhuang. 2016. Self-Paced Boost Learning for Classification. In IJCAI.

[33]

Jean Piaget and Margaret Cook. 1952. The origins of intelligence in children. Vol. 8. International Universities Press New York.

[34]

Ann L Robson. 2002. Critical/sensitive periods. Child Develop-ment (2002), 101–103.

[35]

Manolis Savva, Abhishek Kadian, Oleksandr Maksymets, Yili Zhao, Erik Wijmans, Bhavana Jain, Julian Straub, Jia Liu, Vladlen Koltun, Jitendra Malik, 2019. Habitat: A platform for embodied ai research. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9339–9347.

[36]

Roger C Schank. 1972. Conceptual dependency: A theory of natural language understanding. Cognitive psychology 3, 4 (1972), 552–631.

[37]

Yangyang Shi, Martha Larson, and Catholijn M Jonker. 2015. Recurrent neural network language model adaptation with curriculum learning. Computer Speech & Language 33, 1 (2015), 136–154.

Digital Library

[38]

Yoav Shoham, Rob Powers, and Trond Grenager. 2003. Multi-agent reinforcement learning: a critical survey. (2003).

[39]

Abhinav Shrivastava, A. Gupta, and Ross B. Girshick. 2016. Training Region-Based Object Detectors with Online Hard Example Mining. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), 761–769.

[40]

David Michael Singleton and Zsolt Lengyel. 1995. The age factor in second language acquisition: A critical look at the critical period hypothesis. Multilingual Matters.

[41]

Catherine E Snow and Marian Hoefnagel-Höhle. 1978. The critical period for language acquisition: Evidence from second language learning. Child development (1978), 1114–1128.

[42]

Petru Soviany, Claudiu Ardei, Radu Tudor Ionescu, and M. Leordeanu. 2020. Image Difficulty Curriculum for Generative Adversarial Networks (CuGAN). 2020 IEEE Winter Conference on Applications of Computer Vision (WACV) (2020), 3452–3461.

[43]

Petru Soviany, Radu Tudor Ionescu, Paolo Rota, and N. Sebe. 2021. Curriculum Learning: A Survey. ArXiv abs/2101.10382(2021).

[44]

Valentin I Spitkovsky, Hiyan Alshawi, and Daniel Jurafsky. 2009. Baby Steps: How “Less is More” in unsupervised dependency parsing. (2009).

[45]

Sebastian P Suggate, Elizabeth A Schaughency, and Elaine Reese. 2013. Children learning to read later catch up to children reading earlier. Early Childhood Research Quarterly 28, 1 (2013), 33–48.

[46]

Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.

Digital Library

[47]

Piotr Sztompka. 2002. Socjologia. Analiza społeczeństwa, Znak, Kraków (2002), 324.

[48]

Sebastian Thrun and Lorien Pratt. 2012. Learning to learn. Springer Science & Business Media.

[49]

Satoshi Tsutsui, Arjun Chandrasekaran, Md Alimoor Reza, David Crandall, and Chen Yu. 2020. A Computational Model of Early Word Learning from the Infant’s Point of View. arXiv preprint arXiv:2006.02802(2020).

[50]

Radu Tudor Ionescu, Bogdan Alexe, Marius Leordeanu, Marius Popescu, Dim P Papadopoulos, and Vittorio Ferrari. 2016. How hard can it be? Estimating the difficulty of visual search in an image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2157–2166.

[51]

Matteo Turchetta, Andrey Kolobov, Shital Shah, Andreas Krause, and Alekh Agarwal. 2020. Safe reinforcement learning via curriculum induction. arXiv preprint arXiv:2006.12136(2020).

[52]

Alan M Turing. 2009. Computing machinery and intelligence. In Parsing the Turing Test. Springer, 23–65.

[53]

Karl Weiss, Taghi M Khoshgoftaar, and DingDing Wang. 2016. A survey of transfer learning. Journal of Big data 3, 1 (2016), 9.

[54]

Fei Xia, Amir R Zamir, Zhiyang He, Alexander Sax, Jitendra Malik, and Silvio Savarese. 2018. Gibson env: Real-world perception for embodied agents. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9068–9079.

[55]

Y. Zhang, P. David, and Boqing Gong. 2017. Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes. 2017 IEEE International Conference on Computer Vision (ICCV) (2017), 2039–2049.

[56]

Jennifer N Zosh, Emily J Hopkins, Hanne Jensen, Claire Liu, Dave Neale, Kathy Hirsh-Pasek, S Lynneth Solis, and David Whitebread. 2017. Learning through play: a review of the evidence.

Toddler-Guidance Learning: Impacts of Critical Period on Multimodal AI Agents
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches

Recommendations

Utilising Assured Multi-Agent Reinforcement Learning within Safety-Critical Scenarios
Abstract
Multi-agent reinforcement learning allows a team of agents to learn how to work together to solve complex decision-making problems in a shared environment. However, this learning process utilises stochastic mechanisms, meaning that its use in ...
COOPERATIVE LEARNING BY POLICY-SHARING IN MULTIPLE AGENTS

Reinforcement learning is one of the more prominent machine-learning technologies due to its unsupervised learning structure and ability to continually learn, even in a dynamic operating environment. Applying this learning to cooperative multi-agent ...
Agents Teaching Agents in Reinforcement Learning (Nectar Abstract)
Machine Learning and Knowledge Discovery in Databases
Abstract
Using reinforcement learning [4] (RL), agents can autonomously learn a control policy to master sequential-decision tasks. Rather than always learning tabula rasa, our recent work [5,7,8] considers how an experienced RL agent, the teacher, can ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMI '21: Proceedings of the 2021 International Conference on Multimodal Interaction

October 2021

876 pages

ISBN:9781450384810

DOI:10.1145/3462244

Editors:
Zakia Hammal
Carnegie Mellon University
,
Carlos Busso
University of Texas at Dallas
,
Catherine Pelachaud
CNRS - ISIR, Sorbonne University
,
Sharon Oviatt
Monash University
,
Albert Ali Salah
Utrecht University and Boğaziçi University
,
Guoying Zhao
University of Oulu

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Conference

ICMI '21

Sponsor:

SIGCHI

ICMI '21: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION

October 18 - 22, 2021

QC, Montréal, Canada

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
266
Total Downloads

Downloads (Last 12 months)55
Downloads (Last 6 weeks)6

Reflects downloads up to 01 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents