default search action
ICMI 2023: Paris, France
- Elisabeth André, Mohamed Chetouani, Dominique Vaufreydaz, Gale M. Lucas, Tanja Schultz, Louis-Philippe Morency, Alessandro Vinciarelli:
Proceedings of the 25th International Conference on Multimodal Interaction, ICMI 2023, Paris, France, October 9-13, 2023. ACM 2023
Keynote Talks
- Sophie K. Scott:
Multimodal information processing in communication: the nature of faces and voices. 1 - Maja J. Mataric:
A Robot Just for You: Multimodal Personalized Human-Robot Interaction and the Future of Work and Care. 2-3 - Simone Natale:
Projecting life onto machines. 4
Main Track - Long and Short Papers
- Apostolos Kalatzis, Saidur Rahman, Vishnunarayan Girishan Prabhu, Laura M. Stanley, Mike P. Wittie:
A Multimodal Approach to Investigate the Role of Cognitive Workload and User Interfaces in Human-robot Collaboration. 5-14 - Ehsan Yaghoubi, André Peter Kelm, Timo Gerkmann, Simone Frintrop:
Acoustic and Visual Knowledge Distillation for Contrastive Audio-Visual Localization. 15-23 - Jiangquan Li, Shimin Shan, Yu Liu, Kaiping Xu, Xiwen Hu, Mingcheng Xue:
AIUnet: Asymptotic inference with U2-Net for referring image segmentation. 24-32 - Ayane Tashiro, Mai Imamura, Shiro Kumano, Kazuhiro Otsuka:
Analyzing and Recognizing Interlocutors' Gaze Functions from Multimodal Nonverbal Cues. 33-41 - Mai Imamura, Ayane Tashiro, Shiro Kumano, Kazuhiro Otsuka:
Analyzing Synergetic Functional Spectrum from Head Movements and Facial Expressions in Conversations. 42-50 - Kaushal Sharma, Guillaume Chanel:
Annotations from speech and heart rate: impact on multimodal emotion recognition. 51-59 - Hendric Voß, Stefan Kopp:
AQ-GT: a Temporally Aligned and Quantized GRU-Transformer for Co-Speech Gesture Synthesis. 60-69 - Silvan Mertes, Marcel Strobl, Ruben Schlagowski, Elisabeth André:
ASMRcade: Interactive Audio Triggers for an Autonomous Sensory Meridian Response. 70-78 - Toshiharu Horiuchi, Shota Okubo, Tatsuya Kobayashi:
Augmented Immersive Viewing and Listening Experience Based on Arbitrarily Angled Interactive Audiovisual Representation. 79-83 - Zixuan Xiao, Michal Muszynski, Ricards Marcinkevics, Lukas Zimmerli, Adam Daniel Ivankay, Dario Kohlbrenner, Manuel Kuhn, Yves Nordmann, Ulrich Muehlner, Christian F. Clarenbach, Julia E. Vogt, Thomas Brunschwiler:
Breathing New Life into COPD Assessment: Multisensory Home-monitoring for Predicting Severity. 84-93 - Cristina Gena, Francesca Manini, Antonio Lieto, Alberto Lillo, Fabiana Vernero:
Can empathy affect the attribution of mental states to robots? 94-103 - Harshinee Sriram, Cristina Conati, Thalia Shoshana Field:
Classification of Alzheimer's Disease with Deep Learning on Eye-tracking Data. 104-113 - Jia Fu, Jiarui Tan, Wenjie Yin, Sepideh Pashami, Mårten Björkman:
Component attention network for multimodal dance improvisation recognition. 114-118 - Takeshi Saga, Hiroki Tanaka, Satoshi Nakamura:
Computational analyses of linguistic features with schizophrenic and autistic traits along with formal thought disorders. 119-124 - Marilou Beyeler, Yi Fei Cheng, Christian Holz:
Cross-Device Shortcuts: Seamless Attention-guided Content Transfer via Opportunistic Deep Links between Apps and Devices. 125-134 - Muneeb Imtiaz Ahmad, Abdullah Alzahrani:
Crucial Clues: Investigating Psychophysiological Behaviors for Measuring Trust in Human-Robot Interaction. 135-143 - Pepijn Van Aken, Merel M. Jung, Werner Liebregts, Itir Önal Ertugrul:
Deciphering Entrepreneurial Pitches: A Multimodal Deep Learning Approach to Predict Probability of Investment. 144-152 - Kayla Matheus, Ellie Mamantov, Marynel Vázquez, Brian Scassellati:
Deep Breathing Phase Classification with a Social Robot for Mental Health. 153-162 - Vishal Kiran Kuvar, Julia W. Y. Kam, Stephen Hutt, Caitlin Mills:
Detecting When the Mind Wanders Off Task in Real-time: An Overview and Systematic Review. 163-173 - Monisha Singh, Ximi Hoque, Donghuo Zeng, Yanan Wang, Kazushi Ikeda, Abhinav Dhall:
Do I Have Your Attention: A Large Scale Engagement Prediction Dataset and Baselines. 174-182 - Alexander Cao, Jean Utke, Diego Klabjan:
Early Classifying Multimodal Sequences. 183-189 - Dustin Pulver, Prithila Angkan, Paul Hungler, Ali Etemad:
EEG-based Cognitive Load Classification using Feature Masked Autoencoding and Emotion Transfer Learning. 190-197 - Metehan Doyran, Ronald Poppe, Albert Ali Salah:
Embracing Contact: Detecting Parent-Infant Interactions. 198-206 - Wei-Cheng Lin, Lucas Goncalves, Carlos Busso:
Enhancing Resilience to Missing Data in Audio-Text Emotion Recognition with Multi-Scale Chunk Regularization. 207-215 - Yurina Mizuho, Riku Kitamura, Yuta Sugiura:
Estimation of Violin Bow Pressure Using Photo-Reflective Sensors. 216-223 - Hanaë Rateau, Yosra Rekik, Edward Lank:
Ether-Mark: An Off-Screen Marking Menu For Mobile Devices. 224-233 - Bernardo Marques, Samuel S. Silva, Rafael Maio, João Alves, Carlos Ferreira, Paulo Dias, Beatriz Sousa Santos:
Evaluating Outside the Box: Lessons Learned on eXtended Reality Multi-modal Experiments Beyond the Laboratory. 234-242 - Melanie Heck, Jinhee Jeong, Christian Becker:
Evaluating the Potential of Caption Activation to Mitigate Confusion Inferred from Facial Gestures in Virtual Meetings. 243-252 - Leena Mathur, Maja J. Mataric, Louis-Philippe Morency:
Expanding the Role of Affective Phenomena in Multimodal Interaction Research. 253-260 - Monika Gahalawat, Raul Fernandez Rojas, Tanaya Guha, Ramanathan Subramanian, Roland Goecke:
Explainable Depression Detection via Head Motion Patterns. 261-270 - Amy Melniczuk, Egesa Vrapi:
Exploring Feedback Modality Designs to Improve Young Children's Collaborative Actions. 271-281 - Kazi Injamamul Haque, Zerrin Yumak:
FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation Learning. 282-291 - Ayaka Ideno, Takuhiro Kaneko, Tatsuya Harada:
Frame-Level Event Representation Learning for Semantic-Level Generation and Editing of Avatar Motion. 292-300 - Saikat Chakraborty, Noble Thomas, Anup Nandy:
Gait Event Prediction of People with Cerebral Palsy using Feature Uncertainty: A Low-Cost Approach. 301-306 - Yingxue Gao, Huan Zhao, Yufeng Xiao, Zixing Zhang:
GCFormer: A Graph Convolutional Transformer for Speech Emotion Recognition. 307-313 - Yubin Kim, Dong Won Lee, Paul Pu Liang, Sharifa Alghowinem, Cynthia Breazeal, Hae Won Park:
HIINT: Historical, Intra- and Inter- personal Dynamics Modeling with Cross-person Memory Transformer. 314-325 - Yingbo Ma, Mehmet Celepkolu, Kristy Elizabeth Boyer, Collin F. Lynch, Eric N. Wiebe, Maya Israel:
How Noisy is Too Noisy? The Impact of Data Noise on Multimodal Recognition of Confusion and Conflict During Collaborative Learning. 326-335 - Shumpei Otsuchi, Koya Ito, Yoko Ishii, Ryo Ishii, Shinichirou Eitoku, Kazuhiro Otsuka:
Identifying Interlocutors' Behaviors and its Timings Involved with Impression Formation from Head-Movement Features and Linguistic Features. 336-344 - Mansi Sharma, Shuang Chen, Philipp Müller, Maurice Rekrut, Antonio Krüger:
Implicit Search Intent Recognition using EEG and Eye Tracking: Novel Dataset and Cross-User Prediction. 345-354 - Ruoqi Wang, Haifeng Zhang, Shaun Alexander Macdonald, Patrizia Di Campli San Vito:
Increasing Heart Rate and Anxiety Level with Vibrotactile and Audio Presentation of Fast Heartbeat. 355-363 - Louis Lafuma, Guillaume Bouyer, Olivier Goguel, Jean-Yves Pascal Didier:
Influence of hand representation on a grasping task in augmented reality. 364-372 - Cristina Luna Jiménez, Manuel Gil-Martín, Ricardo Kleinlein, Rubén San Segundo, Fernando Fernández Martínez:
Interpreting Sign Language Recognition using Transformers and MediaPipe Landmarks. 373-377 - Laura Birka Hensel, Nutchanon Yongsatianchot, Parisa Torshizi, Elena Minucci, Stacy Marsella:
Large language models in textual analysis for gesture selection. 378-387 - Yasheng Sun, Qianyi Wu, Hang Zhou, Kaisiyuan Wang, Tianshu Hu, Chen-Chieh Liao, Shio Miyafuji, Ziwei Liu, Hideki Koike:
Make Your Brief Stroke Real and Stereoscopic: 3D-Aware Simplified Sketch to Portrait Generation. 388-396 - Jicheng Li, Vuthea Chheang, Pinar Kullu, Eli Brignac, Zhang Guo, Anjana Bhat, Kenneth E. Barner, Roghayeh Leila Barmaki:
MMASD: A Multimodal Dataset for Autism Intervention Analysis. 397-405 - Trang Tran, Yufeng Yin, Leili Tavabi, Joannalyn Delacruz, Brian Borsari, Joshua D. Woolley, Stefan Scherer, Mohammad Soleymani:
Multimodal Analysis and Assessment of Therapist Empathy in Motivational Interviews. 406-415 - Abhishek Mandal, Suzanne Little, Susan Leavy:
Multimodal Bias: Assessing Gender Bias in Computer Vision Models with NLP Techniques. 416-424 - Paul Pu Liang, Yun Cheng, Ruslan Salakhutdinov, Louis-Philippe Morency:
Multimodal Fusion Interactions: A Study of Human and Automatic Quantification. 425-435 - Meng Chen Lee, Mai Trinh, Zhigang Deng:
Multimodal Turn Analysis and Prediction for Multi-party Conversations. 436-444 - Torsten Wörtwein, Nicholas B. Allen, Lisa B. Sheeber, Randy P. Auerbach, Jeffrey F. Cohn, Louis-Philippe Morency:
Neural Mixed Effects for Nonlinear Personalized Predictions. 445-454 - Siska Fitrianie, Iulia Lefter:
On Head Motion for Recognizing Aggression and Negative Affect during Speaking and Listening. 455-464 - Camille Sallaberry, Gwenn Englebienne, Jan B. F. van Erp, Vanessa Evers:
Out of Sight, ... How Asymmetry in Video-Conference Affects Social Interaction. 465-469 - Jack Fitzgerald, Ethan Seefried, James E. Yost, Sangmi Pallickara, Nathaniel Blanchard:
Paying Attention to Wildfire: Using U-Net with Attention Blocks on Multimodal Data for Next Day Prediction. 470-480 - Yekta Said Can, Elisabeth André:
Performance Exploration of RNN Variants for Recognizing Daily Life Stress Levels by Using Multimodal Physiological Signals. 481-487 - Kosmas Pinitas, David Renaudie, Mike Thomsen, Matthew Barthet, Konstantinos Makantasis, Antonios Liapis, Georgios N. Yannakakis:
Predicting Player Engagement in Tom Clancy's The Division 2: A Multimodal Approach via Pixels and Gamepad Actions. 488-497 - Zhanibek Rysbek, Ki Hwan Oh, Milos Zefran:
Recognizing Intent in Collaborative Manipulation. 498-506 - Daksitha Senel Withanage Don, Philipp Müller, Fabrizio Nunnari, Elisabeth André, Patrick Gebhard:
ReNeLiB: Real-time Neural Listening Behavior Generation for Socially Interactive Agents. 507-516 - Alexandria K. Vail, Jeffrey M. Girard, Lauren M. Bylsma, Jay Fournier, Holly A. Swartz, Jeffrey F. Cohn, Louis-Philippe Morency:
Representation Learning for Interpersonal and Multimodal Behavior Dynamics: A Multiview Extension of Latent Change Score Models. 517-526 - Maria Teresa Parreira, Sarah Gillet, Iolanda Leite:
Robot Duck Debugging: Can Attentive Listening Improve Problem Solving? 527-536 - Maneesh Bilalpur, Saurabh Hinduja, Laura A. Cariola, Lisa Sheeber, Nicholas B. Allen, Louis-Philippe Morency, Jeffrey F. Cohn:
SHAP-based Prediction of Mother's History of Depression to Understand the Influence on Child Behavior. 537-544 - G. S. Rajshekar Reddy, Lucca Eloy, Rachel Dickler, Jason G. Reitman, Samuel L. Pugh, Peter W. Foltz, Jamie C. Gorman, Julie L. Harrison, Leanne M. Hirshfield:
Synerg-eye-zing: Decoding Nonlinear Gaze Dynamics Underlying Successful Collaborations in Co-located Teams. 545-554 - Annika Dix, Clarissa Sabrina Arlinghaus, A. Marie Harkin, Sebastian Pannasch:
The Role of Audiovisual Feedback Delays and Bimodal Congruency for Visuomotor Performance in Human-Machine Interaction. 555-563 - Tan Gemicioglu, R. Michael Winters, Yu-Te Wang, Thomas M. Gable, Ivan J. Tashev:
TongueTap: Multimodal Tongue Gesture Recognition with Head-Worn Devices. 564-573 - Mojtaba Kolahdouzi, Ali Etemad:
Toward Fair Facial Expression Recognition with Improved Distribution Alignment. 574-583 - Kapotaksha Das, Mohamed Abouelenien, Mihai G. Burzo, John Elson, Kwaku O. Prakah-Asante, Clay Maranville:
Towards Autonomous Physiological Signal Extraction From Thermal Videos Using Deep Learning. 584-593 - Gauthier Robert Jean Faisandaz, Alix Goguey, Christophe Jouffrais, Laurence Nigay:
µGeT: Multimodal eyes-free text selection technique combining touch interaction and microgestures. 594-603 - Nathan Kammoun, Lakmal Meegahapola, Daniel Gatica-Perez:
Understanding the Social Context of Eating with Multimodal Smartphone Sensing: The Role of Country Diversity. 604-612 - Kaan Gönç, Baturay Saglam, Onat Dalmaz, Tolga Çukur, Suleyman Serdar Kozat, Hamdi Dibeklioglu:
User Feedback-based Online Learning for Intent Classification. 613-621 - Romina Abadi, Laurie M. Wilcox, Robert Scott Allison:
Using Augmented Reality to Assess the Role of Intuitive Physics in the Water-Level Task. 622-630 - Gizem Sogancioglu, Heysem Kaya, Albert Ali Salah:
Using Explainability for Bias Mitigation: A Case Study for Fair Recruitment Assessment. 631-639 - Emily Doherty, Cara A. Spencer, Lucca Eloy, Nitin Kumar, Rachel Dickler, Leanne M. Hirshfield:
Using Speech Patterns to Model the Dimensions of Teamness in Human-Agent Teams. 640-648 - Takao Obi, Kotaro Funakoshi:
Video-based Respiratory Waveform Estimation in Dialogue: A Novel Task and Dataset for Human-Machine Interaction. 649-660 - Hansi Liu, Hongsheng Lu, Kristin J. Dana, Marco Gruteser:
ViFi-Loc: Multi-modal Pedestrian Localization using GAN with Camera-Phone Correspondences. 661-669 - Vijay Kumar Singh, Pragma Kar, Ayush Madhan-Sohini, Madhav Rangaiah, Sandip Chakraborty, Mukulika Maity:
WiFiTuned: Monitoring Engagement in Online Participation by Harmonizing WiFi and Audio. 670-678
Blue Sky Papers
- Jingwei Liu:
A New Theory of Data Processing: Applying Artificial Intelligence to Cognition and Humanity. 679-683 - Radu-Daniel Vatavu:
From Natural to Non-Natural Interaction: Embracing Interaction Design Beyond the Accepted Convention of Natural. 684-688 - Amr Gomaa, Michael Feld:
Towards Adaptive User-centered Neuro-symbolic Learning for Multimodal Interaction with Autonomous Systems. 689-694
Doctoral Consortium
- Sushant Gautam:
Bridging Multimedia Modalities: Enhanced Multimodal AI Understanding and Intelligent Agents. 695-699 - Aswin Balasubramaniam:
Come Fl.. Run with Me: Understanding the Utilization of Drones to Support Recreational Runners' Well Being. 700-705 - Biswesh Mohapatra:
Conversational Grounding in Multimodal Dialog Systems. 706-710 - Antonius Bima Murti Wijaya:
Crowd Behaviour Prediction using Visual and Location Data in Super-Crowded Scenarios. 711-715 - Arnaud Allemang-Trivalle:
Enhancing Surgical Team Collaboration and Situation Awareness through Multimodal Sensing. 716-720 - Monika Gahalawat:
Explainable Depression Detection using Multimodal Behavioural Cues. 721-725 - Laurent P. Mertens:
Modeling Social Cognition and its Neurologic Deficits with Artificial Neural Networks. 726-730 - Cecilia Domingo:
Recording multimodal pair-programming dialogue for reference resolution by conversational agents. 731-735 - Luz Alejandra Magre, Shirley Coyle:
Smart Garments for Immersive Home Rehabilitation Using VR. 736-740
Grand Challenges: Emotion Recognition in the Wild Challenge (EmotiW23)
- Sunan Li, Hailun Lian, Cheng Lu, Yan Zhao, Chuangao Tang, Yuan Zong, Wenming Zheng:
Audio-Visual Group-based Emotion Recognition using Local and Global Feature Aggregation based Multi-Task Learning. 741-745 - Abhinav Dhall, Monisha Singh, Roland Goecke, Tom Gedeon, Donghuo Zeng, Yanan Wang, Kazushi Ikeda:
EmotiW 2023: Emotion Recognition in the Wild Challenge. 746-749 - Anderson Augusma, Dominique Vaufreydaz, Frédérique Letué:
Multimodal Group Emotion Recognition In-the-wild Using Privacy-Compliant Features. 750-754
Grand Challenges: The GENEA Challenge 2023: Full-Body Speech-Driven Gesture Generation in a Dyadic Setting
- Anna Deichler, Shivam Mehta, Simon Alexanderson, Jonas Beskow:
Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation. 755-762 - Leon Harz, Hendric Voß, Stefan Kopp:
FEIN-Z: Autoregressive Behavior Cloning for Speech-Driven Gesture Generation. 763-771 - Zeyu Zhao, Nan Gao, Zhi Zeng, Guixuan Zhang, Jie Liu, Shuwu Zhang:
Gesture Motion Graphs for Few-Shot Speech-Driven Gesture Reenactment. 772-778 - Sicheng Yang, Haiwei Xue, Zhensong Zhang, Minglei Li, Zhiyong Wu, Xiaofei Wu, Songcen Xu, Zonghong Dai:
The DiffuseStyleGesture+ entry to the GENEA Challenge 2023. 779-785 - Vladislav Korzun, Anna Beloborodova, Arkady Ilin:
The FineMotion entry to the GENEA Challenge 2023: DeepPhase for conversational gestures generation. 786-791 - Taras Kucherenko, Rajmund Nagy, Youngwoo Yoon, Jieyeon Woo, Teodor Nikolov, Mihail Tsakov, Gustav Eje Henter:
The GENEA Challenge 2023: A large-scale evaluation of gesture generation models in monadic and dyadic settings. 792-801 - Jonathan Windle, Iain A. Matthews, Ben Milner, Sarah Taylor:
The UEA Digital Humans entry to the GENEA Challenge 2023. 802-810
Workshop Summaries
- Heysem Kaya, Anouk Neerincx, Maryam Najafian, Saeid Safavi:
4th ICMI Workshop on Bridging Social Sciences and AI for Understanding Child Behaviour. 811-813 - Michal Muszynski, Theodoros Kostoulas, Leimin Tian, Edgar Roman-Rangel, Theodora Chaspari, Panos Amelidis:
4th International Workshop on Multimodal Affect and Aesthetic Experience. 814-815 - Hiroki Tanaka, Satoshi Nakamura, Jean-Claude Martin, Catherine Pelachaud:
4th Workshop on Social Affective Multimodal Interaction for Health (SAMIH). 816-817 - Eleonora Ceccaldi, Béatrice Biancardi, Sara Falcone, Silvia Ferrando, Geoffrey Gorisse, Thomas Janssoone, Anna Martin Coesel, Pierre Raimbaud:
ACE: how Artificial Character Embodiment shapes user behaviour in multi-modal interaction. 818-819 - Zakia Hammal, Steffen Walter, Nadia Berthouze:
Automated Assessment of Pain (AAP). 820-821 - Youngwoo Yoon, Taras Kucherenko, Jieyeon Woo, Pieter Wolfert, Rajmund Nagy, Gustav Eje Henter:
GENEA Workshop 2023: The 4th Workshop on Generation and Evaluation of Non-verbal Behaviour for Embodied Agents. 822-823 - Fabio Catania, Tanya Talkar, Franca Garzotto, Benjamin R. Cowan, Thomas F. Quatieri, Satrajit S. Ghosh:
Multimodal Conversational Agents for People with Neurodevelopmental Disorders. 824-825 - Daniel C. Tozadore, Lise Aubin, Soizic Gauthier, Barbara Bruno, Salvatore Maria Anzalone:
Multimodal, Interactive Interfaces for Education. 826-827 - Bernd Dudzik, Tiffany Matej Hrkalovic, Dennis Küster, David St-Onge, Felix Putze, Laurence Devillers:
The 5th Workshop on Modeling Socio-Emotional and Cognitive Processes from Multimodal Data in the Wild (MSECP-Wild). 828-829
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.