[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3591106.3592243acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

EMP: Emotion-guided Multi-modal Fusion and Contrastive Learning for Personality Traits Recognition

Published: 12 June 2023 Publication History

Abstract

Multi-modal personality traits recognition aims to recognize personality traits precisely by utilizing different modality information, which has received increasing attention for its potential applications in human-computer interaction. Current methods almost fail to extract distinguishable features, remove noise, and align features from different modalities, which dramatically affects the accuracy of personality traits recognition. To deal with these issues, we propose an emotion-guided multi-modal fusion and contrastive learning framework for personality traits recognition. Specifically, we first use supervised contrastive learning to extract deeper and more distinguishable features from different modalities. After that, considering the close correlation between emotions and personalities, we use an emotion-guided multi-modal fusion mechanism to guide the feature fusion, which eliminates the noise and aligns the features from different modalities. Finally, we use an auto-fusion structure to enhance the interaction between different modalities to further extract essential features for final personality traits recognition. Extensive experiments on two benchmark datasets indicate that our method achieves state-of-the-art performance and robustness.

Supplemental Material

PDF File
The appendix of paper with proper package. This is the proper vision of current paper. As for before uploaded one, because the paper upload system has reseted before, I cannot confirm whether it is existed. Thus a new one is uploaded.
PDF File
Appendix

References

[1]
Mehdi Akbari, Mohammad Seydavi, Marcantonio M Spada, Shahram Mohammadkhani, Shiva Jamshidi, Alireza Jamaloo, and Fatemeh Ayatmehr. 2021. The Big Five personality traits and online gaming: A systematic review and meta-analysis. Journal of Behavioral Addictions 10, 3 (2021), 611–625.
[2]
Firoj Alam, Evgeny A Stepanov, and Giuseppe Riccardi. 2013. Personality traits recognition on social network-facebook. In Proceedings of the international AAAI conference on web and social media, Vol. 7. 6–9.
[3]
Alexei Baevski, Henry Zhou, Abdel rahman Mohamed, and Michael Auli. 2020. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations. ArXiv abs/2006.11477 (2020).
[4]
Francesco Barbieri, José Camacho-Collados, Leonardo Neves, and Luis Espinosa-Anke. 2020. TweetEval: Unified Benchmark and Comparative Evaluation for Tweet Classification. ArXiv abs/2010.12421 (2020).
[5]
Blerina Bazelli, Abram Hindle, and Eleni Stroulia. 2013. On the Personality Traits of StackOverflow Users. 2013 IEEE International Conference on Software Maintenance (2013), 460–463.
[6]
BerkovskyShlomo, TaibRonnie, KoprinskaIrena, WangEileen, ZengYucheng, and LiJingjie. 2020. Personality Sensing: Detection of Personality Traits Using Physiological Responses to Image and Video Stimuli.
[7]
Pei-Chun Chang, Yonghao Chen, and Chang-Hsing Lee. 2021. MS-SincResNet: Joint Learning of 1D and 2D Kernels Using Multi-scale SincNet and ResNet for Music Genre Classification. Proceedings of the 2021 International Conference on Multimedia Retrieval (2021).
[8]
Guang Chen, Deyuan Zhang, Tao Liu, and Xiaoyong Du. 2022. Self-Lifting: A Novel Framework for Unsupervised Voice-Face Association Learning. Proceedings of the 2022 International Conference on Multimedia Retrieval (2022).
[9]
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597–1607.
[10]
David Del Curto, Albert Clap’es, Javier Selva, Sorina Smeureanu, Julio Cezar Silveira Jacques, David Gallardo-Pujol, Georgina Guilera, David Leiva, Thomas Baltzer Moeslund, Sergio Escalera, and Cristina Palmero. 2021. Dyadformer: A Multi-modal Transformer for Long-Range Modeling of Dyadic Interactions. 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) (2021), 2177–2188.
[11]
Kristina M. DeNeve and Harris Cooper. 1998. The happy personality: A meta-analysis of 137 personality traits and subjective well-being.Psychological Bulletin 124, 2 (1998), 197–229. https://doi.org/10.1037/0033-2909.124.2.197
[12]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv abs/1810.04805 (2019).
[13]
Jan Dieris-Hirche, Magdalena Pape, Bert Theodor te Wildt, Aram Kehyayan, Maren Esch, Salam Aicha, Stephan Herpertz, and Laura Bottel. 2020. Problematic gaming behavior and the personality traits of video gamers: A cross-sectional survey.Computers in Human Behavior 106 (2020), 106272.
[14]
Paul Ekman. 1992. An argument for basic emotions. Cognition & Emotion 6 (1992), 169–200.
[15]
Hugo Jair Escalante, Heysem Kaya, A. A. Salah, Sergio Escalera, Yağmur Güçlütürk, Umut Güçlü, Xavier Baró, I. Ramadass Subramanian, Julio Cezar Silveira Jacques, Meysam Madadi, S. Ayache, Evelyne Viegas, Furkan Gürpinar, Achmadnoer Sukma Wicaksana, Cynthia C. S. Liem, Marcel van Gerven, and Robert van Lier. 2018. Modeling, Recognizing, and Explaining Apparent Personality From Videos. IEEE Transactions on Affective Computing 13 (2018), 894–911.
[16]
Christoph Feichtenhofer. 2020. X3d: Expanding architectures for efficient video recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 203–213.
[17]
Brendan J. Frey and Delbert Dueck. 2007. Clustering by Passing Messages Between Data Points. Science 315, 5814 (2007), 972–976. https://doi.org/10.1126/science.1136800 arXiv:https://www.science.org/doi/pdf/10.1126/science.1136800
[18]
Lewis R Goldberg. 1990. An alternative" description of personality": the big-five factor structure.Journal of personality and social psychology 59, 6 (1990), 1216.
[19]
Lewis R. Goldberg. 1990. An alternative “description of personality”: the big-five factor structure.Journal of personality and social psychology 59 6 (1990), 1216–29.
[20]
Samuel D Gosling, Sei Jin Ko, Thomas Mannarelli, and Margaret E Morris. 2002. A room with a cue: personality judgments based on offices and bedrooms.Journal of personality and social psychology 82, 3 (2002), 379.
[21]
Yağmur Güçlütürk, Umut Güçlü, Marcel AJ van Gerven, and Rob van Lier. 2016. Deep impression: Audiovisual deep residual networks for multimodal apparent personality trait recognition. In Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part III 14. Springer, 349–358.
[22]
Sharath Chandra Guntuku, Lin Qiu, Sujoy Madhab Roy, Weisi Lin, and Vinit Jakhetiya. 2015. Do Others Perceive You As You Want Them To?: Modeling Personality based on Selfies. Proceedings of the 1st International Workshop on Affect & Sentiment in Multimedia (2015).
[23]
Furkan Gürpinar, Heysem Kaya, and A. A. Salah. 2016. Combining Deep Facial and Ambient Features for First Impression Estimation. In ECCV Workshops.
[24]
Yağmur Güçlütürk, Umut Güçlü, Marcel van Gerven, and Robert van Lier. 2016. Deep Impression: Audiovisual Deep Residual Networks for Multimodal Apparent Personality Trait Recognition. In ECCV Workshops.
[25]
Yağmur Güçlütürk, Umut Güçlü, Xavier Baró, Hugo Jair Escalante, Isabelle Guyon, Sergio Escalera, Marcel A.J. van Gerven, and Rob van Lier. 2018. Multimodal First Impression Analysis with Deep Residual Networks. IEEE Transactions on Affective Computing 9, 3 (2018), 316–329. https://doi.org/10.1109/TAFFC.2017.2751469
[26]
Brian W. Haas, R. Todd Constable, and Turhan Canli. 2008. Stop the sadness: Neuroticism is associated with sustained medial prefrontal cortex response to emotional facial expressions. NeuroImage 42 (2008), 385–392.
[27]
Hassan Hayat, Carles Ventura, and Àgata Lapedriza. 2019. On the Use of Interpretable CNN for Personality Trait Recognition from Audio. In CCIA.
[28]
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9729–9738.
[29]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.
[30]
Carroll E. Izard, D Z Libero, Priscilla H. Putnam, and O. Maurice Haynes. 1993. Stability of emotion experiences and their relations to traits of personality.Journal of personality and social psychology 64 5 (1993), 847–60.
[31]
Onno P. Kampman, Elham J. Barezi, Dario Bertero, and Pascale Fung. 2018. Investigating Audio, Video, and Text Fusion Methods for End-to-End Automatic Personality Prediction. ArXiv abs/1805.00705 (2018).
[32]
Kazuya Kawakami, Luyu Wang, Chris Dyer, Phil Blunsom, and Aaron van den Oord. 2020. Learning robust and multilingual speech representations. arXiv preprint arXiv:2001.11128 (2020).
[33]
Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised Contrastive Learning. ArXiv abs/2004.11362 (2020).
[34]
Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised contrastive learning. Advances in Neural Information Processing Systems 33 (2020), 18661–18673.
[35]
Michelle A Lee, Matthew Tan, Yuke Zhu, and Jeannette Bohg. 2021. Detect, reject, correct: Crossmodal compensation of corrupted sensors. In 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 909–916.
[36]
Dongyuan Li, Qiang Lin, and Xiaoke Ma. 2021. Identification of dynamic community in temporal network via joint learning graph representation and nonnegative matrix factorization. Neurocomputing 435 (2021), 77–90. https://doi.org/10.1016/j.neucom.2021.01.004
[37]
Dongyuan Li, Xiaoke Ma, and Maoguo Gong. 2023. Joint Learning of Feature Extraction and Clustering for Large-Scale Temporal Networks. IEEE Transactions on Cybernetics 53, 3 (2023), 1653–1666. https://doi.org/10.1109/TCYB.2021.3107679
[38]
Dongyuan Li, Jingyi You, Kotaro Funakoshi, and Manabu Okumura. 2022. A-TIP: Attribute-aware Text Infilling via Pre-trained Language Model. In Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Gyeongju, Republic of Korea, 5857–5869. https://aclanthology.org/2022.coling-1.511
[39]
Yunan Li, Jun Wan, Qiguang Miao, Sergio Escalera, Huijuan Fang, Huizhou Chen, Xiangda Qi, and Guodong Guo. 2020. CR-Net: A Deep Classification-Regression Network for Multimodal Apparent Personality Analysis. International Journal of Computer Vision (2020), 1–18.
[40]
Anthony C Little, D Michael Burt, and David I Perrett. 2006. What is good is beautiful: Face preference reflects desired personality. Personality and Individual Differences 41, 6 (2006), 1107–1118.
[41]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
[42]
Reza Lotfian and Carlos Busso. 2019. Building Naturalistic Emotionally Balanced Speech Corpus by Retrieving Emotional Speech from Existing Podcast Recordings. IEEE Transactions on Affective Computing 10, 4 (2019), 471–483. https://doi.org/10.1109/TAFFC.2017.2736999
[43]
Gerald Matthews, Ian J Deary, and Martha C Whiteman. 2003. Personality traits. Cambridge University Press.
[44]
Robert R McCrae and Oliver P John. 1992. An introduction to the five-factor model and its applications. Journal of personality 60, 2 (1992), 175–215.
[45]
Antoine Miech, Jean-Baptiste Alayrac, Lucas Smaira, Ivan Laptev, Josef Sivic, and Andrew Zisserman. 2020. End-to-end learning of visual representations from uncurated instructional videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9879–9889.
[46]
Stefanos-Iordanis Papadopoulos, Christos Koutlis, Symeon Papadopoulos, and Ioannis Kompatsiaris. 2022. Multimodal Quasi-AutoRegression: forecasting the visual popularity of new fashion products. Int. J. Multim. Inf. Retr. 11, 4 (2022), 717–729. https://doi.org/10.1007/s13735-022-00262-5
[47]
Fabio Pianesi, Nadia Mana, Alessandro Cappelletti, Bruno Lepri, and Massimo Zancanaro. 2008. Multimodal recognition of personality traits in social interactions. In Proceedings of the 10th international conference on Multimodal interfaces. 53–60.
[48]
AJ Piergiovanni, Anelia Angelova, and Michael S Ryoo. 2020. Evolving losses for unsupervised video representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 133–142.
[49]
Soujanya Poria, Alexander Gelbukh, Basant Agarwal, E. Cambria, and Newton Howard. 2013. Common Sense Knowledge Based Personality Recognition from Text. In Mexican International Conference on Artificial Intelligence.
[50]
Ricardo Darío Pérez Principi, Cristina Palmero, Julio C. S. Jacques Junior, and Sergio Escalera. 2021. On the Effect of Observed Subject Biases in Apparent Personality Analysis From Audio-Visual Signals. IEEE Transactions on Affective Computing 12, 3 (2021), 607–621. https://doi.org/10.1109/TAFFC.2019.2956030
[51]
Delong Qi, Weijun Tan, Qi Yao, and Jingfeng Liu. 2021. YOLO5Face: why reinventing a face detector. arXiv preprint arXiv:2105.12931 (2021).
[52]
Ruihong Qiu, Zi Huang, Hongzhi Yin, and Zijian Wang. 2022. Contrastive Learning for Representation Degeneration Problem in Sequential Recommendation. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining (Virtual Event, AZ, USA) (WSDM ’22). Association for Computing Machinery, New York, NY, USA, 813–823. https://doi.org/10.1145/3488560.3498433
[53]
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. ArXiv abs/1908.10084 (2019).
[54]
Zhancheng Ren, Qiang Shen, Xiaolei Diao, and Hao Xu. 2021. A sentiment-aware deep learning approach for personality detection from text. Inf. Process. Manag. 58 (2021), 102532.
[55]
Gaurav Sahu and Olga Vechtomova. 2021. Adaptive Fusion Techniques for Multimodal Data. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Association for Computational Linguistics, Online, 3156–3166. https://doi.org/10.18653/v1/2021.eacl-main.275
[56]
Dairazalia Sanchez-Cortes, Oya Aran, and Daniel Gática-Pérez. 2011. An Audio Visual Corpus for Emergent Leader Analysis.
[57]
Björn Schuller, Stefan Steidl, Anton Batliner, Alessandro Vinciarelli, Klaus R. Scherer, Fabien Ringeval, Mohamed Chetouani, Felix Weninger, Florian Eyben, Erik Marchi, Marcello Mortillaro, Hugues Salamin, Anna Polychroniou, Fabio Valente, and Samuel Kim. 2013. The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism. In Interspeech.
[58]
Qijie Shen, Wanjie Tao, Jing Zhang, Hong Wen, Zulong Chen, and Quan Lu. 2021. SAR-Net: A Scenario-Aware Ranking Network for Personalized Fair Recommendation in Hundreds of Travel Scenarios. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 4094–4103.
[59]
Tiancheng Shen, Jia Jia, Yan Li, Yihui Ma, Yaohua Bu, Hanjie Wang, Bo Chen, Tat-Seng Chua, and Wendy Hall. 2020. Peia: Personality and emotion integrated attentive model for music recommendation on social media platforms. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 206–213.
[60]
Ramanathan Subramanian, Julia Wache, Mojtaba Khomami Abadi, Radu L. Vieriu, Stefan Winkler, and N. Sebe. 2018. ASCERTAIN: Emotion and Personality Recognition Using Commercial Sensors. IEEE Transactions on Affective Computing 9 (2018), 147–160.
[61]
Chanchal Suman, Sriparna Saha, Aditya Gupta, Saurabh Kumar Pandey, and Pushpak Bhattacharyya. 2022. A multi-modal personality prediction system. Knowledge-Based Systems 236 (2022), 107715. https://doi.org/10.1016/j.knosys.2021.107715
[62]
Chen Sun, Fabien Baradel, Kevin Murphy, and Cordelia Schmid. 2019. Learning video representations using contrastive bidirectional transformer. arXiv preprint arXiv:1906.05743 (2019).
[63]
Xiangguo Sun, Bo Liu, Jiuxin Cao, Junzhou Luo, and Xiaojun Shen. 2018. Who Am I? Personality Detection Based on Deep Learning for Texts. 2018 IEEE International Conference on Communications (ICC) (2018), 1–6.
[64]
Filareti Tsalakanidou, Symeon Papadopoulos, Vasileios Mezaris, Ioannis Kompatsiaris, Birgit Gray, Danae Tsabouraki, Maritini Kalogerini, Fulvio Negro, Maurizio Montagnuolo, Jesse de Vos, Philo van Kemenade, Daniele Gravina, Rémi Mignot, Alexey Ozerov, François Schnitzler, Artur Garcia-Sáez, Georgios N. Yannakakis, Antonios Liapis, and Georgi Kostadinov. 2021. The AI4Media Project: Use of Next-Generation Artificial Intelligence Technologies for Media Sector Applications. In Artificial Intelligence Applications and Innovations - 17th IFIP WG 12.5 International Conference, AIAI 2021, Hersonissos, Crete, Greece, June 25-27, 2021, Proceedings(IFIP Advances in Information and Communication Technology, Vol. 627), Ilias Maglogiannis, John MacIntyre, and Lazaros Iliadis (Eds.). Springer, 81–93. https://doi.org/10.1007/978-3-030-79150-6_7
[65]
Michael Tschannen, Josip Djolonga, Marvin Ritter, Aravindh Mahendran, Neil Houlsby, Sylvain Gelly, and Mario Lucic. 2020. Self-supervised learning of video-induced visual invariances. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13806–13815.
[66]
Olaf Van Schijndel, Abel-Jan Tasman, and Ralph Litschel. 2015. The nose influences visual and personality perception. Facial Plastic Surgery 31, 05 (2015), 439–445.
[67]
Julia Wache. 2014. The Secret Language of Our Body: Affect and Personality Recognition Using Physiological Signals. Proceedings of the 16th International Conference on Multimodal Interaction (2014).
[68]
Xiu-Shen Wei, Chen-Lin Zhang, Hao Zhang, and Jianxin Wu. 2018. Deep Bimodal Regression of Apparent Personality Traits from Short Video Sequences. IEEE Transactions on Affective Computing 9, 3 (2018), 303–315. https://doi.org/10.1109/TAFFC.2017.2762299
[69]
Xiu-Shen Wei, Chen-Lin Zhang, Hao Zhang, and Jianxin Wu. 2018. Deep Bimodal Regression of Apparent Personality Traits from Short Video Sequences. IEEE Transactions on Affective Computing 9 (2018), 303–315.
[70]
Janine Willis and Alexander Todorov. 2005. Running head : FIRST IMPRESSIONS First impressions : Making up your mind after 100 milliseconds exposure to a face.
[71]
Haishu Xianyu, Mingxing Xu, Zhiyong Wu, and Lianhong Cai. 2016. Heterogeneity-entropy based unsupervised feature learning for personality prediction with cross-media data. 2016 IEEE International Conference on Multimedia and Expo (ICME) (2016), 1–6.
[72]
Zihui Xue and Radu Marculescu. 2022. Dynamic Multimodal Fusion. arXiv preprint arXiv:2204.00102 (2022).
[73]
Paul Thomas Young and Magda B. Arnold. 1963. Emotion and personality. American Journal of Psychology 76 (1963), 516.
[74]
Amir Zadeh, Minghai Chen, Soujanya Poria, E. Cambria, and Louis-Philippe Morency. 2017. Tensor Fusion Network for Multimodal Sentiment Analysis. ArXiv abs/1707.07250 (2017).
[75]
Zhihong Zeng, Maja Pantic, Glenn I Roisman, and Thomas S Huang. 2007. A survey of affect recognition methods: audio, visual and spontaneous expressions. In Proceedings of the 9th international conference on Multimodal interfaces. 126–133.
[76]
Ting Zhang, Rizhen Qin, Qiulei Dong, Wei Gao, Huarong Xu, and Zhanyi Hu. 2017. Physiognomy: Personality traits prediction by learning. International Journal of Automation and Computing 14 (2017), 386–395.
[77]
Zhong Zhang, Chongming Gao, Cong Xu, Rui Miao, Qinli Yang, and Junming Shao. 2020. Revisiting Representation Degeneration Problem in Language Modeling. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 518–527. https://doi.org/10.18653/v1/2020.findings-emnlp.46
[78]
Xiaoming Zhao, Zhiwei Tang, and Shiqing Zhang. 2022. Deep Personality Trait Recognition: A Survey. Frontiers in Psychology (2022), 2390.
[79]
Mingkai Zheng, Fei Wang, Shan You, Chen Qian, Changshui Zhang, Xiaogang Wang, and Chang Xu. 2021. Weakly supervised contrastive learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10042–10051.
[80]
Chengxu Zhuang, Tianwei She, Alex Andonian, Max Sobol Mark, and Daniel Yamins. 2020. Unsupervised learning from video with deep neural embeddings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9563–9572.

Cited By

View all
  • (2024)Active Learning with Task Adaptation Pre-training for Speech Emotion RecognitionJournal of Natural Language Processing10.5715/jnlp.31.82531:3(825-867)Online publication date: 2024
  • (2024)A Descriptive Basketball Highlight Dataset for Automatic Commentary GenerationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681178(10316-10325)Online publication date: 28-Oct-2024
  • (2024)Multimodal Graph-Based Audio-Visual Event LocalizationICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10448223(7880-7884)Online publication date: 14-Apr-2024
  • Show More Cited By

Index Terms

  1. EMP: Emotion-guided Multi-modal Fusion and Contrastive Learning for Personality Traits Recognition

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ICMR '23: Proceedings of the 2023 ACM International Conference on Multimedia Retrieval
      June 2023
      694 pages
      ISBN:9798400701788
      DOI:10.1145/3591106
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 12 June 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Contrastive Learning
      2. Multi-modal Feature Fusion
      3. Multimedia Content Extraction and Analysis
      4. Personality Traits Recognition

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Data Availability

      The appendix of paper with proper package. This is the proper vision of current paper. As for before uploaded one, because the paper upload system has reseted before, I cannot confirm whether it is existed. Thus a new one is uploaded. https://dl.acm.org/doi/10.1145/3591106.3592243#Appendix.pdf

      Conference

      ICMR '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 254 of 830 submissions, 31%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)229
      • Downloads (Last 6 weeks)29
      Reflects downloads up to 12 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Active Learning with Task Adaptation Pre-training for Speech Emotion RecognitionJournal of Natural Language Processing10.5715/jnlp.31.82531:3(825-867)Online publication date: 2024
      • (2024)A Descriptive Basketball Highlight Dataset for Automatic Commentary GenerationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681178(10316-10325)Online publication date: 28-Oct-2024
      • (2024)Multimodal Graph-Based Audio-Visual Event LocalizationICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10448223(7880-7884)Online publication date: 14-Apr-2024
      • (2024)PMTL: Personality-Aware Multitask Learning Model for Emotion Recognition in Conversation2024 6th International Conference on Communications, Information System and Computer Engineering (CISCE)10.1109/CISCE62493.2024.10653433(428-432)Online publication date: 10-May-2024
      • (2024)FINE-LMT: Fine-Grained Feature Learning for Multi-modal Machine TranslationPRICAI 2024: Trends in Artificial Intelligence10.1007/978-981-96-0119-6_32(334-346)Online publication date: 12-Nov-2024
      • (2024)Adaptive information fusion network for multi‐modal personality recognitionComputer Animation and Virtual Worlds10.1002/cav.226835:3Online publication date: 10-Jun-2024
      • (2023)After: Active Learning Based Fine-Tuning Framework for Speech Emotion Recognition2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)10.1109/ASRU57964.2023.10389652(1-8)Online publication date: 16-Dec-2023

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media