[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3266302.3266316acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

AVEC 2018 Workshop and Challenge: Bipolar Disorder and Cross-Cultural Affect Recognition

Published: 15 October 2018 Publication History

Abstract

The Audio/Visual Emotion Challenge and Workshop (AVEC 2018) "Bipolar disorder, and cross-cultural affect recognition'' is the eighth competition event aimed at the comparison of multimedia processing and machine learning methods for automatic audiovisual health and emotion analysis, with all participants competing strictly under the same conditions. The goal of the Challenge is to provide a common benchmark test set for multimodal information processing and to bring together the health and emotion recognition communities, as well as the audiovisual processing communities, to compare the relative merits of various approaches to health and emotion recognition from real-life data. This paper presents the major novelties introduced this year, the challenge guidelines, the data used, and the performance of the baseline systems on the three proposed tasks: bipolar disorder classification, cross-cultural dimensional emotion recognition, and emotional label generation from individual ratings, respectively.

References

[1]
Saeed Abdullah, Mark Matthews, Ellen Frank, Gavin Doherty, Geri Gay, and Tanzeem Choudhury. 2016. Automatic detection of social rhythms in bipolar disorder . Journal of the American Medical Informatics Association, Vol. 23, 3,1 (March 2016), 538--543.
[2]
Shahin Amiriparian, Nicholas Cummins, Sandra Ottl, Maurice Gerczuk, and Björn Schuller. {n. d.}. Sentiment analysis using image-based deep spectrum features. In Proceedings of the 2nd International Workshop on Automatic Sentiment Analysis in the Wild (WASA), held in conjunction with the 7th biannual Conference on Affective Computing and Intelligent Interaction (ACII 2017),2017, San Antonio, TX, USA, October, IEEE, note = 4 pages, .
[3]
Shahin Amiriparian, Maurice Gerczuk, Sandra Ottl, Nicholas Cummins, Michael Freitag, Sergey Pugachevskiy, Alice Baird, and Björn Schuller. 2017. Snore sound classification using image-based deep spectrum features. In Proceedings of INTERSPEECH 2017, 18th Annual Conference of the International Speech Communication Association. ISCA, Stockholm, Sweden, 3512--3516.
[4]
Alice Baird, Shahin Amiriparian, Nicholas Cummins, Alyssa M. Alcorn, Anton Batliner, Sergey Pugachevskiy, Michael Freitag, Mauric Gerczuk, and Björn Schuller. 2017. Automatic classification of autistic child vocalisations: A novel database and results. In Proceedings of INTERSPEECH 2017, 18th Annual Conference of the International Speech Communication Association. ISCA, Stockholm, Sweden, 849--853.
[5]
Tadis Baltruvs aitis, Peter Robinson, and Louis-Philippe Morency. 2016. OpenFace: An open source facial behavior analysis toolkit. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV) . IEEE, Lake Placid, NY, USA. 10 pages.
[6]
Tadas Baltruvsaitis, Chaitanya Ahuja, and Louis-Philippe Morency. 2018. Multimodal machine learning: A survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence (January 2018). Early Access, 20 pages.
[7]
Isabelle E. Bauer, Jair C. Soares, Salih Selek, and Thomas D. Meyer. 2017. The link between refractoriness and neuroprogression in treatment-resistant bipolar disorder. In Neuroprogression in Psychiatric Disorders. Mod Trends Pharmacopsychiatry, A. Halaris and B.E. Leonard (Eds.). Vol. 31. Karger Publishers, 10--26.
[8]
Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, 4 (August 2013), 1798--1828.
[9]
Brandon M. Bootha, Karel Mundnicha, and Shrikanth S. Narayanana. 2018. A novel method for human bias correction of continuous-time annotations. In Proceedings of the 43rd IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE, Calgary, Canada.
[10]
Elvan c Ciftc ci, Heysem Kaya, Hüseyin Gülec c, and Albert Ali Salah. 2018. The Turkish audio-visual bipolar disorder corpus. In Proceedings of the 1st Asian Conference on Affective Computing and Intelligent Interaction (ACII Asia) . AAAC, Beijing, China. 6 pages.
[11]
Ciprian A. Corneanu, Marc O. Simón, Jeffrey F. Cohn, and Sergio E. Guerrero. 2016. Survey on RGB, 3D, thermal, and multimodal approaches for facial expression recognition: History, trends, and affect-related applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 38, 8 (August 2016), 1548--1568.
[12]
Roddy Cowie, Ellen Douglas-Cowie, Susie Savvidou, Edelle McMahon, Martin Sawey, and Marc Schröder. 2000. Feeltrace: An instrument for recording perceived emotion in real time. In ISCA Tutorial and Research Workshop (ITRW) on Speech and Emotion. Belfast, UK, 19--24.
[13]
Nicholas Cummins, Shahin Amiriparian, Gerhard Hagerer, Anton Batliner, Stefan Steidl, and Björn Schuller. 2017. An image-based deep spectrum feature representation for the recognition of emotional speech. In Proceedings of the 25th ACM International Conference on Multimedia (ACM MM). ACM, Mountain View, CA, USA, 478--484.
[14]
Nicholas Cummins, Shahin Amiriparian, Sandra Ottl, Maurice Gerczuk, Maximilian Schmitt, and Björn Schuller. 2018. Multimodal Bag-of-Words for cross domains sentiment analysis. In Proceedings of the 43rd IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE, Calgary, Canada. 5 pages, to appear.
[15]
Jun Deng, Nicholas Cummins, Maximilian Schmitt, Kun Qian, Fabien Ringeval, and Björn Schuller. 2017. Speech-based diagnosis of autism spectrum condition by generative adversarial network representations. In Proceedings of the 7th International Conference on Digital Health (DH). ACM, London, UK, 53--57.
[16]
Sidney K. D'mello and Jacqueline Kory. 2015. A review and meta-analysis of multimodal affect detection systems. ACM Computing Surveys, Vol. 47, 3 (February 2015). Article 43, 36 pages.
[17]
Paul Ekman. 1971. Universals and cultural differences in facial expressions of emotion. In Nebraska Symposium on motivation, Vol. 19. University of Nebraska Press, 207--283.
[18]
Hillary Anger Elfenbein and Nalini Ambady. 2002. On the universality and cultural specificity of emotion recognition: A meta-analysis. Psychological Bulletin, Vol. 128, 2 (2002), 203--235.
[19]
Anna Esposito, Antonietta M. Esposito, and Carl Vogel. 2015. Needs and challenges in human computer interaction for processing social emotional information. Pattern Recognition Letters, Vol. 66 (November 2015), 41--51. Issue C.
[20]
Florian Eyben, Klaus R. Scherer, Björn W. Schuller, Johan Sundberg, Elisabeth André, Carlos Busso, Laurence Y. Devillers, Julien Epps, Petri Laukka, Shrikanth S. Narayanan, and Khiet P. Truong. 2016. The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for voice research and affective computing . IEEE Transactions on Affective Computing, Vol. 7, 2 (July 2016), 190--202.
[21]
Florian Eyben, Felix Weninger, Florian Groß, and Björn Schuller. 2013a. Recent developments in openSMILE, the Munich open-source multimedia feature extractor. In Proceedings of the 21st ACM International Conference on Multimedia (ACM MM). ACM, Barcelona, Spain, 835--838.
[22]
Florian Eyben, Felix Weninger, Stefano Squartini, and Björn Schuller. 2013b. Real-life voice activity detection with LSTM Recurrent Neural Networks and an application to Hollywood movies. In Proceedings of the 38th International Conference on Acoustics, Speech, and Signal Processing (ICASSP) . IEEE, Vancouver, Canada. 5 pages.
[23]
Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification . The Journal of Machine Learning Research, Vol. 9 (June 2008), 1871--1874.
[24]
Maria Faurholt-Jepsen, Jonas Busk, Mads Frost, Maj Vinberg, Ellen M. Christensen, Ole Winther, Jakob E. Bardram, and Lars V. Kessing. 2016. Voice analysis as an objective state marker in bipolar disorder . Transactional Psychiatry, Vol. 6 (July 2016), e856.
[25]
Feng Zhou and Fernando de la Torre. 2009. Canonical time warping for alignment of human behavior. In Proceedings of the 23rd Annual Conference on Advances in Neural Information Processing Systems (NIPS). Neural Information Processing Systems Foundation, Inc., Vancouver, Canada. Paper 3728. 9 pages.
[26]
Feng Zhou and Fernando de la Torre. 2012. Generalized time warping for multi-modal alignment of human motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, Providence, RI, USA, 1282--1289.
[27]
Silvia Monica Feraru, Dagmar Schuller, and Björn Schuller. 2015. Cross-language acoustic emotion recognition: An overview and some tendencies. In Proceedings of the 6th biannual Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, Xi'an, P.,R. China, 125--131.
[28]
Ellen Frank, Isabella Soreca, Holly A. Swartz, Andrea M. Fagiolini, Alan G Mallinger, Michael E. Thase, Victoria J. Grochocinski, Patricia R. Houck, and David J. Kupfer. 2008. The role of interpersonal and social rhythm therapy in improving occupational functioning in patients with bipolar I disorder . The American Journal of Psychiatry, Vol. 165, 12 (December 2008), 1559--1565.
[29]
Sayan Ghosh, Eugene Laksana, Louis-Philippe Morency, and Stefan Scherer. 2016. Representation learning for speech emotion recognition. In Proceedings of INTERSPEECH 2016, 17th Annual Conference of the International Speech Communication Association. ISCA, San Francisco, CA, USA, 3603--3607.
[30]
Raffaele Gravina, Parastoo Alinia, Hassan Ghasemzadeh, and Giancarlo Fortino. 2017. Multi-sensor fusion in body sensor networks: State-of-the-art and research challenges. Information Fusion, Vol. 35 (May 2017), 68--80.
[31]
Michael Grimm and Kristian Kroschel. 2005. Evaluation of natural emotions using self assessment manikins. In Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). IEEE, San Juan, Puerto Rico, 381--385.
[32]
Rahul Gupta, Kartik Audhkhasi, Zach Jacokes, Agata Rozga, and Shrikanth Narayanan. 2018. Modeling multiple time series annotations as noisy distortions of the ground truth: An Expectation-Maximization approach . IEEE Transactions on Affective Computing, Vol. 9, 1 (January-March 2018), 76--89.
[33]
Jing Han, Zixing Zhang, Maximilian Schmitt, Maja Pantic, and Björn Schuller. 2017. From hard to soft: Towards more human-like emotion recognition by modelling the perception uncertainty. In Proceedings of the 25th ACM International Conference on Multimedia (ACM MM). ACM, Mountain View, CA, USA, 890--897.
[34]
Lang He, Ercheng Pei, Dongmei Jiang, Peng Wu, Le Yang, and Hichem Sahli. 2015. Multimodal affective dimension prediction using Deep Bidirectional Long Short-Term Memory Recurrent Neural Networks. In Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge (AVEC), co-located with the 23rd ACM International Conference on Multimedia (ACM MM). ACM, Brisbane, Australia, 73--80.
[35]
Zhaocheng Huang, Nicholas Cummins, Ting Dang, Brian Stasak, Phu Le, and Julien Epps. 2015. An investigation of annotation delay compensation and output-associative fusion for multimodal continuous emotion prediction. In Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge (AVEC), co-located with the 23rd ACM International Conference on Multimedia (ACM MM). ACM, Brisbane, Australia, 41--48.
[36]
Zhaocheng Huang and Julien Epps. 2018. Prediction of emotion change from speech . Frontiers in ICT, Vol. 5 (June 2018). 11 pages.
[37]
John D. Hunter. 2007. Matplotlib: A 2D graphics environment . Computing in Science & Engineering, Vol. 9, 3 (May-June 2007), 90--95.
[38]
Zahi N. Karam, Emily Mower Provost, Satinder Singh, Jennifer Montgomery, Christopher Archer, Gloria Harrington, and Melvin G. Mcinnis. 2014. Ecologically valid long-term mood monitoring of individuals with bipolar disorder using speech. In Proceedings of the 39th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) . IEEE, 4858--4862.
[39]
Heysem Kaya and Alexey A. Karpov. 2018. Efficient and effective strategies for cross-corpus acoustic emotion recognition. Neurocomputing, Vol. 275 (January 2018), 1028--034.
[40]
Soheil Khorram, John Gideon, Melvin McInnis, and Emily Mower Provost. 2016. Recognition of depression in bipolar disorder: Leveraging cohort and person-specific knowledge. In Proceedings of INTERSPEECH 2016, 17th Annual Conference of the International Speech Communication Association. ISCA, San Francisco, CA, USA, 1215--1219.
[41]
Amy M. Kilbourne, David Goodrich, David J. Miklowitz, Karen Austin, Edward P. Post, and Mark S. Bauer. 2010. Characteristics of patients with bipolar disorder managed in VA primary care or specialty mental health care settings. Psychiatric Services, Vol. 61, 5 (May 2010), 500--507.
[42]
Amy M. Kilbourne, David E. Goodrich, Allison N. O'Donnell, and Christopher J. Miller. 2012. Integrating bipolar disorder management in primary care. Current Psychiatry Reports, Vol. 14, 6 (December 2012), 687--695.
[43]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep Convolutional Neural Networks . In Proceedings of the 26th Annual Conference on Advances in Neural Information Processing Systems (NIPS), F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Vol. 25. Curran Associates, Inc., 1097--1105.
[44]
Lin Li. 1989. A concordance correlation coefficient to evaluate reproducibility. Biometrics, Vol. 45, 1 (March 1989), 255--268.
[45]
Phil Lopes, Georgios N. Yannakakis, and Antonios Liapis. 2017. RankTrace: Relative and unbounded affect annotation. In Proceedings of the 7th biannual Conference on Affective Computing and Intelligent Interaction (ACII) . IEEE Computer Society, San Antonio, TX, USA, 158--163.
[46]
Soroosh Mariooryad and Carlos Busso. 2015. Correcting time-continuous emotional labels by modeling the reaction lag of evaluators. IEEE Transactions on Affective Computing, Vol. 6 (April-June 2015), 97--108.
[47]
Arianna Mencattini, Francesco Mosciano, Maria Colomba Comes, Tania De Gregorio, Grazia Raguso, Elena Daprati, Fabien Ringeval, Björn Schuller, and Eugenio Martinelli. 2018. An emotional modulation model as signature for the identification of children developmental disorders . Scientific Reports (2018). 18 pages, to appear.
[48]
Kathleen R. Merikangas, Minnie Ames, Lihong Cui, Paul E. Stang, T. Bedirhan Ustun, Michael Von Korff, and Ronald C. Kessler. 2007. The impact of comorbidity of mental and physical conditions on role disability in the US adult household population. Archives of General Psychiatry, Vol. 64, 10 (October 2007), 1180--1188.
[49]
George A. Miller. 1956. The magical number seven, plus or minus two: Some limits on our capacity for processing information . Psychological Review, Vol. 63, 2 (March 1956), 81--97.
[50]
Meinard Müller. 2007. Dynamic time warping . In Information retrieval for music and motion. Springer, 69--86.
[51]
Mihalis A. Nicolaou, Vladimir Pavlovic, and Maja Pantic. 2014. Dynamic probabilistic CCA for analysis of affective behavior and fusion of continuous annotations . IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 36, 7 (2014), 1299--1311.
[52]
Jérémie Nicolle, Vincent Rapp, Kévin Bailly, Lionel Prevost, and Mohamed Chetouani. 2012. Robust continuous prediction of human emotions using multiscale dynamic cues. In Proceedings of the 14th ACM International Conference on Multimodal Interaction (ICMI). ACM, Santa Monica, CA, USA, 501--508.
[53]
World Health Organization. 2001. Mental disorders affect one in four people . http://www.who.int/whr/2001/media_centre/press_release/en/
[54]
World Health Organization. 2008. The global burden of disease: 2004 update, Table A2: Burden of disease in DALYs by cause, sex and income group in WHO regions, estimates for 2004 .WHO Press, Geneva.
[55]
Maja Pantic, Nicu Sebe, Jeffrey F. Cohn, and Thomas Huang. 2005. Affective Multimodal Human-computer Interaction. In Proceedings of the 13th Annual ACM International Conference on Multimedia (MULTIMEDIA). ACM, 669--676.
[56]
Siobhan Reilly, Claire Planner, Mark Hann, David Reeves, Irwin Nazareth, and Helen Lester. 2012. The role of primary care in service provision for people with severe mental illness in the United kingdom. PLoS One (May 2012). Article e36468.
[57]
Fabien Ringeval, Florian Eyben, Eleni Kroupi, Anil Yuce, Jean-Philippe Thiran, Touradj Ebrahimi, Denis Lalanne, and Björn Schuller. 2015a. Prediction of asynchronous dimensional emotion ratings from audiovisual and physiological data . Pattern Recognition Letters, Vol. 66 (November 2015), 22--30.
[58]
Fabien Ringeval, Björn Schuller, Michel Valstar, Roddy Cowie, and Maja Pantic. 2017a. Summary for AVEC 2017 -- Real-life depression and affect challenge and workshop. In Proceedings of the 25th ACM International Conference on Multimedia (ACM MM). ACM, Mountain View, CA, USA, 1963--1964.
[59]
Fabien Ringeval, Björn Schuller, Michel Valstar, Jonathan Gratch, Roddy Cowie, Stefan Scherer, Sharon Mozgai, Nicholas Cummins, and Maja Pantic. 2017b. AVEC 2017 -- Real-life depression, and affect recognition workshop and challenge. In Proceedings of the 7th International Workshop on Audio/Visual Emotion Challenge (AVEC), co-located with the 25th ACM International Conference on Multimedia (ACM MM). ACM, Mountain View, CA, USA, 3--9.
[60]
Fabien Ringeval, Björn Schuller, Michel Valstar, Shashank Jaiswal, Erik Marchi, Denis Lalanne, Roddy Cowie, and Maja Pantic. 2015b. AV
[61]
EC 2015 -- The first affect recognition challenge bridging across audio, video, and physiological data. In Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge (AVEC), co-located with the ACM International Conference on Multimedia (ACM MM). ACM, Brisbane, Australia, 3--8.
[62]
Fabien Ringeval, Andreas Sonderegger, Basilio Noris, Aude Billard, Jürgen Sauer, and Denis Lalanne. 2013a. On the influence of emotional feedback on emotion awareness and gaze behavior. In Proceedings of the 5th biannual Conference on Affective Computing and Intelligent Interaction (ACII). IEEE Computer Society, Geneva, Switzerland, 448--453.
[63]
Fabien Ringeval, Andreas Sonderegger, Jürgen Sauer, and Denis Lalanne. 2013b. Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions. In Proceedings of the 2nd International Workshop on Emotion Representation, Analysis and Synthesis in Continuous Time and Space (EmoSPACE), held in conjunction with the 10th International IEEE Conference on Automatic Face and Gesture Recognition (FG). IEEE, Shanghai, China. 8 pages.
[64]
James A. Russell. 1980. A circumplex model of affect . Journal of Personality and Social Psychology, Vol. 39, 6 (December 1980), 1161--1178.
[65]
James A. Russell. 1994. Is there universal recognition of emotion from facial expression? A review of the cross-cultural studies. Psychological Bulletin, Vol. 115, 1 (January 1994), 102--141.
[66]
Hesam Sagha, Jun Deng, Maryna Gavryukova, Jing Han, and Björn Schuller. 2016. Cross lingual speech emotion recognition using canonical correlation analysis on principal component subspace. In Proceedings of the 41st IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) . IEEE, Shanghai, P.,R. China, 5800--5804.
[67]
Klaus R. Scherer, Rainer Banse, and Harald G. Wallbott. 2001. Emotion inferences from vocal expression correlate across languages and cultures. Journal of Cross-Cultural Psychology, Vol. 32, 1 (January 2001), 76--92.
[68]
Maximilian Schmitt, Fabien Ringeval, and Björn Schuller. 2016. At the border of acoustics and linguistics: Bag-of-Audio-Words for the recognition of emotions in speech. In Proceedings of INTERSPEECH 2016, 17th Annual Conference of the International Speech Communication Association . ISCA, San Francisco, CA, USA, 495--499.
[69]
Maximilian Schmitt and Björn Schuller. 2017. openXBOW -- Introducing the Passau open-source crossmodal Bag-of-Words toolkit. Journal of Machine Learning Research, Vol. 18 (2017), 1--5. Issue February - present.
[70]
Björn Schuller, Stefan Steidl, Anton Batliner, Alessandro Vinciarelli, Klaus Scherer, Fabien Ringeval, Mohamed Chetouani, Felix Weninger, Florian Eyben, Erik Marchi, et almbox. 2013. The INTERSPEECH 2013 Computational Paralinguistics Challenge: Social signals, conflict, emotion, autism. In Proceedings of INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association. ISCA, Lyon, France, 148--152.
[71]
Björn Schuller, Michel Valstar, Florian Eyben, Roddy Cowie, and Maja Pantic. 2012. AVEC 2012 -- The continuous Audio/Visual Emotion Challenge. In Proceedings of the 14th ACM International Conference on Multimodal Interaction (ICMI). ACM, Santa Monica, CA, USA, 449--456.
[72]
Björn Schuller, Michel Valstar, Florian Eyben, Gary McKeown, Roddy Cowie, and Maja Pantic. 2011. AVEC 2011 -- The First International Audio/Visual Emotion Challenge. In Proceedings of the 4th biannual International Conference on Affective Computing and Intelligent Interaction (ACII), Vol. II. Springer, Memphis, TN, USA, 415--424.
[73]
Mariooryad Soroosh and Carlos Busso. 2013. Analysis and compensation of the reaction lag of evaluators in continuous emotional annotations. In Proceedings of the 5th biannual International Conference on Affective Computing and Intelligent Interaction (ACII) . IEEE, Geneva, Switzerland, 85--90.
[74]
Nattapong Thammasan, Kenichi Fukui, and Masayuki Numao. 2016. An investigation of annotation smoothing for EEG-based continuous music-emotion Recognition. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, Budapest, Hungary.
[75]
George Trigeorgis, Mihalis A. Nicolaou, Björn Schuller, and Stefanos Zafeiriou. 2018. Deep Canonical Time Warping for simultaneous alignment and representation learning of sequences . IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, 5 (May 2018), 1128--1138.
[76]
George Trigeorgis, Fabien Ringeval, Raymond Brueckner, Erik Marchi, Mihalis A. Nicolaou, Björn Schuller, and Stefanos Zafeiriou. 2016. Adieu features? End-to-end speech emotion recognition using a deep Convolutional Recurrent Network. In Proceedings of the 41st International Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE, Shanghai, China, 5200--5204.
[77]
Amos Tversky. 1969. Intransitivity of preferences. Psychological Review, Vol. 76, 1 (January 1969), 31--48.
[78]
Michel Valstar, Jonathan Gratch, Björn Schuller, Fabien Ringeval, Roddy Cowie, and Maja Pantic. 2016a. Summary for AVEC 2016: Depression, mood, and emotion recognition workshop and challenge. In Proceedings of the 24th ACM International Conference on Multimedia (ACM MM). ACM, Amsterdam, The Netherlands, 1483--1484.
[79]
Michel Valstar, Jonathan Gratch, Björn Schuller, Fabien Ringeval, Denis Lalanne, Mercedes Torres Torres, Stefan Scherer, Giota Stratou, Roddy Cowie, and Maja Pantic. 2016b. AVEC 2016 -- Depression, mood, and emotion recognition workshop and challenge. In Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge (AVEC), co-located with the ACM International Conference on Multimedia (ACM MM). ACM, Amsterdam, The Netherlands, 3--10.
[80]
Michel Valstar, Björn Schuller, Jarek Krajewski, Roddy Cowie, and Maja Pantic. 2013. Workshop summary for the 3rd international Audio/Visual Emotion Challenge and workshop. In Proceedings of the 21st ACM International Conference on Multimedia (ACM MM). ACM, Barcelona, Spain, 1085--1086.
[81]
Michel Valstar, Björn Schuller, Jarek Krajewski, Roddy Cowie, and Maja Pantic. 2014. AVEC 2014: The 4th international Audio/Visual Emotion Challenge and workshop. In Proceedings of the 22nd ACM International Conference on Multimedia (ACM MM) . ACM, Orlando, FL, USA, 1243--1244.
[82]
Laurens van der Maaten and Kilian Weinberger. 2012. Stochastic triplet embedding. In Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, Santander, Spain.
[83]
Felix Weninger, Fabien Ringeval, Erik Marchi, and Björn Schuller. 2016. Discriminatively trained recurrent neural networks for continuous dimensional emotion recognition from audio. In Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI). IJCAI/AAAI, New York City, NY, USA, 2196--2202.
[84]
Georgios N. Yannakakis, Roddy Cowie, and Carlos Busso. 2017. The ordinal nature of emotions. In Proceedings of the 7th biannual Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, San Antonio, TX, USA. 8 pages.
[85]
Robert C. Young, Jeffery T. Biggs, Veronika E. Ziegler, and Dolores A. Meyer. 1978. A rating scale for mania: Reliability, validity and sensitivity. The British Journal of Psychiatry, Vol. 133, 5 (November 1978), 429--435.
[86]
Biqiao Zhang, Emily Mower Provost, and Georg Essl. 2017b. Cross-corpus acoustic emotion recognition with multi-task learning: Seeking common ground while preserving differences. IEEE Transactions on Affective Computing (March 2017). Early Access, 14 pages.
[87]
Zixing Zhang, Nicholas Cummins, and Björn Schuller. 2017a. Advanced data exploitation in speech analysis -- An overview. IEEE Signal Processing Magazine, Vol. 34, 4 (July 2017), 107--129.

Cited By

View all
  • (2024)The Role of Selected Speech Signal Characteristics in Discriminating Unipolar and Bipolar DisordersSensors10.3390/s2414472124:14(4721)Online publication date: 20-Jul-2024
  • (2024)Predicting the Arousal and Valence Values of Emotional States Using Learned, Predesigned, and Deep Visual FeaturesSensors10.3390/s2413439824:13(4398)Online publication date: 7-Jul-2024
  • (2024)Development of multimodal sentiment recognition and understandingJournal of Image and Graphics10.11834/jig.24001729:6(1607-1627)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
AVEC'18: Proceedings of the 2018 on Audio/Visual Emotion Challenge and Workshop
October 2018
113 pages
ISBN:9781450359832
DOI:10.1145/3266302
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. affective computing
  2. bipolar disorder
  3. cross-cultural emotion

Qualifiers

  • Research-article

Conference

MM '18
Sponsor:
MM '18: ACM Multimedia Conference
October 22, 2018
Seoul, Republic of Korea

Acceptance Rates

AVEC'18 Paper Acceptance Rate 11 of 23 submissions, 48%;
Overall Acceptance Rate 52 of 98 submissions, 53%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)133
  • Downloads (Last 6 weeks)15
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)The Role of Selected Speech Signal Characteristics in Discriminating Unipolar and Bipolar DisordersSensors10.3390/s2414472124:14(4721)Online publication date: 20-Jul-2024
  • (2024)Predicting the Arousal and Valence Values of Emotional States Using Learned, Predesigned, and Deep Visual FeaturesSensors10.3390/s2413439824:13(4398)Online publication date: 7-Jul-2024
  • (2024)Development of multimodal sentiment recognition and understandingJournal of Image and Graphics10.11834/jig.24001729:6(1607-1627)Online publication date: 2024
  • (2024)Scaling Representation Learning From Ubiquitous ECG With State-Space ModelsIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2024.341689728:10(5877-5889)Online publication date: Oct-2024
  • (2024)Prediction of the Most Common Symptoms in Psychological Illnesses with Language Representation Models2024 10th International Conference on Smart Computing and Communication (ICSCC)10.1109/ICSCC62041.2024.10690750(408-412)Online publication date: 25-Jul-2024
  • (2024)A twin disentanglement Transformer Network with Hierarchical-Level Feature Reconstruction for robust multimodal emotion recognitionExpert Systems with Applications10.1016/j.eswa.2024.125822(125822)Online publication date: Nov-2024
  • (2024)Facial image analysis for automated suicide risk detection with deep neural networksArtificial Intelligence Review10.1007/s10462-024-10882-457:10Online publication date: 3-Sep-2024
  • (2024)Exploring contactless techniques in multimodal emotion recognition: insights into diverse applications, challenges, solutions, and prospectsMultimedia Systems10.1007/s00530-024-01302-230:3Online publication date: 6-Apr-2024
  • (2024)Unveiling Hidden Patterns in Speech: Audio Signal-Based Approach for Depression DetectionAdvances in Information Communication Technology and Computing10.1007/978-981-97-6103-6_19(293-309)Online publication date: 2-Oct-2024
  • (2024)MBDA: A Multi-scale Bidirectional Perception Approach for Cross-Corpus Speech Emotion RecognitionAdvanced Intelligent Computing Technology and Applications10.1007/978-981-97-5669-8_27(329-341)Online publication date: 3-Aug-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media