More Web Proxy on the site http://driver.im/

research-article

Multiple Emotion Tagging for Multimedia Data by Exploiting High-Order Dependencies Among Emotions

Authors:

Qiang JiAuthors Info & Claims

IEEE Transactions on Multimedia, Volume 17, Issue 12

Pages 2185 - 2197

https://doi.org/10.1109/TMM.2015.2484966

Published: 01 December 2015 Publication History

Abstract

In this paper, a novel approach of multiple emotional multimedia tagging is proposed, which explicitly models the higher-order relations among emotions. First, multimedia features are extracted from the multimedia data. Second, a traditional multi-label classifier is used to obtain the measurements of the multi-emotion labels. Then, we propose a three-layer restricted Boltzmann machine (TRBM) model to capture the higher-order relations among emotion labels, as well as the relations between labels and measurements . Finally, the TRBM model is used to infer the samples’ multi- emotion labels by combining the emotion measurements with the dependencies among multi- emotions . Experimental results on four databases demonstrate that our method is more effective than both feature -driven methods and current model-based methods, which capture the pairwise relations among labels by the Bayesian network (BN). Furthermore, the comparison of BN models and the proposed TRBM model verifies that the patterns captured by the latent units of TRBM contain not only all the dependencies captured by the BN but also many other dependencies that the BN cannot capture.

References

[1]

S. Arifin and P. Cheung, “Affective level video segmentation by utilizing the pleasure-arousal-dominance information,” IEEE Trans. Multimedia, vol. 10, no. 7, pp. 1325–1341, Nov. 2008.

Digital Library

[2]

S. Arifin and P. Y. K. Cheung, “A computation method for video segmentation utilizing the pleasure-arousal-dominance emotional information,” in Proc. 15th Int. Conf. Multimedia, 2007, pp. 68–77.

[3]

S. Arifin and P. Y. K. Cheung, “A novel probabilistic approach to modeling the pleasure-arousal-dominance content of the video based on working memory,” in Proc. Int. Conf. IEEE Semantic Comput., Sep. 2007, pp. 147–154.

[4]

L. Canini, S. Benini, and R. Leonardi, “Affective recommendation of movies based on selected connotative features,” IEEE Trans. Circuits Syst. Video Technol., vol. 23, no. 4, pp. 636–647, Apr. 2013.

Digital Library

[5]

L. Canini, S. Benini, P. Migliorati, and R. Leonardi, “Emotional identity of movies,” in Proc. 16th IEEE Int. Conf. Image Process., Nov. 2009, pp. 1821–1824.

[6]

X. Y. Chen and Z. Segall, “XV-Pod: An emotion aware, affective mobile video player,” in Proc. WRI World Congr. Comput. Sci. Inf. Eng., 2009, pp. 277–281.

[7]

E. A. Cherman, M. C. Monard, and J. Metz, “Multi-label problem transformation methods: A case study,” CLEI Electron. J., vol. 14, no. 1, pp. 4–4, 2011.

[8]

Y. Cui, J. S. Jin, S. Zhang, S. Luo, and Q. Tian, “Music video affective understanding using feature importance analysis,” in Proc. ACM Int. Conf. Image Video Retrieval, New York, NY, 2010, pp. 213–219.

[9]

Y. Cui, S. Luo, Q. Tian, S. Zhang, Y. Peng, L. Jiang, and J. Jin Mutual information-based emotion recognition The Era of Interactive Media, New York, NY USA: Springer, 2013, pp. 471–479.

[10]

X. Ding, B. Li, W. Hu, W. Xiong, and Z. Wang, “Horror video scene recognition based on multi-view multi-instance learning,” in Proc. Comput. Vis., 2013, pp. 599–610.

[11]

T. Eerola, O. Lartillot, and P. Toiviainen, “Prediction of multidimensional emotional ratings in music from audio using multivariate regression models,” in Proc. Int. Conf. Music Inf. Retrieval, 2009, pp. 621–626.

[12]

P. Ekman, Handbook of Cognition and Emotion, Sussex, U.K.: Wiley, 1999, Basic Emotions, pp. 45–60.

[13]

J. H. French, “Automatic affective video indexing: Sound energy and object motion correlation discovery: Studies in identifying slapstick comedy using low-level video features,” in Proc. IEEE Southeastcon, Apr. 2013, pp. 1–6.

[14]

J. Fürnkranz, E. Hüllermeier, E. Loza Menca, and K. Brinker, “Multilabel classification via calibrated label ranking,” Mach. Learn., vol. 73, no. 2, pp. 133–153, Nov. 2008.

Digital Library

[15]

S. Godbole and S. Sarawagi, “Discriminative methods for multi-labeled classification,” in Proc. 8th Pacific–Asia Conf. Knowl. Discovery Data Mining, 2004, pp. 22–30.

[16]

J. J. Gross and R. W. Levenson, “Emotion elicitation using films,” Cognition Emotion, vol. 9, no. 1, pp. 87–108, 1995.

[17]

A. Hanjalic and L. Q. Xu, “Affective video content representation and modeling,” IEEE Trans. Multimedia, vol. 7, no. 1, pp. 143–154, Feb. 2005.

Digital Library

[18]

Y. hsuan Yang and H. H. Chen, “Machine recognition of music emotion: A review,” in Proc. ACM Trans. Intell. Syst. Technol., 2012, vol. 3, no. 3, p. 40.

[19]

S.-J. Huang, Y. Yu, and Z.-H. Zhou, “Multi-label hypothesis reuse,” in Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2012, pp. 525–533.

[20]

G. Irie, T. Satou, A. Kojima, T. Yamasaki, and K. Aizawaki, “Affective audio-visual words and latent topic driving model for realizing movie affective scene classification,” IEEE Trans. Multimedia, vol. 12, no. 6, pp. 523–535, Oct. 2010.

Digital Library

[21]

P. Isola, J. Xiao, A. Torralba, and A. Oliva, “What makes an image memorable?,” in Proc. IEEE Conf. IEEE Comput. Vis. Pattern Recog., Jun. 2011, pp. 145–152.

[22]

D. Joshi, R. Datta, E. Fedorovskaya, Q.-T. Luong, J. Z. Wang, J. Li, and J. Luo, “Aesthetics and emotions in images,” IEEE Signal Process. Mag., vol. 28, no. 5, pp. 94–115, Sep. 2011.

[23]

H.-B. Kang, “Affective content detection using HMMs,” in Proc. 11th ACM Int. Conf. Multimedia, 2003, pp. 259–262.

[24]

H.-B. Kang, “Emotional event detection using relevance feedback,” in Proc. Int. Conf. Image Process., 2003, vol. 1, pp. I-721–I-724.

[25]

Y. E. Kim et al., “Music emotion recognition: A state of the art review,” in Proc. ISMIR, 2010, pp. 255–266.

[26]

D. Li, I. K. Sethi, N. Dimitrova, and T. McGee, “Classification of general audio data for content-based retrieval,” Pattern Recog. Lett., vol. 22, no. 5, pp. 533–544, 2001.

Digital Library

[27]

T. Li and M. Ogihara, “Detecting emotion in music,” in Proc. Int. Symp. Music Inf. Retrieval, 2003, pp. 239–240.

[28]

R. M. A. Teixeira, T. Yamasaki, and K. Aizawa, “Determination of emotional content of video clips by low-level audiovisual features,” Multimedia Tools Appl., vol. 61, no. 1, pp. 21–49, 2012.

[29]

A. G. Money and H. Agius Feasibility of personalized affective video summaries Affect and Emotion Human-Computer Interaction, Berlin, Germany: Springer, 2008, pp. 194–208.

Digital Library

[30]

A. G. Money and H. Agius, “Analysing user physiological responses for affective video summarisation,” Displays, vol. 30, no. 2, pp. 59–70, 2009.

[31]

V. Nair and G. E. Hinton, “3D object recognition with deep belief nets,” in Proc. Adv. Neural Inf. Process. Syst., 2009, pp. 1339–1347.

[32]

K.-M. Ong and W. Kameyama, “Classification of video shots based on human affect,” in Proc. Inf. Media Technol., 2009, vol. 4, no. 4, pp. 903–912.

[33]

P. Philippot, “Inducing and assessing differentiated emotion-feeling states in the laboratory,” Cognition Emotion, vol. 7, no. 2, pp. 171–193, 1993.

[34]

J. Read, B. Pfahringer, G. Holmes, and E. Frank, “Classifier chains for multi-label classification,” in Proc. Eur. Conf. on Mach. Learn. Knowl. Discovery Databases: Part II, 2009, pp. 254–269.

[35]

A. Schaefer, F. Nils, X. Sanchez, and P. Philippot, “Assessing the effectiveness of a large database of emotion-eliciting films: A new tool for emotion researchers,” Cognition Emotion, vol. 24, no. 7, pp. 1153–1172, 2010.

[36]

B. Schuller, D. Johannes, and R. Gerhard, “Determination of nonprototypical valence and arousal in popular music: Features and performances,” EURASIP J. Audio, Speech, Music Process., vol. 2010, pp. 735–854, 2010.

[37]

X. W. Shangfei Wang, “Emotion semantics image retrieval: An brief overview,” in Proc. ACII, 2005, pp. 490–497.

[38]

M. S. Sorower, “A literature survey on algorithms for multi-label learning,”, Ph.D. dissertation Dept. of Comput. Sci. Oregon State Univ., Corvallis, OR USA: 2010.

[39]

K. Sun and J. Yu, “Video affective content representation and recognition using video affective tree and hidden Markov models,” in Proc. 2nd Int. Conf. Affective Comput. Intell. Interaction, 2007, pp. 594–605.

[40]

R. Marcelino, A. Teixeira, T. Yamasaki, and K. Aizawa, “Determination of emotional content of video clips by low-level audiovisual features,” Multimedia Tools Appl., vol. 61, no. 1, pp. 1–29, 2011.

[41]

K. Trohidis, G. Tsoumakas, G. Kalliris, and I. Vlahavas, “Multi-label classification of music into emotions,” in Proc. 9th Int. Conf. Music Inf. Retrieval, 2008, pp. 325–330.

[42]

K. Trohidis, G. Tsoumakas, G. Kalliris, and I. Vlahavas, “Multi-label classification of music by emotion,” EURASIP J. Audio, Speech, Music Process., vol. 4, pp. 1–9, 2011.

[43]

G. Tsoumakas, I. Katakis, and I. Vlahavas Mining multi-label data Data Mining and Knowledge Discovery Handbook, New York, NY USA: Springer, 2010, pp. 667–685.

[44]

G. Tsoumakas and I. Vlahavas, “Random k-labelsets: An ensemble method for multilabel classification,” in Proc. Mach. Learn., 2007, pp. 406–417.

[45]

C. W. Wang, W. H. Cheng, J. C. Chen, S. S. Yang, and J. L. Wu, “Film narrative exploration through the analysis of aesthetic elements,” in Proc. Adv. Multimedia Modeling, 2006, pp. 606–615.

[46]

H. L. Wang and L.-F. Cheong, “Affective understanding in film,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 6, pp. 689–704, Jun. 2006.

Digital Library

[47]

S. Wang and X. Wang Emotional semantic detection from multimedia: A brief overview Kansei Engineering and Soft Computing: Theory and Practice 2010, Hershey, PA USA: Eng. Sci. Ref., 2010, pp. 126–146.

[48]

S. Wang, Z. Wang, and Q. Ji, “Multiple emotional tagging of multimedia data by exploiting dependencies among emotions,” Multimedia Tools Appl., vol. 74, no. 6, pp. 1863–1883, 2013.

Digital Library

[49]

S. Wang, Y. Zhu, G. Wu, and Q. Ji, “Hybrid video emotional tagging using users’ eeg and video content,” Multimedia Tools Appl., vol. 72, no. 2, pp. 1257–1283, 2014.

Digital Library

[50]

W. Wang and Q. He, “A survey on emotional semantic image retrieval,” in Proc. 15th IEEE Int. Conf. Image Process., Oct. 2008, pp. 117–120.

[51]

Z. Wang, Y. Li, S. Wang, and Q. Ji, “Capturing global semantic relationships for facial action unit recognition,” in Proc IEEE Int. Conf. Comput. Vis., Dec. 2013, pp. 3304–3311.

[52]

S. C. Watanapa, B. Thipakorn, and N. Charoenkitkarn, “A sieving ANN for emotion-based movie clip classification,” IEICE Trans. Inf. Syst., vol. 91, no. 5, pp. 1562–1572, 2008.

[53]

C. Y. Wei, N. Dimitrova, and S. F. Chang, “Color-mood analysis of films based on syntactic and psychological models,” in Proc. IEEE Int. Conf. Multimedia Expo, Jun. 2004, vol. 2, pp. 831–834.

[54]

A. Wieczorkowska, P. Synak, and Z. W. Ras, “Multi-label classification of emotions in music,” in Proc. Intell. Inf. Process. Web Mining, 2006, pp. 307–315.

[55]

J. Xiao, J. Hays, K. A. Ehinger, A. Oliva, and A. Torralba, “Sun database: Large-scale scene recognition from abbey to zoo,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2010, pp. 3485–3492.

[56]

M. Xu, L.-T. Chia, and J. Jin, “Affective content analysis in comedy and horror videos by audio emotional event detection,” in Proc. IEEE Int. Conf. Multimedia Expo, Jul. 2005, p. 4.

[57]

M. Xu et al., “Using scripts for affective content retrieval,” in Proc. Adv. Multimedia Inf. Process., 2010, pp. 43–51.

[58]

M. Xu, J. S. Jin, S. Luo, and L. Duan, “Hierarchical movie affective content analysis based on arousal and valence features,” in Proc. 16th ACM Int. Conf. Multimedia, 2008, pp. 677–680.

[59]

M. Xu et al., “A three-level framework for affective content analysis and its case studies,” English Multimedia Tools Appl., pp. 1–23, 2012.

[60]

M. Xu et al., “Hierarchical affective content analysis in arousal and valence dimensions,” Signal Process., vol. 93, no. 8, pp. 2140–2150, 2013.

Digital Library

[61]

A. Yazdani, K. Kappeler, and T. Ebrahimi, “Affective content analysis of music video clips,” in Proc. 1st Int. ACM Workshop Music Inf. Retrieval User-Centered Multimodal Strategies, 2011, pp. 7–12.

[62]

A. Yazdani, J.-S. Lee, and T. Ebrahimi, “Implicit emotional tagging of multimedia using EEG signals and brain computer interface,” in Proc. 1st SIGMM Workshop Social Media, 2009, pp. 81–88.

[63]

A. Yazdani, E. Skodras, N. Fakotakis, and T. Ebrahimi, “Multimedia content analysis for emotional characterization of music video clips,” EURASIP J. Image Video Process., vol. 1, no. 26, pp. 1–10, 2013.

[64]

M. Zhang and K. Zhang, “Multi-label learning by exploiting label dependency,” in Proc. 16th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2010, pp. 999–1008.

[65]

M.-L. Zhang and Z.-H. Zhou, “Multilabel neural networks with applications to functional genomics and text categorization,” IEEE Trans. Knowl. Data Eng., vol. 18, no. 10, pp. 1338–1351, Oct. 2006.

Digital Library

[66]

M.-L. Zhang and Z.-H. Zhou, “Ml-knn: A lazy learning approach to multi-label learning,” Pattern Recog., vol. 40, no. 7, pp. 2038–2048, 2007.

Digital Library

[67]

S. Zhang, Q. Huang, S. Jiang, W. Gao, and Q. Tian, “Affective visualization and retrieval for music video,” IEEE Trans. Multimedia, vol. 12, no. 6, pp. 510–522, Oct. 2010.

Digital Library

[68]

S. Zhang, Q. Tian, Q. Huang, W. Gao, and S. Li, “Utilizing affective analysis for efficient movie browsing,” in Proc. 16th IEEE Int. Conf. Image Process., Nov. 2009, pp. 1853–1856.

[69]

S. Zhang, Q. Tian, S. Jiang, Q. Huang, and W. Gao, “Affective MTV analysis based on arousal and valence features,” in Proc. IEEE Int. Conf. Multimedia Expo, Apr.–Jun. 2008, pp. 1369–1372.

[70]

S. Zhao et al., “Video indexing and recommendation based on affective analysis of viewers,” in Proc. 19th ACM Int. Conf. Multimedia, 2011, pp. 1473–1476.

Cited By

Deng XFeng SLyu GWang TLang C(2023)Beyond Word Embeddings: Heterogeneous Prior Knowledge Driven Multi-Label Image ClassificationIEEE Transactions on Multimedia10.1109/TMM.2022.317109525(4013-4025)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2022.3171095
Xu JTian HWang ZWang YKang WChen F(2021)Joint Input and Output Space Learning for Multi-Label Image ClassificationIEEE Transactions on Multimedia10.1109/TMM.2020.300218523(1696-1707)Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1109/TMM.2020.3002185
Wang SPeng GZheng ZXu Z(2021)Capturing Emotion Distribution for Multimedia Emotion TaggingIEEE Transactions on Affective Computing10.1109/TAFFC.2019.290024012:4(821-831)Online publication date: 1-Oct-2021
https://dl.acm.org/doi/10.1109/TAFFC.2019.2900240
Show More Cited By

Recommendations

Multiple emotional tagging of multimedia data by exploiting dependencies among emotions

Digital multimedia may elicit a mixture of human emotions. Most current emotional tagging research typically tags the multimedia data with a single emotion, ignoring the phenomenon of multi-emotion coexistence. To address this problem, we propose a ...
Capturing dependencies among labels and features for multiple emotion tagging of multimedia data
AAAI'17: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence

In this paper, we tackle the problem of emotion tagging of multimedia data by modeling the dependencies among multiple emotions in both the feature and label spaces. These dependencies, which carry crucial top-down and bottom-up evidence for improving ...
Capturing Emotion Distribution for Multimedia Emotion Tagging
Multimedia collections usually induce multiple emotions in audiences. The data distribution of multiple emotions can be leveraged to facilitate the learning process of emotion tagging, yet has not been thoroughly explored. To address this, we propose ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia

IEEE Transactions on Multimedia Volume 17, Issue 12

Dec. 2015

243 pages

ISSN:1520-9210

Issue’s Table of Contents

Copyright © 2015.

Publisher

IEEE Press

Publication History

Published: 01 December 2015

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 30 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Deng XFeng SLyu GWang TLang C(2023)Beyond Word Embeddings: Heterogeneous Prior Knowledge Driven Multi-Label Image ClassificationIEEE Transactions on Multimedia10.1109/TMM.2022.317109525(4013-4025)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2022.3171095
Xu JTian HWang ZWang YKang WChen F(2021)Joint Input and Output Space Learning for Multi-Label Image ClassificationIEEE Transactions on Multimedia10.1109/TMM.2020.300218523(1696-1707)Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1109/TMM.2020.3002185
Wang SPeng GZheng ZXu Z(2021)Capturing Emotion Distribution for Multimedia Emotion TaggingIEEE Transactions on Affective Computing10.1109/TAFFC.2019.290024012:4(821-831)Online publication date: 1-Oct-2021
https://dl.acm.org/doi/10.1109/TAFFC.2019.2900240
Wu BJia JYang YZhao PTang JTian Q(2017)Inferring Emotional Tags From Social Images With User DemographicsIEEE Transactions on Multimedia10.1109/TMM.2017.265588119:7(1670-1684)Online publication date: 15-Jun-2017
https://dl.acm.org/doi/10.1109/TMM.2017.2655881

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents