Abstract
In this paper, we propose a new deep network that learns multi-level deep representations for image emotion classification (MldrNet). Image emotion can be recognized through image semantics, image aesthetics and low-level visual features from both global and local views. Existing image emotion classification works using hand-crafted features or deep features mainly focus on either low-level visual features or semantic-level image representations without taking all factors into consideration. The proposed MldrNet combines deep representations of different levels, i.e. image semantics, image aesthetics and low-level visual features to effectively classify the emotion types of different kinds of images, such as abstract paintings and web images. Extensive experiments on both Internet images and abstract paintings demonstrate the proposed method outperforms the state-of-the-art methods using deep features or hand-crafted features. The proposed approach also outperforms the state-of-the-art methods with at least 6% performance improvement in terms of overall classification accuracy.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
We have 88,298 noisy labeled images and 23,164 manually labeled images as some images no longer exists in the Internet.
References
Alameda-Pineda X, Ricci E, Yan Y, Sebe N (2016) Recognizing emotions from abstract paintings using non-linear matrix completion. In: CVPR
Andrearczyk V, Whelan PF (2016) Using filter banks in convolutional neural networks for texture classification. Pattern Recognit Lett 84:63–69
Aronoff J (2006) How we recognize angry and happy emotion in people, places, and things. Cross-Cult Res 40(1):83–105
Borth D, Ji R, Chen T, Breuel T, Chang SF (2013) Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: ACM MM
Chen CH, Patel VM, Chellappa R (2015) Matrix completion for resolving label ambiguity. In: CVPR
Chen T, Yu FX, Chen J, Cui Y, Chen YY, Chang SF (2014) Object-based visual sentiment concept analysis and application. In: ACM MM
Cui Z, Shi X, Chen Y (2016) Sentiment analysis via integrating distributed representations of variable-length word sequence. Neurocomputing 187:126–132
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: CVPR
Hanjalic A (2006) Extracting moods from pictures and sounds: towards truly personalized TV. IEEE Signal Process Mag 23(2):90–100
Hanjalic A, Xu LQ (2005) Affective video content representation and modeling. IEEE Trans Multimed 7(1):143–154
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR, pp 770–778
Hu C, Xu Z, Liu Y, Mei L, Chen L, Luo X (2014) Semantic link network-based model for organizing multimedia big data. IEEE Trans Emerg Top Comput 2(3):376–387
Itten J, Van Haagen E (1962) The art of colorthe subjective experience and objective rationale of colour. Reinhold, New York
Joachims T (1999) Transductive inference for text classification using support vector machines. In: ICML
Joshi D, Datta R, Fedorovskaya E, Luong QT, Wang JZ, Li J, Luo J (2011) Aesthetics and emotions in images. IEEE Signal Process Mag 28(5):94–115
Jufeng Y, Ming S, Xiaoxiao S (2017) Learning visual sentiment distributions via augmented conditional probability neural network. In: AAAI
Jufeng Y, Dongyu S, Ming S, Ming-Ming C, Rosin PL, Liang W (2018) Visual sentiment prediction based on automatic discovery of affective regions. TMM 99:1–1
Kang HB (2003) Affective content detection using HMMs. In: ACM MM
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS
Lang PJ (1979) A bio-informational theory of emotional imagery. Psychophysiology 16(6):495–512
Lang PJ, Bradley MM, Cuthbert BN (2008) International affective picture system (IAPs): affective ratings of pictures and instruction manual. Technical report A-8
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: CVPR
Lu X, Suryanarayan P, Adams Jr RB, Li J, Newman MG, Wang JZ (2012) On shape and the computability of emotions. In: ACM MM
Lu X, Lin Z, Jin H, Yang J, Wang JZ (2014) Rapid: rating pictorial aesthetics using deep learning. In: ACM MM
Machajdik J, Hanbury A (2010) Affective image classification using features inspired by psychology and art theory. In: ACM MM, pp 83–92
Mikels JA, Fredrickson BL, Larkin GR, Lindberg CM, Maglio SJ, Reuter-Lorenz PA (2005) Emotional category data on images from the international affective picture system. Behav Res Methods 37(4):626–630
Peng KC, Chen T, Sadovnik A, Gallagher AC (2015) A mixed bag of emotions: model, predict, and transfer emotion distributions. In: CVPR
Poria S, Cambria E, Howard N, Huang GB, Hussain A (2016) Fusing audio, visual and textual clues for sentiment analysis from multimodal content. Neurocomputing 174:50–59
Rao T, Xu M, Liu H, Wang J, Burnett I (2016) Multi-scale blocks based image emotion classification using multiple instance learning. In: ICIP
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS
Sartori A, Culibrk D, Yan Y, Sebe N (2015) Who’s afraid of Itten: Using the art theory of color combination to analyze emotions in abstract paintings. In: ACM MM
Shepstone SE, Tan ZH, Jensen SH (2014) Using audio-derived affective offset to enhance TV recommendation. IEEE Trans Multimed 16(7):1999–2010
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Comput Sci. arXiv:1409.1556
Soleymani M, Larson M, Pun T, Hanjalic A (2014) Corpus development for affective video indexing. IEEE Trans Multimed 16(4):1075–1089
Solli M, Lenz R (2009) Color based bags-of-emotions. In: CAIP
Song K, Yao T, Ling Q, Mei T (2018) Boosting image sentiment analysis with visual attention. Neurocomputing 312:218–228
Sun X, Li C, Ren F (2016) Sentiment analysis for chinese microblog based on deep neural networks with convolutional extension features. Neurocomputing 210:227–236
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: CVPR
Tkalcic M, Odic A, Kosir A, Tasic J (2013) Affective labeling in a content-based recommender system for images. IEEE Trans Multimed 15(2):391–400
Wang W, He Q (2008) A survey on emotional semantic image retrieval. In: ICIP
Wei-ning W, Ying-lin Y, Jian-chao Z (2004) Image emotional classification: static vs. dynamic. In: SMC
Xu M, Jin JS, Luo S, Duan L (2008) Hierarchical movie affective content analysis based on arousal and valence features. In: ACM MM
Yadati K, Katti H, Kankanhalli M (2014) CAVVA: computational affective video-in-video advertising. IEEE Trans Multimed 16(1):15–23
Yanulevskaya V, Van Gemert J, Roth K, Herbold AK, Sebe N, Geusebroek JM (2008) Emotional valence categorization using holistic image features. In: ICIP
Yanulevskaya V, Uijlings J, Bruni E, Sartori A, Zamboni E, Bacci F, Melcher D, Sebe N (2012) In the eye of the beholder: employing statistical analysis and eye tracking for analyzing abstract paintings. In: ACM MM
You Q, Luo J, Jin H, Yang J (2016) Building a large scale dataset for image emotion recognition: the fine print and the benchmark. In: AAAI
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: ECCV
Zhao S, Gao Y, Jiang X, Yao H, Chua TS, Sun X (2014) Exploring principles-of-art features for image emotion recognition. In: ACM MM
Zhao S, Yao H, Gao Y, Ji R, Ding G (2017) Continuous probability distribution prediction of image emotions via multitask shared sparse regression. IEEE Trans Multimed 19(3):632–645
Zhao S, Ding G, Gao Y, Zhao X, Tang Y, Han J, Yao H, Huang Q (2018) Discrete probability distribution prediction of image emotions with shared sparse learning. IEEE Trans Affect Comput. https://doi.org/10.1109/TAFFC.2018.2818685
Zhao S, Yao H, Gao Y, Ding G, Chua TS (2018) Predicting personalized image emotion perceptions in social networks. IEEE Trans Affect Comput 9(4):526–540
Zhou B, Lapedriza A, Xiao J, Torralba A, Oliva A (2014) Learning deep features for scene recognition using places database. In: NIPS
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Rao, T., Li, X. & Xu, M. Learning Multi-level Deep Representations for Image Emotion Classification. Neural Process Lett 51, 2043–2061 (2020). https://doi.org/10.1007/s11063-019-10033-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-019-10033-9