Abstract
Recent years have witnessed the flourishing of social media platforms (SMPs), such as Twitter, Facebook, and Sina Weibo. The rapid development of these SMPs has resulted in increasingly large scale multimedia data, which has been proved with remarkable marketing values. It is in an urgent need to classify these social media data into a specified list of concerned entities, such as brands, products, and events, to analyze their sales, popularity or influences. But this is a rather challenging task due to the shortness, conversationality, the incompatibility between images and text, and the data diversity of microblogs. In this paper, we present a multi-modal microblog classification method in a multi-task learning framework. Firstly features of different modalities are extracted for each microblog. Specifically, we extract TF-IDF features for each microblog text and low-level visual features and high-level semantic features for each microblog image. Then multiple related classification tasks are learned simultaneously for each feature to increase the sample size for each task and improve the prediction performance. Finally the outputs of each feature are integrated by a Support Vector Machine that learns how to optimally combine and weight each feature. We evaluate the proposed method on Brand-Social-Net to classify the contained 100 brands. Experimental results demonstrate the superiority of the proposed method, as compared to the state-of-the-art approaches.
Similar content being viewed by others
References
Ando RK, Zhang T (2005) A framework for learning predictive structures from multiple tasks and unlabeled data. J Mach Learn Res 6:1817–1853
Argyriou A, Evgeniou T, Pontil M (2007) Multi-task feature learning. Adv neural infor process syst 19:41
Asur S, Huberman BA (2010) Predicting the future with social media. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, vol. 1, pp 492–499
Becker H, Naaman M, Gravano L (2010) Learning similarity metrics for event identification in social media. In: ACM international conference on Web Search and Data Mining, pp 291–300
Ben-David S, Schuller R (2003) Exploiting task relatedness for multiple task learning. In: Learning Theory and Kernel Machines
Bickel S, Bogojeska J, Lengauer T, Scheffer T (2008) Multi-task learning for hiv therapy screening. In: ACM International Conference on Machine Learning, pp 56–63
Borth D, Ji R, Chen T, Breuel T, Chang SF (2013) Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: ACM International Conference on Multimedia, pp 223–232
Chen C, Li F, Ooi BC, Wu S (2011) Ti: an efficient indexing mechanism for real-time search on tweets. In: ACM SIGMOD International Conference on Management of data, pp 649–660
Chen MY, Hauptmann A (2004) Multi-modal classification in digital news libraries. In: Joint ACM/IEEE Conference on Digital Libraries, pp 212–213
Chen Y, Li Z, Nie L, Hu X, Wang X, Chua TS, Zhang X (2012) A semi-supervised bayesian network model for microblog topic classification. In: International Conference on Computational Linguistics, pp 561–576
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp 886–893
Dunker P, Nowak S, Begau A, Lanz C (2008) Content-based mood classification for photos and music: a generic multi-modal classification framework and evaluation approach. In: ACM International Conference on Multimedia Information Retrieval, pp 97–104
Gao Y, Wang F, Luan H, Chua TS (2014) Brand data gathering from live social media streams. In: ACM International Conference on Multimedia Retrieval
Gao Y, Wang M, Tao D, Ji R, Dai Q. (2012) 3-d object retrieval and recognition with hypergraph analysis. IEEE Trans Image Process 21(9):4290–4303
Gao Y, Wang M, Zha ZJ, Shen J, Li X, Wu X (2013) Visual-textual joint relevance learning for tag-based social image search. IEEE Trans Image Process 22(1):363–376
Gao Y, Zhao S, Yang Y, Chua TS (2015) Multimedia social event detection in microblog. In: International Conference on Multimedia Modeling
Gaonkar S, Li J, Choudhury RR, Cox L, Schmidt A (2008) Micro-blog: sharing and querying content through mobile phones and social participation. In: ACM International Conference on Mobile systems, applications, and services, pp 174–186
Gong P, Ye J, Zhang C (2012) Robust multi-task feature learning. In: ACM SIGKDD international conference on Knowledge discovery and data mining, pp 895–903
Gray KR, Aljabar P, Heckemann RA, Hammers A, Rueckert D (2013) Random forest-based similarity measures for multi-modal classification of alzheimer’s disease. NeuroImage 65:167–175
Gu C, Wang S (2012) Empirical study on social media marketing based on sina microblog. In: IEEE International Conference on Business Computing and Global Informatization, pp 537–540
Hanjalic A (2006) Extracting moods from pictures and sounds: Towards truly personalized tv. IEEE Signal Process Mag 23(2):90–100
Jalali A, Ravikumar PD, Sanghavi S, Ruan C (2010) A dirty model for multi-task learning. In: Advances in Neural Information Processing Systems, vol. 3, p 7
Ji R, Duan LY, Chen J, Yao H, Yuan J, Rui Y, Gao W (2012) Location discriminative vocabulary coding for mobile landmark search. Int J Comput Vis 96(3):290–314
Ji R, Gao Y, Hong R, Liu Q, Tao D, Li X (2014) Spectral-spatial constraint hyperspectral image classification. IEEE Trans Geosci Rem Sens 52(3):1811–1824
Ji R., Gao Y., Liu W., Tian Q., Li X. When location meets social multimedia: A comprehensive survey on location-aware social multimedia. ACM Transactions on Intelligent System and Technology (in press)
Liu J, Ji S, Ye J (2009) Multi-task feature learning via efficient l 2, 1-norm minimization. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, AUAI Press, pp 339–348
Liu Q, Yang Y, Wang X, Cao L (2013) Quality assessment on user generated image for mobile search application. In: International Conference on Multimedia Modeling, pp 1–11
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Nagmoti R, Teredesai A, De Cock M (2010) Ranking approaches for microblog search. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, vol. 1, pp 153–157
Naveed N, Gottron T, Kunegis J, Alhadi AC (2011) Searching microblogs: coping with sparsity and document quality. In: ACM International Conference on Information and knowledge management, pp 183–188
Nie F, Huang H, Cai X, Ding C (2010) Efficient and robust feature selection via joint l2, 1-norms minimization. Adv in Neural Infor Process Syst 23:1813–1821
Ojala T, Pietikäinen M, Harwood D (1996) A comparative study of texture measures with classification based on featured distributions. Pattern Recognition 29(1):51–59
Pronobis A, Caputo B (2007) Confidence-based cue integration for visual place recognition. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp 2394–2401
Pronobis A, Mozos OM, Caputo B, Jensfelt P (2010) Multi-modal semantic place classification. Int J Robot Res 29(2-3):298–320
Reuter T, Cimiano P (2012) Event-based classification of social media streams. In: ACM International Conference on Multimedia Retrieval, p 22
Rowlands T, Hawking D, Sankaranarayana R (2010) New-web search with microblog annotations. In: ACM International Conference on World wide web, pp 1293–1296
Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Infor process & management 24(5):513–523
Sharifi BP (2010) Automatic microblog classification and summarization. University of Colorado, Ph.D. thesis
Sharma R, Walavalkar L (2002) Yeasin, M. Multi-modal gender classification using support vector machines (svms)
Skowron A, Wang H, Wojna A, Bazan J (2006) Multimodal classification: case studies. In: Transactions on Rough Sets V. Springer, pp 224–239
Smeulders AW, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Machine Int 22(12):1349–1380
Sui Y, Yang X (2010) The potential marketing power of microblog. In: IEEE International Conference on Communication Systems, Networks and Applications, vol. 1, pp 164–167
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Royal Stat Society. Series B (Methodological):267–288
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154
Wang F, Qi S, Gao G, Zhao S, Wang X (2014) Logo information recognition in large-scale social media data. Multimedia Syst:1–11
Wei Y, Zhang Z, Fei S, Du W (2014) A method of computing the hot topics popularity on the internet combined with the features of the microblogs. In: Frontier and Future Development of Information Technology in Medicine and Education, pp 2721–2728
Weng J, Lee BS (2011) Event detection in twitter. In: International AAAI Conference on Weblogs and Social Media
Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 1794–1801
Yang Y, Wang X, Guan T, Shen J, Yu L (2014) A multi-dimensional image quality prediction model for user-generated images in social networks. Infor Scie 281:601–610
Yang YH, Lin YC, Cheng HT, Liao IB, Ho YC, Chen HH (2008) Toward multi-modal music emotion classification. In: Advances in Multimedia Information Processing-PCM. Springer, pp 70–79
Zhao S, Gao Y, Jiang X, Yao H, Chua TS, Sun X (2014) Exploring principles-of-art features for image emotion recognition. In: ACM International Conference on Multimedia
Zhao S, Yao H, Sun X, Jiang X, Xu P (2013) Flexible presentation of videos based on affective content analysis. In: International Conference on Multimedia Modeling, pp 368–379
Zhao S, Yao H, Yang Y, Zhang Y (2014) Affective image retrieval via multi-graph learning. In: ACM International Conference on Multimedia
Zhao S, Yao H, Zhang Y, Wang Y, Liu S (2014) View-based 3d object retrieval via multi-modal graph learning. Signal Processing
Zheng H, Yoshinaga N, Kaji N, Toyoda M (2012) A study on microblog classification based on information publicness. In: DEIM Forum
Zhou J, Chen J, Ye J (2012) Malsar: Multi-task learning via structural regularization. Arizona State University
Zhou J, Liu J, Narayan VA, Ye J (2012) Modeling disease progression via fused sparse group lasso. In: ACM SIGKDD international conference on Knowledge discovery and data mining, pp 1095–1103
Zhou J, Yuan L, Liu J, Ye J (2011) A multi-task learning formulation for predicting disease progression. In: ACM SIGKDD international conference on Knowledge discovery and data mining, pp 814–822
Acknowledgments
This work was supported by the National Natural Science Foundation of China (No. 61472103) and Key Program (No. 61133003). Sicheng Zhao was also supported by the Ph.D. Short-Term Overseas Visiting Scholar Program of Harbin Institute of Technology.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhao, S., Yao, H., Zhao, S. et al. Multi-modal microblog classification via multi-task learning. Multimed Tools Appl 75, 8921–8938 (2016). https://doi.org/10.1007/s11042-014-2342-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-014-2342-2