Abstract
The rise in the popularity of Social Networking Sites has made Community Detection in such networks a major research interest. The edges connecting the entities in the network are the principal foci in graphical community detection. At the same time, large volume of data is produced on these Social Networking Sites, a large portion of which being text data. Document Clustering methods utilize the textual properties of text documents to cluster similar documents together while separating dissimilar documents. This paper treats text data collected from Twitter as a set of documents. The clusters produced by the document clustering methods are associated with the respective users. These clusters are then compared with the communities detected in the graphical representation of the network generated from the users and the relationships between them. NodeXL was used to collect data from Twitter while Gephi was used for visualizing the collected dataset. Different feature representation and clustering methods were applied for clustering the tweets(documents) and in turn the users associated with them.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Papadopoulos, S., Kompatsiaris, Y., Vakali, A.: Community detection in social media. Data Min. Knowl. Disc. 24(3), 515–554 (2012). https://doi.org/10.1007/s10618-011-0224-z
Aggarwal, C.C., Wang, H.: Text mining in social networks. In: Aggarwal, C. (ed.) Social Network Data Analytics, pp. 353–378. Springer, Boston (2011). https://doi.org/10.1007/978-1-4419-8462-3_13
Kim, Y.H., Seo, S., Ha, Y.H., Lim, S., Yoon, Y.: Two applications of clustering techniques to Twitter: community detection and issue extraction. Discrete Dyn. Nat. Soc. (2013)
Gligorić, K., Anderson, A., West, R.: How constraints affect content: the case of Twitter’s switch from 140 to 280 characters. arXiv preprint arXiv:1804.02318 (2018)
Zhang, Y., Wu, Y., Yang, Q.: Community discovery in Twitter based on user interests. J. Comput. Inf. Syst. 8(3), 991–1000 (2012)
Tumasjan, A., Sprenger, T.O., Sandner, P.G., Welpe, I.M.: Predicting elections with twitter: what 140 characters reveal about political sentiment. In: Fourth International AAAI Conference on Weblogs and Social Media (2010)
Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the 19th International Conference on World Wide Web, pp. 851–860 (2010)
Lerman, K., Ghosh, R.: Information contagion: an empirical study of the spread of news on digg and Twitter social networks. arXiv preprint arXiv:1003.2664 (2010)
Sachan, M., Contractor, D., Faruquie, T. A., Subramaniam, L. V.: Using content and interactions for discovering communities in social networks. In: Proceedings of the 21st International Conference on World Wide Web, pp. 331–340 (2012)
Huberman, B.A., Romero, D.M., Wu, F.: Social networks that matter: Twitter under the microscope. arXiv preprint arXiv:0812.1045 (2008)
Mucha, P.J., Onnela, J., Porter, M.A.: Communities in networks. Not. Am. Math. Soc. 56, 1082–1097 (2009)
Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)
Zhou, D., Manavoglu, E., Li, J., Giles, C.L., Zha, H.: Probabilistic models for discovering e-communities. In: Proceedings of the 15th International Conference on World Wide Web, pp. 173–182 (2006)
Smith, M., et al.: NodeXL: a free and open network overview, discovery and exploration add-in for Excel 2007/2010/2013/2016. http://nodexl.codeplex.com/ from the Social Media Research Foundation. http://www.smrfoundation.org. Accessed 7 Dec 2020
Bastian, M., Heymann, S., Jacomy, M.: Gephi: an open source software for exploring and manipulating networks. In: Icwsm, vol. 8, no. 2009, pp. 361–362 (2009)
Salloum, S.A., Al-Emran, M., Monem, A.A., Shaalan, K.: A survey of text mining in social media: Facebook and Twitter perspectives. Adv. Sci. Technol. Eng. Syst. J. 2(1), 127–133 (2017)
Bergsma, S., McNamee, P., Bagdouri, M., Fink, C., Wilson, T.: Language identification for creating language-specific Twitter collections. In: Proceedings of the Second Workshop on Language in Social Media, pp. 65–74 (2012)
Chang, J.C., Lin, C.C.: Recurrent-neural-network for language detection on Twitter code-switching corpus. arXiv preprint arXiv:1412.4314 (2014)
Bengfort, B., Bilbro, R.: Yellowbrick: visualizing the scikit-learn model selection process. J. Open Source Softw. 4(35), 1075 (2019)
Grossetti, Q., du Mouza, C., Travers, N.: Community-based recommendations on Twitter: avoiding the filter bubble. In: Cheng, R., Mamoulis, N., Sun, Y., Huang, X. (eds.) WISE 2020. LNCS, vol. 11881, pp. 212–227. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34223-4_14
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Bhowmik, K., Ralescu, A. (2021). Taking a Close Look at Twitter Communities and Clusters. In: Simian, D., Stoica, L.F. (eds) Modelling and Development of Intelligent Systems. MDIS 2020. Communications in Computer and Information Science, vol 1341. Springer, Cham. https://doi.org/10.1007/978-3-030-68527-0_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-68527-0_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68526-3
Online ISBN: 978-3-030-68527-0
eBook Packages: Computer ScienceComputer Science (R0)