Abstract
Due to the popularity of Twitter, it attracts malicious users’ interests. Most of previous approaches relied on account-based features such as message similarity between tweets, following-followers ratio, and so on. Account-based features can be easily manipulated by spam accounts. Spam collusion is a new way to escape the detection mechanisms. Therefore, we need an advanced mechanism to identify the spam collusion relations.
We exploit spam campaign which spreads spam tweets. We focus on the tweet with the high retweet count. We create the message-passing graph via the retweet relations, following relations, and retweet time, then we extract the time evolution feature in the aspect of graph structure. The latent behavior indexing technique is used to extract critical concepts for spam collusion recognition. We collect 5 million tweets from May 14, 2014 to July 15, 2014 and the ground-truth has been labeled by domain experts. Our approach can achieve 86% accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baltazar, J., Costoya, J., Flores, R.: The real face of koobface: The largest web 2.0 botnet explained. Trend Micro Research (2009)
Barabasi, A.L., Oltvai, Z.N.: Network biology: understanding the cell’s functional organization. Nature Reviews Genetics 5(2), 101–113 (2004)
Bilge, L., Strufe, T., Balzarotti, D., Kirda, E.: All your contacts are belong to us: automated identity theft attacks on social networks. In: Proceedings of International Conference on World Wide Web, pp. 551–560 (2009)
Boyd, D., Golder, S., Lotan, G.: Tweet, tweet, retweet: Conversational aspects of retweeting on twitter. In: Proceedings of Hawaii International Conference on System Sciences, pp. 1–10 (2010)
Du, J., Song, D., Liao, L., Li, X., Liu, L., Li, G., Gao, G., Wu, G.: ReadBehavior: Reading probabilities modeling of tweets via the users’ retweeting behaviors. In: Tseng, V.S., Ho, T.B., Zhou, Z.-H., Chen, A.L.P., Kao, H.-Y. (eds.) PAKDD 2014, Part I. LNCS, vol. 8443, pp. 114–125. Springer, Heidelberg (2014)
Ghosh, R., Surachawala, T., Lerman, K.: Entropy-based classification of’retweeting’activity on twitter. In: Proceedings of KDD Workshop on Social Network Analysis (2011)
Ghosh, S., Viswanath, B., Kooti, F., Sharma, N.K., Korlam, G., Benevenuto, F., Ganguly, N., Gummadi, K.P.: Understanding and combating link farming in the twitter social network. In: Proceedings of International Conference on World Wide Web, pp. 61–70 (2012)
Jiang, M., Cui, P., Beutel, A., Faloutsos, C., Yang, S.: Detecting suspicious following behavior in multimillion-node social networks. In: Proceedings of the Companion Publication of the International Conference on World Wide Web Companion, pp. 305–306 (2014)
Kwak, H., Lee, C., Park, H., Moon, S.: What is twitter, a social network or a news media? In: Proceedings of International Conference on World Wide Web, pp. 591–600 (2010)
Lee, S., Kim, J.: Warningbird: A near real-time detection system for suspicious urls in twitter stream. IEEE Transactions on Dependable and Secure Computing 10(3), 183–195 (2013)
Netowrkx: Netowrkx, https://networkx.github.io/
Nexgate: 2013 state of social media spam, http://nexgate.com/wp-content/uploads/2013/09/Nexgate-2013-State-of-Social-Media-Spam-Research-Report.pdf
Peng, H.K., Zhu, J., Piao, D., Yan, R., Zhang, Y.: Retweet modeling using conditional random fields. In: Proceedings of IEEE International Conference on Data Mining Workshops, pp. 336–343 (2011)
Shekar, C., Wakade, S., Liszka, K.J., Chan, C.C.: Mining pharmaceutical spam from twitter. In: Proceedings of International Conference on Intelligent Systems Design and Applications, pp. 813–817 (2010)
Song, J., Lee, S., Kim, J.: Spam filtering in twitter using sender-receiver relationship. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) RAID 2011. LNCS, vol. 6961, pp. 301–317. Springer, Heidelberg (2011)
Stringhini, G., Kruegel, C., Vigna, G.: Detecting spammers on social networks. In: Proceedings of Annual Computer Security Applications Conference, pp. 1–9 (2010)
Stringhini, G., Wang, G., Egele, M., Kruegel, C., Vigna, G., Zheng, H., Zhao, B.Y.: Follow the green: growth and dynamics in twitter follower markets. In: Proceedings of ACM SIGCOMM Conference on Internet Measurement, pp. 163–176 (2013)
Thomas, K., Grier, C., Ma, J., Paxson, V., Song, D.: Design and evaluation of a real-time url spam filtering service. In: Proceedings of IEEE Symposium on Security and Privacy, pp. 447–462 (2011)
Twitter: About verified accounts, https://support.twitter.com/articles/119135
Twitter: Rest api v1.1 resources, https://dev.twitter.com/docs/api/1.1
Twitter: Twitter limits (api, updates, and following), https://support.twitter.com/articles/15364
Watts, D.J., Strogatz, S.H.: Collective dynamics of ’small-world’ networks. Nature 393(6684), 440–442 (1998)
Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann (2011)
Yang, C., Harkreader, R., Zhang, J., Shin, S., Gu, G.: Analyzing spammers’ social networks for fun and profit: a case study of cyber criminal ecosystem on twitter. In: Proceedings of International Conference on World Wide Web, pp. 71–80 (2012)
Yang, C., Harkreader, R.C., Gu, G.: Die free or live hard? Empirical evaluation and new design for fighting evolving twitter spammers. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) RAID 2011. LNCS, vol. 6961, pp. 318–337. Springer, Heidelberg (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Chen, PC., Lee, HM., Tyan, HR., Wu, JS., Wei, TE. (2014). Detecting Spam on Twitter via Message-Passing Based on Retweet-Relation. In: Cheng, SM., Day, MY. (eds) Technologies and Applications of Artificial Intelligence. TAAI 2014. Lecture Notes in Computer Science(), vol 8916. Springer, Cham. https://doi.org/10.1007/978-3-319-13987-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-13987-6_6
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13986-9
Online ISBN: 978-3-319-13987-6
eBook Packages: Computer ScienceComputer Science (R0)