Abstract
Spam emails are junk emails which are unrequested deceptive emails sent or forwarded to any person or a company which may contain malware and has access to confidential information of any individual. A lot of research work has been done in this area of spam detection which is limited to some specific domains. Machine learning is generally used to classify whether an email is valid (ham) or unwanted (spam). Two feature sets are introduced namely stopwords and word count to determine an email is spam or ham on the basis of textual information and fields of an email file. The entire process involves the comparison of two different feature sets on Multinomial Naïve Bayes, Logistic Regression, Linear Support Vector Machine, and Artificial Neural Network Algorithms to determine a more reliable method for spam detection. For this purpose, we use benchmark datasets as well as real time evaluation to experimentally evaluate the proposed work. Detection of a spam email on basis of content, malware, and sender’s information can reduce the threat to user’s confidential information to a great extent.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mohammed, M. A., Mostafa, S. A., & Obaid, O. I. An anti-spam detection model for emails of multi-natural language.
Mallampati, D., & Hegde, N. P. (2020). A machine learning based email spam classification framework model. IJITEE, ISSN, 9(4), 2278–3075.
Cormack, G. V. (2006). Email spam filtering: A systematic review. Foundations and Trends® in Information Retrieval, 1(4), 335–455.
Chen, J. I. Z., & Smys, S. (2020). Social multimedia security and suspicious activity detection in SDN using hybrid deep learning technique. Journal of Information Technology, 2(02), 108–115.
Siponen, M., & Stucke, C. (2006). Effective anti-spam strategies in companies: An international study. In Proceedings of the 39th Annual Hawaii international conference on system sciences (HICSS’06).
Mallampati, D., Chandra Shekar, K., & Ravikanth, K. Supervised machine learning classifier for email spam filtering, © Springer Nature Singapore Pte Ltd. 2019 and Engineering. https://doi.org/10.1007/978-981-13-7082-341.
Gupta, H., Jamal, M. S., Madisetty, S., & Desarkar, M. S. (2018, January). A framework for real-time spam detection in Twitter. In 2018 10th international conference on communication systems & networks (COMSNETS) (pp. 380–383).
Mahmoud, T. M., & Mahfouz, A. M. (2012). SMS spam filtering technique based on artificial immune system. International Journal of Computer Science Issues (IJCSI), 9(2), 589.
Akinyelu, A. A., & Adewumi, A. O. (2014). Classification of phishing email using random forest machine learning technique. Journal of Applied Mathematics.
Yüksel, A. S., Cankaya, S. F., & Üncü, İ. S. (2017). Design of a machine learning based predictive analytics system for spam problem. Acta Physica Polonica, A., 132(3); Goodman, J. (2004, July). IP Addresses in Email Clients. CEAS.
Androutsopoulos, J. Koutsias, K. Chandrinos and C. D. Spyropoulos, “An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal email messages,” Computation and Language, pp. 160–167, 2000.
Huang, L., Jia, J., Ingram, E., & Peng, W. Enhancing the naive bayes spam filter through intelligent text modification detection. In 2018 17th IEEE international conference on trust, security and privacy in computing and communications.
Apache. (2019). “open-source Apache SpamAssassin Dataset”, https://spamassassin.apache.org/old/publiccorpus/
Vinodhini, M., Prithvi, D., Balaji, S. (2020, March). Spam detection framework using ML algorithm. IJRTE, 8(6). ISSN: 2277-3878.
Brownlee, J. (2016, April 1). Logistic regression for machine learning. The Machine Learning Mastery. https://machinelearningmastery.com/logistic-regression-for-machine-learning/
Zavvar, M., Rezaei, M., & Garavand, S. (2016) Email spam detection using combination of particle swarm optimization and artificial neural network and support vector machine. International Journal of Model Education and Computer Science 68–74.
Gandhi, R. (2018, June 7). Support vector machine. The Machine Learning Mastery. https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47
Smys, S., Basar, A., & Wang, H. (2020). Artificial neural network based power management for smart street lighting systems. Journal of Artificial Intelligence, 2(01), 42–52.
Li, X. M., & Kim, U. M. (2012, June). A hierarchical framework for content-based image spam filtering. In 8th international conference on information science and digital content technology (ICIDT) (pp. 149–155). Jeju.
Mukherjee, A., Venkataraman, V., Liu, B., & Glance, N. S. (2013). What yelp fake review filter might be doing? In ICWSM.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sethi, M., Chandra, S., Chaudhary, V., Dahiya, Y. (2022). Spam Email Detection Using Machine Learning and Neural Networks. In: Shakya, S., Balas, V.E., Kamolphiwong, S., Du, KL. (eds) Sentimental Analysis and Deep Learning. Advances in Intelligent Systems and Computing, vol 1408. Springer, Singapore. https://doi.org/10.1007/978-981-16-5157-1_22
Download citation
DOI: https://doi.org/10.1007/978-981-16-5157-1_22
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-5156-4
Online ISBN: 978-981-16-5157-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)