Abstract
This paper presents an evolutionary multi-objective optimization problem formulation for the anti-spam filtering problem, addressing both the classification quality criteria (False Positive and False Negative error rates) and email messages classification time (minimization). This approach is compared to single objective problem formulations found in the literature, and its advantages for decision support and flexible/adaptive anti-spam filtering configuration is demonstrated. A study is performed using the Wirebrush4SPAM framework anti-spam filtering and the SpamAssassin email dataset. The NSGA-II evolutionary multi-objective optimization algorithm was applied for the purpose of validating and demonstrating the adoption of this novel approach to the anti-spam filtering optimization problem, formulated from the multi-objective optimization perspective. The results obtained from the experiments demonstrated that this optimization strategy allows the decision maker (anti-spam filtering system administrator) to select among a set of optimal and flexible filter configuration alternatives with respect to classification quality and classification efficiency.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Statista: The statistics portal, Global spam volume as percentage of total e-mail traffic from January 2014 to September 2016, by month (2016). https://www.statista.com/statistics/420391/spam-email-traffic-share/. Accessed 14 Feb 2017
Digital Marketing Ramblings, 73 Incredible e-mail statistics (2016). http://expandedramblings.com/index.php/email-statistics/. Accessed 14 Feb 2017
The Apache SpamAssassin Group, The first enterprise open-source spam filter (2003), http://spamassassin.apache.org/. Accessed 14 Feb 2017
Méndez, J.R., Reboiro-Jato, M., Díaz, F., Díaz, E., Fdez-Riverola, F.: Grindstone4Spam: an optimization toolkit for boosting e-mail classification. J. Syst. Softw. 85(12), 2909–2920 (2012). doi:10.1016/j.jss.2012.06.027
Yevseyeva, I., Basto-Fernandes, V., Ruano-Ordás, D., Méndez, J.R.: Optimising anti-spam filters with evolutionary algorithms. Expert Syst. Appl. 40(10), 4010–4021 (2013). doi:10.1016/j.eswa.2013.01.008
Zhao, J., Basto-Fernandes, V., Jiao, L., Yevseyeva, I., Maulana, A., Li, R., Bäck, T., Tang, K.: Emmerich, Michael T. M.: Multiobjective optimization of classifiers by means of 3D convex-hull-based evolutionary algorithms. Inf. Sci. 367–368, 80–104 (2016). doi:10.1016/j.ins.2016.05.026
Basto-Fernandes, V., Yevseyeva, I., Méndez, J.R., Zhao, J., Fdez-Riverola, F.: Emmerich, Michael T. M.: A spam filtering multi-objective optimization study covering parsimony maximization and three-way classification. Appl. Soft Comput. 48, 111–123 (2016). doi:10.1016/j.asoc.2016.06.043
Ruano-Ordás, D., Fdez-Glez, J., Fdez-Riverola, J., Méndez, J.R.: Effective scheduling strategies for boosting performance on rule-based spam filtering frameworks. J. Syst. Softw. 86(12), 3151–3161 (2013). doi:10.1016/j.jss.2013.07.036
Ruano-Ordás, D., Fdez-Glez, J., Fdez-Riverola, F., Méndez, J.R.: Combining scheduling heuristics to improve e-mail filtering throughput. In: Omatu, S., Malluhi, Qutaibah M., Gonzalez, S.R., Bocewicz, G., Bucciarelli, E., Giulioni, G., Iqba, F. (eds.) Distributed Computing and Artificial Intelligence. AISC, vol. 373, pp. 235–242. Springer, Cham (2015). doi:10.1007/978-3-319-19638-1_27
Ruano-Ordás, D., Fdez-Glez, J., Fdez-Riverola, F., Méndez, J.R.: Using new scheduling heuristics based on resource consumption information for increasing throughput on rule-based spam filtering systems. Softw. Pract. Exper. 46(8), 1035–1051 (2016). doi:10.1002/spe.2343
Ruano-Ordás, D., Fdez-Glez, J., Fdez-Riverola, F., Basto-Fernandes, V., Méndez, J.R.: RuleSIM: a toolkit for simulating the operation and improving throughput of rule-based spam filters. Softw. Pract. Exp. 46, 1091–1108 (2016). doi:10.1002/spe.2342
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002). doi:10.1109/4235.996017
IEEE Transactions on Evolutionary Computing – Popular Documents, February 2017. http://ieeexplore.ieee.org/xpl/topAccessedArticles.jsp?punumber=4235&sortType=popular_most_cited_by_papers. Accessed 3 April 2017
The Apache SpamAssassin Group, How do I get SpamAssassin to run faster? https://wiki.apache.org/spamassassin/FasterPerformance. Accessed 14 Feb 2017
Pérez-Díaz, N., Ruano-Ordás, D., Fdez-Riverola, F., Méndez, J.R.: Wirebrush4SPAM: a novel framework for improving efficiency on spam filtering services. Softw. Pract. Exp. 43(11), 1299–1318 (2013). doi:10.1002/spe.2135
The Apache SpamAssassin Group. RescoreMassCheck. https://wiki.apache.org/spamassassin/RescoreMassCheck. Accessed 14 Feb 2017
Beasley, D.: Possible applications of evolutionary computation. In: Evolutionary Computation 1: Basic Algorithms and Operators, 1st edn., pp. 4–18. Institute of Physics Publishing, Bristol and Philadelphia (2000)
Resnick, P.: RFC2822: Internet Message Format, Network Working Group. https://www.ietf.org/rfc/rfc2822.txt. Accessed 14 Feb 2017
The Apache SpamAssassin Group, The Apache SpamAssassin Public Corpus. https://spamassassin.apache.org/publiccorpus/. Accessed 14 Feb 2017
CSMINING Group, Spam Emails Datasets. http://csmining.org/index.php/spam-email-datasets-.html. Accessed 14 Feb 2017
TREC Spam. Text REtrieval Conference. http://trec.nist.gov/data/spam.html. Accessed 14 Feb 2017
Guenter, B.: SPAM archive. http://untroubled.org/spam/. Accessed 14 Feb 2017
Durillo, J.J., Nebro, A.J.: jMetal: a java framework for multi-objective optimization. Adv. Eng. Softw. 42(10), 760–771 (2011). doi:10.1016/j.advengsoft.2011.05.014
Acknowledgements
SING group thanks CITI (Centro de Investigación, Transferencia e Innovación) from University of Vigo for hosting its IT infrastructure.
Funding: This work was partially funded by Consellería de Cultura, Educación e Ordenación Universitaria (Xunta de Galicia) and FEDER (European Union). This work was partially supported by the project Platform of integration of intelligent techniques for analysis of biomedical information (TIN2013-47153-C3-3-R) from the Spanish Ministry of Economy and Competitiveness.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Ruano-Ordás, D., Basto-Fernandes, V., Yevseyeva, I., Méndez, J.R. (2017). Evolutionary Multi-objective Scheduling for Anti-Spam Filtering Throughput Optimization. In: Martínez de Pisón, F., Urraca, R., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2017. Lecture Notes in Computer Science(), vol 10334. Springer, Cham. https://doi.org/10.1007/978-3-319-59650-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-59650-1_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59649-5
Online ISBN: 978-3-319-59650-1
eBook Packages: Computer ScienceComputer Science (R0)