[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Ant colony optimization edge selection for support vector machine speed optimization

Published: 01 August 2020 Publication History

Abstract

Support vector machine (SVM) is a widely used and reliable machine learning algorithm. It has been successfully applied to many real-world problems, with remarkable results. However, it has also been observed that SVM computational complexity increases with the increase in data size. Although many SVM speed optimization techniques have been proposed in the literature, there is still need for further improvement on the performance speed and accuracy of this algorithm. In this paper, a boundary detection algorithm for SVM speed optimization called ant colony optimization instance selection algorithm (ACOISA) is proposed. ACOISA is inspired by edge selection in ant colony optimization (ACO) algorithm, and it performs two primary functions: boundary detection and boundary instance selection. In the algorithm, ACO is used for boundary detection and k-nearest neighbor algorithm is used for boundary instance selection. Different sets of experiments are carried out to validate the efficiency of the proposed technique. All the experiments were performed on 35 datasets containing three well-known e-fraud types (credit card fraud, email spam and phishing email) and 31 other datasets available at UCI data repository. The experimental results showed that the proposed technique efficiently improved SVM training speed in 100% of the datasets used for evaluation, without significantly affecting SVM classification quality. Furthermore, the Freidman’s and Holm’s post hoc tests were conducted to statistically validate the credibility of the results. The statistical test results revealed that the proposed technique is significantly faster, compared to the standard SVM and some existing instance selection techniques, in all cases.

References

[1]
Arbatskaya MN, Mukhopadhaya K, Rasmusen EB (2006) The parking lot problem (December 19, 2006). https://ssrn.com/abstract=571101 or https://doi.org/10.2139/ssrn.571101
[2]
Adewumi AO and Ali MMA multi-level genetic algorithm for a multi-stage space allocation problemMath Comput Model2010511109-12625711641190.90083
[3]
Chen J, Zhang C, Xue X, and Liu C-L Fast instance selection for speeding up support vector machines Knowl Based Syst 2013 45 1-7
[4]
Chapelle OTraining a support vector machine in the primalNeural Comput20071951155-117823092671123.68101
[5]
Panda N, Chang EY, Wu G (2006) Concept boundary detection for speeding up SVMs. In: Proceedings of the 23rd international conference on machine learning, pp 681–688
[6]
Martens D, Baesens B, and Fawcett TEditorial survey: swarm intelligence for data miningMach Learn20118211-423108186
[7]
Wilson DR and Martinez TRReduction techniques for instance-based learning algorithmsMach Learn2000383257-2860954.68126
[8]
Tian J, Yu W, Xie S (2008) An ant colony optimization algorithm for image edge detection. In: 2008 IEEE congress on evolutionary computation (IEEE world congress on computational intelligence), pp 751–756
[9]
Nayak M and Dash P Edge detection improvement by ant colony optimization compared to traditional methods on brain MRI image Commun Appl Electron (CAE) 2016 5 8 19-23
[10]
Gautam A and Biswas M Edge detection technique using ACO with PSO for noisy image Recent developments in machine learning and data analytics 2019 Singapore Springer 383-396
[11]
Olvera-López JA, Carrasco-Ochoa JA, Martínez-Trinidad JF, and Kittler J A review of instance selection methods Artif Intell Rev 2010 34 2 133-143
[12]
Brighton H and Mellish CAdvances in instance selection for instance-based learning algorithmsData Min Knowl Discov200262153-17219179351027.68673
[13]
Hart P The condensed nearest neighbor rule (Corresp.) IEEE Trans Inf Theory 1968 14 3 515-516
[14]
Angiulli F Fast nearest neighbor condensation for large data sets classification IEEE Trans Knowl Data Eng 2007 19 11 1450-1464
[15]
Ritter G, Woodruff H, Lowry S, and Isenhour TAn algorithm for a selective nearest neighbor decision rule (Corresp.)IEEE Trans Inf Theory1975216665-6690323.68023
[16]
Chien-Hsing C, Bo-Han K, Fu C (2006) The generalized condensed nearest neighbor rule as a data reduction method. In: 18th international conference on pattern recognition (ICPR’06), pp 556–559
[17]
Wilson DLAsymptotic properties of nearest neighbor rules using edited dataIEEE Trans Syst Man Cybernet197223408-4213291390276.62060
[18]
Tomek IAn experiment with the edited nearest-neighbor ruleIEEE Trans Syst Man Cybernet197666448-4524288530332.68081
[19]
Devijver PA (1980) On the edited nearest neighbor rule. In: Proceedings of 5th international conference on pattern recognition
[20]
Vázquez F, Sánchez JS, Pla F (2005) A stochastic approach to Wilson’s editing algorithm. In: Iberian conference on pattern recognition and image analysis, pp 35–42
[21]
Aha DW, Kibler D, and Albert MK Instance-based learning algorithms Mach Learn 1991 6 1 37-66
[22]
Zhao K-P, Zhou S-G, Guan J-H, Zhou A-Y (2003) C-pruner: an improved instance pruning algorithm. In: Proceedings of the 2003 international conference on machine learning and cybernetics (IEEE Cat. No. 03EX693), pp 94–99
[23]
Li Y, Hu Z, Cai Y, Zhang W (2005) Support vector based prototype selection method for nearest neighbor rules. In: International conference on natural computation, pp 528–535
[24]
Srisawat A, Phienthrakul T, Kijsirikul B (2006) SV-kNNC: an algorithm for improving the efficiency of k-nearest neighbor. In: Pacific rim international conference on artificial intelligence, pp 975–979
[25]
Kuncheva LI Editing for the k-nearest neighbors rule by a genetic algorithm Pattern Recognit Lett 1995 16 8 809-814
[26]
Kuncheva LI and Bezdek JC “Nearest prototype classification: clustering, genetic algorithms, or random search? IEEE Trans Syst Man Cybernet Part C (Appl Rev) 1998 28 1 160-164
[27]
Cano JR, Herrera F, and Lozano M Using evolutionary algorithms as instance selection for data reduction in KDD: an experimental study IEEE Trans Evol Comput 2003 7 6 561-575
[28]
García S, Cano JR, and Herrera FA memetic algorithm for evolutionary prototype selection: a scaling up approachPattern Recognit20084182693-27091149.68411
[29]
Garain UPrototype reduction using an artificial immune modelPattern Anal Appl2008113353-3632430377
[30]
Cerveron V and Ferri FJ “Another move toward the minimum consistent subset: a tabu search approach to the condensed nearest neighbor rule IEEE Trans Syst Man Cybernet Part B (Cybernet) 2001 31 3 408-413
[31]
Zhang H and Sun GOptimal reference subset selection for nearest neighbor classification by tabu searchPattern Recognit20023571481-14901016.68089
[32]
Olvera-López JA, Carrasco-Ochoa JA, Martínez-Trinidad JF (2005) Sequential search for decremental edition. In: International conference on intelligent data engineering and automated learning, pp 280–285
[33]
Pudil P, Ferri FJ, Novovicova J, Kittler J (1994) Floating search methods for feature selection with nonmonotonic criterion functions. In: Proceedings of the 12th IAPR international conference on pattern recognition, Vol. 3-conference c: signal processing (Cat. No. 94CH3440-5), pp 279–283
[34]
Olvera-López JA, Martínez-Trinidad JF, Carrasco-Ochoa JA (2007) Restricted sequential floating search applied to object selection. In: International workshop on machine learning and data mining in pattern recognition, pp 694–702
[35]
Riquelme JC, Aguilar-Ruiz JS, and Toro M Finding representative patterns with ordered projections Pattern Recognit 2003 36 4 1009-1018
[36]
Raicharoen T and Lursinsap C A divide-and-conquer approach to the pairwise opposite class-nearest neighbor (POC-NN) algorithm Pattern Recognit Lett 2005 26 10 1554-1567
[37]
Narayan BL, Murthy CA, and Pal SK Maxdiff kd-trees for data condensation Pattern Recognit Lett 2006 27 3 187-200
[38]
Caises Y, González A, Leyva E, Pérez R (2009) SCIS: combining instance selection methods to increase their effectiveness over a wide range of domains. In: International conference on intelligent data engineering and automated learning, pp 17–24
[39]
Spillmann B, Neuhaus M, Bunke H, Pękalska E, Duin RP (2006) Transforming strings to vector spaces using prototype selection. In: Joint IAPR international workshops on statistical techniques in pattern recognition (SPR) and structural and syntactic pattern recognition (SSPR), pp 287–296
[40]
Mollineda RA, Ferri FJ, and Vidal EAn efficient prototype merging strategy for the condensed 1-NN rule through class-conditional hierarchical clusteringPattern Recognit200235122771-27821007.68944
[41]
Veenman CJ and Reinders MJ The nearest subclass classifier: a compromise between the nearest mean and nearest neighbor classifier IEEE Trans Pattern Anal Mach Intell 2005 27 9 1417-1429
[42]
Lumini A and Nanni LA clustering method for automatic biometric template selectionPattern Recognit2006393495-4971122.68547
[43]
Paredes R, Vidal E (2000) Weighting prototypes-a new editing approach. In: Proceedings 15th international conference on pattern recognition. ICPR-2000, pp 25–28
[44]
Olvera-López JA, Carrasco-Ochoa JA, Martínez-Trinidad JF (2008) Prototype selection via prototype relevance. In: Iberoamerican congress on pattern recognition, pp 153–160
[45]
Leyva E, González A, and Pérez R Three new instance selection methods based on local sets: a comparative study with several approaches from a bi-objective perspective Pattern Recognit 2015 48 4 1523-1537
[46]
Carbonera JL, Abel M (2015) A density-based approach for instance selection. In: 2015 IEEE 27th international conference on tools with artificial intelligence (ICTAI), Italy, pp 768–774
[47]
Carbonera JL An efficient approach for instance selection International conference on big data analytics and knowledge discovery 2017 Cham Springer
[48]
Carbonera JL, Abel M (2018) Efficient instance selection based on spatial abstraction. In: 2018 IEEE 30th international conference on tools with artificial intelligence (ICTAI), pp 286–292
[49]
Rathee S, Ratnoo S, and Ahuja J Instance selection using multi-objective CHC evolutionary algorithm Information and communication technology for competitive strategies 2019 Springer Springer 475-484
[50]
Anwar IM, Salama KM, Abdelbar AM (2015) ADR-miner: an ant-based data reduction algorithm for classification. In: 2015 IEEE congress on evolutionary computation (CEC), pp 515–521
[51]
Tsai C-F and Cheng K-C Simple instance selection for bankruptcy prediction Knowl-Based Syst 2012 27 333-342
[52]
Koggalage R and Halgamuge S Reducing the number of training samples for fast support vector machine classification Neural Inf Process Lett Rev 2004 2 3 57-65
[53]
Dorigo M (1992) Optimization, learning and natural algorithms. Ph.D. Thesis, Politecnico di Milano, Italy
[54]
Dorigo M and Blum CAnt colony optimization theory: a surveyTheor Comput Sci20053442243-27821788551154.90626
[55]
Katiyar S and Ansari AQ Ant colony optimization: a tutorial review MR Int J Eng Technol 2015 7 2 35-41
[56]
Dorigo M, Birattari M, and Stutzle T Ant colony optimization IEEE Comput Intell Mag 2006 1 4 28-39
[57]
Shekhawat A, Poddar P, and Boswal D Ant colony optimization algorithms: introduction and beyond 2009 Mumbai Department of Computer Science and Engineering, Indian Institute of Tecnology Bombay
[58]
Dorigo M (1992) Optimization, learning and natural algorithms. Ph.D. Thesis, Politecnico di Milano, Italy
[59]
Dorigo M and Gambardella LM Ant colony system: a cooperative learning approach to the traveling salesman problem IEEE Trans Evol Comput 1997 1 1 53-66
[60]
Stutzle T, Hoos H (1997) MAX-MIN ant system and local search for the traveling salesman problem. In: IEEE international conference on evolutionary computation, pp 309–314
[61]
Gutjahr WJFirst steps to the runtime complexity analysis of ant colony optimizationComput Oper Res20083592711-272725863951144.90519
[62]
Neumann F, Sudholt D, and Witt C Computational complexity of ant colony optimization and its hybridization with local search Innovations in swarm intelligence 2009 Berlin, Heidelberg Springer 91-120
[63]
Brighton H and Mellish CAdvances in instance selection for instance-based learning algorithmsData Min Knowl Discov200262153-17219179351027.68673
[64]
Garcı S, Triguero I, Carmona CJ, and Herrera F Evolutionary-based selection of generalized instances for imbalanced classification Knowl Based Syst 2012 25 1 3-12
[65]
Angiulli F and Astorino A Scaling up support vector machines using nearest neighbor condensation IEEE Trans Neural Netw 2010 21 2 351-357
[66]
Angiulli F Fast nearest neighbor condensation for large data sets classification IEEE Trans Knowl Data Eng 2007 19 11 1450-1464
[67]
Al-Yaseen WL, Othman ZA, and Nazri MZA Multi-level hybrid support vector machine and extreme learning machine based on modified K-means for intrusion detection system Expert Syst Appl 2017 67 296-303
[68]
Schlag S, Schmitt M, Schulz C (2019) Faster support vector machines. In: 2019 Proceedings of the twenty-first workshop on algorithm engineering and experiments (ALENEX), pp 199–210
[69]
Akinyelu AA and Adewumi AO On the performance of cuckoo search and bat algorithms based instance selection techniques for SVM speed optimization with application to e-Fraud detection KSII Trans Internet Inf Syst 2018 12 3 1348-1375
[70]
Arora S and Anand P Binary butterfly optimization approaches for feature selection Expert Syst Appl 2019 116 147-160
[71]
Agrawal S, Singh B, Kumar R, and Dey N Dey N, Ashour AS, Fong SJ, and Borra S Chapter 9—Machine learning for medical diagnosis: a neural network classifier optimized via the directed bee colony optimization algorithm U-healthcare monitoring systems 2019 Cambridge Academic Press 197-215
[72]
Majhi SK and Mahapatra P Classification of phishing websites using moth-flame optimized neural network Emerging technologies in data mining and information security 2019 Singapore Springer 39-48
[73]
Mafarja M, Aljarah I, Faris H, Hammouri AI, Al-Zoubi AM, and Mirjalili S Binary grasshopper optimisation algorithm approaches for feature selection problems Expert Syst Appl 2019 117 267-286
[74]
Kumar L and Bharti KK An improved BPSO algorithm for feature selection Recent trends in communication, computing, and electronics 2019 Singapore Springer 505-513
[75]
Zendehboudi A, Baseer M, and Saidur R Application of support vector machine models for forecasting solar and wind energy resources: a review J Clean Prod 2018 199 272-285
[76]
Zhang X, Mei C, Chen D, and Yang Y A fuzzy rough set-based feature selection method using representative instances Knowl Based Syst 2018 151 216-229
[77]
Zawbaa HM, Emary E, Grosan C, and Snasel V Large-dimensionality small-instance set feature selection: a hybrid bio-inspired heuristic approach Swarm Evolut Comput 2018 42 29-42
[78]
Alijla BO, Lim CP, Wong L-P, Khader AT, and Al-Betar MA An ensemble of intelligent water drop algorithm for feature selection optimization problem Appl Soft Comput 2018 65 531-541
[79]
Cervantes J, Garcia-Lamont F, Rodriguez L, López A, Castilla JR, and Trueba A PSO-based method for SVM classification on skewed data sets Neurocomputing 2017 228 187-197
[80]
Shunmugapriya P and Kanmani S A hybrid algorithm using ant and bee colony optimization for feature selection and classification (AC-ABC Hybrid) Swarm Evolut Comput 2017 36 27-36
[81]
Deb K, Pratap A, Agarwal S, and Meyarivan T A fast and elitist multiobjective genetic algorithm: NSGA-II IEEE Trans Evolut Comput 2002 6 2 182-197
[82]
Venkatasalam K, Rajendran P, and Thangavel M Improving the accuracy of feature selection in big data mining using accelerated flower pollination (AFP) Algorithm J Med Syst 2019 43 4 96
[83]
Hsu C-W, Chang C-C, and Lin C-J A practical guide to support vector classification 2003 Taipei Department of Computer Science, National Taiwan University
[84]
Fletcher T (2009) Support vector machines explained. http://www.tristanfletcher.co.uk/SVM%20Explained.pdf. Accessed 7 Dec 2016
[85]
Scholkopf B, Burges C, Smola A (1998) Geometry and invariance in kernel based methods. In: Advances in kernel methods: support vector learning
[86]
Akinyelu AA and Adewumi AO Improved instance selection methods for support vector machine speed optimization Secur Commun Netw 2017 2017 11
[87]
Hsu C-W, Chang C-C, Lin C-J (2003) A practical guide to support vector classification. Technical report, Department of Computer Science, National Taiwan University, no. 1–16
[88]
Chang C-C and Lin C-J LIBSVM: a library for support vector machines ACM Trans Intell Syst Technol (TIST) 2011 2 3 1-27
[89]
Johnson M (2009) SVM.NET. http://www.matthewajohnson.org/software/svm.html. Accessed 5 Aug 2014
[90]
Bergholz A, Chang JH, Paaß G, Reichartz F, Strobel S (2008) Improved phishing detection using model-based features. In: Proceedings of the conference on email and anti-spam (CEAS), Mountain View, CA, pp 1–27
[91]
Witten IH, Frank E, Hall MA, and Pal CJ Data mining: practical machine learning tools and techniques 2016 Burlington Morgan Kaufmann
[92]
Group C (2006) SpamAssassin Data. http://www.csmining.org/index.php/spam-assassin-datasets.html. Accessed 5 Aug 2014
[93]
Nazario J (2005) Phishing Corpus. http://monkey.org/jose/wiki/doku.php?id=PhishingCorpus. Accessed 27 April 2015
[94]
Asuncion A, Newman D (2007). UCI machine learning repository. http://archive.ics.uci.edu/ml/datasets.html. Accessed 15 Aug 2016
[95]
Andrea (2016) Credit card fraud detection. https://www.kaggle.com/dalpozz/creditcardfraud. Accessed 12 Dec 2016
[96]
Shams R, Mercer RE (2013) Classifying spam emails using text and readability features. In: IEEE 13th international conference on data mining, December 7–10, 2013, pp 657–666
[97]
Zitar RA and Hamdan A Genetic optimized artificial immune system in spam detection: a review and a model Artif Intell Rev 2013 40 3 305-377
[98]
Graham P (2002) A plan for spam. http://www.paulgraham.com/spam.html. Accessed 04 Aug 2016
[99]
Shams R, Mercer RE (2013) Classifying spam emails using text and readability features. In: 2013 IEEE 13th international conference on data mining, pp 657–666
[100]
Duncan R (2016) A simple guide to HTML. http://www.simplehtmlguide.com/whatishtml.php. Accessed 13 Sept 2016
[101]
Akinyelu AA, Adewumi AO (2014) Classification of phishing email using random forest machine learning technique. J Appl Math 2014:6, Article ID 425731
[102]
Almomani A, Wan T-C, Altaher A, Manasrah A, ALmomani E, Anbar M, et al. Evolving fuzzy neural network for phishing emails detection J Comput Sci 2012 8 7 1099-1107
[103]
Fette I, Sadeh N, Tomasic A (2007) Learning to detect phishing emails. In: Proceedings of the 16th international conference on World Wide Web, Banff, AB, Canada, pp 649–656
[104]
Zhang N, Yuan Y (2017) CS229 lecture notes, phishing detection using neural network. Department of Statistics, Stanford University, 2012. http://cs229.stanford.edu/proj2012/ZhangYuan-PhishingDetectionUsingNeuralNetwork.pdf. Accessed 10 July 2017
[105]
Bergholz A, De Beer J, Glahn S, Moens M-F, Paaß G, and Strobel S New filtering approaches for phishing email J Comput Secur 2010 18 1 7-35
[106]
Adewumi OA and Akinyelu AAA hybrid firefly and support vector machine classifier for phishing email detectionKybernetes2016456977-9943526146
[107]
Basnet R, Mukkamala S, and Sung AH Prasad B Detection of phishing attacks: a machine learning approach Soft computing applications in industry 2008 Berlin Springer 373-383
[108]
Olvera-López JA, Carrasco-Ochoa JA, and Martínez-Trinidad JFA new fast prototype selection method based on clusteringPattern Anal Appl2010132131-1412645361
[109]
Chou C-H, Kuo B-H, Chang F (2006) The generalized condensed nearest neighbor rule as a data reduction method. In: 18th international conference on pattern recognition (ICPR’06), pp 556–559
[110]
Wilson DR, Martinez TR (1997) Instance pruning techniques. In: Proceedings of the fourteenth international conference on machine learning, pp 403–411
[111]
Brighton H, Mellish C (1999) On the consistency of information filters for lazy learning algorithms. In: Żytkow JM, Rauch J (eds) Principles of data mining and knowledge discovery: third european conference, PKDD’99, Prague, Czech Republic, September 15–18, 1999. Proceedings. Springer, Berlin, pp 283–288
[112]
Mohan A and Remya G A parallel implementation of ant colony optimization for tsp based on mapreduce framework Int J Comput Appl 2014 88 8 9-12
[113]
Papesca M (2017) Edge detection using parallel ant colony optimization with Hadoop MapReduce: implementation and scalability. Master of Science Department of Computer Science, State University of New York
[114]
Wei W, Yang X-L, Zhou B, Feng J, and Shen P-YCombined energy minimization for image reconstruction from few viewsMath Probl Eng2012201215463029994711264.94028
[115]
Gopalakrishnan RC and Kuppusamy V Ant colony optimization approaches to clustering of lung nodules from CT images Comput Math Methods Med 2014 2014 572494
[116]
Carter RJ, Dubchak I, and Holbrook SR A computational approach to identify genes for functional RNAs in genomic sequences Nucl Acids Res 2001 29 19 3928-3938
[117]
Salzberg S Locating protein coding regions in human DNA using a decision tree algorithm J Comput Biol 1995 2 3 473-485
[118]
Allen JE, Pertea M, and Salzberg SL Computational gene prediction using multiple sources of evidence Genome Res 2004 14 1 142-148
[119]
Batrinca B and Treleaven PC Social media analytics: a survey of techniques, tools and platforms AI & Soc 2015 30 1 89-116
[120]
Bache K, Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml. Accessed 12 May 2017
[121]
Wei W, Fan X, Song H, and Wang H Video tamper detection based on multi-scale mutual information Multimedia Tools Appl 2019 78 19 27109-27126

Cited By

View all
  • (2023)A Machine Learning-Based Hybrid Approach to Subset Selection Using Binary Ant Colony Optimization FunctionsSN Computer Science10.1007/s42979-023-02251-94:6Online publication date: 8-Nov-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Neural Computing and Applications
Neural Computing and Applications  Volume 32, Issue 15
Aug 2020
1130 pages
ISSN:0941-0643
EISSN:1433-3058
Issue’s Table of Contents

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 August 2020
Accepted: 22 November 2019
Received: 24 October 2017

Author Tags

  1. Machine learning
  2. Support vector machine
  3. Instance selection
  4. Speed optimization
  5. Ant colony optimization

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)A Machine Learning-Based Hybrid Approach to Subset Selection Using Binary Ant Colony Optimization FunctionsSN Computer Science10.1007/s42979-023-02251-94:6Online publication date: 8-Nov-2023

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media