[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

A hybrid isotonic separation training algorithm with correlation-based isotonic feature selection for binary classification

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Isotonic separation is a classification technique which constructs a model by transforming the training set into a linear programming problem (LPP). It is computationally expensive to solve large-scale LPPs using traditional methods when data set grows. This paper proposes a hybrid binary classification algorithm, meta-heuristic isotonic separation with particle swarm optimization and convergence criterion (MeHeIS–CPSO), in which a particle swarm optimization-based meta-heuristic is embedded in the training phase to find a solution for LPP. The proposed framework formulates the LPP as a directed acyclic graph (DAG) and arranges decision variables using topological sort. It obtains a new threshold value from training set and sets up a convergence criterion using this threshold. It also deploys a new correlation coefficient-based supervised feature selection technique to select isotonic features and improves predictive accuracy of the classifier. Experiments are conducted on publicly available data sets and synthetic data set. Theoretical, empirical, and statistical analyses show that MeHeIS–CPSO is superior to its predecessors in terms of training time and predictive ability on large data sets. It also outperforms state-of-the-art machine learning and isotonic classification techniques in terms of predictive performance on small- and large-scale data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Jacob V, Krishnan R, Ryu YU (2007) Internet content filtering using isotonic separation on content category ratings. ACM Trans Internet Technol 7(1):1–19

    Article  Google Scholar 

  2. Ryu YU, Yue WT (2005) Firm bankruptcy prediction; experimental comparison of isotonic separation and other classification approaches. IEEE Trans Syst Man Cybern Part A Syst Hum 35(5):727–737

    Article  Google Scholar 

  3. Ryu YU, Chandrasekaran R, Jacob VS (2007) Breast cancer detection using the isotonic separation technique. Eur J Oper Res 181:842–854

    Article  MATH  Google Scholar 

  4. Ryu YU, Chandrasekaran R, Jacob VS (2004) Prognosis using an isotonic prediction technique. Inf J Manag Sci 50(6):777–785

    Google Scholar 

  5. Chandrasekaran R, Ryu YU, Jacob V, Hong S (2005) Isotonic separation. Inf J Comput 17(4):462–474

    Article  MathSciNet  MATH  Google Scholar 

  6. Cano JR, Aljohani NR, Abbasi RA, Alowidbi JS, Garcia S (2017) Prototype selection to improve monotonic nearest neighbor. EAAI 60:128–135

    Google Scholar 

  7. Gonzalez S, Herrera F, Garcia S (2015) Monotonic random forest with an ensemble pruning mechanism based on the degree of monotonicity. New Gener Comput 33(4):367–388

    Article  Google Scholar 

  8. Ahuja RK, Magnanti TL, Orlin JB (1993) Network flows. Prentice-Hall, Englewood Cliffs

    MATH  Google Scholar 

  9. Goldberg AV (1998) Recent developments in maximum flow algorithms. In: Proceedings of the 1998 Scandinavian workshop on algorithm theory, Springer, London, UK

  10. Dantzig GB, Thapa MN (1997) Linear programming 1: introduction. Springer, New York

    MATH  Google Scholar 

  11. Monteiro R, Adler I (1989) Interior Path following primal-dual algorithms. Part II: convex quadratic programming. Math Program 44:43–66

    Article  MATH  Google Scholar 

  12. Deb K (2001) Multiobjective optimization using evolutionary algorithm. Wiley, New York

    MATH  Google Scholar 

  13. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, vol 4, pp 1942–1948, Piscataway

  14. Kalousis A, Prados J, Hilario M (2007) A stability of feature selection algorithms: a study on high dimensional spaces. Knowl Inf Syst 12(1):95–116

    Article  Google Scholar 

  15. Robertson T, Wright FT, Dykstra RL (1988) Order restricted statistical inference. Wiley, New York

    MATH  Google Scholar 

  16. Dorigo M (1992) Optimization, learning and natural algorithms. In: PhD thesis, Dipartimento di Elettronica, Politecnico di Milano, Milan, Italy

  17. Malar B, Nadarajan R (2013) Evolutionary Isotonic separation for classification: theory and experiments. Knowl Inf Syst 37(3):531–553

    Article  Google Scholar 

  18. Goldberg DE (1989) Genetic algorithms for search, optimization and machine learning. Addision Wesley, Boston

    MATH  Google Scholar 

  19. Majid A, Lee CH, Mahmood MT et al (2012) Impulse noise filtering based on noise free pixels using genetic programming. Knowl Inf Syst 32(3):505–526

    Article  Google Scholar 

  20. Duivesteijn W, Feelders A (2008) Nearest neighbor classification with monotonicity constraints. ECML/PKDD 1:301–316

    Google Scholar 

  21. García J, Fardoun HM, Algazzawi DM, Cano JR, Garcia S (2017) MoNGEL: monotonic nested generalized exemplar learning. Pattern Anal Appl 20:441–452

    Article  MathSciNet  Google Scholar 

  22. Sousa RG, Cardoso JS (2011) Ensemble of decision trees with GLBAL constraints for ordinal classification, In: Proceedings of 11th international conference on intelligent systems design and applications, pp 1164–1169

  23. Daniels H, Velikova M (2010) Monotone and partially monotone neural networks. IEEE Trans Neural Netw 21(6):906–917

    Article  Google Scholar 

  24. Eberhart RC, Shi Y (2001). Particle swarm optimization: developments, applications, and resources. In: Proceedings of the 2001 congress on evolutionary computation 2001. pp 81–86

  25. Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: Proceedings of the IEEE international conference on evolutionary computation, IEEE Press, Piscataway, pp 69–73

  26. Eberhart RC, Simpson PK, Dobbins RW (1996) Computational intelligence PC tools. AP Professional, Boston

    Google Scholar 

  27. Kennedy J, Eberhart R (1997) A discrete binary version of the Particle Swarm algorithm. In: Proceedings of the international conference on systemics, cybernatics and informatics, Orlando, FL, vol 5, pp 4104–4109

  28. Shen Q, Jiang JH, Jiao CX, Shen GL, Yu RQ (2004) Modified particle swarm optimization algorithm for variable selection in MLR and PLS modeling: QSAR studies of antagonism of angiotensin II antagonists. Eur J Pharm Sci 22(2–3):145–152

    Article  Google Scholar 

  29. Wang L, Wang X, Fu J, Zhen L (2008) A novel probability binary particle swarm optimization algorithm and its application. J Softw 3(9):28–35

    Article  Google Scholar 

  30. Poli R (2008) Analysis of the publications on the applications of particle swarm optimization. J Artif Evolut Appl 2008:1–10

    Google Scholar 

  31. Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to algorithms. MIT Press and McGrawHill, New York, pp 549–552

    MATH  Google Scholar 

  32. Merz CJ, Murphy PM (1998) UCI repository of machine learning databases. Department of information and computer sciences, University of California, Irvine

  33. Castillo C, Donato D, Becchetti L, Boldi P, Leonardi S, Santini M, Vigna S (2006) A reference collection for web spam. SIGIR Forum 40(2):11–24

    Article  Google Scholar 

  34. Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8):861–874

    Article  MathSciNet  Google Scholar 

  35. Gutierrez PA, Garcia S (2016) Current prospects on ordinal and monotonic classification, prog, artificial intelligence. Springer, Heidelberg

    Google Scholar 

  36. Poli R, Kennedy J, Blackwell T (2007) Particle swarm optimization: an overview. Swarm Intell 1(1):33–57

    Article  Google Scholar 

  37. Klotz JH (2006) A computational approach to statistics, department of statistics, University of Wisconsin at Madison

  38. Dawson RJM (1997) Turning the tables: a t-table for today. J Stat Edu 5(2):1–6

    Google Scholar 

  39. Hochberg Y (1988) A sharper Bonferonni procedure for multiple tests of significance. Biometrika 75:800–803

    Article  MathSciNet  MATH  Google Scholar 

  40. Wu X, Vipin Kumar J, Quinlan R, Ghosh J, Yang Q et al (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37

    Article  Google Scholar 

  41. Quinlan JR (1993) C4.5: programs for machine learning. Morghan Kaufman, San Mateo

  42. Watters CB, Shepherd M (2003) Support vector machines for text categorization. In: Proceedings of the Hawaii 2003 international conference on system sciences, IEEE computer science society

  43. Eberhart RC, Kennedy J (1995) A new optimizer using particle swarm theory. In: Sixth international symposium on micro machine and human science, Nagoya, pp 39–43

  44. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1):10–18

    Article  Google Scholar 

  45. Han J (2005) Datamining concepts and techniques. Morgan Kaufmann Publishers Inc., San Francisco

    Google Scholar 

  46. Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. In: Proceedings of 1998 European conference on machine learning (ECML)

  47. Joachims T (2002) SVM light support vector machine. pp 83–92. http://svmlight.joachims.org

  48. Ntoulas A, Najork M, Manasse M, Fetterly D, (2006) Detecting spam web pages through content analysis. In: Proceedings of international conference on World Wide Web

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to B. Malar.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Malar, B., Nadarajan, R. & Gowri Thangam, J. A hybrid isotonic separation training algorithm with correlation-based isotonic feature selection for binary classification. Knowl Inf Syst 59, 651–683 (2019). https://doi.org/10.1007/s10115-018-1226-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-018-1226-6

Keywords

Navigation