A Combination Classification Algorithm Based on Outlier Detection and C4.5

ShengYi Jiang²⁵ &
Wen Yu²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5678))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

2271 Accesses
2 Citations

Abstract

The performance of traditional classifier skews towards the majority class for imbalanced data, resulting in high misclassification rate for minority samples. To solve this problem, a combination classification algorithm based on outlier detection and C4.5 is presented. The basic idea of the algorithm is to make the data distribution balance by grouping the whole data into rare clusters and major clusters through the outlier factor. Then C4.5 algorithm is implemented to build the decision trees on both the rare clusters and the major clusters respectively. When classifying a new object, the decision tree for evaluation will be chosen according to the type of the cluster which the new object is nearest. We use the datasets from the UCI Machine Learning Repository to perform the experiments and compare the effects with other classification algorithms; the experiments demonstrate that our algorithm performs much better for the extremely imbalanced data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 71.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 89.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

On the evaluation of outlier detection and one-class classification: a comparative study of algorithms, model selection, and ensembles

Article Open access 16 May 2023

Outlier detection using an ensemble of clustering algorithms

Article 03 November 2021

Survey on extreme learning machines for outlier detection

Article 23 January 2024

References

Weiss, G.M.: Mining with Rarity: A Uinfying Framework. Sigkdd Explorations 6(1), 7–19 (2004)
Article Google Scholar
Marcus, A.: Learning when data set s are imbalanced and when costs are unequal and unknown. In: Proc. of t he Workshop on Learning from Imbalanced Data Sets II, ICML, Washington DC (2003)
Google Scholar
Liu, X.-Y., Wu, J., Zhou, Z.-H.: Exploratory Undersampling for Class-Imbalance Learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 39(2), 539–550 (2009)
Article Google Scholar
Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning. LNCS, pp. 878–887. Springer, Heidelberg (2005)
Google Scholar
Guo, H., Viktor, H.L.: Learning from Imbalanced Data Set s with Boosting and Data Generation: The DataBoost-IM Approach. Sigkdd Explorations 6, 30–39 (2003)
Article Google Scholar
Hong, X., Chen, S., Harris, C.J.: A Kernel-Based Two-Class Classifier for Imbalanced Data Sets. IEEE Transactions on Neural Networks 17(6), 786–795 (2007)
Google Scholar
Su, C.-T., Chen, L.-S., Yih, Y.: Knowledge acquisition through information granulation for imbalanced data. Expert Systems with applications 31, 531–541 (2006)
Article Google Scholar
Jiang, S., Song, X.: A clustering-based method for unsupervised intrusion detections. Pattern Recognition Letters 5, 802–810 (2006)
Article Google Scholar
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html

Download references

Author information

Authors and Affiliations

School of Informatics, Guangdong University of Foreign Studies, 510006, Guangzhou, Guangdong
ShengYi Jiang & Wen Yu

Authors

ShengYi Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Wen Yu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Knowledge Science & Engineering Institute, School of Education Technology, Beijing Normal University, Xinjiekouwai Ave. 19, 100875, Beijing, China
Ronghuai Huang
The Hong Kong University of Science and Technology, Clear Water Bay,, Hong Kong, Hong Kong
Qiang Yang
School of Computing Science, Simon Fraser University, 8888 University Drive, V5A 1S6, Burnaby, BC, Canada
Jian Pei
Faculty of Economics, University of Porto, Rua Dr. Roberto Frias, 4200-465, Porto, Portugal
João Gama
School of Information, Zhongguancum, Renmin University, 100872, Beijing, China
Xiaofeng Meng
School of Information Technology and Electrical Engineering, The University of Queensland, 4072, St. Lucia, Queensland, Australia
Xue Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, S., Yu, W. (2009). A Combination Classification Algorithm Based on Outlier Detection and C4.5. In: Huang, R., Yang, Q., Pei, J., Gama, J., Meng, X., Li, X. (eds) Advanced Data Mining and Applications. ADMA 2009. Lecture Notes in Computer Science(), vol 5678. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03348-3_50

Download citation

DOI: https://doi.org/10.1007/978-3-642-03348-3_50
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03347-6
Online ISBN: 978-3-642-03348-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Combination Classification Algorithm Based on Outlier Detection and C4.5

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

On the evaluation of outlier detection and one-class classification: a comparative study of algorithms, model selection, and ensembles

Outlier detection using an ensemble of clustering algorithms

Survey on extreme learning machines for outlier detection

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Combination Classification Algorithm Based on Outlier Detection and C4.5

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

On the evaluation of outlier detection and one-class classification: a comparative study of algorithms, model selection, and ensembles

Outlier detection using an ensemble of clustering algorithms

Survey on extreme learning machines for outlier detection

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation