Article

Efficient learning of Naive Bayes classifiers under class-conditional classification noise

Authors:

François Denis,

Christophe Nicolas Magnan,

Liva RalaivolaAuthors Info & Claims

ICML '06: Proceedings of the 23rd international conference on Machine learning

Pages 265 - 272

https://doi.org/10.1145/1143844.1143878

Published: 25 June 2006 Publication History

Get Access

Abstract

We address the problem of efficiently learning Naive Bayes classifiers under class-conditional classification noise (CCCN). Naive Bayes classifiers rely on the hypothesis that the distributions associated to each class are product distributions. When data is subject to CCC-noise, these conditional distributions are themselves mixtures of product distributions. We give analytical formulas which makes it possible to identify them from data subject to CCCN. Then, we design a learning algorithm based on these formulas able to learn Naive Bayes classifiers under CCCN. We present results on artificial datasets and datasets extracted from the UCI repository database. These results show that CCCN can be efficiently and successfully handled.

References

[1]

DeComité, F., Denis, F., Gilleron, R., & Letouzey, F. (1999). Positive and unlabeled examples help learning. ALT 99, 10th In. Conf. on Algorithmic Learning Theory.]]

Digital Library

Google Scholar

[2]

Denis, F., Gilleron, R., Laurent, A., & Tommasi, M. (2003). Text classification and co-training from positive and unlabeled examples. Proc. of the ICML 2003 workshop: The Continuum from Labeled to Unlabeled Data.]]

Google Scholar

[3]

Domingos, P., & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, 29, 103--130.]]

Digital Library

Google Scholar

[4]

Feldman, J., O'Donnell, R., & Servedio, R. A. (2005). Learning mixtures of product distributions over discrete domains. Proceedings of FOCS 2005 (pp. 501--510).]]

Digital Library

Google Scholar

[5]

Freund, Y., & Mansour, Y. (1999). Estimating a mixture of two product distributions. Proceedings of COLT'99.]]

Digital Library

Google Scholar

[6]

Geiger, D., Heckerman, D., King, H., & Meek, C. (2001). Stratified exponential families: graphical models and model selection. Annals of Statistics, 29, 505--529.]]

Crossref

Google Scholar

[7]

Li, X., & Liu, B. (2003). Learning to classify texts using positive and unlabeled data. Proceedings of IJCAI 2003.]]

Google Scholar

[8]

Li, X., & Liu, B. (2005). Learning from positive and unlabeled examples with different data distributions. Proceedings of ECML 2005 (pp. 218--229).]]

Digital Library

Google Scholar

[9]

Merz, C., & Murphy, P. (1998). UCI repository of machine learning databases.]]

Google Scholar

[10]

Whiley, M., & Titterington, D. (2002). Model identifiability in naive bayesian networks (Technical Report).]]

Google Scholar

[11]

Yakowitz, S. J. & Spragins, J. D. (1968). On the identifiability of finite mixtures. The Annals of Mat. St., 39.]]

Crossref

Google Scholar

[12]

Yang, Y., Xia, Y., Chi, Y. & Muntz, R. R. (2003). Learning naive bayes classifier from noisy data. CSD-TR 030056.]]

Google Scholar

[13]

Zhu, X., Wu, X., & Chen, Q. (2003). Eliminating class noise in large datasets. ICML (pp. 920--927).]]

Google Scholar

Cited By

View all

Dauce EProix TRalaivola L(2015)Reward-based online learning in non-stationary environments: Adapting a P300-speller with a “backspace” key2015 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2015.7280686(1-8)Online publication date: Jul-2015
https://doi.org/10.1109/IJCNN.2015.7280686
Ismail IMarsono MNor S(2014)Malware detection using augmented naive Bayes with domain knowledge and under presence of class noiseInternational Journal of Information and Computer Security10.1504/IJICS.2014.0651736:2(179-197)Online publication date: 1-Oct-2014
https://dl.acm.org/doi/10.1504/IJICS.2014.065173
Zheng MKudenko D(2012)Automated Event Recognition for Football Commentary GenerationInterdisciplinary Advancements in Gaming, Simulations and Virtual Environments10.4018/978-1-4666-0029-4.ch019(300-315)Online publication date: 2012
https://doi.org/10.4018/978-1-4666-0029-4.ch019
Show More Cited By

Index Terms

Efficient learning of Naive Bayes classifiers under class-conditional classification noise
1. Computing methodologies
  1. Machine learning

Recommendations

Learning Naive Bayes Classifiers for Music Classification and Retrieval
ICPR '10: Proceedings of the 2010 20th International Conference on Pattern Recognition

In this paper, we explore the use of naive Bayes classifiers for music classification and retrieval. The motivation is to employ all audio features extracted from local windows for classification instead of just using a single song-level feature vector ...
A comprehensive review of recursive Naïve Bayes Classifiers

In this paper we provide a comprehensive empirical review of a variant of the Recursive Naïve Baye Classifier (RNBC*) in comparison to simple Naïve Bayes and C4.5. We show that in terms of a zero one loss cost function for classification accuracy, RNBC* ...
Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks

In this paper, we introduce two independent hybrid mining algorithms to improve the classification accuracy rates of decision tree (DT) and naive Bayes (NB) classifiers for the classification of multi-class problems. Both DT and NB classifiers are ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

ICML '06: Proceedings of the 23rd international conference on Machine learning

June 2006

1154 pages

ISBN:1595933832

DOI:10.1145/1143844

Program Chairs:
William Cohen,
Andrew Moore

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 June 2006

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Acceptance Rates

ICML '06 Paper Acceptance Rate 140 of 548 submissions, 26%;

Overall Acceptance Rate 140 of 548 submissions, 26%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
549
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Dauce EProix TRalaivola L(2015)Reward-based online learning in non-stationary environments: Adapting a P300-speller with a “backspace” key2015 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2015.7280686(1-8)Online publication date: Jul-2015
https://doi.org/10.1109/IJCNN.2015.7280686
Ismail IMarsono MNor S(2014)Malware detection using augmented naive Bayes with domain knowledge and under presence of class noiseInternational Journal of Information and Computer Security10.1504/IJICS.2014.0651736:2(179-197)Online publication date: 1-Oct-2014
https://dl.acm.org/doi/10.1504/IJICS.2014.065173
Zheng MKudenko D(2012)Automated Event Recognition for Football Commentary GenerationInterdisciplinary Advancements in Gaming, Simulations and Virtual Environments10.4018/978-1-4666-0029-4.ch019(300-315)Online publication date: 2012
https://doi.org/10.4018/978-1-4666-0029-4.ch019
Kudenko DZheng M(2010)Automated Event Recognition for Football Commentary GenerationInternational Journal of Gaming and Computer-Mediated Simulations10.4018/jgcms.20101001052:4(67-84)Online publication date: 1-Oct-2010
https://dl.acm.org/doi/10.4018/jgcms.2010100105
Ismail IMarsono MNor S(2010)Detecting Worms Using Data Mining TechniquesProceedings of the 2010 Sixth International Conference on Signal-Image Technology and Internet Based Systems10.1109/SITIS.2010.41(187-194)Online publication date: 15-Dec-2010
https://dl.acm.org/doi/10.1109/SITIS.2010.41
Lin JChen M(2008)An Intelligent Agent for Personalized E-LearningProceedings of the 2008 Eighth International Conference on Intelligent Systems Design and Applications - Volume 0110.1109/ISDA.2008.250(27-31)Online publication date: 26-Nov-2008
https://dl.acm.org/doi/10.1109/ISDA.2008.250

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Learning Naive Bayes Classifiers for Music Classification and Retrieval

A comprehensive review of recursive Naïve Bayes Classifiers

Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations