research-article

Learning to combine discriminative classifiers: confidence based

Author:

Chi-Hoon LeeAuthors Info & Claims

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 743 - 752

https://doi.org/10.1145/1835804.1835899

Published: 25 July 2010 Publication History

Get Access

Abstract

Much of research in data mining and machine learning has led to numerous practical applications. Spam filtering, fraud detection, and user query-intent analysis has relied heavily on machine learned classifiers, and resulted in improvements in robust classification accuracy. Combining multiple classifiers (a.k.a. Ensemble Learning) is a well studied and has been known to improve effectiveness of a classifier. To address two key challenges in Ensemble Learning-- (1) learning weights of individual classifiers and (2) the combination rule of their weighted responses, this paper proposes a novel Ensemble classifier, EnLR, that computes weights of responses from discriminative classifiers and combines their weighted responses to produce a single response for a test instance. The combination rule is based on aggregating weighted responses, where a weight of an individual classifier is inversely based on their respective variances around their responses. Here, variance quantifies the uncertainty of the discriminative classifiers' parameters, which in turn depends on the training samples. As opposed to other ensemble methods where the weight of each individual classifier is learned as a part of parameter learning and thus the same weight is applied to all testing instances, our model is actively adjusted as individual classifiers become confident at its decision for a test instance. Our empirical experiments on various data sets demonstrate that our combined classifier produces "effective" results when compared with a single classifier. Our novel classifier shows statistically significant better accuracy when compared to well known Ensemble methods -- Bagging and AdaBoost. In addition to robust accuracy, our model is extremely efficient dealing with high volumes of training samples due to the independent learning paradigm among its multiple classifiers. It is simple to implement in a distributed computing environment such as Hadoop.

Supplementary Material

JPG File (kdd2010_lee_lcd_01.jpg)

Download
9.78 KB

MOV File (kdd2010_lee_lcd_01.mov)

Download
132.37 MB

References

[1]

T. Amemiya. Introduction to Statistics and Econometrics. Harvard University Press, 1994.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Building Locally Discriminative Classifier Ensemble Through Classifier Fusion Among Nearest Neighbors

An Evolutionary Algorithm for Learning Interpretable Ensembles of Classifiers

Hierarchical distance learning by stacking nearest neighbor classifiers

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations