[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2695664.2695754acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Pairwise combination of classifiers for ensemble learning on data streams

Published: 13 April 2015 Publication History

Abstract

This work presents two different voting strategies for ensemble learning on data streams based on pairwise combination of component classifiers. Despite efforts to build a diverse ensemble, there is always some degree of overlap between component classifiers models. Our voting strategies are aimed at using these overlaps to support ensemble prediction. We hypothesize that by combining pairs of classifiers it is possible to alleviate incorrect individual predictions that would otherwise negatively impact the overall ensemble decision. The first strategy, Pairwise Accuracy (PA), combines the shared accuracy estimation of all possible pairs in the ensemble, while the second strategy, Pairwise Patterns (PP), record patterns of pairwise decisions during training and use these patterns during prediction. We present empirical results comparing ensemble classifiers with their original voting methods and our proposed methods in both real and synthetic datasets, with and without concept drifts. Our analysis indicates that pairwise voting is able to enhance overall performance for PP, especially on real datasets, and that PA is useful whenever there are noticeable differences in accuracy estimates among ensemble members, which is common during concept drifts.

References

[1]
R. Agrawal, T. Imilielinski, and A. Swani. Database mining: A performance perspective. IEEE Trans. on Knowledge and Data Engineering, 5(6):914--925, Dec. 1993.
[2]
J. P. Barddal, H. M. Gomes, and F. Enembreck. Sfnclassifier: A scale-free social network method to handle concept drift. In Proceedings of the 29th Annual ACM Symposium on Applied Computing, SAC '14, pages 786--791. ACM, 2014.
[3]
A. Bifet and R. Gavaldà. Learning from time-changing data with adaptive windowing. In SIAM, 2007.
[4]
A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer. Moa: Massive online analysis. The Journal of Machine Learning Research, 11:1601--1604, 2010.
[5]
A. Bifet, G. Holmes, and B. Pfahringer. Leveraging bagging for evolving data streams. In PKDD, pages 135--150, 2010.
[6]
A. Bifet, G. Holmes, B. Pfahringer, R. Kirkby, and R. Gavaldà. New ensemble methods for evolving data streams. In 15th ACM SIGKDD, pages 139--148, 2009.
[7]
L. Breiman. Bagging predictors. Machine Learning, 24(2):123--140, 1996.
[8]
D. Brzezinski and J. Stefanowski. Combining block-based and online methods in learning ensembles from concept drifting data streams. Information Sciences, 265:50--67, 2014.
[9]
J. Demšar. Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7:1--30, Dec. 2006.
[10]
P. Domingos and G. Hulten. Mining high-speed data streams. In Proc. of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 71--80. ACM SIGKDD, Sep. 2000.
[11]
J. Gama and P. Rodrigues. Issues in evaluation of stream learning algorithms. In 15th ACM SIGKDD, pages 329--338. ACM SIGKDD, June 2009.
[12]
H. M. Gomes and F. Enembreck. Sae: Social adaptive ensemble classifier for data streams. In IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pages 199--206, 2013.
[13]
H. M. Gomes and F. Enembreck. Sae2: Advances on the social adaptive ensemble classifier for data streams. In Proceedings of the 29th Annual ACM Symposium on Applied Computing, SAC '14. ACM, 2014.
[14]
M. Harries. Splice-2 comparative evaluation: Electricity pricing. Technical report, 1999.
[15]
T. Hastie, R. Tibshirani, et al. classification by pairwise coupling. The annals of statistics, 26(2):451--471, 1998.
[16]
G. Hulten, L. Spencer, and P. Domingos. Mining time-changing data streams. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 97--106. ACM, 2001.
[17]
I. Katakis, G. Tsoumakas, and I. Vlahavas. An adaptive personalized news dissemination system. Journal of Intelligent Information Systems, 32(2):191--212, Apr. 2009.
[18]
J. Z. Kolter and M. A. Maloof. Dynamic weighted majority: An ensemble method for drifting concepts. In The Journal of Machine Learning Research, pages 123--130. JMLR, Jan. 2007.
[19]
L. I. Kuncheva and C. J. Whitaker. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine learning, 51(2):181--207, 2003.
[20]
L. I. Kuncheva, C. J. Whitaker, C. A. Shipp, and R. P. Duin. Limits on the majority vote accuracy in classifier fusion. Pattern Analysis & Applications, 6(1):22--31, 2003.
[21]
N. Littlestone and M. K. Warmuth. The weighted majority algorithm. Information and computation, 108(2):212--261, 1994.
[22]
N. C. Oza and S. Russell. Online bagging and boosting. In Artificial Intelligence and Statistics, pages 105--112. Society for Artificial Intelligence and Statistics, Jan. 2001.
[23]
B. Quost, T. Denoeux, and M.-H. Masson. Pairwise classifier combination using belief functions. Pattern Recognition Letters, 28(5):644--653, 2007.
[24]
W. N. Street and Y. Kim. A streaming ensemble algorithm (sea) for large-classification. In Proc. of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 377--382. ACM SIGKDD, Aug. 2001.
[25]
H. Wang, W. Fan, P. S. Yu, and J. Han. Mining concept-drifting data streams using ensemble classifiers. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 226--235. ACM, 2003.
[26]
I. Žliobaitė, A. Bifet, J. Read, B. Pfahringer, and G. Holmes. Evaluation methods and decision theory for classification of streaming data with temporal dependence. Machine Learning, pages 1--28, 2014.

Cited By

View all
  • (2023)Challenges and New Opportunities in Diverse Approaches of Big Data Stream AnalyticsProceedings of Third International Conference on Sustainable Expert Systems10.1007/978-981-19-7874-6_31(425-433)Online publication date: 23-Feb-2023
  • (2020)Research on Equipment Fault Diagnosis Classification Model Based on Integrated Incremental Dynamic Weight CombinationData Science10.1007/978-981-15-7984-4_36(475-489)Online publication date: 20-Aug-2020
  • (2017)A Survey on Ensemble Learning for Data Stream ClassificationACM Computing Surveys10.1145/305492550:2(1-36)Online publication date: 27-Mar-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied Computing
April 2015
2418 pages
ISBN:9781450331968
DOI:10.1145/2695664
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 April 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. concept drift
  2. data stream mining
  3. ensemble classifiers
  4. machine learning
  5. supervised learning

Qualifiers

  • Research-article

Conference

SAC 2015
Sponsor:
SAC 2015: Symposium on Applied Computing
April 13 - 17, 2015
Salamanca, Spain

Acceptance Rates

SAC '15 Paper Acceptance Rate 291 of 1,211 submissions, 24%;
Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)2
Reflects downloads up to 30 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Challenges and New Opportunities in Diverse Approaches of Big Data Stream AnalyticsProceedings of Third International Conference on Sustainable Expert Systems10.1007/978-981-19-7874-6_31(425-433)Online publication date: 23-Feb-2023
  • (2020)Research on Equipment Fault Diagnosis Classification Model Based on Integrated Incremental Dynamic Weight CombinationData Science10.1007/978-981-15-7984-4_36(475-489)Online publication date: 20-Aug-2020
  • (2017)A Survey on Ensemble Learning for Data Stream ClassificationACM Computing Surveys10.1145/305492550:2(1-36)Online publication date: 27-Mar-2017
  • (2016)Advances in network-based ensemble classifiers for evolving data streamsProceedings of the 31st Annual ACM Symposium on Applied Computing10.1145/2851613.2852021(958-959)Online publication date: 4-Apr-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media