More Web Proxy on the site http://driver.im/

research-article

Active learning from stream data using optimal weight classifier ensemble

Authors:

Yong ShiAuthors Info & Claims

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, Volume 40, Issue 6

Pages 1607 - 1621

https://doi.org/10.1109/TSMCB.2010.2042445

Published: 01 December 2010 Publication History

Abstract

In this paper, we propose a new research problem on active learning from data streams, where data volumes grow continuously, and labeling all data is considered expensive and impractical. The objective is to label a small portion of stream data from which a model is derived to predict future instances as accurately as possible. To tackle the technical challenges raised by the dynamic nature of the stream data, i.e., increasing data volumes and evolving decision concepts, we propose a classifierensemble-based active learning framework that selectively labels instances from data streams to build a classifier ensemble. We argue that a classifier ensemble's variance directly corresponds to its error rate, and reducing a classifier ensemble's variance is equivalent to improving its prediction accuracy. Because of this, one should label instances toward theminimization of the variance of the underlying classifier ensemble. Accordingly, we introduce a minimum-variance (MV) principle to guide the instance labeling process for data streams. In addition, we derive an optimal-weight calculationmethod to determine the weight values for the classifier ensemble. The MV principle and the optimal weighting module are combined to build an active learning framework for data streams. Experimental results on synthetic and real-world data demonstrate the performance of the proposed work in comparison with other approaches.

References

[1]

C. Aggarwal, Data Streams: Models and Algorithms. New York: Springer-Verlag, 2007.

Digital Library

[2]

P. Domingos and G. Hulten, "Mining high-speed data streams," in Proc. KDD, 2000, pp. 71-80.

Digital Library

[3]

H. Wang, W. Fan, P. Yu, and J. Han, "Mining concept-drifting data streams using ensemble classifiers," in Proc. KDD, 2003, pp. 226-235.

Digital Library

[4]

P. Zhang, X. Zhu, and L. Guo, "Mining data streams with labeled and unlabeled training examples," in Proc. ICDM, 2009, pp. 627-636.

Digital Library

[5]

H. Wang, J. Yin, J. Pei, P. Yu, and J. Yu, "Suppressing model overfitting in mining concept-drifting data streams," in Proc. KDD, 2006, pp. 736-741.

Digital Library

[6]

Y. Yang, X. Wu, and X. Zhu, "Combining proactive and reactive predictions of data streams," in Proc. KDD, 2005, pp. 710-715.

Digital Library

[7]

W. Street and Y. Kim, "A streaming ensemble algorithm (SEA) for largescale classification," in Proc. KDD, 2001, pp. 377-382.

Digital Library

[8]

J. Gao, W. Fan, J. Han, and P. Yu, "A general framework for mining concept-drifting data streams with skewed distributions," in Proc. SIAM Int. Conf. Data Mining, 2007, pp. 3-14.

[9]

W. Fan, Y. Huang, H. Wang, and P. Yu, "Active mining of data streams," in Proc. SIAM Int. Conf. Data Mining, 2004, pp. 457-461.

[10]

X. Zhu, P. Zhang, X. Wu, D. He, C. Zhang, and Y. Shi, "Cleansing noisy data streams," in Proc. 8th IEEE Int. Conf. Data Mining, 2008, pp. 1139-1144.

Digital Library

[11]

D. Cohn, L. Atlas, and R. Ladner, "Improving generalization with active learning," Mach. Learn., vol. 15, no. 2, pp. 201-221, May 1994.

Digital Library

[12]

X. Zhu, X. Wu, and Q. Chen, "Eliminating class noise in large datasets," in Proc. ICML, 2003, pp. 920-927.

[13]

H. Seung, M. Opper, and H. Sompolinsky, "Query by committee," in Proc. COLT, 1992, pp. 287-294.

Digital Library

[14]

Y. Freund, H. Seung, and E. Tishby, "Selective sampling using the query by committee algorithm," Mach. Learn., vol. 28, no. 2/3, pp. 133-168, Aug./Sep. 1997.

Digital Library

[15]

S. Hoi, R. Jin, J. Zhu, and M. Lyu, "Batch mode active learning and its application to medical image classification," in Proc. ICML, 2006, pp. 417-424.

Digital Library

[16]

H. Nguyen and A. Smeulders, "Active learning using pre-clustering," in Proc. ICML, 2004, p. 79.

Digital Library

[17]

M. Culver, D. Kun, and S. Scott, "Active learning to maximize area under the ROC curve," in Proc. ICDM, 2006, pp. 149-158.

Digital Library

[18]

J. Z. Kolter and M. A. Maloof, "Using additive expert ensembles to cope with concept drift," in Proc. ICML, 2005, pp. 449-456.

Digital Library

[19]

P. Utgoff, "Incremental induction of decision trees," Mach. Learn., vol. 4, no. 2, pp. 161-186, Nov. 1989.

Digital Library

[20]

K. Tumer and J. Ghosh, "Error correlation and error reduction in ensemble classifier," Connection Sci., vol. 8, no. 3/4, pp. 385-404, Dec. 1996.

[21]

K. Tumer and J. Ghosh, "Analysis of decision boundaries in linearly combined neural classifiers," Pattern Recognit., vol. 29, no. 2, pp. 341- 348, Feb. 1996.

[22]

R. Kohavi and D. Wolpert, "Bias plus variance decomposition for zeroone loss function," in Proc. ICML, 1996, pp. 275-283.

[23]

X. Zhu and X. Wu, "Class noise vs attribute noise: A quantitative study of their impacts," Artif. Intell. Rev., vol. 22, no. 3/4, pp. 177-210, Nov. 2004.

Digital Library

[24]

J. Snyman, Practical Mathematical Optimization. New York: Springer-Verlag, 2005.

[25]

I. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques. San Mateo, CA: Morgan Kaufmann, 2005.

Digital Library

[26]

J. Quinlan, C4.5: Programs for Machine learning. San Mateo, CA: Morgan Kaufmann, 1993.

Digital Library

[27]

D. Newman, S. Hettich, C. Blake, and C. Merz, UCI Repository of Machine Learning, 1998.

[28]

P. Domingos and M. Pazzani, "On the optimality of the simple Bayesian classifier under zero-one loss," Mach. Learn., vol. 29, no. 2/3, pp. 103- 130, Nov./Dec. 1997.

Digital Library

[29]

L. Breiman, "Bias, variance, and arching classifiers," Statistics Dept., Univ. California at Berkeley, Berkeley, CA, Tech. Rep. #460, 1996.

[30]

J. Friedman, "On bias, variance, 0/1-loss, and the curse-of dimensionality," Data Mining Knowl. Discover, vol. 1, no. 1, pp. 55-77, 1996.

Digital Library

[31]

E. Kong and T. Dietterich, "Error-correcting output coding corrects bias and variance," in Proc. 12th ICML, 1995, pp. 313-321.

[32]

D. Moore and G. McCabe, Introduction to the Practice of Statistics, 4th ed. San Francisco, CA: Michelle Julet, 2002.

[33]

C. Corinna and V. Vapnik, "Support-vector networks," Mach. Learn., vol. 20, no. 3, pp. 273-297, Sep. 1995.

Digital Library

[34]

J. Kittler, M. Hatef, R. Duin, and J. Matas, "On combining classifiers," IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 3, pp. 226-239, Mar. 1998.

Digital Library

[35]

T. Ho, J. Hull, and S. Srihari, "Decision combination in multiple classifier systems," IEEE Trans. Pattern Anal. Mach. Intell., vol. 16, no. 1, pp. 66- 75, Jan. 1994.

Digital Library

[36]

L. Kuncheva, Combining Pattern Classifiers: Methods and Algorithms. Hoboken, NJ: Wiley, 2004.

Digital Library

[37]

J. Kittler and F. Alkoot, "Sum versus vote fusion in multiple classifier systems," IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, no. 1, pp. 110- 116, Jan. 2003.

Digital Library

[38]

P. Wang, H. Wang, X. Wu, W. Wang, and B. Shi, "A low-granularity classifier for data streams with concept drifts and biased class distribution," IEEE Trans. Knowl. Data Eng., vol. 19, no. 9, pp. 1202-1213, Sep. 2007.

Digital Library

[39]

P. Mitra, C. Murthy, and S. Pal, "A probabilistic active support vector learning algorithm," IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 3, pp. 413-418, Mar. 2004.

Digital Library

[40]

D. Cohn, Z. Ghahramani, and M. Jordan, "Active learning with statistical models," Artif. Intell. Res., vol. 4, no. 1, pp. 129-145, Jan. 1996.

Digital Library

[41]

C. Campbell, N. Cristianini, and A. Smola, "Query learning with large margin classifiers," in Proc. ICML, 2000, pp. 111-118.

Digital Library

[42]

S. Ji, B. Krishnapuram, and L. Carin, "Variational Bayes for continuous hiddenMarkov models and its application to active learning," IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 4, pp. 522-532, Apr. 2006.

Digital Library

[43]

D. Lewis and J. Catlett, "Heterogeneous uncertainty sampling for supervised learning," in Proc. ICML, 1994, pp. 148-156.

[44]

N. Abe and H. Mamitsuka, "Query learning strategies using boosting and bagging," in Proc. 15th ICML, Madison, WI, 1998, pp. 1-9.

Digital Library

[45]

X. Zhu and X. Wu, "Class noise handling for effective cost-sensitive learning by cost-guided iterative classification filtering," IEEE Trans. Knowl. Data Eng., vol. 18, no. 10, pp. 1435-1440, Oct. 2006.

Digital Library

[46]

X. Zhu, P. Zhang, X. Lin, and Y. Shi, "Active learning from data streams," in Proc. ICDM, 2007, pp. 757-762.

Digital Library

[47]

J. Rodriguez, L. Kuncheva, and C. Alonso, "Rotation forest: A new classifier ensemble method," IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 10, pp. 1619-1630, Oct. 2006.

Digital Library

[48]

P. Melville and R. Mooney, "Diverse ensembles for active learning," in Proc. 21st ICML, 2004, p. 74.

Digital Library

[49]

H. Mamitsuka and N. Abe, "Active ensemble learning: Application to data mining and bioinformatics," Syst. Comput. Jpn., vol. 38, no. 11, pp. 100- 108, 2007.

Digital Library

[50]

J. Huang, S. Ertekin, Y. Song, H. Zha, and C. Giles, "Efficient multiclass boosting classification with active learning," in Proc. SDM, 2007, pp. 297-308.

[51]

N. Pedrajas, C. Osorio, and C. Fyfe, "Nonlinear boosting projections for ensemble construction," J. Mach. Learn. Res., vol. 8, pp. 1-33, 2007.

Digital Library

[52]

L. Breiman, "Bagging predictors," Mach. Learn., vol. 24, no. 2, pp. 123- 140, Aug. 1996.

Digital Library

[53]

Y. Freud and R. Schapire, "Experiments with a new boosting algorithm," in Proc. ICML, 1996, pp. 148-156.

[54]

W. Hu, W. Hu, N. Xie, and S. Maybank, "Unsupervised active learning based on hierarchical graph-theoretic clustering," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 39, no. 5, pp. 1147-1161, Oct. 2009.

Digital Library

[55]

G. Stemp-Morlock, "Learning more about active learning," Commun. ACM, vol. 52, no. 4, pp. 11-13, Apr. 2009.

Digital Library

[56]

C. Monteleoni and M. Kaariainen, "Practical online active learning for classification," in Proc. CVPR Online Learning Classification Workshop, 2007, pp. 1-8.

[57]

B. Settles, "Active learning literature survey," Univ. Wisconsin-Madison, Madison, WI, Computer Science Tech. Rep. 1648, 2009.

[58]

J. Berger, Statistical Decision Theory and Bayesian Analysis, 2nd ed. New York: Springer-Verlag, 1985.

[59]

X. Zhu, X. Wu, and C. Zhang, "Vague one-class learning for data streams," in Proc. ICDM, 2009, pp. 657-666.

Digital Library

[60]

X. Zhu, Active learning for concept drifting data streams: Results and source codes, Florida Atlantic Univ., Boca Raton, FL. {Online}. Available: http://www.cse.fau.edu/~xqzhu/Stream/ActiveLearning/index.html

Cited By

Ahmad MShah S(2025)An amalgamated correlation and regression based feature selection with ensemble learning approach for IoT network attack detectionInternet Technology Letters10.1002/itl2.5647:6Online publication date: 20-Jan-2025
Guo YZheng ZPu JJiao BGong DYang S(2024)Robust online active learning with cluster-based local drift detection for unbalanced imperfect dataApplied Soft Computing10.1016/j.asoc.2024.112051165:COnline publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1016/j.asoc.2024.112051
Bockel-Rickermann CVerdonck TVerbeke W(2023)Fraud analyticsExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.120605232:COnline publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1016/j.eswa.2023.120605
Show More Cited By

Active learning from stream data using optimal weight classifier ensemble
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning

Recommendations

Active Weighted Aging Ensemble for drifted data stream classification
Abstract Purpose
One of the significant problems in data stream classification is the concept drift phenomenon, which consists of the change in probabilistic characteristics of the classification task. Such changes in posterior ...
Highlights
- The proposal of a new chunk-base classifier ensemble for non-stationary data streams.
Dynamic classifier ensemble for positive unlabeled text stream classification

Most of studies on streaming data classification are based on the assumption that data can be fully labeled. However, in real-life applications, it is impractical and time-consuming to manually label the entire stream for training. It is very common that ...
Cost‐effective multi‐instance multilabel active learning
Abstract
Multi‐instance multi‐label (MIML) Active Learning (M2AL) aims to improve the learner while reducing the cost as much as possible by querying informative labels of complex bags composed of diverse instances. Existing M2AL solutions suffer high ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics Volume 40, Issue 6

December 2010

224 pages

ISSN:1083-4419

Issue’s Table of Contents

Copyright © 2010.

Publisher

IEEE Press

Publication History

Published: 01 December 2010

Accepted: 07 January 2010

Revised: 12 November 2009

Received: 23 July 2009

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

28
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ahmad MShah S(2025)An amalgamated correlation and regression based feature selection with ensemble learning approach for IoT network attack detectionInternet Technology Letters10.1002/itl2.5647:6Online publication date: 20-Jan-2025
Guo YZheng ZPu JJiao BGong DYang S(2024)Robust online active learning with cluster-based local drift detection for unbalanced imperfect dataApplied Soft Computing10.1016/j.asoc.2024.112051165:COnline publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1016/j.asoc.2024.112051
Bockel-Rickermann CVerdonck TVerbeke W(2023)Fraud analyticsExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.120605232:COnline publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1016/j.eswa.2023.120605
de Lima MSouza YFaria EBarioni M(2022)A comprehensive analysis of the diverse aspects inherent to image data stream classificationKnowledge and Information Systems10.1007/s10115-022-01717-164:8(2215-2238)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.1007/s10115-022-01717-1
Hosseini SRabiee HHafez HSoltani-Farani A(2022)Classifying a Stream of Infinite Concepts: A Bayesian Non-parametric ApproachMachine Learning and Knowledge Discovery in Databases10.1007/978-3-662-44848-9_1(1-16)Online publication date: 10-Mar-2022
https://dl.acm.org/doi/10.1007/978-3-662-44848-9_1
Barata RLeite MPacheco RSampaio MAscensão JBizarro PCalinescu ASzpruch L(2021)Active learning for imbalanced data under cold startProceedings of the Second ACM International Conference on AI in Finance10.1145/3490354.3494423(1-9)Online publication date: 3-Nov-2021
https://dl.acm.org/doi/10.1145/3490354.3494423
Dong XYu ZCao WShi YMa Q(2020)A survey on ensemble learningFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-019-8208-z14:2(241-258)Online publication date: 1-Apr-2020
https://dl.acm.org/doi/10.1007/s11704-019-8208-z
Kumar PGupta A(2020)Active Learning Query Strategies for Classification, Regression, and Clustering: A SurveyJournal of Computer Science and Technology10.1007/s11390-020-9487-435:4(913-945)Online publication date: 1-Jul-2020
https://dl.acm.org/doi/10.1007/s11390-020-9487-4
Bonab HCan F(2018)GOOWEACM Transactions on Knowledge Discovery from Data10.1145/313924012:2(1-33)Online publication date: 23-Jan-2018
https://dl.acm.org/doi/10.1145/3139240
Zhao LXu JWang CDing XLi FHou F(2018)Research and Design of an Automatic Grading Device in Chicken Wing WeightWireless Personal Communications: An International Journal10.1007/s11277-017-5099-x102:2(769-782)Online publication date: 1-Sep-2018
https://dl.acm.org/doi/10.1007/s11277-017-5099-x
Show More Cited By

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents