Abstract
Stream learning in dynamic feature space has evolved into an immensely popular field. This problem assumes that each instance of the data stream may have different features, and the feature spaces of the classifier and the instances may also differ. Such assumptions are more relevant to real-world applications dealing with data streams, where dimensions are not predetermined and fixed. This study introduces a general algorithm for data stream classification in dynamic feature space using feature mapping. In contrast with the other studies, the proposed algorithm is not based on a specific classifier and can cooperate with any classifier best suited for an intended application. It discovers the relationship between the features and estimates the unavailable features previously observed by the classifier. This technique helps to exploit the full potential of the classifier. Furthermore, empirical experiments and comparisons with modern methods demonstrate that the proposed algorithm achieves higher accuracy.
Similar content being viewed by others
Data Availability Statement
All datasets are publicly available, and the sources are cited.
References
Barddal JP, Gomes HM, Enembreck F (2015) A survey on feature drift adaptation. In: 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI), IEEE, pp 1053–1060
Beyazit E, Alagurajah J, Wu X (2019) Online learning from data streams with varying feature spaces. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 3232–3239
Bishop CM, Nasrabadi NM (2006) Pattern recognition and machine learning, vol 4. Springer
Draper NR, Smith H (1998) Applied regression analysis, vol 326. Wiley, New York
Dua D, Graff C (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml
Gama J, Žliobaitė I, Bifet A et al (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):1–37
Hartley HO (1961) The modified gauss-newton method for the fitting of non-linear regression functions by least squares. Technometrics 3(2):269–280
He Y, Wu B, Wu D, et al (2019) Online learning from capricious data streams: a generative approach. In: International Joint Conference on Artificial Intelligence Main Track
He Y, Wu B, Wu D et al (2020) Toward mining capricious data streams: a generative approach. IEEE Trans Neural Netw Learn Syst 32(3):1228–1240
He Y, Yuan X, Chen S, et al (2021) Online learning in variable feature spaces under incomplete supervision. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 4106–4114
Hou BJ, Zhang L, Zhou ZH (2017) Learning with feature evolvable streams. Adv Neural Inf Process Syst 30:1417–1427
Hou BJ, Yan YH, Zhao P, et al (2021) Storage fit learning with feature evolvable streams. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 7729–7736
Hou C, Zhou ZH (2017) One-pass learning with incremental and decremental features. IEEE Trans Pattern Anal Mach Intell 40(11):2776–2792
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, PMLR, pp 448–456
Jakomin M, Bosnić Z, Curk T (2020) Simultaneous incremental matrix factorization for streaming recommender systems. Expert Syst Appl 160:113685
Joel G (2015) Data science from scratch. O’Reilly Media
Katakis I, Tsoumakas G, Vlahavas I (2006) Dynamic feature space and incremental feature selection for the classiflcation of textual data streams. In: Proceedings of ECML/PKDD-2006 International Workshop on Knowledge Discovery from Data Streams. Springer, pp 107–116
Kiefer J, Wolfowitz J (1952) Stochastic estimation of the maximum of a regression function. Ann Math Stat, pp 462–466
Li YF, Gao Y, Ayoade G, et al (2019) Multistream classification for cyber threat data with heterogeneous feature space. In: The World Wide Web Conference, pp 2992–2998
Lian H, Atwood J, Hou BJ et al (2022) Online Deep Learning from Doubly-Streaming Data. In: Proceedings of the 30th ACM International Conference on Multimedia (MM)
Masud MM, Chen Q, Gao J et al (2010a) Classification and novel class detection of data streams in a dynamic feature space. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, pp 337–352
Masud MM, Chen Q, Khan L, et al (2010b) Addressing concept-evolution in concept-drifting data streams. In: 2010 IEEE International Conference on Data Mining, IEEE, pp 929–934
Matuszyk P, Spiliopoulou M (2017) Stream-based semi-supervised learning for recommender systems. Mach Learn 106:771–798
Montiel J, Halford M, Mastelini SM et al (2021) River: machine learning for streaming data in python. J Mach Learn Res 22(1):4945–4952
Nakatani S (2022) Memory efficient stream processing for iot devices. In: 2022 International Conference on Algorithms, Data Mining, and Information Technology (ADMIT), IEEE, pp 129–139
Singh T, Kalra R, Mishra S et al (2022) An efficient real-time stock prediction exploiting incremental learning and deep learning. Evol Syst pp 1–19
Vinagre J, Jorge AM, Al-Ghossein M et al (2022) Preface to the special issue on dynamic recommender systems and user models. User Model User-Adap Inter 32(4):503–507
Welford B (1962) Note on a method for calculating corrected sums of squares and products. Technometrics 4(3):419–420
Wu D, Zhuo S, Wang Y, et al (2023) Online semi-supervised learning with mix-typed streaming features. In: Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI)
Yang L, Shami A (2021) A lightweight concept drift detection and adaptation framework for IOT data streams. IEEE Intern Things Magaz 4(2):96–101
Zhang Q, Zhang P, Long G et al (2016) Online learning from trapezoidal data streams. IEEE Trans Knowl Data Eng 28(10):2709–2723
Zhang Y, Chen Y, Yu H et al (2021) A feature adaptive learning method for high-density SEMG-based gesture recognition. Proc ACM Interact Mob Wearable Ubiquitous Technol 5(1):1–26
Zhang ZY, Zhao P, Jiang Y, et al (2020) Learning with feature and distribution evolvable streams. In: International Conference on Machine Learning, PMLR, pp 11317–11327
Funding
No funding was received for conducting this study.
Author information
Authors and Affiliations
Contributions
Authors contributed equally to this work.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Consent for publication
The authors give full consent for publication.
Financial/non-financial interests
The authors have no relevant financial or non-financial interests to disclose.
Ethics approval and consent to participate
No ethical issue is involved. This research involves no human participants or animals.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sajedi, R., Razzazi, M. Data stream classification in dynamic feature space using feature mapping. J Supercomput 80, 12043–12061 (2024). https://doi.org/10.1007/s11227-024-05889-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-024-05889-1