[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/ICDE.2008.4497501guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Stop Chasing Trends: Discovering High Order Models in Evolving Data

Published: 07 April 2008 Publication History

Abstract

Many applications are driven by evolving data - patterns in Web traffic, program execution traces, network event logs, etc., are often non-stationary. Building prediction models for evolving data becomes an important and challenging task. Currently, most approaches work by "chasing trends", that is, they keep learning or updating models from the evolving data, and use these impromptu models for online prediction. In many cases, this proves to be both costly and ineffective - much time is wasted on re-learning recurring concepts, yet the classifier may remain one step behind the current trend all the time. In this paper, we propose to mine high-order models in evolving data. More often than not, there are a limited number of concepts, or stable distributions, in the data stream, and concepts switch between each other constantly. We mine all such concepts offline from a historical stream, and build high quality models for each of them. At run time, combining historical concept change patterns and cues provided by an online training stream, we find the most likely current concept and use its corresponding models to classify data in an unlabeled stream. The primary advantage of the high-order model approach is its high accuracy. Experiments show that in benchmark datasets, classification error of the high-order model is only a small fraction of that of the current best approaches. Another important benefit is that, unlike state-of-the-art approaches, our approach does not require users to tune any parameters to achieve a satisfying result on streams of different characteristics.

Cited By

View all
  • (2020)Novel Class Detection with Concept Drift in Data Stream - AhtNODEInternational Journal of Distributed Systems and Technologies10.4018/IJDST.202001010211:1(15-26)Online publication date: 1-Jan-2020
  • (2013)Automated Anomaly Detector Adaptation using Adaptive Threshold TuningACM Transactions on Information and System Security10.1145/2445566.244556915:4(1-30)Online publication date: 1-Apr-2013
  • (2012)Content-based crowd retrieval on the real-time webProceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2396789(195-204)Online publication date: 29-Oct-2012
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICDE '08: Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
April 2008
1628 pages
ISBN:9781424418367

Publisher

IEEE Computer Society

United States

Publication History

Published: 07 April 2008

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2020)Novel Class Detection with Concept Drift in Data Stream - AhtNODEInternational Journal of Distributed Systems and Technologies10.4018/IJDST.202001010211:1(15-26)Online publication date: 1-Jan-2020
  • (2013)Automated Anomaly Detector Adaptation using Adaptive Threshold TuningACM Transactions on Information and System Security10.1145/2445566.244556915:4(1-30)Online publication date: 1-Apr-2013
  • (2012)Content-based crowd retrieval on the real-time webProceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2396789(195-204)Online publication date: 29-Oct-2012
  • (2012)Facing the reality of data stream classificationKnowledge and Information Systems10.1007/s10115-011-0447-833:1(213-244)Online publication date: 1-Oct-2012
  • (2011)A Cluster-Based Context-Tree Model for Multivariate Data Streams with Applications to Anomaly DetectionINFORMS Journal on Computing10.1287/ijoc.1100.040723:3(364-376)Online publication date: 1-Jul-2011
  • (2011)Online outlier detection for data streamsProceedings of the 15th Symposium on International Database Engineering & Applications10.1145/2076623.2076635(88-96)Online publication date: 21-Sep-2011
  • (2011)Finding semantics in time seriesProceedings of the 2011 ACM SIGMOD International Conference on Management of data10.1145/1989323.1989364(385-396)Online publication date: 12-Jun-2011
  • (2010)Classification and novel class detection of data streams in a dynamic feature spaceProceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II10.5555/1888305.1888328(337-352)Online publication date: 20-Sep-2010
  • (2010)An efficient approach for mining segment-wise intervention rules in time-series streamsProceedings of the 11th international conference on Web-age information management10.5555/1884017.1884049(238-249)Online publication date: 15-Jul-2010
  • (2010)Adaptive system anomaly prediction for large-scale hosting infrastructuresProceedings of the 29th ACM SIGACT-SIGOPS symposium on Principles of distributed computing10.1145/1835698.1835741(173-182)Online publication date: 25-Jul-2010
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media