[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/645496.657862guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

SSDT: A Scalable Subspace-Splitting Classifier for Biased Data

Published: 29 November 2001 Publication History

Abstract

Decision trees are one of the most extensively used data mining models. Recently, a number of efficient, scalable algorithms for constructing decision trees on large disk-resident dataset have been introduced. In this paper, we study the problem of learning scalable decision trees from datasets with biased class distribution. Our objective is to build decision trees that are ore concise and oreinterpretable while maintaining the scalability of the model.To achieve this, our approach searches for subspace clusters of data cases of the biased class to enable multivariate splittings based on weighted distances to such clusters. In orderto build concise and interpretable models, other approaches including multivariate decision trees and association rules, often introduce scalability and performance issues. The SSDT algorithm we present achieves the objective without loss in efficiency, scalability, and accuracy.

Cited By

View all
  • (2002)Supervised ranking in open-domain text summarizationProceedings of the 40th Annual Meeting on Association for Computational Linguistics10.3115/1073083.1073161(465-472)Online publication date: 6-Jul-2002
  • (2002)Modeling (in)variability of human judgments for text summarizationProceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval10.1145/564376.564467(407-408)Online publication date: 11-Aug-2002
  1. SSDT: A Scalable Subspace-Splitting Classifier for Biased Data

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    ICDM '01: Proceedings of the 2001 IEEE International Conference on Data Mining
    November 2001
    663 pages
    ISBN:0769511198

    Publisher

    IEEE Computer Society

    United States

    Publication History

    Published: 29 November 2001

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 04 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2002)Supervised ranking in open-domain text summarizationProceedings of the 40th Annual Meeting on Association for Computational Linguistics10.3115/1073083.1073161(465-472)Online publication date: 6-Jul-2002
    • (2002)Modeling (in)variability of human judgments for text summarizationProceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval10.1145/564376.564467(407-408)Online publication date: 11-Aug-2002

    View Options

    View options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media