Abstract
We propose two models for improving the performance of rule-based classification under unbalanced and highly imprecise domains. Both models are probabilistic frameworks aimed to boost the performance of basic rule-based classifiers. The first model implements a global-to-local scheme, where the response of a global rule-based classifier is refined by performing a probabilistic analysis of the coverage of its rules. In particular, the coverage of the individual rules is used to learn local probabilistic models, which ultimately refine the predictions from the corresponding rules of the global classifier. The second model implements a dual local-to-global strategy, in which single classification rules are combined within an exponential probabilistic model in order to boost the overall performance as a side effect of mutual influence. Several variants of the basic ideas are studied, and their performances are thoroughly evaluated and compared with state-of-the-art algorithms on standard benchmark datasets.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of international conference on very large data bases, 1994, pp 487–499
Antonie ML, Zaïane OR (2002) Text document categorization by term association. In: Proceedings of IEEE international conference on data mining, 2002, pp 19–26
Antonie ML, Zaïane OR (2004) An associative classifier based on positive and negative rules. In: Proceedings of the ACM SIGMOD workshop on research issues in data mining and knowledge discovery, 2004, pp 64–69
Arunasalam B, Chawla S (2006) CCCS: a top-down association classifier for imbalanced class distribution. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining, 2006, pp 517–522
Asuncion A, Newman DJ (2007) UCI, machine learning repository. School of Information and Computer Sciences, University of California, Irvine. http://www.ics.uci.edu/~mlearn/MLRepositoryhtml
Bay SD, Pazzani MJ (2001) Detecting group differences: mining contrast sets. Data Min Knowl Disc 5(3): 213–246
Berger AL, Della Pietra VJ, Della Pietra SA (1996) A maximum entropy approach to natural language processing. J Artif Intell Res 22(1): 39–71
Cesario E, Folino F, Locane A, Manco G, Ortale R (2008) Boosting text segmentation via progressive classification. Knowl Inf Syst 15(3): 285–320
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16(1): 321–357
Chawla NV, Lazarevic A, Hall LO, Bowyer K (2003) SMOTEBoost: improving prediction of minority class in boosting. In: Proceedings of principles of knowledge discovery in databases, 2003, pp 107–119
Cheng H, Yan X, Han J, Hsu CW (2007) Discriminative frequent pattern analysis for effective classification. In: Proceedings of international conference on data engineering, 2007, pp 716–725
Coenen F (2004) LUCS KDD implementations of CBA and CMAR. Department of Computer Science, The University of Liverpool, UK. http://wwwcsclivacuk/~frans/KDD/Software/
Cohen WW (1995) Fast effective rule induction. In: Proceedings of conference on machine learning, 1995, pp 115–123
Cong G, Xu X, Pan F, Tung A, Yang J (2004) FARMER: finding interesting rule groups in microarray datasets. In: Proceedings of ACM SIGMOD international conference on management of data, 2004, pp 123–126
Costa G, Guarascio M, Manco G, Ortale R, Ritacco E (2009) Rule learning with probabilistic smoothing. In: Proceedings of international conference on data warehousing and knowledge discovery, 2009, pp 428–440
Duda RO, Hart PE, Stork DG (2001) Pattern classification. Wiley, New York
Elkan C (2001) The foundations of cost-sensitive learning. In: Proceedings of international joint conference on artificial intelligence, 2001, pp 973–978
Ezawa K, Singh M, Norton SW (1996) Learning goal oriented Bayesian networks for telecommunications risk management. In: Proceedings of international conference on machine learning, 1996, pp 139–147
Fan W, Stolfo SJ, Zhang J, Chan PK (1999) AdaCost: misclassification cost-sensitive boosting. In: Proceedings of international conference on machine learning, 1999, pp 97–105
Fawcett RE, Provost F (1997) Adaptive fraud detection. Data Min Knowl Disc 3(1): 291–316
Frank E, Witten IH (1998) Generating accurate rule sets without global optimization. In: Proceedings of international conference on machine learning, 1998, pp 144–151
Hämäläinen W (2010) StatApriori: an efficient algorithm for searching statistically significant association rules. Knowl Inf Syst 23(3): 373–399
Han J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of ACM SIGMOD international conference on management of data, 2000, pp 1–12
Holte RC, Acker L, Porter B (1989) Concept learning and the problem of small disjuncts. In: Proceedings of international conference on artificial intelligence, 1989, pp 813–818
Japkowicz N (2000) The class imbalance Problem: Significance and Strategies. In: Proceedings of international conference on artificial intelligence, 2000, pp 111–117
Japkowicz N, Stephen S (2002) The class imbalance problem: a systematic study. Intell Data Anal 6(5): 429–449
Joshi MV, Agarwal RC, Kumar V (2002) Predicting rare classes: can boosting make any weak learner strong? In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, 2002, pp 297–306
Joshi MV, Kumar V, Agarwal RC (2001) Evaluating boosting algorithms to classify rare classes: comparison and improvements. In: Proceedings of IEEE international conference on data mining, 2001, pp 257–264
Kubat M, Holte RC, Matwin S, Kohavi R, Provost F (1998) Machine learning for the detection of oil spills in satellite radar images. Mach Learn 30(2): 192–215
Kubat M, Matwin S (1997) Addressing the curse of imbalanced training sets: one-sided selection. In: Proceedings of international conference on machine learning, 1997, pp 179–186
Li W, Han J, Pei J (2001) CMAR: Accurate and efficient classification based on multiple class-association rules. In: Proceedings of IEEE international conference on data mining, 2001, pp 369–376
Liu B, Hsu W, Ma Y (1998) Integrating classification and association rule mining. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining, 1998, pp 80–86
Liu B, Ma Y, Wong CK (2000) Improving an association rule based classifier. In: Proceedings of principles of data mining and knowledge discovery, 2000, pp 504–509
McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York
Mitchell TM (1997) Machine learning. McGraw-Hill, New York
Pazzani M, Merz C, Murphy P, Hume T, Brunk C (1994) Reducing misclassification costs. In: Proceedings of international conference on machine learning, 1994, pp 217–225
Phua C, Alahakoon D, Lee V (2004) Minority report in fraud detection: classification of skewed data. ACM SIGKDD Explor Newsl. Special issue on learning from imbalanced datasets:50–59
Provost F, Fawcett T (2001) Robust classification for imprecise environments. Mach Learn 42(3): 203–231
Quinlan JR, Cameron-Jones RM (1993) FOIL: a midterm report. In: Proceedings of European conference on machine learning, 1993, pp 3–20
Riddle P, Segal R, Etzioni O (1994) Representation design and brute-force induction in a Boeing manufacturing domain. Appl Artif Intell 8(1): 125–147
Tang J, Chen Z, Fu A, Cheung D (2007) Capabilities of outlier detection schemes in large datasets, framework and methodologies. Knowl Inf Sys 11(1): 45–84
Tatti N (2008) Maximum entropy based significance of itemsets. Knowl Inf Sys 17(1): 57–77
Thabtah F (2007) A review of associative classification mining. J Knowl Eng Rev 22(1): 37–65
Ting KM (2000) A comparative study of cost-sensitive boosting algorithms. In: Proceedings of international conference on machine learning, 2000, pp 983–990
Wang J, Karypis G (2005) HARMONY: efficiently mining the best rules for classification. In: Proceedings of SIAM international conference on data mining, 2005, pp 205–216
Webb G, Boughton J, Wang Z (2005) Not so naive Bayes: aggregating one-dependence estimators. Mach Learn 58(1): 5–24
Weiss GM (2000) Learning with rare cases and small disjuncts. In: Proceedings of international conference on machine learning, 2000, pp 558–565
Weiss GM (2004) Mining with rarity: a unifying framework. ACM SIGKDD Explor Newsl 6(1): 7–19
Weiss GM, Hirsh H (2000) A quantitative study of small disjuncts. In: Proceedings of national conference on artificial intelligence, 2000, pp 665–670
Weiss GM, Provost F (2003) Learning when training data are costly: the effect of class distribution on tree induction. J Artif Intell Res 19: 315–354
Xin X, Han J (2003) CPAR: classification based on predictive association rules. In: Proceedings of SIAM international conference on data mining, 2003, pp 331–335
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Costa, G., Manco, G., Ortale, R. et al. From global to local and viceversa: uses of associative rule learning for classification in imprecise environments. Knowl Inf Syst 33, 137–169 (2012). https://doi.org/10.1007/s10115-011-0458-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-011-0458-5