[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Differential prioritization in feature selection and classifier aggregation for multiclass microarray datasets

Published: 01 June 2007 Publication History

Abstract

The high dimensionality of microarray datasets endows the task of multiclass tissue classification with various difficulties—the main challenge being the selection of features deemed relevant and non-redundant to form the predictor set for classifier training. The necessity of varying the emphases on relevance and redundancy, through the use of the degree of differential prioritization (DDP) during the search for the predictor set is also of no small importance. Furthermore, there are several types of decomposition technique for the feature selection (FS) problem—all-classes-at-once, one-vs.-all (OVA) or pairwise (PW). Also, in multiclass problems, there is the need to consider the type of classifier aggregation used—whether non-aggregated (a single machine), or aggregated (OVA or PW). From here, first we propose a systematic approach to combining the distinct problems of FS and classification. Then, using eight well-known multiclass microarray datasets, we empirically demonstrate the effectiveness of the DDP in various combinations of FS decomposition types and classifier aggregation methods. Aided by the variable DDP, feature selection leads to classification performance which is better than that of rank-based or equal-priorities scoring methods and accuracies higher than previously reported for benchmark datasets with large number of classes. Finally, based on several criteria, we make general recommendations on the optimal choice of the combination of FS decomposition type and classifier aggregation method for multiclass microarray datasets.

References

[1]
Ambroise C and McLachlan GJ Selection bias in gene extraction on the basis of microarray gene-expression data Proc Natl Acad Sci USA 2002 99 6562-6566
[2]
Armstrong SA, Staunton JE, Silverman LB, Pieters R, den Boer ML, Minden MD, Sallan SE, Lander ES, Golub TR, and Korsmeyer SJ MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia Nat Genet 2002 30 41-47
[3]
Bhattacharjee A, Richards WG, Staunton JE, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M, Loda M, Weber G, Mark EJ, Lander ES, Wong W, Johnson BE, Golub TR, Sugarbaker DJ, and Meyerson M Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses Proc Natl Acad Sci USA 2001 98 13790-13795
[4]
Breiman L Bagging predictors Mach Learn 1996 24 123-140
[5]
Decoste D and Schölkopf B Training invariant support vector machines Mach Learn 2002 46 161-190
[6]
Ding C, Peng H (2003) Minimum redundancy feature selection from microarray gene expression data. In: Proceedings of the 2nd IEEE computational systems bioinformatics conference, pp 523–529
[7]
Dudoit S, Fridlyand J, and Speed T Comparison of discrimination methods for the classification of tumors using gene expression data J Am Stat Assoc 2002 97 77-87
[8]
Franc V (2005) Optimization algorithms for kernel methods. PhD thesis, Center for Machine Perception, Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University, 29 July, 2005. ftp://cmp.felk.cvut.cz/pub/cmp/articles/franc/Franc-PhD.pdf
[9]
Garber ME, Troyanskaya OG, Schluens K, Petersen S, Thaesler Z, Pacyna-Gengelbach M, van de Rijn M, Rosen GD, Perou CM, Whyte RI, Altman RB, Brown PO, Botstein D, and Petersen I Diversity of gene expression in adenocarcinoma of the lung Proc Natl Acad Sci 2001 98 24 13784-13789
[10]
Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, and Lander ES Molecular classification of cancer: class discovery and class prediction by gene expression monitoring Science 1999 286 531-537
[11]
Guyon I and Elisseeff A An introduction to variable and feature selection J Mach Learn Res 2003 3 1157-1182
[12]
Hall MA, Smith LA (1998) Practical feature subset selection for machine learning. In: Proceedings of the 21st Australasian computer science conference, pp 181–191
[13]
Holm S A simple sequentially rejective multiple test procedure Scand J Stat 1979 6 65-70
[14]
Hsu CW and Lin CJ A comparison of methods for multiclass support vector machines IEEE Trans Neural Netw 2002 13 2 415-425
[15]
Jirapech-Umpai T and Aitken S Feature selection and classification for microarray data analysis: evolutionary methods for identifying predictive genes BMC Bioinform 2005 6 148
[16]
Khan J, Wei JS, Ringner M, Saal LH, Ladanyi M, Westermann F, Berthold F, Schwab M, Antonescu CR, Peterson C, and Meltzer PS Classification and diagnostic prediction of cancers using expression profiling and artificial neural networks Nat Med 2001 7 673-679
[17]
Knijnenburg TA, Reinders MJT, Wessels LFA (2005) The selection of relevant and non-redundant features to improve classification performance of microarray gene expression data. In: Procedings of the 11th annual conference of the advanced school for computing and imaging, Heijen, NL. http://www.ict.ewi.tudelft.nl/pub/marcel/Knij05a.pdf
[18]
Li T, Zhang C, and Ogihara M A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression Bioinformatics 2004 20 2429-2437
[19]
Linder R, Dew D, Sudhoff H, Theegarten D, Remberger K, Poppl SJ, and Wagner M The subsequent artificial neural network (SANN) approach might bring more classificatory power to ANN-based DNA microarray analyses Bioinformatics 2004 20 3544-3552
[20]
Massart DL, Vandeginste BGM, Deming SN, Michotte Y, Kaufman L (1988) The k-nearest neighbor method. Chemometrics: a textbook (Data handling in science and technology) vol 2, pp 395–397
[21]
Mitchell T (1997) Machine learning. McGraw-Hill
[22]
Munagala K, Tibshirani R, and Brown P Cancer characterization and feature set extraction by discriminative margin clustering BMC Bioinform 2004 5 21
[23]
Ooi CH, Chetty M, Gondal I (2004) The role of feature redundancy in tumor classification. In: Proceedings of the international conference bioinformatics and its applications (ICBA’04). Advances in bioinformatics and its applications (Mathematical Biology and Medicine), vol 8, pp 197–208
[24]
Ooi CH, Chetty M, Teng SW (2005a) Relevance, redundancy and differential prioritization in feature selection for multiclass gene expression data. In: Proceedings of the 6th international symposium on biological and medical data analysis. Lecture notes in computer science, vol 3745, pp 367–378
[25]
Ooi CH, Chetty M, Teng SW. (2005b) Modeling microarray datasets for efficient feature selection. In: Proceedings of the 4th Australasian conference on knowledge discovery and data mining (AusDM05), pp 115–129
[26]
Park M, Hastie T (2005) Hierarchical classification using shrunken centroids. Department of Statistics, Stanford University. Technical report. http://www-stat.stanford.edu/~hastie/Papers/hpam.pdf
[27]
Platt JC Schölkopf B, Burges CJC, and Smola AJ Fast training of support vector machines using sequential minimal optimization Advances in Kernel methods 1998 Cambridge MIT Press 185-208
[28]
Platt JC, Cristianini N, and Shawe-Taylor J Large margin DAGs for multiclass classification Adv Neural Inf Process Syst 2000 12 547-553
[29]
Ramaswamy S, Tamayo P, Rifkin R, Mukherjee S, Yeang CH, Angelo M, Ladd C, Reich M, Latulippe E, Mesirov JP, Poggio T, Gerald W, Loda M, Lander ES, and Golub TR Multi-class cancer diagnosis using tumor gene expression signatures Proc Natl Acad Sci USA 2001 98 15149-15154
[30]
Ross DT, Scherf U, Eisen MB, Perou CM, Rees C, Spellman P, Iyer V, Jeffrey SS, Vande Rijn M, Waltham M, Pergamenschikov A, Lee JCF, Lashkari D, Shalon D, Myers TG, Weinstein JN, Botstein D, and Brown PO Systematic variation in gene expression patterns in human cancer cell lines Nat Genet 2000 24 227-235
[31]
Schena M, Shalon D, Davis RW, and Brown PO Quantitative monitoring of gene expression patterns with a complementary DNA microarray Science 1995 270 467-470
[32]
Shalon D, Smith SJ, and Brown PO A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization Genome Res 1996 6 7 639-645
[33]
Slonim DK, Tamayo P, Mesirov JP, Golub TR, Lander ES (2000) Class prediction and discovery using gene expression data. In: RECOMB 2000, pp 263–272
[34]
Vapnik VN (1998) Statistical learning theory. John Wiley and Sons
[35]
Yeoh E-J, Ross ME, Shurtleff SA, Williams WK, Patel D, Mahfouz R, Behm FG, Raimondi SC, Relling MV, Patel A, Cheng C, Campana D, Wilkins D, Zhou X, Li J, Liu H, Pui C-H, Evans WE, Naeve C, Wong L, and Downing JR Classification, subtype discovery, and prediction of outcome in pediatric lymphoblastic leukemia by gene expression profiling Cancer Cell 2002 1 133-143

Cited By

View all
  • (2015)Frequency Decomposition Based Gene ClusteringProceeings, Part II, of the 22nd International Conference on Neural Information Processing - Volume 949010.1007/978-3-319-26535-3_20(170-181)Online publication date: 9-Nov-2015
  • (2013)Base Model Combination Algorithm for Resolving Tied Predictions for K-Nearest Neighbor OVA Ensemble ModelsINFORMS Journal on Computing10.5555/3214394.321439525:3(517-526)Online publication date: 1-Aug-2013
  • (2013)Positive-versus-Negative Classification for Model Aggregation in Predictive Data MiningINFORMS Journal on Computing10.5555/2700769.270078425:4(792-807)Online publication date: 1-Nov-2013
  • Show More Cited By

Index Terms

  1. Differential prioritization in feature selection and classifier aggregation for multiclass microarray datasets
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Reviews

          Yongxi Tan

          The classification of tumor samples, and biomarker discovery, using DNA microarray gene expression data intensifies the need for feature (gene) selection prior to classification. This is due to the high dimensionality of microarray gene expression data, which typically has thousands of variables (genes), and fewer observations (samples), in which severe co-linearity is observed. Generally speaking, a good feature selection method should select features (genes) of high correlation with the sample class labels (high relevancy), and low correlation with each other (low redundancy). In existing correlation-based feature selection techniques, either no redundancy in the selected features is taken into account, or both redundancy and relevancy play equal roles in the feature selection. In contrast, an important parameter, degree of differential prioritization (DDP, where 0 < DDP lt;=1), is introduced, which characterizes the importance of relevancy and redundancy in feature selection and classification. Typically, high DDP puts more priority on maximizing relevance, at the cost of increasing redundancy, while low DDP puts more emphasis on minimizing redundancy, at the cost of decreasing relevancy. To systematically investigate the effect of DDP on feature selection and classification performance, three different feature-selection decomposition methods (all-classes-at-once, one-versus-all, and pair-wise) were combined with three conventional classifier aggregation methods (nonaggregated single-machine, aggregated one-versus-all, and aggregated pair-wise) to classify eight well-known multiclass microarray benchmark datasets, using varying DDP values and fixed sizes of selected features. It was found that the optimal value of DDP depends on both combination type and the number of classes in the dataset, and the classification accuracy depends more on the feature-selection decomposition method than the classifier aggregation method. In summary, this paper is well written and easy to understand. For those interested in feature selection and classification using high-dimensional microarray gene expression data, this paper is worth reading. Online Computing Reviews Service

          Access critical reviews of Computing literature here

          Become a reviewer for Computing Reviews.

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image Data Mining and Knowledge Discovery
          Data Mining and Knowledge Discovery  Volume 14, Issue 3
          Jun 2007
          126 pages

          Publisher

          Kluwer Academic Publishers

          United States

          Publication History

          Published: 01 June 2007
          Accepted: 10 July 2006
          Received: 16 February 2006

          Author Tags

          1. Tissue classification
          2. Microarray data analysis
          3. Multiclass classification
          4. Feature selection
          5. Classifier aggregation

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 01 Jan 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2015)Frequency Decomposition Based Gene ClusteringProceeings, Part II, of the 22nd International Conference on Neural Information Processing - Volume 949010.1007/978-3-319-26535-3_20(170-181)Online publication date: 9-Nov-2015
          • (2013)Base Model Combination Algorithm for Resolving Tied Predictions for K-Nearest Neighbor OVA Ensemble ModelsINFORMS Journal on Computing10.5555/3214394.321439525:3(517-526)Online publication date: 1-Aug-2013
          • (2013)Positive-versus-Negative Classification for Model Aggregation in Predictive Data MiningINFORMS Journal on Computing10.5555/2700769.270078425:4(792-807)Online publication date: 1-Nov-2013
          • (2013)Base Model Combination Algorithm for Resolving Tied Predictions for K-Nearest Neighbor OVA Ensemble ModelsINFORMS Journal on Computing10.5555/2508911.250892225:3(517-526)Online publication date: 1-Jul-2013
          • (2013)A Novel Feature Selection Method for Classification Using a Fuzzy CriterionRevised Selected Papers of the 7th International Conference on Learning and Intelligent Optimization - Volume 799710.1007/978-3-642-44973-4_49(455-467)Online publication date: 7-Jan-2013
          • (2010)A decision rule-based method for feature selection in predictive data miningExpert Systems with Applications: An International Journal10.1016/j.eswa.2009.06.03137:1(602-609)Online publication date: 1-Jan-2010

          View Options

          View options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media