Abstract
Genes responding similarly to changing conditions are believed to be functionally related. Identification of such functional relations is crucial for annotation of unknown genes as well as the exploration of the underlying regulatory program. Gene expression profiling experiments provide noisy datasets about how cells respond to different experimental conditions. One way of analyzing these datasets is the identification of gene groups with similar expression patterns. A prevailing technique to find gene pairs with correlated expression profiles is to use linear measures like Pearson’s correlation coefficient or Euclidean distance. Similar genes are later compiled into a co-expression network to explore the system-level functionality of genes. However, the noise inherent in microarray datasets reduces the sensitivity of these measures and produces many spurious pairs with no real biological relevance. In this paper, we explore an extrinsic way of calculating similarity of two genes based on their relations with other genes. We show that ‘similar’ pairs identified by extrinsic measures overlap better with known biological annotations available in the Gene Ontology database. Our results also indicate that extrinsic measures are useful in enhancing the quality of co-expression networks and their functional subnetworks.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alon, U., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. 96, 6745–6750 (1999)
Asur, S., Ucar, D., Parthasarathy, S.: An ensemble framework for clustering protein-protein interaction networks. In: Proc. 15th Annual Int’l Conference on Intelligent Systems for Molecular Biology (ISMB) (2007)
Bader, G., Hogue, C.: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 4(2) (2003)
Carter, S., Brechbhler, C., Griffin, M., Bond, A.T.: Gene co-expression network topology provides a framework for molecular characterization of cellular state. Bioinformatics 20(14), 2242–2250 (2004)
Das, G., Mannila, H., Ronkainen, P.: Similarity of attributes by external probes. In: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD 1998), pp. 23–29 (1998)
Das, G., Mannila, H., Ronkainen, P.: Similarity of attributes by external probes. Report C-1997-66, University of Helsinki, Department of Computer Science (October 1997)
Datta, S., Datta, S.: Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes. BMC Bioinformatics 7(397) (2006)
Dhillon, I., Guan, Y., Kulis, B.: Weighted Graph Cuts without Eigenvectors: A Multilevel Approach. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1944–1957 (2007)
Eisen, M., Spellman, P., Brown, P., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. 95(25), 14863–14868 (1998)
Hughes, T., et al.: Functional discovery via a compendium of expression profiles. Cell, 102 (2000)
Jiang, J., Conrath, D.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proc. Int’l Conf. Research in Computational Linguistics, ROCKLING X (1997)
Lee, H., Hsu, A., Sajdak, J., Qin, J., Pavlidis, P.: Coexpression analysis of human genes across many microarray data sets. Genome Research 14, 1085–1094 (2004)
Lin, D.: An information-theoretic definition of similarity. In: Proc. 15th Int’l Conf. Machine Learning (1998)
Marselos, M., Michalopoulos, G.: Changes in the pattern of aldehyde dehydrogenase activity in primary and metastatic adenocarcinomas of the human colon. Cancer letters 34(1), 27–37 (1987)
Näthke, I.: Cytoskeleton out of the cupboard: colon cancer and cytoskeletal changes induced by loss of apc. Nature Reviews Cancer 6, 967–974 (2006)
Oshimoto, H., Okamura, S., Yoshida, M., Mori, M.: Increased Activity and Expression of Phospholipase D2 in Human Colorectal Cancer
Ostel, B.: Statistics in research basic concepts and techniques for research workers. Iowa State University Press, Ames (1963)
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, vol. 1, pp. 448–453 (1995)
Palmer, C., Faloutsos, C.: Electricity based external similarity of categorical attributes. In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS, vol. 2637. Springer, Heidelberg (2003)
Ravasz, E., et al.: Hierarchical organization of modularity in metabolic networks. Science 297(5586), 1551–1555 (2002)
Sevilla, J.L., et al.: Correlation between gene expression and go semantic similarity. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2(4) (2005)
Snel, B., Bork, P., Huynen, M.: The identification of functional modules from the genomic association of genes. Proc. Natl. Acad. Sci. 99, 5890–5895 (2002)
Spirin, V., Mirny, L.A.: Protein complexes and functional modules in molecular networks. PNAS 100(21) (2003)
Stuart, J., Segal, E., Koller, D., Kim, S.: A gene coexpression network for global discovery of conserved genetic modules. Science 302(5643), 249–255 (2003)
Ucar, D., Altiparmak, F., Ferhatosmanoglu, H., Parthasarathy, S.: Investigating the use of extrinsic similarity measures for microarray analysis. In: Proceedings of the BIOKDD workshop at the ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD) (2007)
Ucar, D., Asur, S., Catalyurek, U., Parthasarathy, S.: Improving Functional Modularity in Protein-Protein Interactions Graphs Using Hub-Induced Subgraphs. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS, vol. 4213, pp. 371–382. Springer, Heidelberg (2006)
Ucar, D., Neuhaus, I., Ross-MacDonald, P., Tilford, C., Parthasarathy, S., Siemers, N., Ji, R.: Construction of a reference gene association network from multiple profiling data: application to data analysis. Bioinformatics 23(20), 2716 (2007)
Wang, Q., Wang, X., Evers, B.: Induction of cIAP-2 in human colon cancer cells through PKC/NF-B. J. Biol. Chem. 278, 51091–51099 (2003)
Zhang, B., Horvath, S.: A general framework for weighted gene co-expression network analysis. Statistical Applications in Genetics and Molecular Biology 4(1) (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ucar, D., Altiparmak, F., Ferhatosmanoglu, H., Parthasarathy, S. (2009). Mutual Information Based Extrinsic Similarity for Microarray Analysis. In: Rajasekaran, S. (eds) Bioinformatics and Computational Biology. BICoB 2009. Lecture Notes in Computer Science(), vol 5462. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00727-9_39
Download citation
DOI: https://doi.org/10.1007/978-3-642-00727-9_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00726-2
Online ISBN: 978-3-642-00727-9
eBook Packages: Computer ScienceComputer Science (R0)