[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Gene Expression Data Analysis Using a Novel Approach to Biclustering Combining Discrete and Continuous Data

Published: 01 October 2008 Publication History

Abstract

Many different methods exist for pattern detection in gene expression data. In contrast to classical methods, biclustering has the ability to cluster a group of genes together with a group of conditions (replicates, set of patients or drug compounds). However, since the problem is NP-complex, most algorithms use heuristic search functions and therefore might converge towards local maxima. By using the results of biclustering on discrete data as a starting point for a local search function on continuous data, our algorithm avoids the problem of heuristic initialization. Similar to OPSM, our algorithm aims to detect biclusters whose rows and columns can be ordered such that row values are growing across the bicluster's columns and vice-versa. Results have been generated on the yeast genome (Saccharomyces cerevisiae), a human cancer dataset and random data. Results on the yeast genome showed that 89% of the one hundred biggest non-overlapping biclusters were enriched with Gene Ontology annotations. A comparison with OPSM and ISA demonstrated a better efficiency when using gene and condition orders. We present results on random and real datasets that show the ability of our algorithm to capture statistically significant and biologically relevant biclusters.

References

[1]
Y. Cheng and G.M. Church, "Biclustering of Expression Data," Proc. Int'l Conf. Intelligent Systems for Molecular Biology (ISMB '00), vol. 8, pp. 93-103, 2000.
[2]
S.C. Madeira and A.L. Oliveira, "Biclustering Algorithms for Biological Data Analysis: A Survey," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 1, no. 1, pp. 24-45, Jan.-Mar. 2004.
[3]
J. Ihmels, G. Friedlander, S. Bergmann, O. Sarig, Y. Ziv, and N. Barkai, "Revealing Modular Organization in the Yeast Transcriptional Network," Nature Genetics, vol. 31, no. 4, pp. 370-377, 2002.
[4]
S. Bergmann, J. Ihmels, and N. Barkai, "Iterative Signature Algorithm for the Analysis of Large-Scale Gene Expression Data," Physical Rev. E67, 031902, 2003.
[5]
G. Getz, E. Levine, and E. Domany, "Coupled Two-Way Clustering Analysis of Gene Microarray Data," Proc. Nat'l Academy of Sciences, vol. 97, no. 22, pp. 12079-12084, 2000.
[6]
Y. Kluger, R. Basri, J.T. Chang, and M. Gerstein, "Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions," Genome Research, vol. 13, no. 4, pp. 703-716, 2003.
[7]
T.M. Murali and S. Kasif, "Extracting Conserved Gene Expression Motifs from Gene Expression Data," Proc. Pacific Symp. Biocomputing (PSB '03), vol. 8, pp. 77-88, 2003.
[8]
J. Yang, H. Wang, W. Wang, and P. Yu, "Enhanced Biclustering on Expression Data," Proc. Third IEEE Symp. Bioinformatics and Bioengineering (BIBE '03), pp. 321-327, 2003.
[9]
L. Lazzeroni and A. Owen, "Plaid Models for Gene Expression Data," Statistica Sinica, vol. 12, pp. 61-86, 2002.
[10]
A. Ben-Dor, B. Chor, R. Karp, and Z. Yakhini, "Discovering Local Structure in Gene Expression Data: The Order-Preserving Submatrix Problem," Proc. Sixth Int'l Conf. Computational Biology (RECOMB '02), pp. 49-57, 2002.
[11]
A. Tanay, R. Sharan, and R. Shamir, "Discovering Statistically Significant Biclusters in Gene Expression Data," Bioinformatics, vol. 18, pp. S136-S144, 2002.
[12]
I. Tagkopoulos, N. Slavov, and S.Y. Kung, "Multi-Class Biclustering and Classification Based on Modeling of Gene Regulatory Networks," Proc. Fifth IEEE Symp. Bioinformatics and Bioengineering (BIBE), 2005.
[13]
J.S. Aguilar-Ruiz and F. Divina, "Evolutionary Computation for Biclustering of Gene Expression," Proc. ACM Symp. Applied Computing (SAC), 2005.
[14]
K. Bryan, P. Cunningham, and N. Bolshakova, "Biclustering of Expression Data Using Simulated Annealing," Proc. 18th IEEE Symp. Computer-Based Medical Systems (CMBMS), 2005.
[15]
A. Prelic, S. Bleuler, P. Zimmermann, A. Wille, P. Buhlmann, W. Gruissem, L. Hennig, L. Thiele, and E. Zitzler, "A Systematic Comparison and Evaluation of Biclustering Methods for Gene Expression Data," Bioinformatics, vol. 22, no. 9, pp. 1122-1129, 2006.
[16]
S. Dudoit, Y.H. Yang, M.J. Callow, and T.P. Speed, "Statistical Methods for Identifying Genes with Differential Expression in Replicated cDNA Microarray Experiments," Technical Report 578, Dept. of Biochemistry, Univ. of Stanford, Aug. 2000.
[17]
T. Kamishima and S. Akaho, "Learning from Order Examples," Proc. Second IEEE Int'l Conf. Data Mining (ICDM '02), pp. 645-648, 2002.
[18]
J. Bilmes, "A Gentle Tutorial of the EM Algorithm and Its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models," technical report, Univ. of Berkeley, 1998.
[19]
R. Lowry, "Concepts and Applications of Inferential Statistics," http://faculty.vassar.edu/lowry/webtext.html, {Online; accessed 27 November 2006}, 2006.
[20]
S. Yoon, C. Nardini, L. Benini, and G. De Micheli, "Discovering Coherent Biclusters from Gene Expression Data Using Zero-Suppressed Binary Decision Diagrams," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 2, no. 4, pp. 339-354, Oct.- Dec. 2005.
[21]
W.W. Cohen, R.E. Schapire, and Y. Singer, "Learning to Order Things," J. Artificial Intelligence Research, vol. 10, pp. 243-270, 1999.
[22]
T.R. Hughes et al., "Functional Discovery via a Compendium of Expression Profiles," Cell, vol. 102, no. 1, pp. 109-126, 2000.
[23]
A.A. Alizadeh et al., "Distinct Types of Diffuse Large B-Cell Lymphoma Identified by Gene Expression Profiling," Nature, vol. 403, no. 6769, pp. 503-511, Feb. 2000.
[24]
S. Barkow, S. Bleuler, A. Prelic, P. Zimmermann, and E. Zitzler, "Bicat: A Biclustering Analysis Toolbox," Bioinformatics, vol. 22, no. 10, pp. 1282-1283, 2006.
[25]
G.F. Berriz, O.D. King, B. Bryant, C. Sander, and F.P. Roth, "Characterizing Gene Sets with FuncAssociate," Bioinformatics, vol. 19, no. 18, pp. 2502-2504, 2003.
[26]
M.A. Shipp et al., "Diffuse B-Cell Lymphoma Outcome Prediction by Gene-Expression Profiling and Supervised Machine Learning," Nature Medicine, vol. 8, no. 1, pp. 68-74, Jan. 2002.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 5, Issue 4
October 2008
158 pages

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 01 October 2008
Published in TCBB Volume 5, Issue 4

Author Tags

  1. Bioinformatics (genome or protein) databases
  2. Data and knowledge visualization
  3. Data mining
  4. Graph and tree search strategies
  5. Machine learning

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)2
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Evolutionary biclustering algorithmsSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-018-3394-423:17(7671-7697)Online publication date: 1-Sep-2019
  • (2014)Subspace Clustering of DNA Microarray DataInternational Journal of Computational Models and Algorithms in Medicine10.4018/IJCMAM.20140701014:2(1-52)Online publication date: 1-Jul-2014
  • (2012)BicFinderKnowledge and Information Systems10.5555/3225657.322594230:2(341-358)Online publication date: 1-Feb-2012
  • (2012)Clustering of high throughput gene expression dataComputers and Operations Research10.1016/j.cor.2012.03.00839:12(3046-3061)Online publication date: 1-Dec-2012

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media