[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

Published: 01 January 2010 Publication History

Abstract

The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex data sets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss 1) the integration of data clustering and visualization into one framework, 2) the application of data clustering to 3D gene expression data, 3) the evaluation of the number of clusters k in the context of 3D gene expression clustering, and 4) the improvement of overall analysis quality via dedicated postprocessing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.

References

[1]
BDTNP, http://bdtnp.lbl.gov/Fly-Net/bioimaging.jsp, 2008.
[2]
C.L. Luengo Hendriks, S.V.E. Keränen, C.C. Fowlkes, L. Simirenko, G.H. Weber, A.H. DePace, C. Henriquez, D.W. Kaszuba, B. Hamann, M.B. Eisen, J. Malik, D. Sudar, M.D. Biggin, and D.W. Knowles, "Three-Dimensional Morphology and Gene Expression in the Drosophila blastoderm at Cellular Resolution I: Data Acquisition Pipeline," Genome Biology, vol. 7, no. 12, p. R123, http://genomebiology.com/2006/7/12/R123, 2006.
[3]
S.V.E. Keränen, C.C. Fowlkes, C.L. Luengo Hendriks, D. Sudar, D.W. Knowles, J. Malik, and M.D. Biggin, "Three-Dimensional Morphology and Gene Expression in the Drosophila Blastoderm at Cellular Resolution I: Dynamics," Genome Biology, vol. 7, no. 12, p. R124, http://genomebiology.com/2006/7/12/R124, 2006.
[4]
O. Rü bel, G.H. Weber, S.V.E. Keränen, C.C. Fowlkes, C.L. Luengo Hendriks, L. Simirenko, N.Y. Shah, M.B. Eisen, M.D. Biggin, H. Hagen, J.D. Sudar, J. Malik, D.W. Knowles, and B. Hamann, "Pointcloudxplore: Visual Analysis of 3D Gene Expression Data Using Physical Views and Parallel Coordinates," Proc. Joint Eurographics-IEEE VGTC Symp. Visualization (EuroVis '06), B.S. Santos, T. Ertl, and K. Joy, eds., pp. 203-210, 2006.
[5]
G.H. Weber, O. Rübel, M.-Y. Huang, A.H. DePace, C.C. Fowlkes, S.V.E. Keränen, C.L. Luengo Hendriks, H. Hagen, D.W. Knowles, J. Malik, M.D. Biggin, and B. Hamann, "Visual Exploration of Three-Dimensional Gene Expression Using Physical Views and Linked Abstract Views," accepted for publication in IEEE/ACM Trans. Computational Biology and Bioinformatics, 2008.
[6]
C.C. Fowlkes, C.L. Luengo Hendriks, S.V.E. Keränen, M.D. Biggin, D.W. Knowles, D. Sudar, and J. Malik, "Registering Drosophila Embryos at Cellular Resolution to Build a Quantitative3D Map of Gene Expression Patterns and Morphology," Proc. CSB Workshop BioImage Data Mining and Informatics, Aug. 2005.
[7]
C.C. Fowlkes, C.L. Luengo Hendriks, S.V.E. Keränen, G.H. Weber, O. Rübel, M.-Y. Huang, S. Chatoor, L. Simirenko, M.B. Eisen, B. Hamann, D.W. Knowles, M.D. Biggin, and J. Malik, Constructing a Quantitative Spatio-Temporal Atlas of Gene Expression in the Drosophila blastoderm, submitted, 2008.
[8]
A.K. Jain, M.N. Murty, and P.J. Flynn, "Data Clustering: A Review," ACM Computing Surveys, vol. 31, no. 3, Sept. 1999.
[9]
M.B. Eisen, P.T. Spellman, P.O. Brown, and D. Botstein, "Cluster Analysis and Display of Genome-Wide Expression Patterns," Proc. Nat'l Academy of Sciences USA, pp. 14863-14868, 1995.
[10]
A.A. Alizadeh, M.B. Eisen, R.E. Davis, C. Ma, I.S. Lossos, A. Rosenwald, J.C. Boldrick, H. Sabet, T. Tran, X. Yu, J.I. Powell, L. Yang, G.E. Marti, T. Moore, J. Hudson, L. Lu, D.B. Lewis, R. Tibshirani, G. Sherlock, W.C. Chan, T.C. Greiner, D.D. Weisenburger, J.O. Armitage, R. Warnke, R. Levy, W. Wilson, M.R. Grever, J.C. Byrd, D. Botstein, P.O. Brown, and L.M. Staudt, "Distinct Types of Diffuse Large B-Cell Lymphoma Identified by Gene Expression," Nature, vol. 403, pp. 503-511, Feb. 2000.
[11]
J. Ihmels, G. Friedlander, S. Bergmann, O. Sarig, Y. Ziv, and N. Barkai, "Revealing Modular Organization in the Yeast Transcriptional Network," Nature Genetics, vol. 31, part 4, pp. 370-378, 2002.
[12]
M.B. Eisen Cluster 2.20 and Treeview 1.60, http://rana.lbl.gov/ EisenSoftware.htm, 2002.
[13]
Spotfire, Decision Site, http://www.spotfire.com, 2008.
[14]
M. Reich, K. Ohm, P. Tamayo, M. Angelo, and J.P. Mesirov, "Genecluster 2.0: An Advanced Toolset for Bioarray Analysis," Bioinformatics, 2004.
[15]
A. Saeed, V. Sharov, J. White, J. Li, W. Liang, N. Bhagabati, J. Braisted, M. Klapa, T. Currier, M. Thiagarajan, A. Sturn, M. Snuffin, A. Rezantsev, D. Popov, A. Ryltsov, E. Kostukovich, I. Borisovsky, Y. Liu, A. Vinsavich, V. Trush, and J. Quackenbush, "TM4: A Free, Open-Source System for Microarray Data Management and Analysis," Biotechniques, vol. 34, no. 2, pp. 374-378, 2003.
[16]
Rosetta Biosoftware, http://www.rosettabio.com, 2008.
[17]
J. Seo and B. Shneiderman, "A Knowledge Integration Framework for Information Visualization," From Integrated Publication and Information Systems to Information and Knowledge Environments, LNCS 3379, pp. 207-220, 2005.
[18]
J. Handl, J. Knowles, and D.B. Kell, "Computational Cluster Validation in Post-Genomic Data Analysis," Bioinformatics, vol. 21, no. 15, pp. 3201-3212, 2005.
[19]
K.Y. Yeung, D.R. Haynor, and W.L. Ruzzo, "Validating Clustering for Gene Expression Data," Bioinformatics, vol. 17, no. 4, pp. 309-318, 2001.
[20]
S. Datta and S. Datta, "Comparison and Validation of Statistical Clustering Techniques for Microarray Gene Expression Data," Bioinformatics, vol. 19, no. 4, pp. 459-466, 2003.
[21]
A. Ben-Dor, N. Friedmann, and Z. Yakhini, "Overabundance Analysis and Class Discovery in Gene Expression Data," technical report, Agilent Laboratories, Palo Alto, 2002.
[22]
R. Tibshirani, G. Walther, and T. Hastie, "Estimating the Number of Clusters in a Dataset via the Gap Statistic," Technical Report 208, Dept. Statistics, Stanford Univ., 2000.
[23]
G.W. Miligan and M.C. Cooper, "An Examination of Procedures for Determining the Number of Clusters in a Data Set," Psychometrika, vol. 50, no. 2, pp. 159-179, June 1985.
[24]
M.Q.W. Baldonado, A. Woodruff, and A. Kuchinsky, "Guidelines for Using Multiple Views in Information Visualization," Proc. Working Conf. Advanced Visual Interfaces (AVI '00), pp. 110-119, 2000.
[25]
C. Henze, "Feature Detection in Linked Derived Spaces," Proc. IEEE Conf. Visualization (VIS '98), D. Ebert, H. Rushmeier, and H. Hagen, eds., pp. 87-94, 1998.
[26]
D.L. Gresh, B.E. Rogowitz, R.L. Winslow, D.F. Scollan and C.K. Yung, "WEAVE: A System for Visually Linking 3D and Statistical Visualizations, Applied to Cardiac Simulation and Measurement Data," Proc. IEEE Conf. Visualization (VIS '00), T. Ertl, B. Hamann, and A. Varshney, eds., pp. 489-492, 2000.
[27]
H. Doleisch, M. Gasser, and H. Hauser, "Interactive Feature Specification for Focus+Context Visualization of Complex Simulation Data," Proc. Eurographics/IEEE TCVG Symp. Visualization (VisSym '03), G.-P. Bonneau, S. Hahmann, and C.D. Hansen, eds., 2003.
[28]
R. Kosara, G.N. Sahling, and H. Hauser, "Linking Scientific and Information Visualization with Interactive 3D Scatterplots," Short Comm. Papers Proc. 12th Int'l Conf. in Central Europe on Computer Graphics, Visualization, and Computer Vision (WSCG '04), pp. 133-140, 2004.
[29]
H. Piringer, R. Kosara, and H. Hauser, "Interactive Focus+Context Visualization with Linked 2D/3D Scatterplots," Proc. Second Int'l Conf. Coordinated and Multiple Views in Exploratory Visualization (CMV '04), pp. 49-60, 2004.
[30]
Y.-H. Fua, M.O. Ward, and E.A. Rundensteiner, "Hierarchical Parallel Coordinates for Exploration of Large Datasets," Proc. IEEE Conf. Visualization (VIS '99), pp. 43-50, 1999.
[31]
M.J.L. de Hoon, S. Imoto, J. Nolan, and S. Miyano, "Open Source Clustering Software," Bioinformatics, vol. 20, no. 9, pp. 1453-1454, 2004.
[32]
P. Tamayo, D. Slonim, J. Mesirov, Q. Zhu, S. Kitareewan, E. Dmitrosky, E.S. Lander, and T.R. Golub, "Interpreting Patterns of Gene Expression with Self-Organizing Maps: Methods and Application to Hematopoietic Differentiation," Proc. Nat'l Academy of Sciences USA, vol. 96, pp. 2907-2912, Mar. 1999.
[33]
P.L. Rosin, "Unimodal Thresholding," Pattern Recognition, vol. 34, no. 11, pp. 2083-2096, 2001.
[34]
R. Albert and H. Othmer, "The Topology of the Regulatory Interactions Predicts the Expression Pattern of the Segment Polarity Genes in Drosophila melanogaster," J. Theoretical Biology, vol. 223, no. 1, pp. 1-18, July 2003.
[35]
L. Sanchez and D. Thieffry, "A Logical Analysis of the Drosophila Gap-Gene System," J. Theoretical Biology, vol. 211, no. 2, pp. 115-141, 2001.
[36]
J. Jaeger, S. Surkova, M. Blagov, H. Janssens, D. Kosman, K.N. Kozlov, Manu, E. Myasnikova, C.E. Vanario-Alonso, M. Samsonova, D.H. Sharp, and J. Reinitz, "Dynamic Control of Positional Information in the Early Drosophila Embryo," Nature, vol. 430, pp. 368-371, http://dx.doi.org/10.1371%2Fjournal.pcbi. 0020051, 2004.
[37]
P.A. Lawrence, The Making of a Fly: The Genetics of Animal Design. Blackwell, 1992.
[38]
C.-R. Lin, K.-H. Liu, and M.-S. Chen, "Dual Clustering: Integrating Data Clustering over Optimization and Constraint Domains," IEEE Trans. Knowledge and Data Eng., vol. 17, no. 5, pp. 628-637, May 2005.
[39]
O. Alter, P.O. Brown, and D. Botstein, "Singular Value Decomposition for Genome-Wide Expression Data Processing and Modeling," Proc. Nat'l Academy of Sciences USA, pp. 10101-10106, 2000.
[40]
M.E. Wall, A. Rechtsteiner, and L.M. Rocha, "Singular Value Decomposition and Principal Component Analysis," A Practical Approach to Microarray Data Analysis, pp. 91-109, Kluwer Academic Publishers, 2003.

Cited By

View all
  • (2024)Robust Tensor Recovery for Incomplete Multi-View ClusteringIEEE Transactions on Multimedia10.1109/TMM.2023.332149926(3856-3870)Online publication date: 1-Jan-2024
  • (2012)Dual analysis of DNA microarraysProceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies10.1145/2362456.2362489(1-8)Online publication date: 5-Sep-2012
  • (2012)A Biologically Inspired Validity Measure for Comparison of Clustering Methods over Metabolic Data SetsIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2012.109:3(706-716)Online publication date: 1-May-2012
  • Show More Cited By
  1. Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
      IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 7, Issue 1
      January 2010
      190 pages

      Publisher

      IEEE Computer Society Press

      Washington, DC, United States

      Publication History

      Published: 01 January 2010
      Published in TCBB Volume 7, Issue 1

      Author Tags

      1. Bioinformatics visualization
      2. cluster visualization
      3. data clustering
      4. gene expression pattern
      5. gene regulation
      6. integrating Infovis/Scivis
      7. multimodal visualization
      8. spatial expression pattern.
      9. temporal expression variation
      10. three-dimensional gene expression
      11. visual data mining

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 18 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Robust Tensor Recovery for Incomplete Multi-View ClusteringIEEE Transactions on Multimedia10.1109/TMM.2023.332149926(3856-3870)Online publication date: 1-Jan-2024
      • (2012)Dual analysis of DNA microarraysProceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies10.1145/2362456.2362489(1-8)Online publication date: 5-Sep-2012
      • (2012)A Biologically Inspired Validity Measure for Comparison of Clustering Methods over Metabolic Data SetsIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2012.109:3(706-716)Online publication date: 1-May-2012
      • (2011)Integrating cluster formation and cluster evaluation in interactive visual analysisProceedings of the 27th Spring Conference on Computer Graphics10.1145/2461217.2461234(77-86)Online publication date: 28-Apr-2011
      • (2011)Interactive visual analysis of temporal cluster structuresProceedings of the 13th Eurographics / IEEE - VGTC conference on Visualization10.1111/j.1467-8659.2011.01920.x(711-720)Online publication date: 1-Jun-2011

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media