Abstract
The complexity, approximation and algorithmic issues of several clustering problems are studied. These non-traditional clustering problems arise from recent studies in microarray data analysis. We prove the following results. (1) Two variants of the Order-Preserving Submatrix problem are NP-hard. There are polynomial-time algorithms for the Order-Preserving Submatrix Problem when the condition or gene sets are given. (2) The Smooth Subset problem cannot be approximable with ratio 0.5 +δ for any constant δ >0 unless NP=P. (3) Inferring plaid model problem is NP-hard.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alizadeh, A., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–510 (2000)
Ausiello, G., et al.: Complexity and Approximation. Springer, Heidelberg (1999)
Ben-Dor, A., Yakhini, Z.: Clustering gene expression patterns. In: Proc. RECOMB 1999, pp. 33–42 (1999)
Ben-Dor, A., Chor, B., Karp, R., Yakhini, Z.: Discovering local structure in gene expression data: The order-preserving submatrix problem. In: Proceedings of RECOMB 2002, pp. 49–57 (2002)
Berman, P., DasGupta, B., Muthukrishnan, S., Ramaswami, S.: Efficient approximation algorithm for tiling and packing problems with rectangles. J. Alg. 41, 443–470 (2001)
Chen, Y., Dougherty, E., Bitter, M.: Ratio-based decisions and the quantitative analysis of cDNA microarray images. J. Biomed. Optics 2, 364–374 (1997)
Cheng, Y., Church, G.: Biclustering of expression data. In: Proceedings of ISMB 2000, pp. 93-103 (2000)
Cormen, T.H., et al.: Introduction to Algorithms, 2nd edn. McGraw-Hill, New York (2001)
Eisen, M.B., et al.: Clustering Analysis and display of genome-wide expression pattern. Proc. Natl. Amer. Sci. 95, 14863–14868 (1998)
Garey, M.R., Johnson, D.: Computers and Intractability: A Guide to the Theory of NP-completeness. Freeman, San Francisco (1979)
Hartuv, E., et al.: An algorithm for clustering cDNAs for gene expression analysis. In: Proceedings of Recomb 1999, pp. 188–197 (1999)
Hedenfalk, I., et al.: Gene-expression profiles in hereditary breast cancer. New England Journal of Medicine 344, 539–548 (2001)
Hochbaum, D.S.: Approximation Algorithms for NP-hard Problems. PWS Publishing Co. (1995)
Kolda, T.G., O’Leary, D.P.: A semidiscrete matrix decomposition for latent semantic indexing in information retrieval. ACM Trans. on Information Systems 16, 322–346 (1998)
Lawler, E.L.: Combinatorial Optimization: Networks and Matroids. Holt, Rinehart and Winston Inc. (1976)
Liu, J., Yang, J., Wang, W.: Biclustering in gene expression data by tendency. In: Proceedings of CSB 2004, pp. 182–193 (2004)
Lazzeroni, L., Owen, A.: Plaid Models for Gene Expression Data. Statistica Sinica 12, 61–86 (2002); See http://www-stat.stanford.edu/~owen for more about Plaid model.
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)
Peeters, R.: The maximum edge biclique problem is NP-complete. Discrete Applied Mathematics 131, 651–654 (2003)
Tamayo, P., et al.: Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. 96, 2907–2912 (1999)
Troyanskaya, O., et al.: Missing value estimation methods for DNA microarrays. Bioinformatics 17, 520–525 (2001)
Yannakakis, M.: Node-and edge-deletion NP-complete problems. In: Proceedings of the 10th Annual STOC, pp. 253–264 (1978)
Zhang, L., Zhu, S.: Complexity Study on Two Clustering Problems. In: Proceedings of the Annual Inter. Symposium on Alg. and Comput., pp. 660–669 (2001)
Zhang, L., Zhu, S.: A new approach to clustering gene expression data. In: Proceedings of IEEE Symposium on Bioinformatics, pp. 268–275 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tan, J., Chua, K.S., Zhang, L. (2005). Algorithmic and Complexity Issues of Three Clustering Methods in Microarray Data Analysis. In: Wang, L. (eds) Computing and Combinatorics. COCOON 2005. Lecture Notes in Computer Science, vol 3595. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11533719_10
Download citation
DOI: https://doi.org/10.1007/11533719_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28061-3
Online ISBN: 978-3-540-31806-4
eBook Packages: Computer ScienceComputer Science (R0)