[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1081870.1081879acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Consistent bipartite graph co-partitioning for star-structured high-order heterogeneous data co-clustering

Published: 21 August 2005 Publication History

Abstract

Heterogeneous data co-clustering has attracted more and more attention in recent years due to its high impact on various applications. While the co-clustering algorithms for two types of heterogeneous data (denoted by pair-wise co-clustering), such as documents and terms, have been well studied in the literature, the work on more types of heterogeneous data (denoted by high-order co-clustering) is still very limited. As an attempt in this direction, in this paper, we worked on a specific case of high-order co-clustering in which there is a central type of objects that connects the other types so as to form a star structure of the inter-relationships. Actually, this case could be a very good abstract for many real-world applications, such as the co-clustering of categories, documents and terms in text mining. In our philosophy, we treated such kind of problems as the fusion of multiple pair-wise co-clustering sub-problems with the constraint of the star structure. Accordingly, we proposed the concept of consistent bipartite graph co-partitioning, and developed an algorithm based on semi-definite programming (SDP) for efficient computation of the clustering results. Experiments on toy problems and real data both verified the effectiveness of our proposed method.

References

[1]
Bach, F.R., and Jordan, M.I. Learning spectral clustering. Neural Info. Processing Systems 16 (NIPS 2003), 2003.
[2]
Benson, H.P. Global Optimization Algorithm for the Nonlinear Sum of Ratios Problem. Journal of Optimization Theory and Applications: Vol. 112, No. 1, pp. 1--29, January 2002.
[3]
Boyd, S., and Vandenberghe, L. Convex Optimization. Cambridge University Press, 2004.
[4]
Dhillon, I.S. Co-clustering documents and words using bipartite spectral graph partitioning. In KDD'01, 2001.
[5]
Dhillon, I.S., Mallela, S., and Modha, D.S. Information-Theoretic co-clustering. SIGKDD '03, 2003.
[6]
Ding, C., He, X., Zha, H., Gu, M., and Simon, H. A min-max cut algorithm for graph partitioning and data clustering. Proc. IEEE Int'l Conf. Data Mining, 2001.
[7]
Duda, R.O., Hart, P.E., and Stork, D.G. Pattern classification, Second Edition. John Wiley & Sons Inc. 2001.
[8]
Frenk, J.B.G., and Schaible, S. Fractional Programming. ERIM Report Series Reference No. ERS-2004-074-LIS. http://ssrn.com/abstract=595012.
[9]
Fujisawa, K., Fukuda, M., Kojima, M., and Nakata, K. Numerical evaluation of the SDPA (SemiDefinite Programming Algorithm). High Performance Optimization, Kluwer Academic Press, 267--301, 2000.
[10]
Gao, B., Liu, T., Cheng, Q., Feng, G., Qin, T., and Ma, W. Hierarchical Taxonomy Preparation for Text Categorization Using Consistent Bipartite Spectral Graph Co-partitioning. Accepted for publication, IEEE Transactions on Knowledge and Data Engineering, Special Issue on Data Preparation, 2005.
[11]
Golub, G.H., and Loan, C.F.V. Matrix computations. Johns Hopkins University Press, 3rd edition, 1996.
[12]
Hagen, L., and Kahng, A.B. New spectral methods for ratio cut partitioning and clustering. IEEE. Trans. on Computed Aided Desgin, 11:1074--1085, 1992.
[13]
Klerk, E. Aspects of Semidefinite Programming: Interior Point Algorithms and Selected Applications. Applied Optimization Series, Volume 65. Kluwer Academic Publishers, March 2002, 300 pp., ISBN 1-4020-0547-4.
[14]
Kluger, Y., Basri, R., Chang, J.T., and Gerstein, M. Spectral biclustering of microarray cancer data: co-clustering genes and conditions. Genome Res., Apr 2003; 13: 703--716.
[15]
Modha, D.S., and Spangler, W.S. Feature Weighting in k-Means Clustering. Machine Learning, Volume 52, Issue 3, Sep 2003, Pages 217--237.
[16]
Monteiro, R.D.C. First- and Second-Order Methods for Semidefinite Programming. Georgia Tech, January 2003.
[17]
Pardalos, P.M. and Wolkowicz, H. Topics in Semidefinite and Interior Point Methods. Fields Institute Communications 18, AMS, Providence, Rhode Island, 1998.
[18]
Pothen, A., Simon, H.D., and Liou, K.P. Partitioning sparse matrices with eigenvectors of graph. SIAM Journal of Matrix Anal. Appl., 11:430--452, 1990.
[19]
Qiu, G. Image and Feature Co-clustering. ICPR (4) 2004: 991---994.
[20]
SDPA Online for your future. http://grid.r.dendai.ac.jp/sdpa/.
[21]
Semidefinite Programming. http://www-user.tu-chemnitz.de/~helmberg/semidef.html.
[22]
Shi, J., and Malik, J. Normalized cuts and image segmentation. IEEE. Trans. on Pattern Analysis and Machine Intelligence, 22:888--905, 2000.
[23]
Wang, J., Zeng, H., Chen, Z., Lu, H., Tao, L., and Ma, W. ReCoM: reinforcement clustering of multi-type interrelated data objects. Proceedings of ACM SIGIR'03, 2003, Toronto, Canada.
[24]
Yang, Y., and Pedersen J.P. A Comparative Study on Feature Selection in Text Categorization. Proceedings of the Fourteenth International Conference on Machine Learning (ICML'97), 1997, pp412--420.
[25]
Zeng, H., Chen, Z., and Ma, W. A Unified Framework for Clustering Heterogeneous Web Objects. In Proc. 3rd WISE 2002, 12-14 December, Singapore, IEEE Computer Society (2002) 161--172.
[26]
Zha, H., Ding, C., and Gu, M. Bipartite graph partitioning and data clustering. In proceedings of CIKM'01, 2001.

Cited By

View all
  • (2024)BSIN: A Behavior Schema of Information Networks Based on Approximate BisimulationTsinghua Science and Technology10.26599/TST.2023.901008129:4(1092-1104)Online publication date: Aug-2024
  • (2024)Co-clustering: A Survey of the Main Methods, Recent Trends, and Open ProblemsACM Computing Surveys10.1145/369887557:2(1-33)Online publication date: 4-Oct-2024
  • (2024)A Survey of Co-ClusteringACM Transactions on Knowledge Discovery from Data10.1145/368179318:9(1-28)Online publication date: 25-Jul-2024
  • Show More Cited By

Index Terms

  1. Consistent bipartite graph co-partitioning for star-structured high-order heterogeneous data co-clustering

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
    August 2005
    844 pages
    ISBN:159593135X
    DOI:10.1145/1081870
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 August 2005

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. co-clustering
    2. consistency
    3. high-order heterogeneous data
    4. spectral graph

    Qualifiers

    • Article

    Conference

    KDD05

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)20
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 21 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)BSIN: A Behavior Schema of Information Networks Based on Approximate BisimulationTsinghua Science and Technology10.26599/TST.2023.901008129:4(1092-1104)Online publication date: Aug-2024
    • (2024)Co-clustering: A Survey of the Main Methods, Recent Trends, and Open ProblemsACM Computing Surveys10.1145/369887557:2(1-33)Online publication date: 4-Oct-2024
    • (2024)A Survey of Co-ClusteringACM Transactions on Knowledge Discovery from Data10.1145/368179318:9(1-28)Online publication date: 25-Jul-2024
    • (2024)A comprehensive survey of fast graph clusteringVicinagearth10.1007/s44336-024-00008-31:1Online publication date: 13-Sep-2024
    • (2023)Fast parameterless prototype-based co-clusteringMachine Learning10.1007/s10994-023-06474-y113:4(2153-2181)Online publication date: 21-Nov-2023
    • (2023)Spectral Clustering on Multi-aspect DataMulti-aspect Learning10.1007/978-3-031-33560-0_5(103-126)Online publication date: 28-Jul-2023
    • (2022)Fast Flexible Bipartite Graph Model for Co-ClusteringIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3194275(1-12)Online publication date: 2022
    • (2022)Exploiting context-awareness and multi-criteria decision making to improve items recommendation using a tripartite graph-based modelInformation Processing and Management: an International Journal10.1016/j.ipm.2021.10286159:2Online publication date: 1-Mar-2022
    • (2021)Discovering Multiple Co-Clusterings With Matrix FactorizationIEEE Transactions on Cybernetics10.1109/TCYB.2019.295056851:7(3576-3587)Online publication date: Jul-2021
    • (2021)Hierarchical high-order co-clustering algorithm by maximizing modularityInternational Journal of Machine Learning and Cybernetics10.1007/s13042-021-01375-912:10(2887-2898)Online publication date: 23-Jul-2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media