A Scalable Biclustering Method for Heterogeneous Medical Data

Maxence Vandromme^17,18,19,
Julie Jacques¹⁷,
Julien Taillard¹⁷,
Laetitia Jourdan^18,19 &
…
Clarisse Dhaenens^18,19

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10122))

Included in the following conference series:

International Workshop on Machine Learning, Optimization, and Big Data

2672 Accesses
2 Citations

Abstract

We define the problem of biclustering on heterogeneous data, that is, data of various types (binary, numeric, etc.). This problem has not yet been investigated in the biclustering literature. We propose a new method, HBC (Heterogeneous BiClustering), designed to extract biclusters from heterogeneous, large-scale, sparse data matrices. The goal of this method is to handle medical data gathered by hospitals (on patients, stays, acts, diagnoses, prescriptions, etc.) and to provide valuable insight on such data. HBC takes advantage of the data sparsity and uses a constructive greedy heuristic to build a large number of possibly overlapping biclusters. The proposed method is successfully compared with a standard biclustering algorithm on small-size numeric data. Experiments on real-life data sets further assert its scalability and efficiency.

C. Dhaenens—This work was partially supported by project ClinMine - ANR-13-TECS-0009.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

HiBi – The Algorithm of Biclustering the Discrete Data

On Bicluster Aggregation and its Benefits for Enumerative Solutions

Biclustering Algorithms Based on Metaheuristics: A Review

References

Bozdağ, D., Kumar, A.S., Catalyurek, U.V.: Comparative analysis of biclustering algorithms. In: Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology, pp. 265–274. ACM (2010)
Google Scholar
Buluc, A., Fineman, J.T., Frigo, M., Gilbert, J.R., Leiserson, C.E.: Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks. In: SPAA, pp. 233–244 (2009)
Google Scholar
Busygin, S., Prokopyev, O., Pardalos, P.M.: Biclustering in data mining. Comput. Oper. Res. 35(9), 2964–2987 (2008)
Article MathSciNet MATH Google Scholar
Cheng, Y., Church, G.M.: Biclustering of expression data. ISMB 8, 93–103 (2000)
Google Scholar
Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 269–274. ACM (2001)
Google Scholar
Henriques, R., Madeira, S.C.: BicNET: flexible module discovery in large-scale biological networks using biclustering. Algorithms Mol. Biol. 11(1), 1 (2016)
Article Google Scholar
Jacques, J., Taillard, J., Delerue, D., Dhaenens, C., Jourdan, L.: Conception of a dominance-based multi-objective local search in the context of classification rule mining in large and imbalanced data sets. Appl. Soft Comput. 34, 705–720 (2015)
Article Google Scholar
Pontes, B., Giráldez, R., Aguilar-Ruiz, J.S.: Biclustering on expression data: a review. J. Biomed. Inform. 57, 163–180 (2015)
Article Google Scholar
Tanay, A., Sharan, R., Kupiec, M., Shamir, R.: Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. Proc. Natl. Acad. Sci. U.S.A. 101(9), 2981–2986 (2004)
Article Google Scholar
van Uitert, M., Meuleman, W., Wessels, L.: Biclustering sparse binary genomic data. J. Comput. Biol. 15(10), 1329–1345 (2008)
Article MathSciNet Google Scholar
Yang, J., Wang, W., Wang, H., Yu, P.: \(\delta \)-clusters: capturing subspace correlation in a large data set. In: Proceedings of the 18th International Conference on Data Engineering, pp. 517–528. IEEE (2002)
Google Scholar
Zhou, J., Khokhar, A.: ParRescue: scalable parallel algorithm and implementation for biclustering over large distributed datasets. In: 26th IEEE International Conference on Distributed Computing Systems, ICDCS 2006, pp. 21–21. IEEE (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Alicante, Seclin, France
Maxence Vandromme, Julie Jacques & Julien Taillard
CRIStAL, UMR 9189, University of Lille, CNRS, Centrale Lille, Villeneuve d’ascq, France
Maxence Vandromme, Laetitia Jourdan & Clarisse Dhaenens
INRIA Lille - Nord Europe, Villeneuve d’ascq, France
Maxence Vandromme, Laetitia Jourdan & Clarisse Dhaenens

Authors

Maxence Vandromme
View author publications
You can also search for this author in PubMed Google Scholar
Julie Jacques
View author publications
You can also search for this author in PubMed Google Scholar
Julien Taillard
View author publications
You can also search for this author in PubMed Google Scholar
Laetitia Jourdan
View author publications
You can also search for this author in PubMed Google Scholar
Clarisse Dhaenens
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maxence Vandromme .

Editor information

Editors and Affiliations

Department of Industrial and Systems Engineering, University of Florida, Gainesville, Florida, USA
Panos M. Pardalos
Semantic Technology Laboratory, National Research Council (CNR), Catania, Italy
Piero Conca
Dipartimento di Sociologia e Metodi della Ricerca Sociale, Università di Catania, Catania, Italy
Giovanni Giuffrida
Department of Mathematics and Computer Science, University of Catania, Catania, Italy
Giuseppe Nicosia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vandromme, M., Jacques, J., Taillard, J., Jourdan, L., Dhaenens, C. (2016). A Scalable Biclustering Method for Heterogeneous Medical Data. In: Pardalos, P., Conca, P., Giuffrida, G., Nicosia, G. (eds) Machine Learning, Optimization, and Big Data. MOD 2016. Lecture Notes in Computer Science(), vol 10122. Springer, Cham. https://doi.org/10.1007/978-3-319-51469-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-51469-7_6
Published: 25 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51468-0
Online ISBN: 978-3-319-51469-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Scalable Biclustering Method for Heterogeneous Medical Data

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

HiBi – The Algorithm of Biclustering the Discrete Data

On Bicluster Aggregation and its Benefits for Enumerative Solutions

Biclustering Algorithms Based on Metaheuristics: A Review

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Scalable Biclustering Method for Heterogeneous Medical Data

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

HiBi – The Algorithm of Biclustering the Discrete Data

On Bicluster Aggregation and its Benefits for Enumerative Solutions

Biclustering Algorithms Based on Metaheuristics: A Review

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation