Using Medians to Generate Consensus Rankings for Biological Data

Sarah Cohen-Boulakia^19,20,
Alain Denise^19,20,21 &
Sylvie Hamel²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6809))

Included in the following conference series:

International Conference on Scientific and Statistical Database Management

1590 Accesses

Abstract

Faced with the deluge of data available in biological databases, it becomes increasingly difficult for scientists to obtain reasonable sets of answers to their biological queries. A critical example appears in medicine, where physicians frequently need to get information about genes associated with a given disease. When they pose such queries to Web portals (e.g., Entrez NCBI) they usually get huge amounts of answers which are not ranked, making them very difficult to be exploited. In the last years, while several ranking approaches have been proposed, none of them is considered as the most promising.

Instead of considering ranking methods as alternative approaches, we propose to generate a consensus ranking to highlight the common points of a set of rankings while minimizing their disagreements. Our work is based on the concept of median, originally defined on permutations: Given m permutations and a distance function, the median problem is to find a permutation that is the closest of the m given permutations. We have investigated the problem of computing a median of a set of m rankings considering different elements and ties, under a generalized Kendall-τ distance. This problem is known to be NP-hard. In this paper, we present a new heuristic for the problem and we demonstrate the benefit of our approach on real queries using four different ranking methods.

Availability: http://bioguide-project.net/bioconsert

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

ConQuR-Bio: Consensus Ranking with Query Reformulation for Biological Data

Confidence Intervals for Quantiles in Ranked Set Sampling

Article 13 November 2019

The linear ordering problem with clusters: a new partial ranking

Article 13 March 2020

References

Ailon, N.: Aggregation of Partial Rankings, p-Ratings and Top-m Lists. Algorithmica 57(2), 284–300 (2010)
Article MATH Google Scholar
Ailon, N., Charikar, M., Newman, N.: Aggregating inconsistent information: Ranking and clustering. In: Proceedings of the 37th STOC, pp. 684–693 (2005)
Google Scholar
Betzler, N., Fellows, M.R., Guo, J., Niedermeier, R., Rosamond, F.A.: Fixed-Parameter Algorithms for Kemeny Scores. In: Fleischer, R., Xu, J. (eds.) AAIM 2008. LNCS, vol. 5034, pp. 60–71. Springer, Heidelberg (2008)
Chapter Google Scholar
Brancotte, B., Biton, A., Bernard-Pierrot, I., Radvanyi, F., Reyal, F., Cohen-Boulakia, S.: Gene List significance at-a-glance with GeneValorization. To Appear in Bioinformatics (Application Note) (February 2011)
Google Scholar
Birkland, A., Yona, G.: Biozon: a system for unification, management and analysis of heterogeneous biological data. BMC Bioinformatics 7, 70 (2006)
Article Google Scholar
Blin, G., Crochemore, M., Hamel, S., Vialette, S.: Medians of an odd number of permutations. To Appear in Pure Mathematics and Applications (2011)
Google Scholar
Cohen-Boulakia, S., Biton, O., Davidson, S., Froidevaux, C.: BioGuideSRS: Querying Multiple Sources with a user-centric perspective. Bioinformatics 23(10), 1301–1303 (2006)
Article Google Scholar
Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank Aggregation Methods for the Web. In: Proceedings of the 10th WWW, pp. 613–622 (2001)
Google Scholar
Fagin, R., Kumar, R., Mahdian, M., Sivakumar, D., Vee, E.: Comparing and Aggregating Rankings with Ties. In: Proceedings of PODS 2004, pp. 47–55 (2004)
Google Scholar
Hussels, P., Trissl, S., Leser, U.: What’s new? What’s certain? – scoring search results in the presence of overlapping data sources. In: Cohen-Boulakia, S., Tannen, V. (eds.) DILS 2007. LNCS (LNBI), vol. 4544, pp. 231–246. Springer, Heidelberg (2007)
Chapter Google Scholar
Kendall, M.: A New Measure of Rank Correlation. Biometrika 30, 81–89 (1938)
Article MATH Google Scholar
Kenyon-Mathieu, C., Schudy, W.: How to rank with few errors. In: Proceedings of the 39th STOC, pp. 95–103 (2007)
Google Scholar
Lacroix, Z., Raschid, L., Vidal, M.E.: Efficient techniques to explore and rank paths in life science data sources. In: Rahm, E. (ed.) DILS 2004. LNCS (LNBI), vol. 2994, pp. 187–202. Springer, Heidelberg (2004)
Chapter Google Scholar
Laignel, N.: Ranking biological data taking into account user’s preferences, Master thesis report (co-supervised by S. Cohen-Boulakia, C. Froidevaux, and U. Leser), University of Paris-Sud XI (2010)
Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank Citation Ranking: Bringing Order to the Web. Stanford University, Stanford (1998)
Google Scholar
Shafer, P., Isganitis, T., Yona, G.: Hubs of knowledge: using the functional link structure in Biozon to mine for biologically significant entities. BMC Bioinformatics 7, 71 (2006)
Article Google Scholar
Varadarajan, R., Hritidis, V., Raschid, L., Vidal, M., Ibanez, L., Rodriguez-Drumond, H.: Flexible and Efficient Querying and Ranking on Hyperlinked Data Sources. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 553–564 (2009)
Google Scholar
Wilf, H.S.: Generatingfunctionology, p. 147. Academic Press, NY (1990)
MATH Google Scholar
van Zuylen, A., Williamson, D.P.: Deterministic algorithms for rank aggregation and other ranking and clustering problems. In: Kaklamanis, C., Skutella, M. (eds.) WAOA 2007. LNCS, vol. 4927, pp. 260–273. Springer, Heidelberg (2008)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

LRI (Laboratoire de Recherche en Informatique), CNRS UMR 8623, Université Paris-Sud, France
Sarah Cohen-Boulakia & Alain Denise
AMIB Group, INRIA Saclay Ile-de-France, France
Sarah Cohen-Boulakia & Alain Denise
IGM (Institut de Génétique et de Microbiologie), CNRS UMR 8621, Université Paris-Sud, France
Alain Denise
DIRO (Département d’Informatique et de Recherche Opérationnelle), Université de Montréal, QC, Canada
Sylvie Hamel

Authors

Sarah Cohen-Boulakia
View author publications
You can also search for this author in PubMed Google Scholar
Alain Denise
View author publications
You can also search for this author in PubMed Google Scholar
Sylvie Hamel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The Evergreen State College, 98505, Olympia, WA, USA
Judith Bayard Cushing
CNRI and University of Virginia, 22908, Charlottesville, VA, USA
James French
Gonzaga University, 99258, Spokane, WA, USA
Shawn Bowers

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cohen-Boulakia, S., Denise, A., Hamel, S. (2011). Using Medians to Generate Consensus Rankings for Biological Data. In: Bayard Cushing, J., French, J., Bowers, S. (eds) Scientific and Statistical Database Management. SSDBM 2011. Lecture Notes in Computer Science, vol 6809. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22351-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-22351-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22350-1
Online ISBN: 978-3-642-22351-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Using Medians to Generate Consensus Rankings for Biological Data

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

ConQuR-Bio: Consensus Ranking with Query Reformulation for Biological Data

Confidence Intervals for Quantiles in Ranked Set Sampling

The linear ordering problem with clusters: a new partial ranking

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Using Medians to Generate Consensus Rankings for Biological Data

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

ConQuR-Bio: Consensus Ranking with Query Reformulation for Biological Data

Confidence Intervals for Quantiles in Ranked Set Sampling

The linear ordering problem with clusters: a new partial ranking

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation