Abstract
The rapid accumulation of whole-genome data has renewed interest in the study of genomic rearrangements. Comparative genomics, evolutionary biology, and cancer research all require models and algorithms to elucidate the mechanisms, history, and consequences of these rearrangements. However, even simple models lead to NP-hard problems, particularly in the area of phylogenetic analysis. Current approaches are limited to small collections of genomes and low-resolution data (typically a few hundred syntenic blocks). Moreover, whereas phylogenetic analyses from sequence data are deemed incomplete unless bootstrapping scores (a measure of confidence) are given for each tree edge, no equivalent to bootstrapping exists for rearrangement-based phylogenetic analysis.
We describe a fast and accurate algorithm for rearrangement analysis that scales up, in both time and accuracy, to modern high-resolution genomic data. We also describe a novel approach to estimate the robustness of results—an equivalent to the bootstrapping analysis used in sequence-based phylogenetic reconstruction. We present the results of extensive testing on both simulated and real data showing that our algorithm returns very accurate results, while scaling linearly with the size of the genomes and cubically with their number. We also present extensive experimental results showing that our approach to robustness testing provides excellent estimates of confidence, which, moreover, can be tuned to trade off thresholds between false positives and false negatives. Together, these two novel approaches enable us to attack heretofore intractable problems, such as phylogenetic inference for high-resolution vertebrate genomes, as we demonstrate on a set of six vertebrate genomes with 8,380 syntenic blocks.
Availability: a copy of the software is available on demand.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alekseyev, M.A., Pevzner, P.A.: Breakpoint graphs and ancestral genome reconstructions. Genome Research 19(5), 943–957 (2009)
Amrine-Madsen, H., Koepfli, K.-P., Wayne, R.K., Springer, M.S.: A new phylogenetic marker, apolipoprotein b, provides compelling evidence for eutherian relationships. Molecular Phylogenetics and Evolution 28(2), 225–240 (2003)
Bergeron, A., Mixtacki, J., Stoye, J.: A unifying view of genome rearrangements. In: Bücher, P., Moret, B.M.E. (eds.) WABI 2006. LNCS (LNBI), vol. 4175, pp. 163–173. Springer, Heidelberg (2006)
Blanchette, M., Bourque, G., Sankoff, D.: Breakpoint phylogenies. In: Miyano, S., Takagi, T. (eds.) Genome Informatics, pp. 25–34. Univ. Academy Press, Tokyo (1997)
Bourque, G., Pevzner, P.: Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Res. 12, 26–36 (2002)
Cannarozzi, G., Schneider, A., Gonnet, G.: A phylogenomic study of human, dog, and mouse. PLoS Comput. Biol. 3, e2 (2007)
Day, W.H.E., Sankoff, D.: The computational complexity of inferring phylogenies from chromosome inversion data. J. Theor. Biol. 127, 213–218 (1987)
Felsenstein, J.: Confidence limits on phylogenies: an approach using the bootstrap. Evol. 39, 783–791 (1985)
Felsenstein, J., Kishino, H.: Is there something wrong with the bootstrap on phylogenies? A reply to Hillis and Bull. Syst. Biol. 42(2), 193–200 (1993)
Fertin, G., Labarre, A., Rusu, I., Tannier, E., Vialette, S.: Combinatorics of Genome Rearrangements. MIT Press, Cambridge (2009)
Hillis, D.M., Huelsenbeck, J.P.: Assessing molecular phylogenies. Science 267, 255–256 (1995)
Huttley, G.A., Wakefield, M.J., Easteal, S.: Rates of genome evolution and branching order from whole-genome analysis. Mol. Biol. Evol. 24(8), 1722–1730 (2007)
Lin, Y., Moret, B.M.E.: Estimating true evolutionary distances under the DCJ model. In: Proc. 16th Int’l Conf. on Intelligent Systems for Mol. Biol. (ISMB 2008). Bioinformatics, vol. 24(13), pp. i114–i122 (2008)
Lin, Y., Rajan, V., Swenson, K.M., Moret, B.M.E.: Estimating true evolutionary distances under rearrangements, duplications, and losses. In: Proc. 8th Asia Pacific Bioinf. Conf (APBC 2010). BMC Bioinformatics, vol. 11(suppl. 1), p. S54 (2010)
Madsen, O., Scally, M., Douady, C.J., Kao, D.J., DeBry, R.W., Adkins, R., Amrine, H.M., Stanhope, M.J., de Jong, W.W., Springer, M.S.: Parallel adaptive radiations in two major clades of placental mammals. Nature 409, 610–614 (2001)
Moret, B.M.E., Tang, J., Wang, L.-S., Warnow, T.: Steps toward accurate reconstructions of phylogenies from gene-order data. J. Comput. Syst. Sci. 65(3), 508–525 (2002)
Moret, B.M.E., Wyman, S.K., Bader, D.A., Warnow, T., Yan, M.: A new implementation and detailed study of breakpoint analysis. In: Proc. 6th Pacific Symp. on Biocomputing (PSB 2001), pp. 583–594. World Scientific Pub., Singapore (2001)
Murphy, W.J., Eizirik, E., Johnson, W.E., Zhang, Y.P., Ryder, O.A., O’Brien, S.J.: Molecular phylogenetics and the origins of placental mammals. Nature 409, 614–618 (2001)
R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2009), ISBN 3-900051-07-0
Robinson, D.R., Foulds, L.R.: Comparison of phylogenetic trees. Mathematical Biosciences 53, 131–147 (1981)
Rokas, A., Holland, P.W.H.: Rare genomic changes as a tool for phylogenetics. Trends in Ecol. and Evol. 15, 454–459 (2000)
Saitou, N., Nei, M.: The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987)
Shi, J., Zhang, Y., Luo, H., Tang, J.: Using jackknife to assess the quality of gene order phylogenies. BMC Bioinformatics 11(168) (2010)
Soltis, P.S., Soltis, D.E.: Applying the bootstrap in phylogeny reconstruction. Statist. Sci. 18(2), 256–267 (2003)
Sturtevant, A.H.: A crossover reducer in Drosophila melanogaster due to inversion of a section of the third chromosome. Biol. Zent. Bl. 46, 697–702 (1926)
Sturtevant, A.H., Dobzhansky, T.: Inversions in the third chromosome of wild races of drosophila pseudoobscura and their use in the study of the history of the species. Proc. Nat’l Acad. Sci., USA 22, 448–450 (1936)
Swofford, D.L., Olsen, G.J., Waddell, P.J., Hillis, D.M.: Phylogenetic inference. In: Hillis, D.M., Mable, B.K., Moritz, C. (eds.) Molecular Systematics, pp. 407–514. Sinauer Assoc., Sunderland (1996)
Tang, J., Moret, B.M.E.: Scaling up accurate phylogenetic reconstruction from gene-order data. In: Proc. 11th Int’l Conf. on Intelligent Systems for Mol. Biol (ISMB 2003). Bioinformatics, vol. 19, pp. i305–i312. Oxford U. Press, Oxford (2003)
Wang, L.-S.: Exact-IEBP: a new technique for estimating evolutionary distances between whole genomes. In: Proc. 33rd Ann. ACM Symp. Theory of Comput (STOC 2001), pp. 637–646. ACM Press, New York (2001)
Wang, L.-S., Warnow, T.: Estimating true evolutionary distances between genomes. In: Gascuel, O., Moret, B.M.E. (eds.) WABI 2001. LNCS, vol. 2149, pp. 176–190. Springer, Heidelberg (2001)
Wildman, D.E., Uddin, M., Opazo, J.C., Liu, G., Lefort, V., Guindon, S., Gascuel, O., Grossman, L.I., Romero, R., Goodman, M.: Genomics, biogeography, and the diversification of placental mammals. Proc. Nat’l Acad. Sci., USA 104(36), 14395–14400 (2007)
Yancopoulos, S., Attie, O., Friedberg, R.: Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 21(16), 3340–3346 (2005)
Zwickl, D.J., Hillis, D.M.: Increased taxon sampling greatly reduces phylogenetic error. Syst. Biol. 51(4), 588–598 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lin, Y., Rajan, V., Moret, B.M.E. (2010). Fast and Accurate Phylogenetic Reconstruction from High-Resolution Whole-Genome Data and a Novel Robustness Estimator. In: Tannier, E. (eds) Comparative Genomics. RECOMB-CG 2010. Lecture Notes in Computer Science(), vol 6398. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16181-0_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-16181-0_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16180-3
Online ISBN: 978-3-642-16181-0
eBook Packages: Computer ScienceComputer Science (R0)