Abstract
Phylogenetic networks are increasingly used in evolutionary biology to represent the history of species that have undergone reticulate events such as horizontal gene transfer, hybrid speciation and recombination. One of the most fundamental questions that arise in this context is whether the evolution of a gene with one copy in all species can be explained by a given network. In mathematical terms, this is often translated in the following way: is a given phylogenetic tree contained in a given phylogenetic network? Recently this tree containment problem has been widely investigated from a computational perspective, but most studies have only focused on the topology of the phylogenies, ignoring a piece of information that, in the case of phylogenetic trees, is routinely inferred by evolutionary analyses: branch lengths. These measure the amount of change (e.g., nucleotide substitutions) that has occurred along each branch of the phylogeny. Here, we study a number of versions of the tree containment problem that explicitly account for branch lengths. We show that, although length information has the potential to locate more precisely a tree within a network, the problem is computationally hard in its most general form. On a positive note, for a number of special cases of biological relevance, we provide algorithms that solve this problem efficiently. This includes the case of networks of limited complexity, for which it is possible to recover, among the trees contained by the network with the same topology as the input tree, the closest one in terms of branch lengths.
Similar content being viewed by others
References
Abbott R, Albach D, Ansell S, Arntzen J, Baird S, Bierne N, Boughman J, Brelsford A, Buerkle C, Buggs R et al (2013) Hybridization and speciation. J Evolut Biol 26(2):229–246
Bapteste E, van Iersel L, Janke A, Kelchner S, Kelk S, McInerney JO, Morrison DA, Nakhleh L, Steel M, Stougie L et al (2013) Networks: expanding evolutionary thinking. Trends Genet 29(8):439–441
Baroni M, Semple C, Steel M (2006) Hybrids in real time. Syst Biol 55(1):46–56
Bordewich M, Tokac N (2016) An algorithm for reconstructing ultrametric tree-child networks from inter-taxa distances. Discrete Appl Math. doi:10.1016/j.dam.2016.05.011
Boto L (2010) Horizontal gene transfer in evolution: facts and challenges. Proc R Soc B Biol Sci 277(1683):819–827
Cardona G, Llabrés M, Rosselló F, Valiente G (2008) A distance metric for a class of tree-sibling phylogenetic networks. Bioinformatics 24(13):1481–1488
Chan HL, Jansson J, Lam TW, Yiu SM (2006) Reconstructing an ultrametric galled phylogenetic network from a distance matrix. J Bioinform Comput Biol 4(04):807–832
Choy C, Jansson J, Sadakane K, Sung WK (2005) Computing the maximum agreement of phylogenetic networks. Theor Comput Sci 335(1):93–107
Cordue P, Linz S, Semple C (2014) Phylogenetic networks that display a tree twice. Bull Math Biol 76(10):2664–2679
Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to algorithms. MIT Press, Cambridge
Doolittle WF (1999) Phylogenetic classification and the universal tree. Science 284:2124–2128
Downey RG, Fellows MR (2013) Fundamentals of parameterized complexity, vol 4. Springer, Berlin
Doyon JP, Scornavacca C, Gorbunov KY, Szöllösi GJ, Ranwez V, Berry V (2011) An efficient algorithm for gene/species trees parsimonious reconciliation with losses, duplications, and transfers. In: Proceedings of the eighth RECOMB comparative genomics satellite workshop (RECOMB-CG’10), LNCS, vol 6398, pp 93–108. Springer
Gambette P, Berry V, Paul C (2009) The structure of level-k phylogenetic networks. In: CPM09, LNCS, vol 5577, pp 289–300. Springer
Garey MR, Johnson DS (1975) Complexity results for multiprocessor scheduling under resource constraints. SIAM J Comput 4(4):397–411
Garey MR, Johnson DS (1979) Computers and intractability. W. H. Freeman and Co. A guide to the theory of NP-completeness, A Series of Books in the Mathematical Sciences
Gramm J, Nickelsen A, Tantau T (2008) Fixed-parameter algorithms in phylogenetics. Comput J 51(1):79–101
Gusfield D (2014) ReCombinatorics: the algorithmics of ancestral recombination graphs and explicit phylogenetic networks. MIT Press, Cambridge
Hotopp JCD (2011) Horizontal gene transfer between bacteria and animals. Trends Genet 27(4):157–163
Huber KT, van Iersel L, Moulton V, Wu T (2015) How much information is needed to infer reticulate evolutionary histories? Syst Biol 64(1):102–111
Huson DH, Rupp R, Scornavacca C (2010) Phylogenetic networks: concepts, algorithms and applications. Cambridge University Press, Cambridge
Huson DH, Scornavacca C (2011) A survey of combinatorial methods for phylogenetic networks. Genome Biol Evol 3:23–35
Jansson J, Sung WK (2006) Inferring a level-1 phylogenetic network from a dense set of rooted triplets. Theor Comput Sci 363(1):60–68
Kanj IA, Nakhleh L, Than C, Xia G (2008) Seeing the trees and their branches in the network is hard. Theor Comput Sci 401(1):153–164
Kubatko LS (2009) Identifying hybridization events in the presence of coalescence via model selection. Syst Biol 58(5):478–488
Mallet J (2007) Hybrid speciation. Nature 446(7133):279–283
Meng C, Kubatko LS (2009) Detecting hybrid speciation in the presence of incomplete lineage sorting using gene tree incongruence: a model. Theor Popul Biol 75(1):35–45
Morrison DA (2011) Introduction to Phylogenetic Networks. RJR Productions
Nolte AW, Tautz D (2010) Understanding the onset of hybrid speciation. Trends Genet 26(2):54–58
Pardi F, Scornavacca C (2015) Reconstructible phylogenetic networks: do not distinguish the indistinguishable. PLoS Comput Biol 11(4):e1004135
Posada D, Crandall KA, Holmes EC (2002) Recombination in evolutionary genomics. Annu Rev Genet 36(1):75–97
Rambaut A, Posada D, Crandall K, Holmes E (2004) The causes and consequences of HIV evolution. Nat Rev Genet 5(1):52–61
van Iersel L (2009) Algorithms, haplotypes and phylogenetic networks. Ph.D. thesis, Eindhoven University of Technology
van Iersel L, Moulton V (2014) Trinets encode tree-child and level-2 phylogenetic networks. J Math Biol 68(7):1707–1729
van Iersel L, Semple C, Steel M (2010) Locating a tree in a phylogenetic network. Inf Process Lett 110(23):1037–1043
Vuilleumier S, Bonhoeffer S (2015) Contribution of recombination to the evolutionary history of HIV. Curr Opin HIV AIDS 10(2):84–89
Warnow TJ (1994) Tree compatibility and inferring evolutionary history. J Algorithms 16(3):388–407
Yu Y, Degnan JH, Nakhleh L (2012) The probability of a gene tree topology within a phylogenetic network with applications to hybridization detection. PLoS Genet 8(4):e1002660
Yu Y, Dong J, Liu KJ, Nakhleh L (2014) Maximum likelihood inference of reticulate evolutionary histories. PNAS 111(46):16448–16453
Zhaxybayeva O, Doolittle WF (2011) Lateral gene transfer. Curr Biol 21(7):R242–R246
Acknowledgments
This work was partially funded by the CNRS “Projet international de coopération scientifique (PICS)” grant number 230310 (CoCoAlSeq). L. van Iersel was partly funded by the 4TU Applied Mathematics Institute and The Netherlands Organisation for Scientific Research (NWO). F. Pardi is a member of the VIROGENESIS project, which receives funding from the EU’s Horizon 2020 research and innovation programme under grant agreement No 634650.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gambette, P., van Iersel, L., Kelk, S. et al. Do Branch Lengths Help to Locate a Tree in a Phylogenetic Network?. Bull Math Biol 78, 1773–1795 (2016). https://doi.org/10.1007/s11538-016-0199-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11538-016-0199-4