A New Phylogenetic Inference Based on Genetic Attribute Reduction for Morphological Data
<p>An example of phylogenetic inference. Photographs (<b>A</b>–<b>F</b>) show examples of Cambrian Chengjiang Lagerstätte fossils. (<b>G</b>) is a morphological attribute matrix, where the rows represent species and the columns represent attributes. In the column labels of the matrix, the first row represents the attribute number and the second row corresponds to the attribute name. (<b>H</b>) is a phylogenetic tree for selected lobopodians and arthropods from the early Cambrian era [<a href="#B1-entropy-21-00313" class="html-bibr">1</a>].</p> "> Figure 2
<p>The framework of phylogenetic inference based on the Concept Decision Tree.</p> "> Figure 3
<p>The strategy of species grafting in a single decision node.</p> "> Figure 4
<p>An example of the species grafting algorithm. The red dot indicates the final graft position of the species <span class="html-italic">G</span>.</p> "> Figure 5
<p>An example of handling polymorphic trees.</p> "> Figure 6
<p>Accuracies of phylogenetic analysis for different proportions of missing data.</p> "> Figure 7
<p>The tree length for different proportions of missing data and for different methods.</p> "> Figure 8
<p>A paleontological phylogenetic tree. The red solid dot is the node position and the position of the red square is the grafting position of the species.</p> ">
Abstract
:1. Introduction
2. Framework of the CDT Algorithm
3. Construction of Multiple Concept-Sample Templates
3.1. The Design of the Genetic Algorithm for Attribute Reduction
3.1.1. Encoding Method
3.1.2. Fitness Function
3.1.3. Selection Operator
3.1.4. Crossover Operator
3.1.5. Mutation Operator
3.1.6. Modification Operator
3.2. Algorithm Description
4. Species Grafting Algorithm (SGA)
4.1. Description of SGA
4.2. Detailed Example of SGA
5. Experimental Results
5.1. CDT Accuracy Analysis
5.2. CDT Reliability Analysis
5.3. Phylogenetic Inference on Cambrian Lobopodians
6. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Liu, Y.Y.; Jeanjacques, S.; Albertlászló, B. Liu et al. reply. Nature 2011, 478, E4–E5. [Google Scholar] [CrossRef]
- Wiens, J.J. Does adding characters with missing data increase or decrease phylogenetic accuracy? Syst. Biol. 1998, 47, 625–640. [Google Scholar] [CrossRef] [PubMed]
- Wiens, J.J. Incomplete taxa, incomplete characters, and phylogenetic accuracy: Is there a missing data problem? J. Vertebr. Paleontol. 2003, 23, 297–310. [Google Scholar] [CrossRef]
- Livezey, B.C. Phylogenetic relationships and incipient flightlessness of the extinct Auckland Islands Merganser. Wilson Bull. 1989, 101, 410–435. [Google Scholar]
- Hufford, L.; Dickison, W.C. A phylogenetic analysis of Cunoniaceae. Syst. Bot. 1992, 17, 181–200. [Google Scholar] [CrossRef]
- Smith, A.B.; Paterson, G.L.; Lafay, B. Ophiuroid phylogeny and higher taxonomy: Morphological, molecular and palaeontological perspectives. Zool. J. Linn. Soc. 1995, 114, 213–243. [Google Scholar] [CrossRef]
- Hillis, D.M.; Huelsenbeck, J.P.; Cunningham, C.W. Application and accuracy of molecular phylogenies. Science 1994, 264, 671–677. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kearney, M.; Clark, J.M. Problems due to missing data in phylogenetic analyses including fossils: A critical review. J. Vertebr. Paleontol. 2003, 23, 263–274. [Google Scholar] [CrossRef]
- Wiens, J.J. Missing data, incomplete taxa, and phylogenetic accuracy. Syst. Biol. 2003, 52, 528–538. [Google Scholar] [CrossRef] [PubMed]
- Farris, J. Hennig86, Version 1.5.; Distributed by the author; Port Jefferson Station: New York, NY, USA, 1988. [Google Scholar]
- Swofford, D. PAUP*: Phylogenetic Analysis Using Parsimony and Other Methods (Software); Sinauer Associates: Sunderland, MA, USA, 2000. [Google Scholar]
- Mallat, S.G. A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans. Pattern Anal. Mach. Intell. 1989, 7, 674–693. [Google Scholar] [CrossRef]
- Guido, R.C.; Addison, P.S.; Walker, J. Introducing wavelets and time-frequency analysis. IEEE Eng. Med. Biol. Mag. 2009, 28, 13. [Google Scholar] [CrossRef] [PubMed]
- Daubechies, I. Ten Lectures on Wavelets; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 1992. [Google Scholar]
- Newland, D.E. Harmonic wavelet analysis. Proc. R. Soc. Lond. Ser. A Math. Phys. Sci. 1993, 443, 203–225. [Google Scholar] [CrossRef]
- Guariglia, E.; Silvestrov, S. Fractional-Wavelet Analysis of Positive definite Distributions and Wavelets on D′(C); Springer: Berlin, Germany, 2016. [Google Scholar]
- Guariglia, E. Spectral analysis of the Weierstrass-Mandelbrot function. In Proceedings of the 2nd International Multidisciplinary Conference on Computer and Energy Science (SpliTech), Split, Croatia, 12–14 July 2017. [Google Scholar]
- Fitch, W.M. Toward defining the course of evolution: Minimum change for a specific tree topology. Syst. Biol. 1971, 20, 406–416. [Google Scholar] [CrossRef]
- Felsenstein, J. Evolutionary trees from DNA sequences: A maximum likelihood approach. J. Mol. Evol. 1981, 17, 368–376. [Google Scholar] [CrossRef] [PubMed]
- Wiens, J.J. Missing data and the design of phylogenetic analyses. J. Biomed. Inf. 2006, 39, 34–42. [Google Scholar] [CrossRef] [PubMed]
- Guillerme, T.; Cooper, N. Effects of missing data on topological inference using a total evidence approach. Mol. Phylogenet. Evol. 2016, 94, 146–158. [Google Scholar] [CrossRef] [PubMed]
- Zuckerkandl, E.; Pauling, L. Molecules as documents of evolutionary history. J. Theor. Biol. 1965, 8, 357–366. [Google Scholar] [CrossRef]
- Foulds, L.R.; Graham, R.L. The Steiner problem in phylogeny is NP-complete. Adv. Appl. Math. 1982, 3, 43–49. [Google Scholar] [CrossRef]
- Blum, A.L.; Langley, P. Selection of relevant features and examples in machine learning. Artif. Intell. 1997, 97, 245–271. [Google Scholar] [CrossRef] [Green Version]
- Pawlak, Z. Rough sets. Int. J. Comput. Inf. Sci. 1982, 11, 341–356. [Google Scholar] [CrossRef]
- Ma, X.; Wang, G.; Yu, H. Heuristic method to attribute reduction for decision region distribution preservation. J. Softw. 2014, 8, 1761–1780. [Google Scholar]
- Huelsenbeck, J.P.; Ronquist, F. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 2001, 17, 754–755. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Goloboff, P.A.; Farris, J.S.; Nixon, K.C. TNT, a free program for phylogenetic analysis. Cladistics 2008, 24, 774–786. [Google Scholar] [CrossRef]
- Yang, Z.; Rannala, B. Bayesian phylogenetic inference using DNA sequences: A Markov Chain Monte Carlo method. Mol. Biol. Evol. 1997, 14, 717–724. [Google Scholar] [CrossRef] [PubMed]
- Tsujimura, Y.; Gen, M. Entropy-based genetic algorithm for solving TSP. In Proceedings of the Second International Conference. Knowledge-Based Intelligent Electronic Systems, Adelaide, SA, Australia, 21–23 April 1998; Volume 2, pp. 285–290. [Google Scholar]
- Zhengjiang, W.; Jingmin, Z.; Yan, G. An attribute reduction algorithm based on genetic algorithm and discernibility matrix. J. Softw. 2012, 7, 2640–2648. [Google Scholar]
- Arellano-Valle, R.B.; Contreras-Reyes, J.E.; Genton, M.G. Shannon entropy and mutual information for multivariate skew-elliptical distributions. Scand. J. Stat. 2013, 40, 42–62. [Google Scholar] [CrossRef]
- Cover, T.M.; Thomas, J.A. Elements of Information Theory; John Wiley and Sons: Hoboken, NJ, USA, 2012. [Google Scholar]
- Lipscomb, D. Basics of Cladistic Analysis; George Washington University: Washington, DC, USA, 1998. [Google Scholar]
- Bouamer, S.; Morand, S. Phylogeny of palaearctic pharyngodonidae parasite species of testudinidae: A morphological approach. Can. J. Zool. 2003, 81, 1885–1893. [Google Scholar] [CrossRef]
- Tang, L.D.; Yuan, M.M.; Yan, L.I.; Wang, X. Phylogenetic analysis of hibiscus based on morphological characters. J. Henan Agric. Sci. 2014, 43, 105–111. [Google Scholar]
- Lin, X.L.; Chen, Y.; Huang, M.; Yang, X.K. A new species of the genus Meligethes Stephens (Coleoptera: Nitidulidae: Meligethinae) from China. Zool. Syst. 2015, 40, 268–289. [Google Scholar]
- Goloboff, P.A. A Revision of the South American Spiders of the Family Nemesiidae (Araneae, Mygalomorphae). Part 1, Species from Peru, Chile, Argentina, and Uruguay. Bulletin of the AMNH; no. 224; American Museum of Natural History: New York, NY, USA, 1995. [Google Scholar]
- Reeder, T.W.; Wiens, J.J. Evolution of the lizard family Phrynosomatidae as inferred from diverse types of data. Herpetol. Monogr. 1996, 10, 43–84. [Google Scholar] [CrossRef]
- Liebherr, J.K.; Zimmerman, E.C. Cladistic analysis, phylogeny and biogeography of the Hawaiian Platynini (Coleoptera: Carabidae). Syst. Entomol. 1998, 23, 137–172. [Google Scholar] [CrossRef]
- Efron, B.; Tibshirani, R.J. An Introduction to the Bootstrap; CRC Press: Boca Raton, FL, USA, 1994. [Google Scholar]
- Davison, A.C.; Hinkley, D.V. Bootstrap Methods and Their Application; Cambridge University Press: Cambridge, UK, 1997; Volume 1. [Google Scholar]
- Huang, D.W. An Introduction to Cladistics; China Agriculture Press: Beijing, China, 1996. [Google Scholar]
No. Attributes | No. Values | Possible Value |
---|---|---|
1 | 2 | |
2 | ||
3 | ||
N |
Site | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|---|---|
Code | 1 | 3 | 0 | 4 | 5 | 7 | 10 | 9 | 9 | 8 |
Datasets | No. Species | No. Attributes | Reference |
---|---|---|---|
Pharyngodonidae | 25 | 30 | Bouamer and Morand (2003) [35] |
Hibiscus | 40 | 38 | Tang et al. (2014) [36] |
Meligethes | 42 | 60 | Lin et al. (2015) [37] |
Nemesiid spiders | 77 | 60 | Goloboff (1995) [38] |
Phrynosomatid lizards | 115 | 59 | Reeder and Wiens (1996) [39] |
liebherr | 160 | 136 | Hawaiian Platynini (Carabidae), Liebherr (1998) [40] |
Pharyngodonidae | Hibiscus | Meligethes | Nemesiid Spiders | Phrynosomatid Lizards | Liebherr | Avg. | |
---|---|---|---|---|---|---|---|
BI | 0.8919 | 0.8714 | 0.7828 | 0.8672 | 0.8567 | 0.8355 | 0.851 |
ML | 0.8400 | 0.8250 | 0.7461 | 0.8659 | 0.8501 | 0.8428 | 0.828 |
MP | 0.8990 | 0.8778 | 0.7905 | 0.8730 | 0.8618 | 0.8283 | 0.855 |
CDT | 0.8983 | 0.8811 | 0.7930 | 0.8811 | 0.8732 | 0.8613 | 0.865 |
Pharyngodonidae | Hibiscus | Meligethes | Nemesiid Spiders | Phrynosomatid Lizards | Liebherr | Avg. | |
---|---|---|---|---|---|---|---|
CDT vs. BI | 0.0282 | 0.0800 | 0.4278 | 0.0282 | 0.0800 | 0.0500 | 0.1157 |
CDT vs. ML | 0.0153 | 0.0957 | 0.0488 | 0.0153 | 0.0957 | 0.0180 | 0.0481 |
CDT vs. MP | 0.1250 | 0.1128 | 0.0282 | 0.1250 | 0.1128 | 0.0821 | 0.0977 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Feng, J.; Liu, Z.; Feng, H.; Sutcliffe, R.F.E.; Liu, J.; Han, J. A New Phylogenetic Inference Based on Genetic Attribute Reduction for Morphological Data. Entropy 2019, 21, 313. https://doi.org/10.3390/e21030313
Feng J, Liu Z, Feng H, Sutcliffe RFE, Liu J, Han J. A New Phylogenetic Inference Based on Genetic Attribute Reduction for Morphological Data. Entropy. 2019; 21(3):313. https://doi.org/10.3390/e21030313
Chicago/Turabian StyleFeng, Jun, Zeyun Liu, Hongwei Feng, Richard F. E. Sutcliffe, Jianni Liu, and Jian Han. 2019. "A New Phylogenetic Inference Based on Genetic Attribute Reduction for Morphological Data" Entropy 21, no. 3: 313. https://doi.org/10.3390/e21030313
APA StyleFeng, J., Liu, Z., Feng, H., Sutcliffe, R. F. E., Liu, J., & Han, J. (2019). A New Phylogenetic Inference Based on Genetic Attribute Reduction for Morphological Data. Entropy, 21(3), 313. https://doi.org/10.3390/e21030313