Abstract
According to concept of dinucleotide, this paper proposes a graphical representation of dinucleotide. In this way, the graphical representation can not only show sequence one-to-one, but also can be easily converted into the original sequence. Then this paper extracts cross- correlation function to characterize the degree of similarity from representation of dinucleotide. After applying our approach to nine kinds of viruses, it can be found that our conclusion is almost consistent with the reported data. After the analysis with the method of inter-class, it can be found that our data can classify different viruses well. Our approach can more easily extract data and distinguish the different classes than previous results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Luo, J.: Fundamental Concepts of Bioinformation. Peking University Press, Beijing (2002)
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)
Needleman, S.B., Wunsch, C.D.: A General method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970)
Yao, Y., Dai, Q., Ling, L., Nan, X., He, P., Zhang, Y.: Similarity/dissimilarity studies of protein sequences based on a new 2D graphical representation. J. Comput. Chem. 31, 1045–1052 (2010)
Feng, J., Wang, T.: A 3D graphical representation of RNA secondary structures based on chaos game representation. Chem. Phys. Lett. 454, 355–361 (2008)
Tang, X., Zhou, P., Qiu, W.: On the similarity/dissimilarity of DNA sequences based on 4D graphical representation. Chin. Sci. Bull. 55, 701–704 (2010)
Yu, C., Deng, M., Yau, S.T.: DNA sequence comparison by a novel probabilistic method. Inf. Sci. 181, 1484–1492 (2011)
Zheng, X., Qin, Y., Wang, J.: A Poisson model of sequence comparison and its application to coronavirus phylogony. Math. Biosci. 217, 159–166 (2009)
Yang, X., Wang, T.: Linear regression model of short K-word: a similarity distance suitable for biological sequences with various lengths. J. Theor. Biol. 337, 61–70 (2013)
Yang, X., Wang, T.: A novel statistical measure for sequence comparison on the basis of K-word counts. J. Theor. Biol. 318, 91–100 (2013)
Yano, M., Kato, Y.: Using hidden Markov models to investigate G-quadruplex motifs in genomic sequences. BMC Genom. 15, S15 (2014)
Wu, T., Hsiech, Y., Li, L.: Statistical measures of DNA sequence dissimilarity under Markov chain models of based composition. Biometrics 57, 441–448 (2001)
Pham, T.D., Zuegg, J.: A probabilistic measure for alignment free sequence comparison. Bioinformatics 20, 3455–3461 (2004)
Jeong, B.S., Bari, A.G., Reaz, M.R., Jeon, S., Lim, C.G., Choi, H.J.: Codon-based encoding for DNA sequence analysis. Methods 67, 373–379 (2014)
He, Q., Bai, X., Liu, X., Xu, N., et al.: Protein and mRNA expression of CTGF, CYR61, VEGF-C and VEGFR-2 in bone marrow of leukemia patients and its correlation with clinical features. Chin. Assoc. Pathophysiol. 22, 653–659 (2014)
Zhang, Y., Qiu, J., Su, L.: Comparing RNA secondary structures based on 2D graphical representation. Chem. Phys. Lett. 458, 180–185 (2008)
Liu, L., Wang, T.: On 3D graphical representation of RNA secondary structures and their applications. J. Math. Chem. 42, 595–602 (2007)
Yu, H., Huang, D.: Graphical representation for DNA sequences via joint diagonalization of matrix pencil. IEEE J. Biomed. Health Inform. 17, 503–511 (2013)
Tian, F., Wang, S., Wang, J., Liu, X.: Similarity analysis of RNA secondary structure with symbolic dynamics. J. Comput. Res. Dev. 50, 445–452 (2013)
Wang, S., Tian, F., Qiu, Y., Liu, X.: Bilateral similarity function: a novel and universal method for similarity analysis of biological sequences. J. Theor. Biol. 265, 194–201 (2010)
Liu, Z., Liao, B., Zhu, W.: A new method to analyze the similarity based on dual nucleotides of the DNA sequence. Match-Commun. Math. Comput. Chem. 61, 541–552 (2009)
Liu, Z., Liao, B., Zhu, W., Huang, G.: A 2D graphical representation of DNA sequence based on dual nucleotides and its application. Int. J. Quantum Chem. 109, 948–958 (2009)
Bai, F., Li, D., Wang, T.: A new mapping rule for RNA secondary structures with its applications. J. Math. Chem. 43, 932–942 (2008)
Li, W., Fu, L., Niu, B., Wu, S., Wooley, J.: Ultrafast clustering algorithms for metagenomic sequence analysis. Brief. Bioinform. 13, 656–668 (2012)
Yang, J., et al.: Entropy-driven DNA logic circuits regulated by DNAzyme. Nucl. Acids Res. (2018). https://doi.org/10.1093/nar/gky663
Wang, B., et al.: Constructing DNA barcode sets based on particle swarm optimization. IEEE/ACM Trans. Comput. Biol. Bioinform. 15, 999–1002 (2018)
Pan, L., Wang, Z., Li, Y., Xu, F., Zhang, Q., Zhang, C.: Nicking enzyme-controlled toehold regulation for DNA logic circuits. Nanoscale 9(46), 18223–18228 (2017)
Wang, B., Xie, Y., Zhou, S., Zheng, X., Zhou, C.: Correcting errors in image encryption based on DNA coding. Molecules (2018). https://doi.org/10.3390/molecules23081878
Acknowledgement
This work is supported by the National Natural Science Foundation of China (Nos. 61425002, 61751203, 61772100, 61702070, 61672121, 61572093), Program for Changjiang Scholars and Innovative Research Team in University (No. IRT_15R07), the Program for Liaoning Innovative Research Team in University (No. LT2015002).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Xing, S., Wang, B., Wei, X., Zhou, C., Zhang, Q., Zheng, Z. (2018). RNA Sequences Similarities Analysis by Cross-Correlation Function. In: Qiao, J., et al. Bio-inspired Computing: Theories and Applications. BIC-TA 2018. Communications in Computer and Information Science, vol 952. Springer, Singapore. https://doi.org/10.1007/978-981-13-2829-9_9
Download citation
DOI: https://doi.org/10.1007/978-981-13-2829-9_9
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-2828-2
Online ISBN: 978-981-13-2829-9
eBook Packages: Computer ScienceComputer Science (R0)