Abstract
Cancer data analysis is significant to detect the codes that are responsible for cancer diseases. It is significant to find out the coding regions from diseases infected biological data. The infected data will be helpful to design proper drugs and will be supportable in laboratory assessments. Codes bear specific meaning on various features as well as symptoms of diseases. Coding of biological data is a key area to get exact information on animals to discover the desired medicine. In the current work, four different machine learning approaches such as support vector machine (SVM), principal component analysis (PCA) technique, neural mapping skyline filtering (NMSF) and Fisher’s discriminant analysis (FDA) were applied for data reduction and coding area selection. The experimental analysis established that the SVM outperforms PCA and FDA. However, due to the mapping facility, NMSF outperforms SVM. Thus, the NMSF achieved the preeminent results among the four techniques. Matthews’s correlation coefficient was used to evaluate the accuracy, specificity, sensitivity, F-measures and error rate of the four methods that are used to determine the coding area. Detailed experimental analysis included comparison study among the four classifiers for the deoxyribonucleic acid dataset.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Cui P, Liu H, Aggarwal C, Wang F (2016) Uncovering and predicting human behaviors. IEEE Intell Syst 31(2):77–88
Subbian K, Aggarwal CC, Srivastava J (2016) Mining influencers using information flows in social streams. ACM Trans Knowl Discov Data 10(3):26
Li J, Le TD, Liu L, Liu J, Jin Z, Sun B, Ma S (2016) From observational studies to causal rule mining. ACM Trans Intell Syst Technol 7(2):14
Wu CJ, Ku CF, Ho JM, Chen MS (2016) A novel pipeline approach for efficient big data broadcasting. IEEE Trans Knowl Data Eng 28(1):17–28
Leis V, Kemper A, Neumann T (2016) Scaling HTM-supported database transactions to many cores. IEEE Trans Knowl Data Eng 28(2):297–310
Bhowmick SS, Seah BS (2016) Clustering and summarizing protein-protein interaction networks: a survey. IEEE Trans Knowl Data Eng 28(3):638–658
Zhou C, Cule B, Goethasls B (2016) Pattern based sequence classification. IEEE Trans Knowl Data Eng 28(5):1285–1298
Zhong J, Ong YS, Cai W (2016) Self-learning gene expression programming. IEEE Trans Evol Comput 20(1):65–80
He J, Lin G (2016) Average convergence rate of evolutionary algorithms. IEEE Trans Evol Comput 20(1):316–321
Deadman E, Higham NJ (2016) Testing matrix function algorithms using identities. ACM Trans Math Softw 42(1):4
Kiah HM, Puleo GJ, Milenkovic O (2016) Codes for DNA sequence profiles. IEEE Trans Inf Theory 62(6):3125–3146
Chien JT, KuBayesian YC (2016) Recurrent neural network for language modeling. IEEE Trans Neural Netw Learn Syst 27(2):361–374
Turcu A, Palmieri R, Ravindran B, Hirve S (2016) Automated data partitioning for highly scalable and strongly consistent transactions. IEEE Trans Parallel Distrib Syst 27(1):106–118
Deng SP, Zhu L, Huang DS (2016) Predicting hub genes associated with cervical cancer through gene co-expression networks. IEEE/ACM Trans Comput Biol Bioinf 13(1):27–35
Hsieh SY, Chou YC (2016) A faster cDNA microarray gene expression data classifier for diagnosing diseases. IEEE/ACM Trans Comput Biol Bioinf 13(1):43–54
Dhulekar N, Ray S, Yuan D, Baskaran A, Oztan B, Larsen M, Yene B (2016) Prediction of growth factor-dependent cleft formation during branching morphogenesis using a dynamic graph-based growth model. IEEE/ACM Trans Comput Biol Bioinf 13(2):350–363
Borroto OM, Vega JMG, Ponce YM, Grau R (2016) Relational agreement measures for similarity searching of cheminformatic data sets. IEEE/ACM Trans Comput Biol Bioinf 13(1):158–167
Sáez JA, Luengo J, Herrera F (2016) Evaluating the classifier behavior with noisy data considering performance and robustness: the equalized loss of accuracy measure. Neurocomputing 176:26–35
Saez JA, Galar M, Luengo J, Herrera F (2016) INFFC: an iterative class noise filter based on the fusion of classifiers with noise sensitivity control. Inf Fusion 27:505–636
Palacios A, Sanchez L, Couso I (2016) An extension of the FURIA classification algorithm to low quality data through fuzzy rankings and its application to the early diagnosis of dyslexia. Neurocomputing 176:60–71
Fdez JA, Alonso JM (2016) A survey of fuzzy systems software: taxonomy, current research trends and prospects. IEEE Trans Fuzzy Syst 24(1):40–56
Martin D, Fdez JA, Rosete A, Herrera F (2016) NICGAR: a niching genetic algorithm to mine a diverse set of interesting quantitative association rules. Inf Sci 355–356:208–228
González M, Bergmeir C, Triguero I, Rodríguez Y, Benítez JM (2016) On the stopping criteria for k-nearest neighbor in positive unlabeled time series classification problems. Inf Sci 328:42–59
Morente-Molinera JA, Pérez IJ, Ureña MR, Herrera-Viedma E (2016) Creating knowledge databases for storing and sharing people knowledge automatically using group decision making and fuzzy ontologies. Inf Sci 328:418–434
Dong Y, Zhang H, Herrera-Viedma E (2016) Integrating experts’ weights generated dynamically into the consensus reaching process and its applications in managing non-cooperative behaviors. Decis Support Syst 84:1–15
Fernandez A, Carmona CJ, del Jesus MJ, Herrera F (2016) A view on fuzzy systems for big data: progress and opportunities. Int J Comput Intell Syst 9(1):69–80
Peralta D, Triguero I, García S, Herrera F, Benítez JM (2016) DPD–DFF: a dual phase distributed scheme with double fingerprint fusion for fast and accurate identification in large databases. Inf Fusion 32:40–51
Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2016) Ordering-based pruning for improving the performance of ensembles of classifiers in the framework of imbalanced datasets. Inf Sci 354:178–196
Lozano M, Rodriguez FJ, Peralta D, García-Martínez C (2016) Randomized greedy multi-start algorithm for the minimum common integer partition problem. Eng Appl Artif Intell 50:226–235
Cavalcante RG, Patil S, Weymouth TE, Bendinskas KG, Karnovsky A, Maureen A (2016) Sartor ConceptMetab: exploring relationships among metabolite sets to identify links among biomedical concepts. Bioinformatics 32(10):1536–1543
Domínguez JG, Schmidt B (2016) ParDRe: faster parallel duplicated reads removal tool for sequencing studies. Bioinformatics 32(10):1562–1564
Machado MR, Pantano S (2016) SIRAH tools: mapping, backmapping and visualization of coarse-grained models. Bioinformatics 32(10):1568–1570
Burkett KM, McNeney B, Graham J (2016) Sampletrees and Rsampletrees: sampling gene genealogies conditional on SNP genotype data. Bioinformatics 32(10):1568–1570
Liu Y, Zhao M (2016) lnCaNet: pan-cancer co-expression network for human lncRNA and cancer genes. Bioinformatics 32(10):1595–1597
Meyer MJ, Geske P, Yu H (2016) BISQUE: locus- and variant-specific conversion of genomic, transcriptomic and proteomic database identifiers. Bioinformatics 32(10):1598–2000
Lyu Y, Li Q (2016) A semi-parametric statistical model for integrating gene expression profiles across different platforms. BMC Bioinf 17(5):51
Ferlay J, Soerjomataram I, Dikshit R, Eser S, Mathers C, Rebelo M et al (2014) Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer 136(5):E359–E386
Sancho-Asensio A, Orriols-Puig A, Casillas J (2016) Evolving association streams. Inf Sci 334–335:250–272
Sáez JA, Luengo J, Herrera F (2016) Evaluating the classifier behavior with noisy data considering performance and robustness: the equalized loss of accuracy measure. Neurocomputing 176:26–35
Ramentol E, Gondres I, Lajes S, Bello R, Caballero Y, Cornelis C, Herrera F (2016) Fuzzy-rough imbalanced learning for the diagnosis of high voltage circuit breaker maintenance: the SMOTE-FRST-2T algorithm. Eng Appl Artif Intell 48:134–139
Ramírez-Gallego S, García S, Mouriño-Talín H, Martínez-Rego D, Bolón-Canedo V, Alonso-Betanzos A, Benítez JM, Herrera F (2016) Data discretization: taxonomy and big data challenge. Wiley interdisciplinary reviews. Data Min Knowl Disc 6(1):5–21
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188
Verbiest N, Derrac J, Cornelis C, García S, Herrera F (2016) Evolutionary wrapper approaches for training set selection as preprocessing mechanism for support vector machines: experimental evaluation and support vector analysis. Appl Soft Comput 38:10–22
Wu A, Wen S, Zeng Z (2012) Synchronization control of a class of memristor-based recurrent neural networks. Inf Sci 183(1):106–116
Wu A, Zeng Z (2013) Anti-synchronization control of a class of memristive recurrent neural networks. Commun Nonlinear Sci Numer Simul 18(2):373–385
Zhang G, Shen Y (2014) Exponential synchronization of delayed memristor-based chaotic neural networks via periodically intermittent control. Neural Netw 55:1–10
Zhang G, Shen Y, Sun J (2012) Global exponential stability of a class of memristor-based recurrent neural networks with time-varying delays. Neurocomputing 97:149–154
Zhang G, Shen Y, Yin Q, Sun J (2013) Global exponential periodicity and stability of a class of memristor-based recurrent neural networks with multiple delays. Inf Sci 232:386–396
Zhang G, Shen Y, Wang L (2013) Global anti-synchronization of a class of chaotic memristive neural networks with time-varying delays. Neural Netw 46:1–8
Zhang G, Shen Y (2013) New algebraic criteria for synchronization stability of chaotic memristive neural networks with time-varying delays. IEEE Trans Neural Netw Learn Syst 24(10):1701–1707
Wen S, Zeng Z, Huang T (2012) Exponential stability analysis of memristor-based recurrent neural networks with time-varying delays. Neurocomputing 97:233–240
Chen J, Zeng Z, Jiang P (2014) Global Mittag–Leffler stability and synchronization of memristor-based fractional-order neural networks. Neural Netw 51:1–8
Wang X, Li C, Huang T, Duan S (2014) Global exponential stability of a class of memristive neural networks with time-varying delays. Neural Comput Appl 24(8):1707–1715
Guo Z, Wang J, Yan Z (2013) Global exponential dissipativity and stabilization of memristor-based recurrent neural networks with time-varying delays. Neural Netw 48:158–172
Guo Z, Wang J, Yan Z (2014) Attractivity analysis of memristor-based cellular neural networks with time-varying delays. IEEE Trans Neural Netw Learn Syst 25(4):704–717
Sun J, Shen Y, Yin Q, Xu C (2013) Compound synchronization of four memristor chaotic oscillator systems and secure communication. Chaos 23(1):013140
Bo-Cheng B, Zhong L, Jian-Ping X (2010) Transient chaos in smooth memristor oscillator. Chin Phys B 19(3):030510
Wu CW (2001) Synchronization in arrays of coupled nonlinear systems: passivity, circle criterion, and observer design. IEEE Trans Circuits Syst I Fundam Theory Appl 48(10):1257–1261
Zhang Y, Wang J, Wang X (2014) Review on probabilistic forecasting of wind power generation. Renew Sustain Energy Rev 32:255–270
Quan H, Srinivasan D, Khosravi A (2015) Incorporating wind power forecast uncertainties into stochastic unit commitment using neural network-based prediction intervals. IEEE Trans Neural Netw Learn Syst 26(9):2123–2135
Yuan Y, Mou L, Lu X (2015) Scene recognition by manifold regularized deep learning architecture. IEEE Trans Neural Netw Learn Syst 26(10):2222–2233
Zhang W, Tang Y, Wong WK, Miao Q (2015) Stochastic stability of delayed neural networks with local impulsive effects. IEEE Trans Neural Netw Learn Syst 26(10):2336–2345
Chang C (2015) Deep and shallow architecture of multilayer neural networks. IEEE Trans Neural Netw Learn Syst 26(10):2477–2486
Yang JB, Singh MG (1994) An evidential reasoning approach for multiple attribute decision making with uncertainty. IEEE Trans Syst Man Cybern 24(1):1–18
Yang JB, Sen P (1994) A general multi-level evaluation process for hybrid MADM with uncertainty. IEEE Trans Syst Man Cybern 24(10):1458–1473
Yang JB (2001) Rule and utility based evidential reasoning approach for multi-attribute decision analysis under uncertainties. Eur J Oper Res 131(1):31–61
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declared that no conflict of interest.
Rights and permissions
About this article
Cite this article
Kamal, S., Dey, N., Nimmy, S.F. et al. Evolutionary framework for coding area selection from cancer data. Neural Comput & Applic 29, 1015–1037 (2018). https://doi.org/10.1007/s00521-016-2513-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-016-2513-3