Abstract
Multiomics compiles data from different genome levels to study the effects of interactions between various omics molecules on disease processes. Integrated analysis of different omics data can more comprehensively evaluate their role in human health and complex diseases. Previous studies have used SNF and SNF-CC for multiomics integration. Although the effect of multiomics integrative algorithm is significantly increased, these methods did not consider the effects of a biologically significant correlation within and between omics. A large body of evidence has shown that cancer occurs due to interactions and synergistic effects of multiple genes. The correlation relationships between genes can be reflected through gene pathway and motif information. In this paper, we define the IPMM(Integration Pathway and Motif information Model), which combines pathway and motif information with multiomics data to study their effects on cancer subtype classification. To facilitate the use of gene association information, we employ the Isomap method for dimensionality reduction analysis of expression data from the genomes in a pathway and motif. Selection of K values in Isomap dimensionality reduction is used to maximize the presentation of the relationship of genes in pathway and motif data with dimensionality reduced to one. SNF and SNF-CC are used for integrative analysis of gene-expression data, methylation data, miRNA data, and pathway and motif data after dimensionality reduction in two cancer datasets. Results show that clustering effects display varying increases in different methods after pathway and motif information are integrated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Yugi, K., Kubota, H., Hatano, A., Kuroda, S.: Trans-omics: how to reconstruct biochemical networks across multiple’omic’ layers. Trends Biotechnol. 34, 276–290 (2016)
Lin, E., Lane, H.Y.: Machine learning and systems genomics approaches for multi-omics data. Biomark. Res. 5, 2 (2017)
Ritchie, M.D., Holzinger, E.R., Li, R., Pendergrass, S.A., Kim, D.: Methods of integrating data to uncover genotype-phenotype interactions. Nat. Rev. Genet. 16, 85–97 (2015)
Guo, Y., Liu, S.: BCDForest: a boosting cascade deep forest model towards the classification of cancer subtypes based on gene expression data. BMC Bioinf. 19, 118 (2018)
Hasin, Y., Seldin, M.: Multi-omics approaches to disease. Genome Biol. 18, 1–5 (2017)
Torshizi, A.D., Petzold, L.R.: Graph-based semi-supervised learning with genomic data integration using condition-responsive genes applied to phenotype classification. J. Am. Med. Inform. Assoc. 25, 99–108 (2018)
Zhao, J., Cheng, F., Jia, P., Cox, N., Denny, J.C., Zhao, Z.: An integrative functional genomics framework for effective identification of novel regulatory variants in genome-phenome studies. Genome Med. 10, 7 (2018)
Romanowska, J.: From genotype to phenotype: through chromatin. Genes 10(2), 76 (2019)
Chu, S.H., Huang, Y.T.: Integrated genomic analysis of biological gene sets with applications in lung cancer prognosis. BMC Bioinf. 18, 336 (2017)
Yuan, L., Huang, D.S.: A network-guided association mapping approach from DNA methylation to disease. Sci. Rep. 9, 5601 (2019)
Wilk, G., Braun, R.: Integrative analysis reveals disrupted pathways regulated by microRNAs in cancer. Nucleic Acids Res. 46, 1089–1101 (2018)
Jung, K.: Multidimensional Scaling I. In: Wright, J.D. (ed.) International Encyclopedia of the Social & Behavioral Sciences, 2nd edn, pp. 34–39. Elsevier, Oxford (2015)
Tenenbaum, J.B.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)
Shi, J., Luo, Z.: Nonlinear dimensionality reduction of gene expression data for visualization and clustering analysis of cancer tissue samples. Comput. Biol. Med. 40(8), 723–732 https://doi.org/10.1016/j.compbiomed.2010.06.007
Sebastiani, P.: Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Mach. Learn. 52(1–2), 91–118 (2003)
Wilkerson, M.D., Hayes, D.N.: ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 26, 1572–1573 (2010)
Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)
Silhouettes, R.P.J.: A graphical aid to the interpretation and validation of cluster analysis, J. Comput. Appl. 20, 53–65 (1987)
Hosmer Jr, D.W., Lemeshow, S.: Applied survival analysis: regression modeling of time to event data. J. Am. Stat. Assoc. (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Guo, X., Lu, Y., Yin, Z., Shang, X. (2020). IPMM: Cancer Subtype Clustering Model Based on Multiomics Data and Pathway and Motif Information. In: Yang, X., Wang, CD., Islam, M.S., Zhang, Z. (eds) Advanced Data Mining and Applications. ADMA 2020. Lecture Notes in Computer Science(), vol 12447. Springer, Cham. https://doi.org/10.1007/978-3-030-65390-3_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-65390-3_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65389-7
Online ISBN: 978-3-030-65390-3
eBook Packages: Computer ScienceComputer Science (R0)