Abstract
Being able to better understand the underlying structure of clinical data is a topic of growing importance. Topological data analysis enables data scientists to uncover the “shape” of data by extracting the underlying topological structure which enables distinct regions to be identified. For example, certain regions may be associated with early-stage disease whilst others may represent different advanced disease sub-types. The identification of these regions can help clinicians to better understand specific patients’ symptoms based upon where they lie in the disease topology, and therefore to make more targeted interventions. However, these topologies do not capture any sequential or temporal information. Pseudo-time series analysis can generate realistic trajectories through non-time-series data based on a combination of graph theory and the exploitation of expert knowledge (e.g. disease staging information). In this paper, we explore the combination of pseudo time and topological data analysis to build realistic trajectories over disease topologies. Using three different datasets: simulated, diabetes and genomic data, we explore how the combined method can highlight distinct temporal phenotypes in each disease based on the possible trajectories through the disease process.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Dagliati, A., et al.: Temporal electronic phenotyping by mining careflows of breast cancer patients. J. Biomed. Inform. 66, 136–147 (2017)
Hripcsak, G., Albers, D.J.: Next-generation phenotyping of electronic health records. J. Am. Med. Inf. Assoc. 20(1), 117–121 (2013)
Li, L., et al.: Identification of type 2 diabetes subgroups through topological analysis of patient similarity. Sci. Transl. Med. 7(311), 311ra174 (2015)
Nielson, J.L., et al.: Topological data analysis for discovery in preclinical spinal cord injury and traumatic brain injury. Nat. Commun. 6(8581), 1–12 (2015)
Torres, B.Y., Oliveira, J.M., Tate, A.T., Rath, P., Cumnock, K., Schneider, D.S.: Tracking resilience to infections by mapping disease space. PLoS Comput. Biol. 14(4), e1002436 (2016)
Singh, G., Memoli, F., Carlsson, G.: Topological methods for the analysis of high dimensional data sets and 3D object recognition. In: SPBG: Eurographics Symposium on Point Based Graphics, Prague, pp. 91–100. The Eurographics Association (2007)
Nicolau, M., Levine, A., Carlsson, G.: Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival. Proc. Natl. Acad. Sci. 108(17), 7265–7270 (2011)
Lum, P.Y., et al.: Extracting insights from the shape of complex data using topology. Sci. Rep. 3(1), 1236 (2013)
Torres-Tramón, P., Hromic, H., Heravi, B.R.: Topic detection in twitter using topology data analysis. In: Daniel, F., Diaz, O. (eds.) ICWE 2015. LNCS, vol. 9396, pp. 186–197. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24800-4_16
Gholizadeh, S., Seyeditabari, A., Zadrozny, W.: Topological signature of 19th century novelists. Big Data Cogn. Comput. 2(4), 33 (2018)
Nilsson, D., Ekgren, A.: Topology and Word Spaces. Stockholm: KTH Computer Science and Communication (2013)
Zhu, X.: Persistent homology: an introduction and a new text representation for natural language processing. In: IJCAI International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, Beijing, pp. 1953–1959 (2013)
Sardiu, M., Gilmore, J., Groppe, B., Florens, L., Washburn, M.: Identification of topological network modules in perturbed protein interaction networks. Sci. Rep. 7(43845), 1–13 (2107)
Rizvi, A., et al.: Single-cell topological RNA-seq analysis reveals insights into cellular differentiation and development. Nat. Biotechnol. 35(6), 551–560 (2017)
Romano, D., et al.: Topological methods reveal high and low functioning neuro-phenotypes within fragile X syndrome. Hum. Brain Mapp. 35, 4904–4915 (2014)
Campbell, K.R., Yau, C.: Uncovering pseudotemporal trajectories with covariates from single cell and bulk expression data. Nat. Commun. 9(1), 2442 (2018)
Tucker, A., Garway-Heath, D.: The pseudotemporal bootstrap for predicting glaucoma from cross-sectional visual field data. IEEE Trans. Inf. Technol. Biomed. 14(1), 79–85 (2010)
Dagliati, A., et al.: Inferring temporal phenotypes with topological data analysis and pseudo time-series. In: Riaño, D., Wilk, S., ten Teije, A. (eds.) AIME 2019. LNCS (LNAI), vol. 11526, pp. 399–409. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21642-9_50
Li, Y., Tucker, A.: Uncovering disease regions using pseudo time-series trajectories on clinical trial data. In: 2010 3rd International Conference on Biomedical Engineering and Informatics (BMEI 2010), Yantai, pp. 2356–2362. IEEE (2010)
Li, Y., Swift, S., Tucker, A.: Modelling and analysing the dynamics of disease progression from cross-sectional studies. J. Biomed. Inform. 46(2), 266–274 (2013)
Floyd, R.: Algorithm 97: shortest path. Commun. ACM 5(6), 345 (1962)
Sanchez-Palencia, A., et al.: Gene expression profiling reveals novel biomarkers in nonsmall cell lung cancer. Int. J. Cancer 129(2), 355–364 (2011)
Pei, H., et al.: FKBP51 affects cancer cell response to chemotherapy by negatively regulating Akt. Cancer Cell 16(3), 259–266 (2009)
The National Center for Biotechnology Information: Gene Expression Omnibus (GEO) – Accession Display. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE11151. Accessed 04 Mar 2020
Rosty, C., et al.: Identification of a proliferation gene cluster associated with HPV E6/E7 expression level and viral DNA load in invasive cervical carcinoma. Oncogene 24(47), 7094–7104 (2005)
Tan, H., Wang, X., Yang, X., Li, H., Liu, B., Pan, P.: Oncogenic role of epithelial cell transforming sequence 2 in lung adenocarcinoma cells. Exp. Ther. Med. 12(4), 2088–2094 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Sajjadi, S.E., Draghi, B., Sacchi, L., Dagliani, A., Holmes, J., Tucker, A. (2020). Building Trajectories Over Topology with TDA-PTS: An Application in Modelling Temporal Phenotypes of Disease. In: Koprinska, I., et al. ECML PKDD 2020 Workshops. ECML PKDD 2020. Communications in Computer and Information Science, vol 1323. Springer, Cham. https://doi.org/10.1007/978-3-030-65965-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-65965-3_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65964-6
Online ISBN: 978-3-030-65965-3
eBook Packages: Computer ScienceComputer Science (R0)