[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3543873.3587562acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
short-paper
Open access

TAPP: Defining standard provenance information for clinical research data and workflows - Obstacles and opportunities

Published: 30 April 2023 Publication History

Abstract

Data provenance has raised much attention across disciplines lately, as it has been shown that enrichment of data with provenance information leads to better credibility, renders data more FAIR fostering data reuse. Also, the biomedical domain has recognised the potential of provenance capture. However, several obstacles prevent efficient, automated, and machine-interpretable enrichment of biomedical data with provenance information, such as data heterogeneity, complexity, and sensitivity. Here, we explain how in Germany clinical data are transferred from hospital information systems into a data integration centre to enable secondary use of patient data and how it can be reused as research data. Considering the complex data infrastructures in hospitals, we indicate obstacles and opportunities when collecting provenance information along heterogeneous data processing pipelines. To express provenance data, we indicate the usage of the Fast Healthcare Interoperability Resource (FHIR) provenance resource for healthcare data. In addition, we consider already existing approaches from other research fields and standard communities. As a solution towards high-quality standardised clinical research data, we propose to develop a ’MInimal Requirements for Automated Provenance Information Enrichment’ (MIRAPIE) guideline. As a community project, MIRAPIE should generalise provenance information concepts to allow its world-wide applicability, possibly beyond the health care sector.

References

[1]
Anila Sahar Butt and Peter Fitch. 2020. ProvONE+: A Provenance Model for Scientific Workflows. In Web Information Systems Engineering – WISE 2020, Zhisheng Huang, Wouter Beek, Hua Wang, Rui Zhou, and Yanchun Zhang (Eds.). Springer, Cham, 431–444. https://doi.org/10.1007/978-3-030-62008-0_30
[2]
Vasa Curcin, Elliot Fairweather, Roxana Danger, and Derek Corrigan. 2017. Templates as a method for implementing data provenance in decision support systems. Journal of Biomedical Informatics 65 (jan 2017), 1–21. https://doi.org/10.1016/j.jbi.2016.10.022
[3]
P Daumke, KU Heitmann, S Heckmann, C Martínez-Costa, and Schulz S. 2019. Clinical Text Mining on FHIR.ST HEAL T 264 (2019), 83–87.
[4]
SN Duda, N Kennedy, D Conway, AC Cheng, V Nguyen, T Zayas-Cabán, and PA Harris. 2022. HL7 FHIR-based tools and initiatives to support clinical research: a scoping review. J AM MED INFORM ASSN 29, 9 (2022), 1642–1653.
[5]
Elliot Fairweather, Rudolf Wittner, Martin Chapman, Petr Holub, and Vasa Curcin. 2021. Non-repudiable Provenance for Clinical Decision Support Systems. Springer, Cham, 165–182. https://doi.org/10.1007/978-3-030-80960-7_10
[6]
S Gehring and R Eulenfeld. 2018. German medical informatics initiative: unlocking data for research and health care. METHOD INFORM MED 57, S 01 (2018), e46–e49.
[7]
K Gierend, S Freiesleben, D Kadioglu, F Siegel, T Ganslandt, and D Waltemath. 2023. The Status of data management practices throughout the Data Life Cycle: a Mixed-Method Study across MIRACUM Data Integration Centers. Research Square (2023).
[8]
K Gierend, F Krüger, D Waltemath, M Fünfgeld, T Ganslandt, and AA Zeleke. 2021. Approaches and Criteria for Provenance in Biomedical Data Sets and Workflows: Protocol for a Scoping Review. JMIR Res Protoc 10(11) (2021), e31750.
[9]
G Hughes, H Mills, D De Roure, JG Frey, L Moreau, MC Schraefel, G Smith, and E Zaluska. 2004. The semantic smart laboratory: a system for supporting the chemical eScientist. Organic & Biomolecular Chemistry 2, 22 (2004), 3284. https://doi.org/10.1039/b410075a
[10]
Lorenz A Kapsner, Marvin O Kampf, Susanne A Seuchter, Julian Gruendner, Christian Gulden, Sebastian Mate, Jonathan M Mang, Christina Schüttler, Noemi Deppenwiese, Linda Krause, 2021. Reduced rate of inpatient hospital admissions in 18 German university hospitals during the COVID-19 lockdown. Frontiers in public health 8 (2021), 594117.
[11]
B Lerner, E Boose, and L Perez. 2018. Using Introspection to Collect Provenance in R. Informatics 5, 1 (2018), 12.
[12]
Chunhyeok Lim, Shiyong Lu, Artem Chebotko, and Farshad Fotouhi. 2010. Prospective and Retrospective Provenance Collection in Scientific Workflow Environments. In 2010 IEEE International Conference on Services Computing. IEEE, 449–456. https://doi.org/10.1109/SCC.2010.18
[13]
M Löbe, G Kamdje-Wabo, AC Sinza, H Spengler, M Strobel, and E Tute. 2022. Towards Harmonized Data Quality in the Medical Informatics Initiative-Current State and Future Directions. ST HEAL T 289 (2022), 240–243.
[14]
Cuggia M and Combes S.2019. The French Health Data Hub and the German Medical Informatics Initiatives: Two National Projects to Promote Data Sharing in Healthcare.Yearb Med Inform. 28(1) (Aug 2019), 195–202. https://doi.org/10.1055/s-0039-1677917
[15]
L Moreau, BV Batlajery, TD Huynh, D Michaelides, and H Packer. 2018. A Templating System to Generate Provenance. IEEE Transactions on Software Engineering 44, 2 (2018), 103–121. https://doi.org/10.1109/TSE.2017.2659745
[16]
S Samuel and B König-Ries. 2018. ProvBook: Provenance-based Semantic Enrichment of Interactive Notebooks for Reproducibility. In ISWC (P&D/Industry/BlueSky).
[17]
S Samuel and B König-Ries. 2022. End-to-End provenance representation for the understandability and reproducibility of scientific experiments using a semantic approach. J BIOMED SEMANT 13, 1 (jan 2022). https://doi.org/10.1186/s13326-021-00253-1
[18]
M Schröder, S Staehlke, P Groth, JB Nebe, S Spors, and F Krüger. 2022. Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation. J BIOMED SEMANT 13, 1 (2022). https://doi.org/10.1186/s13326-021-00257-x
[19]
SC Semler, F Wissing, and R Heyder. 2018. German medical informatics initiative. METHOD INFORM MED 57, S 01 (2018), e50–e56.
[20]
LN Soldatova, D Nadis, RD King, PS Basu, E Haddi, V Baumlé, NJ Saunders, W Marwan, and B B Rudkin. 2014. EXACT2: the semantics of biomedical protocols. BMC Bioinformatics 15, S14 (nov 2014). https://doi.org/10.1186/1471-2105-15-s14-s5
[21]
Hannes Ulrich, Ann-Kristin Kock-Schoppenhauer, Mark R Stöhr, Jürgen Stausberg, Julian Varghese, Martin Dugas, and Josef Ingenerf. 2022. Understanding the Nature of Metadata: Systematic Review. Journal of Medical Internet Research 24, 1 (jan 2022), e25440. https://doi.org/10.2196/25440

Cited By

View all
  • (2023)Eleven strategies for making reproducible research and open science training the norm at research institutionseLife10.7554/eLife.8973612Online publication date: 23-Nov-2023
  • (2023)Traceable Research Data Sharing in a German Medical Data Integration Center With FAIR (Findability, Accessibility, Interoperability, and Reusability)-Geared Provenance Implementation: Proof-of-Concept StudyJMIR Formative Research10.2196/500277(e50027)Online publication date: 7-Dec-2023
  • (2023)The Status of Data Management Practices Across German Medical Data Integration Centers: Mixed Methods StudyJournal of Medical Internet Research10.2196/4880925(e48809)Online publication date: 8-Nov-2023

Index Terms

  1. TAPP: Defining standard provenance information for clinical research data and workflows - Obstacles and opportunities

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023
      April 2023
      1567 pages
      ISBN:9781450394192
      DOI:10.1145/3543873
      This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 30 April 2023

      Check for updates

      Author Tags

      1. Data Integration Center
      2. Hospital Information System
      3. biomedical data
      4. provenance capture

      Qualifiers

      • Short-paper
      • Research
      • Refereed limited

      Conference

      WWW '23
      Sponsor:
      WWW '23: The ACM Web Conference 2023
      April 30 - May 4, 2023
      TX, Austin, USA

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)185
      • Downloads (Last 6 weeks)15
      Reflects downloads up to 03 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Eleven strategies for making reproducible research and open science training the norm at research institutionseLife10.7554/eLife.8973612Online publication date: 23-Nov-2023
      • (2023)Traceable Research Data Sharing in a German Medical Data Integration Center With FAIR (Findability, Accessibility, Interoperability, and Reusability)-Geared Provenance Implementation: Proof-of-Concept StudyJMIR Formative Research10.2196/500277(e50027)Online publication date: 7-Dec-2023
      • (2023)The Status of Data Management Practices Across German Medical Data Integration Centers: Mixed Methods StudyJournal of Medical Internet Research10.2196/4880925(e48809)Online publication date: 8-Nov-2023

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media