Computer Science > Computer Vision and Pattern Recognition

arXiv:2208.02529 (cs)

[Submitted on 4 Aug 2022 (v1), last revised 26 Jul 2024 (this version, v3)]

Title:Metadata-enhanced contrastive learning from retinal optical coherence tomography images

Authors:Robbie Holland, Oliver Leingang, Hrvoje Bogunović, Sophie Riedl, Lars Fritsche, Toby Prevost, Hendrik P. N. Scholl, Ursula Schmidt-Erfurth, Sobha Sivaprasad, Andrew J. Lotery, Daniel Rueckert, Martin J. Menten

View PDF HTML (experimental)

Abstract:Deep learning has potential to automate screening, monitoring and grading of disease in medical images. Pretraining with contrastive learning enables models to extract robust and generalisable features from natural image datasets, facilitating label-efficient downstream image analysis. However, the direct application of conventional contrastive methods to medical datasets introduces two domain-specific issues. Firstly, several image transformations which have been shown to be crucial for effective contrastive learning do not translate from the natural image to the medical image domain. Secondly, the assumption made by conventional methods, that any two images are dissimilar, is systematically misleading in medical datasets depicting the same anatomy and disease. This is exacerbated in longitudinal image datasets that repeatedly image the same patient cohort to monitor their disease progression over time. In this paper we tackle these issues by extending conventional contrastive frameworks with a novel metadata-enhanced strategy. Our approach employs widely available patient metadata to approximate the true set of inter-image contrastive relationships. To this end we employ records for patient identity, eye position (i.e. left or right) and time series information. In experiments using two large longitudinal datasets containing 170,427 retinal OCT images of 7,912 patients with age-related macular degeneration (AMD), we evaluate the utility of using metadata to incorporate the temporal dynamics of disease progression into pretraining. Our metadata-enhanced approach outperforms both standard contrastive methods and a retinal image foundation model in five out of six image-level downstream tasks related to AMD. Due to its modularity, our method can be quickly and cost-effectively tested to establish the potential benefits of including available metadata in contrastive pretraining.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2208.02529 [cs.CV]
	(or arXiv:2208.02529v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2208.02529

Submission history

From: Robbie Holland [view email]
[v1] Thu, 4 Aug 2022 08:53:15 UTC (42,079 KB)
[v2] Thu, 14 Dec 2023 09:54:38 UTC (23,240 KB)
[v3] Fri, 26 Jul 2024 15:07:08 UTC (19,073 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Metadata-enhanced contrastive learning from retinal optical coherence tomography images

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Metadata-enhanced contrastive learning from retinal optical coherence tomography images

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators