Colorectal cancer is among the most prevalent cause of cancer-related mortality worldwide. Detection and removal of polyps at an early stage can help reduce mortality and even help in spreading over adjacent organs. Early polyp detection could save the lives of millions of patients over the world as well as reduce the clinical burden. However, the detection polyp rate varies significantly among endoscopists. There is numerous deep learning-based method proposed, however, most of the studies improve accuracy. Here, we propose a novel architecture, Residual Upsampling Network (RUPNet) for colon polyp segmentation that can process in realtime and show high recall and precision. The proposed architecture, RUPNet, is an encoder-decoder network that consists of three encoders, three decoder blocks, and some additional upsampling blocks at the end of the network. With an image size of 512 × 512, the proposed method achieves an excellent real-time operation speed of 152.60 frames per second with an average dice coefficient of 0.7658, mean intersection of union of 0.6553, sensitivity of 0.8049, precision of 0.7995, and F2-score of 0.9361. The results suggest that RUPNet can give real-time feedback while retaining high accuracy indicating a good benchmark for early polyp detection.
PurposeWe perform anatomical landmarking for craniomaxillofacial (CMF) bones without explicitly segmenting them. Toward this, we propose a simple, yet efficient, deep network architecture, called relational reasoning network (RRN), to accurately learn the local and the global relations among the landmarks in CMF bones; specifically, mandible, maxilla, and nasal bones.ApproachThe proposed RRN works in an end-to-end manner, utilizing learned relations of the landmarks based on dense-block units. For a given few landmarks as input, RRN treats the landmarking process similar to a data imputation problem where predicted landmarks are considered missing.ResultsWe applied RRN to cone-beam computed tomography scans obtained from 250 patients. With a fourfold cross-validation technique, we obtained an average root mean squared error of <2 mm per landmark. Our proposed RRN has revealed unique relationships among the landmarks that help us in inferring informativeness of the landmark points. The proposed system identifies the missing landmark locations accurately even when severe pathology or deformations are present in the bones.ConclusionsAccurately identifying anatomical landmarks is a crucial step in deformation analysis and surgical planning for CMF surgeries. Achieving this goal without the need for explicit bone segmentation addresses a major limitation of segmentation-based approaches, where segmentation failure (as often is the case in bones with severe pathology or deformation) could easily lead to incorrect landmarking. To the best of our knowledge, this is the first-of-its-kind algorithm finding anatomical relations of the objects using deep learning.
PurposeWe perform anatomical landmarking for craniomaxillofacial (CMF) bones without explicitly segmenting them. Toward this, we propose a simple, yet efficient, deep network architecture, called relational reasoning network (RRN), to accurately learn the local and the global relations among the landmarks in CMF bones; specifically, mandible, maxilla, and nasal bones.ApproachThe proposed RRN works in an end-to-end manner, utilizing learned relations of the landmarks based on dense-block units. For a given few landmarks as input, RRN treats the landmarking process similar to a data imputation problem where predicted landmarks are considered missing.ResultsWe applied RRN to cone-beam computed tomography scans obtained from 250 patients. With a fourfold cross-validation technique, we obtained an average root mean squared error of <2 mm per landmark. Our proposed RRN has revealed unique relationships among the landmarks that help us in inferring informativeness of the landmark points. The proposed system identifies the missing landmark locations accurately even when severe pathology or deformations are present in the bones.ConclusionsAccurately identifying anatomical landmarks is a crucial step in deformation analysis and surgical planning for CMF surgeries. Achieving this goal without the need for explicit bone segmentation addresses a major limitation of segmentation-based approaches, where segmentation failure (as often is the case in bones with severe pathology or deformation) could easily lead to incorrect landmarking. To the best of our knowledge, this is the first-of-its-kind algorithm finding anatomical relations of the objects using deep learning.
Auscultation is an established technique in clinical assessment of symptoms for respiratory disorders. Auscultation is safe and inexpensive, but requires expertise to diagnose a disease using a stethoscope during hospital or office visits. However, some clinical scenarios require continuous monitoring and automated analysis of respiratory sounds to pre-screen and monitor diseases, such as the rapidly spreading COVID-19. Recent studies suggest that audio recordings of bodily sounds captured by mobile devices might carry features helpful to distinguish patients with COVID-19 from healthy controls. Here, we propose a novel deep learning technique to automatically detect COVID-19 patients based on brief audio recordings of their cough and breathing sounds. The proposed technique first extracts spectrogram features of respiratory recordings, and then classifies disease state via a hierarchical vision transformer architecture. Demonstrations are provided on a crowdsourced database of respiratory sounds from COVID-19 patients and healthy controls. The proposed transformer model is compared against alternative methods based on state-of-the-art convolutional and transformer architectures, as well as traditional machine-learning classifiers. Our results indicate that the proposed model achieves on par or superior performance to competing methods. In particular, the proposed technique can distinguish COVID-19 patients from healthy subjects with over 94% AUC.
Organ at risk (OAR) segmentation is a crucial step for treatment planning and outcome determination in radiotherapy treatments of cancer patients. Several deep learning based segmentation algorithms have been developed in recent years, however, U-Net remains the de facto algorithm designed specifically for biomedical image segmentation and has spawned many variants with known weaknesses. In this study, our goal is to present simple architectural changes in U-Net to improve its accuracy and generalization properties. Unlike many other available studies evaluating their algorithms on single center data, we thoroughly evaluate several variations of U-Net as well as our proposed enhanced architecture on multiple data sets for an extensive and reliable study of the OAR segmentation problem. Our enhanced segmentation model includes (a)architectural changes in the loss function, (b)optimization framework, and (c)convolution type. Testing on three publicly available multi-object segmentation data sets, we achieved an average of 80% dice score compared to the baseline U-Net performance of 63%.
Purpose: Deep learning has achieved major breakthroughs during the past decade in almost every field. There are plenty of publicly available algorithms, each designed to address a different task of computer vision in general. However, most of these algorithms cannot be directly applied to images in the medical domain. Herein, we are focused on the required preprocessing steps that should be applied to medical images prior to deep neural networks.
Approach: To be able to employ the publicly available algorithms for clinical purposes, we must make a meaningful pixel/voxel representation from medical images which facilitates the learning process. Based on the ultimate goal expected from an algorithm (classification, detection, or segmentation), one may infer the required pre-processing steps that can ideally improve the performance of that algorithm. Required pre-processing steps for computed tomography (CT) and magnetic resonance (MR) images in their correct order are discussed in detail. We further supported our discussion by relevant experiments to investigate the efficiency of the listed preprocessing steps.
Results: Our experiments confirmed how using appropriate image pre-processing in the right order can improve the performance of deep neural networks in terms of better classification and segmentation.
Conclusions: This work investigates the appropriate pre-processing steps for CT and MR images of prostate cancer patients, supported by several experiments that can be useful for educating those new to the field (https://github.com/NIH-MIP/Radiology_Image_Preprocessing_for_Deep_Learning).
Recognition of anatomical structures is an important step in model based medical image segmentation. It
provides pose estimation of objects and information about "where" roughly the objects are in the image and
distinguishing them from other object-like entities. In,1 we presented a general method of model-based multi-object
recognition to assist in segmentation (delineation) tasks. It exploits the pose relationship that can be
encoded, via the concept of ball scale (b-scale), between the binary training objects and their associated grey
images. The goal was to place the model, in a single shot, close to the right pose (position, orientation, and
scale) in a given image so that the model boundaries fall in the close vicinity of object boundaries in the image.
Unlike position and scale parameters, we observe that orientation parameters require more attention when
estimating the pose of the model as even small differences in orientation parameters can lead to inappropriate
recognition. Motivated from the non-Euclidean nature of the pose information, we propose in this paper the use
of non-Euclidean metrics to estimate orientation of the anatomical structures for more accurate recognition and
segmentation. We statistically analyze and evaluate the following metrics for orientation estimation: Euclidean,
Log-Euclidean, Root-Euclidean, Procrustes Size-and-Shape, and mean Hermitian metrics. The results show that
mean Hermitian and Cholesky decomposition metrics provide more accurate orientation estimates than other
Euclidean and non-Euclidean metrics.
Since MR image intensities do not possess a tissue specific numeric meaning, even in images acquired for the
same subject, on the same scanner, for the same body region, by using the same pulse sequence, it is important
to transform the image scale into a standard intensity scale so that, for the same body region, intensities are
similar. The lack of a standard image intensity scale in MRI leads to many difficulties in tissue characterizability,
image display, and analysis, including image segmentation and registration. The influence of standardization
on these tasks has been documented well; however, how intensity non-standardness may affect the automatic
recognition of anatomical structures for image segmentation has not been studied. Motivated from the study
that we previously presented in SPIE Medical Imaging Conference 2010,1, 2 in this study, we analyze the effects
of intensity standardization on anatomical object recognition. A set of 31 scenarios of multiple objects from
the ankle complex included in the model, plus seven different realistic levels of non-standardness introduced are
considered for evaluation. The experimental results imply that, intensity variation among scenes in an ensemble
- a particular characteristic of the behavior of non-standardness - degrades object recognition performance.
This paper investigates, using prior shape models and the concept of ball scale (b-scale), ways of automatically
recognizing objects in 3D images without performing elaborate searches or optimization. That is, the goal is
to place the model in a single shot close to the right pose (position, orientation, and scale) in a given image
so that the model boundaries fall in the close vicinity of object boundaries in the image. This is achieved via
the following set of key ideas: (a) A semi-automatic way of constructing a multi-object shape model assembly.
(b) A novel strategy of encoding, via b-scale, the pose relationship between objects in the training images and
their intensity patterns captured in b-scale images. (c) A hierarchical mechanism of positioning the model, in
a one-shot way, in a given image from a knowledge of the learnt pose relationship and the b-scale image of
the given image to be segmented. The evaluation results on a set of 20 routine clinical abdominal female and
male CT data sets indicate the following: (1) Incorporating a large number of objects improves the recognition
accuracy dramatically. (2) The recognition algorithm can be thought as a hierarchical framework such that
quick replacement of the model assembly is defined as coarse recognition and delineation itself is known as finest
recognition. (3) Scale yields useful information about the relationship between the model assembly and any given
image such that the recognition results in a placement of the model close to the actual pose without doing any
elaborate searches or optimization. (4) Effective object recognition can make delineation most accurate.
We call the computerized assistive process of recognizing, delineating, and quantifying organs and tissue regions in
medical imaging, occurring automatically during clinical image interpretation, automatic anatomy recognition (AAR).
The AAR system we are developing includes five main parts: model building, object recognition, object delineation,
pathology detection, and organ system quantification. In this paper, we focus on the delineation part. For the modeling
part, we employ the active shape model (ASM) strategy. For recognition and delineation, we integrate several hybrid
strategies of combining purely image based methods with ASM. In this paper, an iterative Graph-Cut ASM (IGCASM)
method is proposed for object delineation. An algorithm called GC-ASM was presented at this symposium last year for
object delineation in 2D images which attempted to combine synergistically ASM and GC. Here, we extend this method
to 3D medical image delineation. The IGCASM method effectively combines the rich statistical shape information
embodied in ASM with the globally optimal delineation capability of the GC method. We propose a new GC cost
function, which effectively integrates the specific image information with the ASM shape model information. The
proposed methods are tested on a clinical abdominal CT data set. The preliminary results show that: (a) it is feasible to
explicitly bring prior 3D statistical shape information into the GC framework; (b) the 3D IGCASM delineation method
improves on ASM and GC and can provide practical operational time on clinical images.
Acquisition-to-acquisition signal intensity variations (non-standardness) are inherent in MR images. Standardization
is a post processing method for correcting inter-subject intensity variations through transforming all
images from the given image gray scale into a standard gray scale wherein similar intensities achieve similar
tissue meanings. The lack of a standard image intensity scale in MRI leads to many difficulties in tissue characterizability,
image display, and analysis, including image segmentation. This phenomenon has been documented
well; however, effects of standardization on medical image registration have not been studied yet. In this paper,
we investigate the influence of intensity standardization in registration tasks with systematic and analytic evaluations
involving clinical MR images. We conducted nearly 20,000 clinical MR image registration experiments
and evaluated the quality of registrations both quantitatively and qualitatively. The evaluations show that intensity
variations between images degrades the accuracy of registration performance. The results imply that the
accuracy of image registration not only depends on spatial and geometric similarity but also on the similarity of
the intensity values for the same tissues in different images.
In this paper, we propose three novel and important methods for the registration of histological images for 3D
reconstruction. First, possible intensity variations and nonstandardness in images are corrected by an intensity
standardization process which maps the image scale into a standard scale where the similar intensities correspond
to similar tissues meaning. Second, 2D histological images are mapped into a feature space where continuous
variables are used as high confidence image features for accurate registration. Third, we propose an automatic
best reference slice selection algorithm that improves reconstruction quality based on both image entropy and
mean square error of the registration process. We demonstrate that the choice of reference slice has a significant
impact on registration error, standardization, feature space and entropy information. After 2D histological slices
are registered through an affine transformation with respect to an automatically chosen reference, the 3D volume
is reconstructed by co-registering 2D slices elastically.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.