[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Next Issue
Volume 6, June
Previous Issue
Volume 6, April
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 

J. Imaging, Volume 6, Issue 5 (May 2020) – 10 articles

Cover Story (view full-size image): In this study, we evaluate velocity of the tongue tip with magnetic resonance imaging (MRI) using two independent approaches. The first approach is by acquisition with a real-time technique in the mid-sagittal plane. Tracking of the tongue tip manually and with a computer vision method allows its trajectory to be found and the velocity to be calculated as the derivative of the coordinate. We also propose the use of another approach—phase contrast MRI—which enables velocities of the moving tissues to be measured directly. Simultaneous sound recording enabled us to find the relation between the movements and the sound. The results of both methods are in qualitative agreement and are consistent with other reviewer techniques used for evaluation of the tongue tip velocity. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
13 pages, 7252 KiB  
Article
Comparative Study of Contact Repulsion in Control and Mutant Macrophages Using a Novel Interaction Detection
by José Alonso Solís-Lemus, Besaiz J Sánchez-Sánchez, Stefania Marcotti, Mubarik Burki, Brian Stramer and Constantino Carlos Reyes-Aldasoro
J. Imaging 2020, 6(5), 36; https://doi.org/10.3390/jimaging6050036 - 20 May 2020
Viewed by 4655
Abstract
In this paper, a novel method for interaction detection is presented to compare the contact dynamics of macrophages in the Drosophila embryo. The study is carried out by a framework called macrosight, which analyses the movement and interaction of migrating macrophages. The framework [...] Read more.
In this paper, a novel method for interaction detection is presented to compare the contact dynamics of macrophages in the Drosophila embryo. The study is carried out by a framework called macrosight, which analyses the movement and interaction of migrating macrophages. The framework incorporates a segmentation and tracking algorithm into analysing the motion characteristics of cells after contact. In this particular study, the interactions between cells is characterised in the case of control embryos and Shot mutants, a candidate protein that is hypothesised to regulate contact dynamics between migrating cells. Statistical significance between control and mutant cells was found when comparing the direction of motion after contact in specific conditions. Such discoveries provide insights for future developments in combining biological experiments with computational analysis. Full article
(This article belongs to the Special Issue MIUA2019)
Show Figures

Figure 1

Figure 1
<p>Illustration of the main hypothesis in this work. Different movement patterns from control and mutant samples are expected as a result of the movement analysis performed. The diagram shows the two different types of cells: controls (<b>a</b>) and mutants (<b>b</b>) being processed with <tt>macrosight</tt> [<a href="#B33-jimaging-06-00036" class="html-bibr">33</a>] (<b>c</b>). The output (<b>d</b>) consists of measurements of the cell’s trajectories and the changes in direction upon interactions represented by the different types of line and colours in the diagram.</p>
Full article ">Figure 2
<p>Comparison between four frames of (<b>a</b>) the control against four frames of a (<b>b</b>) mutant dataset. These datasets were selected as they had a similar number of frames, and thus, a similar spacing between the frames in both cases could be shown (≈95).</p>
Full article ">Figure 3
<p>Illustration of a series of <span class="html-italic">clumps</span> in (<b>a</b>) control and (<b>b</b>) mutant experiments. Both datasets present overlapping events, i.e., <span class="html-italic">clumps</span>, which are highlighted with yellow boxes. It should be noted that although the microtubules are overlapping, the nuclei are still separated.</p>
Full article ">Figure 4
<p>Illustration of the <tt>macrosight</tt> framework parts used in this work. (<b>a</b>) Illustration of a sequence of images with cells with red nuclei and green microtubules. The two fluorescent channels are segmented in (<b>b</b>) based on a hysteresis threshold where the levels are selected by the Otsu [<a href="#B41-jimaging-06-00036" class="html-bibr">41</a>] algorithm. The segmentation of the red channel (<b>b.i</b>) provides the cell positions necessary to produce (<b>c</b>) the tracks of the cells using the keyhole tracking algorithm [<a href="#B40-jimaging-06-00036" class="html-bibr">40</a>] (represented in cyan, magenta, and yellow). Finally, the tracks’ information is combined with the clump information (<b>d</b>) from the segmented green channel (<b>b.ii</b>) to allow analysis of movement based on contact events (<b>e</b>), producing the change of direction chart per cells in the clump. In this case, two cells interact and form a clump (magenta and cyan), whilst the other cell (yellow) does not form a clump. The diagram illustrates the change of direction of those cells that interact in the clump.</p>
Full article ">Figure 5
<p>Illustration of clump codes for the different time frames for a particular track <math display="inline"><semantics> <msub> <mi mathvariant="script">T</mi> <mn>2</mn> </msub> </semantics></math>. The horizontal axis represents the time, and the detail of five frames is presented to illustrate the evolution of track <math display="inline"><semantics> <msub> <mi mathvariant="script">T</mi> <mn>2</mn> </msub> </semantics></math> as it interacts with other cells. In (<b>a</b>,<b>e</b>), track <math display="inline"><semantics> <msub> <mi mathvariant="script">T</mi> <mn>2</mn> </msub> </semantics></math> is not in contact with any other cell, thus no clump is present. (<b>b</b>,<b>d</b>) Represent moments when <math display="inline"><semantics> <msub> <mi mathvariant="script">T</mi> <mn>2</mn> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi mathvariant="script">T</mi> <mn>1</mn> </msub> </semantics></math> interact in clump <tt>2001</tt>. Following, in (<b>c</b>), tracks <math display="inline"><semantics> <msub> <mi mathvariant="script">T</mi> <mn>3</mn> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi mathvariant="script">T</mi> <mn>5</mn> </msub> </semantics></math> become present in the clump; thus, the <span class="html-italic">clump</span> code changes to <tt>5003002001</tt>.</p>
Full article ">Figure 6
<p>This figure shows three examples of the change of direction before and after a clump. Column (<b>a</b>) shows the cells that interact in three different clumps: <tt>2001, 3002</tt>, and <tt>22001</tt>. A red line (<math display="inline"><semantics> <mrow> <mo>*</mo> <mo>−</mo> </mrow> </semantics></math>) shows the orientation of movement before the clump, and a green line (<math display="inline"><semantics> <mrow> <mo>⋄</mo> <mo>−</mo> </mrow> </semantics></math>) represents the positions of movement after. A yellow arrow is superimposed on the image to show the trajectory of the cell inside the clump. (<b>b</b>) Simplified view of the cells’ changes in orientation. The cells’ path before the clump is represented in blue (<math display="inline"><semantics> <mrow> <mo>−</mo> <mo>⋄</mo> <mo>−</mo> </mrow> </semantics></math>). The path of the cell after the clump is shown in orange (<math display="inline"><semantics> <mrow> <mo>:</mo> <mo>*</mo> </mrow> </semantics></math>). The angle arc of orientation is shown in magenta. Notice that the movement of the two cells involved in clump <tt>2001</tt> is considerably smaller compared to the other cases.</p>
Full article ">Figure 7
<p>Representation of the migration of two cells as they form a <span class="html-italic">clump</span>. The perimeters of the individual cells are highlighted by cyan and magenta lines, whilst the perimeter of the <span class="html-italic">clump</span> is highlighted with a yellow line. Red lines indicate the movement of the individual cells before the clump is created, and a green line indicates the positions of cells after they separate. To show the duration of the <span class="html-italic">clump</span>, the number of time frames is shown above the images. In this case, the cells overlap and form the <span class="html-italic">clump</span> for 18 frames, which is equivalent to 180 s.</p>
Full article ">Figure 8
<p>Frames in different interactions overlapped to appreciate cell movement and <span class="html-italic">clump</span> formation. (<b>a</b>–<b>c</b>) Three frames are superimposed: the first, middle, and final frames in each experiment are shown, with corresponding segmentations and tracks. The full track in each experiment is presented, with changes of colour representing different moments: before (red), during (yellow), and after (green) the <span class="html-italic">clump</span>. (<b>d</b>) is a representation of the same cells forming different <span class="html-italic">clumps</span> at different time points.</p>
Full article ">Figure 9
<p>Illustration of direction change (<math display="inline"><semantics> <msub> <mi>θ</mi> <mi>x</mi> </msub> </semantics></math>) measurement. Three markers represent different positions of a given track. The markers are as follows: (∘) represents <span class="html-italic">S</span> frames before contact; (⋄) represents the starting instant of the clump; and (*) represents the position where the experiment is finalised. Notice the translation and rotation into the new frame of reference <math display="inline"><semantics> <mrow> <mo>(</mo> <msup> <mi>x</mi> <mo>′</mo> </msup> <mo>,</mo> <msup> <mi>y</mi> <mo>′</mo> </msup> <mo>)</mo> </mrow> </semantics></math>.</p>
Full article ">Figure 10
<p>Comparison of aligned tracks for (<b>a</b>) control and (<b>b</b>) mutant interactions. Each line represents the trajectory of one cell, and the marker (·) represents the position at a certain time frame. Each line can be read from the utmost left point and continuing, initially towards the right, along the line to the next time frame marker. The grey lines correspond to the cells before entering <span class="html-italic">clump</span>, where the origin (0,0) corresponds to the clump formation. Red and blue lines correspond to five time frames of each cell after exiting the clump.</p>
Full article ">Figure 11
<p>Comparison of relevant variables between control (blue) and mutant (red) interactions. (<b>a</b>) Change of direction angle, <math display="inline"><semantics> <msub> <mi>θ</mi> <mi>x</mi> </msub> </semantics></math>, coming from <a href="#jimaging-06-00036-f010" class="html-fig">Figure 10</a>. (<b>b</b>) Time in clump <math display="inline"><semantics> <mrow> <mi>T</mi> <mi>C</mi> </mrow> </semantics></math> in frames. Finally, (<b>c</b>) shows the distances to the centre or origin of the new frame of reference <math display="inline"><semantics> <mrow> <mo>(</mo> <msup> <mi>x</mi> <mo>′</mo> </msup> <mo>,</mo> <msup> <mi>y</mi> <mo>′</mo> </msup> <mo>)</mo> </mrow> </semantics></math> (i.e., the length of the tracks after they leave the clump).</p>
Full article ">Figure 12
<p>Change of direction differences between control (blue) and mutant (red) interactions for those tracks whose absolute value of the angle is in the range <math display="inline"><semantics> <mrow> <msup> <mn>0</mn> <mo>∘</mo> </msup> <mo>&lt;</mo> <msub> <mi>θ</mi> <mi>x</mi> </msub> <mo>&lt;</mo> <msup> <mn>90</mn> <mo>∘</mo> </msup> </mrow> </semantics></math>. These two populations present a statistical significant difference (<math display="inline"><semantics> <mrow> <mi>p</mi> <mo>=</mo> <mn>0</mn> <mo>.</mo> <mn>03</mn> </mrow> </semantics></math>).</p>
Full article ">
12 pages, 1311 KiB  
Article
On a Method For Reconstructing Computed Tomography Datasets from an Unstable Source
by Nicholas Stull, Josh McCumber, Lawrence D'Aries, Michelle Espy, Cort Gautier and James Hunter
J. Imaging 2020, 6(5), 35; https://doi.org/10.3390/jimaging6050035 - 19 May 2020
Cited by 1 | Viewed by 3530
Abstract
As work continues in neutron computed tomography, at Los Alamos Neutron Science Center (LANSCE) and other locations, source reliability over the long imaging times is an issue of increasing importance. Moreover, given the time commitment involved in a single neutron image, it is [...] Read more.
As work continues in neutron computed tomography, at Los Alamos Neutron Science Center (LANSCE) and other locations, source reliability over the long imaging times is an issue of increasing importance. Moreover, given the time commitment involved in a single neutron image, it is impractical to simply discard a scan and restart in the event of beam instability. In order to mitigate the cost and time associated with these options, strategies are presented in the current work to produce a successful reconstruction of computed tomography data from an unstable source. The present work uses a high energy neutron tomography dataset from a simulated munition collected at LANSCE to demonstrate the method, which is general enough to be of use in conjunction with unstable X-ray computed tomography sources as well. Full article
(This article belongs to the Special Issue Neutron Imaging)
Show Figures

Figure 1

Figure 1
<p>Full side profile of the reconstructed simulated munition, reassembled from 3 separate neutron radiography computed tomography experiments on overlapping segments of the part.</p>
Full article ">Figure 2
<p>The above graph shows real light and dark postage stamp fluctuation over the time scale of a scan of the middle portion of the simulated munition. The light postage stamp has higher counts throughout, except when the beam drops out entirely. Note the long-term fluctuation is on the order of 15–20%.</p>
Full article ">Figure 3
<p>Consecutive neutron radiographs taken during the experiment. <b>Top Left</b>: Radiograph #2323, in which the beam is performing as expected. <b>Top Right</b>: Radiograph #2324, in which nearly total beam dropout has occurred. <b>Bottom Left</b>: Histogram of Radiograph #2323, restricted to the range of [6000, 15,000] for clarity. <b>Bottom Right</b>: Histogram of Radiograph #2324, restricted to the range of [6000, 15,000] for clarity.</p>
Full article ">Figure 4
<p>The above graph shows real light and dark postage stamp fluctuation over the time scale of a scan of the middle portion of the simulated munition after applying the filter prescribed in the Remark. For reference, the parameters used were a window length of 21, <math display="inline"><semantics> <mrow> <msub> <mi>T</mi> <mrow> <mi>r</mi> <mi>e</mi> <mi>l</mi> </mrow> </msub> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>T</mi> <mrow> <mi>a</mi> <mi>b</mi> <mi>s</mi> </mrow> </msub> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>T</mi> <mrow> <mi>p</mi> <mi>e</mi> <mi>r</mi> <mi>c</mi> </mrow> </msub> <mo>=</mo> <mn>0.03125</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 5
<p>The number of radiographs per view is presented above. Note that even after filtering results for beam dropout and large beam fluctuation, the full set of 1080 views is adequately represented.</p>
Full article ">Figure 6
<p><b>Left</b>: A representative slice, exhibiting a hard hit from a neutron in the lower left (indicated by the blue arrow); <b>Right</b>: A side profile of the 3-dimensional reconstructed volume. Note the intensity streaks in the volume, which are partially due to hard hits and partially due to beam fluctuation. Visualizations created in Volume Graphics’ VGStudio Max (for information on VGStudio Max, see [<a href="#B22-jimaging-06-00035" class="html-bibr">22</a>]).</p>
Full article ">Figure 7
<p><b>Left</b>: No hard hits are observed in any slices in the corrected reconstruction. A sample slice is shown, with blue arrows indicating features examined via lineout (see <a href="#jimaging-06-00035-f008" class="html-fig">Figure 8</a>); <b>Right</b>: A side profile of the 3-dimensional reconstructed volume. Note that the intensity fluctuations (striping) in the side profile are no longer apparent. Visualizations created in Volume Graphics’ VGStudio Max (for information on VGStudio Max, see [<a href="#B22-jimaging-06-00035" class="html-bibr">22</a>]).</p>
Full article ">Figure 8
<p>Lineout comparison done between the slice featuring a hard hit (<a href="#jimaging-06-00035-f006" class="html-fig">Figure 6</a>) and the slice post-processing (<a href="#jimaging-06-00035-f007" class="html-fig">Figure 7</a>).</p>
Full article ">
17 pages, 1322 KiB  
Article
Multilevel Analysis of the Influence of Maternal Smoking and Alcohol Consumption on the Facial Shape of English Adolescents
by Jennifer Galloway, Damian J.J. Farnell, Stephen Richmond and Alexei I. Zhurov
J. Imaging 2020, 6(5), 34; https://doi.org/10.3390/jimaging6050034 - 18 May 2020
Cited by 4 | Viewed by 3473
Abstract
This cross-sectional study aims to assess the influence of maternal smoking and alcohol consumption during pregnancy on the facial shape of non-syndromic English adolescents and demonstrate the potential benefits of using multilevel principal component analysis (mPCA). A cohort of 3755 non-syndromic 15-year-olds from [...] Read more.
This cross-sectional study aims to assess the influence of maternal smoking and alcohol consumption during pregnancy on the facial shape of non-syndromic English adolescents and demonstrate the potential benefits of using multilevel principal component analysis (mPCA). A cohort of 3755 non-syndromic 15-year-olds from the Avon Longitudinal Study of Parents and Children (ALSPAC), England, were included. Maternal smoking and alcohol consumption during the 1st and 2nd trimesters of pregnancy were determined via questionnaire at 18 weeks gestation. 21 facial landmarks, used as a proxy for the main facial features, were manually plotted onto 3D facial scans of the participants. The effect of maternal smoking and maternal alcohol consumption (average 1–2 glasses per week) was minimal, with 0.66% and 0.48% of the variation in the 21 landmarks of non-syndromic offspring explained, respectively. This study provides a further example of mPCA being used effectively as a descriptive analysis in facial shape research. This is the first example of mPCA being extended to four levels to assess the influence of environmental factors. Further work on the influence of high/low levels of smoking and alcohol and providing inferential evidence is required. Full article
(This article belongs to the Special Issue MIUA2019)
Show Figures

Figure 1

Figure 1
<p>The 21 landmarks as described by Farkas [<a href="#B32-jimaging-06-00034" class="html-bibr">32</a>]: (1) Glabella, (2) Nasion, (3) Endocanthion (Left), (4) Endocanthion (Right), (5) Exocanthion (Left), (6) Exocanthion (Right), (7) Palpebrale Superius (Left), (8) Palpebrale Superius (Right), (9) Palpebrale Inferius (Left), (10) Palpebrale Inferius (Right), (11) Pronasale, (12) Subnasale, (13) Alare (Left), (14) Alare (Right), (15) Labiale Superius, (16) Crista Phitri (Left), (17) Crista Philtri (Right), (18) Labiale Inferius, (19) Cheilion (Left), (20) Cheilion (Right), and (21) Pogonion.</p>
Full article ">Figure 2
<p>A flow chart of participant inclusion.</p>
Full article ">Figure 3
<p>Eigenvalue plot allowing visualisation of the eigenvalue magnitude for each principal component, at each level of the model.</p>
Full article ">Figure 4
<p>Scatter plots of the standardised component scores at PC1 and PC2 for conventional PCA (<b>a</b>,<b>b</b>,<b>d</b>) and mPCA (<b>c</b>,<b>e</b>). No obvious pattern in the separation of the group means is evident in conventional PCA PC1/2 (<b>a</b>). There is a suggestion of a pattern in conventional PCA PC7/8 (<b>b</b>) and mPCA smoking level PC2 (<b>c</b>). Subjects whose mothers did not smoke during pregnancy are possibly separated from those whose mothers smoked during the 1st trimester or both the 1st and 2nd trimesters. There is no obvious pattern in the separation of the group means with alcohol consumption during pregnancy (<b>d</b>,<b>e</b>). Scatter plots for conventional PCA PC3–10 are available as <a href="#app1-jimaging-06-00034" class="html-app">supplementary material</a>.</p>
Full article ">Figure 5
<p>Scatter plots of the standardised component scores at PC1 and PC2 for conventional PCA (<b>a</b>), PC1 for sex level mPCA (<b>b</b>) and PC1 and PC2 for subject level mPCA (<b>c</b>). Clear separation of the group means of the biological sexes is evident for both conventional PCA and mPCA. At a subject level, the centroids seem to be centred around the origin, although there is some deviation from this, perhaps due to group sample sizes.</p>
Full article ">
20 pages, 7387 KiB  
Article
Subpixel Localization of Isolated Edges and Streaks in Digital Images
by Devin T. Renshaw and John A. Christian
J. Imaging 2020, 6(5), 33; https://doi.org/10.3390/jimaging6050033 - 18 May 2020
Cited by 13 | Viewed by 4646
Abstract
Many modern sensing systems rely on the accurate extraction of measurement data from digital images. The localization of edges and streaks in digital images is an important example of this type of measurement, with these techniques appearing in many image processing pipelines. Several [...] Read more.
Many modern sensing systems rely on the accurate extraction of measurement data from digital images. The localization of edges and streaks in digital images is an important example of this type of measurement, with these techniques appearing in many image processing pipelines. Several approaches attempt to solve this problem at both the pixel level and subpixel level. While the subpixel methods are often necessary for applications requiring best-possible accuracy, they are often susceptible to noise, use iterative methods, or require pre-processing. This work investigates a unified framework for subpixel edge and streak localization using Zernike moments with ramp-based and wedge-based signal models. The method described here is found to outperform the current state-of-the-art for digital images with common signal-to-noise ratios. Performance is demonstrated on both synthetic and real images. Full article
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Example geometry of a square image patch (<math display="inline"><semantics> <mrow> <msub> <mi>N</mi> <mi>p</mi> </msub> <mo>=</mo> <mn>5</mn> </mrow> </semantics></math>, shown in dark red) centered about pixel-level edge guess <math display="inline"><semantics> <mrow> <mo>{</mo> <msub> <mover accent="true"> <mi>u</mi> <mo>˜</mo> </mover> <mi>i</mi> </msub> <mo>,</mo> <msub> <mover accent="true"> <mi>v</mi> <mo>˜</mo> </mover> <mi>i</mi> </msub> <mo>}</mo> </mrow> </semantics></math> shown in bright red. The edge has a blur width of <math display="inline"><semantics> <mrow> <mn>2</mn> <mi>w</mi> </mrow> </semantics></math> and is offset from the pixel-level guess by a distance <span class="html-italic">ℓ</span>. The primed frame (rotated by an angle <math display="inline"><semantics> <mi>ψ</mi> </semantics></math> relative to the unprimed image frame) is aligned with the edge, with <math display="inline"><semantics> <msup> <mover accent="true"> <mi>v</mi> <mo stretchy="false">¯</mo> </mover> <mo>′</mo> </msup> </semantics></math> being parallel to the edge and <math display="inline"><semantics> <msup> <mover accent="true"> <mi>u</mi> <mo stretchy="false">¯</mo> </mover> <mo>′</mo> </msup> </semantics></math> being normal to the edge. Although this figure shows only an edge, these coordinate frame conventions are the same for both edges and streaks.</p>
Full article ">Figure 2
<p>Graphical representation of a continuous edge signal (modeled as a linear ramp using Equation (<a href="#FD34-jimaging-06-00033" class="html-disp-formula">34</a>)) within the unit circle, including background intensity <span class="html-italic">h</span>, peak intensity of edge <span class="html-italic">k</span>, edge width <span class="html-italic">w</span>, and distance from the origin to the midpoint of the edge <span class="html-italic">ℓ</span>.</p>
Full article ">Figure 3
<p>Graphical representation of a continuous streak signal (modeled as a wedge using Equation (<a href="#FD44-jimaging-06-00033" class="html-disp-formula">44</a>)) within the unit circle, including background intensity <span class="html-italic">h</span>, peak intensity of streak <span class="html-italic">k</span>, width of the streak <span class="html-italic">w</span>, and distance from the origin to the streak <span class="html-italic">ℓ</span>.</p>
Full article ">Figure 4
<p>Contours of edge localization error for a continuous (not pixelated) ramp edge signal. Black contours show the error when using the approximation from Equation (<a href="#FD41-jimaging-06-00033" class="html-disp-formula">41</a>), red contours show the error when using the step function approximation from [<a href="#B11-jimaging-06-00033" class="html-bibr">11</a>].</p>
Full article ">Figure 5
<p>Contours of edge localization error (in pixels, assuming a <math display="inline"><semantics> <mrow> <mn>5</mn> <mo>×</mo> <mn>5</mn> </mrow> </semantics></math> mask) in a digital image for our method from Equation (<a href="#FD41-jimaging-06-00033" class="html-disp-formula">41</a>) (black), the step function approximation using Zernike moments (red) [<a href="#B11-jimaging-06-00033" class="html-bibr">11</a>], and the partial area effect (blue) [<a href="#B13-jimaging-06-00033" class="html-bibr">13</a>] as a function of SNR and blur. Error statistics are computed from a Monte Carlo analysis consisting of 5000 randomized images at each SNR and blur combination.</p>
Full article ">Figure 6
<p>Qualitative visualization of subpixel edge localization performance at varying levels of blur and SNR. The left column shows the full synthetically generated image and the right column shows a small area within that image. The rows represent different noise and blur levels (top: no noise or blur; middle: noise only (approximately 28.4 peak signal to noise ratio); bottom: noise and blur (2D Gaussian kernel with standard deviation 0.3 pixels)). The black line is the exact location of the true edge.</p>
Full article ">Figure 7
<p>Contours of streak localization error when using Equation (<a href="#FD57-jimaging-06-00033" class="html-disp-formula">57</a>) for a continuous (not pixelated) wedge edge signal.</p>
Full article ">Figure 8
<p>Contours of streak localization error (in pixels, assuming a <math display="inline"><semantics> <mrow> <mn>5</mn> <mo>×</mo> <mn>5</mn> </mrow> </semantics></math> mask) in a digital image for our method as a function of SNR and blur. Error statistics are computed from a Monte Carlo analysis consisting of 5000 randomized images at each SNR and blur combination.</p>
Full article ">Figure 9
<p>Qualitative visualization of subpixel streak localization performance at varying levels of blur and SNR. The left column shows the full synthetically generated image and the right column shows a small area within that image. The rows represent different noise and blur levels (top: no noise or blur; middle: noise only(approximately 28.5 peak signal to noise ratio); bottom: noise and blur (2D Gaussian kernel with standard deviation of 0.3 pixels)). The black line is the exact location of the true streak center.</p>
Full article ">Figure 10
<p>Images of the Mississippi River taken by the Landsat-5 spacecraft, where we seek to localize the river banks. The top image (LM05 L1TP_025032_20120830_20180521_01_T2) was collected on 21 May 2018 by the Multispectral Scanner System (MSS) and shows the river during normal conditions. The bottom image (LT05_L1TP_025032 20110508_20160902_01_T1) was collected on 2 September 2011 by the thematic mapper (TM) and shows the river after a major flooding event. The red × symbols denote pixel-level edge estimates and green dots denote the refined subpixel localization estimates. Image data is available from the U.S. Geological Survey (USGS) [<a href="#B30-jimaging-06-00033" class="html-bibr">30</a>].</p>
Full article ">Figure 11
<p>Image of a pedestrian crosswalk in Watervliet, NY, where we seek to localize the edges of the painted surface markings. The red × symbols denote pixel-level edge estimates and green dots denote the refined subpixel localization estimates. Original image collected by the authors with a personal camera.</p>
Full article ">Figure 12
<p>Image of a street in Watervliet, NY, where we seek to localize the edges of the painted yellow lane markings. The red × symbols denote pixel-level edge estimates and green dots denote the refined subpixel localization estimates. Original image collected by the authors with a personal camera.</p>
Full article ">Figure 13
<p>Image of Rhea (a moon of Saturn) collected by the Cassini spacecraft’s Narrow Angle Camera (NAC) on 13 October 2006 (raw image N1539252663 [<a href="#B31-jimaging-06-00033" class="html-bibr">31</a>]), where we seek to localize the moon’s lit limb. The red × symbols denote pixel-level edge estimates and green dots denote the refined subpixel localization estimates.</p>
Full article ">Figure 14
<p>Inertially pointed star field image captured with with the Omnidirectional Space Situational Awareness (OmniSSA) system. This example image has a 10 s exposure time and contains a satellite that appears as a streak within the image. The red × symbols denote pixel-level streak estimates and green dots denote the refined subpixel localization estimates. The original OmniSSA image is courtesy of Dr. Marcus Holzinger of University of Colorado Boulder.</p>
Full article ">Figure 15
<p>Image of Kuiper belt object Arrokoth (formerly called Ultima Thule) collected by the New Horizon spacecraft’s Long Range Reconnaissance Imager (LORRI) during a flyby in early 2019 (credit for raw image: NASA/Johns Hopkins University Applied Physics Laboratory/Southwest Research Institute). The red × symbols denote pixel-level streak estimates and green dots denote the refined subpixel localization estimates.</p>
Full article ">Figure 16
<p>Image of a retinal scan for a healthy eye, where we seek to localize blood vessels. The red × symbols denote pixel-level streak estimates and green dots denote the refined subpixel localization estimates. The original image is im00032 from the STARE database [<a href="#B32-jimaging-06-00033" class="html-bibr">32</a>,<a href="#B33-jimaging-06-00033" class="html-bibr">33</a>].</p>
Full article ">Figure 17
<p>Microscope image from an in vitro tumor model embedded in a hydrogel. We seek to localize the edges of tumors to measure their growth over time [<a href="#B34-jimaging-06-00033" class="html-bibr">34</a>,<a href="#B35-jimaging-06-00033" class="html-bibr">35</a>]. The red × symbols denote pixel-level edge estimates and green dots denote the refined subpixel localization estimates. The original image is courtesy of Dr. Kristen Mills of Rensselaer Polytechnic Institute.</p>
Full article ">
17 pages, 10112 KiB  
Article
CNN-Based Page Segmentation and Object Classification for Counting Population in Ottoman Archival Documentation
by Yekta Said Can and M. Erdem Kabadayı
J. Imaging 2020, 6(5), 32; https://doi.org/10.3390/jimaging6050032 - 14 May 2020
Cited by 12 | Viewed by 4273
Abstract
Historical document analysis systems gain importance with the increasing efforts in the digitalization of archives. Page segmentation and layout analysis are crucial steps for such systems. Errors in these steps will affect the outcome of handwritten text recognition and Optical Character Recognition (OCR) [...] Read more.
Historical document analysis systems gain importance with the increasing efforts in the digitalization of archives. Page segmentation and layout analysis are crucial steps for such systems. Errors in these steps will affect the outcome of handwritten text recognition and Optical Character Recognition (OCR) methods, which increase the importance of the page segmentation and layout analysis. Degradation of documents, digitization errors, and varying layout styles are the issues that complicate the segmentation of historical documents. The properties of Arabic scripts such as connected letters, ligatures, diacritics, and different writing styles make it even more challenging to process Arabic script historical documents. In this study, we developed an automatic system for counting registered individuals and assigning them to populated places by using a CNN-based architecture. To evaluate the performance of our system, we created a labeled dataset of registers obtained from the first wave of population registers of the Ottoman Empire held between the 1840s and 1860s. We achieved promising results for classifying different types of objects and counting the individuals and assigning them to populated places. Full article
(This article belongs to the Special Issue Recent Advances in Historical Document Processing)
Show Figures

Figure 1

Figure 1
<p>Three sample pages of the registers belonging to three different districts. The layout of pages can change between districts.</p>
Full article ">Figure 2
<p>Start of the populated place (village or neighborhood) symbol and individual objects are demonstrated. When a new populated place is registered, its name is written at the top of a new page (populated place start symbol). Then, all men in this place are written one by one (individual objects). These objects include the name, age, appearance, and job of the individuals.</p>
Full article ">Figure 3
<p>Example updates of registers are shown. Some of them can connect two individuals and can cause clustering errors. Green enclosed objects are individuals; red ones are populated place symbols; and blue ones are the updates connecting two other object types.</p>
Full article ">Figure 4
<p>A sample register page and its labeled version are demonstrated. Different colors represent different object types. The background, which is the region between the objects and document borders, is marked with blue. The start of a populated place object is colored with red. The individual objects are marked with green.</p>
Full article ">Figure 5
<p>Training metrics are demonstrated. In the top left, the learning rate, in the top right, the loss function, in the bottom left, regularized loss, and in the bottom right, global steps per second metrics are demonstrated. The subfigures are created with Tensorboard. The horizontal axis is the increasing iterations.</p>
Full article ">Figure 6
<p>Flowchart of our populated place assigning algorithm.</p>
Full article ">Figure 7
<p>Examples of intertwined rows and columns are shown. They are counted as one since there are not any empty pixels in between.</p>
Full article ">Figure 8
<p>A sample prediction made by our system. In the left, a binarized prediction image for counting individuals, in the middle, a binarized image for counting populated place start, and in the right, the objects, enclosed with rectangular boxes. Green boxes for individual register counting and the red box for counting the populated place start object.</p>
Full article ">Figure 9
<p>A sample counting mistake. All three individual registers are counted as one. This results in two missing records in our automatic counting system.</p>
Full article ">
14 pages, 2857 KiB  
Article
Measurement of Tongue Tip Velocity from Real-Time MRI and Phase-Contrast Cine-MRI in Consonant Production
by Karyna Isaieva, Yves Laprie, Freddy Odille, Ioannis K. Douros, Jacques Felblinger and Pierre-André Vuissoz
J. Imaging 2020, 6(5), 31; https://doi.org/10.3390/jimaging6050031 - 13 May 2020
Cited by 4 | Viewed by 3928
Abstract
We evaluate velocity of the tongue tip with magnetic resonance imaging (MRI) using two independent approaches. The first one consists in acquisition with a real-time technique in the mid-sagittal plane. Tracking of the tongue tip manually and with a computer vision method allows [...] Read more.
We evaluate velocity of the tongue tip with magnetic resonance imaging (MRI) using two independent approaches. The first one consists in acquisition with a real-time technique in the mid-sagittal plane. Tracking of the tongue tip manually and with a computer vision method allows its trajectory to be found and the velocity to be calculated as the derivative of the coordinate. We also propose to use another approach—phase contrast MRI—which enables velocities of the moving tissues to be measured directly. We recorded the sound simultaneously with the MR acquisition which enabled us to make conclusions regarding the relation between the movements and the sound. We acquired the data from two French-speaking subjects articulating /tata/. The results of both methods are in qualitative agreement and are consistent with other reviewer techniques used for evaluation of the tongue tip velocity. Full article
Show Figures

Figure 1

Figure 1
<p>Illustration of the routine of the slice and region of interest positioning. (<b>a</b>) examples of frames of dynamic acquisition of the subject obtained with the real-time MRI sequence in the mid-sagittal plane: during production of /a/, of /t/ and transition between /a/ and /t/. The frame denotes the position of the slice touching the hard palate and perpendicular to the tongue motion direction; (<b>b</b>) example frame of the acquisition in the plane denoted by the frame in <a href="#jimaging-06-00031-f001" class="html-fig">Figure 1</a>a. Region of interest is shown by the rectangle. The inset presents zoomed-in region containing ROI.</p>
Full article ">Figure 2
<p>Scheme of the tongue tip tracking procedure illustrated with fragments of images for S1 acquired in the mid-sagittal plane. The yellow arrows show the displacement fields and the numbers under the images denote frame numbers. The red circles are the resulting points, and the yellow circles show the manually selected first point given for the comparison.</p>
Full article ">Figure 3
<p>Velocity <span class="html-italic">v</span> (time) (the left scale) and displacement curves <span class="html-italic">d</span> (time) (the right scale) are shown on the top, sound level <math display="inline"><semantics> <msub> <mi>L</mi> <mi>I</mi> </msub> </semantics></math> (time) (middle) and sound spectrogram <span class="html-italic">s</span> (time, <math display="inline"><semantics> <mi>ν</mi> </semantics></math>) (bottom) extracted from the real-time MRI images of the subject S2 and corresponding sound recordings. The solid curve corresponds to velocities evaluated manually, the dashed curve corresponds to those evaluated automatically, and the dotted curve corresponds to the displacement between the lower boundary of the slice used for the phase-contrast sequences and the tongue tip. Positive velocity values correspond to closure of the mouth and negative to opening. Vertical dash-pointed lines denote time when velocity takes zero value.</p>
Full article ">Figure 4
<p>Results for S1 articulating /tata/ calculated from real-time MRI. (<b>a</b>) the whole velocity <span class="html-italic">v</span> (time) (the left scale) and the displacement <span class="html-italic">d</span> (time) curves (the right scale). The solid curve denotes velocity found manually, the dashed curve corresponds to automatically evaluated velocities, and the dotted line corresponds to the tongue tip displacement. The solid bold vertical lines present the borders of the <a href="#jimaging-06-00031-f004" class="html-fig">Figure 4</a>c. Positive velocity values correspond to the upward tongue tip movement to achieve the constriction; (<b>b</b>) part of an image taken in the mid-sagittal plane, with the region of interest, shown in <a href="#jimaging-06-00031-f004" class="html-fig">Figure 4</a>c (solid black rectangle) and the displacement axis (the inclined axis); (<b>c</b>) example of one repetition of /tata/. Vertical dash-dotted black lines indicate points <math display="inline"><semantics> <mrow> <msub> <mi>t</mi> <mn>0</mn> </msub> <mo>−</mo> <msub> <mi>t</mi> <mn>5</mn> </msub> </mrow> </semantics></math> where velocity becomes zero and are denoted as the <span class="html-italic">x</span> (<math display="inline"><semantics> <msub> <mi>t</mi> <mn>0</mn> </msub> </semantics></math>), the circle (<math display="inline"><semantics> <msub> <mi>t</mi> <mn>1</mn> </msub> </semantics></math>), the triangle (<math display="inline"><semantics> <msub> <mi>t</mi> <mn>2</mn> </msub> </semantics></math>), the square (<math display="inline"><semantics> <msub> <mi>t</mi> <mn>3</mn> </msub> </semantics></math>), the diamond (<math display="inline"><semantics> <msub> <mi>t</mi> <mn>4</mn> </msub> </semantics></math>) and the asteriks (<math display="inline"><semantics> <msub> <mi>t</mi> <mn>5</mn> </msub> </semantics></math>); (<b>d</b>) the whole trajectory of the tongue motion on an image in the mid-sagittal plane with the same markers as for <a href="#jimaging-06-00031-f004" class="html-fig">Figure 4</a>c corresponding to <math display="inline"><semantics> <mrow> <msub> <mi>t</mi> <mn>0</mn> </msub> <mo>−</mo> <msub> <mi>t</mi> <mn>5</mn> </msub> </mrow> </semantics></math>.</p>
Full article ">Figure 5
<p>Examples of phase-contrast images acquired in the oblique plane for the initial position and the extreme tongue trajectory points which are presented in <a href="#jimaging-06-00031-t001" class="html-table">Table 1</a> for both subjects. Black color corresponds to <math display="inline"><semantics> <mrow> <mo>−</mo> <mi>π</mi> </mrow> </semantics></math> and motion downwards and white corresponds to <math display="inline"><semantics> <mi>π</mi> </semantics></math> and upwards. The yellow rectangle denotes the region of interest.</p>
Full article ">Figure 6
<p>Velocity curves <span class="html-italic">v</span> (Time) and spectrograms <span class="html-italic">s</span> (Time, <math display="inline"><semantics> <mi>ν</mi> </semantics></math>) obtained from PC cine-MRI for subjects S1 and S2. Vertical dashed black lines on the velocity plots indicate time when velocity takes zero value, and denoted in correspondence with <a href="#jimaging-06-00031-f004" class="html-fig">Figure 4</a>. The filled zones indicate <math display="inline"><semantics> <mrow> <mo>(</mo> <msub> <mi>μ</mi> <mi>i</mi> </msub> <mo>−</mo> <msub> <mi>σ</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>μ</mi> <mi>i</mi> </msub> <mo>+</mo> <msub> <mi>σ</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> </semantics></math> for onset and offset of the both vowels, where <math display="inline"><semantics> <mi>μ</mi> </semantics></math> is corresponding mean, <math display="inline"><semantics> <mi>σ</mi> </semantics></math> is the corresponding standard deviation, and <span class="html-italic">i</span> is index of the zone. The spectrograms are obtained from the sound averaged on the 21 repetitions aligned with the MR images. The vertical solid lines denote averaged onsets and offsets for both vowels.</p>
Full article ">Figure 7
<p>(<b>a</b>) peak velocity values evaluated with real-time MRI with manual labeling (white bars) and PC cine-MRI (black bars). Height of the bars represents mean absolute values of the first minimum, the maximum between two minima, and the second minimum. Error bars correspond to the standard deviation; (<b>b</b>) mean velocity values evaluated with real-time MRI with manual labeling (white bars) and PC cine-MRI (black bars). Height of the bars represents mean absolute values of the first opening having place within interval <math display="inline"><semantics> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>t</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> </semantics></math>, the closure between two openings (within <math display="inline"><semantics> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mn>2</mn> </msub> <mo>,</mo> <msub> <mi>t</mi> <mn>3</mn> </msub> <mo>)</mo> </mrow> </semantics></math>), and the second closure (within <math display="inline"><semantics> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mn>3</mn> </msub> <mo>,</mo> <msub> <mi>t</mi> <mn>4</mn> </msub> <mo>)</mo> </mrow> </semantics></math>). Error bars correspond to the standard deviation.</p>
Full article ">
22 pages, 5416 KiB  
Project Report
Influence of Image TIFF Format and JPEG Compression Level in the Accuracy of the 3D Model and Quality of the Orthophoto in UAV Photogrammetry
by Vincenzo Saverio Alfio, Domenica Costantino and Massimiliano Pepe
J. Imaging 2020, 6(5), 30; https://doi.org/10.3390/jimaging6050030 - 11 May 2020
Cited by 30 | Viewed by 7125
Abstract
The aim of this study is to evaluate the degradation of the accuracy and quality of the images in relation to the TIFF format and the different compression level of the JPEG format compared to the raw images acquired by UAV platform. Experiments [...] Read more.
The aim of this study is to evaluate the degradation of the accuracy and quality of the images in relation to the TIFF format and the different compression level of the JPEG format compared to the raw images acquired by UAV platform. Experiments were carried out using DJI Mavic 2 Pro and Hasselblad L1D-20c camera on three test sites. Post-processing of images was performed using software based on structure from motion and multi-view stereo approaches. The results show a slight influence of image format and compression levels in flat or slightly flat surfaces; in the case of a complex 3D model, instead, the choice of a format became important. Across all tests, processing times were found to also play a key role, especially in point cloud generation. The qualitative and quantitative analysis, carried out on the different orthophotos, allowed to highlight a modest impact in the use of the TIFF format and a strong influence as the JPEG compression level increases. Full article
Show Figures

Figure 1

Figure 1
<p>Pipeline of the developed method of investigation.</p>
Full article ">Figure 2
<p>Test sites under investigation: panorama image of the test site 1 (<b>a</b>), test site 2 (<b>b</b>) and test site 3 (<b>c</b>) flight planning on test site 1 (<b>d</b>), on test site 2 (<b>e</b>) and on test site 3 (<b>f</b>).</p>
Full article ">Figure 3
<p>Orthophoto of the test site 3 using different image formats: digital negative (DNG) (<b>a</b>); TIFF (<b>b</b>); JPEG at low level of compression (JPEG12) (<b>c</b>) JPEG at medium level of compression (JPEG6) (<b>d</b>) and JPEG at maximum level of compression (JPEG1) (<b>e</b>).</p>
Full article ">Figure 4
<p>Histograms of difference between the point cloud generated by DNG images and the other point clouds generated by several format e level of JPEG compression: test site 1: TIFF (<b>a</b>), JPEG12 (<b>b</b>), JPEG6 (<b>c</b>), JPEG1 (<b>d</b>); test site 2: TIFF (<b>e</b>), JPEG12 (<b>f</b>), JPEG6 (<b>g</b>), JPEG1 (<b>h</b>); test site 3: TIFF (<b>i</b>), JPEG12 (<b>l</b>), JPEG6 (<b>m</b>), JPEG1 (<b>n</b>).</p>
Full article ">Figure 5
<p>Comparison of point clouds in a section of the temple: (<b>a</b>) 3D point cloud with the identification of the section, (<b>b</b>) sections indicated with different color obtained from Agisoft Metashape, and (<b>c</b>) 3DF Zephir.</p>
Full article ">Figure 6
<p>Comparison of point clouds (JPEG12) elaborated by Agisoft Metashape and 3DF Zephir: (<b>a</b>) low quality and (<b>b</b>) medium quality.</p>
Full article ">Figure 7
<p>Particular of the several orthophoto and their color features: (<b>a</b>) DNG, (<b>b</b>) TIFF, (<b>c</b>) JPEG12 (<b>d</b>) JPEG6, (<b>e</b>) JPEG1.</p>
Full article ">Figure 8
<p>Modulation transfer function (MTF) curves.</p>
Full article ">
12 pages, 3099 KiB  
Article
Unsupervised Clustering of Hyperspectral Paper Data Using t-SNE
by Binu Melit Devassy, Sony George and Peter Nussbaum
J. Imaging 2020, 6(5), 29; https://doi.org/10.3390/jimaging6050029 - 5 May 2020
Cited by 51 | Viewed by 6728
Abstract
For a suspected forgery that involves the falsification of a document or its contents, the investigator will primarily analyze the document’s paper and ink in order to establish the authenticity of the subject under investigation. As a non-destructive and contactless technique, Hyperspectral Imaging [...] Read more.
For a suspected forgery that involves the falsification of a document or its contents, the investigator will primarily analyze the document’s paper and ink in order to establish the authenticity of the subject under investigation. As a non-destructive and contactless technique, Hyperspectral Imaging (HSI) is gaining popularity in the field of forensic document analysis. HSI returns more information compared to conventional three channel imaging systems due to the vast number of narrowband images recorded across the electromagnetic spectrum. As a result, HSI can provide better classification results. In this publication, we present results of an approach known as the t-Distributed Stochastic Neighbor Embedding (t-SNE) algorithm, which we have applied to HSI paper data analysis. Even though t-SNE has been widely accepted as a method for dimensionality reduction and visualization of high dimensional data, its usefulness has not yet been evaluated for the classification of paper data. In this research, we present a hyperspectral dataset of paper samples, and evaluate the clustering quality of the proposed method both visually and quantitatively. The t-SNE algorithm shows exceptional discrimination power when compared to traditional PCA with k-means clustering, in both visual and quantitative evaluations. Full article
(This article belongs to the Special Issue Multispectral Imaging)
Show Figures

Figure 1

Figure 1
<p>Hyperspectral image representation.</p>
Full article ">Figure 2
<p>Hyperspectral acquisition setup.</p>
Full article ">Figure 3
<p>Paper checker. The left-hand side describes the types of paper, with the corresponding sample on the right-hand side. Identification numbers are marked above each sample.</p>
Full article ">Figure 4
<p>Data processing pipeline for the proposed method.</p>
Full article ">Figure 5
<p>Average normalized reflectance spectrum of 40 paper samples.</p>
Full article ">Figure 6
<p>Clustering results of 40 paper samples, with a sample size of 100 spectra. Left-hand plot is obtained using Principal Component Analysis (PCA), and the right-hand plot is obtained using t-Distributed Stochastic Neighbor Embedding (t-SNE).</p>
Full article ">Figure 7
<p>Optimal perplexity relation against sample count.</p>
Full article ">Figure 8
<p>Impact of sample size over the clustering index for the t-SNE algorithm.</p>
Full article ">Figure 9
<p>Sample size impact on the processing time.</p>
Full article ">
16 pages, 2892 KiB  
Article
Redesigned Skip-Network for Crowd Counting with Dilated Convolution and Backward Connection
by Sorn Sooksatra, Toshiaki Kondo, Pished Bunnun and Atsuo Yoshitaka
J. Imaging 2020, 6(5), 28; https://doi.org/10.3390/jimaging6050028 - 2 May 2020
Cited by 3 | Viewed by 4076
Abstract
Crowd counting is a challenging task dealing with the variation of an object scale and a crowd density. Existing works have emphasized on skip connections by integrating shallower layers with deeper layers, where each layer extracts features in a different object scale and [...] Read more.
Crowd counting is a challenging task dealing with the variation of an object scale and a crowd density. Existing works have emphasized on skip connections by integrating shallower layers with deeper layers, where each layer extracts features in a different object scale and crowd density. However, only high-level features are emphasized while ignoring low-level features. This paper proposes an estimation network by passing high-level features to shallow layers and emphasizing its low-level feature. Since an estimation network is a hierarchical network, a high-level feature is also emphasized by an improved low-level feature. Our estimation network consists of two identical networks for extracting a high-level feature and estimating the final result. To preserve semantic information, dilated convolution is employed without resizing the feature map. Our method was tested in three datasets for counting humans and vehicles in a crowd image. The counting performance is evaluated by mean absolute error and root mean squared error indicating the accuracy and robustness of an estimation network, respectively. The experimental result shows that our network outperforms other related works in a high crowd density and is effective for reducing over-counting error in the overall case. Full article
Show Figures

Figure 1

Figure 1
<p>A general flowchart of crowd counting using deep learning-based technique.</p>
Full article ">Figure 2
<p>Illustrations of network architectures of (<b>a</b>) multi-column network, (<b>b</b>) skip-network, and (<b>c</b>) multi-scale network.</p>
Full article ">Figure 3
<p>The backbone network architecture modified by (<b>a</b>) adding pooling and up-sampling layers and (<b>b</b>) constructing only convolutional layers (Conv layers).</p>
Full article ">Figure 4
<p>The estimation network architectures of backbone networks with backward connections, consisting of master (right) and slave networks (left) with (<b>a</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math> and (<b>b</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math> backward connections.</p>
Full article ">Figure 5
<p>The examples of regions located in the boundary of images (heat maps) after zero padding.</p>
Full article ">Figure 6
<p>Learning curve between Euclidean distance and epoch while training the network.</p>
Full article ">Figure 7
<p>Actual and predicted counts with various crowd densities in ShanghaiTech dataset Part A.</p>
Full article ">Figure 8
<p>Actual and predicted counts with various crowd densities in ShanghaiTech dataset Part B.</p>
Full article ">Figure 9
<p>Actual and predicted counts with various crowd densities in UCF_CC_50 dataset.</p>
Full article ">Figure 10
<p>Actual and predicted counts with various crowd densities in TRANCOS dataset.</p>
Full article ">Figure 11
<p>The examples of crowd images with different crowd density consisting of (<b>a</b>) 23, (<b>b</b>) 45, (<b>c</b>) 131, and (<b>d</b>) 297 people, where their input images, actual density maps, and predicted density maps by a proposed method are located in the 1st, 2nd, and 3rd rows, respectively.</p>
Full article ">Figure 12
<p>The examples of crowd images with counting errors caused by (<b>a</b>) small-sized crowd image, (<b>b</b>) object similarity, (<b>c</b>) dark illumination, and (<b>d</b>) occlusion of a mist, where their input images, actual density maps, and predicted density maps by a proposed method are located in the 1st, 2nd, and 3rd rows, respectively.</p>
Full article ">Figure 13
<p>The example of effect on dilated convolution from vehicle counting where (<b>a</b>) input images and their prediceted density maps generated by the backbone network in (<b>b</b>) <a href="#jimaging-06-00028-f003" class="html-fig">Figure 3</a>a and (<b>c</b>) <a href="#jimaging-06-00028-f003" class="html-fig">Figure 3</a>b with backward connections.</p>
Full article ">Figure 14
<p>The example of effect on dilated convolution from people counting where (<b>a</b>) input images and their predicted density maps generated by the backbone network in (<b>b</b>) <a href="#jimaging-06-00028-f003" class="html-fig">Figure 3</a>a and (<b>c</b>) <a href="#jimaging-06-00028-f003" class="html-fig">Figure 3</a>b with backward connections.</p>
Full article ">
24 pages, 9281 KiB  
Article
Fusing Appearance and Spatio-Temporal Models for Person Re-Identification and Tracking
by Andrew Tzer-Yeu Chen, Morteza Biglari-Abhari and Kevin I-Kai Wang
J. Imaging 2020, 6(5), 27; https://doi.org/10.3390/jimaging6050027 - 1 May 2020
Cited by 4 | Viewed by 4343
Abstract
Knowing who is where is a common task for many computer vision applications. Most of the literature focuses on one of two approaches: determining who a detected person is (appearance-based re-identification) and collating positions into a list, or determining the motion of a [...] Read more.
Knowing who is where is a common task for many computer vision applications. Most of the literature focuses on one of two approaches: determining who a detected person is (appearance-based re-identification) and collating positions into a list, or determining the motion of a person (spatio-temporal-based tracking) and assigning identity labels based on tracks formed. This paper presents a model fusion approach, aiming towards combining both sources of information together in order to increase the accuracy of determining identity classes for detected people using re-ranking. First, a Sequential k-Means re-identification approach is presented, followed by a Kalman filter-based spatio-temporal tracking approach. A linear weighting approach is used to fuse the outputs from these models together, with modification of the weights using a decay function and a rule-based system to reflect the strengths and weaknesses of the models under different conditions. Preliminary experimental results with two different person detection algorithms on an indoor person tracking dataset show that fusing the appearance and spatio-temporal models significantly increases the overall accuracy of the classification operation. Full article
Show Figures

Figure 1

Figure 1
<p>Samples from the UoA-Indoor re-identification dataset extracted with DPM for two classes, with the camera view number shown in the top left.</p>
Full article ">Figure 2
<p>The feature extraction and identity classification processes in the appearance model.</p>
Full article ">Figure 3
<p>The position estimation and identity classification process in the spatio-temporal model.</p>
Full article ">Figure 4
<p>An example of the positions of a detected person being mapped over time. The top edge of the map corresponds to the wall with the door shown in the two camera views.</p>
Full article ">Figure 5
<p>Tracks (drawn by a human) for two people being estimated over time (approx. ten seconds apart). Full circles indicate the measured points from the images, and the triangles at the end of each track indicate the next Kalman Filter estimated position, which is treated as the prediction for that identity class.</p>
Full article ">Figure 6
<p>A flowchart of the combined appearance and spatio-temporal models in parallel, leading to the model fusion step, with the class updates feeding back into the classification step for the next detection.</p>
Full article ">Figure 7
<p>An example of a group of three people being tracked and identified correctly. In each camera view, the thin red box indicates the area that is isolated during background estimation for further processing, the green boxes represent the detected people, and the red circles represent the estimated position for each of those people. They have consistent ID numbers between the cameras (shown in red above each person), where −99 indicates low certainty (and is ignored). The maps show the positions of the individuals, with the top of the map matching the wall with curtains shown in Camera 0 and 2. The top-right map shows the last fifty detections per class as circles and the estimated future position from the Kalman Filters as triangles. The bottom-right heatmap shows the most occupied areas of the room since system initialisation, quantised to 40 cm × 40 cm patches, where lighter/brighter colours indicate higher occupancy.</p>
Full article ">Figure 8
<p>A group of people being tracked, with an erroneous classification due to the people standing closely together and the model fusion algorithm failing to separate the individuals. This figure follows the same annotations and mapping structure as <a href="#jimaging-06-00027-f007" class="html-fig">Figure 7</a>.</p>
Full article ">
Previous Issue
Next Issue
Back to TopTop