Higher Order of Motion Magnification for Vessel Localisation in Surgical Video

Mirek Janatka^18,19,
Ashwin Sridhar¹⁸,
John Kelly¹⁸ &
…
Danail Stoyanov^18,19

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11073))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

10k Accesses
6 Citations
1 Altmetric

Abstract

Locating vessels during surgery is critical for avoiding inadvertent damage, yet vasculature can be difficult to identify. Video motion magnification can potentially highlight vessels by exaggerating subtle motion embedded within the video to become perceivable to the surgeon. In this paper, we explore a physiological model of artery distension to extend motion magnification to incorporate higher orders of motion, leveraging the difference in acceleration over time (jerk) in pulsatile motion to highlight the vascular pulse wave. Our method is compared to first and second order motion based Eulerian video magnification algorithms. Using data from a surgical video retrieved during a robotic prostatectomy, we show that our method can accentuate cardio-physiological features and produce a more succinct and clearer video for motion magnification, with more similarities in areas without motion to the source video at large magnifications. We validate the approach with a Structure Similarity (SSIM) and Peak Signal to Noise Ratio (PSNR) assessment of three videos at an increasing working distance, using three different levels of optical magnification. Spatio-temporal cross sections are presented to show the effectiveness of our proposal and video samples are provided to demonstrates qualitatively our results.

You have full access to this open access chapter, Download conference paper PDF

Surgical Video Motion Magnification with Suppression of Instrument Artefacts

Localization and Local Motion Magnification of Pulsatile Regions in Endoscopic Surgery Videos

Automatic Vessel Segmentation from Pulsatile Radial Distension

Keywords

1 Introduction

One of the most common surgical complications is due to inadvertent damage to blood vessels. Avoiding vascular structures is particularly challenging in minimally invasive surgery (MIS) and robotic MIS (RMIS) where the tactile senses are inhibited and cannot be used to detect pulsatile motion. Vessels can be detected by using interventional imaging modalities like fluorescence or ultrasound (US) but these do not always produce a sufficient signal, or are difficult to use in practice [1]. Using video information directly is appealing because it is inherently available, but processing is required to reveal any vessel information hidden within the video and is not apparent to the surgeon, as can be seen in the right image of Fig. 1.

The cardiovascular system creates a pressure wave that propagates through the entire body and causes an equivalent distension-displacement profile in the arteries and veins [3]. This periodic motion has intricate characteristics, shown in Fig. 1 (left), that can be highlighted by differentiating the distension-displacement signal. The second order derivative outlines where the systolic uptake is located, whilst the third derivative highlights the end diastolic phase and the dicrotic notch. This information can be present as spatio-temporal variation between image frames and amplified using Eulerian video magnification (EVM). EVM could be applied to endoscopic video for vessel localisation by using an adaptation of an EVM algorithm and showing the output video directly to the surgeon [4]. Similarly, EVM can aid vessel segmentation for registration and overlay of pre-operative data [5], as existing linear based forms of the raw magnified video can be abstract and noisy to use directly within a dynamic scene. Magnifying the underlying video motion can exacerbate unwanted artifacts and unsought motions, and in this case regarding surgical video, of those which are not the blood vessels but due to respiration, endoscope motion or other physiological movement within the scene.

In this paper, we propose to utilise features that are apparent in the cardiac pulse wave, particularly the non-linear motion components that are emphasised by the third order of displacement, known as jerk (Green plot Fig. 1, left). We devise a custom temporal filter and use an existing technique for spatial decomposition of complex steerable pyramids [6]. The result is a more coherent magnified video compared to existing lower order of motion approaches [7, 8], as the high magnitudes of jerk are prominently exclusive to the pulse wave in the surgical scene, as our method avoids amplification of residual motions due to respiration or other periodic scene activities. Quantitative results are difficult for such approaches but we report a comparison to previous work using Structure Similarity [9] and Peak Signal to Noise Ratio (PSNR) of three robotic assisted surgical videos at separate optical zoom. We provide a qualitative example of how our method achieves isolation of two cardio-physiological features over existing methods. A supplementary video of the magnifications is provided that further illustrates the results.

2 Methods

Building on previous work in video motion magnification [7, 8, 10] we set out to highlight the third order motion characteristics created by the cardiac cycle. In an Eulerian frame of reference, the input image signal function is taken as $I(\mathbf {x},t)$ at position $\mathbf {x}$ ($\mathbf {x} = (x,y)$) and at time t [10]. With the linear magnification methods, $\delta (t)$ is taken as a displacement function with respect to time, giving the expression $I(\mathbf {x},t) = f(\mathbf {x} + \delta (t))$ and is equivalent to the first-order term in the Taylor expansion:

$$\begin{aligned} I(\mathbf {x},t) \thickapprox f(\mathbf {x})+\delta (t)\frac{\partial f(\mathbf {x})}{\partial \mathbf {x}} \end{aligned}$$

(1)

This Taylor series expansion appropriation can be continued into higher orders of motion, as shown in [8]. Taking it to the third order, where $\hat{I}(\mathbf {x},t)$ is the magnified pixel at point $\mathbf {x}$ and time t in the video.

$$\begin{aligned} \hat{I}(\mathbf {x},t) \thickapprox f(\mathbf {x})+(1+\beta )\delta (t)\frac{\delta f(\mathbf {x})}{\delta \mathbf {x}} +(1+\beta )^{2}\delta (t)^{2}\frac{1}{2}\frac{\delta ^{2}f(\mathbf {x})}{\delta ^{2}\mathbf {x}} +(1+\beta )^{3}\delta (t)^{3}\frac{1}{6}\frac{\delta ^{3}f(\mathbf {x})}{\delta ^{3}} \end{aligned}$$

(2)

In a similar vein to [8], we equate a component of the expansion to an order of motion and isolate these by subtraction of the lower orders

$$\begin{aligned} I(\mathbf {x},t) - I(\mathbf {x},t)_{non-linear(2^{nd}order)} - I(x,t)_{linear}\thickapprox (1+\beta )^{3}\delta (t)^{3}\frac{1}{6}\frac{\delta ^{3}f(x)}{\delta ^{3}\mathbf {x}} \end{aligned}$$

(3)

assuming (1+$\beta )^{3}$= $\alpha ,$ $\alpha >0$.

$$\begin{aligned} D(\mathbf {x},t) = \delta (t)^{3}\frac{1}{6}\frac{\delta ^{3}f(\mathbf {x})}{\delta ^{3}\mathbf {x}} \end{aligned}$$

(4)

$$\begin{aligned} \hat{I}_{non-linear(3^{nd}order)}(\mathbf {x},t) = I(\mathbf {x},t) + \alpha D(\mathbf {x},t) \end{aligned}$$

(5)

This produces an approximation for for the input signal and a term that can be attenuated in order to present an augmented reality (AR) view of the original video.

2.1 Temporal Filtering

As jerk is the third temporal derivative of the signal $\hat{I}(\mathbf {x},t)$, a filter has to be derived to reflect this. To achieve acceleration magnification, the Difference of Gaussian (DoG) filter was used [8]. This allowed for a temporal bandpass to be assigned, by subtracting two Gaussian filters, using $\sigma $ = $\frac{r}{4\omega \sqrt{2}}$ [11] to calculate the standard deviations of them both, where r is the frame rate of the video and $\omega $ is the frequency under investigation. Taking the derivative of the second order DoG we create an approximation of the third order, which follows Hermitian polynominals [12]. Due to the linearity of the operators, the relationship between the the jerk in the signal and the third order DoG as:

$$\begin{aligned} \frac{\partial ^{3}I(\mathbf {x},t)}{\partial t^{3}}\otimes G_{\sigma }(t) = I(\mathbf {x},t) \otimes \frac{\partial ^{3}G_{\sigma }(t)}{\partial t^{3}} \end{aligned}$$

(6)

2.2 Phase-Based Magnification

In the classical EVM approach, the intensity change over time is used in a pixel-wise manner [10] where a second order IIR filter detects the intensity change caused by the human pulse. An extension of this uses the difference in phase w.r.t spatial frequency [7] for linear motion, as subtle difference in phase can be detected between frames where minute motion is present. Recently, phase-based acceleration magnification has been proposed [8]. It is this methodology we utilise and amend for jerk magnification. By describing motion as phase shift, a decomposition of the signal f(x) with displacement $\delta (t)$ at time t, the sum of all frequencies ($\omega $) can be shown as:

$$\begin{aligned} f(\mathbf {x}+\delta (t))={\mathop {\omega }\limits ^{[}}=-\infty ]{\infty }{\sum }A_{\omega }e^{i\omega (\mathbf {x}+\delta (t))} \end{aligned}$$

(7)

where the global phase for frequency $\omega $ for displacement $\delta (t)$ is $\phi _{\omega } = \omega (\mathbf {x} + \delta (t))$.

It has been shown that spatially localised phase information of a series of image over time is related to local motion [13] and has been leveraged for linear magnification [7]. This is performed by using complex steerable pyramids [14] to separate the image signal into multi-frequency bands and orientations. These pyramids contain a set of filters $\varPsi _{\omega _{s},\theta }$ at multiple scales, $\omega _{s}$ and orientations $\theta $. The local phase information of a single 2D image $I(\mathbf {x})$ is

$$\begin{aligned} (I(\mathbf {x}))\otimes \varPsi _{\omega _{s},\theta }(\mathbf {x}) = A_{\omega ,\theta }(\mathbf {x})e^{i\phi _{\omega _{s},\theta }(\mathbf {x})} \end{aligned}$$

(8)

where $A_{\omega ,\theta }(\mathbf {x})$ is the amplitude at frequency $\omega $ and orientation $\theta $, and where $\phi _{\omega _{s},\theta }$ is the corresponding phase at scale (pyramid level) $\omega _{s}$. The phase information is extracted ($\phi _{\omega _{s},\theta }(\mathbf {x},t)$) at a given frequency $\omega $, orientation $\theta $ and frame t. The jerk constituent part of the motion is filtered out with our third order Gaussian filter and can then be magnified and reinstated into the video ($\hat{\phi }_{\omega ,\theta }(\mathbf {x},t)$) to accentuate the desired state changes in the cardiac cycle, such as the dicrotic notch and end diastolic point, shown in Fig. 1 (left).

$$\begin{aligned} D_{\sigma }(\phi _{\omega ,\theta }(\mathbf {x},t)) = \phi _{\omega ,\theta }(\mathbf {x},t)\otimes \frac{\partial ^{3}G_{\sigma }(t)}{\partial t^{3}} \end{aligned}$$

(9)

$$\begin{aligned} \hat{\phi }_{\omega ,\theta }(\mathbf {x},t) = \phi _{\omega ,\theta }(\mathbf {x},t) +\alpha D_{\sigma }\phi _{\omega ,\theta }(\mathbf {x},t) \end{aligned}$$

(10)

Phase unwrapping is applied as with the acceleration methodology in order to create the full composite signal [8, 15].

3 Results

To demonstrate the proposed approach, endoscopic video was captured from robotic prostatectomy using the da Vinci surgical system (Intuitive Surgical Inc, CA), where a partially occluded obturator artery could be seen. Despite being identified by the surgical team the vessel produced little perceivable motion in the video. This footage was captured at 1080p resolution at 30 Hz. For processing ease, the video was cropped to a third of the original width, which contained the motion of interest, yet still retains the spatial resolution of the endoscope. The video was motion magnified using the phase-based complex steerable pyramid technique described in [7] for first order motion and the video acceleration magnification described in [8] offline for comparison. Our method appended the video acceleration magnification method. All processes use a four level pyramid and half octave pyramid type. For the temporal processing, a bandpass was set at 1 Hz +/− 0.1 to account for a pulse around 54 to 66 bpm. From the patient’s ECG reading, their pulse was stable at 60 bpm during video acquisition. This was done at three magnification factors (x2, x5, x10). Spatio-temporal slices were then taken of a site along the obturator artery for visual comparison of each temporal filter type. For a quantitative comparison, the Peak Noise to Signal Ratio (PNSR) and Structural Similarity (SSIM) index [9] was calculated on a hundred frame sample, comparing the magnified videos to their original equivalent frame.

Figure 2 shows an apprehensible overview of our video magnification investigation. The pulse from the external iliac artery can be seen in the right corner and the obturator artery on the front face. Large distortion and blur can be observed on the linear magnification example, particularly in the front right corner, where as this is not present on the non-linear example, as change in velocity is exaggerated, where as any velocity is exaggerated in the linear case. Figure 3 displays a magnification comparison of spatio-temporal slices taken from three different for mentioned magnification methods. E and G in this figure, demonstrates the improvement in pulse wave motion granularity using jerk has in temporal processing, compared to the lower orders. The magenta in E shows a periodic saw wave, with no discerning features relating to the underlying pulse wave signal. The magenta in G that depicts the use of acceleration shows a more bipolar triangle wave. The green in both E and G shows a consistent periodic twin peak, with the second being more diminished, which suggests that our hypothesis of a jerk temporal filter being able to detect the dicrotic notch as correct and comparable to our model analysis shown in Fig. 4. Table 1 shows a comparison of a surgical scene at three separate working distances. This was arranged to diminish the spatial resolution with the same objective in the endoscope. All three aforementioned magnification algorithms were used on each at three different motion magnification ($\alpha $) factors (x2, x5, x10).

Table 1. Results from SSIM analysis and PSNR for our surgical videos at three levels of magnification across the different temporal processing approaches.

Full size table

As a comparative metric, SSIM and PSNR are used as a quantitative metric, with PSNR being based on mathematical model and SSIM taking into account characteristics of the human visual system [9]. SSIM and PSNR allow for objective comparisons of a processed image to a reference source, whilst it is expected that a magnified video to be altered, the residual noise generation by the process can be seen by these proposed methods. SSIM is measured in decibels (db), where the higher the number the better the quality is. PSNR is a percentile reading, with 1 being the best possible correspondence to the reference frame. For the all surgical scene, our proposed temporal process of using jerk out performs the other low order motion magnification methods across all magnifications for SSIM and equals or outperforms the acceleration technique, particularly at $\alpha {\,=\,10}$.

4 Conclusion

We have demonstrated that the use of higher order motion magnification can bring out subtle motion features that are exclusive to the pulse wave in arteries. This limits the amplification of residual signals present in surgical scenes. Our method particularly relies on the definitive cardiovascular signature characterized by the twin peaks of the end diastolic point and the dicrotic notch. Additionally, we have shown objective evidence that less noise is generated when used within laparoscopic surgery compared to other magnification technique, however, a wider sample and case specific examples would be needed to verify this claim. Further work will look at a real-time implementation of this approach as well as methods of both ground truth validation and subjective comparison within a clinical setting. Practical clinical use cases are also needed to verify the validity of using such techniques in practice and to identify the bottlenecks to translation.

References

Sridhar, A.N., et al.: Image-guided robotic interventions for prostate cancer. Nat. Rev. Urol. 10(8), 452 (2013)
Article Google Scholar
Willemet, M., Chowienczyk, P., Alastruey, J.: A database of virtual healthy subjects to assess the accuracy of foot-to-foot pulse wave velocities for estimation of aortic stiffness. Am. J. Physiol.-Heart Circ. Physiol. 309(4), H663–H675 (2015)
Article Google Scholar
Alastruey, J., Parker, K.H., Sherwin, S.J., et al.: Arterial pulse wave haemodynamics. In: 11th International Conference on Pressure Surges, pp. 401–442. Virtual PiE Led t/a BHR Group. Lisbon (2012)
Google Scholar
McLeod, A.J., Baxter, J.S.H., de Ribaupierre, S., Peters, T.M.: Motion magnification for endoscopic surgery, vol. 9036, p. 90360C (2014)
Google Scholar
Amir-Khalili, A., Hamarneh, G., Peyrat, J.-M., Abinahed, J., Al-Alao, O., Al-Ansari, A., Abugharbieh, R.: Automatic segmentation of occluded vasculature via pulsatile motion analysis in endoscopic robot-assisted partial nephrectomy video. Med. Image Anal. 25(1), 103–110 (2015)
Article Google Scholar
Simoncelli, E.P., Adelson, E.H.: Subband transforms. In: Woods, J.W. (ed.) Subband Image Coding. SECS, vol. 115, pp. 143–192. Springer, Boston (1991). https://doi.org/10.1007/978-1-4757-2119-5_4
Chapter Google Scholar
Wadhwa, N., Rubinstein, M., Durand, F., Freeman, W.T.: Phase-based video motion processing. ACM Trans. Graph. (TOG) 32(4), 80 (2013)
Article Google Scholar
Zhang, Y., Pintea, S.L., van Gemert, J.C.: Video acceleration magnification. arXiv preprint arXiv:1704.04186 (2017)
Wang, Z., Lu, L., Bovik, A.C.: Video quality assessment based on structural distortion measurement. Sig. Process. Image Commun. 19(2), 121–132 (2004)
Article Google Scholar
Wu, H.-Y., Rubinstein, M., Shih, E., Guttag, J., Durand, F., Freeman, W.: Eulerian video magnification for revealing subtle changes in the world (2012)
Google Scholar
Mikolajczyk, K., Schmid, C.: Indexing based on scale invariant interest points. In: 2001 Proceedings of the Eighth IEEE International Conference on Computer Vision, ICCV 2001, vol. 1, pp. 525–531. IEEE (2001)
Google Scholar
Haar Romeny, B.M.: Front-End Vision and Multi-scale Image Analysis: Multi-scale Computer Vision Theory and Applications, Written in Mathematica, vol. 27. Springer Science & Business Media, Heidelberg (2003). https://doi.org/10.1007/978-1-4020-8840-7
Book Google Scholar
Fleet, D.J., Jepson, A.D.: Computation of component image velocity from local phase information. Int. J. Comput. Vis. 5(1), 77–104 (1990)
Article Google Scholar
Portilla, J., Simoncelli, E.P.: A parametric texture model based on joint statistics of complex wavelet coefficients. Int. J. Comput. Vis. 40(1), 49–70 (2000)
Article Google Scholar
Kitahara, D., Yamada, I.: Algebraic phase unwrapping along the real axis: extensions and stabilizations. Multidimens. Syst. Sig. Process. 26(1), 3–45 (2015)
Article Google Scholar

Download references

Acknowledgements

The work was supported by funding from the EPSRC (EP/N013220/1, EP/N027078/1, NS/A000027/1) and Wellcome (NS/A000050/1).

Author information

Authors and Affiliations

Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London, UK
Mirek Janatka, Ashwin Sridhar, John Kelly & Danail Stoyanov
Department of Computer Science, University College London, London, UK
Mirek Janatka & Danail Stoyanov

Authors

Mirek Janatka
View author publications
You can also search for this author in PubMed Google Scholar
Ashwin Sridhar
View author publications
You can also search for this author in PubMed Google Scholar
John Kelly
View author publications
You can also search for this author in PubMed Google Scholar
Danail Stoyanov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mirek Janatka .

Editor information

Editors and Affiliations

University of Leeds, Leeds, UK
Alejandro F. Frangi
King’s College London, London, UK
Julia A. Schnabel
University of Pennsylvania, Philadelphia, PA, USA
Christos Davatzikos
Universidad de Valladolid, Valladolid, Spain
Carlos Alberola-López
Queen’s University, Kingston, ON, Canada
Gabor Fichtinger

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Janatka, M., Sridhar, A., Kelly, J., Stoyanov, D. (2018). Higher Order of Motion Magnification for Vessel Localisation in Surgical Video. In: Frangi, A., Schnabel, J., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2018. MICCAI 2018. Lecture Notes in Computer Science(), vol 11073. Springer, Cham. https://doi.org/10.1007/978-3-030-00937-3_36

Download citation

DOI: https://doi.org/10.1007/978-3-030-00937-3_36
Published: 13 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00936-6
Online ISBN: 978-3-030-00937-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics