Abstract
Color Space-Time Interest Points (CSTIP) are among all the interesting low-level features which can be extracted from videos; they provide an efficient characterization of moving objects. The CSTIP are simple and can be used for video stabilization, camera motion estimation, and object tracking. In this paper, we show how the resulting features often reflect interesting events that can be used for a compact representation of video data as well as for tracking. To increase the robustness of CSTIP features extraction, we suggest a pre-processing step which is based on a Color Video Decomposition and can decompose the input images into a dynamic color texture and structure components. We compute the new Color Space Time Interest Points (CSTIP) associated to the dynamic color texture components by using the proposed algorithm of the detection of Color Space- Time Interest Points. The point tracker object tracks a set of Color Space-Time Interest Points using the robust Zero-Mean Normalized Cross-Correlation (ZNCC), feature-tracking algorithm. Experimental results are obtained from very different types of videos, namely sport videos and animation movies.
Similar content being viewed by others
References
Aubert G, Kornprobst P (2006) Mathematical problems in image processing: partial differential equations and the calculus of variations. vol. 147. Springer Science & Business Media
Aujol J-F, Chambolle A (2005) Dual norms and image decomposition models. Int J Comput Vis 63(1):85–104
Aujol J-F, Kang SH (2006) Color image decomposition and restoration. J Vis Commun Image Represent 17(4):916–928
Aujol J-F et al (2005) Image decomposition into a bounded variation component and an oscillating component. J Math Imaging Vis 22(1):71–88
Aujol J, Gilboa G, Chan T, Osher S (2006) Structure–texture image decomposition-modeling, algorithms, and parameter selection. Int J Comput Vis 67(1):111–136
Baker S, Roth S, Scharstein D, Black MJ, Lewis JP, Szeliski R (2007) A database and evaluation methodology for optical flow. ICCV07, pp. 1 http://vision.middlebury.edu/flow/data/
Bellamine I, Tairi H (2015) Optical flow estimation based on the structure–texture image decomposition. SIViP 9:193–201
Chambolle A (2004) An algorithm for Total variation minimization and applications. J Math Imaging Vis 20:89–97
Chen H, Du X, Gu W (2006) Real-time speedup techniques for region based stereo matching algorithm. Chin J Sens Actuators 1:040
Chen Y, Wang J, Xia R, Zhang Q, Cao Z, Yang K (2019) The visual object tracking algorithm research based on adaptive combination kernel. J Ambient Intell Humaniz Comput 10:4855–4867. https://doi.org/10.1007/s12652-018-01171-4
Chen Y et al. (2019) Multiscale fast correlation filtering tracking algorithm based on a feature fusion model. Concurr Comput Pract Exp e5533
Chen Y, Wang J, Chen X, Zhu M, Yang K, Wang Z, Xia R (2019) Single-image super-resolution algorithm based on structural self-similarity and deformation block features. IEEE Access 7:58791–58801
Chowdhury K, Chaudhuri D, Pal AK, Samal A (2019) Seed selection algorithm through K-means on optimal number of clusters. Multimed Tools Appl 78:18617–18651. https://doi.org/10.1007/s11042-018-7100-4
Dai C, Liu X, Lai J (2020) Human action recognition using two-stream attention based LSTM networks. Appl Soft Comput 86:105820
Dollár P, Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temporal features. In: Proceedings - 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, VS-PETS
Fan D-P et al. (2019) Shifting more attention to video salient object detection. Proc IEEE Conf Comput Vis Pattern Recogn
Galmar E, Huet B (2012) Analysis of vector space model and spatiotemporal segmentation for video indexing and retrieval
Harris C, Stephens M (1988) A combined corner and edge detector. In: Procedings of the Alvey vision conference 1988
Kjeldsen R, Kender J (2002) Finding skin in color images
Laptev I (2005) On space-time interest points. Int J Comput Vis 64:107–123
Laptev I, Lindeberg T (2003) Space-time interest points. ICCV’03, pp.432–439. http://www.nada.kth.se/cvap/actions/
Lin C, Li Y, Xu G, Cao Y (2017) Optimizing ZNCC calculation in binocular stereo matching. Signal Process Image Commun 52:64–73
Liu Q, Peng GZ (2010) A robust skin color based face detection algorithm. In CAR 2010–2010 2nd International Asia Conference on Informatics in Control, Automation and Robotics
Liu T et al (2020) Double-layer conditional random fields model for human action recognition. Signal Process Image Commun 80:115672
Lu X et al. (2018) Deep regression tracking with shrinkage loss. Proc Eur Conf Comput Vis (ECCV)
Lu X et al. (2019) See more, know more: Unsupervised video object segmentation with co-attention Siamese networks. Proc IEEE Conf Comput Vis Pattern Recogn
Lugiez M, Ménard M, El-Hamidi A (2008) Dynamic color texture modeling and color video decomposition using bounded variation and oscillatory functions. 3rd international conference on image and signal processing, Cherbourg-Octeville, France pp. 277
Meyer Y (2001) Oscillating patterns in image processing and nonlinear evolution equations: the fifteenth Dean Jacqueline B. Lewis memorial lectures. Am Math Soc 22:122
Meyer Y (2006) Oscillating patterns in some nonlinear evolution equations. Mathematical foundation of turbulent viscous flows. Springer, Berlin, Heidelberg, 101–187
Neelima N, Ravi Kumar Y (2019) Optimal clustering based outlier detection and cluster center initialization algorithm for effective tone mapping. Multimed Tools Appl 78:31057–31075. https://doi.org/10.1007/s11042-019-07907-4
Péteri R, Huiskes M, Fazekas S (n.d.) (Dyntex : A comprehensive database of dynamic textures) http://dyntex.univ-lr.fr/index.html
Rudin LI, Osher S, Fatemi E (1992) Nonlinear total variation based noise removal algorithms. Physica D Nonlinear Phenom 60(1–4):259–268
Ryoo MS, Aggarwal JK (2010) UT-Iteraction Dataset. In: ICPR Contest on semantic Description of Human Activities, SDHA http://cvrc.ece.utexas.edu/SDHA2010/Human_Interaction.html
Stöttinger J, Hanbury A, Sebe N, Gevers T (2012) Sparse color interest points for image retrieval and object categorization, vol 21. IEEE Trans Image Process, pp 2681–2692
Sun C (2002) Fast stereo matching using rectangular subregioning and 3D maximumsurface techniques. Int J Comput Vis 47:99–117
Terrillon J-C, Akamatsu S (1999) Comparative performance of different chrominance spaces for color segmentation and detection of human faces in complex scene images. Vis Interface
Verbeke N (2007) Suivi d’objets en mouvement dans une séquence vidéo. Paris Descartes
Vese LA, Osher SJ (2003) Modeling textures with total variation minimization and oscillating patterns in image processing. J Sci Comput 19(1–3):553–572
Vese LA, Osher SJ (2004) Image denoising and decomposition with total variation minimization and oscillatory functions. J Math Imaging Vis 20(1–2):7–18
Vese LA, Osher SJ (2006) Color texture modeling and color image decomposition in a variational-PDE approach. 2006 Eighth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing. IEEE
Wang H, Ullah MM, Klaser A, Laptev I, Schmid C )2012) Evaluation of local spatio-temporal features for action recognition
Willems G, Tuytelaars T, Van Gool L (2008) An efficient dense and scale-invariant spatio-temporal interest point detector. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics)
Zeng J, Su X (2004) Fast dense disparity map extracting. Opt Technol 30:40–43
Zhou B, Hou X, Zhang L (2011) A phase discrepancy analysis of object motion. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bellamine, I., Silkan, H. & Tmiri, A. Track color space-time interest points in video. Multimed Tools Appl 79, 24579–24593 (2020). https://doi.org/10.1007/s11042-020-09037-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09037-8