Abstract
In the context of human action recognition from video sequences in the medical environment, a Temporal Belief-based Hidden Markov Model (HMM) is presented. It allows to cope with human action temporality and enables to manage the data uncertainty and the knowledge incompleteness. The system of activity recognition is based on an HMM with explicit state duration. The global interpretation process uses the framework of the Transferable Belief Model (TBM). It enable us to model and manage the uncertainty over the video interpretation process. An application is proposed for human action analysis in medical video sequences provided by a patient monitoring system in the cardiology section in hospital. The proposed recognition method has been assessed on a database of 3000 video images of medical scenes and compared to the performance of the probabilistic Hidden Markov Models.
Similar content being viewed by others
References
A. Ziani, C. Motamed, and J. C. Noyer, “Temporal reasoning for scenario recognition in video-surveillance using Bayesian networks,” IET Comput. Vision 2 (2), 99–107 (2008).
A.-S. Silvent, M. Dojat, and C. Garbay, “Multi-level temporal abstraction for medical scenario construction,” Int._J. Adap. Control Signal Process 19, 377–394 (2005).
A. Wilson and A. Bobick. “Hidden Markov models for modeling and recognizing gesture under variation,” Int. J. Pattern Recogn. Artificial Intellig. 15 (1), 123–160, (2001).
E. Ramasso, “Contribution of belief functions to Hidden Markov Models,” in Proc. IEEE Workshop on Machine Learning and Signal Processing (Grenoble, 2009), pp.1–6.
S. Luhr, S. Venkatesh, G. West, and H. H. Bui, “Duration abnormality detection in sequences of human activity,” in Proc. PRICAI 2004: Trends in Artificial Intelligence Lecture Notes in Computer Science (2004), Vol. 3157, pp. 983–984.
S. Luhr, S. Venkatesh, G. West, and H. H. Bui, “Explicit state duration HMM for abnormality detection in sequences of human activity,” in Proc. PRICAI 2004: Trends in Artificial Intelligence Lecture Notes in Computer Science (2004).
C. Tessier, “Towards a commonsense estimator for activity tracking,” in Proc. AAAI Spring Symp. (Palo Alto, CA, 2003), pp. 111–119.
T. Vu, F. Bremond, and M. Thonnat, “Automatic video interpretation: A novel algorithm for temporal scenario recognition,” in Proc. 18th Int. Joint Conf. on Artificial Intelligence (IJCAI’03) (Acapulco, 2003), pp. 9–15.
C. Dousson, and P. Le Maigat, “Chronicle recognition improvement using temporal focusing and hierarchization,” in Proc. IJCAI 2007 (Hyderabad, 2007), pp. 324–329.
A. Ziani and C. Motamed, “Automatic scenario recognition for visual-surveillance by combining probabilistic graphical approaches,” in Video Surveillance, Ed. by Weiyao Lin (InTech, 2011).
V. T. Vu, “Temporel scenario for automatic video interpretation,” PhD Thesis (INRIA, Sophia-Antipolis, 2004).
N. Rota and M. Thonnat, “Activity recognition from video sequences using declarative models,” in Proc. 14th European Conf. on Artificial Intelligence (ECAI), Ed. by W. Horn (IOS Press, Amsterdam, Berlin, 2000).
V.-T. Vu, F. Bremond, and M. Thonnat, “Temporal constraints for video interpretation,” in Proc. 15th European Conf. on Artificial Intelligence (ECAI’2002) (Lyon, 2002).
R. Gerber, H. Nagel, and H. Schreiber, “Deriving textual descriptions of road traffic queues from video sequences,” in Proc. 15th European Conf. on Artificial Intelligence (ECAI) (Lyon, 2002), pp. 736–740.
M. Thonnat and N. A. Rota, “Image understanding for visual surveillance applications,” in Proc. Int. Workshop on Cooperative Distributed Vision (Kyoto, 1999), pp. 51–82.
J. Hobbs, R. Nevatia, and B. Bolles, “An ontology for video event representation,” in Proc. IEEE Workshop on Event Detection and Recognition, WEDR’04 (Washington, 2004).
M. Thonnat, F. Bremond, N. Maillot, and Van Th. Vu, “Ontologies for video events,” in Tech. Rep. (INRIA, Sophia-Antipolis, 2004), No. 5189.
L. I. Kuncheva, “Classifier ensembles for changing environments,” in Proc. Int. Workshop on Multiple Classifier Systems (Cagliari, 2004), Vol. 3077, pp. 1–15.
H. H. Bui, Q. Phung, and S. Venkatesh, “Hierarchical Hidden Markov Models with general state hierarchy,” in Proc. 18th National Conf. on Artificial Intelligence, (San Jose, CA, 2004), pp. 324–329.
S. Hongeng; F. Bremond, and R. Nevatia, “Bayesian framework for video surveillance application,” in Proc. ICPR00 (Barcelona, 2000), pp. 164–170.
L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proc. IEEE 77, 257–285 (1989).
A. J. Howell and H. Buxton. “Learning identity with radial basis function networks,” Neurocomputing 20, 15–34 (1998).
K. Altun, B. Barshan, and O. Tunél, “Comparative study on classifying human activities with miniature inertial and magnetic sensors,” Pattern Recogn. 43 (10), 3605–3620 (2010).
J. He, H. Li, and J. Tan, “Real-time daily activity classification with wireless sensor networks using hidden Markov model,” in Proc. 29th Annu. Int. Conf. of the IEEE Engineering in Medicine and Biology Society (Lyon, 2007), pp. 3192–3195.
M. C. Yuksek and B. Barshan, “Human activity classification with miniature inertial and magnetic sensors,” in Proc. 9th European Signal Processing Conf. (Barcelona, 2011), pp. 956–960.
A. Sant’Anna, “A symbolic approach to human motion analysis using inertial sensors: framework and gait analysis study,” in PhD Thesis (Halmstad Univ., 2012).
A. J. Howell and H. Buxton, “RBF network methods for face detection and attentional frames,” Neural Processing Lett. 15 (3), 197–211 (2002).
H. Buxton and G. Shaogang, “Advanced visual surveillance using Bayesian networks,” in Proc. Int. Conf. on Computer Vision (Cambridge, MA, 1995).
R. J. Howarth and H. Buxton, “Conceptual descriptions from monitoring et watching image sequences,” Image Vision Comput. 18, 105–135 (2000).
T. Tan, P. Remagnino, and K. Baker, “Agent orientated annotation in model based visual surveillance,” in Proc. Int. Conf. on Computer Vision (Bombay, 1998), pp. 857–862.
S. Hongeng and R. Nevatia, “Multi-agent event recognition,” in Proc. Int. Conf. on Computer Vision, ICCV’01 (Vancouver, July 2001).
R., Nevatia, S. Hongeng, and F. Bremond, “Videobased event recognition: Activity representation and probabilistic recognition methods,” Comput. Vision Image Understand. 96 (2),129–162 (2003).
G. West, P. Peursum, and S. Venkatesh, “Combining image regions and human activity for indirect object recognition in indoor wide-angle views,” in Proc. IEEE Int. Conf. on Computer Vision (Beijing, 2005), Vol. 1, pp. 82–89.
J. Ohya, J. Yamato, and K. Ishii, “Recognizing human action in time-sequential images using hidden Markov model,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’92) (Champaign, IL, 1992), pp. 379–385.
A. Kokaram, N. Rea, R. Dahyot, M. Tekalp, P. Bouthemy, P. Gros, and I. Sezan, “Browsing sports video (trends in sports-related indexing and retrieval work),” IEEE Signal Processing Mag. 23 (2), 47–58 (2006).
M. Barnard and J.-M. Odobez, “Sports event recognition using layered HMMs,” in Proc. IDIAPRR (2005), pp. 05–07.
Y. Bengio and P. Frasconi, “An input output HMM architecture,” Adv. Neural Inf. Processing Syst. 7 (NIPS’9), 427–434 (1995).
Y. A. Ivanov and A. F. Bobick, “Recognition of visual activities and interactions by stochastic parsing,” IEEE Trans. Pattern Anal. Mach. Intellig. (PAMI) 22 (8), 852–872 (2000).
M. Oliver, B. Rosario, and A. Pentland, “A Bayesian computer vision system for modeling human interactions,” IEEE Trans. Pattern Anal. Mach. Intellig. 22 (8), 831–843 (2000).
A. Galata, N. Johnson, and D. Hogg, “Learning variable length Markov models of behaviour,” Int. J. Comp. Vision Image Understand. 81 (3), 398–413 (2001).
F. Porikli, “Trajectory distance metric using hidden Markov model based representation,” in Proc. PETS Workshop (Prague, 2004).
M. Duong, T. Vu, H. H. Bui, H. Phung, Q. Dinh, and V. Svetha, “Activity recognition and abnormality detection with the switching hidden semi-Markov model,” in Proc. IEEE Computer Society Conf. CVPR 2005 (San Diego, 2005).
M. Duong, T. Vu, H. H. Bui, H. Phung, Q. Dinh, and V. Svetha, “Efficient duration and hierarchical modeling for human activity recognition,” Artificial Intellig. 173 (7–8), 830–856 (Elsevier BV, Amsterdam, 2009).
M. Dong and D. He, “A segmental hidden semiMarkov model (HSMM) based diagnostics and prognostics framework and methodology,” Mech. Syst. Signal Processing 21, 2248–2266 (2007).
S. Fine, Y. Singer, and N. Tishby, “The hierarchical hidden Markov model: analysis and applications,” Mach. Learn. 32 (1), 41–62 (1998).
H. H. Bui, S. Venkatesh, and G. West, “Policy recognition in the abstract hidden Markov model,” J. Artificial Intellig. Res. 17, 451–499 (2002).
N. Oliver, E. Horvitz, and A. Garg, “Layered representations for human activity recognition,” in Proc. 4th IEEE Int. Conf. on Multimodal Interfaces (ICMI’02) (Pittsburgh, 2002), pp. 3–8.
K. P. Murphy, Hidden Semi-Markov Models (HSMMs), Unpublished Notes (2002).
M. J. Russel, and R. K. Moore, “Explicit Modeling of State Occupancy in Hidden Markov Models for Speech Recognition,” in Proc. ICASSP (Tampa, FL, 1985), pp. 5–8.
P. Smets and R. Kennes, “The transferable belief model,” Artificial Intellig. 66, 191–234 (1994).
E. Ramasso. “Contribution of belief functions to Hidden Markov Models,” in Proc. IEEE Workshop on Machine Learning and Signal Processing (Grenoble, 2009), pp. 1–6.
S. R. M. Ahouandjinou, E. C. Ezin, C. Motamed, and P. Gouton, “An approach to correcting image distortion by self calibration stereoscopic scene from multiple views,” in Proc. SITIS 2012 (Sorrento, 2012), pp. 389–394.
E. Ramasso, M. Rombaut, and D. Pellerin, “Forwardbackward-Viterbi procedures in TBM for state sequence analysis using belief functions,” in Proc. Europ. Conf. on Symbolic and Quantitative Approaches to Reasoning with Uncertainty (Hammamet, 2007), pp. 405–417.
N. Zouba, F. Bremond, and M. Thonnat, “An activity monitoring system for real elderly at home: Validation study,” in Proc. 7th IEEE Int. Conf. on Advanced Video and Signal-Based Surveillance, AVSS (Boston, 2010).
P. Robert, E. Castelli, P. C. Chung, T. Chiroux, C. F. Crispim-Junior, P. Mallea, and F. Bremond, “SWEET HOME ICT technologies for the assessment of elderly subjects,” in Ref. No.: IRBM-D-13-00003 (IRBM BioMedical Engineering and Research, 2013).
L. Serir, E. Ramasso, and N. Zerhouni, “Time-sliced temporal evidential networks: the case of evidential HMM with application to dynamical system analysis,” IEEE Int. Conf. on Prognostics and Health Management, PHM’11 (Denver, 2011).
Author information
Authors and Affiliations
Corresponding author
Additional information
The article is published in the original.
This paper uses the materials of the report submitted at the 11th International Conference “Pattern Recognition and Image Analysis: New Information Technologies,” Samara, Russia, September 23–28, 2013.
Arnaud S. R. M. Ahouandjinou Received his BSc in Electrical Engineering and Industrial Computing, and M.Sc in Network Engineering and Computer Science from the Ecole Superieure de Genie Informatique (ESGI), Paris, France. He received Master’s Research degree in Mathematics applied to Engineering Sciences in specialty Ingenierie Numerique, Signal, Image and Informatique Industrielle in 2010 and since this year, he is PhD student in computer science, signal, and image processing at University of Littoral Cote d’Opale, Calais, France. Current research is concerned with the automatic visual-surveillance of wide area scenes using computational vision in medical fields. His research interests focus on the design of multicamera system for realtime for human action recognition with the management of data uncertainty over the vision system by using probabilistic graphical models, and beliefs propagation (TBM Framework). He is recently interested by machine learning approaches for human activity recognition through a multisensor system which provide the solution for using the collaboration and the cooperation between these sensors to improve the high-level video interpretation process.
Cina Motamed is associate professor in Computer Science in the University of Littoral Cote d’Opale, Calais, France. He received his BSc in mathematics, and MSc in Electrical Engineering and Computer Science from the University of Caen, France and the PhD degree in Computer Science from the University of Compiegne, France, in 1987, 1989, and 1992, respectively. Current research is concerned with the automatic visual surveillance of wide area scenes using computational vision. His research interests focus on the design of multicamera system for real-time multiobject tracking and human action recognition. He is recently focusing on the uncertainty management over the vision system by using graphical models, and beliefs propagation. He is also interested by unsupervised learning approaches for human activity recognition.
Eugène C. Ezin received his PhD degree with highest level of distinction in 2001 after research works carried out on neural and fuzzy systems for speech applications at the International Institute for Advanced Scientific Studies in Italy. Since July 2012, he has been an associate professor in computer science. He supervised many master thesis in the same field. He is a reviewer of Mexican International Conference on Artificial Intelligence and other journals. His research interests include machine learning, neural networks and fuzzy systems, signal and image processing, high performance computing, cryptography, modeling and simulation, information systems and network security. He is also interested by human activity recognition through multisensor systems.
Rights and permissions
About this article
Cite this article
Ahouandjinou, A.S.R.M., Motamed, C. & Ezin, E.C. A temporal belief-based hidden markov model for human action recognition in medical videos. Pattern Recognit. Image Anal. 25, 389–401 (2015). https://doi.org/10.1134/S1054661815030025
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1054661815030025