Abstract
The question of how to model spatiotemporal similarity between gestures arising in 3D motion capture data streams is of major significance in currently ongoing research in the domain of human communication. While qualitative perceptual analyses of co-speech gestures, which are manual gestures emerging spontaneously and unconsciously during face-to-face conversation, are feasible in a small-to-moderate scale, these analyses are inapplicable to larger scenarios due to the lack of efficient query processing techniques for spatiotemporal similarity search. In order to support qualitative analyses of co-speech gestures, we propose and investigate a simple yet effective distance-based similarity model that leverages the spatial and temporal characteristics of co-speech gestures and enables similarity search in 3D motion capture data streams in a query-by-example manner. Experiments on real conversational 3D motion capture data evidence the appropriateness of the proposal in terms of accuracy and efficiency.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Arici, T., Celebi, S., Aydin, A.S., Temiz, T.T.: Robust gesture recognition using feature pre-processing and weighted dynamic time warping. Multimedia Tools Appl. 72(3), 3045–3062 (2014)
Beecks, C.: Distance-based similarity models for content-based multimedia retrieval. PhD thesis, RWTH Aachen University (2013)
Beecks, C., Kirchhoff, S., Seidl, T.: On stability of signature-based similarity measures for content-based image retrieval. Multimedia Tools Appl. 71(1), 349–362 (2014). doi:10.1007/s11042-012-1334-3
Beecks, C., Kirchhoff, S., Seidl, T.: Signature matching distance for content-based image retrieval. In: Proceedings of the ACM International Conference on Multimedia Retrieval, pp. 41–48 (2013)
Beecks, C., Uysal, M.S., Seidl, T.: A comparative study of similarity measures for content-based multimedia retrieval. In: Proceedings of the IEEE International Conference on Multimedia and Expo, pp. 1552–1557 (2010)
Beecks, C., Uysal, M.S., Seidl, T.: Signature quadratic form distance. In: Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 438–445 (2010)
Berndt, D., Clifford, J.: Using dynamic time warping to find patterns in time series. In: AAAI 1994 workshop on knowledge discovery in databases, pp. 359–370 (1994)
Blackburn, J., Ribeiro, E.: Human motion recognition using isomap and dynamic time warping. In: Elgammal, A., Rosenhahn, B., Klette, R. (eds.) Human Motion 2007. LNCS, vol. 4814, pp. 285–298. Springer, Heidelberg (2007)
Bodiroža, S., Doisy, G., Hafner, V.V.: Position-invariant, real-time gesture recognition based on dynamic time warping. In: Proceedings of the International Conference on Human-robot Interaction, pp. 87–88 (2013)
Campbell, L.W.: Visual Classification of Co-verbal Gestures for Gesture Understanding. PhD thesis (2001)
Chen, L., Ng, R.: On the marriage of Lp-norms and edit distance. In: Proceedings of the International Conference on Very Large Data Bases, pp. 792–803 (2004)
Chen, L., Özsu, M.T., Oria, V.: Robust and fast similarity search for moving object trajectories. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 491–502 (2005)
Cheng, J., Xie, C., Bian, W., Tao, D.: Feature fusion for 3D hand gesture recognition by learning a shared hidden space. Pattern Recogn. Lett. 33(4), 476–484 (2012)
Cienki, A.: Cognitive linguistics: Spoken language and gesture as expressions of conceptualization. Body - Language - Communication: An International Handbook on Multimodality in Human Interaction, pp. 182–201 (2013)
Deza, M., Deza, E.: Encyclopedia of Distances. Springer, Heidelberg (2009)
Efron, D.: Gesture and Environment. Kings Crown Press, New York (1941)
Ekman, P., Friesen, W.: The repertoire of nonverbal behavior: Categories, origins, usage, and coding. Semiotica 1(1), 49–98 (1969)
Fang, S., Chan, H.: Human identification by quantifying similarity and dissimilarity in electrocardiogram phase space. Pattern Recogn. 42(9), 1824–1831 (2009)
Hahn, M., Krüger, L., Wöhler, C.: 3D action recognition and long-term prediction of human motion. In: Gasteratos, A., Vincze, M., Tsotsos, J.K. (eds.) ICVS 2008. LNCS, vol. 5008, pp. 23–32. Springer, Heidelberg (2008)
Hasan, H., Abdul-Kareem, S.: Static hand gesture recognition using neural networks. Artif. Intell. Rev. 41(2), 147–181 (2014)
Hassani, M., Beecks, C., Töws, D., Serbina, T., Haberstroh, M., Niemietz, P., Jeschke, S., Neumann, S., Seidl, T.: Sequential pattern mining of multimodal streams in the humanities. In: Proceedings of the Conference on Database Systems for Business, Technology, and Web, pp. 683–686 (2015)
Hassani, M., Seidl, T.: Towards a mobile health context prediction: Sequential pattern mining in multiple streams. In: Proceedings of the IEEE International Conference on Mobile Data Management, pp. 55–57 (2011)
Hausdorff, F.: Grundzüge der Mengenlehre. Von Veit (1914)
Huttenlocher, D.P., Klanderman, G.A., Rucklidge, W.: Comparing images using the hausdorff distance. IEEE Trans. Pattern Anal. Mach. Intell. 15(9), 850–863 (1993)
Ibraheem, N.A., Khan, R.Z.: Article: survey on various gesture recognition technologies and techniques. Int. J. Comput. Appl. 50(7), 38–44 (2012)
Itakura, F.: Minimum prediction residual principle applied to speech recognition. IEEE Trans. Acoust. Speech Signal Process. 23(1), 67–72 (1975)
Kendon, A.: Some relationships between body motion and speech. Stud. Dyadic Commun. 7, 177 (1972)
Kendon, A.: Gesticulation and speech: two aspects of the process of utterance. The Relat. Verbal Nonverbal Commun. 25, 207–227 (1980)
Kendon, A.: Gesture: Visible action as utterance. Cambridge University Press (2004)
Keogh, E.J.: Exact indexing of dynamic time warping. In: Proceedings of the International Conference on Very Large Data Bases, pp. 406–417 (2002)
Keskin, C., Erkan, A., Akarun, L.: Real time hand tracking and 3d gesture recognition for interactive interfaces using hmm. ICANN/ICONIPP 26–29, 2003 (2003)
Khan, R.Z., Ibraheem, N.A.: Survey on gesture recognition for hand image postures. pp. 110–121 (2012)
Latecki, L.J., Megalooikonomou, V., Wang, Q., Lakaemper, R., Ratanamahatana, C.A., Keogh, E.: Elastic partial matching of time series. In: European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 577–584 (2005)
LaViola, J.: A survey of hand posture and gesture recognition techniques and technology. Brown University, Providence, RI (1999)
Liu, J., Kavakli, M.: A survey of speech-hand gesture recognition for the development of multimodal interfaces in computer games. In: Proceedings of the IEEE International Conference on Multimedia and Expo, pp. 1564–1569 (2010)
McNeill, D.: Hand and mind: What gestures reveal about thought. University of Chicago Press (1992)
Mitra, S., Acharya, T.: Gesture recognition: a survey. Trans. Sys. Man Cyber Part C 37(3), 311–324 (2007)
Mittelberg, I.: Geometric and image-schematic patterns in gesture space. Equinox Publishing, pp. 351–388 (2010)
Moeslund, T.B., Granum, E.: A survey of computer vision-based human motion capture. Comput. Vis. Image Underst. 81(3), 231–268 (2001)
Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104(2), 90–126 (2006)
Müller, C.: Redebegleitende Gesten. Berliner Wissenschafts-Verlag, Kulturgeschichte - Theorie - Sprachvergleich (1998)
Müller, C., Cienki, A., Fricke, E., Ladewig, S.H., McNeill, D., Teßendorf, S.: Body - Language - Communication: An International Handbook on Multimodality in Human Interaction. (Handbooks of Linguistics and Communication Science 38). De Gruyter Mouton, Berlin/ Boston (2013)
Müller, C., Posner, R.: The Semantics and Pragmatics of Everyday Gestures. Kultur. Weidler, Körper, Zeichen (2004)
Nam, Y., Wohn, K.: Recognition of hand gestures with 3D, nonlinear arm movement. Pattern Recogn. Lett. 18(1), 105–113 (1997)
Park, B.G., Lee, K.M., Lee, S.U.: Color-based image retrieval using perceptually modified hausdorff distance. EURASIP J. Image Video Process. 2008, 4:1–4:10 (2008)
Psarrou, A., Gong, S., Walter, M.: Recognition of human gestures and behaviour based on motion trajectories. Image Vis. Comput. 20(5), 349–358 (2002)
Rautaray, S.S., Agrawal, A.: Vision based hand gesture recognition for human computer interaction: a survey. Artif. Intell. Rev. 43(1), 1–54 (2015)
Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vision 40(2), 99–121 (2000)
Ruffieux, S., Lalanne, D., Mugellini, E., Abou Khaled, O.: A survey of datasets for human gesture recognition. In: Kurosu, M. (ed.) HCI 2014, Part II. LNCS, vol. 8511, pp. 337–348. Springer, Heidelberg (2014)
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)
Stern, H., Shmueli, M., Berman, S.: Most discriminating segment-longest common subsequence (MDSLCS) algorithm for dynamic hand gesture classification. Pattern Recogn. Lett. 34(15), 1980–1989 (2013)
Suk, H.-I., Sin, B.-K., Lee, S.-W.: Recognizing hand gestures using dynamic bayesian network. In: Proceedings of the IEEE International Conference on Automatic Face & Gesture Recognition, pp. 1–6 (2008)
Suk, H.-I., Sin, B.-K., Lee, S.-W.: Hand gesture recognition based on dynamic Bayesian network framework. Pattern Recogn. 43(9), 3059–3072 (2010)
Vlachos, M., Hadjieleftheriou, M., Gunopulos, D., Keogh, E.: Indexing multi-dimensional time-series with support for multiple distance measures. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 216–225 (2003)
Vlachos, M., Kollios, G., Gunopulos, D.: Elastic translation invariant matching of trajectories. Mach. Learn. 58(2–3), 301–334 (2005)
Watson, R.: A survey of gesture recognition techniques. Technical report,Trinity College Dublin, Department of Computer Science (1993)
Wu, Y., Huang, T.S.: Vision-based gesture recognition: a review. In: Braffort, A., Gibet, S., Teil, D., Gherbi, R., Richardson, J. (eds.) GW 1999. LNCS (LNAI), vol. 1739, pp. 103–115. Springer, Heidelberg (2000)
Yang, J., Li, Y., Wang, K.: A new descriptor for 3D trajectory recognition via modified CDTW. In: Proceedings of the IEEE International Conference on Automation and Logistics, pp. 37–42 (2010)
Acknowledgment
This work is partially funded by the Excellence Initiative of the German federal and state governments and by DFG grant SE 1039/7-1.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Beecks, C. et al. (2015). Spatiotemporal Similarity Search in 3D Motion Capture Gesture Streams. In: Claramunt, C., et al. Advances in Spatial and Temporal Databases. SSTD 2015. Lecture Notes in Computer Science(), vol 9239. Springer, Cham. https://doi.org/10.1007/978-3-319-22363-6_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-22363-6_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22362-9
Online ISBN: 978-3-319-22363-6
eBook Packages: Computer ScienceComputer Science (R0)