Application of Audio and Video Processing Methods for Language Research and Documentation: The AVATecH Project

Przemyslaw Lenkiewicz⁶,
Sebastian Drude⁶,
Anna Lenkiewicz⁶,
Binyam Gebrekidan Gebre⁶,
Stefano Masneri⁸,
Oliver Schreer⁸,
Jochen Schwenninger⁷ &
…
Rolf Bardeli⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8387))

Included in the following conference series:

Language and Technology Conference

867 Accesses

Abstract

Evolution and changes of all modern languages is a well-known fact. However, recently it is reaching dynamics never seen before, which results in loss of the vast amount of information encoded in every language. In order to preserve such rich heritage, and to carry out linguistic research, properly annotated recordings of world languages are necessary. Since creating those annotations is a very laborious task, reaching times 100 longer than the length of the annotated media, innovative video processing algorithms are needed, in order to improve the efficiency and quality of annotation process. This is the scope of the AVATecH project presented in this article.

AVATecH is a joint project of Max Planck and Fraunhofer Institutes, started in 2009 and funded by MPG and FhG. Some of the research leading to these results has received funding from the European Commissions 7th Framework Program under grant agreement n 238405 (CLARA).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Case Study: The AusTalk Corpus

Tools for Multimodal Annotation

A Corpus with Wavesurfer and TEI: Speech and Video in TEITOK

Notes

References

Crystal, D.: Language Death. Cambridge University Press, Cambridge (2000)
Book Google Scholar
Lenkiewicz, P., Gebre, B.G., Schreer, O., Masneri, S., Schneider, D., Tschöpel, S.: Avatech automated annotation through audio and video analysis. In: Choukri, K., Declerck, T., Doğan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.) Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12) (N. C. C. Chair), Istanbul, Turkey. European Language Resources Association (ELRA), May 2012
Google Scholar
Lenkiewicz, P., Uytvanck, D.V., Wittenburg, P., Drude, S.: Towards automated annotation of audio and video recordings by application of advanced web-services. In: INTERSPEECH, ISCA (2012)
Google Scholar
Ajmera, J., Bourlard, H., Lapidot, I., McCowan, I.: Unknown-multiple speaker clustering using hmm. In: INTERSPEECH, Citeseer (2002)
Google Scholar
Terrillon, J.-C., Shirazi, M., Fukamachi, H., Akamatsu, S.: Comparative performance of different skin chrominance models and chrominance spaces for the automatic detection of human faces in color images. In: Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 54–61 (2000)
Google Scholar
Vezhnevets, V., Sazonov, V., Andreeva, A.: A survey on pixel-based skin color detection techniques. In: Proceedings of the GRAPHICON-2003, pp. 85–92 (2003)
Google Scholar
Kueblbeck, C., Ernst, A.: Face detection and tracking in video sequences using the modified census transformation. J. Image Vis. Comput. 24(6), 564–572 (2006)
Article Google Scholar
Atzpadin, N., Kauff, P., Schreer, O.: Stereo analysis by hybrid recursive matching for real-time immersive video conferencing. IEEE Trans. Circuits Syst. Video Technol. 14(3), 321–334 (2004)
Article Google Scholar
Dumitras, A., Haskell, B.G.: A look-ahead method for pan and zoom detection in video sequences using block-based motion vectors in polar coordinates. In: Proceedings of the ISCAS, vol. 3, pp. 853–856 (2004)
Google Scholar
Tranter, S., Reynolds, D.: An overview of automatic speaker diarization systems. IEEE Trans. Audio, Speech, Lang. Process. 14(5), 1557–1565 (2006)
Article Google Scholar
Anguera Miro, X., Bozonnet, S., Evans, N., Fredouille, C., Friedland, G., Vinyals, O.: Speaker diarization: a review of recent research. IEEE Trans. Audio, Speech, Lang. Process. 20(2), 356–370 (2012)
Article Google Scholar
McNeill, D.: So you think gestures are nonverbal? Psychol. Rev. 92, 350–371 (1985)
Article Google Scholar
Gebre, B.G., Wittenburg, P., Heskes, T.: The gesturer is the speaker. In: ICASSP 2013 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Max Planck Institute for Psycholinguistics, Wundtlaan 1, 6525 XD, Nijmegen, The Netherlands
Przemyslaw Lenkiewicz, Sebastian Drude, Anna Lenkiewicz & Binyam Gebrekidan Gebre
Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS, Schloss Birlinghoven, 53757, Sankt Augustin, Germany
Jochen Schwenninger & Rolf Bardeli
Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Einsteinufer 37, 10587, Berlin, Germany
Stefano Masneri & Oliver Schreer

Authors

Przemyslaw Lenkiewicz
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Drude
View author publications
You can also search for this author in PubMed Google Scholar
Anna Lenkiewicz
View author publications
You can also search for this author in PubMed Google Scholar
Binyam Gebrekidan Gebre
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Masneri
View author publications
You can also search for this author in PubMed Google Scholar
Oliver Schreer
View author publications
You can also search for this author in PubMed Google Scholar
Jochen Schwenninger
View author publications
You can also search for this author in PubMed Google Scholar
Rolf Bardeli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Przemyslaw Lenkiewicz .

Editor information

Editors and Affiliations

Adam Mickiewicz University, Poznań, Poland
Zygmunt Vetulani
IMMI-CNRS, Orsay, France
Joseph Mariani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lenkiewicz, P. et al. (2014). Application of Audio and Video Processing Methods for Language Research and Documentation: The AVATecH Project. In: Vetulani, Z., Mariani, J. (eds) Human Language Technology Challenges for Computer Science and Linguistics. LTC 2011. Lecture Notes in Computer Science(), vol 8387. Springer, Cham. https://doi.org/10.1007/978-3-319-08958-4_24

Download citation

DOI: https://doi.org/10.1007/978-3-319-08958-4_24
Published: 26 July 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08957-7
Online ISBN: 978-3-319-08958-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Application of Audio and Video Processing Methods for Language Research and Documentation: The AVATecH Project

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Case Study: The AusTalk Corpus

Tools for Multimodal Annotation

A Corpus with Wavesurfer and TEI: Speech and Video in TEITOK

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Application of Audio and Video Processing Methods for Language Research and Documentation: The AVATecH Project

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Case Study: The AusTalk Corpus

Tools for Multimodal Annotation

A Corpus with Wavesurfer and TEI: Speech and Video in TEITOK

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation