Classroom Lecture Recognition

Isabel Trancoso²⁴,
Ricardo Nunes²⁴ &
Luís Neves²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3960))

Included in the following conference series:

International Workshop on Computational Processing of the Portuguese Language

440 Accesses
1 Citations

Abstract

The main goal of this work is to provide automatic transcriptions of classroom lectures for e-learning and e-inclusion applications. The first experiments using a recognition system trained for Broadcast News resulted in word error rates near 60%, clearly confirming the need for adaptation to the specific topic of the lectures, on one hand, and for better strategies for handling spontaneous speech. This paper describes the different domain adaptation steps that lowered the error rate to 45%, with very little transcribed adaptation material. It also includes a qualitative analysis of the different types of error, focusing on the ones related to a very high rate of disfluencies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Autoblog 2021: The Importance of Language Models for Spontaneous Lecture Speech

Recognition of Teaching Activities from University Lecture Transcriptions

Multimodal Corpus Analysis of Autoblog 2020: Lecture Videos in Machine Learning

References

Shriberg, E.: Spontaneous speech: How people really talk, and why engineers should care. In: Proc. Interspeech 2005, Lisbon, Portugal (2005)
Google Scholar
Furui, S., Iwano, K., Hori, C., Shinozaki, T., Saito, Y., Tamura, S.: Ubiquitous speech processing. In: Proc. ICASSP 2001, Salt Lake City, USA (2001)
Google Scholar
Lamel, L., Adda, G., Bilinski, E., Gauvain, J.L.: Transcribing lectures and seminars. In: Proc. Interspeech 2005, Lisbon, Portugal (2005)
Google Scholar
Glass, J.R., Hazen, T.J., Hetherington, I.L., Wang, C.: Analysis and processing of lecture audio data: Preliminary investigations. In: Proc. Human Language Technology NAACL, Speech Indexing Workshop, Boston (2004)
Google Scholar
Lindstrm, A.: English and Other Foreign Linguistic Elements in Spoken Swedish: Studies of Productive Processes and Their Modelling Using Finite-State Tools. PhD thesis, Linkping University (2004)
Google Scholar
Trancoso, I., Neto, J., Meinedo, H., Amaral, R.: Evaluation of an alert system for selective dissemination of broadcast news. In: Proc. Eurospeech 2003, Geneva, Switzerland (2003)
Google Scholar
Meinedo, H., Neto, J.: Audio segmentation, classification and clustering in a broadcast news task. In: Proc. ICASSP 2003, Hong Kong (2003)
Google Scholar
Caseiro, D., Trancoso, I., Oliveira, L., Viana, C.: Grapheme-to-phone using finite state transducers. In: Proc. 2002 IEEE Workshop on Speech Synthesis, SantaMonica, CA, USA (2002)
Google Scholar
Trancoso, I., Viana, C., Mascarenhas, M., Teixeira, C.: On deriving rules for nativised pronunciation in navigation queries. In: Proc. Eurospeech 1999, Budapest, Hungary (1999)
Google Scholar
Stolcke, A.: Srlim - an extensible language modeling toolkit. In: Proc. ICSLP 2002, Denver, USA (2002)
Google Scholar
Gauvain, J., Lamel, L., Adda, G.: Developments in continuous speech dictation using the arpa wsj task. In: Proc. ICASSP 1995, Detroit, USA (1995)
Google Scholar
Martins, C., Neto, J., Almeida, L.: Using partial morphological analysis in language modeling estimation for large vocabulary portuguese speech recognition. In: Proc. Eurospeech 1999, Budapest, Hungary (1999)
Google Scholar
LDC: Simple metadata annotation specification version 6.2. Technical report, Linguistic Data Consortium (2004)
Google Scholar
Mata, A.: For a Study of Intonation in Spontaneous and Prepared Speec. In: European portuguese: Methodology, Results and Didactic Implications (in Portuguese). PhD thesis, FLUL, Lisbon (1998)
Google Scholar
Goto, M., Itou, K., Hayamizu, S.: A real-time filled pause detection system for spontaneous speech recognition. In: Proc. Eurospeech 1999, Budapest, Hungary (1999)
Google Scholar
Heeman, P., Allen, J.: Speech repairs, intonational phrases and discourse markers: modeling speakers’ utterances in spoken dialog. Computational Linguistics 4(25), 527–571 (1999)
Google Scholar
Johnson, M., Charniak, E.: A tag-based noisy channel model of speech repairs. In: Proc. ACL, Barcelona, Spain (2004)
Google Scholar
Honal, M., Schultz, T.: Automatic disfluency removal on recognized spontaneous speech - rapid adaptation to speaker-dependent disfluencies. In: Proc. ICASSP 2005, Philadelphia, USA (2005)
Google Scholar
Snover, M., Schwartz, R., Dorr, B., Makhoul, J.: Rt-s: Surface rich transcription scoring, methodology, and initial results. In: Proceedings of the Rich Transcription 2004 Workshop, Montreal, Canada (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

INESC ID / IST, R. Alves Redol, 9, 1000-029, Lisbon, Portugal
Isabel Trancoso, Ricardo Nunes & Luís Neves

Authors

Isabel Trancoso
View author publications
You can also search for this author in PubMed Google Scholar
Ricardo Nunes
View author publications
You can also search for this author in PubMed Google Scholar
Luís Neves
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Pontifícia Universidade do Rio Grande do Sul, Porto Alegre, Brasil
Renata Vieira
Departamento de Informática, Universidade de Évora, Portugal
Paulo Quaresma
NILC-ICMC, University of São Paulo, CP 668P, 13560-970, São Carlos, SP, Brazil
Maria das Graças Volpe Nunes
L2F/INESC-ID Lisboa, Email: qa-clef@l2f.inesc-id.pt, Rua Alves Redol, 9, 1000-029, Lisboa, Portugal
Nuno J. Mamede
Instituto Militar de Engenharia, Praça General Tibúrcio, 80, Rio de Janeiro, Brazil
Cláudia Oliveira
Pontifícia Universidade Católica do Rio de Janeiro, Rua Marquês de São Vicente, 225, Rio de Janeiro, Brazil
Maria Carmelita Dias

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Trancoso, I., Nunes, R., Neves, L. (2006). Classroom Lecture Recognition. In: Vieira, R., Quaresma, P., Nunes, M.d.G.V., Mamede, N.J., Oliveira, C., Dias, M.C. (eds) Computational Processing of the Portuguese Language. PROPOR 2006. Lecture Notes in Computer Science(), vol 3960. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11751984_20

Download citation

DOI: https://doi.org/10.1007/11751984_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34045-4
Online ISBN: 978-3-540-34046-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics