[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Classroom Lecture Recognition

  • Conference paper
Computational Processing of the Portuguese Language (PROPOR 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3960))

Abstract

The main goal of this work is to provide automatic transcriptions of classroom lectures for e-learning and e-inclusion applications. The first experiments using a recognition system trained for Broadcast News resulted in word error rates near 60%, clearly confirming the need for adaptation to the specific topic of the lectures, on one hand, and for better strategies for handling spontaneous speech. This paper describes the different domain adaptation steps that lowered the error rate to 45%, with very little transcribed adaptation material. It also includes a qualitative analysis of the different types of error, focusing on the ones related to a very high rate of disfluencies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Shriberg, E.: Spontaneous speech: How people really talk, and why engineers should care. In: Proc. Interspeech 2005, Lisbon, Portugal (2005)

    Google Scholar 

  2. Furui, S., Iwano, K., Hori, C., Shinozaki, T., Saito, Y., Tamura, S.: Ubiquitous speech processing. In: Proc. ICASSP 2001, Salt Lake City, USA (2001)

    Google Scholar 

  3. Lamel, L., Adda, G., Bilinski, E., Gauvain, J.L.: Transcribing lectures and seminars. In: Proc. Interspeech 2005, Lisbon, Portugal (2005)

    Google Scholar 

  4. Glass, J.R., Hazen, T.J., Hetherington, I.L., Wang, C.: Analysis and processing of lecture audio data: Preliminary investigations. In: Proc. Human Language Technology NAACL, Speech Indexing Workshop, Boston (2004)

    Google Scholar 

  5. Lindstrm, A.: English and Other Foreign Linguistic Elements in Spoken Swedish: Studies of Productive Processes and Their Modelling Using Finite-State Tools. PhD thesis, Linkping University (2004)

    Google Scholar 

  6. Trancoso, I., Neto, J., Meinedo, H., Amaral, R.: Evaluation of an alert system for selective dissemination of broadcast news. In: Proc. Eurospeech 2003, Geneva, Switzerland (2003)

    Google Scholar 

  7. Meinedo, H., Neto, J.: Audio segmentation, classification and clustering in a broadcast news task. In: Proc. ICASSP 2003, Hong Kong (2003)

    Google Scholar 

  8. Caseiro, D., Trancoso, I., Oliveira, L., Viana, C.: Grapheme-to-phone using finite state transducers. In: Proc. 2002 IEEE Workshop on Speech Synthesis, SantaMonica, CA, USA (2002)

    Google Scholar 

  9. Trancoso, I., Viana, C., Mascarenhas, M., Teixeira, C.: On deriving rules for nativised pronunciation in navigation queries. In: Proc. Eurospeech 1999, Budapest, Hungary (1999)

    Google Scholar 

  10. Stolcke, A.: Srlim - an extensible language modeling toolkit. In: Proc. ICSLP 2002, Denver, USA (2002)

    Google Scholar 

  11. Gauvain, J., Lamel, L., Adda, G.: Developments in continuous speech dictation using the arpa wsj task. In: Proc. ICASSP 1995, Detroit, USA (1995)

    Google Scholar 

  12. Martins, C., Neto, J., Almeida, L.: Using partial morphological analysis in language modeling estimation for large vocabulary portuguese speech recognition. In: Proc. Eurospeech 1999, Budapest, Hungary (1999)

    Google Scholar 

  13. LDC: Simple metadata annotation specification version 6.2. Technical report, Linguistic Data Consortium (2004)

    Google Scholar 

  14. Mata, A.: For a Study of Intonation in Spontaneous and Prepared Speec. In: European portuguese: Methodology, Results and Didactic Implications (in Portuguese). PhD thesis, FLUL, Lisbon (1998)

    Google Scholar 

  15. Goto, M., Itou, K., Hayamizu, S.: A real-time filled pause detection system for spontaneous speech recognition. In: Proc. Eurospeech 1999, Budapest, Hungary (1999)

    Google Scholar 

  16. Heeman, P., Allen, J.: Speech repairs, intonational phrases and discourse markers: modeling speakers’ utterances in spoken dialog. Computational Linguistics 4(25), 527–571 (1999)

    Google Scholar 

  17. Johnson, M., Charniak, E.: A tag-based noisy channel model of speech repairs. In: Proc. ACL, Barcelona, Spain (2004)

    Google Scholar 

  18. Honal, M., Schultz, T.: Automatic disfluency removal on recognized spontaneous speech - rapid adaptation to speaker-dependent disfluencies. In: Proc. ICASSP 2005, Philadelphia, USA (2005)

    Google Scholar 

  19. Snover, M., Schwartz, R., Dorr, B., Makhoul, J.: Rt-s: Surface rich transcription scoring, methodology, and initial results. In: Proceedings of the Rich Transcription 2004 Workshop, Montreal, Canada (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Trancoso, I., Nunes, R., Neves, L. (2006). Classroom Lecture Recognition. In: Vieira, R., Quaresma, P., Nunes, M.d.G.V., Mamede, N.J., Oliveira, C., Dias, M.C. (eds) Computational Processing of the Portuguese Language. PROPOR 2006. Lecture Notes in Computer Science(), vol 3960. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11751984_20

Download citation

  • DOI: https://doi.org/10.1007/11751984_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-34045-4

  • Online ISBN: 978-3-540-34046-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics