[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1943628.1943629acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfitConference Proceedingsconference-collections
research-article

Large vocabulary continuous speech recognition for Urdu

Published: 21 December 2010 Publication History

Abstract

This paper presents the development of acoustic and language models for robust Urdu speech recognition using the CMU Sphinx Open Source Toolkit for speech recognition. Three models have been developed incrementally, with the addition of speech data of up to two speakers per pass; one model using data from 40 female speakers only, one from 41 male speakers only, and one with both male and female speakers (81 speakers). This paper presents the current recognition results, and discusses approaches for improving these recognition rates.

References

[1]
M. P. Lewis (ed.). Ethnologue: Languages of the World, Sixteenth edition. Dallas, Tex.: SIL International, 2009. Online version: http://www.ethnologue.com/.
[2]
Government of Pakistan, Statistics Division, Federal Bureau of Statistics. Pakistan Statistical Yearbook 2009. www.statpak.gov.pk/depts/fbs/publications/yearbook2009/yearbook2009.html, accessed July 2010.
[3]
R. Cole (ed). Survey of the State of the Art in Human Language Technology. Cambridge University Press and Giardini, 1997.
[4]
A. Nakamura, S. Matsunaga, T. Shimizu, M. Tonomura, Y. Sagisaka. Japanese Speech Databases for Robust Speech Recognition. Proceedings of the ICSLP'96. Philadelphia, PA, pp. 2199--2202, Volume 4.
[5]
J. H. L. Hansen, R. Sarikaya, U. Yapanel and B. L. Pellom. Robust Speech Recognition in Noise: An Evaluation Using SPINE Corpus. Eurospeech '2001, Aalborg, Denmark, September 2001.
[6]
CMU Sphinx Open Source Toolkit for Speech Recognition Project by Carnegie Mellon University, http://cmusphinx.sourceforge.net/, accessed July 2010.
[7]
A. Raza, S. Hussain, H. Sarfraz, I. Ullah, Z. Sarfraz, "An ASR System for Spontaneous Urdu Speech". Submitted to O-COCOSDA 2010.
[8]
J. Ashraf, N. Iqbal, N. S. Khattak and A. M. Zaidi. Speaker Independent Urdu Speech Recognition Using HMM. In Proceedings, Natural Language Processing and Information Systems: 15th International Conference on Applications of Natural Language to Information Systems, Cardiff, UK. June 23--25, 2010.
[9]
H. Satori, H. Hiyassat, M. Harti and N. Chenfour. Investigation Arabic Speech Recognition using CMU Sphinx System. The International Arab Journal of Information Technology, Vol. 6, No. 2, April 2009.
[10]
M. U. Akram and M. Arif. Design of an Urdu Speech Recognizer Based Upon Acoustic Modeling Approach. In proceedings, IEEE INMIC 2004, pp. 91--96.
[11]
S. Azam, M. Mansoor, ZA. Shahzad, M. Mughal and S. Mohsin. Urdu Spoken Digits Recognition Using Classified MFCC and Backpropagation Neural Network. In proceedings, IEEE Computer Graphics, Imaging and Visualization, 2007.
[12]
A. Ahad, A. Fayyaz, T. Mehmood. Speech Recognition Using Multilayer Perceptron. In proceedings, Student Conference, ISCON, IEEE, 2002.
[13]
S. K. Hasnain. Recognizing Spoken Urdu Numbers Using Fourier Descriptor and Neural Networks with MATLAB. Second International Conference on Electrical Engineering, University of Engineering and Technology, Lahore, Pakistan. 25--26 March 2008.
[14]
C. Neti, N. Rajput and A. Verma. A Large Vocabulary Continuous Speech Recognition System for Hindi. In proceedings, 3rd IEEE Workshop on Multimedia Signal Processing, Copenhagen, 2002.
[15]
J. Kacur and G. Rozinaj. Practical Issues of Building Robust HMM Models using HTK and Sphinx Systems. In Speech Recognition, Technologies and Applications. F. Mihelic and J. Zibert. I-Tech (ed.), Vienna, Austria. November 2008.
[16]
HTK, http://htk.eng.cam.ac.uk, accessed July 2010.
[17]
Julius, http://julius.sourceforge.jp/en_index, accessed July 2010.
[18]
Institute for Signal and Information Processing (ISIP) Internet-Accessible Speech Recognition Technology Project, www.isip.piconepress.com/projects/speech, accessed July 2010.
[19]
H. Sarfraz, S. Hussain, R. Bokhari, A. A. Raza, I. Ullah, Z. Sarfraz, S. Pervez, A. Mustafa, I. Javed and R. Parveen. Speech Corpus Development for a Speaker Independent Spontaneous Urdu Speech Recognition System. O-COCOSDA 2010.
[20]
The CMU Statistical Language Modeling (SLM) Toolkit. www.speech.cs.cmu.edu/SLM_info.html, accessed July 2010.
[21]
Manual for the Sphinx3 Recognition System. www.speech.cs.cmu.edu/sphinxman, accessed July 2010.
[22]
Robust Group's Open Source Tutorial: Learning to Use the CMU Sphinx Automatic Speech Recognition System. www.speech.cs.cmu.edu/sphinx/tutorial.html, accessed July 2010.
[23]
X. Huang, A. Acero, H. W. Hon. Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice Hall, New Jersey, 2001.
[24]
S. Hussain. Letter-to-Sound Conversion for Urdu Text-to-Speech System. In proceedings, Workshop on Computational Approaches to Arabic Script-based Languages, COLING 2004, Geneva, Switzerland, 2004.

Cited By

View all
  • (2024)Code-Mixed Street Address Recognition and Accent Adaptation for Voice-Activated Navigation ServicesIEEE Access10.1109/ACCESS.2024.349661712(168393-168411)Online publication date: 2024
  • (2021)Early Results from Automating Voice-based Question-Answering Services Among Low-income Populations in IndiaProceedings of the 4th ACM SIGCAS Conference on Computing and Sustainable Societies10.1145/3460112.3471946(79-87)Online publication date: 28-Jun-2021
  • (2020)AutoSSR: an efficient approach for automatic spontaneous speech recognition model for the Punjabi LanguageSoft Computing10.1007/s00500-020-05248-1Online publication date: 10-Aug-2020
  • Show More Cited By

Index Terms

  1. Large vocabulary continuous speech recognition for Urdu

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    FIT '10: Proceedings of the 8th International Conference on Frontiers of Information Technology
    December 2010
    281 pages
    ISBN:9781450303422
    DOI:10.1145/1943628
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    • HEC: Higher Education Commission, Pakistan
    • COMSATS Institute of Information Technology

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 December 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Sphinx
    2. Urdu
    3. speech recognition
    4. spontaneous speech

    Qualifiers

    • Research-article

    Conference

    FIT '10
    Sponsor:
    • HEC

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 14 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Code-Mixed Street Address Recognition and Accent Adaptation for Voice-Activated Navigation ServicesIEEE Access10.1109/ACCESS.2024.349661712(168393-168411)Online publication date: 2024
    • (2021)Early Results from Automating Voice-based Question-Answering Services Among Low-income Populations in IndiaProceedings of the 4th ACM SIGCAS Conference on Computing and Sustainable Societies10.1145/3460112.3471946(79-87)Online publication date: 28-Jun-2021
    • (2020)AutoSSR: an efficient approach for automatic spontaneous speech recognition model for the Punjabi LanguageSoft Computing10.1007/s00500-020-05248-1Online publication date: 10-Aug-2020
    • (2019)Long short-term memory recurrent neural network architectures for Urdu acoustic modelingInternational Journal of Speech Technology10.1007/s10772-018-09573-722:1(21-30)Online publication date: 1-Mar-2019
    • (2018)Sinhala Speech Recognition for Interactive Voice Response Systems Accessed Through Mobile Phones2018 Moratuwa Engineering Research Conference (MERCon)10.1109/MERCon.2018.8421888(241-246)Online publication date: May-2018
    • (2017)Continuous speech recognition system for chhattisgarhi2017 International Conference on Communication and Signal Processing (ICCSP)10.1109/ICCSP.2017.8286379(0365-0369)Online publication date: Apr-2017
    • (2016)Urdu speech recognition system for district names of Pakistan: Development, challenges and solutions2016 Conference of The Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques (O-COCOSDA)10.1109/ICSDA.2016.7918979(28-32)Online publication date: Oct-2016
    • (2015)District names speech corpus for Pakistani Languages2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE)10.1109/ICSDA.2015.7357893(207-211)Online publication date: Oct-2015
    • (2014)DWT features performance analysis for automatic speech recognition of UrduSpringerPlus10.1186/2193-1801-3-2043:1Online publication date: 27-Apr-2014
    • (2014)Linear Discriminant Analysis Based Approach for Automatic Speech Recognition of Urdu Isolated WordsCommunication Technologies, Information Security and Sustainable Development10.1007/978-3-319-10987-9_3(24-34)Online publication date: 11-Sep-2014
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media