[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

On the importance of phase in human speech recognition

Published: 01 September 2006 Publication History

Abstract

In this paper, we analyze the effects of uncertainty in the phase of speech signals on the word recognition error rate of human listeners. The motivating goal is to get a quantitative measure on the importance of phase in automatic speech recognition by studying the effects of phase uncertainty on human perception. Listening tests were conducted for 18 listeners under different phase uncertainty and signal-to-noise ratio (SNR) conditions. These results indicate that a small amount of phase error or uncertainty does not affect the recognition rate, but a large amount of phase uncertainty significantly affects the recognition rate. The degree of the importance of phase also seems to be an SNR-dependent one, such that at lower SNRs the effects of phase uncertainty are more pronounced than at higher SNRs. For example, at an SNR of -10 dB, having random phases at all frequencies results in a word error rate (WER) of 63% compared to 24% if the phase was unaltered. In comparison, at 0 dB, random phase results in a 25% WER as compared to 11% for the unaltered phase case. Listening tests were also conducted for the case of reconstructed phase based on the least square error estimation approach. The results indicate that the recognition rate for the reconstructed phase case is very close to that of the perfect phase case (a WER difference of 4% on average)

Cited By

View all
  • (2024)Flask-based ASR for Automated Disorder Speech RecognitionProcedia Computer Science10.1016/j.procs.2024.03.252233:C(623-637)Online publication date: 1-Jan-2024
  • (2022)Novel Complex AUTOMAP for Accelerated MRIProceedings of the Thirteenth Indian Conference on Computer Vision, Graphics and Image Processing10.1145/3571600.3571636(1-9)Online publication date: 8-Dec-2022
  • (2022)Complex-valued Neural Network-based Quantum Language ModelsACM Transactions on Information Systems10.1145/350513840:4(1-31)Online publication date: 9-Mar-2022
  • Show More Cited By
  1. On the importance of phase in human speech recognition

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image IEEE Transactions on Audio, Speech, and Language Processing
    IEEE Transactions on Audio, Speech, and Language Processing  Volume 14, Issue 5
    September 2006
    392 pages

    Publisher

    IEEE Press

    Publication History

    Published: 01 September 2006

    Author Tags

    1. Phase analysis
    2. phase effect
    3. phase reconstruction
    4. speech recognition

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 25 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Flask-based ASR for Automated Disorder Speech RecognitionProcedia Computer Science10.1016/j.procs.2024.03.252233:C(623-637)Online publication date: 1-Jan-2024
    • (2022)Novel Complex AUTOMAP for Accelerated MRIProceedings of the Thirteenth Indian Conference on Computer Vision, Graphics and Image Processing10.1145/3571600.3571636(1-9)Online publication date: 8-Dec-2022
    • (2022)Complex-valued Neural Network-based Quantum Language ModelsACM Transactions on Information Systems10.1145/350513840:4(1-31)Online publication date: 9-Mar-2022
    • (2022)Improving the Performance of ASR System by Building Acoustic Models using Spectro-Temporal and Phase-Based FeaturesCircuits, Systems, and Signal Processing10.1007/s00034-021-01848-w41:3(1609-1632)Online publication date: 1-Mar-2022
    • (2021)Extending the Beta divergence to complex valuesPattern Recognition Letters10.1016/j.patrec.2020.11.005144:C(105-111)Online publication date: 1-Apr-2021
    • (2020)Support software for Automatic Speech Recognition systems targeted for non-native speechProceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services10.1145/3428757.3429971(55-61)Online publication date: 30-Nov-2020
    • (2020)A hybrid speech enhancement system with DNN based speech reconstruction and Kalman filteringMultimedia Tools and Applications10.1007/s11042-020-09563-579:43-44(32643-32663)Online publication date: 1-Nov-2020
    • (2019)Dual supervised learning for non-native speech recognitionEURASIP Journal on Audio, Speech, and Music Processing10.1186/s13636-018-0146-42019:1(1-10)Online publication date: 1-Dec-2019
    • (2018)Determining the optimal conditions for signal reconstruction based on STFT magnitudeInternational Journal of Speech Technology10.1007/s10772-018-9522-921:3(619-632)Online publication date: 1-Sep-2018
    • (2017)On relationships between amplitude and phase of short-time Fourier transform2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2017.7952241(676-680)Online publication date: 5-Mar-2017
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media