A log-index weighted cepstral distance measure for speech recognition

Zheng Fang¹,
Wu Wenhu¹ &
Fang Ditang¹

95 Accesses
Explore all metrics

Abstract

A log-index weighted cepstral distance measure is proposed and tested in speaker-independent and speaker-dependent isolated word recognition systems using statistic techniques. The weights for the cepstral coefficients of this measure equal the logarithm of the corresponding indices. The experimental results show that this kind of measure works better than any other weighted Euclidean cepstral distance measures on three speech databases. The error rate obtained using this measure is about 1.8 percent for three databases on average, which is a 25% reduction from that obtained using other measures, and a 40% reduction from that obtained using Log Likelihood Ratio (LLR) measure. The experimental results also show that this kind of distance measure works well in both speaker-dependent and speaker-independent speech recognition systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Isolated Word Recognition Based on Power Normalized Cepstrum and Machine Learning Clusters

Robust Feature Extraction Based on Teager-Entropy and Half Power Spectrum Estimation for Speech Recognition

Integration of Mel-frequency Cepstral Coefficients with Log Energy and Temporal Derivatives for Text-Independent Speaker Identification

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Itakura F. Minimum prediction residual principle applied to speech recognition.IEEE Trans. Acoust., Speech, Signal. Processing, 1975, ASSP-23: 67–72.
Article Google Scholar
Nocerino N, Soong F K, Rabiner L R, Klatt D H. Comparative study of several distortion measures for speech recognition. InProc. ICASSP 1985, vol.11, Mar. 1985, pp.25–28.
Furui S. Cepstral analysis technique for automatic speaker verification.IEEE Trans. Acoust., Speech, Signal Processing, 1981, ASSP-29: 254–272.
Article Google Scholar
Paliwal K K. On the performance of the quefrency-weighted cepstral coefficients in vowel recognition.Speech Commun., 1982, 1: 151–154.
Article Google Scholar
Tohkura Y. A weighted cepstral distance measure for speech recognition.IEEE Trans. Acoust., Speech, Signal Processing, 1987, ASSP-35(10): 1414–1422.
Article Google Scholar
Juang B H, Rabiner L R, Wilpon J G. On the use of bandpass liftering in speech recognition.IEEE Trans. Acoust., Speech, Signal Processing, 1987, ASSP-35(7): 947–953.
Article Google Scholar
Jiang Li, Wu Wenhu, Cai Lianhong, Fang Ditang. A real-time speaker-independent speech recognition system based on SPM for 208 Chinese words. InProc. ICSP’90, pp.473–476, 1990.
Zheng Fang, Yang Hongbo, Wu Wenhu, Fang Ditang. A continuous distance density segmental probabilistic model. InProc. National Conference on Man-Machine Speech Communication (NCMMSC’94), Speech Recognition and Synthesis, pp.238–241, Oct. 1994. (in Chinese)
Zheng Fang, Wu Wenhu, Fang Ditang. The CDCPM with applications to speech recognition. Accepted byChinese J. Advanced Software Research, 1996. (in Chinese)
Juang B H, Rabiner L R, Wilpon J G. On the use of bandpass liftering in speech recognition.IEEE Trans. ASSP, 1987, ASSP-35: 947–953.
Article Google Scholar
Makhoul J. Linear prediction: A tutorial review. InProc. IEEE, Apr. 1975, vol.63, pp.562–580.
Gold B, Rader C M. Digital Processing of Signals. New York, McGraw-Hill, 1969, p.246.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, Tsinghua University, 100084, Beijing
Zheng Fang, Wu Wenhu & Fang Ditang

Authors

Zheng Fang
View author publications
You can also search for this author in PubMed Google Scholar
Wu Wenhu
View author publications
You can also search for this author in PubMed Google Scholar
Fang Ditang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zheng Fang.

Additional information

Zheng Fang was born in Jiangsu Province, P.R.China, in 1967. He received the B.S. degree and the M.S. degree from Tsinghua Univ., P.R. China, both in computer science and technology, in 1990 and 1992, respectively. He is now a lecturer and, at the same time, a Ph.D. candidate in Tsinghua University. He is also the Executive Director of the Analog Devices Inc.-Tsinghua DSP Technology Research Center. Since 1988, He has been working on Speech Recognition at Speech Lab., Dept. of Computer Science and Technology, Tsinghua University.

Wu Wenhu was born in Beijing, P.R.China, in 1936. He studied in the Department of Electrical Engineering, Tsinghua University from 1955 to 1958, and then in the Department of Automation, Tsinghua University, from 1958 to 1961. Since then he has been at Tsinghua University and now a Professor in the Department of Computer Science and Technology. He is the Director of the Speech Lab. now. He is devoted in the research of Chinese speech recognition and understanding, especially the speaker-independent Chinese speech recognition. As a result, he has been awarded several times. He is also engaged in the computer spread education. He is the Chairman of Computer Spread Education Commission of CCF (Chinese Computer Federation). He led the China Team to take part in the IOI’89—IOI’95 (International Olympiad in Informatics) and won many gold medals.

Fang Ditang was born in Shanghai, P.R.China, in 1930. He received the B.S. degree from Jiaotong University and the M.S. degree from Tsinghua University, both in electrical engineering, in 1953 and 1956, respectively. Since then, he has been teaching at Tsinghua University and now a Professor in the Department of Computer Science and Technology. In 1979, he founded the Laboratory for Human-Machine Speech Communications and was the Director from 1979 to 1990. The laboratory won the National Scientific Research and Technology Progress Award twice, in 1987 and 1989, respectively, the National Scientific Invention Award in 1990, and three other awards. He is the Deputy Chief of the Artificial Intelligence and Pattern Recognition Committee of the Chinese Computer Science Society.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zheng, F., Wu, W. & Fang, D. A log-index weighted cepstral distance measure for speech recognition. J. of Comput. Sci. & Technol. 12, 177–184 (1997). https://doi.org/10.1007/BF02951337

Download citation

Received: 25 May 1996
Revised: 12 September 1996
Issue Date: March 1997
DOI: https://doi.org/10.1007/BF02951337

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Isolated Word Recognition Based on Power Normalized Cepstrum and Machine Learning Clusters

Robust Feature Extraction Based on Teager-Entropy and Half Power Spectrum Estimation for Speech Recognition

Integration of Mel-frequency Cepstral Coefficients with Log Energy and Temporal Derivatives for Text-Independent Speaker Identification

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A log-index weighted cepstral distance measure for speech recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Isolated Word Recognition Based on Power Normalized Cepstrum and Machine Learning Clusters

Robust Feature Extraction Based on Teager-Entropy and Half Power Spectrum Estimation for Speech Recognition

Integration of Mel-frequency Cepstral Coefficients with Log Energy and Temporal Derivatives for Text-Independent Speaker Identification

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation