Abstract
With the popularity of various portable recording devices, playback speech has become one of the most important means of attack in the speaker authentication system. By comparing with the original speech data, the difference in the high-frequency layer, and the playback speech is also different in the low-frequency layer due to the different recording equipment. According to this finding, a detection algorithm was presented to extract representative data. In the high frequency layer, the inverse-Mel filters (I-Mel) is used to extract speaker eigenvector sequences. In the low frequency layer, linear filters (Linear) is combined with Mel filters (Mel) to avoid superposition of characteristic parameters. Multi-layer fusion to obtain L-M-I filter banks to form new cepstral features. The experimental results show that the method can detect playback speech effectively and the equal error rate is 2.63%. Compared with the traditional feature extraction methods (MFCC, CQCC, LFCC, IMFCC), the equal error rate decreases by 12.79%, 9.61%, 4.45% and 3.28% respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zhu, D., Ma, B., Li, H.: Speaker verification with feature-space MAPLR parameters. IEEE Trans. Audio Speech Lang. Process. 19(3), 505–515 (2011)
Wu, Z., Evans, N., Kinnunen, T., et al.: Spoofing and countermeasures for speaker verification: a survey. Speech Commun. 66, 130–153 (2015)
Wu, Z., Yamagishi, J., Kinnunen, T., et al.: ASVspoof 2015: the first automatic speaker verification spoofing and countermeasures challenge. IEEE J. Sel. Top. Sig. Process. 11(4), 588–604 (2017)
Albeshri, A., Thayananthan, V., et al.: Analytical techniques for decision making on information security for big data breaches. Int. J. Inf. Technol. Decis. Mak. (IJITDM) 17(2), 527–545 (2018)
Shang, W., Stevenson, M.: Score normalization in playback attack detection. In: 2010 IEEE International Conference on Acoustics Speech and Signal Processing, Dallas, TX, USA, pp. 1678–1681. IEEE Press (2010)
Gałka, J., Grzywacz, M., Samborski, R.: Playback attack detection for text-dependent speaker verification over telephone channels. Speech Commun. 67, 143–153 (2015)
Todisco, M., Delgado, H., Evans, N.: A new feature for automatic speaker verification anti-spoofing: constant q cepstral coefficients. In: Odyssey 2016 - The Speaker and Language Recognition Workshop. ISCA Press, Bilbao, Spain (2016)
Todisco, M., Delgado, H., Evans, N.: Constant Q cepstral coefficients: a spoofing countermeasure for automatic speaker verification. Comput. Speech Lang. 45, 516–535 (2017)
Nagarsheth, P., Khoury, E., Patil, K., Garland, M.: Replay attack detection using DNN for channel discrimination. In: INTERSPEECH, Stockholm, Sweden, pp. 97–101 (2017)
Chen, Z., Xie, Z., Zhang, W., Xu, X.: ResNet and model fusion for automatic spoofing detection. In: INTERSPEECH 2017, Stockholm, Sweden, pp. 102–106 (2017)
Cai, W., Cai, D., Liu, W., Li, G., Li, M.: Countermeasures for automatic speaker verification replay spoofing attack: on data augmentation, feature representation, classification and fusion. In: INTERSPEECH, Stockholm, Sweden, pp. 17–21 (2017)
Patil, H.A., Kamble, M.R., Patel, T.B., Soni, M.: Novel variable length Teager energy separation based instantaneous frequency features for replay detection. In: INTERSPEECH, Stockholm, Sweden, pp. 12–16 (2017)
Alluri, K.R., Achanta, S., Kadiri, S.R., Gangashetty, S.V., Vuppala, A.K.: SFF anti-spoofer: IIIT-H submission for automatic speaker verification spoofing and countermeasures challenge 2017. In: INTERSPEECH, Stockholm, Sweden, pp. 107–111 (2017)
Witkowski, M., Kacprzak, S., Zelasko, P., et al.: Audio replay attack detection using high-frequency features. In: INTERSPEECH, Stockholm, Sweden, pp. 27–31 (2017)
Xu, Z., Hu, H.: Projection models for intuitionistic fuzzy multiple attribute decision making. Int. J. Inf. Technol. Decis. Mak. 09(02), 267–280 (2010)
Mcdermott, J.H., Schemitsch, M., Simoncelli, E.P.: Summary statistics in auditory perception. Nat. Neurosci. 16(4), 493–498 (2013)
Hoshen, Y., Weiss, R.J., Wilson, K.W.: Speech acoustic modeling from raw multichannel waveforms. In: ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE (2015)
Jelil, S., Das, R.K., Prasanna, S.M., Sinha, R.: Spoof detection using source, instantaneous frequency and cepstral features. In: INTERSPEECH, Stockholm, Sweden, pp. 22–26 (2017)
Rouba, B., Bahloul, S.N.: A multicriteria clustering approach based on similarity indices and clustering ensemble techniques. Int. J. Inf. Technol. Decis. Mak. 13(04), 811–837 (2014)
Witkowski, M., Kacprzak, S., Zelasko, P., et al.: Audio replay attack detection using high-frequency features. In: Interspeech, pp. 27–31(2017)
Nematollahi, M.A., Al-Haddad, S.A.R.: Distant speaker recognition: an overview. Int. J. Humanoid Rob. 13(02), 45 (2016)
Font, R., Espín, J.M., Cano, M.J.: Experimental analysis of features for replay attack detection — results on the ASVspoof2017 challenge. In: Interspeech 2017 (2017)
Tian, X., Wu, Z., Xiao, X., et al.: Spoofing detection from a feature representation perspective. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2119–2123. IEEE Press, Washington (2016)
Acknowledgements
This work was funded by the Natural Science Foundation of Jiangsu Province (Project No. BK20150987) and the support of the College of Information Engineering, Nanjing University of Finance & Economics. In addition, authors would like to thank the database provided by the ASVspoof2017 challenge.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhou, J., Jiang, Y. (2020). Playback Speech Detection Application Based on Cepstrum Feature. In: He, J., et al. Data Science. ICDS 2019. Communications in Computer and Information Science, vol 1179. Springer, Singapore. https://doi.org/10.1007/978-981-15-2810-1_24
Download citation
DOI: https://doi.org/10.1007/978-981-15-2810-1_24
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-2809-5
Online ISBN: 978-981-15-2810-1
eBook Packages: Computer ScienceComputer Science (R0)