Abstract
Audio replay attack poses great threat to Automatic Speaker Verification (ASV) systems. In this paper, we propose a set of features based on Teager Energy Operator and a slightly modified version of x-vector system to detect replay attacks. The proposed methods are tested on ASVspoof 2017 corpus. When using GMM with the proposed features, our best system has an EER of 6.13% on dev set and 15.53% on eval set, while the EER for the baseline system (GMM with CQCC) is 30.60% on eval set. When combined with the modified x-vector, the best EER further drops to 5.57% for dev subset and 14.21% for eval subset.
This work is supported by NSFC 61602404 and the National Basic Research Program of China (973 Program) (No. 2013CB329504).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ergünay, S.K., Khoury, E., Lazaridis, A., Marcel, S.: On the vulnerability of speaker verification to realistic voice spoofing. In: 2015 IEEE 7th International Conference on Biometrics Theory, Applications and Systems (BTAS), pp. 1–6. IEEE (2015)
Alegre, F., Janicki, A., Evans, N.: Re-assessing the threat of replay spoofing attacks against automatic speaker verification. In: 2014 International Conference of the Biometrics Special Interest Group (BIOSIG), pp. 1–6. IEEE (2014)
Villalba, J., Lleida, E.: Detecting replay attacks from far-field recordings on speaker verification systems. In: Vielhauer, C., Dittmann, J., Drygajlo, A., Juul, N.C., Fairhurst, M.C. (eds.) BioID 2011. LNCS, vol. 6583, pp. 274–285. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19530-3_25
Kinnunen, T., et al.: ASVspoof 2017: automatic speaker verification spoofing and countermeasures challenge evaluation plan. Training 10(1508), 1508 (2017)
Lavrentyeva, G., Novoselov, S., Malykh, E., Kozlov, A., Kudashev, O., Shchemelinin, V.: Audio replay attack detection with deep learning frameworks. In: INTERSPEECH, pp. 82–86 (2017)
Witkowski, M., Kacprzak, S., Zelasko, P., Kowalczyk, K., Galka, J.: Audio replay attack detection using high-frequency features. In: INTERSPEECH, pp. 27–31 (2017)
Kaiser, J.F.: On a simple algorithm to calculate the ‘energy’ of a signal. In: International Conference on Acoustics, Speech, and Signal Processing, pp. 381–384. IEEE (1990)
Patil, H.A., Kamble, M.R., Patel, T.B., Soni, M.H.: Novel variable length Teager energy separation based instantaneous frequency features for replay detection. In: INTERSPEECH, pp. 12–16 (2017)
Nagarsheth, P., Khoury, E., Patil, K., Garland, M.: Replay attack detection using DNN for channel discrimination. In: INTERSPEECH, pp. 97–101 (2017)
Snyder, D., Garcia-Romero, D., Povey, D., Khudanpur, S.: Deep neural network embeddings for text-independent speaker verification. In: INTERSPEECH, pp. 999–1003 (2017)
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
Zhou, L.: Research on audio replay detection method for speaker recognition. Master’s thesis, Zhejiang University (2019)
Maragos, P., Kaiser, J.F., Quatieri, T.F.: Energy separation in signal modulations with application to speech analysis. IEEE Trans. Signal Process. 41(10), 3024–3051 (1993)
Lee, K.A., et al.: The RedDots data collection for speaker recognition. In: Sixteenth Annual Conference of the International Speech Communication Association (2015)
Chettri, B., Mishra, S., Sturm, B.L., Benetos, E.: A study on convolutional neural network based end-to-end replay anti-spoofing. arXiv preprint arXiv:1805.09164 (2018)
Zhu, Y., Ko, T., Snyder, D., Mak, B., Povey, D.: Self-attentive speaker embeddings for text-independent speaker verification. In: Proceedings of the INTERSPEECH, vol. 2018, pp. 3573–3577 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, Z., Zhou, L., Yang, Y., Wu, Z. (2019). Teager Energy Operator Based Features with x-vector for Replay Attack Detection. In: Sun, Z., He, R., Feng, J., Shan, S., Guo, Z. (eds) Biometric Recognition. CCBR 2019. Lecture Notes in Computer Science(), vol 11818. Springer, Cham. https://doi.org/10.1007/978-3-030-31456-9_51
Download citation
DOI: https://doi.org/10.1007/978-3-030-31456-9_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31455-2
Online ISBN: 978-3-030-31456-9
eBook Packages: Computer ScienceComputer Science (R0)