Abstract
In this paper, an efficient retrieval approach for encrypted speech based on biological hashing and spectral subtraction is proposed. The proposed approach improves the impact of noise on the robustness and discrimination of speech hashing scheme, as well as improves the retrieval efficiency, accuracy, and security of search digests, and realizes the authentication of the query result. The speech owner firstly secure the original speech file by encrypting it with two-dimensional Arnold mapping and uploading it to the encrypted speech library on cloud. Then, the pre-processed speech signal is subjected to spectral subtraction and noise reduction, as well as the discrete wavelet transform (DWT) is performed to obtain the wavelet low-frequency coefficient and reconstruct the speech signal, calculating the normalized autocorrelation function to obtain the matrix feature vector, and using the Chebychew mapping algorithm to generate the pseudo-random matrix, and generate the pseudo-random Fourier matrix by fast Fourier transform (FFT). Finally, iterate the matrix feature vector and pseudo-random matrix. After the thresholding, the hash sequence is constructed and uploaded to the system hash index table on cloud. When speech’s user retrieval, the Hamming distance algorithm is used for the matching retrieval operation during the search and integrity authentication of the query result. The experimental results show that the proposed approach effectively reduces the noise of speech, with strong robustness and discrimination, and the retrieval efficiency, accuracy and security have been significantly improved.
Similar content being viewed by others
References
Ali Z, Hossain MS, Muhammad G, Ullah I, Abachi H, Alamri A (2018) Edge-centric multimodal authentication system using encrypted biometric templates. Futur Gener Comput Syst 85:76–87. https://doi.org/10.1016/j.future.2018.02.040
Aljawarneh S, Yassein MB (2017) A resource-efficient encryption algorithm for multimedia big data. Multimed Tools Appl 76(21):22703–22724. https://doi.org/10.1007/s11042-016-4333-y
Das D, Maity S, Chatterjee B, Sen S (2018). In-field remote fingerprint authentication using human body communication and on-hub analytics. 2018 40th annual international conference of the IEEE engineering in medicine and biology society (EMBC). IEEE, pp. 5398–5401. doi: https://doi.org/10.1109/EMBC.2018.8513667
Dash D, Ferrari P, Malik S, Wang J (2018). Overt speech retrieval from neuromagnetic signals using wavelets and artificial neural networks. In Proc. 2018 IEEE global conference on signal and information processing (GlobalSIP). IEEE, pp 489–493. doi: https://doi.org/10.1109/GlobalSIP.2018.8646401
Espín JM, Font R, Marín-Blazquez JG, Esquembre F (2018). Logical access attacks detection through audio fingerprinting in automatic speaker verification. In Proc. 2018 IEEE 28th international workshop on machine learning for signal processing (MLSP). IEEE, pp 1-6. https://doi.org/10.1109/mlsp.2018.8517013
Glackin C, Chollet G, Dugan N, Cannings N, Wall J, Tahir S, Ray IG, Rajarajan M (2017). Privacy preserving encrypted phonetic search of speech data. In acoustics, speech and signal processing (ICASSP), 2017 IEEE international conference on. IEEE, pp 6414-6418. https://doi.org/10.1109/ICASSP.2017.7953391
He S, Zhao H (2017) A retrieval algorithm of encrypted speech based on syllable-level perceptual hashing. Comput Sci Inf Syst 14(3):703–718. https://doi.org/10.2298/CSIS170112024H
Iliev A, Stanchev P (2018). Information retrieval and recommendation using emotion from speech signals. In Proc. 2018 IEEE conference on multimedia information processing and retrieval (MIPR). IEEE, pp 222–225. doi: https://doi.org/10.1109/MIPR.2018.00054
Jin SS (2017) A resilience mask for robust audio hashing. IEICE Trans Inf Syst 100(1):57–60. https://doi.org/10.1587/transinf.2016MUL0003
Kamper H, Shakhnarovich G, Livescu K (2019) Semantic speech retrieval with a visually grounded model of untranscribed speech. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP) 27(1):89–98. https://doi.org/10.1109/TASLP.2018.2872106
Knospe H (2013). Privacy-enhanced perceptual hashing of audio data. In Proc. 2013 international conference on security and cryptography (SECRYPT), IEEE, pp 1-6.
Liao HY, Mandal MK, Cockburn BF (2004) Efficient architectures for 1-D and 2-D lifting-based wavelet transforms. IEEE Trans Signal Process 52(5):1315–1326. https://doi.org/10.1109/tsp.2004.826175
Lin QG, Shao YW (2018). A novel normalization method for autocorrelation function for pitch detection and for speech activity detection. In Proc. Interspeech, pp 2097-2101. Doi: https://doi.org/10.21437/Interspeech.2018-45
Paliwal K, Wójcicki K, Schwerin B (2010) Single-channel speech enhancement using spectral subtraction in the short-time modulation domain. Speech Comm 52(5):450–475. https://doi.org/10.1016/j.specom.2010.02.004
Patil NM, Nemade MU (2019) Content-based audio classification and retrieval using segmentation, feature extraction and neural network approach. In Advances in Intelligent Systems and Computing 924:263–281. https://doi.org/10.1007/978-981-13-6861-5_23
Revathi B, Sudha GF (2018). Retrieval performance analysis of multibiometric database using optimized multidimensional spectral hashing based indexing. Journal of King Saud University-Computer and Information Sciences https://doi.org/10.1016/j.jksuci.2018.02.003
Sasikaladevi N, Geetha K, Revathi A, Mahalakshmi N, Archana N (2019) SCAN-speech biometric template protection based on genus-2 hyper elliptic curve. Multimed Tools Appl 78(13):18339–18361. https://doi.org/10.1007/s11042-019-7208-1
Sun JS, Zhang JY, Yang Y (2017) Effective audio fingerprint retrieval based on the spectral sub-band centroid feature. Journal of Tsinghua University (science and technology) (in Chinese) 57(4):382–387. https://doi.org/10.16511/j.cnki.qhdxxb.2017.25.008
Tam WM, Lau FCM, Tse CK et al (2004) Exact analytical bit error rates for multiple access chaos-based communication systems. IEEE Transactions on Circuits and Systems II: Express Briefs 51(9):473–481. https://doi.org/10.1109/tcsii.2004.832773
Thangavel M, Varalakshmi P, Renganayaki S, Subhapriya GR, Preethi T, Zeenath Banu A (2016). SMCSRC—secure multimedia content storage and retrieval in cloud. In international conference on recent trends in information technology (ICRTIT), 2016 international conference on. IEEE, pp 1-6. https://doi.org/10.1109/ICRTIT.2016.7569581
Wang HX, Hao GY (2015). Encryption speech perceptual hashing algorithm and retrieval scheme based on time and frequency domain change characteristics. China patent, CN104835499A, 2015-08-12
Wang HX, Zhou L, Zhang W, Liu S (2013). Watermarking-based perceptual hashing search over encrypted speech. In Proc. international workshop on digital watermarking (IWDW), springer, Berlin, Heidelberg, pp 423-434. https://doi.org/10.1007/978-3-662-43886-2_3
Xue W, Moore AH, Brookes M, Naylor PA (2018) Modulation-domain multichannel Kalman filtering for speech enhancement. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP) 26(10):1833–1847. https://doi.org/10.1109/TASLP.2018.2845665
Yadava TG, Jayanna HS (2019) Speech enhancement by combining spectral subtraction and minimum mean square error-spectrum power estimator based on zero crossing. International Journal of Speech Technology 22(3):639–648. https://doi.org/10.1007/s10772-018-9506-9
Ye G, Wong KW (2012) An efficient chaotic image encryption algorithm based on a generalized Arnold map. Nonlinear dynamics 69(4):2079–2087. https://doi.org/10.1007/s11071-012-0409-z
Zhang QY, Hu WH, Huang YB, Qiao SB (2018) An efficient perceptual hashing based on improved spectral entropy for speech authentication. Multimed Tools Appl 77(2):1555–1581. https://doi.org/10.1007/s11042-017-4381-y
Zhang QY, Zhou L, Zhang T, Zhang DH (2019) A retrieval algorithm of encrypted speech based on short-term cross-correlation and perceptual hashing. Multimed Tools Appl 78(13):17825–17846. https://doi.org/10.1007/s11042-019-7180-9
Zhao H, He SF (2016). A retrieval algorithm for encrypted speech based on perceptual hashing. In Proc. 2016 12th international conference on natural computation, fuzzy systems and knowledge discovery (ICNC-FSKD), IEEE, pp 1840-1845. doi: https://doi.org/10.1109/fskd.2016.7603458
Zheng Y, Cao Y, Chang CH (2018). Facial biohashing based user-device physical unclonable function for bring your own device security. In Proc. 2018 IEEE international conference on consumer electronics (ICCE), IEEE, pp 1-6. doi: https://doi.org/10.1109/ICCE.2018.8326074
Zou F, Tang X, Li K, Wang Y, Song J, Yang S, Ling H (2018) Hidden semantic hashing for fast retrieval over large scale document collection. Multimed Tools Appl 77(3):3677–3697. https://doi.org/10.1007/s11042-017-5219-3
Acknowledgments
This work is supported by the National Natural Science Foundation of China (No. 61862041, 61363078). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, Qy., Li, Gl. & Huang, Yb. An efficient retrieval approach for encrypted speech based on biological hashing and spectral subtraction. Multimed Tools Appl 79, 29775–29798 (2020). https://doi.org/10.1007/s11042-020-09446-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09446-9