[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

A retrieval method for encrypted speech based on improved power normalized cepstrum coefficients and perceptual hashing

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In order to improve the impact of noise on the robustness and discrimination of the speech perceptual hashing scheme, improve retrieval efficiency and retrieval accuracy, and protect the privacy of the cloud speech data, a retrieval method for encrypted speech based on improved power normalized cepstrum coefficients (PNCC) and perceptual hashing was proposed in the paper. Firstly, the original speech was encrypted by Henon chaotic map inter-frame scrambling encryption algorithm before uploading to the encrypted speech library in cloud server. Secondly, the discrete wavelet transform (DWT) and first-order difference coefficient were used to improve the PNCC feature extraction algorithm to extract speech features, and the principal component analysis (PCA) was used to reduce high-dimensional audio features to one dimension to form frame features that can represent the speech segment. Finally, the frame features are constructed as binary hashing sequences using hash functions and upload it to the system hashing index table in the cloud. When the user retrieves, the hashing sequence of query speech is extracted and matched with the encrypted speech features by normalized hamming distance in the cloud system hashing index table to obtain the retrieval result. Experimental results show that compared with the existing methods, the proposed method has good robustness and discrimination, and improves retrieval efficiency and retrieval accuracy, the security of cloud speech data is improved. In addition, the proposed method has good recognition ability under simulated real noise environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Abdullah RSAR, Saleh NL, Ahmad SMS, Salah AA, Rashid NE (2019) Ambiguity function analysis of human echolocator waveform by using Gammatone filter processing. J Eng 2019(20):6935–6939. https://doi.org/10.1049/joe.2019.0535

    Article  Google Scholar 

  2. Alasadi AA, Deshmukh RR, Waghmare SD (2019) Review of Modgdf & PNCC techniques for features extraction in speech recognition. In 2019 IEEE international conference on electrical, computer and communication technologies (ICECCT). IEEE 1-7. https://doi.org/10.1109/ICECCT.2019.8869154

  3. Ali Z, Talha M (2018) Innovative method for unsupervised voice activity detection and classification of audio segments. IEEE Access 6:15494–15504. https://doi.org/10.1109/ACCESS.2018.2805845

    Article  Google Scholar 

  4. Bai J, Shi YY, Xue PY, Guo QY (2019) CFCC feature extraction for fusion of the power-law nonlinearity function and spectral subtraction. J Xidian Univ 46(1):86–92. https://doi.org/10.19665/j.issn1001-2400.2019.01.014

    Article  Google Scholar 

  5. Dua M, Aggarwal RK, Biswas M (2019) GFCC based discriminatively trained noise robust continuous ASR system for Hindi language. J Ambient Intell Humaniz Comput 10(6):2301–2314. https://doi.org/10.1007/s12652-018-0828-x

    Article  Google Scholar 

  6. Elzaher MFA, Shalaby M, El Ramly SH (2016) An Arnold cat map-based chaotic approach for securing voice communication. In proceedings of the 10th international conference on informatics and systems. ACM 329-331. https://doi.org/10.1145/2908446.2908508

  7. He SF, Zhao H (2017) A retrieval algorithm of encrypted speech based on syllable-level perceptual hashing. Comput Sci Inf Syst 14(3):703–718. https://doi.org/10.2298/CSIS170112024H

    Article  Google Scholar 

  8. Ibtihal M, Hassan N (2017) Homomorphic encryption as a service for outsourced images in mobile cloud computing environment. Int J Cloud Appl Comput 7(2):27–40. https://doi.org/10.4018/IJCAC.2017040103

    Article  Google Scholar 

  9. Kim C, Stern RM (2016) Power-normalized cepstral coefficients (PNCC) for robust speech recognition. IEEE/ACM Trans Audio, Speech, Language Process 24(7):1315–1329. https://doi.org/10.1109/TASLP.2016.2545928

    Article  Google Scholar 

  10. Kowsigan M, Balasubramanie P (2019) An efficient performance evaluation model for the resource clusters in cloud environment using continuous time Markov chain and Poisson process. Clust Comput 22(5):12411–12419. https://doi.org/10.1007/s10586-017-1640-7

    Article  Google Scholar 

  11. Lee SC, Wang JF, Chen MH (2018) Threshold-based noise detection and reduction for automatic speech recognition system in human-robot interactions. Sensors 18(7):1–12. https://doi.org/10.3390/s18072068

    Article  Google Scholar 

  12. Li WJ (2014) A study of encryption technology based on the analog voice. Master thesis, Xidian University (in Chinese), Xian, China.

  13. Nair UR, Birajdar GK (2016) A secure audio watermarking employing AES technique. In 2016 international conference on inventive computation technologies (ICICT). IEEE 3:1–5. https://doi.org/10.1109/INVENTIVE.2016.7830133

    Article  Google Scholar 

  14. Nayyar RK, Nair S, Patil O, Pawar R, Lolage A (2017) Content-based auto-tagging of audios using deep learning. In international conference on big data, IoT and data science, 2017 international conference on. IEEE 30-36. https://doi.org/10.1109/BID.2017.8336569

  15. Waldekar S, Saha G (2020) Analysis and classification of acoustic scenes with wavelet transform-based mel-scaled features. Multimed Tools Appl 79(11–12):7911–7926. https://doi.org/10.1007/s11042-019-08279-5

    Article  Google Scholar 

  16. Wang D, Zhang XW (2015) Thchs-30: a free Chinese speech corpus. arXiv preprint arXiv:1512.01882. https://arxiv.org/abs/1512.01882

  17. Wang HX, Hao GY (2015) Encryption speech perceptual hashing algorithm and retrieval scheme based on time and frequency domain change characteristics. China patent, CN104835499A, 2015-08-12.

  18. Wang XH, Yao PC, Ma LP, Wang WJ (2020) Algorithm for extraction of features of robot speech control in the factory environment. J Xidian Univ 47(2):16–22. https://doi.org/10.19665/j.issn1001-2400.2020.02.003

    Article  Google Scholar 

  19. Wu JF, Qin HB, Hua YZ, Fan LY (2018) Pitch estimation and voicing classification using reconstructed spectrum from MFCC. IEICE Trans Inf Syst 101(2):556–559. https://doi.org/10.1587/transinf.2017EDL8162

    Article  Google Scholar 

  20. Zhang HM, Wang GY, Jin PP (2017) Design of VOIP chaotic voice encryption system based on P2P. J Hangzhou Dianzi Univ (Natural Science Edition) 37(2):5–9. https://doi.org/10.13954/j.cnki.hdu.2017.02.002

    Article  Google Scholar 

  21. Zhang K, Zhang G, Jiang C, Yang YS (2016) Research and implementation of security cipher-text clustered index based on B+ tree. In 2016 international conference on network and information Systems for Computers (ICNISC). IEEE 274-278. https://doi.org/10.1109/ICNISC.2016.067

  22. Zhang Q, Zhou L, Zhang T, Zhang D (2019) A retrieval algorithm of encrypted speech based on short-term cross-correlation and perceptual hashing. Multimed Tools Appl 78(13):17825–17846. https://doi.org/10.1007/s11042-019-7180-9

    Article  Google Scholar 

  23. Zhang Q, Ge Z, Hu Y, Bai J, Huang Y (2020) An encrypted speech retrieval algorithm based on chirp-Z transform and perceptual hashing second feature extraction. Multimed Tools Appl 79(9–10):6337–6361. https://doi.org/10.1007/s11042-019-08450-y

    Article  Google Scholar 

  24. Zhang QY, Xing PF, Huang YB, Dong RH, Yang ZP (2016) Perceptual hashing algorithm for multi-format. J Beijing Univ Posts Telecomm 39(4):77–82. https://doi.org/10.13190/j.jbupt.2016.04.015

    Article  Google Scholar 

  25. Zhang QY, Ge ZX, Qiao SB (2018) An efficient retrieval method of encrypted speech based on frequency band variance. J Inform Hiding Multimedia Signal Process 9(6):1452–1463

    Google Scholar 

  26. Zhang ZT (2018) Research of speech recognition technology based on wavelet and PNCC characteristic parameters. Chongqing University, Chongqing, China, Master thesis

  27. Zhao H, He SF (2016) A retrieval algorithm for encrypted speech based on perceptual hashing. In 2016 12th international conference on natural computation, fuzzy systems and knowledge discovery (ICNC-FSKD). IEEE 1840-1845. https://doi.org/10.1109/FSKD.2016.7603458

  28. Zhong SM, Kuang P, Zhuang HS, Feng HD, Wang JY, Zhang H (2019) A robust gender recognition scheme for telephone speech based on PNCC and fundamental frequency. J South Chine Normal (Natural Science Edition) 51(6):118–122. https://doi.org/10.6054/j.jscnun.2019111

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (No. 61862041, 61363078). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiu-yu Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Qy., Bai, J. & Xu, Fj. A retrieval method for encrypted speech based on improved power normalized cepstrum coefficients and perceptual hashing. Multimed Tools Appl 81, 15127–15151 (2022). https://doi.org/10.1007/s11042-022-12560-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12560-5

Keywords

Navigation