[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

An Efficient Cascaded Filtering Retrieval Method for Big Audio Data

Published: 01 September 2015 Publication History

Abstract

Fast audio retrieval is crucial for many important applications and yet demanding due to the high dimension nature and increasingly larger volume of audios on the Internet. Although audio fingerprinting can greatly reduce its dimension while keeping audio identifiable, the dimension for audio fingerprints is still too high to scale up for big audio data. The tradeoff between accuracy (measured by precision and recall rate) and efficiency (measured by retrieval time) prevents further reduction in the dimension of fingerprints. This paper shows that a multi-stage filtering strategy can achieve both speedup and high accuracy, with the beginning stages focusing on speedup and the end stage emphasizing accuracy. With this strategy, an efficient cascaded filtering retrieval method is proposed that consists of filtering with Fibonacci hashing, the middle fingerprint, thresholds to quickly select candidate audios, and refining with an accurate and robust fingerprint on the candidate audios. Experiments with 500 000 audios show that the proposed method can achieve a speed gain more than 28 K times that of the Fibonacci hashing retrieval. After applying MP3 conversion, resampling, white noise addition, and background noise addition, the recall rates of the method are all above 99.45%, and the precision is the same as the Philips audio fingerprint, which is close to 100%.

References

[1]
J. Haitsma and T. Kalker, “A highly robust audio fingerprinting system,” in Proc. ISMIR, 2002, pp. 107–115.
[2]
F. Precioso, M. Cord, D. Gorisse, and N. Thome, “Efficient bag-of-features kernel representation for image similarity search,” in Proc. ICIP, 2011, pp. 109–112.
[3]
M. Chen et al., “A fast retrieval algorithm based on Fibonacci hashing for audio fingerprinting systems,” in Proc. ICAIEES, 2013, pp. 219–222.
[4]
P. Grosche, M. Müller, and J. Serrà Audio content-based music retrieval Multimodal Music Processing, Wadern, Germany: Schloss Dagstuhl, 2012, vol. 3, pp. 157–174.
[5]
K. I. Diamantaras and S. Y. Kung, Principal Component Neural Networks: Theory and Applications, New York, NY USA: Wiley, 1996.
[6]
Y. G. Hu, Y. Wu, and J. Bu, “Dimensionality reduction in audio fingerprint based on weighted PCA,” in Comput. Appl., vol. 26, no. 9, pp. 2250–2254, Sep. 2006.
[7]
L. Shen, Y. Guan, Y. Wu, and Y. Zhao, “Fast audio fingerprint search strategy for song identification,” in Proc. ICNDS, 2009, pp. 259–262.
[8]
G. Zheng, M. Li, J. Han, and T. Zheng, “A fast audio retrieval method based on negativity judgment,” in Proc. IIH-MSP, 2009, pp. 1156–1159.
[9]
V. Panagiotou and N. Mitianoudis, “PCA summarization for audio song identification using gaussian mixture models,” in Proc. DSP, 2013, pp. 1–6.
[10]
J. Haitsma, T. Kalker, and J. Oostveen, “An efficient database search strategy for audio fingerprinting,” in Proc. IEEE Workshop Multimedia Signal Process., Dec. 2002, pp. 178–181.
[11]
F. Kurth, A. Ribbrock, and M. Clausen, “Identification of highly distorted audio material for querying large scale databases,” in Proc. AES112, 2002, pp. 1–8.
[12]
F. Kurth and M. Muller, “Efficient index-based audio matching,” in IEEE Trans. Audio, Speech, Language Process., vol. 16, no. 2, pp. 382–395, Feb. 2008.
[13]
C. P. J. Vitola, J. Sepulveda, and J. I. Martinez, “Fast content-based audio retrieval algorithm,” in Proc. STSIVA, 2013, pp. 1–5.
[14]
S. Yao, Y. Wang, and B. Niu, “An efficient cascaded filtering retrieval method for big audio data,” in Proc. BigMM, 2015, pp. 105–115.
[15]
T. Shibuya, M. Abe, and M. Nishiguchi, “Audio fingerprinting robust against reverberation and noise based on quantification of sinusoidality,” in Proc. ICME, 2013, pp. 1–6.
[16]
D. I. Barnea and H. F. Silverman, “A Class of algorithms for fast digital image registration,” in IEEE Trans. Comput., vol. 100, no. 2, pp. 179–186, Feb. 1972.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia
IEEE Transactions on Multimedia  Volume 17, Issue 9
Sept. 2015
270 pages

Publisher

IEEE Press

Publication History

Published: 01 September 2015

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Accuracy comparisons of fingerprint based song recognition approaches using very high granularityMultimedia Tools and Applications10.1007/s11042-023-14787-282:20(31591-31606)Online publication date: 21-Mar-2023
  • (2022)Encrypted speech retrieval based on long sequence BiohashingMultimedia Tools and Applications10.1007/s11042-022-12371-881:9(13065-13085)Online publication date: 1-Apr-2022
  • (2021)A Hierarchical Retrieval Method Based on Hash Table for Audio FingerprintingIntelligent Computing Theories and Application10.1007/978-3-030-84522-3_13(160-174)Online publication date: 12-Aug-2021
  • (2018)Multimedia Big Data AnalyticsACM Computing Surveys10.1145/315022651:1(1-34)Online publication date: 10-Jan-2018
  • (2017)Audio Identification by Sampling Sub-fingerprints and Counting MatchesIEEE Transactions on Multimedia10.1109/TMM.2017.272384619:9(1984-1995)Online publication date: 1-Sep-2017
  • (2017)Generalized Residual Vector Quantization and Aggregating Tree for Large Scale SearchIEEE Transactions on Multimedia10.1109/TMM.2017.269218119:8(1785-1797)Online publication date: 1-Aug-2017

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media