More Web Proxy on the site http://driver.im/

article

Online audio background determination for complex audio environments

Authors:

Simon Moncrieff,

Svetha Venkatesh,

Geoff WestAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 3, Issue 2

Pages 8 - es

https://doi.org/10.1145/1230812.1230814

Published: 01 May 2007 Publication History

Abstract

We present a method for foreground/background separation of audio using a background modelling technique. The technique models the background in an online, unsupervised, and adaptive fashion, and is designed for application to long term surveillance and monitoring problems. The background is determined using a statistical method to model the states of the audio over time. In addition, three methods are used to increase the accuracy of background modelling in complex audio environments. Such environments can cause the failure of the statistical model to accurately capture the background states. An entropy-based approach is used to unify background representations fragmented over multiple states of the statistical model. The approach successfully unifies such background states, resulting in a more robust background model. We adaptively adjust the number of states considered background according to background complexity, resulting in the more accurate classification of background models. Finally, we use an auxiliary model cache to retain potential background states in the system. This prevents the deletion of such states due to a rapid influx of observed states that can occur for highly dynamic sections of the audio signal. The separation algorithm was successfully applied to a number of audio environments representing monitoring applications.

References

[1]

Azlan, M., Cartwright, I., Jones, N., Quirk, T., and West, G. 2005. Multimodal monitoring of the aged in their own homes. In Proceedings of the ICOST'2005: 3rd. International Conference on Smart Homes and Health Telematics (July) Magog, Canada.

[2]

Chen, J., Kam, A. H., Zhang, J., Liu, N., and Shue, L. 2005a. Bathroom activity monitoring based on sound. In Pervasive Computing. Munich, Germany, 47--61.

Digital Library

[3]

Chen, J., Zhang, J., Kam, A., and Shue, L. 2005b. An automatic acoustic bathroom monitoring system. In IEEE International Symposium on Circuits and Systems (ISCAS 05). vol. 2, 1750--1753.

[4]

Clarkson, B., Sawhney, N., and Pentland, A. 1998. Auditory context awareness in wearable computing. In Workshop on Perceptual User Interfaces. San Francisco, U.S.A., 47--61.

[5]

Clavel, C., Ehrette, T., and Richard, G. 2005. Events detection for an audio-based surveillance system. In IEEE International Conference on Multimedia and Expo (ICME 2005). Amsterdam, Netherlands.

[6]

Cover, T. and Thomas, J. 1991. Elements of Information Theory. John Wiley and Sons.

Digital Library

[7]

Cowling, M. and Sitte, R. 2003. Comparison of techniques for environmental sound recognition. Pattern Recognition Letters 24, 15, 2895--2907.

Digital Library

[8]

Cristani, M., Bicego, M., and Murino, V. 2004. Online adaptive background modelling for audio surveillance. In Proceedings of the 17th International Conference on Pattern Recognition (ICPR 04). vol. 2, 399--402.

[9]

Daubechies, I. 1992. Ten Lectures on Wavelets. Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania.

Digital Library

[10]

Deller, J. R., Proakis, J. G., and Hansen, J. H. 1993. Discrete-Time Processing of Speech Signals. Maxwell Macmillan International.

Digital Library

[11]

Elgammal, A., Duraiswami, R., Harwood, D., and Davis, L. S. 2000. Non-parametric model for background subtraction. In Proceedings of the 6th European Conference on Computer Vision-Part II. Springer-Verlag, Dublin, Ireland, 751--767.

Digital Library

[12]

Ellis, D. P. W. 2001. Detecting alarm sounds. In Consistent and Reliable Acoustic Cues for Sound Analysis. Aalborg, Denmark.

[13]

Foote, J. T. and Cooper, M. L. 2003. Media segmentation using self-similarity decomposition. In SPIE Storage and Retrieval for Multimedia Databases. vol. 5021. 167--175.

[14]

Gaunard, P., Mubikangiey, C., Couvreur, C., and Fontaine, V. 1998. Automatic classification of environmental noise events by hidden markov models. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '98). vol. 6, 3609--3612.

[15]

Härmä, A., McKinney, M., and Skowronek, J. 2005. Automatic surveillance of the acoustic activity in our living environment. In IEEE International Conference on Multimedia and Expo (ICME 2005). Amsterdam, Netherlands.

[16]

Kim, K., Chalidabhongse, T. H., Harwood, D., and Davis, L. 2004. Background modeling and subtraction by codebook construction. In IEEE International Conference on Image Processing (ICIP). Singapore.

[17]

Lee, L. 1999. Measures of distributional similarity. In 37th Annual Meeting of the Association for Computational Linguistics. 25--32.

Digital Library

[18]

Moncrieff, S., Venkatesh, S., and West, G. 2005. Persistent audio modelling for background determination. In IEEE International Conference on Multimedia and Expo (ICME 2005). Amsterdam, Netherlands.

[19]

Moncrieff, S,. Venkatesh, S., and West, G. 2006. Unifying background models over complex audio using entropy. In International Conference on Pattern Recognition (ICPR 2006). Hong Kong, China.

Digital Library

[20]

Radhakrishnan, R., Divakaran, A., and Xiong, Z. 2004. A time series clustering based framework for multimedia mining and summarization using audio features. In Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR '04). ACM Press, 157--164.

Digital Library

[21]

Stager, M., Lukowicz, P., Perera, N., von Buren, T., Troster, G., and Starner, T. 2003. Soundbutton: Design of a low power wearable audio classification system. In Proceedings of the Seventh IEEE International Symposium on Wearable Computers (2003). 12--17.

Digital Library

[22]

Stauffer, C. and Grimson, W. 1999. Adaptive background mixture models for real-time tracking. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (1999). vol. 2. Fort Collins, CO USA, 246--252.

[23]

Vacher, M., Istrate, D., Besacier, L., Serignat, J. F., and Castelli, E. 2004. Sound detection and classification for medical telesurvey. In 2nd Conference on Biomedical Engineering. ACTA Press, Ed. Innsbruck, Austria, 395--398.

[24]

Witten, I. H. and Frank, E. 2000. Data Mining: Practical Machine Learning Tools with Java Implementations. Morgan Kaufmann.

Digital Library

[25]

Wren, C., Azarbayejani, A., Darrel, T., and Pentland, A. 1997. Pfinder: Real-time tracking of the human body. PAMI 19, 7, 780--785.

Digital Library

[26]

Zhang, T. and Jay Kuo, C.-C. 1999. Hierarchical classification of audio data for archiving and retrieving. In IEEE International Conference On Acoustics, Speech, and Signal Processing. vol. 6. 3001--3004. Phoenix.

Digital Library

Cited By

Kumari PChoudhary PKujur VAtrey PSaini M(2024)Concept drift challenge in multimedia anomaly detection: A case study with facial datasetsSignal Processing: Image Communication10.1016/j.image.2024.117100123(117100)Online publication date: Apr-2024
https://doi.org/10.1016/j.image.2024.117100
Kumari PSaini M(2022)Anomaly Detection in Audio With Concept Drift Using Dynamic Huffman CodingIEEE Sensors Journal10.1109/JSEN.2022.319396922:17(17126-17138)Online publication date: 1-Sep-2022
https://doi.org/10.1109/JSEN.2022.3193969
Kumari PSaini M(2022)An Adaptive Framework for Anomaly Detection in Time-Series Audio-Visual DataIEEE Access10.1109/ACCESS.2022.316443910(36188-36199)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3164439
Show More Cited By

Index Terms

Online audio background determination for complex audio environments
1. Information systems
  1. Information retrieval
    1. Document representation
    2. Search engine architectures and scalability
      1. Search engine indexing

Recommendations

Unifying Background Models over Complex Audio using Entropy
ICPR '06: Proceedings of the 18th International Conference on Pattern Recognition - Volume 04

In this paper we extend an existing audio background modelling technique, leading to a more robust application to complex audio environments. The determination of background audio is used as an initial stage in the analysis of audio for surveillance and ...
Robust audio fingerprinting using peak-pair-based hash of non-repeating foreground audio in a real environment

In this paper, we propose a high-performance audio fingerprinting system used in real-world query-by-example applications for acoustic audio-based content identification, especially for use in heterogeneous portable consumer devices or on-line audio ...
A semi-supervised learning approach to online audio background detection
ICASSP '09: Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

We present a framework for audio background modeling of complex and unstructured audio environments. The determination of background audio is important for understanding and predicting the ambient context surrounding an agent, both human and machine. ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 3, Issue 2

May 2007

147 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/1230812

Issue’s Table of Contents

Copyright © 2007 ACM.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2007

Published in TOMM Volume 3, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
609
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)3

Reflects downloads up to 14 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kumari PChoudhary PKujur VAtrey PSaini M(2024)Concept drift challenge in multimedia anomaly detection: A case study with facial datasetsSignal Processing: Image Communication10.1016/j.image.2024.117100123(117100)Online publication date: Apr-2024
https://doi.org/10.1016/j.image.2024.117100
Kumari PSaini M(2022)Anomaly Detection in Audio With Concept Drift Using Dynamic Huffman CodingIEEE Sensors Journal10.1109/JSEN.2022.319396922:17(17126-17138)Online publication date: 1-Sep-2022
https://doi.org/10.1109/JSEN.2022.3193969
Kumari PSaini M(2022)An Adaptive Framework for Anomaly Detection in Time-Series Audio-Visual DataIEEE Access10.1109/ACCESS.2022.316443910(36188-36199)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3164439
Kumari PShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)Situational Anomaly Detection in Multimedia Data under Concept DriftProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3481033(2969-2973)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3481033
Bhagwat GJayaprakash SBhowmick A(2021)Rotating Acoustic Reflector Parameter Trade-Off for Near-Outdoor Audio Event DetectionInnovations in Electrical and Electronic Engineering10.1007/978-981-16-0749-3_44(567-582)Online publication date: 25-May-2021
https://doi.org/10.1007/978-981-16-0749-3_44
Crocco MCristani MTrucco AMurino V(2016)Audio SurveillanceACM Computing Surveys10.1145/287118348:4(1-46)Online publication date: 22-Feb-2016
https://dl.acm.org/doi/10.1145/2871183
Thorogood MFan JPasquier PKalliris GDimoulas C(2015)BF-ClassifierProceedings of the Audio Mostly 2015 on Interaction With Sound10.1145/2814895.2814926(1-6)Online publication date: 7-Oct-2015
https://dl.acm.org/doi/10.1145/2814895.2814926
Li QZhang MXu G(2013)A Novel Element Detection Method in Audio Sensor NetworksInternational Journal of Distributed Sensor Networks10.1155/2013/6071879:2(607187)Online publication date: Jan-2013
https://doi.org/10.1155/2013/607187
Chu SNarayanan SKuo C(2011)Unstructured Environmental AudioMachine Audition10.4018/978-1-61520-919-4.ch001(1-21)Online publication date: 2011
https://doi.org/10.4018/978-1-61520-919-4.ch001
Cristani MFarenzena MBloisi DMurino V(2010)Background subtraction for automated multisensor surveillanceEURASIP Journal on Advances in Signal Processing10.1155/2010/3430572010(1-24)Online publication date: 1-Feb-2010
https://dl.acm.org/doi/10.1155/2010/343057
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents