[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Online audio background determination for complex audio environments

Published: 01 May 2007 Publication History

Abstract

We present a method for foreground/background separation of audio using a background modelling technique. The technique models the background in an online, unsupervised, and adaptive fashion, and is designed for application to long term surveillance and monitoring problems. The background is determined using a statistical method to model the states of the audio over time. In addition, three methods are used to increase the accuracy of background modelling in complex audio environments. Such environments can cause the failure of the statistical model to accurately capture the background states. An entropy-based approach is used to unify background representations fragmented over multiple states of the statistical model. The approach successfully unifies such background states, resulting in a more robust background model. We adaptively adjust the number of states considered background according to background complexity, resulting in the more accurate classification of background models. Finally, we use an auxiliary model cache to retain potential background states in the system. This prevents the deletion of such states due to a rapid influx of observed states that can occur for highly dynamic sections of the audio signal. The separation algorithm was successfully applied to a number of audio environments representing monitoring applications.

References

[1]
Azlan, M., Cartwright, I., Jones, N., Quirk, T., and West, G. 2005. Multimodal monitoring of the aged in their own homes. In Proceedings of the ICOST'2005: 3rd. International Conference on Smart Homes and Health Telematics (July) Magog, Canada.
[2]
Chen, J., Kam, A. H., Zhang, J., Liu, N., and Shue, L. 2005a. Bathroom activity monitoring based on sound. In Pervasive Computing. Munich, Germany, 47--61.
[3]
Chen, J., Zhang, J., Kam, A., and Shue, L. 2005b. An automatic acoustic bathroom monitoring system. In IEEE International Symposium on Circuits and Systems (ISCAS 05). vol. 2, 1750--1753.
[4]
Clarkson, B., Sawhney, N., and Pentland, A. 1998. Auditory context awareness in wearable computing. In Workshop on Perceptual User Interfaces. San Francisco, U.S.A., 47--61.
[5]
Clavel, C., Ehrette, T., and Richard, G. 2005. Events detection for an audio-based surveillance system. In IEEE International Conference on Multimedia and Expo (ICME 2005). Amsterdam, Netherlands.
[6]
Cover, T. and Thomas, J. 1991. Elements of Information Theory. John Wiley and Sons.
[7]
Cowling, M. and Sitte, R. 2003. Comparison of techniques for environmental sound recognition. Pattern Recognition Letters 24, 15, 2895--2907.
[8]
Cristani, M., Bicego, M., and Murino, V. 2004. Online adaptive background modelling for audio surveillance. In Proceedings of the 17th International Conference on Pattern Recognition (ICPR 04). vol. 2, 399--402.
[9]
Daubechies, I. 1992. Ten Lectures on Wavelets. Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania.
[10]
Deller, J. R., Proakis, J. G., and Hansen, J. H. 1993. Discrete-Time Processing of Speech Signals. Maxwell Macmillan International.
[11]
Elgammal, A., Duraiswami, R., Harwood, D., and Davis, L. S. 2000. Non-parametric model for background subtraction. In Proceedings of the 6th European Conference on Computer Vision-Part II. Springer-Verlag, Dublin, Ireland, 751--767.
[12]
Ellis, D. P. W. 2001. Detecting alarm sounds. In Consistent and Reliable Acoustic Cues for Sound Analysis. Aalborg, Denmark.
[13]
Foote, J. T. and Cooper, M. L. 2003. Media segmentation using self-similarity decomposition. In SPIE Storage and Retrieval for Multimedia Databases. vol. 5021. 167--175.
[14]
Gaunard, P., Mubikangiey, C., Couvreur, C., and Fontaine, V. 1998. Automatic classification of environmental noise events by hidden markov models. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '98). vol. 6, 3609--3612.
[15]
Härmä, A., McKinney, M., and Skowronek, J. 2005. Automatic surveillance of the acoustic activity in our living environment. In IEEE International Conference on Multimedia and Expo (ICME 2005). Amsterdam, Netherlands.
[16]
Kim, K., Chalidabhongse, T. H., Harwood, D., and Davis, L. 2004. Background modeling and subtraction by codebook construction. In IEEE International Conference on Image Processing (ICIP). Singapore.
[17]
Lee, L. 1999. Measures of distributional similarity. In 37th Annual Meeting of the Association for Computational Linguistics. 25--32.
[18]
Moncrieff, S., Venkatesh, S., and West, G. 2005. Persistent audio modelling for background determination. In IEEE International Conference on Multimedia and Expo (ICME 2005). Amsterdam, Netherlands.
[19]
Moncrieff, S,. Venkatesh, S., and West, G. 2006. Unifying background models over complex audio using entropy. In International Conference on Pattern Recognition (ICPR 2006). Hong Kong, China.
[20]
Radhakrishnan, R., Divakaran, A., and Xiong, Z. 2004. A time series clustering based framework for multimedia mining and summarization using audio features. In Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR '04). ACM Press, 157--164.
[21]
Stager, M., Lukowicz, P., Perera, N., von Buren, T., Troster, G., and Starner, T. 2003. Soundbutton: Design of a low power wearable audio classification system. In Proceedings of the Seventh IEEE International Symposium on Wearable Computers (2003). 12--17.
[22]
Stauffer, C. and Grimson, W. 1999. Adaptive background mixture models for real-time tracking. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (1999). vol. 2. Fort Collins, CO USA, 246--252.
[23]
Vacher, M., Istrate, D., Besacier, L., Serignat, J. F., and Castelli, E. 2004. Sound detection and classification for medical telesurvey. In 2nd Conference on Biomedical Engineering. ACTA Press, Ed. Innsbruck, Austria, 395--398.
[24]
Witten, I. H. and Frank, E. 2000. Data Mining: Practical Machine Learning Tools with Java Implementations. Morgan Kaufmann.
[25]
Wren, C., Azarbayejani, A., Darrel, T., and Pentland, A. 1997. Pfinder: Real-time tracking of the human body. PAMI 19, 7, 780--785.
[26]
Zhang, T. and Jay Kuo, C.-C. 1999. Hierarchical classification of audio data for archiving and retrieving. In IEEE International Conference On Acoustics, Speech, and Signal Processing. vol. 6. 3001--3004. Phoenix.

Cited By

View all
  • (2024)Concept drift challenge in multimedia anomaly detection: A case study with facial datasetsSignal Processing: Image Communication10.1016/j.image.2024.117100123(117100)Online publication date: Apr-2024
  • (2022)Anomaly Detection in Audio With Concept Drift Using Dynamic Huffman CodingIEEE Sensors Journal10.1109/JSEN.2022.319396922:17(17126-17138)Online publication date: 1-Sep-2022
  • (2022)An Adaptive Framework for Anomaly Detection in Time-Series Audio-Visual DataIEEE Access10.1109/ACCESS.2022.316443910(36188-36199)Online publication date: 2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 3, Issue 2
May 2007
147 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/1230812
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2007
Published in TOMM Volume 3, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Audio analysis
  2. online background modelling
  3. surveillance and monitoring

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)3
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Concept drift challenge in multimedia anomaly detection: A case study with facial datasetsSignal Processing: Image Communication10.1016/j.image.2024.117100123(117100)Online publication date: Apr-2024
  • (2022)Anomaly Detection in Audio With Concept Drift Using Dynamic Huffman CodingIEEE Sensors Journal10.1109/JSEN.2022.319396922:17(17126-17138)Online publication date: 1-Sep-2022
  • (2022)An Adaptive Framework for Anomaly Detection in Time-Series Audio-Visual DataIEEE Access10.1109/ACCESS.2022.316443910(36188-36199)Online publication date: 2022
  • (2021)Situational Anomaly Detection in Multimedia Data under Concept DriftProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3481033(2969-2973)Online publication date: 17-Oct-2021
  • (2021)Rotating Acoustic Reflector Parameter Trade-Off for Near-Outdoor Audio Event DetectionInnovations in Electrical and Electronic Engineering10.1007/978-981-16-0749-3_44(567-582)Online publication date: 25-May-2021
  • (2016)Audio SurveillanceACM Computing Surveys10.1145/287118348:4(1-46)Online publication date: 22-Feb-2016
  • (2015)BF-ClassifierProceedings of the Audio Mostly 2015 on Interaction With Sound10.1145/2814895.2814926(1-6)Online publication date: 7-Oct-2015
  • (2013)A Novel Element Detection Method in Audio Sensor NetworksInternational Journal of Distributed Sensor Networks10.1155/2013/6071879:2(607187)Online publication date: Jan-2013
  • (2011)Unstructured Environmental AudioMachine Audition10.4018/978-1-61520-919-4.ch001(1-21)Online publication date: 2011
  • (2010)Background subtraction for automated multisensor surveillanceEURASIP Journal on Advances in Signal Processing10.1155/2010/3430572010(1-24)Online publication date: 1-Feb-2010
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media