[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Classification of acoustic events using SVM-based clustering schemes

Published: 01 April 2006 Publication History

Abstract

Acoustic events produced in controlled environments may carry information useful for perceptually aware interfaces. In this paper we focus on the problem of classifying 16 types of meeting-room acoustic events. First of all, we have defined the events and gathered a sound database. Then, several classifiers based on support vector machines (SVM) are developed using confusion matrix based clustering schemes to deal with the multi-class problem. Also, several sets of acoustic features are defined and used in the classification tests. In the experiments, the developed SVM-based classifiers are compared with an already reported binary tree scheme and with their correlative Gaussian mixture model (GMM) classifiers. The best results are obtained with a tree SVM-based classifier that may use a different feature set at each node. With it, a 31.5% relative average error reduction is obtained with respect to the best result from a conventional binary tree scheme.

References

[1]
Bregman, A., Auditory Scene Analysis. MIT Press, Cambridge.
[2]
CHIL-Computers in the Human Interaction Loop, http://chil.server.de/
[3]
Lu, L., Zhang, H.-J. and Jiang, H., Content analysis for audio classification and segmentation. IEEE Trans. Speech Audio Process. v10 i7. 504-516.
[4]
D. Hoiem, Y. Ke, R. Sukthankar, SOLAR: sound object localization and retrieval in complex audio environments, in: International Conference on Acoustics, Speech, and Signal Processing, Philadelphia, March 2005.
[5]
M. Slaney, Mixtures of probability experts for audio retrieval and indexing, in: IEEE International Conference on Multimedia and Expo, Lausanne, August 2002.
[6]
L. Kennedy, D. Ellis, Laughter detection in meetings, in: NIST Meeting Recognition Workshop, International Conference on Acoustics, Speech, and Signal Processing, Montreal, May 2004.
[7]
J. Pinquier, J. Arias, R. André-Obrecht, Audio classification by search of primary components, in: International Workshop on Image, Video and Audio Retrieval and Mining, Sherbrooke, October 2004.
[8]
T. Nishiura, S. Nakamura, K. Miki, K. Shikano, Environmental sound source identification based on hidden Markov model for robust speech recognition, in: Eurospeech 2003, Geneva, September 2003, pp. 2157-2160.
[9]
S. Nakamura, K. Hiyane, F. Asano, T. Nishiura, T. Yamada, Acoustical sound database in real environments for sound scene understanding and hands-free speech recognition, in: Second International Conference on Language Resources & Evaluation, Athens, 2000.
[10]
D. Gerhard, Audio signal classification: history and current techniques, Technical Report TR-CS 2003-07, November 2003.
[11]
Guo, G. and Li, Z., Content-based audio classification and retrieval using support vector machines. IEEE Trans. Neural Networks. v14. 209-215.
[12]
Lu, L., Li, S.Z. and Zhang, H., Content-based audio classification and segmentation by using support vector machines. ACM Multimedia Systems J. v8 i6. 482-492.
[13]
Hsu, C.W. and Lin, C.J., A comparison of methods for multi-class support vector machines. IEEE Trans. Neural Networks. v13. 415-425.
[14]
ShATR Multiple Simultaneous Speaker Corpus, http://www.dcs.shef.ac.uk/research/groups/spandh/projects/shatrweb/index.html.
[15]
Rabiner, L. and Juang, B.H., Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs, NJ.
[16]
C. Nadeu, J. Hernando, M. Gorricho, On the decorrelation of filter-bank energies in speech recognition, in: European Speech Processing Conference (Eurospeech '95), Madrid, September 1995, pp. 1381-1384.
[17]
Burges, C., A tutorial on support vector machines for pattern recognition. Data Mining Knowledge Discovery. v2. 955-975.
[18]
Schölkopf, B. and Smola, A., Learning with Kernels. MIT Press, Cambridge, MA.
[19]
Müller, K., Mika, S., Rätsch, G., Tsuda, K. and Schölkopf, B., An introduction to kernel-based learning algorithms. IEEE Trans. Neural Networks. v12. 181-202.
[20]
Bersekas, D., Nonlinear Programming. Athena Scientific.
[21]
Veropoulos, K., Campbell, C. and Cristianini, N., Controlling the sensitivity of support vector machines. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 55-60.
[22]
I. Gradshteyn, I. Ryzhik, Tables of Integrals, Series, and Products, fifth ed., Academic Press, New York, 1979, p. 1101.
[23]
Rifkin, R. and Klautau, A., In defense of one-vs-all classification. J. Mach. Learning Res. v5. 101-141.
[24]
Duda, R., Hart, P. and Stork, D., Pattern Classification. second ed. Wiley-Interscience, New York.
[25]
Voorhees, E.M., Implementing agglomerative hierarchical clustering algorithms for use in document retrieval. Inf. Process. Manage. v22. 465-476.
[26]
J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, V. Vapnik, Feature selection for SVMS, in: Proceedings of NIPS, 2000.
[27]
Y. Liu Y. Yang, J. Carbonell, Boosting to correct inductive bias in text classification, in: International Conference on Information and Knowledge Management (CIKM), McLean, November 2002, pp. 348-355.

Cited By

View all