Abstract
An automatic content extraction from multimedia files is recently being extensively explored. However, an automatic content description of musical sounds has not been broadly investigated and still needs an intensive research. In this paper, we investigate how to optimize sound representation in terms of musical instrument recognition purposes. We propose to trace trends in the evolution of values of MPEG-7 descriptors in time, as well as their combinations. Described process is a typical example of KDD application, consisting of data preparation, feature extraction and decision model construction. Discussion of efficiency of applied classifiers illustrates capabilities of possible progress in the optimization of sound representation. We believe that further research in this area would provide background for an automatic multimedia content description.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., and Verkamo, I. (1996). Fast Discovery of Association Rules. In Proc. of the Advances in Knowledge Discovery and Data Mining (pp. 307–328). CA: AAAI Press/The MIT Press.
Agrawal, R. and Srikant, R. (1994). Fast Algorithms for Mining Association Rules. In Proc. of the VLDB Conference, Santiago, Chile.
Ando, S. and Yamaguchi, K. (1993). Statistical Study of Spectral Parameters in Musical Instrument Tones. J. Acoust. Soc. of America, 94(1), 37–45.
Batlle, E. and Cano, P. (2000). Automatic Segmentation for Music Classification Using Competitive Hidden Markov Models. In Proceedings of International Symposium on Music Information Retrieval. Plymouth, MA. Available at http://www.iua.upf.es/mtg/publications/ismir2000-eloi.pdf.
Bazan, J.G., Nguyen, H.S., Nguyen, S.H., Synak, P., and Wróblewski, J. (2000). Rough Set Algorithms in Classification Problem. In L. Polkowski, S. Tsumoto, and T.Y. Lin (Eds.), Rough Set Methods and Applications: New Developments in Knowledge Discovery in Information Systems (pp. 49–88). Physica-Verlag.
Bazan, J.G. and Szczuka, M. (2000). RSES and RSESlib–A Collection of Tools for Rough Set Computations. In W. Ziarko and Y.Y. Yao (Eds.), Proc. of RSCTC'00, Banff, Canada. See also: http://alfa.mimuw.edu.pl/~rses/.
Bazan, J.G., Szczuka, M., and Wróblewski, J. (2002). A New Version of Rough Set Exploration System. In Proc. of RSCTC'02. See also: http://alfa.mimuw.edu.pl/~rses/.
Beauchamp, J.W., Maher, R., and Brown, R. (1993). Detection of Musical Pitch from Recorded Solo Performances. 94th AES Convention, preprint 3541, Berlin.
Box, G.E.P. and Tiao, G.C. (1992). Bayesian Inference in Statistical Analysis. Wiley.
Brown, J.C. (1999). Computer Identification of Musical Instruments Using Pattern Recognition with Cepstral Coefficients as Features. J. Acoust. Soc. of America, 105, 1933–1941.
Brown, J.C., Houix, O., and McAdams, S. (2001). Feature Dependence in the Automatic Identification of Musical Woodwind Instruments. J. Acoust. Soc. of America, 109, 1064–1072.
Brown, J.C. and Zhang, B. (1991). Musical Frequency Tracking Using the Methods of Conventional and ‘Narrowed’ Autocorrelation. J. Acoust. Soc. Am., 89, 2346–2354.
Cook, P.R., Morrill, D., and Smith, J.O. (1992). An Automatic Pitch Detection and MIDI Control System for Brass Instruments. Invited for special session on Automatic Pitch Detection, Acoustical Society of America, New Orleans.
Cooper, D. and Ng, K.C. (1994). AMonophonic Pitch Tracking Algorithm. Available at http://citeseer.nj.nec.com/cooper94monophonic.html.
Cosi, P., De Poli, G., and Lauzzana, G. (1994). Auditory Modelling and Self-Organizing Neural Networks for Timbre Classification. Journal of New Music Research, 23, 71–98.
de la Cuadra, P., Master, A., and Sapp, C. (2001). Efficient Pitch Detection Techniques for Interactive Music. ICMC. Available at http://www-ccrma.stanford.edu/pdelac/PitchDetection/icmc01-pitch.pdf.
Doval, B. and Rodet, X. (1991). Estimation of Fundamental Frequency of Musical Sound Signals. IEEE, A2.11, 3657–3660.
Düntsch, I., Gediga, G., and Nguyen, H.S. (2000). Rough Set Data Analysis in the KDD Process. In Proc. of IPMU 2000, 1 (pp. 220–226). Madrid, Spain.
Eronen, A. and Klapuri, A. (2000) Musical Instrument Recognition Using Cepstral Coefficients and Temporal Features. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP 2000 (pp. 753–756). Plymouth, MA.
Fujinaga, I. and McMillan, K. (2000). Realtime Recognition of Orchestral Instruments. In Proceedings of the International Computer Music Conference (pp. 141–143).
Herrera, P., Amatriain, X., Batlle, E., and Serra, X. (2000). Towards Instrument Segmentation for Music Content Description: A Critical Review of Instrument Classification Techniques. In Proc. of International Symposium on Music Information Retrieval (ISMIR 2000), Plymouth, MA.
ISO/IEC JTC1/SC29/WG11 (2002). MPEG-7 Overview. Available at http://mpeg.telecomitalialab.com/standards/mpeg-7/mpeg-7.htm.
Kaminskyj, I. (2000). Multi-feature Musical Instrument Classifier. MikroPolyphonie 6 (online journal at http://farben.latrobe.edu.au/).
Kostek, B. and Czyzewski, A. (2001). Representing Musical Instrument Sounds for their Automatic Classification. J. Audio Eng. Soc., 49(9), 768–785.
Kostek, B. and Wieczorkowska, A. (1997). Parametric Representation of Musical Sounds. Archive of Acoustics, 22(1), Institute of Fundamental Technological Research, Warsaw, Poland, (pp. 3–26).
Lindsay, A.T. and Herre, J. (2001). MPEG-7 and MPEG-7 Audio–An Overview. J. Audio Eng. Soc., 49(7/8), 589–594.
Liu, H. and Motoda, H. (Eds.) (1998). Feature Extraction, Construction and Selection–A Data Mining Perspective. Dordrecht: Kluwer Academic Publishers.
Mannila, H., Toivonen, H., and Verkamo, A.I. (1998). Discovery of Frequent Episodes in Event Sequences. Report C-1997-15, University of Helsinki, Finland.
Martin, K.D. and Kim, Y.E. (1998). 2pMU9. Musical Instrument Identification: A Pattern-Recognition Approach. 136-th Meeting of the Acoustical Soc. of America, Norfolk, VA.
Mitchell, T. (1998). Machine Learning. McGraw Hill.
Nguyen, H.S. (1997). Discretization od RealValue Attributes: Boolean Reasoning Approach. Rozprawa doktorska. Uniwersytet Warszawski.
Nguyen, H.S. (1997). Discretization od Real Value Attributes: Boolean Reasoning Approach. Ph.D. Dissertation, Warsaw University, Poland.
Nguyen, S.H. (2000). Regularity Analysis and Its Applications in Data Mining. Ph.D. Dissertation, Warsaw University, Poland.
Opolko, F. and Wapnick, J. (1987). MUMS–McGill University Master Samples. CD's.
Pawlak, Z. (1991). Rough Sets–Theoretical Aspects of Reasoning About Data. Dordrecht: Kluwer Academic Publishers.
Peeters, G., McAdams, S., and Herrera, P. (2000). Instrument Sound Description in the Context of MPEG-7. In Proc. International Computer Music Conf. (ICMC'2000), Berlin. Available at http://www.iua.upf.es/mtg/publications/icmc00-perfe.pdf.
Polkowski, L. and Skowron, A. (Eds.) (1998). Rough Sets in Knowledge Discovery 1, 2. Physica-Verlag, Heidelberg.
Pollard, H.F. and Jansson, E.V. (1982). A Tristimulus Method for the Specification of Musical Timbre. Acustica, 51, 162–171.
Ślęzak, D. (2001). Approximate Decision Reducts. Ph.D. Thesis, Institute of Mathematics, Warsaw University.
Ślęzak, D., Synak, P., Wieczorkowska, A.A., and Wróblewski, J. (2002). KDD-Based Approach to Musical Instrument Sound Recognition. In M.-S. Hacid, Z.W. Ras, D. Zighed, and Y. Kodratoff (Eds.), Foundations of Intelligent Systems (pp. 29–37), LNCS/LNAI 2366, Springer.
Ślęzak, D. and Wróblewski, J. (1999). Classification Algorithms Based on Linear Combinations of Features. In Proc. of PKDD'99 (pp. 548–553). Praga, Czech Republik: LNAI 1704, Springer, Heidelberg. Available at http://www.mimuw.edu.pl/~jakubw/bib/.
Synak, P. (2000). Temporal Templates and Analysis of Time Related Data. In W. Ziarko and Y.Y. Yao (Eds.), Proc. of RSCTC'00, Banff, Canada.
Toiviainen, P. (1996). Optimizing Self-Organizing Timbre Maps: Two Approaches. Joint International Conference, II Int. Conf. on Cognitive Musicology (pp. 264–271). College of Europe at Brugge, Belgium.
Wieczorkowska, A.A. (1999a). The Recognition Efficiency of Musical Instrument Sounds Depending on Parameterization and Type of a Classifier. Ph.D. Thesis (in Polish), Technical University of Gdansk, Poland.
Wieczorkowska, A.(1999b). Rough Sets as a Tool for Audio Signal Classification. In Z.W. Ras and A. Skowron (Eds.), Foundations of Intelligent Systems (pp. 367–375). LNCS/LNAI 1609, Springer.
Wieczorkowska, A.A. and Raś, Z.W. (2001). Audio Content Description in Sound Databases. In N. Zhong, Y. Yao, J. Liu, and S. Ohsuga (Eds.), Web Intelligence: Research and Development (pp. 175–183). LNCS/LNAI 2198, Springer.
Wróblewski, J.(2000). Analyzing Relational Databases Using Rough Set Based Methods. In Proc. of IPMU'00 1 (pp. 256–262), Madrid, Spain. Available at http://www.mimuw.edu.pl/~jakubw/bib/.
Wróblewski, J. (2001a). Ensembles of Classifiers Based on Approximate Reducts. Fundamenta Informaticae 47(3,4), IOS Press (pp. 351–360). Available at http://www.mimuw.edu.pl/~jakubw/bib/.
Wróblewski, J. (2001b). Adaptive Methods of Object Classification. Ph.D. Thesis, Institute of Mathematics, Warsaw University. Available at http://www.mimuw.edu.pl/~jakubw/bib/.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Wieczorkowska, A.A., Wróblewski, J., Synak, P. et al. Application of Temporal Descriptors to Musical Instrument Sound Recognition. Journal of Intelligent Information Systems 21, 71–93 (2003). https://doi.org/10.1023/A:1023505917953
Issue Date:
DOI: https://doi.org/10.1023/A:1023505917953