- Research Article
- Open access
- Published:
A Supervised Classification Algorithm for Note Onset Detection
EURASIP Journal on Advances in Signal Processing volume 2007, Article number: 043745 (2006)
Abstract
This paper presents a novel approach to detecting onsets in music audio files. We use a supervised learning algorithm to classify spectrogram frames extracted from digital audio as being onsets or nononsets. Frames classified as onsets are then treated with a simple peak-picking algorithm based on a moving average. We present two versions of this approach. The first version uses a single neural network classifier. The second version combines the predictions of several networks trained using different hyperparameters. We describe the details of the algorithm and summarize the performance of both variants on several datasets. We also examine our choice of hyperparameters by describing results of cross-validation experiments done on a custom dataset. We conclude that a supervised learning approach to note onset detection performs well and warrants further investigation.
References
West K, Cox S: Finding an optimal segmentation for audio genre classification. Proceedings of 6th International Conference on Music Information Retrieval (ISMIR '05), September 2005, London, UK 680–685.
Scheirer ED: Tempo and beat analysis of acoustic musical signals. Journal of the Acoustical Society of America 1998,103(1):588–601. 10.1121/1.421129
Klapuri A: Sound onset detection by applying psychoacoustic knowledge. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '99), March 1999, Phoenix, Ariz, USA 6: 3089–3092.
Klapuri AP, Eronen AJ, Astola JT: Analysis of the meter of acoustic musical signals. IEEE Transactions on Audio, Speech and Language Processing 2006,14(1):342–355.
Gouyon F, Klapuri A, Dixon S, et al.: An experimental comparison of audio tempo induction algorithms. IEEE Transactions on Audio, Speech and Language Processing 2006,14(5):1832–1844.
Duxbury C, Bello JP, Davies M, Sandler M: Compled domain onset detection for musical signals. Proceedings of 6th International Conference on Digital Audio Effects (DAFx '03), September 2003, London, UK
Duxbury C, Bello JP, Davies M, Sandler M: A combined phase and amplitude based approach to onset detection for audio segmentation. Proceedings of the 4th European Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS '03), April 2003, London, UK
Bello JP, Sandler M: Phase-based note onset detection for music signals. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '03), April 2003, Hong Kong 5: 441–444.
Bello JP, Duxbury C, Davies M, Sandler M: On the use of phase and energy for musical onset detection in the complex domain. IEEE Signal Processing Letters 2004,11(6):553–556. 10.1109/LSP.2004.827951
Kapanci E, Pfeffer A: A hierarchical approach to onset detection. Proceedings of the International Computer Music Conference (ICMC '04), October 2004, Miami, Fla, USA
Davy M, Godsill S: Detection of abrupt spectral changes using support vector machines an application to audio signal segmentation. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 2: 1313–1316.
Marolt M, Kavcic A, Privosnik M: Neural networks for note onset detection in piano music. Proceedings of the International Computer Music Conference (ICMC '02), September 2002, Gotenborg, Sweden
Brown JC:Calculation of a constant spectral transform. Journal of the Acoustical Society of America 1991,89(1):425–434. 10.1121/1.400476
Brown JC, Puckette MS:An efficient algorithm for the calculation of a constant transform. Journal of the Acoustical Society of America 1992,92(5):2698–2701. 10.1121/1.404385
Bishop CM: Neural Networks for Pattern Recognition. Oxford University Press, Oxford, UK; 1995.
Press WH, Teukolsky SA, Vetterling WT, Flannery BP: Numerical Recipes in C: The Art of Scientific Computing. 2nd edition. Cambridge University Press, Cambridge, Mass, USA; 1993.
Large EW, Kolen JF: Resonance and the perception of musical meter. Connection Science 1994,6(1):177–208.
Eck D: Finding downbeats with a relaxation oscillator. Psychological Research 2002,66(1):18–25. 10.1007/s004260100070
Dixon SE: Automatic extraction of tempo and beat from expressive performances. Journal of New Music Research 2001,30(1):39–58. 10.1076/jnmr.30.1.39.7119
Cemgil AT, Kappen B: Monte Carlo methods for tempo tracking and rhythm quantization. Journal of Artificial Intelligence Research 2003, 18: 45–81.
Cemgil AT, Kappen B, Desain PWM, Honing HJ: On tempo tracking: tempogram representation and Kalman filtering. Journal of New Music Research 2001,29(4):259–273.
Brown JC: Determination of the meter of musical scores by autocorrelation. Journal of the Acoustical Society of America 1993,94(4):1953–1957. 10.1121/1.407518
Tzanetakis G, Cook P: Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing 2002,10(5):293–302. 10.1109/TSA.2002.800560
Toiviainen P, Eerola T: The role of accent periodicities in meter induction: a classificatin study. In Proceedings of the 8th International Conference on Music Perception and Cognition (ICMPC8 '04), August 2004, Evanston, Ill, USA. Edited by: Lipscomb S, Ashley R, Gjerdingen R, Webster P. Causal Productions;
Eck D: A machine-learning approach to musical sequence induction that uses autocorrelation to bridge long timelags. In Proceedings of the 8th International Conference on Music Perception and Cognition (ICMPC8 '04), August 2004, Evanston, Ill, USA. Edited by: Lipscomb SD, Ashley R, Gjerdingen RO, Webster P. Causal Productions;
Goto M: An audio-based real-time beat tracking system for music with or without drum-sounds. Journal of New Music Research 2001,30(2):159–171. 10.1076/jnmr.30.2.159.7114
Eck D: Meter and autocorrelation. 10th Rhythm Perception and Production Workshop (RPPW '05), July 2005, Blitzen, Belgium
Leveau P, Daudet L, Richard G: Methodology and tools for the evaluation of automatic onset detection algorithms in music. Proceedings of 5th International Conference on Music Information Retrieval (ISMIR '04), October 2004, Barcelona, Spain
McKinney M, Moelants D: Mirex 2005: tempo contest. Proceedings of 6th International Conference on Music Information Retrieval (ISMIR '05), September, London, UK
Eck D, Casagrande N: A tempo-extraction algorithm using an autocorrelation phase matrix and shannon entropy. MIREX tempo extraction contest, 2005, https://doi.org/www.music-ir.org/evaluation/mirex-results/
LeCun Y, Bengio Y: Convolutional networks for images, speech, and time-series. In The Handbook of Brain Theory and Neural Networks. Edited by: Arbib . MIT Press, Cambridge, Mass, USA; 1995.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Lacoste, A., Eck, D. A Supervised Classification Algorithm for Note Onset Detection. EURASIP J. Adv. Signal Process. 2007, 043745 (2006). https://doi.org/10.1155/2007/43745
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1155/2007/43745