Pen-Based Music Document Transcription with Convolutional Neural Networks

Javier Sober-Mira¹⁵,
Jorge Calvo-Zaragoza¹⁶,
David Rizo¹⁵ &
…
José M. Iñesta¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11009))

Included in the following conference series:

International Workshop on Graphics Recognition

390 Accesses

Abstract

The transcription of music sources requires new ways of interacting with musical documents. Assuming that automatic technologies will never guarantee a perfect transcription, our intention is to develop an interactive system in which user and software collaborate to complete the task. Since the use of traditional software for score edition might be tedious, our work studies the interaction by means of electronic pen (e-pen). In our framework, users trace symbols using an e-pen over a digital surface, which provides both the underlying image (offline data) and the drawing made (online data). Using both sources, the system is capable of reaching an error below 4% when recognizing the symbols with a Convolutional Neural Network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Online handwriting trajectory reconstruction from kinematic sensors using temporal convolutional network

Article 17 May 2023

Enhancing Recognition of Historical Musical Pieces with Synthetic and Composed Images

Optical music recognition for homophonic scores with neural networks and synthetic music generation

Article Open access 26 May 2023

Notes

1.
The dataset is freely available at http://grfia.dlsi.ua.es/ (Bimodal music symbols from Early notation).

References

Azeem, S.A., Ahmed, H.: Combining online and offline systems for arabic handwriting recognition. In: Proceedings of the 21st International Conference on Pattern Recognition ICPR 2012, pp. 3725–3728 (2012)
Google Scholar
Benetos, E., Dixon, S., Giannoulis, D., Kirchhoff, H., Klapuri, A.: Automatic music transcription: challenges and future directions. J. Intell. Inf. Syst. 41(3), 407–434 (2013)
Article Google Scholar
Bourlard, H., Wellekens, C.: Links between markov models and multilayer perceptrons. IEEE Trans. Pattern Anal. Mach. Intell. 12(11), 1167–1178 (1990)
Article Google Scholar
Donald Byrd and Jakob Grue Simonsen: Towards a standard testbed for optical music recognition: definitions, metrics, and page images. J. New Music Res. 44(3), 169–195 (2015)
Article Google Scholar
Calvo-Zaragoza, J., Oncina, J.: Recognition of pen-based music notation with finite-state machines. Expert Syst. Appl. 72, 395–406 (2017)
Article Google Scholar
Calvo-Zaragoza, J., Rizo, D., Quereda, J.M.I.: Two (note) heads are better than one: pen-based multimodal interaction with music scores. In: Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR 2016, New York City, United States, pp. 509–514, 7–11 August 2016
Google Scholar
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, USA, pp. 315–323 (2011)
Google Scholar
Graves, A., Mohamed, A.-R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649, May 2013
Google Scholar
Keysers, D., Deselaers, T., Rowley, H.A., Wang, L.L., Carbune, V.: Multi-language online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1180–1194 (2017)
Article Google Scholar
Kherfi, M.L.: Review of Human-Computer Interaction Issues in Image Retrieval, chapter 14, pp. 215–240 (2008)
Google Scholar
Konidaris, T., Gatos, B., Ntzios, K., Pratikakis, I., Theodoridis, S., Perantonis, S.J.: Keyword-guided word spotting in historical printed documents using synthetic data and user feedback. Int. J. Doc. Anal. Recognit. (IJDAR) 9(2), 167–177 (2007)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–44 (2015)
Article Google Scholar
Plamondon, R., Srihari, S.N.: On-line and off-line handwriting recognition: a comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 63–84 (2000)
Article Google Scholar
Rebelo, A., Fujinaga, I., Paszkiewicz, F., Marçal, A.R.S., Guedes, C., Cardoso, J.S.: Optical music recognition: state-of-the-art and open issues. Int. J. Multimedia Inf. Retrieval 1(3), 173–190 (2012)
Article Google Scholar
Toselli, A.H., Vidal, E., Casacuberta, F.: Multimodal Interactive Pattern Recognition and Applications, 1st edn. Springer, London (2011). https://doi.org/10.1007/978-0-85729-479-1
Book Google Scholar
Vidal, E., Rodríguez, L., Casacuberta, F., García-Varea, I.: Interactive pattern recognition. In: Popescu-Belis, A., Renals, S., Bourlard, H. (eds.) MLMI 2007. LNCS, vol. 4892, pp. 60–71. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78155-4_6
Chapter Google Scholar
Vinciarelli, A., Perrone, M.P.: Combining online and offline handwriting recognition. In: Proceedings of 7th International Conference on Document Analysis and Recognition, pp. 844–848 (2003)
Google Scholar
Yin, F., Wang, Q.-F., Zhang, X.-Y., Liu, C.-L.: ICDAR 2013 chinese handwriting recognition competition. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1464–1470, August 2013
Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar

Download references

Acknowledgment

This work was supported by the Social Sciences and Humanities Research Council of Canada, and by the Spanish Ministerio de Ciencia, Innovación y Universidades through Project HISPAMUS (No. TIN2017-86576-R supported by EU FEDER funds).

Author information

Authors and Affiliations

Department of Software and Computing Systems, University of Alicante, Alicante, Spain
Javier Sober-Mira, David Rizo & José M. Iñesta
Schulich School of Music, McGill University, Montréal, Canada
Jorge Calvo-Zaragoza

Authors

Javier Sober-Mira
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Calvo-Zaragoza
View author publications
You can also search for this author in PubMed Google Scholar
David Rizo
View author publications
You can also search for this author in PubMed Google Scholar
José M. Iñesta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jorge Calvo-Zaragoza .

Editor information

Editors and Affiliations

Computer Vision Center, Autonomous University of Barcelona, Bellaterra, Barcelona, Spain
Alicia Fornés
Université de Lorraine, Nancy, France
Bart Lamiroy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sober-Mira, J., Calvo-Zaragoza, J., Rizo, D., Iñesta, J.M. (2018). Pen-Based Music Document Transcription with Convolutional Neural Networks. In: Fornés, A., Lamiroy, B. (eds) Graphics Recognition. Current Trends and Evolutions. GREC 2017. Lecture Notes in Computer Science(), vol 11009. Springer, Cham. https://doi.org/10.1007/978-3-030-02284-6_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-02284-6_6
Published: 23 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02283-9
Online ISBN: 978-3-030-02284-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Pen-Based Music Document Transcription with Convolutional Neural Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Online handwriting trajectory reconstruction from kinematic sensors using temporal convolutional network

Enhancing Recognition of Historical Musical Pieces with Synthetic and Composed Images

Optical music recognition for homophonic scores with neural networks and synthetic music generation

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Pen-Based Music Document Transcription with Convolutional Neural Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Online handwriting trajectory reconstruction from kinematic sensors using temporal convolutional network

Enhancing Recognition of Historical Musical Pieces with Synthetic and Composed Images

Optical music recognition for homophonic scores with neural networks and synthetic music generation

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation