[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Pen-Based Music Document Transcription with Convolutional Neural Networks

  • Conference paper
  • First Online:
Graphics Recognition. Current Trends and Evolutions (GREC 2017)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11009))

Included in the following conference series:

  • 390 Accesses

Abstract

The transcription of music sources requires new ways of interacting with musical documents. Assuming that automatic technologies will never guarantee a perfect transcription, our intention is to develop an interactive system in which user and software collaborate to complete the task. Since the use of traditional software for score edition might be tedious, our work studies the interaction by means of electronic pen (e-pen). In our framework, users trace symbols using an e-pen over a digital surface, which provides both the underlying image (offline data) and the drawing made (online data). Using both sources, the system is capable of reaching an error below 4% when recognizing the symbols with a Convolutional Neural Network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The dataset is freely available at http://grfia.dlsi.ua.es/ (Bimodal music symbols from Early notation).

References

  1. Azeem, S.A., Ahmed, H.: Combining online and offline systems for arabic handwriting recognition. In: Proceedings of the 21st International Conference on Pattern Recognition ICPR 2012, pp. 3725–3728 (2012)

    Google Scholar 

  2. Benetos, E., Dixon, S., Giannoulis, D., Kirchhoff, H., Klapuri, A.: Automatic music transcription: challenges and future directions. J. Intell. Inf. Syst. 41(3), 407–434 (2013)

    Article  Google Scholar 

  3. Bourlard, H., Wellekens, C.: Links between markov models and multilayer perceptrons. IEEE Trans. Pattern Anal. Mach. Intell. 12(11), 1167–1178 (1990)

    Article  Google Scholar 

  4. Donald Byrd and Jakob Grue Simonsen: Towards a standard testbed for optical music recognition: definitions, metrics, and page images. J. New Music Res. 44(3), 169–195 (2015)

    Article  Google Scholar 

  5. Calvo-Zaragoza, J., Oncina, J.: Recognition of pen-based music notation with finite-state machines. Expert Syst. Appl. 72, 395–406 (2017)

    Article  Google Scholar 

  6. Calvo-Zaragoza, J., Rizo, D., Quereda, J.M.I.: Two (note) heads are better than one: pen-based multimodal interaction with music scores. In: Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR 2016, New York City, United States, pp. 509–514, 7–11 August 2016

    Google Scholar 

  7. Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, USA, pp. 315–323 (2011)

    Google Scholar 

  8. Graves, A., Mohamed, A.-R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649, May 2013

    Google Scholar 

  9. Keysers, D., Deselaers, T., Rowley, H.A., Wang, L.L., Carbune, V.: Multi-language online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1180–1194 (2017)

    Article  Google Scholar 

  10. Kherfi, M.L.: Review of Human-Computer Interaction Issues in Image Retrieval, chapter 14, pp. 215–240 (2008)

    Google Scholar 

  11. Konidaris, T., Gatos, B., Ntzios, K., Pratikakis, I., Theodoridis, S., Perantonis, S.J.: Keyword-guided word spotting in historical printed documents using synthetic data and user feedback. Int. J. Doc. Anal. Recognit. (IJDAR) 9(2), 167–177 (2007)

    Article  Google Scholar 

  12. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–44 (2015)

    Article  Google Scholar 

  13. Plamondon, R., Srihari, S.N.: On-line and off-line handwriting recognition: a comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 63–84 (2000)

    Article  Google Scholar 

  14. Rebelo, A., Fujinaga, I., Paszkiewicz, F., Marçal, A.R.S., Guedes, C., Cardoso, J.S.: Optical music recognition: state-of-the-art and open issues. Int. J. Multimedia Inf. Retrieval 1(3), 173–190 (2012)

    Article  Google Scholar 

  15. Toselli, A.H., Vidal, E., Casacuberta, F.: Multimodal Interactive Pattern Recognition and Applications, 1st edn. Springer, London (2011). https://doi.org/10.1007/978-0-85729-479-1

    Book  Google Scholar 

  16. Vidal, E., Rodríguez, L., Casacuberta, F., García-Varea, I.: Interactive pattern recognition. In: Popescu-Belis, A., Renals, S., Bourlard, H. (eds.) MLMI 2007. LNCS, vol. 4892, pp. 60–71. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78155-4_6

    Chapter  Google Scholar 

  17. Vinciarelli, A., Perrone, M.P.: Combining online and offline handwriting recognition. In: Proceedings of 7th International Conference on Document Analysis and Recognition, pp. 844–848 (2003)

    Google Scholar 

  18. Yin, F., Wang, Q.-F., Zhang, X.-Y., Liu, C.-L.: ICDAR 2013 chinese handwriting recognition competition. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1464–1470, August 2013

    Google Scholar 

  19. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53

    Chapter  Google Scholar 

Download references

Acknowledgment

This work was supported by the Social Sciences and Humanities Research Council of Canada, and by the Spanish Ministerio de Ciencia, Innovación y Universidades through Project HISPAMUS (No. TIN2017-86576-R supported by EU FEDER funds).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jorge Calvo-Zaragoza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sober-Mira, J., Calvo-Zaragoza, J., Rizo, D., Iñesta, J.M. (2018). Pen-Based Music Document Transcription with Convolutional Neural Networks. In: Fornés, A., Lamiroy, B. (eds) Graphics Recognition. Current Trends and Evolutions. GREC 2017. Lecture Notes in Computer Science(), vol 11009. Springer, Cham. https://doi.org/10.1007/978-3-030-02284-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-02284-6_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-02283-9

  • Online ISBN: 978-3-030-02284-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics