[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1255047.1255085acmconferencesArticle/Chapter ViewAbstractPublication PagesesemConference Proceedingsconference-collections
Article

A voice-to-MIDI system for singing melodies with lyrics

Published: 13 June 2007 Publication History

Abstract

In this paper, we propose a robust Voice-to-MIDI (V to M) system with which a user can input MIDI sequence data by naturally singing melodies with lyrics. A Voice-to-MIDI system translates singing voices into digital musical data, i.e., MIDI sequence data. Therefore, with such a system, users can input melodies intuitively, which releases them from manual translating memorized melodies into chromatic pitches. However, the quality of translation of ordinary Voice-to-MIDI systems is insufficient. One of the most significant problems is the poor accuracy of the segmentation of notes. We solve this problem by employing "rhythmic tapping" concurrently with singing. We examined the proposed method by the accuracy of the numbers of segmented notes and their pitches. As a result, we confirmed that our system outperformed ordinary Voice-to-MIDI systems. Thus, this system satisfies both of easy and intuitive composition of MIDI sequence data and high accuracy of translation of sung data into MIDI sequence data.

References

[1]
YAMAHA Corp., XGworks ST, http://www.yamaha.co.jp/product/syndtm/p/cmp/xgwstw/index.html
[2]
Media Navigation,Inc., Hanauta Musician 2, http://medianavi.co.jp/product/hana2/hana2.html
[3]
Jun, S., Takeshi, M., Masanobu, M. and Masuzo, Y., Automatic Scoring of Melodies Sung by Humming. Tech. Rep. Musical Acoust. Acoust. Soc. Jpn.,allVol.23, No.5, pp.95--100, 2004.
[4]
Epinoisis Software, Digital Ear, http://www.digital-ear.com/digital-ear/index.asp
[5]
Lloyd A. S., Eline F. C., Brian L. S., A Speech Interface for Building Musical Score Collections. Proceedings of the fifth ACM conference on Digital libraries, 2000.
[6]
Goto, M., SmartMusicKIOSK: music listening station with chorus-search function. Proc.of the 16th annual ACM symp. on User interface software and technology (UIST), 2003.
[7]
Goto, M., A Chorus-Section detection Method for Musical Audio Signals. Proc. of IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2003.
[8]
Tzanetakis, G., Song-specific bootstrapping of singing voice structure. Proc. of the Intl Conf. on Multimedia and Expo (ICME), 2004.
[9]
Wang, C. K., Lyu, R. Y. and Chiang, Y.C., An Automatic Singing Transcription System with Multilingual Singing Lyric Recognizer and Robust Melody Tracker. Proc. of EUROSpeech, 2003.
[10]
Ye, W., Min-Yen, K., Tin L. N., Arun S. and Jun Y., LyricAlly: automatic synchronization of acoustic musical signals and textual lyrics. Proc. of the 12th annual ACM intl. conf. on Multimedia (MULTIMEDIA), 2004.
[11]
Yuichiro, H. and Seiji, I., Frequency Identification by Complex Spectrum. Soc. Inst. And Cont. Engineering., pp.718--723, 1983.
[12]
INTERNET .Co.,Ltd., SingerSongWriter Lite5, http://www.ssw.co.jp/products/ssw/win/sswlt50w/index.html
[13]
Hideki, K., Haruhiro, K., Alain de C. and Roy D. P., Fixed Point Analysis of Frequency to Instantaneous Frequency Mapping for Accurate Estimation of F0 and Periodicity, Proc. EUROSPEECH'99, Volume 6, 2781--2784, 1999.

Cited By

View all
  • (2022)Electroglottography based voice-to-MIDI real time converter with AI voice act classification2022 IEEE International Symposium on Medical Measurements and Applications (MeMeA)10.1109/MeMeA54994.2022.9856413(1-6)Online publication date: 22-Jun-2022
  • (2022)Electroglottography based real-time voice-to-MIDI controllerNeuroscience Informatics10.1016/j.neuri.2022.100041(100041)Online publication date: Jan-2022
  • (2013)A Semantics-Driven Approach to Lyrics SegmentationProceedings of the 2013 8th International Workshop on Semantic and Social Media Adaptation and Personalization10.1109/SMAP.2013.15(73-79)Online publication date: 12-Dec-2013
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ACE '07: Proceedings of the international conference on Advances in computer entertainment technology
June 2007
324 pages
ISBN:9781595936400
DOI:10.1145/1255047
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. FFT
  2. lyrics
  3. melody input
  4. note segmentation
  5. pitch recognition
  6. tapping
  7. voice-to-MIDI

Qualifiers

  • Article

Conference

ACE2007
Sponsor:

Acceptance Rates

Overall Acceptance Rate 36 of 90 submissions, 40%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)2
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Electroglottography based voice-to-MIDI real time converter with AI voice act classification2022 IEEE International Symposium on Medical Measurements and Applications (MeMeA)10.1109/MeMeA54994.2022.9856413(1-6)Online publication date: 22-Jun-2022
  • (2022)Electroglottography based real-time voice-to-MIDI controllerNeuroscience Informatics10.1016/j.neuri.2022.100041(100041)Online publication date: Jan-2022
  • (2013)A Semantics-Driven Approach to Lyrics SegmentationProceedings of the 2013 8th International Workshop on Semantic and Social Media Adaptation and Personalization10.1109/SMAP.2013.15(73-79)Online publication date: 12-Dec-2013
  • (2012)Hummi-comProceedings of the 20th ACM international conference on Multimedia10.1145/2393347.2396463(1321-1322)Online publication date: 29-Oct-2012
  • (2011)An accompaniment system for healing emotions of patients with dementia who repeat stereotypical utterancesProceedings of the 9th international conference on Toward useful services for elderly and people with disabilities: smart homes and health telematics10.5555/2026187.2026198(65-71)Online publication date: 20-Jun-2011
  • (2011)An Implementation Method for Converting the Erhu Music from Wav to MidProceedings of the 2011 Seventh International Conference on Computational Intelligence and Security10.1109/CIS.2011.318(1425-1429)Online publication date: 3-Dec-2011
  • (2011)Notation-Support Method in Music Composition Based on Interval-Pitch ConversionIntelligent Decision Technologies10.1007/978-3-642-22194-1_54(547-556)Online publication date: 2011
  • (2011)An Accompaniment System for Healing Emotions of Patients with Dementia Who Repeat Stereotypical UtterancesToward Useful Services for Elderly and People with Disabilities10.1007/978-3-642-21535-3_9(65-71)Online publication date: 2011
  • (2009)A Real-Time Note Transcription Technique Using Static and Dynamic Window SizesProceedings of the 2009 International Conference on Signal Acquisition and Processing10.1109/ICSAP.2009.20(30-33)Online publication date: 3-Apr-2009
  • (2009)An Approach for Heartbeat Sound TranscriptionProceedings of the 2009 International Conference on Computer Technology and Development - Volume 0110.1109/ICCTD.2009.94(38-41)Online publication date: 13-Nov-2009
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media