Abstract
In this paper, we propose a graph-based framework to organize low-level and high-level features of music objects in a unified way. The featured graph, called the power graph, is associated with operators to support a variety of music information retrieval applications, such as auto-tagging, link analysis, similarity measurement, and clustering. Among these operators, we have identified the node ranking by computing prestige value as one of the essential fundamental link analysis operators. For this particular operator, we propose two methods of computing prestige; they are the power method and the algebraic method. Although the algebraic method is originated from the symmetric graph, the algebraic method can be applied as an approximate but efficient alternative to the power method. To demonstrate the feasibility of our framework, we have carried out an auto-tagging experiment and a music object clustering experiment. According to the auto-tagging experimental results, we have observed that the algebraic method has achieved almost the same results as the power method with only a one-fifth elapsed time. In the experiments we have conducted, we have achieved accuracy levels up to 75 %.
Similar content being viewed by others
Notes
MusicBrainz, available at http://musicbrainz.org/
EchoNest, available at http://the.echonest.com/
AllMusic, available at http://www.allmusic.com/
“Words and other instructions in musical scores used to define the speed and specify the manner of performance” [15].
References
Bailloeul T, Zhu C, Xu Y (2008) Automatic image tagging as a random walk with priors on the canonical correlation subspace. Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval (MIR 2008) (pp. 75–82). ACM Press. doi:10.1145/1460096.1460110
Barbedo JGA, Lopes A (2007) Automatic genre classification of musical signals. EURASIP J Adv Signal Proc, 2007(Article ID 64960). doi:10.1155/2007/64960
Berenzweig A, Ellis D, Logan B, Whitman B (2004) A large scale evaluation of acoustic and subjective music similarity measures. In: Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR’04), Barcelona, Spain, October 2004
Bertin-Mahieux T, Ellis DPW, Whitman B, Lamere P (2011) The million song dataset. Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR 2011), 591–596
Breyer L (2002) Markovian page ranking distributions: some theory and simulations. Technical report. Available at http://www.lbreyer.com/preprints.html
Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw ISDN Sys 30(1–7):107–117. doi:10.1016/j.comnet.2012.10.007
Bryan NJ, Wang G (2011) Musical influence network analysis and rank of sample-based music. Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR 2011), 329–334
Cano P, Celma O, Koppenberger M, Martin-Buldu J (2006) Topology of music recommendation networks. Chaos Interdiscip J Nonlinear Sci 16
Cano P, Koppenberger M (2004) The emergence of complex network patterns in music artist networks. Proceedings of the International Society for Music Information Retrieval Conference (ISMIR 2004), 466–469
Chen L, Wright P, Nejdl W (2009) Improving music genre classification using collaborative tagging data. Proceedings of Second ACM International Conference on Web Search and Data Mining (WSDM 2009).
Coscia M, Giannotti F, Pedreschi D (2011) A classification for community discovery methods in complex networks. Stat Anal Data Min 4(5):512–546. doi:10.1002/sam, Wiley Periodicals, Inc
Downie JS (2008) The music information retrieval evaluation exchange (2005–2007): a window into music information retrieval research. Acoust Sci Technol 29(4):247–255. doi:10.1250/ast.29.247
Downie JS, Ehmann AF, Bay M, Cameron Jones M (2010) The music information retrieval evaluation eXchange: some observations and insights. Adv Music Inf Retr 274:93–115
Easley D, Kleinberg J (2010) Networks, crowds, and markets: reasoning about a highly connected world. Cambridge University Press, New York
Fallows D (accessed September 28, 2013) “Tempo and expression marks.” Grove Music Online. Oxford Music Online. Oxford University Press. Available at http://www.oxfordmusiconline.com/subscriber/article/grove/music/27650
Fu Z, Lu G, Ting KM, Zhang D (2011) A survey of audio-based music classification and annotation. IEEE Trans Multimed 13(2):303–319. doi:10.1109/TMM.2010.2098858
Gersho A, Gray RM (1991) Vector quantization and signal compression. Kluwer Academic Publishers, Norwell
Girvan M, Newman MEJ (2002) Community structure in social and biological networks. PNAS 99(12):7821–7826. doi:10.1073/pnas.122653799
Gouyon F, Dixon S, Pampalk E, Widmer G (2004) Evaluating rhythmic descriptors for musical genre classification. In: Proceedings of the 25th International AES Conference, London, UK, June 2004
Graphviz. Graph visualization software. Available at http://www.graphviz.org/Home.php
Hsu J-L, Li Y-F (2012) A cross-modal method of labeling music tags. Multimedia Tools Appl 58(3):521–541. doi:10.1007/s11042-011-0729-x
Jang R (2011) DCPR toolbox. Retrieved from http://neural.cs.nthu.edu.tw/jang/books/dcpr/
Lartillot O, Toiviainen P, Eerola T (2011) MIRtoolbox. Retrieved from https://www.jyu.fi/hum/laitokset/musiikki/en/research/coe/materials/mirtoolbox
Levy M, Sandler M (2009) Music information retrieval using social tags and audio. IEEE Trans Multimed 11(3):383–395. doi:10.1109/TMM.2009.2012913
Lew MS, Sebe N, Djeraba C, Jain R (2006) Content-based multimedia information retrieval: state of the art and challenges. ACM Trans Multimed Comput Commun Appl 2(1):1–19. doi:10.1145/1126004.1126005
Li Q, Myaeng SH, Kim BM (2007) A probabilistic music recommender considering user options and audio features. Inf Process Manag 43(2):473–487
Lidy T, Rauber A (2005) “Evaluation of feature extractors and psycho-acoustic transformations for music genre classification”. In: Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR’05), pp. 34–41, London, UK, September 2005
Liu B (2011) Web data mining: exploring hyperlinks, contents, and usage data, 2nd edn. Springer
Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval (p. 496). Cambridge University Press
Marcus SE, Moy M, Coffman T (2007) Social network analysis. In: Cook DJ, Holder LB (eds) Mining graph data (pp. 443–451). John Wiley & Sons, Inc
McFee B, Bertin-Mahieux T, Ellis D, Lanckriet G (2012) The million song dataset challenge. In: Proceedings of the 4th International Workshop on Advances in Music Information Research (AdMIRe ‘12)
McKay C, Burgoyne JA, Hockman J, Smith JBL, Vigliensoni G (2010) Evaluating the genre classification performance of lyrical features relative to audio, symbolic and cultural features. In: Proceeding of the 11th International Conference for Music Information Retrieval Conference
McKay C, Fujinaga I (2008) Combining features extracted from audio, symbolic, and cultural sources. In: Proceedings of International Conference on Music Information Retrieval
Miotto R, Orio N (2010) A probabilistic approach to merge context and content information for music retrieval. In: Downie JS, Veltkamp RC (eds) International Conference on Music Information Retrieval (pp. 15–20). International Society for Music Information Retrieval
Miotto R, Orio N (2012) A probabilistic model to combine tags and acoustic similarity for music retrieval. ACM Trans Inf Syst 30(2):8.1–8.29. doi:10.1145/2180868.2180870
Page L, Brin S, Motwani R, Winograd T (1998) The PageRank citation ranking: bringing order to the web. Proceedings of the World Wide Web Internet and Web Information Systems (pp. 1–17). Technical report, Stanford Digital Library Technologies Project, 1998. Retrieved from http://en.scientificcommons.org/42893894
Pan J-Y, Yang H-J, Faloutsos C, DuyguluP (2004) Automatic multimedia cross-modal correlation discovery. Proceedings of the tenth ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD 2004) (pp. 653–658). Seattle, WA, USA: ACM Press. doi:10.1145/1014052.1014135
Pohle T, Pampalk E, Widmer G (2005) Evaluation of frequently used audio features for classification of music into perceptual categories. In: Proceedings of the 4th International Workshop on Content-Based Multimedia Indexing (CBMI’05), Riga, Latvia, June 2005
Sayood K (2012) Introduction to data compression, 4th edn., p. 768. Morgan Kaufmann Publishers
Scaringella N, Zoia G, Mlynek D (2006) Automatic genre classification of music content: a survey. IEEE Signal Process Mag 23(2):133–141. doi:10.1109/MSP.2006.1598089
Sergios T, Konstantinos K (2006) Pattern recognition, 3rd edn. Academic Press
Song, Y., Dixon, S., and Pearce, P. (2012). Evaluation of Musical Features for Emotion Classification. In: Proceeding of the 13th International Conference for Music Information Retrieval Conference
Stober S, Nürnberger A (2013) Adaptive music retrieval: a state of the art. Multimedia Tools Appl 65(3):467–494. doi:10.1007/s11042-012-1042-z
Tan S, Bu J, Chen C, Xu B, Wang C, He X (2011) Using rich social media information for music recommendation via hypergraph model. ACM Trans Multimed Comput Commun Appl 7S(1), 22:1–22:22. doi:10.1145/2037676.2037679
Tong H, Faloutsos C, Pan J-Y (2007) Random walk with restart: fast solutions and applications. Knowl Inf Syst 14(3):327–346. Springer-Verlag New York, Inc. doi:10.1007/s10115-007-0094-2
Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Trans Speech Audio Process 10(5):293–302. doi:10.1109/TSA.2002.800560
Wang C, Jing F, Zhang L, Zhang H-J (2006) Image annotation refinement using random walk with restarts. Proceedings of the 14th annual ACM International Conference on Multimedia (MULTIMEDIA 2006) (pp. 647–650). Santa Barbara, CA, USA: ACM press. doi:10.1145/1180639.1180774
Zsuzsanna M, Sacarea C (2011) Using conceptual graphs to represent modern music. In: Proceedings of the 2011 I.E. International Conference on Intelligent Computer Communication and Processing (ICCP), pp. 137,140, 25–27 Aug. 2011. doi:10.1109/ICCP.2011.6047857
Acknowledgments
The authors would like to thank to Professor George Tzanetakis for his valuable guidance and advice on experiments using the Million Songs Dataset [4].
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported by Fu Jen Catholic University with Project No. 410031044042 and sponsored by the National Science Council under Contract No. NSC-100-2221-E-030-021 and NSC-101-2221-E-030-008.
Rights and permissions
About this article
Cite this article
Hsu, JL., Huang, CC. Designing a graph-based framework to support a multi-modal approach for music information retrieval. Multimed Tools Appl 74, 5401–5427 (2015). https://doi.org/10.1007/s11042-014-1860-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-014-1860-2