[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2661829.2661918acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Using Crowdsourcing to Investigate Perception of Narrative Similarity

Published: 03 November 2014 Publication History

Abstract

For many applications measuring the similarity between documents is essential. However, little is known about how users perceive similarity between documents. This paper presents the first large-scale empirical study that investigates perception of narrative similarity using crowdsourcing. As a dataset we use a large collection of Dutch folk narratives. We study the perception of narrative similarity by both experts and non-experts by analyzing their similarity ratings and motivations for these ratings. While experts focus mostly on the plot, characters and themes of narratives, non-experts also pay attention to dimensions such as genre and style. Our results show that a more nuanced view is needed of narrative similarity than captured by story types, a concept used by scholars to group similar folk narratives. We also evaluate to what extent unsupervised and supervised models correspond with how humans perceive narrative similarity.

References

[1]
J. Abello, P. Broadwell, and T. R. Tangherlini. Computational folkloristics. Communications of the ACM, 55(7):60--70, July 2012.
[2]
D. Bär, T. Zesch, and I. Gurevych. A reflective view on text similarity. In Proceedings of RANLP 2011, 2011.
[3]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet Allocation. The Journal of Machine Learning Research, 3:993--1022, 2003.
[4]
M. Cherubini, R. de Oliveira, and N. Oliver. Understanding near-duplicate videos: a user-centric approach. In Proceedings of ACM Multimedia, 2009.
[5]
A. Dundes. The motif-index and the tale type index: A critique. Journal of Folklore Research, 34(3):195--202, 1997.
[6]
K. Eckert, M. Niepert, C. Niemann, C. Buckner, C. Allen, and H. Stuckenschmidt. Crowdsourcing the assembly of concept hierarchies. In Proceedings of JCDL 2010, 2010.
[7]
D. K. Elson. Detecting story analogies from annotations of time, action and agency. In Proceedings of the LREC 2012 Workshop on Computational Models of Narrative, 2012.
[8]
M. Fay. Story comparison via simultaneous matching and alignment. In Proceedings of the LREC 2012 Workshop on Computational Models of Narrative, 2012.
[9]
B. Fisseni and B. Löwe. Which dimensions of narrative are relevant for human judgments of story equivalence? In Proceedings of the LREC 2012 Workshop on Computational Models of Narrative, 2012.
[10]
L. Friedland and J. Allan. Joke retrieval: recognizing the same joke told differently. In Proceedings of CIKM 2008, 2008.
[11]
R. Gomes, P. Welinder, A. Krause, and P. Perona. Crowdclustering. In Proceedings of NIPS 2011, 2011.
[12]
R. Grundkiewicz and F. Gralinski. How to distinguish a kidney theft from a death car? Experiments in clustering urban-legend texts. In Proceedings of the Workshop on Information Extraction and Knowledge Acquisition, 2011.
[13]
J. Kim, G. Kazai, and I. Zitouni. Relevance dimensions in preference-based IR evaluation. In Proceedings of SIGIR 2013, 2013.
[14]
A. Kovashka and M. Lease. Human and machine detection of stylistic similarity in art. In Proceedings of CrowdConf 2010, 2010.
[15]
E. Kypridemou and L. Michael. Narrative similarity as common summary. In Proceedings of the Workshop on Computational Models of Narrative 2013, 2013.
[16]
K. A. La Barre and C. L. Tilley. The elusive tale: leveraging the study of information seeking and knowledge organization to improve access to and discovery of folktales. Journal of the American Society for Information Science and Technology, 63(4):687--701, 2012.
[17]
J. H. Lee. Crowdsourcing music similarity judgments using mechanical turk. In Proceedings of ISMIR 2010, 2010.
[18]
M. D. Lee, B. Pincombe, and M. B. Welsh. An empirical evaluation of models of text document similarity. Proceedings of the 27th Annual Conference of the Cognitive Science Society, 2005.
[19]
T. Meder. From a Dutch Folktale Database towards an International Folktale Database. Fabula, 51(1-2):6--22, 2010.
[20]
L. Michael. Similarity of narratives. In Proceedings of the LREC 2012 Workshop on Computational Models of Narrative, 2012.
[21]
G. Nathalie, L. B. Hervé, H. Jeanny, and G.-D. Anne. Towards the introduction of human perception in a natural scene classification system. In NNSP 2002, 2002.
[22]
D. Nguyen, D. Trieschnigg, T. Meder, and M. Theune. Automatic classification of folk narrative genres. In Proceedings of the Workshop on Language Technology for Historical Text(s) at KONVENS 2012, 2012.
[23]
D. Nguyen, D. Trieschnigg, and M. Theune. Folktale classification using learning to rank. In Proceedings of ECIR 2013, 2013.
[24]
E. Pavlick, M. Post, A. Irvine, D. Kachaev, and C. Callison-Burch. The language demographics of Amazon Mechanical Turk. Transactions of the Association for Computational Linguistics, 2(Feb):79--92, 2014.
[25]
R. Řehůřek and P. Sojka. Software Framework for Topic Modelling with Large Corpora. In New Challenges for NLP Frameworks, 2010.
[26]
J. J. Tehrani. The phylogeny of Little Red Riding Hood. PloS one, 8(11):e78871, 2013.
[27]
S. Thompson. The folktale. Dryden Press, 1951.
[28]
D. Trieschnigg, D. Hiemstra, M. Theune, F. de Jong, and T. Meder. An exploration of language identification techniques for the Dutch folktale database. In Workshop on Adaptation of Language Resources and Tools for Processing Cultural Heritage, LREC 2012, 2012.
[29]
J. Urbano, J. Morato, M. Marrero, and D. Martín. Crowdsourcing preference judgments for evaluation of music similarity tasks. In ACM SIGIR workshop on crowdsourcing for search evaluation, 2010.
[30]
A. van den Bosch, B. Busser, S. Canisius, and W. Daelemans. An efficient memory-based morphosyntactic tagger and parser for Dutch. Selected Papers of the 17th Computational Linguistics in the Netherlands Meeting, pages 99--114, 2007.
[31]
R. Vliegendhart, M. Larson, and J. A. Pouwelse. Discovering user perceptions of semantic similarity in near-duplicate multimedia files. In First International Workshop on Crowdsourcing Web Search, 2012.
[32]
J. Yi, R. Jin, A. K. Jain, S. Jain, and T. Yang. Semi-crowdsourced clustering: Generalizing crowd labeling by robust distance metric learning. In Proceedings of NIPS 2012, 2012.
[33]
M. Zengin and B. Carterette. User judgements of document similarity. Proceedings of the SIGIR 2013 Workshop on Modeling User Behavior for Information Retrieval Evaluation (MUBE 2013), 2013.

Cited By

View all
  • (2024)The State of Pilot Study Reporting in Crowdsourcing: A Reflection on Best Practices and GuidelinesProceedings of the ACM on Human-Computer Interaction10.1145/36410238:CSCW1(1-45)Online publication date: 26-Apr-2024
  • (2021)Towards a Human-AI Hybrid Framework for Inter-Researcher Similarity Detection2021 IEEE 2nd International Conference on Human-Machine Systems (ICHMS)10.1109/ICHMS53169.2021.9582633(1-4)Online publication date: 8-Sep-2021
  • (2019)Adopting Seekers’ Solution Exemplars in Crowdsourcing Ideation Contests: Antecedents and ConsequencesInformation Systems Research10.1287/isre.2018.081030:2(486-506)Online publication date: Jun-2019
  • Show More Cited By

Index Terms

  1. Using Crowdsourcing to Investigate Perception of Narrative Similarity

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
    November 2014
    2152 pages
    ISBN:9781450325981
    DOI:10.1145/2661829
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 November 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. crowdsourcing
    2. folktales
    3. narratives
    4. similarity

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    CIKM '14
    Sponsor:

    Acceptance Rates

    CIKM '14 Paper Acceptance Rate 175 of 838 submissions, 21%;
    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)22
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 01 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)The State of Pilot Study Reporting in Crowdsourcing: A Reflection on Best Practices and GuidelinesProceedings of the ACM on Human-Computer Interaction10.1145/36410238:CSCW1(1-45)Online publication date: 26-Apr-2024
    • (2021)Towards a Human-AI Hybrid Framework for Inter-Researcher Similarity Detection2021 IEEE 2nd International Conference on Human-Machine Systems (ICHMS)10.1109/ICHMS53169.2021.9582633(1-4)Online publication date: 8-Sep-2021
    • (2019)Adopting Seekers’ Solution Exemplars in Crowdsourcing Ideation Contests: Antecedents and ConsequencesInformation Systems Research10.1287/isre.2018.081030:2(486-506)Online publication date: Jun-2019
    • (2018)Syncretic matchingProceedings of the ACM India Joint International Conference on Data Science and Management of Data10.1145/3152494.3152508(146-156)Online publication date: 11-Jan-2018
    • (2018)Quality Control in CrowdsourcingACM Computing Surveys10.1145/314814851:1(1-40)Online publication date: 4-Jan-2018
    • (undefined)Adopting Seekerss Solution Exemplars in Ideation Contests: Antecedents and ConsequencesSSRN Electronic Journal10.2139/ssrn.3034630

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media