More Web Proxy on the site http://driver.im/

research-article

How useful are your comments?: analyzing and predicting youtube comments and comment ratings

Authors:

Stefan Siersdorfer,

Sergiu Chelaru,

Wolfgang Nejdl,

Jose San PedroAuthors Info & Claims

WWW '10: Proceedings of the 19th international conference on World wide web

Pages 891 - 900

https://doi.org/10.1145/1772690.1772781

Published: 26 April 2010 Publication History

Abstract

An analysis of the social video sharing platform YouTube reveals a high amount of community feedback through comments for published videos as well as through meta ratings for these comments. In this paper, we present an in-depth study of commenting and comment rating behavior on a sample of more than 6 million comments on 67,000 YouTube videos for which we analyzed dependencies between comments, views, comment ratings and topic categories. In addition, we studied the influence of sentiment expressed in comments on the ratings for these comments using the SentiWordNet thesaurus, a lexical WordNet-based resource containing sentiment annotations. Finally, to predict community acceptance for comments not yet rated, we built different classifiers for the estimation of ratings for these comments. The results of our large-scale evaluations are promising and indicate that community feedback on already rated comments can help to filter new unrated comments or suggest particularly useful but still unrated comments.

References

[1]

C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In ICML '05: Proceedings of the 22nd international conference on Machine learning, pages 89--96, New York, NY, USA, 2005. ACM.

Digital Library

[2]

S. Chakrabarti. Mining the Web: Discovering Knowledge from Hypertext Data. Morgan-Kauffman, 2002.

Digital Library

[3]

X. Cheng, C. Dale, and J. Liu. Understanding the characteristics of internet short video sharing: Youtube as a case study. In Technical Report arXiv:0707.3670v1 cs.NI, New York, NY, USA, 2007. Cornell University, arXiv e-prints.

[4]

C. Danescu-Niculescu-Mizil, G. Kossinets, J. Kleinberg, and L. Lee. How opinions are received by online communities: a case study on amazon.com helpfulness votes. In WWW '09: Proceedings of the 18th international conference on World wide web, pages 141--150, New York, NY, USA, 2009. ACM.

Digital Library

[5]

K. Denecke. Using sentiwordnet for multilingual sentiment analysis. In Data Engineering Workshop, 2008. ICDEW 2008, pages 507--512, 2009.

[6]

J. L. Devore. Probability and Statistics for Engineering and the Sciences. Thomson Brooks/Cole, 2004.

[7]

S. Dumais, J. Platt, D. Heckerman, and M. Sahami. Inductive learning algorithms and representations for text categorization. In CIKM '98: Proceedings of the seventh international conference on Information and knowledge management, pages 148--155, Bethesda, Maryland, United States, 1998. ACM Press.

Digital Library

[8]

A. Esuli. Automatic Generation of Lexical Resources for Opinion Mining: Models, Algorithms and Applications. PhD in Information Engineering, PhD School "Leonardo da Vinci", University of Pisa, 2008.

[9]

A. Esuli and F. Sebastiani. Sentiwordnet: A publicly available lexical resource for opinion mining. In In Proceedings of the 5th Conference on Language Resources and Evaluation (LREC'06), pages 417--422, 2006.

[10]

C. Fellbaum, editor. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA, 1998.

[11]

P. Gill, M. Arlitt, Z. Li, and A. Mahanti. Youtube traffic characterization: a view from the edge. In IMC '07: Proceedings of the 7th ACM SIGCOMM conference on Internet measurement, pages 15--28, New York, NY, USA, 2007. ACM.

Digital Library

[12]

F. M. Harper, D. Raban, S. Rafaeli, and J. A. Konstan. Predictors of answer quality in online q&a sites. In CHI '08: Proceeding of the twenty-sixth annual SIGCHI conference on Human factors in computing systems, pages 865--874, New York, NY, USA, 2008. ACM.

Digital Library

[13]

T. Joachims. Text categorization with Support Vector Machines: Learning with many relevant features. ECML, 1998.

Digital Library

[14]

T. Joachims. Making large-scale support vector machine learning practical. Advances in kernel methods: support vector learning, pages 169--184, 1999.

Digital Library

[15]

S.-M. Kim, P. Pantel, T. Chklovski, and M. Pennacchiotti. Automatically assessing review helpfulness. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 423--430, Sydney, Australia, July 2006. Association for Computational Linguistics.

Digital Library

[16]

J. Liu, Y. Cao, C.-Y. Lin, Y. Huang, and M. Zhou. Low-quality product review detection in opinion summarization. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 334--342, 2007. Poster paper.

[17]

Y. Lu, C. Zhai, and N. Sundaresan. Rated aspect summarization of short comments. In WWW '09: Proceedings of the 18th international conference on World wide web, pages 131--140, New York, NY, USA, 2009. ACM.

Digital Library

[18]

C. Manning and H. Schuetze. Foundations of Statistical Natural Language Processing. MIT Press, 1999.

Digital Library

[19]

B. Pang and L. Lee. Thumbs up? sentiment classification using machine learning techniques. In Conference on Empirical Methods in Natural Language Processing (EMNLP), Philadelphia, PA, USA, 2002.

Digital Library

[20]

M. Richardson, A. Prakash, and E. Brill. Beyond pagerank: machine learning for static ranking. In WWW '06: Proceedings of the 15th international conference on World Wide Web, pages 707--715, New York, NY, USA, 2006. ACM.

Digital Library

[21]

A. Rosenberg and E. Binkowski. Augmenting the kappa statistic to determine interannotator reliability for multiply labeled data points. In HLT-NAACL '04: Proceedings of HLT-NAACL 2004: Short Papers on XX, pages 77--80, Morristown, NJ, USA, 2004. Association for Computational Linguistics.

Digital Library

[22]

J. San Pedro and S. Siersdorfer. Ranking and classifying attractiveness of photos in folksonomies. In WWW '09: Proceedings of the 18th international conference on World wide web, pages 771--780, New York, NY, USA, 2009. ACM.

Digital Library

[23]

S. Siersdorfer, J. San Pedro, and M. Sanderson. Automatic video tagging using content redundancy. In SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 395--402, New York, NY, USA, 2009. ACM.

Digital Library

[24]

A. J. Smola and B. Schölkopf. A tutorial on support vector regression. Statistics and Computing, 14(3):199--222, 2004.

Digital Library

[25]

M. Thomas, B. Pang, and L. Lee. Get out the vote: Determining support or opposition from Congressional floor-debate transcripts. In EMNLP '06: Proceedings of the ACL-02 conference on Empirical methods in natural language processing, pages 327--335, 2006.

Digital Library

[26]

M. Weimer, I. Gurevych, and M. Muehlhaeuser. Automatically assessing the post quality in online discussions on software. In Companion Volume of the 45rd Annual Meeting of the Association for Computational Linguistics (ACL), 2007.

Digital Library

[27]

F. Wu and B. A. Huberman. How public opinion forms. In Internet and Network Economics, 4th International Workshop, WINE 2008, Shanghai, China, pages 334--341, 2008.

Digital Library

[28]

Y. Yang and J. O. Pedersen. A comparative study on feature selection in text categorization. In ICML '97: Proceedings of the Fourteenth International Conference on Machine Learning, pages 412--420, San Francisco, CA, USA, 1997. Morgan Kaufmann Publishers Inc.

Digital Library

Cited By

KAMA S(2024)Meditation as a Leisure Activity: A Content and Comment Level AnalysisBoş Zaman Etkinliği Olarak Meditasyon: İçerik ve Yorum Düzeyi AnaliziGSI Journals Serie A: Advancements in Tourism Recreation and Sports Sciences10.53353/atrss.1412002Online publication date: 9-Feb-2024
https://doi.org/10.53353/atrss.1412002
Zhu KKhern-am-nuai WYu Y(2024)Negative Peer Feedback and User Content Generation: Evidence From a Restaurant Review PlatformProduction and Operations Management10.1177/10591478231224941Online publication date: 8-Feb-2024
https://doi.org/10.1177/10591478231224941
Yang THasan RAyday EVaidya J(2024)Discovering Privacy Harms from Education Technology by Analyzing User ReviewsProceedings of the 23rd Workshop on Privacy in the Electronic Society10.1145/3689943.3695050(186-192)Online publication date: 20-Nov-2024
https://dl.acm.org/doi/10.1145/3689943.3695050
Show More Cited By

Index Terms

How useful are your comments?: analyzing and predicting youtube comments and comment ratings
1. Information systems
  1. Information systems applications

Recommendations

Analyzing and Mining Comments and Comment Ratings on the Social Web

An analysis of the social video sharing platform YouTube and the news aggregator Yahoo! News reveals the presence of vast amounts of community feedback through comments for published videos and news stories, as well as through metaratings for these ...
Sifting useful comments from Flickr Commons and YouTube

Cultural institutions are increasingly contributing content to social media platforms to raise awareness and promote use of their collections. Furthermore, they are often the recipients of user comments containing information that may be incorporated in ...
YouTube Comments on Gene-Edited Babies: What Factors Affect Diverse Opinions in Comments?

This study explored the factors that influence video popularity and diverse opinions in the comments of YouTube videos about gene-edited babies. 107 most viewed videos and corresponding 56,912 direct comments about gene-edited babies were collected from ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '10: Proceedings of the 19th international conference on World wide web

April 2010

1407 pages

ISBN:9781605587998

DOI:10.1145/1772690

General Chairs:
Michael Rappa
North Carolina State University, USA
,
Paul Jones
University of North Carolina at Chapel Hill, USA
,
Program Chairs:
Juliana Freire
University of Utah, USA
,
Soumen Chakrabarti
Indian Institute of Technology, India

Copyright © 2010 International World Wide Web Conference Committee (IW3C2).

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 April 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '10

WWW '10: The 19th International World Wide Web Conference

April 26 - 30, 2010

North Carolina, Raleigh, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

190
Total Citations
View Citations
5,643
Total Downloads

Downloads (Last 12 months)388
Downloads (Last 6 weeks)27

Reflects downloads up to 11 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

KAMA S(2024)Meditation as a Leisure Activity: A Content and Comment Level AnalysisBoş Zaman Etkinliği Olarak Meditasyon: İçerik ve Yorum Düzeyi AnaliziGSI Journals Serie A: Advancements in Tourism Recreation and Sports Sciences10.53353/atrss.1412002Online publication date: 9-Feb-2024
https://doi.org/10.53353/atrss.1412002
Zhu KKhern-am-nuai WYu Y(2024)Negative Peer Feedback and User Content Generation: Evidence From a Restaurant Review PlatformProduction and Operations Management10.1177/10591478231224941Online publication date: 8-Feb-2024
https://doi.org/10.1177/10591478231224941
Yang THasan RAyday EVaidya J(2024)Discovering Privacy Harms from Education Technology by Analyzing User ReviewsProceedings of the 23rd Workshop on Privacy in the Electronic Society10.1145/3689943.3695050(186-192)Online publication date: 20-Nov-2024
https://dl.acm.org/doi/10.1145/3689943.3695050
Meghana K(2024)Artificial Intelligence and Sentiment Analysis in YouTube Comments: A Comprehensive Overview2024 2nd International Conference on Intelligent Data Communication Technologies and Internet of Things (IDCIoT)10.1109/IDCIoT59759.2024.10467782(1565-1572)Online publication date: 4-Jan-2024
https://doi.org/10.1109/IDCIoT59759.2024.10467782
Bindhumol MSingh TPatra P(2024)Sentiment Analysis using YouTube Comments2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT61001.2024.10724166(1-7)Online publication date: 24-Jun-2024
https://doi.org/10.1109/ICCCNT61001.2024.10724166
Rathod MYadav APhutane SBhise S(2024)Comment Compass: An Approach to Analyze YouTube Comments through Web Scraping, Sentiment Analysis, and Generative AI for Actionable Insights2024 First International Conference on Pioneering Developments in Computer Science & Digital Technologies (IC2SDT)10.1109/IC2SDT62152.2024.10696483(1-5)Online publication date: 2-Aug-2024
https://doi.org/10.1109/IC2SDT62152.2024.10696483
Miyazaki KUchiba TKwak HAn JSasahara K(2024)The impact of toxic trolling comments on anti-vaccine YouTube videosScientific Reports10.1038/s41598-024-54925-w14:1Online publication date: 1-Mar-2024
https://doi.org/10.1038/s41598-024-54925-w
ÇILGIN C(2023)Emotion Analysis on Youtube Comments for 2023 Turkish Presidential Elections2023 Türkiye Cumhurbaşkanlığı Seçimleri için Youtube Yorumlarında Duygu AnaliziYeni Medya Dergisi10.55609/yenimedya.1339272Online publication date: 15-Dec-2023
https://doi.org/10.55609/yenimedya.1339272
Alafwan BSiallagan MPutro U(2023)Comments Analysis on Social Media: A ReviewICST Transactions on Scalable Information Systems10.4108/eetsis.3843Online publication date: 6-Sep-2023
https://doi.org/10.4108/eetsis.3843
Möller AVermeer SBaumgartner S(2023)Cutting Through the Comment Chaos: A Supervised Machine Learning Approach to Identifying Relevant YouTube CommentsSocial Science Computer Review10.1177/0894439323117389542:1(162-185)Online publication date: 16-May-2023
https://doi.org/10.1177/08944393231173895
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

EPUB

View this article in ePub.

Media

Figures

Other

Tables

View Table of Contents