Abstract
Social media platforms have had an enormous impact on the dissemination of scientific work and have fared well in covering scientific papers. However, little is known about the general dissemination process from academia to social media and how various factors affect the dissemination of scientific papers at different stages. In this paper, we proposed a two-staged dissemination process to profile the diffusion of scientific papers from academia to social media. A two-step simultaneous equation modeling–artificial neural network approach was adopted to predict the retweet scale of scientific papers on Twitter by combining source-related and content-related factors. The analysis in the field of oncology suggests that the artificial neural network algorithm (ANN) with the input units generated from the simultaneous equation model (3SLS) can predict the retweet scale of scientific papers on Twitter with an accuracy of 78.05%. According to the normalized importance obtained from the ANN, we found that most factors related to the information source play critical roles in promoting the dissemination of scientific papers. The number of first-generation tweets has the most remarkable impact on subsequent dissemination. As for the content-related predictors, tweets attached with more URLs can provide richer information for audiences, thereby increasing the retweet scale of scientific papers. Besides, the influence of research topics on dissemination varies with different audiences. The findings of this study contribute to the literature on the dissemination of scientific papers beyond academia and provide practical implications for scholarly communication.
Similar content being viewed by others
References
Abolhassani, M., & Danakol, S. H. (2019). Wage and competition channels of foreign direct investment and new firm entry. Small Business Economics, 53(4), 935–960.
Anderson, C. J., Glassman, M., McAfee, R. B., & Pinelli, T. (2001). An investigation of factors affecting how engineers and scientists seek information. Journal of Engineering and Technology Management—JET-M, 18(2), 131–155.
Arafa, M. A., Rabah, D. M., & Farhat, K. H. (2020). Rising cancer rates in the Arab World: Now is the time for action. Eastern Mediterranean Health Journal, 26(6), 638–640.
Archambault, É., & Larivière, V. (2009). History of the journal impact factor: Contingencies and consequences. Scientometrics, 79(3), 635–649.
Bar-Ilan, J. (2006). An ego-centric citation analysis of the works of Michael O. Rabin based on multiple citation indexes. Information Processing and Management, 42(6), 1553–1566.
Bateman, C. (2009). Breast cancer breakthrough in gene profilings? SAMJ: South African Medical Journal, 99(11), 780–782.
Bornmann, L. (2014). Validity of altmetrics data for measuring societal impact: A study using data from Altmetric and F1000Prime. Journal of Informetrics, 8(4), 935–950.
Bornmann, L., & Leydesdorff, L. (2017). Skewness of citation impact data and covariates of citation distributions: A large-scale empirical analysis based on Web of Science data. Journal of Informetrics, 11(1), 164–175.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning research, 3, 993–1022.
Brossard, D., & Scheufele, D. A. (2013). Science, new media, and the public. Science, 339(6115), 40–41.
Can, E. F., Oktay, H., & Manmatha, R. (2013). Predicting retweet count using visual cues. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management (pp. 1481–1484).
Costas, R., Zahedi, Z., & Wouters, P. (2015a). The thematic orientation of publications mentioned on social media: Large-scale disciplinary comparison of social media metrics with citations. Aslib Journal of Information Management, 67(3), 260–288.
Costas, R., Zahedi, Z., & Wouters, P. (2015b). Do “altmetrics” correlate with citations? Extensive comparison of altmetric indicators with citations from a multidisciplinary perspective. Journal of the American Society for Information Science and Technology, 66(10), 2003–2019.
de Winter, J. C. F. (2014). The relationship between tweets, citations, and article views for PLOS ONE articles. Scientometrics, 102(2), 1773–1779.
Deng, S., Tong, J., Lin, Y., Li, H., & Liu, Y. (2019). Motivating scholars’ responses in academic social networking sites: An empirical study on ResearchGate Q&A behavior. Information Processing and Management, 56(6), 102082.
Didegah, F., Bowman, T. D., Bowman, S., & Hartley, J. (2016). Comparing the characteristics of highly cited titles and highly alted titles. STI 2016: Peripheries, Frontiers and Beyond, 48(25), 1–6.
Didegah, F., Mejlgaard, N., & Sørensen, M. P. (2018). Investigating the quality of interactions and public engagement around scientific papers on Twitter. Journal of Informetrics, 12(3), 960–971.
Didegah, F., & Thelwall, M. (2013). Determinants of research citation impact in nanoscience and nanotechnology. Journal of the American Society for Information Science and Technology, 64(5), 1055–1064.
Ding, H., & Wu, J. (2015). Predicting retweet scale using log-normal distribution. In 2015 IEEE international conference on multimedia big data (pp. 56–63). IEEE.
Drongstrup, D., Malik, S., Aljohani, N. R., Alelyani, S., Safder, I., & Hassan, S. U. (2020). Can social media usage of scientific literature predict journal indices of AJG, SNIP and JCR? An Altmetric Study of Economics Scientometrics, 125(2), 1541–1558.
Eagly, A. H., & Chaiken, S. (1975). An attribution analysis of the effect of communicator characteristics on opinion change: The case of communicator attractiveness. Journal of Personality and Social Psychology, 32(1), 136–144.
Eysenbach, G. (2011). Can tweets predict citations? Metrics of social impact based on Twitter and correlation with traditional metrics of scientific impact. Journal of Medical Internet Research, 13(4), e123.
Falagas, M. E., Kouranos, V. D., Arencibia-Jorge, R., & Karageorgopoulos, D. E. (2008). Comparison of SCImago journal rank indicator with journal impact factor. The FASEB Journal, 22(8), 2623–2628.
Friedrich, N., Bowman, T. D., Stock, W. G., & Haustein, S. (2015). Adapting sentiment analysis for tweets linking to scientific papers. In Proceedings of ISSI 2015 Istanbul: 15th international society of scientometrics and informetrics conference (pp. 107–108).
Gupta, A., & Kumaraguru, P. (2012). Credibility ranking of tweets during high impact events. In Proceedings of the 1st workshop on privacy and security in online social media (pp. 2–8).
Harrigan, N., Achananuparp, P., & Lim, E. P. (2012). Influentials, novelty, and social contagion. The viral power of average friends, close communities, and old news. Social Networks, 34(4), 470–480.
Haustein, S., Costas, R., & Larivière, V. (2015). Characterizing social media metrics of scholarly papers: The effect of document properties and collaboration patterns. PLoS ONE, 10(3), e0120495.
Haustein, S., Peters, I., Sugimoto, C. R., Thelwall, M., & Larivière, V. (2014). Tweeting biomedicine: An analysis of tweets and citations in the biomedical literature. Journal of the American Society for Information Science and Technology, 65(4), 656–669.
Hassan, S. U., Imran, M., Gillani, U., Aljohani, N. R., Bowman, T. D., & Didegah, F. (2017). Measuring social media activity of scientific literature: An exhaustive comparison of scopus and novel altmetrics big data. Scientometrics, 113(2), 1037–1057.
Hassan, S. U., Bowman, T. D., Shabbir, M., Akhtar, A., Imran, M., & Aljohani, N. R. (2019). Influential tweeters in relation to highly cited articles in altmetric big data. Scientometrics, 119(1), 481–493.
Hassan, S. U., Iqbal, S., Aljohani, N. R., Alelyani, S., & Zuccala, A. (2020a). Introducing the ‘alt-index’for measuring the social visibility of scientific research. Scientometrics, 123, 1407–1419.
Hassan, S. U., Aljohani, N. R., Shabbir, M., Ali, U., Iqbal, S., Sarwar, R., & Herrera, F. (2020b). Tweet coupling: A social media methodology for clustering scientific publications. Scientometrics, 124, 973–991.
He, X., Meng, X., Wu, Y., Chan, C. S., & Pang, T. (2020). Semantic matching efficiency of supply and demand texts on online technology trading platforms: Taking the electronic information of three platforms as an example. Information Processing and Management, 57(5), 102258.
Holmberg, K., Bowman, T. D., Haustein, S., & Peters, I. (2014). Astrophysicists’ conversational connections on Twitter. PLoS ONE, 9(8), e106086.
Huang, J., & Xue, J. (2015). The predictive power of content and temporal features of posts in information dissemination in microblogging. Journal of China Tourism Research, 11(2), 150–165.
Jin, X., Feng, H., & Zhou, Z. (2017). Understanding Healthcare Knowledge Diffusion in WeChat. In Sixteenth Wuhan international conference on E-business, (pp. 288–297).
Jin, X. L., Zhou, Z., & Yu, X. (2019). Predicting users’ willingness to diffuse healthcare knowledge in social media. Information Technology & People, 32(4), 1044–1064.
Jiang, C. , Han, R. , Xu, Q. , & Liu, Y. (2020). The impact of soft information extracted from descriptive text on crowdfunding performance. In Electronic commerce research and applications, p. 101002.
Jung, A. K., Ross, B., Neuberger, C., Mirbabaie, M., Stieglitz, S., & Kapidzic, S. (2018). Information diffusion between twitter and online media. In International conference on information systems 2018, ICIS 2018. Association for Information Systems.
Khan, S., Liu, X., Shakil, K. A., & Alam, M. (2017). A survey on scholarly data: From big data perspective. Information Processing and Management, 53(4), 923–944.
Kim, Y., Kim, J. E., Kim, Y. H., Yoon, D. Y., Kim, Y. J., & Bae, J. S. (2019). Social attention and scientific articles on stroke: Altmetric analysis of top-50 articles. Clinical Neurology and Neurosurgery, 183, 105386.
Klar, S., Krupnikov, Y., Ryan, J. B., Searles, K., & Shmargad, Y. (2020). Using social media to promote academic research: Identifying the benefits of twitter for sharing academic work. PLoS ONE, 15(4), e0229446.
Lehmann, S., Jackson, A. D., & Lautrup, B. E. (2008). A quantitative analysis of indicators of scientific performance. Scientometrics, 76(2), 369–390.
Leong, L. Y., Hew, T. S., Lee, V. H., & Ooi, K. B. (2015). An SEM-artificial-neural-network analysis of the relationships between SERVPERF, customer satisfaction and loyalty among low-cost and full-service airline. Expert Systems with Applications, 42(19), 6620–6634.
Leong, L. Y., Hew, T. S., Ooi, K. B., & Wei, J. (2020). Predicting mobile wallet resistance: A two-staged structural equation modeling-artificial neural network approach. International Journal of Information Management, 51, 102047.
Leong, L. Y., Hew, T. S., Tan, G. W. H., & Ooi, K. B. (2013). Predicting the determinants of the NFC-enabled mobile credit card acceptance: A neural networks approach. Expert Systems with Applications, 40(14), 5604–5620.
Li, J., Yin, Y., Fortunato, S., & Wang, D. (2020a). Scientific elite revisited: Patterns of productivity, collaboration, authorship and impact. Journal of the Royal Society Interface, 17(165), 31–35.
Li, G., Guan, W. D., Ma, Y. X., & Mao, J. (2020b). Predicting social media visibility of scientific papers. Data Analysis and Knowledge Discovery, 4(08), 63–74.
Liu, G., Shi, C., Chen, Q., Wu, B., & Qi, J. (2014). A two-phase model for retweet number prediction. In International conference on web-age information management (pp. 781–792). Cham: Springer.
Liu, W., Wang, X., & Cao, Z. (2015). Popularity prediction in microblog based on LR-DT. In 2015 international conference on behavioral, economic and socio-cultural computing (BESC) (pp. 18–23). IEEE.
Liu, Z., Liu, L., & Li, H. (2012). Determinants of information retweeting in microblogging. Internet Research, 22(4), 443–466.
Lulic, I., & Kovic, I. (2013). Analysis of emergency physicians’ Twitter accounts. Emergency Medicine Journal, 30(5), 371–376.
Ma, W. W., & Chan, A. (2014). Knowledge sharing and social media: Altruism, perceived online attachment motivation, and perceived online relationship commitment. Computers in Human Behavior, 39, 51–58.
Marukatat, R. (2016). A retweet prediction of Thai tweets. In 2016 IEEE advanced information management, communicates, electronic and automation control conference (IMCEC) (pp. 1000–1003). IEEE.
Marcus, A., & Oransky, I. (2011). The paper is not sacred. Nature, 480(7378), 449–450.
McNutt, M. (2015). Breakthrough to genome editing. Science, 350(6267), 1445.
Mo, Z. Y., Ma, F. C., & Luo, Y. (2013). Research on the construction of quality evaluation model of micro-blog information. Journal of Information Resources Management, 3(02), 12–18.
Mohammadi, E., Gregory, K. B., Thelwall, M., & Barahmand, N. (2020). Which health and biomedical topics generate the most Facebook interest and the strongest citation relationships? Information Processing and Management, 57(3), 102230.
Nesi, P., Pantaleo, G., Paoli, I., & Zaza, I. (2018). Assessing the reTweet proneness of tweets: Predictive models for retweeting. Multimedia Tools and Applications, 77(20), 26371–26396.
Pálovics, R., Daróczy, B., & Benczúr, A. A. (2013). Temporal prediction of retweet count. In 2013 IEEE 4th international conference on cognitive infocommunications (CogInfoCom) (pp. 267–270). IEEE.
Patthi, B., Prasad, M., Gupta, R., Singla, A., Kumar, J. K., Dhama, K., Ali, I., & Niraj, L. K. (2017). Altmetrics-A collated adjunct beyond citations for scholarly impact: A systematic review. Journal of Clinical and Diagnostic Research, 11(6), ZE16–ZE20.
Peoples, B. K., Midway, S. R., Sackett, D., Lynch, A., & Cooney, P. B. (2016). Twitter predicts citation rates of ecological research. PLoS ONE, 11(11), e0166570.
Prasad, A. M., Iverson, L. R., & Liaw, A. (2006). Newer classification and regression tree techniques: Bagging and random forests for ecological prediction. Ecosystems, 9(2), 181–199.
Rashid, J., Shah, S. M. A., & Irtaza, A. (2019). Fuzzy topic modeling approach for text mining over short text. Information Processing and Management, 56(6), 102060.
Rauschnabel, P. A., Sheldon, P., & Herzfeldt, E. (2019). What motivates users to hashtag on social media? Psychology & Marketing, 36(5), 473–488.
Riquelme, F., & González-Cantergiani, P. (2016). Measuring user influence on Twitter: A survey. Information Processing and Management, 52(5), 949–975.
Said, A., Bowman, T. D., Abbasi, R. A., Aljohani, N. R., Hassan, S. U., & Nawaz, R. (2019). Mining network-level properties of Twitter altmetrics data. Scientometrics, 120(1), 217–235.
Sarwar, R., Zia, A., Nawaz, R., Fayoumi, A., Aljohani, N. R., & Hassan, S. U. (2021). Webometrics: Evolution of social media presence of universities. Scientometrics, 126(2), 951–967.
Sedighi, M. (2020). Evaluating the impact of research using the altmetrics approach (case study: The field of scientometrics). Global Knowledge, Memory and Communication, 69(4/5), 241–252.
Shan, S., Liu, M., & Xu, X. (2017). Analysis of the key influencing factors of haze information dissemination behavior and motivation in WeChat. Information Discovery and Delivery, 45(1), 21–29.
Shema, H., Bar-Ilan, J., & Thelwall, M. (2015). How is research blogged? A content analysis approach. Journal of the American Society for Information Science and Technology, 66(10), 1136–1149.
Shi, J., Hu, P., Lai, K. K., & Chen, G. (2018). Determinants of users’ information dissemination behavior on social networking sites: An elaboration likelihood model perspective. Internet Research, 28(2), 393–418.
Shu, F., Lou, W., & Haustein, S. (2018). Can Twitter increase the visibility of Chinese publications? Scientometrics, 116(1), 505–519.
Siddiqui, S., & Sheikh, S. P. (2016). Modelling the return of Shariah with underlying indices of national stock exchange of India: A case of 3SLS and GMM estimation. Journal of Emerging Economies and Islamic Research, 4(2), 1–15.
Sugimoto, C. R., Work, S., Larivière, V., & Haustein, S. (2017). Scholarly use of social media and altmetrics: A review of the literature. Journal of the Association for Information Science and Technology, 68(9), 2037–2062.
Son, J., Lee, H. K., Jin, S., & Lee, J. (2019). Content features of tweets for effective communication during disasters: A media synchronicity theory perspective. International Journal of Information Management, 45, 56–68.
Söderlund, C., & Lundin, J. (2017). What is an information source? Information design based on information source selection behavior. Communication Design Quarterly Review, 4(3), 12–19.
Suh, B., Hong, L., Pirolli, P., & Chi, E. H. (2010). Want to be retweeted? large scale analytics on factors impacting retweet in twitter network. In 2010 IEEE second international conference on social computing (pp. 177–184). IEEE.
Su, M., Zhang, Z., Zhu, Y., & Zha, D. (2019). Data-driven natural gas spot price forecasting with least squares regression boosting algorithm. Energies, 12(6), 1094.
Sutton, J., Spiro, E. S., Johnson, B., Fitzhugh, S., Gibson, B., & Butts, C. T. (2014). Warning tweets: Serial transmission of messages during the warning phase of a disaster event. Information Communication and Society, 17(6), 765–787.
Teo, A. C., Tan, G. W. H., Ooi, K. B., Hew, T. S., & Yew, K. T. (2015). The effects of convenience and speed in m-payment. Industrial Management and Data Systems, 115(2), 311–331.
Thelwall, M., Haustein, S., Larivière, V., & Sugimoto, C. R. (2013). Do altmetrics work? Twitter and ten other social web services. PLoS ONE, 8(5), e64841.
Thelwall, M., Tsou, A., Weingart, S., Holmberg, K., & Haustein, S. (2013b). Tweeting links to academic articles. Cybermetrics, 17(1), 1–8.
Vahdati, S., Fathalla, S., Lange, C., Behrend, A., Say, A., Say, Z., & Auer, S. (2021). A comprehensive quality assessment framework for scientific events. Scientometrics, 126(1), 641–682.
Watkinson, A., Nicholas, D., Thornley, C., Herman, E., Jamali, H. R., Volentine, R., & Tenopir, C. (2016). Changes in the digital scholarly environment and issues of trust: An exploratory, qualitative analysis. Information Processing and Management, 52(3), 446–458.
Webberley, W. M., Allen, S. M., & Whitaker, R. M. (2016). Retweeting beyond expectation: Inferring interestingness in Twitter. Computer Communications, 73, 229–235.
Westerwick, A., Johnson, B. K., & Knobloch-Westerwick, S. (2017). Confirmation biases in selective exposure to political online information: Source bias vs. content bias. Communication Monographs, 84(3), 343–364.
Wang, X., Fang, Z., & Guo, X. (2016). Tracking the digital footprints to scholarly articles from social media. Scientometrics, 109(2), 1365–1376.
Xu, W., Liu, R., Yang, P., Chen, X., Zhang, M., Xu, Y., et al. (2016). eMAP: Efficient user selection for mobile advertisement popularization. In 2016 IEEE 83rd vehicular technology conference (VTC Spring) (pp. 1–5). IEEE.
Yang, Q., Tufts, C., Ungar, L., Guntuku, S., & Merchant, R. (2018). To retweet or not to retweet: Understanding what features of cardiovascular tweets influence their retransmission. Journal of Health Communication, 23(12), 1026–1035.
Yang, Y., Zhang, C., Fan, C., Yao, W., Huang, R., & Mostafavi, A. (2019). Exploring the emergence of influential users on social media during natural disasters. International Journal of Disaster Risk Reduction, 38(2019), 101204.
Ye, Y. E., & Na, J. C. (2018). To get cited or get tweeted: A study of psychological academic articles. Online Information Review, 42(7), 1065–1081.
Yu, H. (2017). Context of altmetrics data matters: An investigation of count type and user category. Scientometrics, 111(1), 267–283.
Zahedi, Z., Costas, R., & Wouters, P. (2014). How well developed are altmetrics? A cross-disciplinary analysis of the presence of ‘alternative metrics’ in scientific publications. Scientometrics, 101(2), 1491–1513.
Zhang, Q., Gong, Y., Wu, J., Huang, H., & Huang, X. (2016). Retweet prediction with attention-based deep neural network. In Proceedings of the 25th ACM international on conference on information and knowledge management (pp. 75–84).
Zhao, J., Zhu, C., Peng, Z., Xu, X., & Liu, Y. (2018). User willingness toward knowledge sharing in social networks. Sustainability (switzerland), 10(12), 1–27.
Zhao, J. L., Gao, H., Li, Y., & Liu, J. (2017). Which factors affect the duration of hot topics on social media platforms? Quality and Quantity, 51(5), 2395–2407.
Zhao, W. X., Jiang, J., Weng, J. S., He, J., Lim, E. P., Yan, H. F., & Li, X. M. (2011). Comparing twitter and traditional media using topic models. In P. Clough, C. Foley, C. Gurrin, G. J. F. Jones, W. Kraaij, H. Lee, & V. Murdoch (Eds.), European conference on information retrieval (pp. 338–349). Berlin Heidelberg: Springer.
Acknowledgements
This work was jointly supported by the National Natural Science Foundation of China (nos. 72004094, 71921002, and 72074112). In addition, we thank the anonymous reviewers for valuable comments and suggestions which helped us to reshape and improve this paper.
Author information
Authors and Affiliations
Contributions
Y.M.: Conceptualization, Methodology, Writing—Original Draft, Formal analysis. Z.B.: Data curation, Methodology, Writing—Review & Editing, Software. Y.C.Z.: Writing—Review & Editing, Supervision. Jin Mao: Conceptualization, Methodology. G.L.: Supervision, Funding acquisition.
Corresponding authors
Rights and permissions
About this article
Cite this article
Ma, Y., Ba, Z., Zhao, Y. et al. Understanding and predicting the dissemination of scientific papers on social media: a two-step simultaneous equation modeling–artificial neural network approach. Scientometrics 126, 7051–7085 (2021). https://doi.org/10.1007/s11192-021-04051-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-021-04051-5