Abstract
In-text citation analysis is one of the most frequently used methods in research evaluation. We are seeing significant growth in citation analysis through bibliometric metadata, primarily due to the availability of citation databases such as the Web of Science, Scopus, Google Scholar, Microsoft Academic, and Dimensions. Due to better access to full-text publication corpora in recent years, information scientists have gone far beyond traditional bibliometrics by tapping into advancements in full-text data processing techniques to measure the impact of scientific publications in contextual terms. This has led to technical developments in citation classifications, citation sentiment analysis, citation summarisation, and citation-based recommendation. This article aims to narratively review the studies on these developments. Its primary focus is on publications that have used natural language processing and machine learning techniques to analyse citations.
Similar content being viewed by others
References
Abu-Jbara, A., Ezra, J., & Radev, D. (2013). Purpose and polarity of citation: Towards nlp-based bibliometrics. Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 596–606.
Agarwal, S., Choubey, L., & Yu, H. (2010). Automatically classifying the role of citations in biomedical articles. AMIA Annual Symposium Proceedings, 2010, 11.
Ahmad, R., & Afzal, M. T. (2018). CAD: An algorithm for citation-anchors detection in research papers. Scientometrics, 117(3), 1405–1423.
Aljaber, B., Martinez, D., Stokes, N., & Bailey, J. (2011). Improving Mesh classification of biomedical articles using citation contexts. Journal of Biomedical Informatics, 44(5), 881–896.
Ananiadou, S., Thompson, P., & Nawaz, R. (2013, March). Enhancing search: Events and their discourse context. In International conference on intelligent text processing and computational linguistics (pp. 318–334). Berlin, Heidelberg: Springer.
Anderson, M. H. (2006). How can we know what we think until we see what we said?: A citation and citation context analysis of Karl Weick’s The Social Psychology of Organizing. Organization Studies, 27(11), 1675–1692.
Angrosh, M. A., Cranefield, S., & Stanger, N. (2010). Context identification of sentences in related work sections using a conditional random field: Towards intelligent digital libraries. Proceedings of the 10th Annual Joint Conference on Digital Libraries,: 293–302.
Angrosh, M. A., Cranefield, S., & Stanger, N. (2013). Context identification of sentences in research articles: Towards developing intelligent tools for the research community. Natural Language Engineering, 19(4), 481–515.
Athar, A., & Teufel, S. (2012a). Context-enhanced citation sentiment detection. In Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: Human language technologies (pp. 597–601).
Athar, A., & Teufel, S. (2012b). Detection of implicit citations for sentiment detection. In Proceedings of the Workshop on Detecting Structure in Scholarly Discourse (pp. 18–26).
Bakhti, K., Niu, Z., & Nyamawe, A. S. (2018). Semi-automatic annotation for citation function classification. 2018 International Conference on Control, Artificial Intelligence, Robotics & Optimization (ICCAIRO), 43–47.
Balabantaray, R. C., Sarma, C., & Jha, M. (2015). Document clustering using k-means and k-medoids. ArXiv Preprint. https://arxiv.org/abs/1502.07938.
Barrera, A., & Verma, R. (2012). Combining syntax and semantics for automatic extractive single-document summarization. International Conference on Intelligent Text Processing and Computational Linguistics, 366–377.
Batista-Navarro, R. T., Kontonatsios, G., Mihăilă, C., Thompson, P., Rak, R., Nawaz, R., et al. (2013, March). Facilitating the analysis of discourse phenomena in an interoperable NLP platform (pp. 559–571). Berlin, Heidelberg: Springer.
Bertin, M., & Atanassova, I. (2014). A study of lexical distribution in citation contexts through the IMRaD standard. PLoS Neglected Tropical Diseases, 1(200 920), 83–402.
Bertin, M., & Atanassova, I. (2017). The context of multiple in-text references and their signification. International Journal on Digital Libraries, 19(2), 127–138.
Bertin, M., Atanassova, I., Gingras, Y., & Larivière, V. (2016). The invariant distribution of references in scientific articles. Journal of the Association for Information Science and Technology, 67(1), 164–177.
Bonzi, S. (1982). Characteristics of a literature as predictors of relatedness between cited and citing works. Journal of the American Society for Information Science, 33(4), 208–216.
Bornmann, L., & Daniel, H.-D. (2008). What do citation counts measure? A review of studies on citing behavior. Journal of Documentation, 64(1), 45–80.
Bornmann, L., Haunschild, R., & Hug, S. E. (2018). Visualizing the context of citations referencing papers published by Eugene Garfield: A new type of keyword co-occurrence analysis. Scientometrics, 114(2), 427–437.
Boyack, K. W., van Eck, N. J., Colavizza, G., & Waltman, L. (2018). Characterizing in-text citations in scientific articles: A large-scale analysis. Journal of Informetrics, 12(1), 59–73.
Cao, Z., Li, W., & Wu, D. (2016). Polyu at cl-scisumm 2016. Proceedings of the Joint Workshop on Bibliometric-Enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL), 132–138.
Chang, Y.-W. (2013). A comparison of citation contexts between natural sciences and social sciences and humanities. Scientometrics, 96(2), 535–553.
Chubin, D. E., & Moitra, S. D. (1975). Content analysis of references: Adjunct or alternative to citation counting? Social Studies of Science, 5(4), 423–441.
Cohan, A., & Goharian, N. (2018). Scientific document summarization via citation contextualization and scientific discourse. International Journal on Digital Libraries, 19(2–3), 287–303.
Cohen, A. M., Hersh, W. R., Peterson, K., & Yen, P.-Y. (2006). Reducing workload in systematic review preparation using automated citation classification. Journal of the American Medical Informatics Association, 13(2), 206–219.
Conroy, J., & Davis, S. T. (2015). Vector space models for scientific document summarization. Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, 186–191.
Councill, I. G., Giles, C. L., & Kan, M.-Y. (2008). ParsCit: An Open-source CRF Reference String Parsing Package. LREC, 8, 661–667.
Cronin, B. (1984). The citation process. The Role and Significance of Citations in Scientific Communication, 103.
Dreyer, M., & Marcu, D. (2012, June). Hyter: Meaning-equivalent semantics for translation evaluation. In Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: Human language technologies (pp. 162–171).
Ding, Y., Liu, X., Guo, C., & Cronin, B. (2013). The distribution of references across texts: Some implications for citation analysis. Journal of Informetrics, 7(3), 583–592.
Ding, Y., Zhang, G., Chambers, T., Song, M., Wang, X., & Zhai, C. (2014). Content-based citation analysis: The next generation of citation analysis. Journal of the Association for Information Science and Technology, 65(9), 1820–1833.
Dong, C., & Schäfer, U. (2011). Ensemble-style self-training on citation classification. Proceedings of 5th International Joint Conference on Natural Language Processing, 623–631.
Doslu, M., & Bingol, H. O. (2016). Context sensitive article ranking with citation context analysis. Scientometrics, 108(2), 653–671.
Elkiss, A., Shen, S., Fader, A., Erkan, G., States, D., & Radev, D. (2008). Blind men and elephants: What do citation summaries tell us about a research article? Journal of the American Society for Information Science and Technology, 59(1), 51–62.
Erikson, M. G., & Erlandson, P. (2014). A taxonomy of motives to cite. Social Studies of Science, 44(4), 625–637.
Erkan, G., & Radev, D. R. (2004). Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 22, 457–479.
Fang, H. (2017). A theoretical model of scientific impact based on citations. Malaysian Journal of Library & Information Science, 20(3), 1–13.
Finney, B. (1979). The reference characteristics of scientific texts [Ph.D. Thesis]. City University (London, England).
Frost, C. O. (1979). The use of citations in literary research: A preliminary classification of citation functions. The Library Quarterly, 49(4), 399–414.
Fujiwara, T., & Yamamoto, Y. (2015). Colil: A database and search service for citation contexts in the life sciences domain. Journal of Biomedical Semantics, 6(1), 38.
Garfield, E. (1965) Can citation indexing be automated. Statistical Association Methods for Mechanized Documentation, Symposium Proceedings, 269, 189–192.
Galgani, F., Compton, P., & Hoffmann, A. (2015). Lexa: Building knowledge bases for automatic legal citation classification. Expert Systems with Applications, 42(17–18), 6391–6407.
Gambhir, M., & Gupta, V. (2017). Recent automatic text summarization techniques: A survey. Artificial Intelligence Review, 47(1), 1–66.
Garfield, E. (1956). Citation indexes: New paths to scientific knowledge. The Chemical Bulletin, 43(4), 11.
Garfield E, E. (1955). Citation indexes to the old testament. Am. Documentation Inst.
Gupta, S., & Varma, V. (2017). Scientific article recommendation by using distributed representations of text and graph. Proceedings of the 26th International Conference on World Wide Web Companion, 1267–1268.
Hassan, S.-U., Akram, A., & Haddawy, P. (2017). Identifying important citations using contextual information from full text. Proceedings of the 17th ACM/IEEE Joint Conference on Digital Libraries, 41–48.
Hassan, S.-U., Imran, M., Iqbal, S., Aljohani, N. R., & Nawaz, R. (2018). Deep context of citations using machine-learning models in scholarly full-text articles. Scientometrics, 117(3), 1645–1662.
Hassan, S.-U., Iqbal, S., Imran, M., Aljohani, N. R., & Nawaz, R. (2018). Mining the context of citations in scientific publications. International Conference on Asian Digital Libraries, 316–322.
Hatzivassiloglou, V., & McKeown, K. R. (1997). Predicting the semantic orientation of adjectives. Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, 174–181.
He, Q., Pei, J., Kifer, D., Mitra, P., & Giles, L. (2010). Context-aware citation recommendation. Proceedings of the 19th International Conference on World Wide Web, 421–430.
Hernández, M., & Gómez, J. M. (2015). Sentiment, polarity and function analysis in bibliometrics: A review. Natural Language Processing and Cognitive Science, 10, 149–160.
Hernández-Alvarez, M., & Gomez, J. M. (2016). Survey about citation context analysis: Tasks, techniques, and resources. Natural Language Engineering, 22(3), 327–349.
Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences, 102(46), 16569–16572.
Hoffmann, A., & Pham, S. B. (2003). Towards topic-based summarization for interactive document viewing. Proceedings of the 2nd International Conference on Knowledge Capture, 28–35.
Hooten, P. A. (1991). Frequency and functional use of cited documents in information science. Journal of the American Society for Information Science, 42(6), 397–404.
Hu, Z., Chen, C., & Liu, Z. (2013). Where are citations located in the body of scientific articles? A study of the distributions of citation locations. Journal of Informetrics, 7(4), 887–896.
Hu, Z., Chen, C., & Liu, Z. (2015). The recurrence of citations within a scientific article. ISSI.
Hu, Z., Lin, G., Sun, T., & Hou, H. (2017). Understanding multiply mentioned references. Journal of Informetrics, 11(4), 948–958.
Huang, W., Kataria, S., Caragea, C., Mitra, P., Giles, C. L., & Rokach, L. (2012). Recommending citations: Translating papers into references. CIKM, 12, 1910–1914.
Huang, W., Wu, Z., Liang, C., Mitra, P., & Giles, C. L. (2015). A neural probabilistic model for context based citation recommendation. Twenty-Ninth AAAI Conference on Artificial Intelligence.
Hurt, C. D. (1987). Conceptual citation differences in science, technology, and social sciences literature. Information Processing & Management, 23(1), 1–6.
Ikram, M. T., & Afzal, M. T. (2019). Aspect based citation sentiment analysis using linguistic patterns for better comprehension of scientific knowledge. Scientometrics, 119(1), 73–95.
Ikram, M. T., Afzal, M. T., & Butt, N. A. (2018). Automated citation sentiment analysis using high order n-grams: A preliminary investigation. Turkish Journal of Electrical Engineering & Computer Sciences, 26(4), 1922–1932.
Jahangir, M., Afzal, H., Ahmed, M., Khurshid, K., & Nawaz, R. (2017, September). An expert system for diabetes prediction using auto tuned multi-layer perceptron. In 2017 Intelligent systems conference (IntelliSys) (pp. 722–728). IEEE.
Jebari, C., Cobo, M. J., & Herrera-Viedma, E. (2018). A new approach for implicit citation extraction. International Conference on Intelligent Data Engineering and Automated Learning, 121–129.
Jeong, Y. K., Song, M., & Ding, Y. (2014). Content-based author co-citation analysis. Journal of Informetrics, 8(1), 197–211.
Jha, R., Jbara, A.-A., Qazvinian, V., & Radev, D. R. (2017). NLP-driven citation analysis for scientometrics. Natural Language Engineering, 23(1), 93–130.
Jochim, C., & Schütze, H. (2012). Towards a generic and flexible citation classifier based on a faceted classification scheme. Proceedings of COLING 2012, 1343–1358.
Judge, T. A., Cable, D. M., Colbert, A. E., & Rynes, S. L. (2007). What causes a management article to be cited—Article, author, or journal? Academy of Management Journal, 50(3), 491–506.
Kaplan, D., Iida, R., & Tokunaga, T. (2009). Automatic extraction of citation contexts for research paper summarization: A coreference-chain based approach. Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries (NLPIR4DL), 88–95.
Karimi, S., Moraes, L., Das, A., Shakery, A., & Verma, R. (2018). Citance-based retrieval and summarization using IR and machine learning. Scientometrics, 116(2), 1331–1366.
Klampfl, S., Rexha, A., & Kern, R. (2016). Identifying referenced text in scientific publications by summarisation and classification techniques. Proceedings of the Joint Workshop on Bibliometric-Enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL), 122–131.
Lawrence, S., Giles, C. L., & Bollacker, K. (1999). Digital libraries and autonomous citation indexing. Computer, 32(6), 67–71.
Li, X., He, Y., Meyers, A., & Grishman, R. (2013). Towards fine-grained citation function classification. Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013, 402–407.
Li, L., Zhang, Y., Mao, L., Chi, J., Chen, M., & Huang, Z. (2017). CIST@ CLSciSumm-17: Multiple features based citation linkage, classification and summarization. BIRNDL@ SIGIR, 2, 43–54.
Lin, C. Y., & Hovy, E. (2003). Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of the 2003 human language technology conference of the North American chapter of the association for computational linguistics (pp. 150–157).
Liu, M. (1993). Progress in documentation the complexities of citation practice: A review of citation studies. Journal of Documentation, 49(4), 370–408.
Lopez, P. (2009). GROBID: Combining automatic bibliographic data recognition and term extraction for scholarship publications. International Conference on Theory and Practice of Digital Libraries, 473–474.
Lu, K., Mao, J., Li, G., & Xu, J. (2016). Recognizing reference spans and classifying their discourse facets. Proceedings of the Joint Workshop on Bibliometric-Enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL), 139–145.
Ma, S., Xu, J., & Zhang, C. (2018). Automatic identification of cited text spans: A multi-classifier approach over imbalanced dataset. Scientometrics, 116, 1303–1330.
Ma, S., Zhang, C., & Liu, X. (2020). A review of citation recommendation: From textual content to enriched context. Scientometrics, 122(3), 1445–1472.
MacRoberts, M. H., & MacRoberts, B. R. (1989). Problems of citation analysis: A critical review. Journal of the American Society for Information Science, 40(5), 342–349.
Mäntylä, M. V., Graziotin, D., & Kuutila, M. (2018). The evolution of sentiment analysis—A review of research topics, venues, and top cited papers. Computer Science Review, 27, 16–32.
McCain, K., & Turner, K. (1989). Citation context analysis and aging patterns of journal articles in molecular genetics. Scientometrics, 17(1–2), 127–163.
McCallum, A. K., Nigam, K., Rennie, J., & Seymore, K. (2000). Automating the construction of internet portals with machine learning. Information Retrieval, 3(2), 127–163.
Mei, Q., & Zhai, C. (2008). Generating impact-based summaries for scientific literature. Proceedings of ACL-08: HLT, 816–824.
Mercer, R. E., Di Marco, C., & Kroon, F. W. (2004). The frequency of hedging cues in citation contexts in scientific writing. Conference of the Canadian Society for Computational Studies of Intelligence, 75–88.
Mohammad, S., Dorr, B., Egan, M., Hassan, A., Muthukrishan, P., Qazvinian, V., Radev, D., & Zajic, D. (2009). Using citations to generate surveys of scientific paradigms. Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 584–592.
Moravcsik, M. J., & Murugesan, P. (1975). Some results on the function and quality of citations. Social Studies of Science, 5(1), 86–92.
Nallapati, R. M., Ahmed, A., Xing, E. P., & Cohen, W. W. (2008). Joint latent topic models for text and citations. Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 542–550.
Nicholson, J. M., Mordaunt, M., Lopez, P., Uppala, A., Rosati, D., Rodrigues, N. P., & Rife, S. Scite: A smart citation index that displays the context of citations and classifies their intent using deep learning. bioRxiv, 2021.
Nomoto, T. (2016). NEAL: A neurally enhanced approach to linking citation and reference. Proceedings of the Joint Workshop on Bibliometric-Enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL), 168–174.
Oppenheim, C., & Renn, S. P. (1978). Highly cited old papers and the reasons why they continue to be cited. Journal of the American Society for Information Science, 29(5), 225–231.
Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends® in Information Retrieval, 2(1–2), 1–135.
Piao, S., Ananiadou, S., Tsuruoka, Y., Sasaki, Y., & McNaught, J. (2007). Mining opinion polarity relations of citations. International Workshop on Computational Semantics (IWCS), 366–371.
Prabha, C. G. (1983). Some aspects of citation behavior: A pilot study in business administration. Journal of the American Society for Information Science, 34(3), 202–206.
Pride, D., & Knoth, P. (2017). Incidental or influential?–A decade of using text-mining for citation function classification. 16th International Society of Scientometrics and Informetrics Conference.
Qazvinian, V., & Radev, D. R. (2010). Identifying non-explicit citing sentences for citation-based summarization. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 555–564.
Qazvinian, V., Radev, D. R., Mohammad, S. M., Dorr, B., Zajic, D., Whidby, M., & Moon, T. (2013). Generating extractive summaries of scientific paradigms. Journal of Artificial Intelligence Research, 46, 165–201.
Radev, D. R., Allison, T., Blair-Goldensohn, S., Blitzer, J., Celebi, A., Dimitrov, S., Drabek, E., Hakim, A., Lam, W., & Liu, D. (2004). MEAD-a platform for multidocument multilingual text summarization. Lisbon, Portugal: LREC.
Ritchie, A., Robertson, S., & Teufel, S. (2008). Comparing citation contexts for information retrieval. Proceedings of the 17th ACM Conference on Information and Knowledge Management, 213–222.
Safder, I., & Hassan, S. U. (2019). Bibliometric-enhanced information retrieval: A novel deep feature engineering approach for algorithm searching from full-text publications. Scientometrics, 119(1), 257–277.
Safer, M. A., & Tang, R. (2009). The psychology of referencing in psychology journal articles. Perspectives on Psychological Science, 4(1), 51–53.
Salton, G. (1963). Associative document retrieval techniques using bibliographic information. Journal of the ACM (JACM), 10(4), 440–457.
See, A., Liu, P. J., & Manning, C. D. (2017). Get to the point: Summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368.
Shadish, W. R., Tolliver, D., Gray, M., & Sen Gupta, S. K. (1995). Author judgements about works they cite: Three studies from psychology journals. Social Studies of Science, 25(3), 477–498.
Shardlow, M., Batista-Navarro, R., Thompson, P., Nawaz, R., McNaught, J., & Ananiadou, S. (2018). Identification of research hypotheses and new knowledge from scientific literature. BMC Medical Informatics and Decision Making, 18(1), 1–13.
Siddharthan, A., & Teufel, S. (2007). Whose idea was this, and why does it matter? Attributing scientific work to citations. Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, 316–323.
Small, H. (1982). Citation context analysis. Progress in Communication Sciences, 287–310.
Small, H. (2004). On the shoulders of Robert Merton: Towards a normative theory of citation. Scientometrics, 60(1), 71–79.
Small, H. (2018). Characterizing highly cited method and non-method papers using citation contexts: The role of uncertainty. Journal of Informetrics, 12(2), 461–480.
Small, H., Tseng, H., & Patek, M. (2017). Discovering discoveries: Identifying biomedical discoveries using citation contexts. Journal of Informetrics, 11(1), 46–62.
Sugiyama, K., Kumar, T., Kan, M.-Y., & Tripathi, R. C. (2010). Identifying citing sentences in research papers using supervised learning. Information Retrieval & Knowledge Management,(CAMP), 2010 International Conference On, 67–72.
Sula, C. A., & Miller, M. (2014). Citations, contexts, and humanistic discourse: Toward automatic extraction and classification. Literary and Linguistic Computing, 29(3), 452–464.
Tahamtan, I., & Bornmann, L. (2019). What do citation counts measure? An updated review of studies on citations in scientific documents published between 2006 and 2018. Scientometrics, 121(3), 1635–1684.
Tandon, N., & Jain, A. (2012). Citation context sentiment analysis for structured summarization of research papers. 35th German Conference on Artificial Intelligence, 98.
Tang, J., & Zhang, J. (2009, April). A discriminative approach to topic-based citation recommendation. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 572-579).
Taşkın, Z., & Al, U. (2018). A content-based citation analysis study based on text categorization. Scientometrics, 114(1), 335–357.
Teufel, S., & Moens, M. (2002). Summarizing scientific articles: Experiments with relevance and rhetorical status. Computational Linguistics, 28(4), 409–445.
Teufel, S., Siddharthan, A., & Tidhar, D. (2006). Automatic classification of citation function. Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, 103–110.
Thompson, P., Nawaz, R., McNaught, J., & Ananiadou, S. (2017). Enriching news events with meta-knowledge information. Language Resources and Evaluation, 51(2), 409–438.
Tkaczyk, D., & Bolikowski, L. (2015). Extracting contextual information from scientific literature using CERMINE system. Semantic Web Evaluation. Challenges, 93–104.
Tuarob, S., Kang, S. W., Wettayakorn, P., Pornprasit, C., Sachati, T., Hassan, S.-U., & Haddawy, P. (2019). Automatic classification of algorithm citation functions in scientific literature. IEEE Transactions on Knowledge and Data Engineering, 32(10), 1881–1896.
Turney, P. D. (2002). Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, 417–424.
Valenzuela, M., Ha, V., & Etzioni, O. (2015). Identifying meaningful citations. Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence.
Verma, R., & Lee, D. (2017). Extractive summarization: Limits, compression, generalized model and heuristics. Computación y Sistemas, 21(4), 787–798.
Voos, H., & Dagaev, K. S. (1976). Are all citations equal? Or did we op. cit. your idem? Journal of Academic Librarianship, 1(6), 19–21.
Wang, C., & Blei, D. M. (2011). Collaborative topic modeling for recommending scientific articles. Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 448–456.
Wang, M., Leng, D., Ren, J., Zeng, Y., & Chen, G. (2019). Sentiment classification based on linguistic patterns in citation context. CURRENT SCIENCE, 117(4), 606.
Wang, W., Villavicencio, P., & Watanabe, T. (2012). Analysis of reference relationships among research papers, based on citation context. International Journal on Artificial Intelligence Tools, 21(02), 1240004.
White, H. D. (2004). Citation analysis and discourse analysis revisited. Applied Linguistics, 25(1), 89–116.
Yang, L., Zheng, Y., Cai, X., Dai, H., Mu, D., Guo, L., & Dai, T. (2018). A LSTM based model for personalized context-aware citation recommendation. IEEE Access, 6, 59618–59627.
Yasunaga, M., Kasai, J., Zhang, R., Fabbri, A. R., Li, I., Friedman, D., & Radev, D. R. (2019). ScisummNet: A large annotated corpus and content-impact models for scientific paper summarization with citation networks. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 7386–7393.
Yin, X., Huang, J. X., & Li, Z. (2011). Mining and modeling linkage information from citation context for improving biomedical literature retrieval. Information Processing & Management, 47(1), 53–67.
Yousif, A., Niu, Z., Chambua, J., & Khan, Z. Y. (2019). Multi-task learning model based on recurrent convolutional neural networks for citation sentiment and purpose classification. Neurocomputing, 335, 195–205.
Zafar, L., Ahmed, U., & Islam, M. A. (2019). Citation context analysis using word-graph. 2019 2nd International Conference on Communication, Computing and Digital Systems (C-CODE), 120–125.
Zarrinkalam, F., & Kahani, M. (2013). SemCiR: A citation recommendation system based on a novel semantic distance measure. Program, 47(1), 92–112.
Zhang, G., Ding, Y., & Milojević, S. (2013). Citation content analysis (CCA): A framework for syntactic and semantic analysis of citation content. Journal of the American Society for Information Science and Technology, 64(7), 1490–1503.
Zhu, X., Turney, P., Lemire, D., & Vellino, A. (2015). Measuring academic influence: Not all citations are equal. Journal of the Association for Information Science and Technology, 66(2), 408–427.
Acknowledgements
The authors (Salem Alelyani and Saeed-Ul Hassan) are grateful for the financial support received from King Khalid University for this research Under Grant No. R.G.P2/100/41.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Iqbal, S., Hassan, SU., Aljohani, N.R. et al. A decade of in-text citation analysis based on natural language processing and machine learning techniques: an overview of empirical studies. Scientometrics 126, 6551–6599 (2021). https://doi.org/10.1007/s11192-021-04055-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-021-04055-1