Abstract
The increasing number of patents leads patent applicants and examiners to spend more time and cost on searching and citing prior patents. Deep learning has exhibited outstanding performance in the recommendation of movies, music, products, and paper citation. However, the application of deep learning in patent citation recommendation has not been addressed well. Despite many attempts to apply deep learning models to the patent domain, there is little attention to the patent citation recommendation. Since patent citation is determined according to a complex technological context beyond simply finding semantically similar preceding documents, it is necessary to understand the context in which the citation occurs. Therefore, we propose a dataset named as a PatentNet to capture technological citation context based on textual information, meta data and examiner citation information for about 110,000 patents. Also, this paper proposes a strong benchmark model considering the similarity of patent text as well as technological citation context using cooperative patent classification (CPC) code. The proposed model exploits a two-stage structure of selecting based on textual information and pre-trained CPC embedding values and re-ranking candidates using a trained deep learning model with examiner citation information. The proposed model achieved improved performance with an MRR of 0.2506 on the benchmarking dataset, outperforming the existing methods. The results obtained show that learning about the descriptive citation context, rather than simple text similarity, has an important influence on citation recommendation. The proposed model and dataset can help researchers to understand technological citation context and assist patent examiners or applicants to find prior patents to cite effectively.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ali, Z., Kefalas, P., Muhammad, K., Ali, B., & Imran, M. (2020a). Deep learning in citation recommendation models survey. Expert Systems with Applications, 162, 113790.
Ali, Z., Qi, G., Muhammad, K., Ali, B., & Abro, W. A. (2020b). Paper recommendation based on heterogeneous network embedding. Knowledge-Based Systems, 210, 106438.
Ali, Z., Qi, G., Muhammad, K., Kefalas, P., & Khusro, S. (2021). Global citation recommendation employing generative adversarial network. Expert Systems with Applications, 180, 114888.
An, X., Li, J., Xu, S., Chen, L., & Sun, W. (2021). An improved patent similarity measurement based on entities and semantic relations. Journal of Informetrics, 15(2), 101135.
Bhagavatula, C., Feldman, S., Power, R., & Ammar, W. (2018). Content-based citation recommendation. http://arXiv.org/08301
Cai, X., Zheng, Y., Yang, L., Dai, T., & Guo, L. (2018). Bibliographic network representation based personalized citation recommendation. IEEE Access, 7, 457–467.
Chen, L., Xu, S., Zhu, L., Zhang, J., Lei, X., & Yang, G. (2020). A deep learning based method for extracting semantic information from patent documents. Scientometrics, 125(1), 289–312.
Choi, J., Jeong, B., Yoon, J., Coh, B.-Y., & Lee, J.-M. (2020). A novel approach to evaluating the business potential of intellectual properties: A machine learning-based predictive analysis of patent lifetime. Computers & Industrial Engineering, 145, 106544.
Choi, S., Lee, H., Park, E. L., & Choi, S. (2019). Deep patent landscaping model using transformer and graph embedding. http://arXiv.org/05823.
Chung, P., & Sohn, S. Y. (2020). Early detection of valuable patents using a deep learning model: Case of semiconductor industry. Technological Forecasting & Social Change, 158, 120146.
Clevert, D.-A., Unterthiner, T., & Hochreiter, S. (2015). Fast and accurate deep network learning by exponential linear units (elus). http://arXiv.org/07289
Du, Z., Tang, J., & Ding, Y. (2018). Polar: Attention-based cnn for one-shot personalized article recommendation. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases.
Ebesu, T., & Fang, Y. (2017). Neural citation network for context-aware citation recommendation. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.
Fu, T.-Y., Lei, Z., & Lee, W.-C. (2015). Patent citation recommendation for examiners. In 2015 IEEE International Conference on Data Mining.
Gay, C., & Le Bas, C. (2005). Uses without too many abuses of patent citations or the simple economics of patent citations as a measure of value and flows of knowledge. Economics of Innovation and New Technology, 14(5), 333–338.
Gipp, B., Beel, J., & Hentschel, C. (2009). Scienstein: A research paper recommender system. In Proceedings of the International Conference on Emerging Trends in Computing (ICETiC’09).
Govindarajan, U. H., Trappey, A. J., & Trappey, C. V. (2019). Intelligent collaborative patent mining using excessive topic generation. Advanced Engineering Informatics, 42, 100955.
He, Q., Pei, J., Kifer, D., Mitra, P., & Giles, L. (2010). Context-aware citation recommendation. In Proceedings of the 19th International Conference on World wide web.
Jaffe, A. B., Trajtenberg, M., & Henderson, R. (1993). Geographic localization of knowledge spillovers as evidenced by patent citations. The Quarterly Journal of Economics, 108(3), 577–598.
Jeong, C., Jang, S., Park, E., & Choi, S. (2020). A context-aware citation recommendation model with BERT and graph convolutional networks. Scientometrics, 124, 1907.
Kim, J., Yoon, J., Park, E., & Choi, S. (2020). Patent document clustering with deep embeddings. Scientometrics, 123, 563.
KISTA. (2017). Geostationary orbit composite satellite offshore payload (GOCI-II) integrated data processing system development planning research. Retrieved from http://biz.kista.re.kr/patentmap/front/repo.do?method=m01G&rptno=R201700384&downViewDiv=downPDF. Accessed on 1 March 2022
Kuhn, J. M. (2010). Information overload at the US Patent and trademark office: Reframing the duty of disclosure in patent law as a search and filter problem. Yale Journal of Law and Technology 13, 89.
Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents. In International Conference on Machine Learning.
Lee, C., Kwon, O., Kim, M., & Kwon, D. (2018). Early identification of emerging technologies: A machine learning approach using multiple patent indicators. Technological Forecasting & Social Change, 127, 291–303.
Lee, J.-S., & Hsiang, J. (2020). Patent classification by fine-tuning BERT language model. World Patent Information, 61, 101965.
Li, S., Hu, J., Cui, Y., & Hu, J. (2018). DeepPatent: Patent classification with convolutional neural networks and word embedding. Scientometrics, 117(2), 721–744.
Lu, Y., Xiong, X., Zhang, W., Liu, J., & Zhao, R. (2020). Research on classification and similarity of patent citation based on deep learning. Scientometrics, 123(2), 813–839.
Ma, X., & Wang, R. (2019). Personalized scientific paper recommendation based on heterogeneous graph representation. IEEE Access, 7, 79887–79894.
Meyer, M. (2000). What is special about patent citations? Differences between scientific and patent citations. Scientometrics, 49(1), 93–123.
Mu, D., Guo, L., Cai, X., & Hao, F. (2017). Query-focused personalized citation recommendation with mutually reinforced ranking. IEEE Access, 6, 3107–3119.
Nigel Gilbert, G. (1977). Referencing as persuasion. Social Studies of Science, 7(1), 113–122.
Oh, S., Lei, Z., Lee, W.-C., Mitra, P., & Yen, J. (2013). CV-PCR: A context-guided value-driven framework for patent citation recommendation. In Proceedings of the 22nd ACM International Conference on Information & Knowledge Management.
Rozemberczki, B., & Sarkar, R. (2018). Fast sequence-based embedding with diffusion graphs. In International Workshop on Complex Networks.
Salimans, T., & Kingma, D. P. (2016). Weight normalization: A simple reparameterization to accelerate training of deep neural networks. http://arXiv.org/07868.
Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Tian, H., & Zhuo, H. H. (2017). Paper2vec: Citation-context based document distributed representation for scholar recommendation. http://arXiv.org/06587
Trappey, A., Trappey, C. V., & Hsieh, A. (2021). An intelligent patent recommender adopting machine learning approach for natural language processing: A case study for smart machinery technology mining. Technological Forecasting & Social Change, 164, 120511.
Trappey, A. J., Trappey, C. V., Wu, C.-Y., & Lin, C.-W. (2012). A patent quality analysis for innovative technology and product development. Advanced Engineering Informatics, 26(1), 26–34.
Trappey, A. J., Trappey, C. V., Wu, J.-L., & Wang, J. W. (2020). Intelligent compilation of patent summaries using machine learning and natural language processing techniques. Advanced Engineering Informatics, 43, 101027.
Wang, X., & Wang, Y. (2014). Improving content-based and hybrid music recommendation using deep learning. In Proceedings of the 22nd ACM International Conference on Multimedia
Wang, X., Ren, H., Chen, Y., Liu, Y., Qiao, Y., & Huang, Y. (2019). Measuring patent similarity with SAO semantic analysis. Scientometrics, 121(1), 1–23.
Wei, J., He, J., Chen, K., Zhou, Y., & Tang, Z. (2017). Collaborative filtering and deep learning based recommendation system for cold start items. Expert Systems with Applications, 69, 29–39.
Yang, L., Zhang, Z., Cai, X., & Guo, L. (2019). Citation recommendation as edge prediction in heterogeneous bibliographic network: A network representation approach. IEEE Access, 7, 23232–23239.
Yang, L., Zheng, Y., Cai, X., Dai, H., Mu, D., Guo, L., & Dai, T. (2018). A LSTM based model for personalized context-aware citation recommendation. IEEE Access, 6, 59618–59627.
Zhang, Y., & Ma, Q. (2020). Citation recommendations considering content and structural context embedding. In 2020 IEEE International Conference on Big Data and Smart Computing (BigComp).
Zhou, L. (2020). Product advertising recommendation in e-commerce based on deep learning and distributed expression. Electronic Commerce Research, 20(2), 321–342.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Choi, J., Lee, J., Yoon, J. et al. A two-stage deep learning-based system for patent citation recommendation. Scientometrics 127, 6615–6636 (2022). https://doi.org/10.1007/s11192-022-04301-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-022-04301-0