One-to-many comparative summarization for patents

Zheng Liu ORCID: orcid.org/0000-0001-5391-1105¹,
Jialing Zhang¹,
Tingting Qin¹,
Yanwen Qu² &
…
Yun Li¹

567 Accesses
1 Citation
Explore all metrics

Abstract

Patents bring technology companies commercial values in modern business operations. However, companies have to bear the high cost of handling patent applications or infringement cases. A common yet expensive task among these jobs is to analyze relevant patent literature. Lengthy and technically complicated patents require a large number of human efforts. This paper focuses on automatically analyzing the similar contents between a patent and its relevant literature, relevant patents specifically, to help experts review the similarities among these patents. We formulate this as a one-to-many document comparison problem by generating a comparative summary of a given patent and its relevant patents. We extract essential technical features from semantic dependency trees based on sentences in claims and construct a multi-relational graph to model the relevance between features and patents. The key to generating the comparative summary is selecting comparative essential technical features, which we formulate as an optimization problem and solve by a fast greedy algorithm. Experiments on real-world datasets and case studies demonstrate the effectiveness and efficiency of the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

References

Abbas, A., Zhang, L., & Khan, S. U. (2014). A literature review on the state-of-the-art in patent analysis. World Patent Information, 37, 3–13.
Article Google Scholar
Cascini, G., & Zini, M. (2008). Measuring patent similarity by comparing inventions functional trees. IFIP International Federation for Information Processing, 277, 31–42.
Article Google Scholar
Choi, S., Kim, H., Yoon, J., Kim, K., & Lee, J. Y. (2012). An sao-based text-mining approach for technology roadmapping using patent information. R & D Management, 43(1), 52–74.
Google Scholar
Devlin, J., Chang, MW., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota (Vol. 1, pp 4171–4186). https://doi.org/10.18653/v1/N19-1423.
Erkan, G., & Radev, D. R. (2004) LexPageRank: Prestige in multi-document text summarization. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Barcelona, Spain, pp. 365–371.
Federico, P., Heimerl, F., Koch, S., & Miksch, S. (2017). A survey on visual approaches for analyzing scientific literature and patents. IEEE Transactions on Visualization and Computer Graphics, 23(9), 2179–2198. https://doi.org/10.1109/TVCG.2016.2610422
Article Google Scholar
Gong, Y., & Liu, X. (2001). Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery, New York, NY, USA, SIGIR’01, p 19–25
Helmers, L., Horn, F., Biegler, F., Oppermann, T., & Müller, K. R. (2019). Automating the search for a patent’s prior art with a full text similarity search. PLOS ONE, 14(3), 1–17.
Article Google Scholar
Hu, P., Huang, M., Xu, P., Li, W., Usadi, A. K., & Zhu, X. (2012). Finding nuggets in ip portfolios: Core patent mining through textual temporal analysis. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, Association for Computing Machinery, CIKM ’12, pp. 1819–1823.
Huang, X., Wan, X., & Xiao, J. (2014). Comparative news summarization using concept-based optimization. Knowledge & Information Systems, 38(3), 691–716.
Article Google Scholar
Krestel, R., Chikkamath, R., Hewel, C., & Risch, J. (2021). A survey on deep learning for patent analysis. World Patent Information, 65, 102035.
Lee, C., Song, B., & Park, Y. (2013). How to assess patent infringement risks: a semantic patent claim analysis using dependency relationships. Technology Analysis & Strategic Management, 25(1), 23–38.
Article Google Scholar
Li, T., & Ding, C. (2008). Weighted consensus clustering. In Proceedings of the 2008 SIAM International Conference on Data Mining, SIAM (pp. 798–809).
Lupu, M., Mayer, K., Kando, N., & Trippe, A. J. (2017). Current challenges in patent information retrieval. Springer. https://doi.org/10.1007/978-3-662-53817-3
Mani, I., & Bloedorn, E. (1997). Multi-document summarization by graph search and matching. In Proceedings of the Fourteenth National Conference on Artificial Intelligence and Ninth Conference on Innovative Applications of Artificial Intelligence, AAAI Press, AAAI’97/IAAI’97, pp. 622–628.
Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., & McClosky, D (2014) The Stanford CoreNLP natural language processing toolkit. In: Association for Computational Linguistics (ACL) System Demonstrations, pp 55–60
Mihalcea, R., Tarau, P (2005) A language independent algorithm for single and multiple document summarization. In: Companion Volume to the Proceedings of Conference including Posters/Demos and tutorial abstracts, Asian Federation of Natural Language Processing
Mikolov, T., Sutskever, I., Chen, K., Corrado, GS., & Dean, J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
Ren, X., Lv, Y., Wang, K., & Han, J (2017) Comparative document analysis for large text corpora. Association for Computing Machinery, New York, NY, USA, WSDM ’17, p 325-334, 10.1145/3018661.3018690, https://doi.org/10.1145/3018661.3018690
Risch, J., & Krestel, R. (2019). Domain-specific word embeddings for patent classification. Data Technologies and Applications, 53(1), 108–122.
Article Google Scholar
Shalaby, W., & Zadrozny, W. (2019). Patent retrieval: a literature review. Knowledge and Information Systems, 61(2), 631–660. https://doi.org/10.1007/s10115-018-1322-7
Article Google Scholar
Shen C, & Li T (2010) Multi-document summarization via the minimum dominating set. In: Proceedings of the 23rd International Conference on Computational Linguistics, Association for Computational Linguistics, USA, COLING’10, p 984–992
Shen, D., Sun, JT., Li, H., Yang, Q., & Chen, Z (2007) Document summarization using conditional random fields. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, IJCAI’07, p 2862–2867
Souza, C. M., Meireles, M. R. G., & Almeida, P. E. M. (2021). A comparative study of abstractive and extractive summarization techniques to label subgroups on patent dataset. Scientometrics, 126(1), 135–156. https://doi.org/10.1007/s11192-020-03732-x
Article Google Scholar
Tang, J., Wang, B., Yang, Y., Hu, P., Zhao, Y., Yan, X., Gao, B., Huang, M., Xu, P., Li, W., et al (2012) Patentminer: Topic-driven patent analysis and mining. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery, New York, NY, USA, KDD’12, p 1366–1374, 10.1145/2339530.2339741
Tseng, Y. H., Lin, C. J., & Lin, Y. I. (2007). Text mining techniques for patent analysis. Inf Process Manage, 43(5), 1216–1247. https://doi.org/10.1016/j.ipm.2006.11.011
Article Google Scholar
Wan, X., & Yang, J (2008) Multi-document summarization using cluster-based link analysis. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery, New York, NY, USA, SIGIR’08, p 299–306, 10.1145/1390334.1390386
Wang, D., & Li, T (2010) Many are better than one: Improving multi-document summarization via weighted consensus. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery, New York, NY, USA, SIGIR’10, p 809–810, 10.1145/1835449.1835627
Wang, D., Zhu, S., Li, T., & Gong, Y (2009) Multi-document summarization using sentence-based topic models. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, Association for Computational Linguistics, USA, ACLShort’09, p 297–300
Wang, D., Zhu, S., Li, T., & Gong, Y (2012) Comparative document summarization via discriminative sentence selection. ACM Trans Knowl Discov Data 6(3), 10.1145/2362383.2362386
Yang, SY., & Soo, VW (2008) Comparing the conceptual graphs extracted from patent claims. In: Proceedings of the 2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing (Sutc 2008), IEEE Computer Society, USA, SUTC’08, p 394–399, 10.1109/SUTC.2008.87
Zhang, L., Li, L., & Li, T. (2015). Patent mining: A survey. SIGKDD Explor Newsl, 16, 1–19.
Article Google Scholar
Zhang, L., Li, L., Shen, C., & Li, T (2015b) Patentcom: A comparative view of patent document retrieval. In: Proceedings of the 2015 SIAM International Conference on Data Mining, SIAM, pp 163–171
Zhang, L., Liu, Z., Li, L., Shen, C., & Li, T. (2018). PatSearch: an integrated framework for patentability retrieval. Knowledge and Information Systems, 57(1), 135–158. https://doi.org/10.1007/s10115-017-1127-0
Article Google Scholar
Zhou, D., Bousquet, O., Lal, TN., Weston, J.,&Schölkopf, B (2003) Learning with local and global consistency. In: Proceedings of the 16th International Conference on Neural Information Processing Systems, MIT Press, Cambridge, MA, USA, NIPS’03, p 321–328

Download references

Funding

Funding was provided by Nanjing University of Posts and Telecommunications (Grant No. NY219084).

Author information

Authors and Affiliations

School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, China
Zheng Liu, Jialing Zhang, Tingting Qin & Yun Li
School of Computer Information and Engineering, Jiangxi Normal University, Nanchang, China
Yanwen Qu

Authors

Zheng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jialing Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Tingting Qin
View author publications
You can also search for this author in PubMed Google Scholar
Yanwen Qu
View author publications
You can also search for this author in PubMed Google Scholar
Yun Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zheng Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, Z., Zhang, J., Qin, T. et al. One-to-many comparative summarization for patents. Scientometrics 127, 1969–1993 (2022). https://doi.org/10.1007/s11192-022-04307-8

Download citation

Received: 12 August 2021
Accepted: 10 February 2022
Published: 02 March 2022
Issue Date: April 2022
DOI: https://doi.org/10.1007/s11192-022-04307-8

One-to-many comparative summarization for patents

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

PatSearch: an integrated framework for patentability retrieval

Using Summarization Techniques on Patent Database Through Computational Intelligence

PaEffExtr: A Method to Extract Effect Statements Automatically from Patents

References

Funding

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

One-to-many comparative summarization for patents

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

PatSearch: an integrated framework for patentability retrieval

Using Summarization Techniques on Patent Database Through Computational Intelligence

PaEffExtr: A Method to Extract Effect Statements Automatically from Patents

References

Funding

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation