A Framework for Word Embedding Based Automatic Text Summarization and Evaluation
<p>The general workflow architecture of proposed framework for automatic text summarization and evaluation. Given original sentences O = O<sub>1</sub>, O<sub>2</sub>, O<sub>3</sub> … O<sub>m</sub> with their corresponding sequence of words o<sub>ij</sub> = o<sub>i1</sub>, o<sub>i2</sub>, o<sub>i3</sub> … o<sub>in</sub>, assign the highest cosine similarity value to the words by comparing them with the keywords. Keywords are nonstop words from first sentence of the original document and top k number of frequent words from the document, (we experimented with values between 1 and 15, and found the optimal value to be k = 6). Then sum up all weights (Σ) and divide by the number of words (n) in the corresponding sentence for ranking. Then top-y sentences are considered as a reference summary.</p> "> Figure 2
<p>Performance comparison of individual best scores among variants of the metrics.</p> "> Figure 3
<p>Pearson′s correlation heat map of automatic evaluation metrics. Deep green indicates strong positive correlation, light green is a medium whereas white is a weak correlation.</p> ">
Abstract
:1. Introduction
- We propose an automatic evaluation metric called WEEM4TS for evaluating the performance of text summarization systems. WEEM4TS is designed to evaluate the quality of system summaries with regard to the preserved meaning of the original document. Hence, we believe it represents an appropriate evaluation metric for all types of systems summaries: extractive, abstractive, and mixed.
- We propose a method called WETS for determining the most important sentences of original document that can be used as a reference summary, which helps to address lack of ground truth summaries. This reference summary is produced carefully to be used by WEEM4TS for evaluating systems summaries against it.
- By comparing with six baseline text summarization systems, we validate the utility of summaries that generated by WETS. We also evaluate performance of WEEM4TS by correlating with human judgments. Further, we compare WEEM4TS with commonly used automatic evaluation metrics in the domain of text summarization and machine translation. The experimental results demonstrate that both WETS and WEEM4TS achieve promising performance.
2. Related Work
2.1. Text Summarization
2.2. Evaluation of Text Summarization Systems
3. Methods
3.1. Word Embedding Based Text Summarization
Algorithm 1: Word Embedding based Text Summarization | |
Input: Original_Text, word_embedding_model, stop_words | |
Output: Top-y salient sentences | |
1: | for i in range (len(Original_Text)): |
2: | sentences←Original_Text[i].split(. ) |
3: | firstsentence ←Original_Text [0].split(. ) |
4: | for sentence in sentences: |
5: | tokenized←yield(gensim.utils.simple_preprocess(str(sentence))) |
6: | nonstopwords ←remove_stopwords(tokenized) |
7: | for m in range(len(firstsentence)): |
8: | if firstsentence[m] is not in stop_words: |
9: | first_keywords←firstsentence[m] |
10: | second_keywords←frequent_words(Original_Text) |
11: | keywords←first_keywords + second_keywords |
12: | weight←0, sentweight←0 |
13: | for n in range (len(nonstopwords)): |
14: | weight←max(cosine_similarity(tokenized[n], keywords)) |
15: | sentweight←sentweight + weight |
16: | relevancescore←sentweight/len(nonstopwords) |
17: | top-y←put_sentence_in_order(relevancescore) |
18: | return top-y |
3.2. Word Embedding Based Automatic Evaluation Metric for Text Summarization
Algorithm 2: Word Embedding based Evaluation Metric for Text Summarization | |
Input: Reference_Text, System_Summary, word_embedding_model, stop_words | |
Output: metric_score | |
1: | sentweight, unigramrecall, weight, countbigram←0.0 |
2: | for i in range(len(Reference_Text)): |
3: | for n in range(len(System_Summary)): |
4: | if System_Summary[i][n] in Reference_Text[i]: |
5: | weight←1.0 |
6: | sentweight←sentweight + weight |
7: | elseif System_Summary[i][n] in vocabulary(word_embedding_model): |
8: | weight←max(cosine_similarity(System_Summary[i][n], vocabulary)) |
9: | sentweight←sentweight + weight |
10: | else: |
11: | weight←0.0 |
12: | sentweight←sentweight + weight |
13: | if len(Reference_Text[i])>0: |
14: | unigramrecall←sentweight/len(Reference_Text[i]) |
15: | else: |
16: | unigramrecall←0.0 |
17: | bigramprecision←(countbigram/(len(System_Summary[i]) − 1))*100 |
18: | unigramrecall←unigramrecall*100 |
19: | WEEM4TSscore←(α*unigramrecall) + (β*bigramprecision) |
20: | return WEEM4TSscore |
3.2.1. Pre-processing
3.2.2. Word weighting
3.2.3. Computing modified unigram recall
3.2.4. Computing modified bigram precision
4. Experiments
4.1. Dataset
4.2. Pre-Trained Word Embedding Models
4.3. Baselines
4.3.1. Text Summarization Systems
4.3.2. Automatic Evaluation Metrics
4.4. Results and Discussion
4.4.1. Evaluation Results of Text Summarization Systems
4.4.2. Correlation Results of Automatic Evaluation Metrics
4.4.3. Discussion
5. Conclusion and Future Work
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Appendix A. Demonstration of how word embedding models are used in both WETS and WEEM4TS
- For the demonstration purpose, we used an original text and a reference text shown in the following box.Original Text: A person who loves someone is surely loved in turn by the others {0}. Researchers show that the more a person loves people around him/her the better healthy life he/she has {1}. People who love others without any condition are mostly lead happy life {2}. Contrary, there are people who are ignorant and get satisfaction by hurting others {3}. Some of them develop this behavior from their childhood {4}. Adoring others will give you immense happiness and peace {5}.Reference Text: A person who loves someone is unquestionably loved successively by the others. In fact, there are also some people who get satisfaction by hurting others. Adoring others will provide you with immense happiness and peace.
- Using full stop as a delimiter, we split an original text into six sentences. We refer each sentence using index value enclosed by the green curly bracket.
- Use nonstop words of the first sentence as keywords. Accordingly, ‘person’, ‘loves’, ‘someone’, ‘surely’, ‘loved’, ‘turn’, and ‘others’ are all considered as keywords.
- Update the keywords identified in step 3 by adding relevant frequent words. In this particular example, all of the frequent words are also found in the first sentence except the word ‘people’ that occurs two times in the original text. So, we updated the keywords by adding the word ‘people’ to the list.
- In order to compute relevance score of each sentence, we sum weight values of all words in that sentence and divide by the number of words in that sentence, Equation (1). Obviously, first sentence is favored to be a first top salient sentence. Before calculating relevance score of other sentences, weigh value of each word in that sentence is determined based on the following rules: If the word in that sentence also exists in the first sentence, we temporarily remove that word rather than assigning a weight value to it. This is to discourage redundancy. If the word does not exist in the first sentence but does in the keywords, assign a weight value equal to +1. If the word does not exist in the keywords but does in the vocabulary of word embedding model then compute cosine similarity between the word and all words in the keywords to consider the highest cosine similarity value as a weight value of that word. For instance, the first word of the second sentence {1}, ‘Researchers’, exists neither in the first sentence nor in the list of keywords. Hence, to determine weight value for the word ‘Researchers’, we compute cosine similarity between ‘Researchers’ and all words in the keywords via Word2Vec in which the word ‘people’ is found the most similar word with the highest cosine similarity value equal to 0.104. Following the same process, we assign weight value for all words, which consequently used to compute relevance score. Based on the obtained relevance score, from the highest to lowest, the following sentences are identified as salient top-3, in order: {0}, {5}, and {3}. Accordingly, WETS summary is: A person who loves someone is surely loved in turn by the others. Adoring others will give you immense happiness and peace. Contrary, there are people who are ignorant and get satisfaction by hurting others.
- Assume we are required to evaluate system generated summary, in this case WETS and Lede-3 summaries, against human generated reference summary.
- We use WEEM4TS to evaluate both WETS and Lede-3 summaries against human generated reference summary. As described in Section 3.2, WEEM4TS score is calculated by a linear combination of modified unigram recall (Equation (2)) and modified bigram precision (Equation (3)). In modified unigram recall, each word in the WETS summary is assigned by the highest cosine similarity value among the words in the reference summary. Then sum of these values is divided by the number of words in the reference summary. In the modified bigram precision, counting the number of bigrams matches and then divided by the number of bigrams in the system summary. It should be noted that the matching is not only on based on the surface form but also on semantic similarity. This makes the standard recall and precision different from the recall and precision employed in this study, which we use the term modified to designate the differences. Accordingly, we compute WEEM4TS variants score for WETS summary and Lede-3 summary:Reference Text: A person who loves someone is unquestionably loved successively by the others. In fact, there are also some people who get satisfaction by hurting others. Adoring others will provide you with immense happiness and peace.Lede-3 summary: A person who loves someone is surely loved in turn by the others. Researchers show that the more a person loves people around him/her the better healthy life he/she has. People who love others without any condition are mostly lead happy life. WEEM4TS score: [WEEM4TSw = 53.81, WEEM4TSg = 67.08, and WEEM4TSf = 58.72]WETS summary: A person who loves someone is surely loved in turn by the others. Adoring others will give you immense happiness and peace. Contrary, there are people who are ignorant and get satisfaction by hurting others.WEEM4TS score: [WEEM4TSw = 84.19, WEEM4TSg = 85.03, and WEEM4TSf = 84.70]
References
- Rush, A.M.; Chopra, S.; Weston, J. A Neural Attention Model for Abstractive Sentence Summarization. arXiv 2015, arXiv:1509.00685. [Google Scholar]
- Torres-Moreno, J.-M. Automatic Text Summarization; ISTE Ltd. and John Wiley & Sons: London, UK; Hoboken, NJ, USA, 2014; pp. 1–320. [Google Scholar]
- Saggion, H.; Poibeau, T. Automatic text summarization: Past, present and future. In Multi-Source, Multilingual Information Extraction and Summarization; Springer: Berlin/Heidelberg, Germany, 2013; pp. 3–21. [Google Scholar]
- Kumar, Y.J.; Goh, O.S.; Basiron, H.; Choon, N.H.; C Suppiah, P. A review on automatic text summarization approaches. J. Comput. Sci. 2016, 12, 178–190. [Google Scholar] [CrossRef] [Green Version]
- See, A.; Liu, P.J.; Manning, C.D. Get To The Point: Summarization with Pointer-Generator Networks. arXiv 2017, arXiv:1704.04368. [Google Scholar]
- Luhn, H.P. The Automatic Creation of Literature Abstracts *. IBM J. 1958, 2, 159–165. [Google Scholar] [CrossRef] [Green Version]
- Edmundson, H.P. New Methods in Automatic Extracting. J. ACM. 1969, 16, 264–285. [Google Scholar] [CrossRef]
- Barrios, F.; Federico, L.; Argerich, L.; Wachenchauzer, R. Variations of the Similarity Function of TextRank for Automated Summarization. arXiv 2016, arXiv:1602.03606. [Google Scholar]
- Rossiello, G.; Basile, P.; Semeraro, G. Centroid-based Text Summarization through Compositionality of Word Embeddings. In Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres, Valencia, Spain, 3 April 2017; pp. 12–21. [Google Scholar]
- Wu, Y.; Hu, B. Learning to Extract Coherent Summary via Deep Reinforcement Learning. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; pp. 5602–5609. [Google Scholar]
- Jadhav, A.; Rajan, V. Extractive Summarization with SWAP-NET: Sentences and Words from Alternating Pointer Networks. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, 15–20 July 2018; pp. 142–151. [Google Scholar]
- Nallapati, R.; Zhai, F.; Zhou, B. SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; pp. 3075–3081. [Google Scholar]
- Chen, Y.; Bansal, M. Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting. arXiv 2018, arXiv:1805.11080. [Google Scholar]
- Zhang, Y.; Li, D.; Wang, Y.; Fang, Y. Abstract Text Summarization with a Convolutional Seq2seq Model. Appl. Sci. 2019, 9, 1665. [Google Scholar] [CrossRef] [Green Version]
- Bae, S.; Kim, T.; Kim, J.; Lee, S. Summary Level Training of Sentence Rewriting for Abstractive Summarization. arXiv 2019, arXiv:1909.08752. [Google Scholar]
- Li, C.; Xu, W.; Li, S.; Gao, S. Guiding Generation for Abstractive Text Summarization based on Key Information Guide Network. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, LA, USA, 1–6 June 2018; pp. 55–60. [Google Scholar]
- Hsu, W.; Lin, C.; Lee, M.; Min, K.; Tang, J.; Sun, M. A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss. arXiv 2018, arXiv:1805.06266. [Google Scholar]
- Erkan, G.; Radev, D.R. LexRank: Graph-based Lexical Centrality as Salience in Text Summarization. J. Artif. Intell. Res. 2004, 22, 457–479. [Google Scholar] [CrossRef]
- Steinberger, J.; Ježek, K. Evaluation Measures for Text Summarization. Comput. Inform. 2009, 28, 1001–1025. [Google Scholar]
- Rath, G.; Resnick, A.; Savage, T. The Formation of Abstracts By the Selection of Sentences. Part 1. Sentence Selection By Men and Machines. Am. Doc. 1961, 12, 139–141. [Google Scholar] [CrossRef]
- Berg-Kirkpatrick, T.; Gillick, D.; Klein, D. Jointly Learning to Extract and Compress. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, OR, USA, 19–24June 2011; pp. 481–490. [Google Scholar]
- Knight, K.; Marcu, D. Statistics-Based Summarization—Step One: Sentence Compression. AAAI/IAAI 2000, 2000, 703–710. [Google Scholar]
- Grusky, M.; Naaman, M.; Artz, Y. Newsroom: A dataset of 1.3 million summaries with diverse extractive strategies. arXiv 2018, arXiv:1804.11283. [Google Scholar]
- Radev, D.R.; Blair-Goldensohn, S.; Zhang, Z. Experiments in Single and Multi-Document Summarization Using MEAD. In Proceedings of the First Document Understanding Conference, New Orleans, LA, USA, 13–14 September 2001. [Google Scholar]
- Luhn, H.P. A Statistical Approach to Mechanized Encoding and Searching of Literary Information. IBM J. Res. Dev. 1957, 1, 309–317. [Google Scholar] [CrossRef]
- Jones, K.S. A statistical interpretation of term specificity and its application in retrieval. J. Doc. 1972, 28, 11–21. [Google Scholar] [CrossRef]
- Salton, G.; Buckley, C. Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 1988, 24, 513–523. [Google Scholar] [CrossRef] [Green Version]
- Kyoomarsi, F.; Khosravi, H.; Eslami, E.; Dehkordy, P.K.; Tajoddin, A. Optimizing Text Summarization Based on Fuzzy Logic. In Seventh IEEE/ACIS International Conference on Computer and Information Science (Icis 2008); IEEE: Piscataway, NJ, USA, 2008; pp. 347–352. [Google Scholar]
- Villatoro-Tello, E.; Villaseñor-Pineda, L.; Montes-y-Gómez, M. Using Word Sequences for Text Summarization. In Proceedings of the International Conference on Text, Speech and Dialogue, Brno, Czech Republic, 11–15 September 2006; pp. 293–294. [Google Scholar]
- René Arnulfo, G.; Montiel, R.; Ledeneva, Y.; Rendón, E.; Gelbukh, A.; Cruz, R. Text Summarization by Sentence Extraction Using Unsupervised Learning *. In Proceedings of the Mexican International Conference on Artificial Intelligence, Atizapán de Zaragoza, Mexico, 27–31 October 2008; pp. 133–143. [Google Scholar]
- Fattah, M.A.; Ren, F. Automatic Text Summarization. Int. J. Comput. Inf. Eng. 2008, 2, 90–93. [Google Scholar]
- Witbrock, M.J.; Mittal, V.O. Ultra-Summarization: A Statistical Approach to Generating Highly Condensed Non-Extractive Summaries. In SIGIR 1999, 9, 1–14. [Google Scholar]
- Zajic, D.; Dorr, B.J.; Lin, J.; Schwartz, R. Multi-candidate reduction: Sentence compression as a tool for document summarization tasks. Inf. Process. Manag. 2007, 43, 1549–1570. [Google Scholar] [CrossRef]
- Jing, H.; McKeow, K.R. Cut and Paste Based Text Summarization. In Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference, Seattle, WA, USA, 29 April–4 May 2000; pp. 178–185. [Google Scholar]
- Mohd, M.; Jan, R.; Shah, M. Text document summarization using word embedding. Expert Syst. Appl. 2020, 143, 112958. [Google Scholar] [CrossRef]
- Al-Sabahi, K.; Zuping, Z.; Kang, Y. Latent Semantic Analysis Approach for Document Summarization Based on Word Embeddings. arXiv 2018, arXiv:1807.02748. [Google Scholar]
- Nema, P.; Khapra, M.; Laha, A.; Ravindran, B. Diversity driven Attention Model for Query-based Abstractive Summarization. arXiv 2018, arXiv:1704.08300. [Google Scholar]
- Nallapati, R.; Zhou, B.; Gulcehre, C.; Xiang, B. Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv 2016, arXiv:1602.06023. [Google Scholar]
- Xiang, X.; Xu, G.; Fu, X.; Wei, Y.; Jin, L.; Wang, L. Skeleton to Abstraction: An Attentive Information Extraction Schema for Enhancing the Saliency of Text Summarization. Inf. 2018, 9, 217. [Google Scholar] [CrossRef] [Green Version]
- Rush, A.M.; Chopra, S.; Weston, J. A Neural Attention Model for Abstractive Sentence Summarization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015. [Google Scholar]
- Al-Sabahi, K.; Zuping, Z.; Nadher, M. A Hierarchical Structured Self-Attentive Model for Extractive Document Summarization (HSSAS). IEEE Access 2018, 6, 24205–24212. [Google Scholar] [CrossRef]
- Yang, K.; Al-Sabahi, K.; Xiang, Y.; Zhang, Z. An Integrated Graph Model for Document Summarization. Information 2018, 9, 232. [Google Scholar] [CrossRef] [Green Version]
- Ganesan, K.; Zhai, C.; Han, J. Opinosis: A graph based approach to abstractive summarization of highly redundant opinions. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), Beijing, China, 23–27 August 2010; pp. 340–348. [Google Scholar]
- Jing, H.; Mckeown, K.R. The Decomposition of Human-Written Summary Sentences. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, CA, USA, 15–19 August 1999; pp. 129–136. [Google Scholar]
- Saggion, H. A classification algorithm for predicting the structure of summaries. In Proceedings of the 2009 Workshop on Language Generation and Summarisation, Suntec, Singapore, 6 August 2009; pp. 31–38. [Google Scholar]
- Saggion, H. Learning Predicate Insertion Rules for Document Abstracting. In International Conference on Intelligent Text Processing and Computational Linguistics; Springer: Berlin/Heidelberg, Germany, 2011; pp. 301–302. [Google Scholar]
- Hou, Y.; Xiang, Y.; Tang, B.; Chen, Q.; Wang, X.; Zhu, F. Identifying High Quality Document–Summary Pairs through Text Matching. Information 2017, 8, 64. [Google Scholar]
- Hu, B.; Chen, Q.; Zhu, F. LCSTS: A Large Scale Chinese Short Text Summarization Dataset. arXiv 2015, arXiv:1506.05865. [Google Scholar]
- Hermann, K.M.; Ko, T.; Grefenstette, E.; Espeholt, L.; Kay, W.; Suleyman, M.; Blunsom, P. Teaching Machines to Read and Comprehend. In Proceedings of the Advances in Neural Information Processing Systems, Montréal, QC, Canada, 11–15 December 2015; pp. 1693–1701. [Google Scholar]
- Over, P.; Dang, H.; Harman, D. DUC in Context. Inf. Process. Manag. 2007, 43, 1506–1520. [Google Scholar] [CrossRef]
- Chopra, S.; Auli, M.; Rush, A.M. Abstractive Sentence Summarization with Attentive Recurrent Neural Networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016. [Google Scholar]
- Völske, M.; Potthast, M.; Syed, S.; Stein, B. TL; DR: Mining Reddit to Learn Automatic Summarization. In Proceedings of the Workshop on New Frontiers in Summarization, Copenhagen, Denmark, 7 September 2017; pp. 59–63. [Google Scholar]
- Filippova, K.; Altun, Y. Overcoming the Lack of Parallel Data in Sentence Compression. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, WA, USA, 18–21 October 2013; pp. 1481–1491. [Google Scholar]
- Guo, Q.; Huang, J.; Xiong, N.; Wang, P. MS-Pointer Network: Abstractive Text Summary Based on Multi-Head Self-Attention. IEEE Access. 2019, 7, 138603–138613. [Google Scholar] [CrossRef]
- Jones, K.S.; Galliers, J.R. Evaluating Natural Language Processing Systems: An Analysis and Review. Comput. Linguist. 1996, 24, 336–338. [Google Scholar]
- Paice, C.D. Constructing literature abstracts by computer: techniques and prospects. Inf. Process. Manag. 1990, 26, 171–186. [Google Scholar] [CrossRef]
- Radev, D.R.; Jing, H.; Budzikowska, M. Centroid-based summarization of multiple documents. Inf. Proc. Manag. 2004, 40, 919–938. [Google Scholar]
- Mani, I.; House, D.; Klein, G.; Hirschman, L.; Firmin, T.; Sundheim, B.M. The TIPSTER SUMMAC text summarization evaluation. In Proceedings of the Ninth Conference of the European Chapter of the Association for Computational Linguistics, Bergen, Norway, 8–12 June 1999. [Google Scholar]
- Hynek, J.; Jezek, K. Practical Approach to Automatic Text Summarization. In Proceedings of the 7th ICCC/IFIP International Conference on Electronic Publishing, Minho, Portugal, 25–28 June 2003. [Google Scholar]
- R.Radev, D.; Teufel, S.; Saggion, H.; Lam, W. Evaluation challenges in large-scale document summarization. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, Sapporo, Japan, 7–12 July 2003; pp. 375–382. [Google Scholar]
- Morris, A.H.; George, M.; Kasper, D.A.A. The Effects and Limitations of Automated Text Condensing on Reading Comprehension Performance. Inf. Syst. Res. 1992, 3, 17–35. [Google Scholar]
- Lin, C. ROUGE: A Package for Automatic Evaluation of Summaries. In Proceedings of the Text Summarization Branches Out, Barcelona, Spain, 25–26 July 2004. [Google Scholar]
- Lin, C.Y. Looking for a few good metrics: ROUGE and its evaluation. In Proceedings of the NTCIR Workshop, Tokyo, Japan, 2–4 June 2004. [Google Scholar]
- Minel, J.-L.; Nugler, S.; Plat, G. How to appreciate the quality of automatic text summarization? Examples of FAN and MLUCE protocols and their results on SERAPHIN. In Proceedings of the Intelligent Scalable Text Summarization, Madrid, Spain, 11 July 1997. [Google Scholar]
- Nenkova, A.; Passonneau, R. Evaluating Content Selection in Summarization: The Pyramid Method. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Boston, MA, USA, 2–7 May 2004. [Google Scholar]
- Mani, I.; Klein, G.; House, D.; Hirschman, L.; Firmin, T.; Sundheim, B. SUMMAC: A text summarization evaluation. Nat. Lang. Eng. 2002, 8, 43–68. [Google Scholar] [CrossRef] [Green Version]
- Jing, H.; Barzilay, R.; McKeow, K.; Elhadad, M. Summarization Evaluation Methods: Experiments and Analysis. In Proceedings of the AAAI Symposium on Intelligent Summarization, Madison, WI, USA, 26–30 July 1998. [Google Scholar]
- Kennedy, A.; Szpakowicz, S. Evaluation of a Sentence Ranker for Text Summarization Based on Roget’s Thesaurus. Int. Conf. Text, Speech Dialogue. 2010, 6231, 101–108. [Google Scholar]
- MILLER, U. Thesaurus construction: problems and their roots. Inf. Process. Manag 1997, 33, 481–493. [Google Scholar] [CrossRef]
- Ng, J.; Abrech, V. Better Summarization Evaluation with Word Embeddings for ROUGE. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient Estimation of Word Representations in Vector Space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Distributed Representations of Words and Phrases and their Compositionality. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 5–10 December 2013. [Google Scholar]
- Mikolov, T.; Yih, W.; Zweig, G. Linguistic Regularities in Continuous Space Word Representations. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Atlanta, GA, USA, 9–14 June 2013. [Google Scholar]
- Bojanowski, P.; Grave, E.; Joulin, A.; Mikolov, T. Enriching Word Vectors with Subword Information. arXiv 2017, arXiv:1607.04606. [Google Scholar] [CrossRef] [Green Version]
- Pennington, J.; Socher, R.; Manning, C.D.. GloVe: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014. [Google Scholar]
- Husin, M.S.; Ariffin, K. The Rhetorical Organisation of English Argumentative Essays by Malay ESL Students: The Placement of Thesis Statement. J. Asia TEFL. 2012, 9, 147–169. [Google Scholar]
- Vinyals, O.; Fortunato, M.; Jaitly, N. Pointer Networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015. [Google Scholar]
- Tu, Z.; Lu, Z.; Liu, Y.; Liu, X. Modeling Coverage for Neural Machine Translation. arXiv 2016, arXiv:1601.04811. [Google Scholar]
- Papineni, K.; Roukos, S.; Ward, T.; Zhu, W.-J. BLEU: A Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, PA, USA, 7–12 July 2002. [Google Scholar]
- Graham, Y. Re-evaluating Automatic Summarization with BLEU and 192 Shades of ROUGE. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015. [Google Scholar]
- Popović, M. CHRF: character n-gram F-score for automatic MT evaluation. In Proceedings of the Tenth Workshop on Statistical Machine Translation, Lisbon, Portugal, 17–18 September 2015. [Google Scholar]
- Bojar, O.; Graham, Y.; Kamran, A.; Stanojević, M. Results of the WMT16 Metrics Shared Task. In Proceedings of the First Conference on Machine Translation, Berlin, Germany, 11–12 August 2016. [Google Scholar]
- Bojar, O.; Graham, Y.; Kamran, A. Results of the WMT17 Metrics Shared Task. In Proceedings of the Second Conference on Machine Translation, Volume 2: Shared Task Papers, Copenhagen, Denmark, 7–11 September 2017. [Google Scholar]
- Pearson, K. Note on regression and inheritance in the case of two parents. In Proceedings of the Royal Society of London, London, UK, 31 December 1895. [Google Scholar]
- Villanueva, V. Cross-Talk in Comp Theory: A Reader, 2nd ed.; Revised and Updated; ERIC: Kern, CA, USA, 2003; p. 873. [Google Scholar]
Summarization Style | Data Size |
---|---|
Extractive | 346 |
Abstractive | 405 |
Mixed | 330 |
Algorithm | Corpus | Vocabulary Size |
---|---|---|
Word2Vec | Google news | 3 million |
GloVe | Common crawl | 2.2 million |
FastText | Common crawl and Wikipedia | 2 million |
Original Text: NEWSROOM Article |
Mortgage rates are still pretty cheap, even though they have risen a full percentage point since hitting record lows about a year ago. And with the stronger economy pulling housing along, this is a good time to get into the market Anika Khan, Wells Fargo Securities senior economist, told CNBC is Squawk Box; on Friday. But many first-time homebuyers are being left on the sidelines, watching all that cheap money inch higher because lending requirements remain tight. The average rate on a 30-year loan ticked up to 4.41 percent from 4.40 percent last week. Fifteen-year mortgages increased to 3.47 percent from 3.42 percent. In this video, Khan gives three reasons why it is still so hard for would-be buyers to purchase their first home. |
NEWSROOM Reference Summary |
Many first-time homebuyers are being left on the sidelines watching all that cheap money inch higher because among other reasons lending requirements remain tight. |
Lede-3 |
Mortgage rates are still pretty cheap, even though they have risen a full percentage point since hitting record lows about a year ago. And with the stronger economy pulling housing along, this is a good time to get into the market; Anika Khan, Wells Fargo Securities senior economist, told CNBC is Squawk Box on Friday. But many first–time homebuyers are being left on the sidelines, watching all that cheap money inch higher because lending requirements remain tight. |
Abs-N |
device could help save lives |
TextRank |
Mortgage rates are still pretty cheap, even though they have risen a full percentage point since hitting record lows about a year ago. The average rate on a 30-year loan ticked up to 4.41 percent from 4.40 percent last week. Fifteen–year mortgages increased to 3.47 percent from 3.42 percent. |
Pointer-C |
Market, anika khan, wells fargo securities senior economist, told cnbc is `` squawk box on friday. But many first-time homebuyers are being left on the 30-year loan ticked up to 4.41 percent from 4.40 percent last week. Fifteen– year mortgages increased to 3.47 percent from 3.42 percent. In this video, khan gives three reasons why it’s still so hard for would-be buyers to purchase their first home. The average rate on a 30-year loan ticked up to 4.41 percent from 4.40 risen a full percentage point since hitting record lows about a year ago. and with the |
Pointer-S |
mortgage rates are still pretty cheap, even though they’ve risen a full percentage point since hitting record lows about a year ago many reasons why the stronger economy pulling housing along, “this is a good time to get into the market, anika khan, wells fargo securities senior economist, told cnbc is” squawk box on friday many first-time homebuyers are being left on the sidelines, watching all that cheap money inch higher because lending requirements remain tight. |
Pointer-N |
mortgage rates are still pretty cheap—even though they have risen a full percentage point since hitting record lows about a year ago. and with the stronger economy pulling housing along |
WETS |
Mortgage rates are still pretty cheap, even though they have risen a full percentage point since hitting record lows about a year ago. |
Extractive | Abstractive | Mixed | |||||||
---|---|---|---|---|---|---|---|---|---|
R-1 | R-2 | R-L | R-1 | R-2 | R-L | R-1 | R-2 | R-L | |
Lede-3 | 52.5 | 42.0 | 44.0 | 10.5 | 1.2 | 5.2 | 21.5 | 10.2 | 12.2 |
WETS | 49.0 | 37.8 | 47.0 | 10.9 | 1.3 | 8.7 | 24.9 | 12.0 | 21.8 |
R-1 | R-2 | R-L | Ww | Wg | Wf | |
---|---|---|---|---|---|---|
Lede-3 | 24.30 | 11.10 | 17.60 | 26.84 | 38.81 | 33.93 |
TextRank | 22.50 | 8.50 | 16.60 | 28.22 | 41.24 | 35.95 |
Abs-N | 6.70 | 0.20 | 5.60 | 11.28 | 21.09 | 18.10 |
Pointer-C | 13.90 | 3.50 | 9.90 | 24.34 | 39.20 | 32.66 |
Pointer-S | 18.80 | 7.20 | 13.80 | 26.91 | 40.80 | 34.88 |
Pointer-N | 20.10 | 8.50 | 16.30 | 28.87 | 42.13 | 36.56 |
WETS | 21.51 | 9.98 | 19.56 | 34.20 | 44.89 | 39.61 |
Systems | Lede-3 | Fragments | TextRank | Abs-N | Pointer-C | Pointer-S | Pointer-N |
---|---|---|---|---|---|---|---|
#Summaries | 60 | 60 | 60 | 60 | 60 | 60 | 60 |
Correlation | |r| | |r| | |r| | |r| | |r| | |r| | |r| |
ROUGE-1 | 0.232 | 0.559 * | 0.217 | 0.115 | 0.023 | −0.162 | −0.038 |
ROUGE-2 | 0.210 | 0.513 * | 0.230 | −0.077 | –.005 | −0.138 | −0.025 |
ROUGE-L | 0.216 | 0.543 * | 0.215 | 0.099 | 0.044 | −0.131 | −0.049 |
BLEU-1 | −0.234 | 0.104 | 0.086 | 0.172 | −0.227 | −0.101 | −0.020 |
BLEU-2 | −0.142 | 0.241 | 0.128 | 0.160 | −0.241 | −0.098 | −0.007 |
BLEU-3 | −0.132 | 0.304 * | 0.156 | 0.154 | −0.228 | −0.098 | 0.004 |
BLEU-4 | −0.128 | 0.337 * | 0.172 | 0.151 | −0.215 | −0.098 | 0.011 |
CHRF1 | 0.266 * | 0.630 * | 0.258 * | 0.018 | 0.077 | −0.081 | −0.015 |
CHRF2 | 0.280 * | 0.633 * | 0.263 * | 0.020 | 0.089 | −0.081 | −0.008 |
CHRF3 | 0.284 * | 0.633 * | 0.264 * | 0.020 | 0.093 | −0.081 | −0.006 |
WEEM4TSw | 0.348 * | 0.643 * | 0.288 * | 0.150 | 0.123 | −0.021 | 0.014 |
WEEM4TSg | 0.333 * | 0.633 * | 0.268 * | 0.022 | 0.118 | 0.002 | 0.027 |
WEEM4TSf | 0.331 * | 0.631 * | 0.267 * | −0.033 | 0.106 | −0.029 | 0.018 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hailu, T.T.; Yu, J.; Fantaye, T.G. A Framework for Word Embedding Based Automatic Text Summarization and Evaluation. Information 2020, 11, 78. https://doi.org/10.3390/info11020078
Hailu TT, Yu J, Fantaye TG. A Framework for Word Embedding Based Automatic Text Summarization and Evaluation. Information. 2020; 11(2):78. https://doi.org/10.3390/info11020078
Chicago/Turabian StyleHailu, Tulu Tilahun, Junqing Yu, and Tessfu Geteye Fantaye. 2020. "A Framework for Word Embedding Based Automatic Text Summarization and Evaluation" Information 11, no. 2: 78. https://doi.org/10.3390/info11020078
APA StyleHailu, T. T., Yu, J., & Fantaye, T. G. (2020). A Framework for Word Embedding Based Automatic Text Summarization and Evaluation. Information, 11(2), 78. https://doi.org/10.3390/info11020078