Abstract
With the development of natural language processing technology, machine reading comprehension has been widely used in various fields such as QA systems and Intelligence Engineering. However, despite numerous models are proposed in the general domain, there is still a lack of appropriate dataset and models for specific domains like anti-terrorism or homeland security to address this problem. Therefore, a Chinese reading comprehension data set in the field of anti-terrorism (ATCMRC) is constructed, and a generative machine reading comprehension model (AT-MT5) is proposed. ATCMRC was constructed in a semi-automated manner, and a domain-specific vocabulary is created based on the dataset to assist the AT-MT5. The model uses a hybrid attention layer is and a controlled answer generation layer to enhance text perception. Finally, the ATCMRC dataset and the AT-MT5 model is evaluated against existing approaches. The experimental results show that ATCMRC covers key issues in the domain and presents challenging MRC tasks for existent models, while AT-MT5 achieves better results in the domain specific dataset than the existing methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Rajpurkar, P., et al.: SQuAD: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (2016)
Nguyen, T., et al.: MS MARCO: A Human Generated MAchine Reading. Comprehension Dataset (2016)
He, W., et al.: Du Reader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications (2017)
Zheng, H., et al:. Analysis and prospect of China's contemporary anti-terrorism intelligence perception. MATEC Web of Conferences. vol. 336. EDP Sciences (2021)
Guo, S., et al.: Frame-based neural network for machine reading comprehension. Knowledge-Based Systems 219, 106889 (2021)
Cui, Y., et al.: Understanding attention in machine reading comprehension. arXiv preprint arXiv:2108.11574 (2021)
Xu, L., Li, S., Wang, Y., Xu, L.: Named Entity Recognition of BERT-BiLSTM-CRF Combined with Self-attention. In: Xing, C., Fu, X., Zhang, Y., Zhang, G., Borjigin, C. (eds.) WISA 2021. LNCS, vol. 12999, pp. 556–564. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87571-8_48
Devlin, J., et al.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683 (2019)
Vaswani, A., et al.: Attention is all you need. Advances in neural information processing systems pp. 5998–6008 (2017)
Li, F., et al.: Multi-task joint training model for machine reading comprehension. Neurocomputing 488, 66–77 (2022)
Xue, L., et al.: mt5: A massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934 (2020)
Seo, M., et al.: Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603 (2016)
Scholak, T., Schucher, N., Bahdanau, D.: PICARD: Parsing incrementally for constrained auto-regressive decoding from language models. arXiv preprint arXiv:2109.05093 (2021)
Chan, A., et al.: CoCon: A Self-Supervised Approach for Controlled Text Generation (2020)
Hu, M., et al.: Read + Verify: Machine Reading Comprehension with Unanswerable Questions (2018)
Back, S., et al. NeurQuRI: Neural question requirement inspector for answerability prediction in machine reading comprehension. International Conference on Learning Representations (2019)
Fu, S., et al.: U-Net: Machine Reading Comprehension with Unanswerable Questions (2018)
Lin, D., Wang, J., Li, W.: Target-guided knowledge-aware recommendation dialogue system: an empirical investigation. Proceedings of the Joint KaRS & ComplexRec Workshop. CEUR-WS (2021)
Xiao, D., et al.: ERNIE-GEN: an enhanced multi-flow pre-training and fine-tuning framework for natural language generation. arXiv preprint arXiv:2001.11314 (2020)
Ye, C., Fuhai, L.: Research on the construction method of future technology vocabulary in technology roadmap. Modern Library and Information Technology 2013(05), 59–63
Sun, J.: Jieba (Chinese for “to stutter”) Chinese text segmentation: built to be the best Python Chinese word segmentation module 2013. https://github.com/fxsjy/jieba (2021)
Sellam, T., Das, D., Parikh, A.P.: BLEURT: Learning robust metrics for text generation. arXiv preprint arXiv:2004.04696 (2020)
Radford, A., et al.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Gao, F., Yang, Z., Gu, J., Cheng, J. (2022). Machine Reading Comprehension Based on Hybrid Attention and Controlled Generation. In: Zhao, X., Yang, S., Wang, X., Li, J. (eds) Web Information Systems and Applications. WISA 2022. Lecture Notes in Computer Science, vol 13579. Springer, Cham. https://doi.org/10.1007/978-3-031-20309-1_30
Download citation
DOI: https://doi.org/10.1007/978-3-031-20309-1_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20308-4
Online ISBN: 978-3-031-20309-1
eBook Packages: Computer ScienceComputer Science (R0)