More Web Proxy on the site http://driver.im/

research-article

A Fuzzy Multigranularity Convolutional Neural Network With Double Attention Mechanisms for Measuring Semantic Textual Similarity

Authors:

Kaiyuan BaiAuthors Info & Claims

IEEE Transactions on Fuzzy Systems, Volume 32, Issue 10

Pages 5762 - 5776

https://doi.org/10.1109/TFUZZ.2024.3427801

Published: 01 October 2024 Publication History

Abstract

Semantic textual similarity (STS) is a fundamental task in the field of natural language processing (NLP). Recent advances demonstrate that deep-learning-based approaches can achieve excitingly accurate STS measurement. However, existing studies cannot capture the spatial location of important information by attention mechanisms, fail to model sentences from the perspective of overall sentences, and neglect to deal with semantic fuzziness. In this article, we propose a novel double attentive fuzzy convolutional neural network (DAFCNN) to measure STS more accurately with the consideration of semantic fuzziness. This article first introduces the spatial attention module and combines it with the improved attentive convolutions to create a multigranularity convolutional neural network in DAFCNN, which not only extracts critical spatial location information but also models sentences from multiple perspectives at word and sentence levels. Second, DAFCNN pioneers a fuzzy learning module (FLM) to fulfill the extraction of fuzzy semantic features. By using the fuzzy membership function, fuzzy aggregation operator, and trainable parameters and weights, FLM can map sentence representations to fuzzy space to constitute representations with more accurate and rich semantics. Third, compared with various state-of-the-art STS models, DAFCNN decreases by 14.57% mean-square error, increases by 4.61% Pearson's γ and 8.57% Spearman's ρ on STS score datasets, and increases by 3.39% accuracy and 2.41% <italic>F</italic>1-score on semantic classification dataset. The ablation experiment demonstrates the effectiveness of each module of DAFCNN. Finally, the experimental results also indicate that FLM is a promising new attempt to incorporate fuzzy set theory in the NLP field.

References

[1]

F. Alam, M. Afzal, and K. M. Malik, “Comparative analysis of semantic similarity techniques for medical text,” in Proc. Int. Conf. Inf. Netw., Barcelona, Spain, 2020, pp. 106–109.

[2]

H. Naderi, B. Kiani, S. Madani, and K. Etminani, “Concept based auto-assignment of healthcare questions to domain experts in online Q&A communities,” Int. J. Med. Inform., vol. 137, May 2020, Art. no.

[3]

P. Thuy, Y. K. Lee, and S. Lee, “S-trans: Semantic transformation of XML healthcare data into OWL ontology,” Knowl-Based Syst., vol. 35, pp. 349–356, Nov. 2012.

Digital Library

[4]

I. Alonso and D. Contreras, “Evaluation of semantic similarity metrics applied to the automatic retrieval of medical documents: An UMLS approach,” Expert Syst. Appl., vol. 44, pp. 386–399, Feb. 2016.

Digital Library

[5]

N. Bölücü, B. Can, and H. Artuner, “A Siamese neural network for learning semantically-informed sentence embeddings,” Expert Syst. Appl., vol. 214, no. 15, Mar. 2023, Art. no.

Digital Library

[6]

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998.

[7]

J. Elman, “Finding structure in time,” Cognit. Sci., vol. 14, no. 2, pp. 179–211, Mar. 1990.

[8]

S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997.

Digital Library

[9]

S. Xu, S. E, and Y. Xiang, “Enhanced attentive convolutional neural networks for sentence pair modeling,” Expert Syst. Appl., vol. 151, no. 1, Aug. 2020, Art. no.

[10]

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proc. Annu. Conf. NAACL-HLT, Minneapolis, MN, USA, 2019, vol. 1, pp. 4171–4186.

[11]

H. He, K. Gimpel, and J. Lin, “Multi-perspective sentence similarity modeling with convolutional neural networks,” in Proc. Conf. Empirical Methods Natural Lang. Process., Lisbon, Portugal, 2015, pp. 1576–1586.

[12]

H. He and J. Lin, “Pairwise word interaction modeling with deep neural networks for semantic similarity measurement,” in Proc. Conf. North Amer. Chapter Assoc. Comput. Ling., Hum. Lang. Technol, San Diego, CA, USA, 2016, pp. 937–948.

[13]

S. Kim, I. Kang, and N. Kwak, “Semantic sentence matching with densely-connected recurrent and co-attentive information,” in Proc. AAAI Conf. Artif. Intell., Honolulu, HI, USA, 2019, vol. 33, no. 1, pp. 6586–6593.

Digital Library

[14]

W. Yin, H. Schütze, B. Xiang, and B. Zhou, “ABCNN: Attention-based convolutional neural network for modeling sentence pair,” Trans. Assoc. Comput. Linguistics, vol. 4, pp. 259–272, 2016.

[15]

W. Yin, K. Kann, M. Yu, and H. Schütze, “Comparative study of CNN and RNN for natural language processing,” 2017, arXiv:1702.01923.

[16]

J. Bradbury, S. Merity, C. Xiong, and R. Socher, “Quasi-recurrent neural networks,” in Proc. Int. Conf. Learn. Representation, San Juan, Puerto Rico, 2016, pp. 1–12.

[17]

M. Luo et al., “A multi-granularity convolutional neural network model with temporal information and attention mechanism for efficient diabetes medical cost prediction,” Comput. Biol. Med., vol. 151, Dec. 2022, Art. no.

Digital Library

[18]

W. Yin and H. Schütze, “Attentive convolution: Equipping CNNs with RNN-style attention mechanisms,” Trans. Assoc. Comput. Linguistics, vol. 6, pp. 687–702, Dec. 2018.

[19]

Y. Gong, H. Luo, and J. Zhang, “Natural language inference over interaction space,” in Proc. Int. Conf. Learn. Representation, Vancouver, BC, Canada, 2018, pp. 1–15.

[20]

S. Woo, J. Park, J.-Y. Lee, and I. Kweon, “CBAM: Convolutional block attention module,” in Proc. Eur. Conf. Comput. Vis., Munich, Germany, 2018, pp. 3–19.

Digital Library

[21]

Y. Zheng, Z. Xu, and X. Wang, “The fusion of deep learning and fuzzy systems: A state-of-the-art survey,” IEEE Trans. Fuzzy Syst., vol. 30, no. 8, pp. 2783–2799, Aug. 2022.

Digital Library

[22]

S. Piantadosi, H. Tily, and E. Gibson, “The communicative function of ambiguity in language,” Cognition, vol. 122, no. 3, pp. 280–291, Mar. 2012.

[23]

L. Zadeh, “Fuzzy logic—A personal perspective,” Fuzzy Sets Syst., vol. 281, no. 15, pp. 4–20, Dec. 2015.

Digital Library

[24]

T. Wang, H. Shi, W. Liu, and X. Yan, “A joint FrameNet and element focusing sentence-BERT method of sentence similarity computation,” Expert Syst. Appl., vol. 200, Aug. 2022, Art. no.

[25]

J. Mueller and A. Thyagarajan, “Siamese recurrent architectures for learning sentence similarity,” in Proc. AAAI Conf. Artif. Intell., Phoenix, AZ, USA, 2016, vol. 30, no. 1, pp. 2786–2792.

[26]

M. Shajalal and M. Aono, “Semantic textual similarity between sentences using bilingual word semantics,” Prog. Artif. Intell., vol. 8, pp. 263–272, Mar. 2019.

[27]

Y. Yang et al., “Learning semantic textual similarity from conversations,” in Proc. 3rd Workshop Representation Learn. NLP, Melbourne, VIC, Australia, 2018, pp. 164–174.

[28]

N. Subramani, N. Suresh, and M. Peters, “Extracting latent steering vectors from pretrained language models,” in Proc. Findings Assoc. Comput. Linguistics, Dublin, Ireland, 2022, pp. 566–581.

[29]

J. Martinez-Gil, R. Mokadem, J. Küng, and A. Hameurlain, “Neurofuzzy semantic similarity measurement,” Data Knowl. Eng., vol. 145, May 2023, Art. no.

Digital Library

[30]

Q. Liu, H. Huang, J. Xuan, G. Zhang, Y. Gao, and J. Lu, “A fuzzy word similarity measure for selecting top-k similar words in query expansion,” IEEE Trans. Fuzzy Syst., vol. 29, no. 8, pp. 2132–2144, Aug. 2021.

[31]

K. Chen, M. Yang, T. Zhao, and M. Zhang, “Data-driven fuzzy target-side representation for intelligent translation system,” IEEE Trans. Fuzzy Syst., vol. 30, no. 11, pp. 4568–4577, Nov. 2022.

Digital Library

[32]

R. Zhao and K. Mao, “Fuzzy bag-of-words model for document representation,” IEEE Trans. Fuzzy Syst., vol. 26, no. 2, pp. 794–804, Apr. 2018.

[33]

G. Salton, A. Wong, and C. S. Yang, “A vector space model for automatic indexing,” Commun. ACM, vol. 18, no. 11, pp. 613–620, Nov. 1975.

Digital Library

[34]

D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet allocation,” J. Mach. Learn. Res., vol. 3, pp. 993–1022, Jan. 2003.

[35]

S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. A. Harshman, “Indexing by latent semantic analysis,” J. Amer. Soc. Inf. Sci., vol. 41, no. 6, pp. 391–407, Sep. 1990.

[36]

V. Mnih, N. Heess, A. Graves, and K. Kavukcuoglu, “Recurrent models of visual attention,” in Proc. 27th Int. Conf. Neural Inf. Process. Syst., Montreal, QC, Canada, 2014, pp. 2204–2212.

Digital Library

[37]

D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” in Proc. 3rd Int. Conf. Learn. Representation, San Diego, CA, USA, 2015, pp. 1–15.

[38]

K. Bai, X. Zhu, S. Wen, R. Zhang, and W. Zhang, “Broad learning based dynamic fuzzy inference system with adaptive structure and interpretable fuzzy rules,” IEEE Trans. Fuzzy Syst., vol. 30, no. 8, pp. 3270–3283, Aug. 2022.

Digital Library

[39]

J. Bromley et al., “Signature verification using a ‘Siamese’ time delay neural network,” Int. J. Pattern Recognit. Artif. Intell., vol. 7, no. 4, pp. 669–683, Aug. 1993.

[40]

Z. Wang, W. Hamza, and R. Florian, “Bilateral multi-perspective matching for natural language sentences,” in Proc. 26th Int. Joint Conf. Artif. Intell., Melbourne, VIC, Australia, 2017, pp. 4144–4150.

[41]

S. Zhu, J. Zeng, and H. Mamitsuka, “Enhancing MEDLINE document clustering by incorporating MeSH semantic similarity,” Bioinformatics, vol. 25, no. 15, pp. 1944–1951, Aug. 2009.

Digital Library

[42]

S. Zagoruyko and N. Komodakis, “Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer,” in Proc. Int. Conf. Learn. Representations, Toulon, France, 2017, pp. 1–13.

[43]

S. Li, “Some comments on fuzzy variables with different membership functions,” Soft Comput., vol. 16, pp. 505–509, Mar. 2012.

Digital Library

[44]

W. M. Dong and F. S. Wong, “Fuzzy weighted averages and implementation of the extension principle,” Fuzzy Sets Syst., vol. 21, no. 2, pp. 183–199, Feb. 1987.

Digital Library

[45]

H. Liao and Z. Xu, “Approaches to manage hesitant fuzzy linguistic information based on the cosine distance and similarity measures for HFLTSs and their application in qualitative decision making,” Expert Syst. Appl., vol. 42, no. 12, pp. 5328–5336, Jul. 2015.

[46]

L. Bentivogli et al., “SICK through the SemEval glasses: Lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment,” Lang. Resour. Eval., vol. 50, no. 1, pp. 95–124, Jan. 2016.

Digital Library

[47]

D. Cera, M. Diabb, E. Agirre, I. Lopez-Gazpio, and L. Specia, “SemEval-2017 task 1: Semantic textual similarity - multilingual and cross-lingual focused evaluation,” in Proc. Int. Workshop Semantic Eval., Vancouver, BC, Canada, 2017, pp. 1–14.

[48]

W. Dolan, C. Quirk, C. Brockett, and B. Dolan, “Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources,” in Proc. Int. Conf. Comput. Linguistics, Geneva, Switzerland, 2004, pp. 350–356.

[49]

M. Lee and M. Welsh, “An empirical evaluation of models of text document similarity,” in Proc. 27th Annu. Conf. Cogn. Sci. Soc., Stresa, Italy, 2005, pp. 1254–1259.

[50]

K. Bai et al., “A data-knowledge-driven interval type-2 fuzzy neural network with interpretability and self-adaptive structure,” Inform. Sci., vol. 660, Mar. 2024, Art. no.

Digital Library

Index Terms

A Fuzzy Multigranularity Convolutional Neural Network With Double Attention Mechanisms for Measuring Semantic Textual Similarity
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction
      2. Lexical semantics
  2. Machine learning
    1. Machine learning approaches
      1. Learning latent representations
      2. Neural networks
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Similarity measures

Index terms have been assigned to the content through auto-classification.

Recommendations

Intuitionistic fuzzy similarity measure: Theory and applications

First we give notion of integral of intuitionistic fuzzy set and introduce intuitionistic fuzzy implicator and intuitionistic fuzzy inclusion measure. Then we propose a new measure of similarity between two intuitionistic fuzzy sets based on ...
Enhancing inter-sentence attention for Semantic Textual Similarity
Abstract
Semantic Textual Similarity (STS) is a fundamental task that aims to measure semantic equivalence between two sentences. The pre-trained language models based on Transformer have had great success on the STS task, which is characterized by multi-...
Highlights
- We propose a novel multi-head self-attention architecture, named EIA.
- The proposed architecture effectively extracts and leverages inter-sentence information.
- Our EIA outperforms other state-of-the-art models on the STS task.
- ...
Fuzzy ontologies in semantic similarity measures
2016 IEEE Congress on Evolutionary Computation (CEC)
Ontologies are a fundamental part of the development of short text semantic similarity measures. The most known ontology used within the field was developed from the lexical database known as WordNet which is used as a semantic resource for determining ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Fuzzy Systems

IEEE Transactions on Fuzzy Systems Volume 32, Issue 10

Oct. 2024

597 pages

Issue’s Table of Contents

1941-0034 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 01 October 2024

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents