Word Sense Disambiguation Combining Knowledge Graph and Text Hierarchical Structure
Current supervised word sense disambiguation models have obtained high disambiguation results using annotated information of different word senses and pre-trained language models. However, the semantic data of the supervised word sense disambiguation ...
Knowledge Graph Guided Neural Machine Translation with Dynamic Reinforce-selected Triples
Previous methods incorporating knowledge graphs (KGs) into neural machine translation (NMT) adopt a static knowledge utilization strategy, that introduces many useless knowledge triples and makes the useful triples difficult to be utilized by NMT. To ...
EiAP-BC: A Novel Emoji Aware Inter-Attention Pair Model for Contextual Spam Comment Detection Based on Posting Text
Detecting spam comments on social media remains a continuously discussed research topic to this day, especially on public figure/celebrity accounts in Indonesia. However, the previous studies only focused on the comments themselves, without considering ...
Improving Cross-lingual Aspect-based Sentiment Analysis with Sememe Bridge
Aspect-based Sentiment Analysis (ABSA) comprises numerous subtasks including aspect term extraction (AE), opinion term extraction (OE), opinion pair extraction (PE), and triplet extraction (TE). Current research in Chinese ABSA primarily concentrates on ...
RPEPL: Tibetan Sentiment Analysis Based on Relative Position Encoding and Prompt Learning
Sentiment analysis is a critical task for natural language processing. Much research has been done for high-resource languages such as English and Chinese. However, Tibetan is an extremely low resource language with less reference information. According ...
Empowering Digital Civility with an NLP Approach for Detecting 𝕏 (Formerly Known as Twitter) Cyberbullying through Boosted Ensembles
As the number of social networking sites grows, so do cyber dangers. Cyberbullying is harmful behavior that uses technology to intimidate, harass, or harm someone, often on social media platforms like 𝕏 (formerly known as Twitter). Machine learning is ...
An Extended Pattern Based Comprehensive Stemmer for the Urdu Language
The Urdu language is used by approximately 200 million people for spoken and written communications on a daily basis. There is a substantial amount of unstructured Urdu textual data that is available worldwide. Data mining techniques can be used to ...
EADRE: Event-type Aware Dynamic Representation of Entities in Document-level Event Extraction
Document-level event extraction aims to identify event types and arguments from one document. However, existing methods fail to consider semantic distinctions between multiple mentions of one entity and ignore dynamic representation of entities across ...
Context-Aware Adversarial Graph-Based Learning for Multilingual Grammatical Error Correction
- Naresh Kumar,
- Parveen Kumar,
- Sushreeta Tripathy,
- Neelamani Samal,
- Debasis Gountia,
- Praveen Gatla,
- Teekam Singh
Correcting grammatical errors in various language contexts is a crucial and challenging task in the field of natural language processing, commonly referred to as Multilingual Grammatical Error Correction. This paper elaborates the Adversarial Temporal ...
Improving Tone Recognition Performance using Wav2vec 2.0-Based Learned Representation in Yoruba, a Low-Resourced Language
- Saint Germes B. Bengono Obiang,
- Norbert Tsopze,
- Paulin Melatagia Yonta,
- Jean-Francois Bonastre,
- Tania Jiménez
Many sub-Saharan African languages are categorized as tone languages, and for the most part, they are classified as low-resource languages due to the limited resources and tools available to process these languages. Identifying the tone associated with a ...