Computer Science > Computation and Language

arXiv:2408.01394 (cs)

[Submitted on 2 Aug 2024]

Title:Improving Multilingual Neural Machine Translation by Utilizing Semantic and Linguistic Features

Abstract:The many-to-many multilingual neural machine translation can be regarded as the process of integrating semantic features from the source sentences and linguistic features from the target sentences. To enhance zero-shot translation, models need to share knowledge across languages, which can be achieved through auxiliary tasks for learning a universal representation or cross-lingual mapping. To this end, we propose to exploit both semantic and linguistic features between multiple languages to enhance multilingual translation. On the encoder side, we introduce a disentangling learning task that aligns encoder representations by disentangling semantic and linguistic features, thus facilitating knowledge transfer while preserving complete information. On the decoder side, we leverage a linguistic encoder to integrate low-level linguistic features to assist in the target language generation. Experimental results on multilingual datasets demonstrate significant improvement in zero-shot translation compared to the baseline system, while maintaining performance in supervised translation. Further analysis validates the effectiveness of our method in leveraging both semantic and linguistic features. The code is available at this https URL.

Comments:	Accepted by ACL2024 Findings
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2408.01394 [cs.CL]
	(or arXiv:2408.01394v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2408.01394

Submission history

From: Mengyu Bu [view email]
[v1] Fri, 2 Aug 2024 17:10:12 UTC (8,543 KB)

Computer Science > Computation and Language

Title:Improving Multilingual Neural Machine Translation by Utilizing Semantic and Linguistic Features

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Improving Multilingual Neural Machine Translation by Utilizing Semantic and Linguistic Features

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators