Recent advances of low-resource neural machine translation
In recent years, neural network-based machine translation (MT) approaches have steadily superseded the statistical MT (SMT) methods, and represents the current state-of-the-art in MT research. Neural MT (NMT) is a data-driven end-to-end learning ...
Recent advances in Apertium, a free/open-source rule-based machine translation platform for low-resource languages
- Tanmai Khanna,
- Jonathan N. Washington,
- Francis M. Tyers,
- Sevilay Bayatlı,
- Daniel G. Swanson,
- Tommi A. Pirinen,
- Irene Tang,
- Hèctor Alòs i Font
This paper presents an overview of Apertium, a free and open-source rule-based machine translation platform. Translation in Apertium happens through a pipeline of modular tools, and the platform continues to be improved as more language pairs are ...
Improving bilingual word embeddings mapping with monolingual context information
Bilingual word embeddings (BWEs) play a very important role in many natural language processing (NLP) tasks, especially cross-lingual tasks such as machine translation (MT) and cross-language information retrieval. Most existing methods to train ...
Tag-less back-translation
An effective method to generate a large number of parallel sentences for training improved neural machine translation (NMT) systems is the use of the back-translations of the target-side monolingual data. The standard back-translation method has ...
Jointly learning bilingual word embeddings and alignments
Learning bilingual word embeddings can be much easier if the parallel corpora are available with their words well aligned explicitly. However, in most cases, the parallel corpora only provide a set of pairs that are semantically equivalent to each ...
Word reordering on multiple pivots for the Japanese and Indonesian language pair
We investigated multiple pivot approaches for the Japanese and Indonesian (Ja–Id) language pair in phrase-based statistical machine translation (SMT). We used four languages as pivots: viz., English, Malaysian, Filipino, and the Myanmar language. ...
Investigating the roles of sentiment in machine translation
Parallel corpora are central to translation studies and contrastive linguistics. However, training machine translation (MT) systems by barely using the semantic aspects of a parallel corpus leads to unsatisfactory results, as then the trained MT ...
Simple measures of bridging lexical divergence help unsupervised neural machine translation for low-resource languages
Unsupervised Neural Machine Translation (UNMT) approaches have gained widespread popularity in recent times. Though these approaches show impressive translation performance using only monolingual corpora of the languages involved, these approaches ...