[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN109446535A - A kind of illiteracy Chinese nerve machine translation method based on triangle framework - Google Patents

A kind of illiteracy Chinese nerve machine translation method based on triangle framework Download PDF

Info

Publication number
CN109446535A
CN109446535A CN201811231026.1A CN201811231026A CN109446535A CN 109446535 A CN109446535 A CN 109446535A CN 201811231026 A CN201811231026 A CN 201811231026A CN 109446535 A CN109446535 A CN 109446535A
Authority
CN
China
Prior art keywords
chinese
translation
mongol
language
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811231026.1A
Other languages
Chinese (zh)
Inventor
苏依拉
孙晓骞
王宇飞
高芬
张振
牛向华
赵亚平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inner Mongolia University of Technology
Original Assignee
Inner Mongolia University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inner Mongolia University of Technology filed Critical Inner Mongolia University of Technology
Priority to CN201811231026.1A priority Critical patent/CN109446535A/en
Publication of CN109446535A publication Critical patent/CN109446535A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

为了改变低资源语言机器翻译发展相对落后的现状,本发明公开了一种基于三角架构的蒙汉神经机器翻译方法,与现有的纯端到端神经机器翻译方法相比,本发明充分考虑了小语种中平行语料库有限的问题特别是蒙汉平行语料库稀缺问题,在平行语料库匮乏的前提下提高蒙汉翻译的质量;其次,利用统一的双向EM算法联合优化蒙古语的翻译模型;最后,将由模型x→z或z→y生成的伪样本与真正的双语样本以1:1的比例混合在同一小批量中来稳定训练过程。

In order to change the relatively backward development of low-resource language machine translation, the present invention discloses a Mongolian-Chinese neural machine translation method based on triangular architecture. Compared with the existing pure end-to-end neural machine translation method, the present invention fully considers The problem of limited parallel corpora in small languages, especially the scarcity of Mongolian-Chinese parallel corpus, improves the quality of Mongolian-Chinese translation under the premise of lack of parallel corpus; secondly, the unified two-way EM algorithm is used to jointly optimize the Mongolian translation model; The pseudo samples generated by the model x→z or z→y are mixed 1:1 with the true bilingual samples in the same mini-batch to stabilize the training process.

Description

A kind of illiteracy Chinese nerve machine translation method based on triangle framework
Technical field
The invention belongs to machine translation mothod field, in particular to a kind of illiteracy Chinese nerve machine translation based on triangle framework Method.
Background technique
A kind of automatic language translation can be become another language using computer by machine translation, be to solve language barrier Hinder most one of powerful measure of problem.In recent years, many large-scale searching enterprises and service centre such as Google, Baidu etc. are directed to machine Device translation has all carried out large-scale research, is made that significant contribution, therefore big language to obtain the high quality translation of machine translation Already close to human translation level, millions of people realizes leap using translation on line system and mobile application for translation between kind The exchange of aphasis.In the tide of deep learning in recent years, machine translation has become the most important thing, and it is complete to have become promotion The important component of ball exchange.
As a kind of data-driven method, the performance height of neural machine translation relies on scale, the quality of Parallel Corpus With neighborhood covering face.However, the language resourceful in addition to Chinese, English etc., most language all lack big rule in the world The Parallel Corpus of mould, high quality, wide coverage rate, Mongol are exactly a typical representative.Therefore, how to make full use of existing Data alleviate scarcity of resources problem, become an important research direction of neural machine translation.
Currently, end-to-end nerve machine translation is rapidly developed, translated relative to traditional machine translation method It is significantly improved in quality, has become the core technology of commercial online machine translation system.But for Parallel Corpus The translation of deficient low-resource language still has no small disadvantage compared to the translation between majority language.
Summary of the invention
In order to overcome the disadvantages of the above prior art, the purpose of the present invention is to provide a kind of illiteracy Chinese based on triangle framework Neural machine translation method, this method especially cover Chinese parallel corpora mainly for the limited problem of Parallel Corpus in rare foreign languages Library scarcity problem regard Mongol (z) as intermediate hidden variable, is introduced into the translation between English (x) and Chinese (y), by English Translation between the Chinese is decomposed into via Mongolian two steps.
To achieve the goals above, the technical solution adopted by the present invention is that:
A kind of illiteracy Chinese nerve machine translation method based on triangle framework, which is characterized in that using Mongol as intermediate hidden Variable is introduced into the translation between majority language x (such as English, French, Japanese) and Chinese, will be between majority language x and Chinese Translation be decomposed into via Mongolian two steps, under target a possibility that maximizing translation majority language x and Chinese, use The unified two-way Mongolian translation model of EM algorithm combined optimization, is promoted and covers Chinese translation quality, and any two of them combines it Between translation still use coder-decoder structure end to end.
Mongol is indicated with z, indicates that Chinese, the two-way EM algorithmic procedure of the unification are as follows with y:
The direction x → y
E: optimization θz|x
Wherein: θz|xIt indicates to make to translate ginseng when accuracy rate reaches on setting value when translating Mongol z from majority language x Numerical value, and p (z | x) indicate the accuracy rate that Mongol z is translated from majority language x, it is really to be distributed, p (z | y) it indicates to turn over from Chinese y The accuracy rate for translating Mongol z is the fitting distribution of p (z | x), and KL () is KullbackLeibler divergence;KL(p(z|x)|| P (z | y)) it indicates when being fitted true distribution p (z | x) with p (z | y), the information loss of generation;
M: optimization θy|z
Wherein: θy|zIt indicates to make to translate parameter when accuracy rate reaches on setting value when translating Chinese y from Mongol z Value, EZ~p (z | x)The mathematic expectaion of z when indicating to translate Mongol z from majority language x, and p (y | z) it indicates to translate the Chinese from Mongol z The accuracy rate of language y, D indicate entire training set;
The direction y → x
E: optimization θz|y
Wherein: θz|yIt indicates to make to translate parameter when accuracy rate reaches on setting value when translating Mongol z from Chinese y Value;
M: optimization θx|z
Wherein: θx|zIt indicates to make to translate ginseng when accuracy rate reaches on setting value when translating majority language x from Mongol z Numerical value, EZ~p (z | y)The mathematic expectaion of z when indicating to translate Mongol z from Chinese y, and p (x | z) it indicates to translate greatly from Mongol z The accuracy rate of languages x.
P (z | x) and p (y | z) joint training with the help of p (z | y), p (z | y) and p (x | z) are with the help of p (z | x) Joint training.
P (z | x), p (z | y), p (y | z) and p (x | z) are trained by the sample that itself is generated.
The training of the direction the x → y translation is decomposed into two stages, two translation models of training, the first model x → z From the potential translation of the input sentence generation Mongol z of majority language x, second model z → y generates Chinese according to the potential translation The final translation of y, it then follows the step of standard EM algorithm and Jensen inequality, the lower bound of p on entire training data D (y | x) is such as Under:
Wherein: L (Q) is L (θ;D lower bound), L (θ;It D) is likelihood function, θ is the model parameter of p (z | x) and p (y | z) Parameter value when concentration reaches translation accuracy rate on setting value, and p (y | x) it indicates to translate the accurate of Chinese y from majority language x Rate, Q (z) are any Posterior distrbutionps of z, Q (z)=p (z | x).
It is weighted examination with translation of the IBM model to generation, translation probability is calculated according to given bilingual data, it is described Bilingual data refers to low-resource to (x;Or (y z);z).
The pseudo- sample generated by model p (z | x) or p (z | y) and real bilingual sample are blended in together with the ratio of 1:1 In one small lot, to stablize training process.
The entire training process algorithm of the present invention is as follows:
Input: resource bilingual data (x abundant;Y), low-resource bilingual data (x;And (y z);z)
Output: parameter θz|xy|zz|yAnd θx|z
1: pre-training p (z | x), p (z | y), p (x | z), p (y | z)
2:while does not restrain do
3: parallel corpora (x, y) ∈ D between majority language x and Chinese y, the parallel corpora between majority language x and Mongol z (x*,z*) ∈ D, the parallel corpora (y between Chinese y and Mongol z*,z*)∈D
The direction 4:x → y: optimization θz|xy|z
5: generating z ' from p (z ' | x) and establish training batch B1=(x, z ') ∪ (x*,z*), B1Indicate sample (x;Z) it adds (x after training the pseudo- Parallel Corpus come;Z) parallel corpora, the corpus of the newly-generated Mongol z of z ' expression, B2=(y, z′)∪(y*,z*), B2Indicate sample (y;Z) (the y after being added to the pseudo- Parallel Corpus for training and;Z) parallel corpora
6:E step: B is used1Update θz|x
7:M step: B is used2Update θy|z
The direction 8:y → x: optimization θz|yx|z
9: generating z ' from p (z ' | y) and establish training batch B3=(y, z ') ∪ (y*,z*), B4=(x, z ') ∪ (x*,z*)
10:E step: B is used3Update θz|y
11:M step: B is used4Update θx|z
12:end while
13: returning: θz|x, θy|z, θz|y, θx|z
Compared with existing end-to-end neural machine translation method, the present invention has fully considered Parallel Corpus in rare foreign languages Limited problem especially covers Chinese Parallel Corpus scarcity problem, improves under the premise of Parallel Corpus scarcity and covers Chinese translation Quality;Secondly, utilizing the unified two-way Mongolian translation model of EM algorithm combined optimization;Finally, by model x → z or z → y The pseudo- sample of generation and real bilingual sample are blended in same small lot with the ratio of 1:1 and stablize training process.
Detailed description of the invention
The triangle that Fig. 1 is low-resource NMT learns architecture diagram.
Fig. 2 is end-to-end coder-decoder structure.
Specific embodiment
The embodiment that the present invention will be described in detail with reference to the accompanying drawings and examples.
Problem description: the illiteracy Chinese nerve machine translation method based on triangle framework, it is excellent with unified two-way EM algorithm joint Change Mongolian translation model.
Mongol is indicated with z, Chinese is indicated with y, English is indicated with x, unified two-way broad sense EM process is as follows:
The training of the translation of x → y is decomposed into two stages to train two translation models, the first model x → z is from x's The potential translation of sentence generation z is inputted, second model z → y generates the final translation of y language according to the potential translation, this two A process uses coder-decoder structure end to end;In addition, it then follows the step of standard EM algorithm and Jensen inequality, The lower bound for obtaining the p (y | x) on entire training data D is as follows:
Wherein: L (Q) is L (θ;D lower bound), L (θ;It D) is likelihood function, θ is the model parameter of p (z | x) and p (y | z) Parameter value when concentration reaches translation accuracy rate on setting value, and p (z | x) it indicates to translate the accurate of language z from language x Rate, p (y | z) indicate the accuracy rate that language y is translated from language z, p (y | x) indicate the accuracy rate that language y is translated from language x, D indicates entire training set, and Q (z) is any Posterior distrbutionp of z, Q (z)=p (z | x).
The direction x → y
E: optimization θz|x
In order to make L (Q) and L (θ;D error reaches minimum between), uses following formula:
Wherein: θz|xIt indicates to make to translate ginseng when accuracy rate reaches on setting value when translating Mongol z from majority language x Numerical value, and p (z | x) indicate the accuracy rate that Mongol z is translated from majority language x, it is really to be distributed, p (z | y) it indicates to turn over from Chinese y The accuracy rate for translating Mongol z is the fitting distribution of p (z | x), and KL () is KullbackLeibler divergence;KL(p(z|x)|| P (z | y)) it indicates when being fitted true distribution p (z | x) with p (z | y), the information loss of generation;
M: optimization θy|z
Wherein: θy|zIt indicates to make to translate parameter value when accuracy rate reaches on setting value when translating language y from language z, EZ~p (z | x)The mathematic expectaion of z when indicating to translate z from x;
The direction y → x
E: optimization θz|y
Wherein: θz|yIt indicates to make to translate parameter value when accuracy rate reaches on setting value when translating language z from language y, P (z | y) indicate the accuracy rate that language z is translated from language y;
M: optimization θx|z
Wherein: θx|zIt indicates to make to translate parameter value when accuracy rate reaches on setting value when translating language x from language z, EZ~p (z | y)Indicating the mathematic expectaion of z when translating z from y, p (x | z) indicates the accuracy rate that language x is translated from language z, p (x | Y) accuracy rate that language x is translated from language y is indicated;
On the basis of above-mentioned derivation, whole system structure is analyzed, as shown in Figure 1: dotted arrow indicate p (y | X) direction, wherein p (z | x) and p (y | z) joint training with the help of p (z | y), and solid arrow indicates the side of p (x | y) To wherein p (z | y) and p (x | z) joint training with the help of p (z | x).Similar to intensified learning, model p (z | x), p (z | Y), p (y | z) and p (x | z) are trained by sample that itself is generated.
In above-mentioned two-way training process, E step is performed both by gradient decline training, and the gradient descent algorithm on the direction x → y is public Formula is as follows:
Training process algorithm is as follows:
Input: resource bilingual data (x abundant;Y), low-resource bilingual data (x;And (y z);z)
Output: parameter θz|xy|zz|yAnd θx|z
1: pre-training p (z | x), p (z | y), p (x | z), p (y | z)
2:while does not restrain do
3: parallel corpora (x, y) ∈ D between language x and language y, the parallel corpora (x between language x and language z*,z*) ∈ D, the parallel corpora (y between language y and language z*,z*)∈D
The direction 4:x → y: optimization θz|xy|z
5: generating z ' from p (z ' | x) and establish training batch B1=(x, z ') ∪ (x*,z*), B1Indicate sample (x;Z) it adds (x after training the pseudo- Parallel Corpus come;Z) parallel corpora, the corpus of the newly-generated language z of z ' expression, B2=(y, z′)∪(y*,z*), B2Indicate sample (y;Z) (the y after being added to the pseudo- Parallel Corpus for training and;Z) parallel corpora, z ' table Show the corpus of newly-generated language z
6:E step: B is used1Update θz|x
7:M step: B is used2Update θy|z
The direction 8:y → x: optimization θz|yx|z
9: generating z ' from p (z ' | y) and establish training batch B3=(y, z ') ∪ (y*,z*), B4=(x, z ') ∪ (x*,z*)
10:E step: B is used3Update θz|y
11:M step: B is used4Update θx|z
12:end while
13: returning: θz|x, θy|z, θz|y, θx|z
The guarantee of training process stability:
In order to guarantee the stability of training process, by the pseudo- sample generated by model x → z or z → y and real bilingual sample This is blended in same small lot with the ratio of 1:1.
It is that English covers, illiteracy English, Meng Han, the Chinese cover and utilize coder-decoder knot end to end between any bilingual below The process of structure translation:
Referring to Fig. 2, firstly, the top half of Fig. 2, encoder encode source language sentence, it is semantic to generate context Vector Groups, then, the coding that these context semantic vectors are intended to as user.During generation (lower part of Fig. 2), decoding Device combination attention mechanism generates each of object language word, while generating each word, considers input this institute Corresponding context semantic vector, so that the content that this generates is consistent with the meaning of original language.
Specific translation steps are as follows:
1. the source language sentence that encoder reads input;
2. the sentence read is encoded to hidden layer state using Recognition with Recurrent Neural Network by encoder, a context semanteme is formed Vector Groups;
3. each word that decoder combination attention mechanism sequentially generates object language.

Claims (9)

1. a kind of illiteracy Chinese nerve machine translation method based on triangle framework, which is characterized in that using Mongol as intermediate hidden change Amount, is introduced into the translation between majority language x and Chinese, the translation between majority language x and Chinese is decomposed into via Mongol Two steps, it is excellent with unified two-way EM algorithm joint in the case where maximizing translation majority language x and target a possibility that Chinese Change Mongolian translation model, promoted and cover Chinese translation quality, the translation between any two of them combination still uses end-to-end Coder-decoder structure.
2. the illiteracy Chinese nerve machine translation method based on triangle framework according to claim 1, which is characterized in that indicated with z Mongol indicates that Chinese, the two-way EM algorithmic procedure of the unification are as follows with y:
The direction x → y
E: optimization θz|x
Wherein: θz|xIt indicates to make to translate parameter value when accuracy rate reaches on setting value when translating Mongol z from majority language x, P (z | x) indicate the accuracy rate that Mongol z is translated from majority language x, it is really to be distributed, p (z | y) it indicates to translate illiteracy from Chinese y The accuracy rate of archaism z is the fitting distribution of p (z | x), and KL () is KullbackLeibler divergence;KL(p(z|x)||p(z| Y) it) indicates when being fitted true distribution p (z | x) with p (z | y), the information loss of generation;
M: optimization θy|z
Wherein: θy|zIt indicates to make to translate parameter value when accuracy rate reaches on setting value when translating Chinese y from Mongol z, EZ~p (z | x)The mathematic expectaion of z when indicating to translate Mongol z from majority language x, and p (y | z) it indicates to translate Chinese y from Mongol z Accuracy rate, D indicates entire training set;
The direction y → x
E: optimization θz|y
Wherein: θz|yIt indicates to make to translate parameter value when accuracy rate reaches on setting value when translating Mongol z from Chinese y;
M: optimization θx|z
Wherein: θx|zIt indicates to make to translate parameter value when accuracy rate reaches on setting value when translating majority language x from Mongol z, EZ~p (z | y)The mathematic expectaion of z when indicating to translate Mongol z from Chinese y, and p (x | z) it indicates to translate majority language x from Mongol z Accuracy rate.
3. the illiteracy Chinese nerve machine translation method based on triangle framework according to claim 2, which is characterized in that p (z | x) and P (y | z) joint training with the help of p (z | y), p (z | y) and p (x | z) joint training with the help of p (z | x).
4. the illiteracy Chinese nerve machine translation method based on triangle framework according to claim 2, which is characterized in that p (z | x), p (z | y), p (y | z) and p (x | z) are trained by the sample that itself is generated.
5. the illiteracy Chinese nerve machine translation method based on triangle framework according to claim 2, which is characterized in that the x → y The training of direction translation is decomposed into two stages, two translation models of training, input sentence of the first model x → z from majority language x Son generates the potential translation of Mongol z, and second model z → y generates the final translation of Chinese y according to the potential translation, it then follows The lower bound of the step of standard EM algorithm and Jensen inequality, the p (y | x) on entire training data D is as follows:
Wherein: L (Q) is L (θ;D lower bound), L (θ;It D) is likelihood function, θ is the model parameter concentration of p (z | x) and p (y | z) Parameter value when reaching translation accuracy rate on setting value, and p (y | x) indicate the accuracy rate that Chinese y is translated from majority language x, Q (z) be z any Posterior distrbutionp, Q (z)=p (z | x).
6. the illiteracy Chinese nerve machine translation method based on triangle framework according to claim 2, which is characterized in that use IBM mould Type is weighted examination to the translation of generation, calculates translation probability according to given bilingual data, the bilingual data refers to low money Source is to (x;Or (y z);z).
7. the illiteracy Chinese nerve machine translation method based on triangle framework according to claim 5, which is characterized in that will be by model The pseudo- sample and real bilingual sample that p (z | x) or p (z | y) are generated are blended in same small lot with the ratio of 1:1, with steady Determine training process.
8. the illiteracy Chinese nerve machine translation method based on triangle framework according to claim 6, which is characterized in that entire training Process algorithm is as follows:
Input: resource bilingual data (x abundant;Y), low-resource bilingual data (x;And (y z);z)
Output: parameter θz|xy|zz|yAnd θx|z
1: pre-training p (z | x), p (z | y), p (x | z), p (y | z)
2:while does not restrain do
3: parallel corpora (x, y) ∈ D between majority language x and Chinese y, the parallel corpora (x between majority language x and Mongol z*, z*) ∈ D, the parallel corpora (y between Chinese y and Mongol z*,z*)∈D
The direction 4:x → y: optimization θz|xy|z
5: generating z ' from p (z ' | x) and establish training batch B1=(x, z ') ∪ (x*,z*), B1Indicate sample (x;Z) it is added to instruction (x after practising the pseudo- Parallel Corpus come;Z) parallel corpora, the corpus of the newly-generated Mongol z of z ' expression, B2=(y, z ') ∪(y*,z*), B2Indicate sample (y;Z) (the y after being added to the pseudo- Parallel Corpus for training and;Z) parallel corpora
6:E step: B is used1Update θz|x
7:M step: B is used2Update θy|z
The direction 8:y → x: optimization θz|yx|z
9: generating z ' from p (z ' | y) and establish training batch B3=(y, z ') ∪ (y*,z*), B4=(x, z ') ∪ (x*,z*)
10:E step: B is used3Update θz|y
11:M step: B is used4Update θx|z
12:end while
13: returning: θz|x, θy|z, θz|y, θx|z
9. the illiteracy Chinese nerve machine translation method based on triangle framework according to claim 1, which is characterized in that the big language Kind x is English.
CN201811231026.1A 2018-10-22 2018-10-22 A kind of illiteracy Chinese nerve machine translation method based on triangle framework Pending CN109446535A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811231026.1A CN109446535A (en) 2018-10-22 2018-10-22 A kind of illiteracy Chinese nerve machine translation method based on triangle framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811231026.1A CN109446535A (en) 2018-10-22 2018-10-22 A kind of illiteracy Chinese nerve machine translation method based on triangle framework

Publications (1)

Publication Number Publication Date
CN109446535A true CN109446535A (en) 2019-03-08

Family

ID=65547586

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811231026.1A Pending CN109446535A (en) 2018-10-22 2018-10-22 A kind of illiteracy Chinese nerve machine translation method based on triangle framework

Country Status (1)

Country Link
CN (1) CN109446535A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110472252A (en) * 2019-08-15 2019-11-19 昆明理工大学 The method of the more neural machine translation of the Chinese based on transfer learning
CN111382568A (en) * 2020-05-29 2020-07-07 腾讯科技(深圳)有限公司 Training method and device of word segmentation model, storage medium and electronic equipment
CN112101047A (en) * 2020-08-07 2020-12-18 江苏金陵科技集团有限公司 Machine translation method for matching language-oriented precise terms

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170060854A1 (en) * 2015-08-25 2017-03-02 Alibaba Group Holding Limited Statistics-based machine translation method, apparatus and electronic device
CN107248409A (en) * 2017-05-23 2017-10-13 四川欣意迈科技有限公司 A kind of multi-language translation method of dialect linguistic context
CN107967262A (en) * 2017-11-02 2018-04-27 内蒙古工业大学 A kind of neutral net covers Chinese machine translation method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170060854A1 (en) * 2015-08-25 2017-03-02 Alibaba Group Holding Limited Statistics-based machine translation method, apparatus and electronic device
CN107248409A (en) * 2017-05-23 2017-10-13 四川欣意迈科技有限公司 A kind of multi-language translation method of dialect linguistic context
CN107967262A (en) * 2017-11-02 2018-04-27 内蒙古工业大学 A kind of neutral net covers Chinese machine translation method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SHUO REN ET AL.: ""Triangular Architecture for Rare Language Translation"", 《ARXIV》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110472252A (en) * 2019-08-15 2019-11-19 昆明理工大学 The method of the more neural machine translation of the Chinese based on transfer learning
CN110472252B (en) * 2019-08-15 2022-12-13 昆明理工大学 Method for translating Hanyue neural machine based on transfer learning
CN111382568A (en) * 2020-05-29 2020-07-07 腾讯科技(深圳)有限公司 Training method and device of word segmentation model, storage medium and electronic equipment
CN112101047A (en) * 2020-08-07 2020-12-18 江苏金陵科技集团有限公司 Machine translation method for matching language-oriented precise terms

Similar Documents

Publication Publication Date Title
CN111125333B (en) A Generative Question Answering Method Based on Representation Learning and Multilayer Covering Mechanism
JP2021089705A (en) Method and device for evaluating translation quality
US11475225B2 (en) Method, system, electronic device and storage medium for clarification question generation
CN108829685A (en) A kind of illiteracy Chinese inter-translation method based on single language training
CN109359294A (en) An ancient Chinese translation method based on neural machine translation
CN112257465B (en) Multi-mode machine translation data enhancement method based on image description generation
CN110457661B (en) Natural language generation method, device, equipment and storage medium
CN115392259B (en) Microblog text sentiment analysis method and system based on confrontation training fusion BERT
CN113657123A (en) Mongolian Aspect-Level Sentiment Analysis Method Based on Target Template Guidance and Relation Head Coding
CN109446535A (en) A kind of illiteracy Chinese nerve machine translation method based on triangle framework
CN116955594A (en) Semantic fusion pre-training model construction method and cross-language summary generation method and system
CN114168754A (en) Relation extraction method based on syntactic dependency and fusion information
CN116384373A (en) An Aspect-Level Sentiment Analysis Method Based on Knowledge Distillation Framework
CN116257630A (en) Aspect-level emotion analysis method and device based on contrast learning
Sun [Retracted] Analysis of Chinese Machine Translation Training Based on Deep Learning Technology
CN110825869A (en) A copy mechanism-based variational generative decoder for text summarization generation
CN113591493B (en) Translation model training method and translation model device
CN114238636A (en) Translation matching-based cross-language attribute level emotion classification method
CN116681087B (en) An automatic question generation method based on multi-stage timing and semantic information enhancement
CN117831574A (en) A Chinese emotional speech synthesis method, system, device and medium based on text emotion
Hou et al. Design and implementation of interactive English translation system in internet of things auxiliary information processing
CN117093864A (en) Text generation model training method and device
CN115658881A (en) Sequence-to-sequence text summary generation method and system based on causality
CN114896969A (en) Method for extracting aspect words based on deep learning
CN115033692A (en) Problem generation method for machine-oriented reading understanding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190308