CN110442880B

CN110442880B - Translation method, device and storage medium for machine translation

Info

Publication number: CN110442880B
Application number: CN201910721252.6A
Authority: CN
Inventors: 林芯玥; 刘晋; 宋俊杰
Original assignee: Shanghai Maritime University
Current assignee: Shanghai Maritime University
Priority date: 2019-08-06
Filing date: 2019-08-06
Publication date: 2022-09-30
Anticipated expiration: 2039-08-06
Also published as: CN110442880A

Abstract

The invention discloses a translation method, device and storage medium for machine translation translation, comprising: receiving a source sentence to be translated; performing word segmentation processing on the source sentence; model, the part of speech is integrated into the word vector corresponding to the word, and the fused word vector sequence is obtained; the word vector sequence is input into the encoder-decoder model, and the encoding and decoding results are obtained; for the encoding and decoding results, Result evaluation is performed based on a beam search evaluation function, wherein the beam search evaluation function includes a penalty item based on length comparison and a penalty item based on repeated detection; and a translation is obtained according to the evaluation result. The application of the embodiments of the present invention improves the problems of repeated fragments and omission of source sentences in the translation, and has a wide range of applications, strong pertinence, and high translation quality.

Description

A translation method, device and storage medium for machine translation translation

技术领域technical field

本发明涉及机器翻译译文改进技术领域，尤其涉及一种机器翻译译文的翻译方法、装置及存储介质。The present invention relates to the technical field of machine translation translation improvement, in particular to a translation method, device and storage medium for machine translation translation.

背景技术Background technique

语言是人类平时信息交流最重要的一种载体，它对于整个社会的发展有着十分重要的影响，机器自动化翻译的方法已经成为了目前的一个迫切的需求。实现不同语言的自动化翻译由着巨大的应用控件。Language is the most important carrier of information exchange for human beings. It has a very important impact on the development of the whole society. The method of automatic machine translation has become an urgent need at present. The realization of automatic translation of different languages is controlled by a huge application.

目前，基于规则的机器翻译方法需要专业的语言学家制定大量的规则，人工成本高，可扩展性差。基于中间语言的机器翻译方法需要制定一套通用的中间语言，难度太高，且鲁棒性低。基于统计的机器翻译方法虽然人工成本较低，扩展性得到了提高，但译文质量依旧较差。基于神经网络的机器翻译方法是目前最先进的机器翻译方法，但对于翻译译文的质量依旧有着改进的空间。At present, rule-based machine translation methods require professional linguists to formulate a large number of rules, with high labor costs and poor scalability. The machine translation method based on intermediate language needs to formulate a set of common intermediate language, which is too difficult and has low robustness. Although statistical-based machine translation methods have lower labor costs and improved scalability, the translation quality is still poor. The machine translation method based on neural network is the most advanced machine translation method at present, but there is still room for improvement in the quality of the translated text.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于提供一种机器翻译译文的翻译方法、装置及存储介质，旨在解决现有机器翻译模型生成译文质量较差的问题。The purpose of the present invention is to provide a translation method, device and storage medium for machine translation translation, aiming at solving the problem of poor quality of translation generated by the existing machine translation model.

为了实现上述目的，本发明提供一种机器翻译译文的翻译方法，所述方法包括：In order to achieve the above object, the present invention provides a translation method for machine translation translation, the method comprising:

接收待翻译的源语句；Receive the source sentence to be translated;

对所述源语句进行分词处理；performing word segmentation processing on the source sentence;

获取所述分词中每一个单词的词性；Obtain the part of speech of each word in the participle;

根据词向量模型，将所述词性融入单词所对应的词向量中，获取融合后的词向量序列；According to the word vector model, the part of speech is integrated into the word vector corresponding to the word, and the fused word vector sequence is obtained;

将所述词向量序列输入至编码器解码器模型中，获得编解码结果；Inputting the word vector sequence into an encoder-decoder model to obtain an encoding and decoding result;

针对所述编解码结果，基于波束搜索评价函数进行结果评价，其中，所述波束搜索评价函数包括在基于长度对比的惩罚项和重复检测的惩罚项；For the encoding and decoding results, result evaluation is performed based on a beam search evaluation function, wherein the beam search evaluation function includes a penalty item based on length comparison and a penalty item for repeated detection;

根据所述评价结果获得译文。A translation is obtained based on the evaluation results.

进一步的，所述波束搜索评价函数的具体表达为：Further, the specific expression of the beam search evaluation function is:

s(Y，X)＝log(P(Y|X))+d(x)+l(x)s(Y, X)=log(P(Y|X))+d(x)+l(x)

其中，x(Y，X)是波束搜索评价函数，log(P(Y/X))是在X出现时Y出现的概率函数，d(x)是基于重复检测的惩罚项，l(x)为基于长度对比的惩罚，P是分布函数；where x(Y, X) is the beam search evaluation function, log(P(Y/X)) is the probability function of Y appearing when X appears, d(x) is the penalty term based on repeated detection, and l(x) is the penalty based on length comparison, P is the distribution function;

在波束搜索评价函数中加入基于长度比值的惩罚项，用于解决译文出现部分漏翻的问题；A penalty term based on the length ratio is added to the beam search evaluation function to solve the problem of partial omissions in the translation;

在波束搜索评价函数中加入基于重复检测的惩罚项，用于解决译文出现重复内容的问题。A penalty term based on duplicate detection is added to the beam search evaluation function to solve the problem of duplicate content in translation.

进一步的，所述重复检测惩罚项d(x)的具体公式表达为：Further, the specific formula of the repeated detection penalty term d(x) is expressed as:

其中，c为当前翻译单词所在的索引，δ为重复检测的范围，ε为惩罚系数，y为候选译文所对应的矩阵，y_c-j，y_c-i-j分别为重复检测的两个矩阵，i，j，为遍历变量。Among them, c is the index of the current translation word, δ is the range of duplicate detection, ε is the penalty coefficient, y is the matrix corresponding to the candidate translation, y _cj , y _cij are the two matrices for duplicate detection, i, j, to iterate over variables.

进一步的，所述针对所述编解码结果，基于波束搜索评价函数进行结果评价的步骤，包括：Further, the step of evaluating the result based on the beam search evaluation function for the encoding and decoding results includes:

所述源语句的长度与目标译文的长度比值；the ratio of the length of the source sentence to the length of the target translation;

通过线形回归对所述长度比值进行拟合，得到累计分布函数；Fitting the length ratio through linear regression to obtain a cumulative distribution function;

当波束搜索的候选词中同时出现句末标记和普通的单词时，将译文已经结束的概率F_X(x)和译文没结束的概率1-F_X(x)分别加入到评价函数l(x)＝θF_X(x)，not_EOS，l(x)＝θ(1-F_X(x))，EOS，其中，EOS是句末标记，θ是参数；When both end-of-sentence markers and common words appear in the candidate words of the beam search, the probability that the translation has ended F _X (x) and the probability that the translation has not ended 1-F _X (x) are respectively added to the evaluation function l(x )=θF _X (x), not_EOS, l(x)=θ(1-F _X (x)), EOS, where EOS is the end-of-sentence marker, and θ is a parameter;

当候选词是句末标记标志时，将还未翻译好的概率乘上惩罚因子作为惩罚项；When the candidate word is the marker at the end of the sentence, multiply the untranslated probability by the penalty factor as the penalty item;

而当候选词不是句末标记标志时，将完成翻译的概率乘上惩罚因子作为惩罚项；When the candidate word is not the marker at the end of the sentence, the probability of completing the translation is multiplied by the penalty factor as the penalty item;

将得到的基于长度比值的惩罚项加入到波束搜索的评价函数中；The obtained penalty term based on the length ratio is added to the evaluation function of beam search;

基于波束搜索评价函数进行结果评价。The results are evaluated based on the beam search evaluation function.

进一步的，所述编码器解码器模型中，编码器部分和解码器部分均使用双向循环神经网络。Further, in the encoder-decoder model, both the encoder part and the decoder part use a bidirectional recurrent neural network.

进一步的，所述将所述词向量序列输入至编码器解码器模型中，获得编解码结果的步骤，包括：Further, the step of inputting the word vector sequence into an encoder-decoder model to obtain an encoding and decoding result includes:

将所述词向量序列输入至编码器解码器模型中；inputting the sequence of word vectors into an encoder-decoder model;

基于编码解码器的深度学习框架，将词向量序列转换成句向量；The deep learning framework based on the codec converts the sequence of word vectors into sentence vectors;

在基于解码器，将句向量转换成词向量序列。Based on the decoder, the sentence vector is converted into a sequence of word vectors.

此外，本发明还公开了一种机器翻译译文装置，所述装置包括处理器、以及通过通信总线与所述处理器连接的存储器；其中，In addition, the present invention also discloses a machine translation translation device, the device includes a processor and a memory connected to the processor through a communication bus; wherein,

所述存储器，用于存储机器翻译译文的翻译程序；The memory is used to store a translation program for machine translation translation;

所述处理器，用于执行所述机器翻译译文的翻译程序，以实现任一项所述的机器翻译译文的翻译步骤。The processor is configured to execute the translation program of the machine translation translation, so as to realize the translation step of any one of the machine translation translations.

以及，一种计算机存储介质，所述计算机存储介质存储有一个或者多个程序，所述一个或者多个程序可被一个或者多个处理器执行，以使所述一个或者多个处理器执行任一项所述的机器翻译译文的翻译步骤。And, a computer storage medium storing one or more programs executable by one or more processors to cause the one or more processors to perform any The translation steps of the machine translation translation described in one item.

应用本发明实施例提供的一种机器翻译译文的翻译方法、装置及存储介质，在有效地构建向量建立不同词语之间的语义关联的同时，对包含它们在不同词性下的含义，并且修正了波束搜索评价函数改善了译文中出现重复片段以及遗漏源语句的问题，适用范围广、针对性强、翻译译文质量较高。By applying the translation method, device and storage medium for machine translation translation provided by the embodiments of the present invention, while effectively constructing vectors to establish semantic associations between different words, the meanings of them under different parts of speech are included and corrected. The beam search evaluation function improves the problem of repeated fragments and missing source sentences in the translation, and has a wide range of applications, strong pertinence, and high translation quality.

附图说明Description of drawings

图1是本发明实施例的一种流程示意图。FIG. 1 is a schematic flowchart of an embodiment of the present invention.

图2是本发明实施例的一种结构示意图。FIG. 2 is a schematic structural diagram of an embodiment of the present invention.

图3是本发明实施例的另一种结构示意图。FIG. 3 is another schematic structural diagram of an embodiment of the present invention.

图4是本发明实施例的重复检测的惩罚项算法描述示意图。FIG. 4 is a schematic diagram illustrating a penalty item algorithm for repeated detection according to an embodiment of the present invention.

图5是本发明实施例的长度比值的惩罚项算法描述示意图。FIG. 5 is a schematic diagram illustrating a penalty term algorithm for a length ratio according to an embodiment of the present invention.

图6是本发明实施例的英译中效果示意图。FIG. 6 is a schematic diagram of an English-Chinese translation effect of an embodiment of the present invention.

具体实施方式Detailed ways

以下通过特定的具体实例说明本发明的实施方式，本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用，本说明书中的各项细节也可以基于不同观点与应用，在没有背离本发明的精神下进行各种修饰或改变。The embodiments of the present invention are described below through specific specific examples, and those skilled in the art can easily understand other advantages and effects of the present invention from the contents disclosed in this specification. The present invention can also be implemented or applied through other different specific embodiments, and various details in this specification can also be modified or changed based on different viewpoints and applications without departing from the spirit of the present invention.

语言模型(language Model)是一个单纯的、统一的、抽象的形式系统，语言客观事实经过语言模型的描述后，即可被计算机进行自动处理，因而语言模型对于自然语言的信息处理具有重大的意义，对于词性标注、句法分析、语音识别等研究都具有重要的作用。The language model is a simple, unified and abstract formal system. After the objective facts of the language are described by the language model, they can be automatically processed by the computer. Therefore, the language model is of great significance to the information processing of natural language. It plays an important role in the research of part-of-speech tagging, syntactic analysis, and speech recognition.

在机器翻译问题中，输入的原语言语句和输出的目标语言译文都可以看作序列，因此可以将机器翻译看作序列到序列问题。目前主流的解决序列到序列的方法为编码器解码器模型，编码器将原语言语句编码为句子向量，解码器将句子向量重新解码得到目标语言译文。In the machine translation problem, both the input original language sentence and the output target language translation can be regarded as sequences, so machine translation can be regarded as a sequence-to-sequence problem. The current mainstream method to solve sequence-to-sequence is the encoder-decoder model. The encoder encodes the original language sentence into a sentence vector, and the decoder re-decodes the sentence vector to obtain the target language translation.

需要说明的是，通常使用循环神经网络RNN作为编码器和解码器。循环神经网络(RNN，RecurrentNeural Network)，是一种经典的神经网络结构，它包含循环的神经单元，因此可以处理可序列化数据并允许数据信息的持久化。RNN会将当前的输入与之前的输入一起作为参数进行训练并得到输出。双向循环神经网络(Bi-RNN，BidirectionalRecurrent Neural Network)，是基于循环神经网络改进的一种网络结构。在一些任务中，网络的输入不仅与过去的输入有关，也与后续的输入有一定关联。因此除了输入正向序列外，还需输入逆向序列。而双向循环神经网络就是由两层循环神经网络组成的，它支持同时输入正向序列和逆向序列，有效地提高了网络的性能。It should be noted that the recurrent neural network RNN is usually used as the encoder and decoder. Recurrent Neural Network (RNN, Recurrent Neural Network), is a classic neural network structure, which contains recurrent neural units, so it can process serializable data and allow the persistence of data information. The RNN will train the current input together with the previous input as parameters and get the output. Bidirectional Recurrent Neural Network (Bi-RNN, Bidirectional Recurrent Neural Network) is an improved network structure based on cyclic neural network. In some tasks, the input of the network is not only related to the past input, but also has a certain relationship with the subsequent input. Therefore, in addition to inputting the forward sequence, the reverse sequence also needs to be input. The bidirectional recurrent neural network is composed of a two-layer recurrent neural network, which supports the simultaneous input of forward sequence and reverse sequence, which effectively improves the performance of the network.

词向量(WordVector)嵌入式自然语言处理(NLP)中的一组语言建模和特征学习技术的统称，其中来自词汇表的单词或短语被映射到实数的向量。从概念上讲，它涉及从每个单词一维的空间到具有更低维度的连续向量空间的数学嵌入。Skip-gram模型是在神经网络训练语言模型时，用于生成单词的分布式表示的模型结构，Skip-gram模型是将当前词的词向量作为输入，预测这个词的上下文。波束搜索(Beam Search)波束搜索是一种启发式搜索算法，通过扩展有限集合中最有希望的节点来探索图形。WordVector A general term for a set of language modeling and feature learning techniques in embedded natural language processing (NLP), where words or phrases from a vocabulary are mapped to vectors of real numbers. Conceptually, it involves mathematical embeddings from a one-dimensional space per word to a continuous vector space with lower dimensions. The Skip-gram model is a model structure used to generate a distributed representation of a word when a neural network trains a language model. The Skip-gram model uses the word vector of the current word as input to predict the context of the word. Beam Search Beam search is a heuristic search algorithm that explores a graph by expanding the most promising nodes in a finite set.

波束搜索是最佳优先搜索的优化，可以减少其内存需求。最佳搜索是根据尝试预测部分解决方案与完整解决方案(目标状态)有多接近的一些启发式命令来排列所有部分解决方案(状态)的图搜索。但是在波束搜索中，只有预定数量的最佳局部解才被保留为候选。Beam search is an optimization of best-first search that reduces its memory requirements. An optimal search is a graph search that ranks all partial solutions (states) according to some heuristic commands that try to predict how close the partial solutions are to the complete solution (goal state). But in beam search, only a predetermined number of the best local solutions are kept as candidates.

请参阅图1-6。需要说明的是，本实施例中所提供的图示仅以示意方式说明本发明的基本构想，遂图式中仅显示与本发明中有关的组件而非按照实际实施时的组件数目、形状及尺寸绘制，其实际实施时各组件的型态、数量及比例可为一种随意的改变，且其组件布局型态也可能更为复杂。See Figures 1-6. It should be noted that the drawings provided in this embodiment are only to illustrate the basic concept of the present invention in a schematic way, so the drawings only show the components related to the present invention rather than the number, shape and the number of components in actual implementation. For dimension drawing, the type, quantity and proportion of each component can be changed at will in actual implementation, and the component layout may also be more complicated.

如图1本发明提供一种机器翻译译文的翻译方法，所述方法包括：As shown in Figure 1, the present invention provides a translation method for machine translation translation, and the method includes:

S110，接收待翻译的源语句。S110: Receive the source sentence to be translated.

S120，对源语句进行分词处理；S120, performing word segmentation processing on the source sentence;

可以理解的是，对接收待翻译的源语句中的每句语句进行分词处理。It can be understood that, word segmentation processing is performed on each sentence in the received source sentence to be translated.

S130，获取所述分词中每一个单词的词性；S130, obtaining the part of speech of each word in the word segmentation;

需要说明的是，对源语句中的每句语句进行分词，接着使用词性标注工具对各个单词进行词性标注，得到每个单词的词性，并通过查询词性缩写表获得对应词性的缩写符号。最后将原单词与对应的词性缩写符号通过“_”符号进行连接得到单词/词性字符串并替代源语句中的原单词。It should be noted that each sentence in the source sentence is segmented, and then each word is tagged with a part of speech tagging tool to obtain the part of speech of each word, and the abbreviation symbol of the corresponding part of speech is obtained by querying the part of speech abbreviation table. Finally, connect the original word and the corresponding part-of-speech abbreviation through the "_" symbol to obtain a word/part-of-speech string and replace the original word in the source sentence.

S140，根据词向量模型，将所述词性融入单词所对应的词向量中，获取融合后的词向量序列；S140, according to the word vector model, integrate the part of speech into the word vector corresponding to the word, and obtain the fused word vector sequence;

词向量(WordVector)嵌入式自然语言处理(NLP)中的一组语言建模和特征学习技术的统称，其中来自词汇表的单词或短语被映射到实数的向量。从概念上讲，它涉及从每个单词一维的空间到具有更低维度的连续向量空间的数学嵌入。WordVector A general term for a set of language modeling and feature learning techniques in embedded natural language processing (NLP), where words or phrases from a vocabulary are mapped to vectors of real numbers. Conceptually, it involves mathematical embeddings from a one-dimensional space per word to a continuous vector space with lower dimensions.

本发明一种实现方式中，对步骤S120，S130得到源语句中所有的单词/词性字符串进行统计并构建词典，接着将词典中的每个单词/词性字符串进行索引并保存。然后将句子中的单词/词性字符串转化为索引值，并将每句句子代表的索引值序列输入到skip-gram模型中进行训练，得到训练好的融合了词性特征的词向量，获取融合后的词向量序列。In an implementation manner of the present invention, all words/part-of-speech strings in the source sentence obtained in steps S120 and S130 are counted and a dictionary is constructed, and then each word/part-of-speech string in the dictionary is indexed and saved. Then convert the word/part-of-speech string in the sentence into an index value, and input the index value sequence represented by each sentence into the skip-gram model for training, and obtain the trained word vector fused with the part-of-speech feature. sequence of word vectors.

示例性的，如图2所示，输入为w(t)经过skip-gram模型训练后输出分别为w(t-2)、w(t-1)、w(t+2)、w(t+1)。Exemplarily, as shown in Figure 2, the input is w(t), and after the skip-gram model is trained, the output is w(t-2), w(t-1), w(t+2), w(t respectively +1).

S150，将所述词向量序列输入至编码器解码器模型中，获得编解码结果。S150: Input the word vector sequence into an encoder-decoder model to obtain an encoding and decoding result.

需要说明的是，将词向量序列输入编码器解码器模型中,将训练好的词向量代替原语料集中的各条句子的单词，将语料集中的句子转化词向量序列。再将词向量序列作为输入送入编码器解码器模型中，获得编解码结果。编码器解码器模型结构如图3所示。It should be noted that the word vector sequence is input into the encoder-decoder model, the trained word vector is used to replace the words of each sentence in the original corpus, and the sentence in the corpus is converted into a word vector sequence. Then, the word vector sequence is sent as input into the encoder-decoder model to obtain the encoding and decoding results. The encoder-decoder model structure is shown in Figure 3.

S160，针对所述编解码结果，基于波束搜索评价函数进行结果评价，其中，所述波束搜索评价函数包括在基于长度对比的惩罚项和重复检测的惩罚项；S160, for the encoding and decoding results, perform result evaluation based on a beam search evaluation function, wherein the beam search evaluation function is included in a penalty item based on length comparison and a penalty item based on repeated detection;

可以理解的是，波束搜索是一种启发式搜索算法，通过扩展有限集合中最有希望的节点来探索图形。波束搜索是最佳优先搜索的优化，可以减少其内存需求。最佳搜索是根据尝试预测部分解决方案与完整解决方案(目标状态)有多接近的一些启发式命令来排列所有部分解决方案(状态)的图搜索。但是在波束搜索中，只有预定数量的最佳局部解才被保留为候选。本发明实施例，对波束搜索的评价函数进行改进，加入基于重复检测的惩罚项和加入基于长度比值的惩罚项。Understandably, beam search is a heuristic search algorithm that explores a graph by expanding the most promising nodes in a finite set. Beam search is an optimization of best-first search that reduces its memory requirements. An optimal search is a graph search that ranks all partial solutions (states) according to some heuristic commands that try to predict how close the partial solutions are to the complete solution (goal state). But in beam search, only a predetermined number of the best local solutions are kept as candidates. In the embodiment of the present invention, the evaluation function of beam search is improved, and a penalty item based on repeated detection and a penalty item based on length ratio are added.

S170，根据所述评价结果获得译文；S170, obtaining a translation according to the evaluation result;

通过编码器解码器模型和波束搜索，得到最终的译文。Through the encoder-decoder model and beam search, the final translation is obtained.

本发明的一种实现方式中，所述波束搜索评价函数的具体表达为：In an implementation manner of the present invention, the specific expression of the beam search evaluation function is:

s(Y，X)＝log(P(Y|X))+d(x)+l(x)s(Y, X)=log(P(Y|X))+d(x)+l(x)

需要说明的是，本发明实施例对波束搜索的评价函数进行改进，加入基于长度比值的惩罚项和基于重复检测的惩罚项。长度比值惩罚项针对机器翻译译文长度过长或过短的问题，通过统计源语句的长度与译文长度的比值得到惩罚项，并用于波束搜索对候选词的评价函数中，重复检测惩罚项通过将译文划分成不同大小的片段进行比对，并将出现重复词的位置与待翻译的位置间的距离也考虑在内，最后将求得的惩罚项用在波束搜索对候选词的评价函数中。改善了译文中出现重复片段以及遗漏源语句的问题，适用范围广、针对性强、翻译译文质量较高。It should be noted that, in the embodiment of the present invention, the evaluation function of beam search is improved, and a penalty item based on length ratio and a penalty item based on repeated detection are added. The length ratio penalty item is aimed at the problem that the length of the machine translation translation is too long or too short. The penalty item is obtained by counting the ratio of the length of the source sentence to the length of the translated text, and it is used in the evaluation function of the candidate word by the beam search. The translation is divided into fragments of different sizes for comparison, and the distance between the position where the repeated word appears and the position to be translated are also taken into account. Finally, the obtained penalty term is used in the evaluation function of the candidate word by the beam search. The problem of repeated fragments and omission of source sentences in the translation has been improved, with a wide range of applications, strong pertinence, and high translation quality.

如图4，将整个波束搜索的候选语句以及公式中的参数δ和ε作为算法的输入，经过划分多个不同大小的片段进行比对，分别计算各自的惩罚项，最后进行加权累加。图5本文将当前候选词、当前长度计算得到的累积分布函数的值F_X(x)以及公式中的参数θ作为算法的输入，首先通过向量运算得到候选词是否为EOS，是的话为1，否则为0。然后通过点乘的形式，得到公式中l(x)的值。As shown in Figure 4, the candidate sentences of the entire beam search and the parameters δ and ε in the formula are used as the input of the algorithm. After dividing into multiple segments of different sizes for comparison, the respective penalty terms are calculated respectively, and finally the weighted accumulation is performed. Figure 5 This paper uses the current candidate word, the value of the cumulative distribution function calculated by the current length F _X (x), and the parameter θ in the formula as the input of the algorithm. First, whether the candidate word is EOS is obtained by vector operation, if it is, it is 1, 0 otherwise. Then through the form of dot product, the value of l(x) in the formula is obtained.

本发明的一种实现方式中，所述针对所述编解码结果，基于波束搜索评价函数进行结果评价的步骤，包括：In an implementation manner of the present invention, the step of evaluating the encoding and decoding results based on the beam search evaluation function includes:

可以理解的是，首先分别统计源语句的长度和目标译文的长度，并计算源语句与目标译文的长度比值，然后通过线性回归得到的长度比值进行拟合得到其累计分布函数F_X(x)＝P(X＜x)，其中，It can be understood that firstly, the length of the source sentence and the length of the target translation are calculated separately, and the length ratio of the source sentence and the target translation is calculated, and then the length ratio obtained by linear regression is fitted to obtain its cumulative distribution function F _X (x) =P(X<x), where,

即目标译文的长度与源语句的长度比值。当波束搜索的候选词中同时出现EOS(句末标记)和普通的单词时，将译文已经结束的概率FX(x)和译文还没结束的概率1-F_X(x)分别加入到它们的评价函数l(x)＝θF_X(x)，not_EOS，l(x)＝θ(1-F_X(x))，EOS。当候选词是EOS标志时，将还未翻译好的概率乘上惩罚因子作为惩罚项，而当候选词不是EOS标志时，将完成翻译的概率乘上惩罚因子作为惩罚项。最后将得到的基于长度比值的惩罚项加入到波束搜索的评价函数中，如图5。That is, the ratio of the length of the target translation to the length of the source sentence. When both EOS (sentence end marker) and common words appear in the candidate words of the beam search, add the probability FX(x) that the translation has ended and the probability 1-F _X (x) that the translation has not ended to their Evaluation function l(x)=θF _X (x), not_EOS, l(x)=θ(1−F _X (x)), EOS. When the candidate word is the EOS symbol, multiply the untranslated probability by the penalty factor as the penalty item, and when the candidate word is not the EOS symbol, multiply the translation probability by the penalty factor as the penalty term. Finally, the penalty term based on the length ratio is added to the evaluation function of beam search, as shown in Figure 5.

通过编码器解码器模型和波束搜索，得到最终的最优译文，如图6所示。Through the encoder-decoder model and beam search, the final optimal translation is obtained, as shown in Figure 6.

需要说明的是，编码解码器是深度学习框架再结合波束搜索评价函数，得到最终的最优译文，解决了出现重复片段以及遗漏源语句的问题，适用范围广、针对性强、翻译译文质量较高。It should be noted that the codec is a deep learning framework combined with a beam search evaluation function to obtain the final optimal translation, which solves the problem of repeated fragments and missing source sentences. high.

需要说明的是，双向循环神经网络(Bi-RNN，Bidirectional Recurrent NeuralNetwork)，是基于循环神经网络改进的一种网络结构。在一些任务中，网络的输入不仅与过去的输入有关，也与后续的输入有一定关联。因此除了输入正向序列外，还需输入逆向序列。而双向循环神经网络就是由两层循环神经网络组成的，它支持同时输入正向序列和逆向序列，有效地提高了网络的性能。It should be noted that the Bidirectional Recurrent Neural Network (Bi-RNN, Bidirectional Recurrent Neural Network) is an improved network structure based on the cyclic neural network. In some tasks, the input of the network is not only related to the past input, but also has a certain relationship with the subsequent input. Therefore, in addition to inputting the forward sequence, the reverse sequence also needs to be input. The bidirectional recurrent neural network is composed of a two-layer recurrent neural network, which supports the simultaneous input of forward sequence and reverse sequence, which effectively improves the performance of the network.

本发明一种实现方式中，所述将所述词向量序列输入至编码器解码器模型中，获得编解码结果的步骤，包括：In an implementation manner of the present invention, the step of inputting the word vector sequence into an encoder-decoder model to obtain an encoding and decoding result includes:

可以理解的是,编码器解码器是深度学习框架，编码器用于将词向量序列转换成句向量，解码器用于将句向量转换成词向量序列。It can be understood that the encoder-decoder is a deep learning framework, the encoder is used to convert word vector sequences into sentence vectors, and the decoder is used to convert sentence vectors into word vector sequences.

本发明还提供了一种机器翻译译文的翻译装置，所述装置包括处理器、以及通过通信总线与所述处理器连接的存储器；其中，The present invention also provides a translation device for machine translation translation, the device includes a processor and a memory connected to the processor through a communication bus; wherein,

本发明还提供了一种计算机存储介质存储有一个或者多个程序，所述一个或者多个程序可被一个或者多个处理器执行，以使所述一个或者多个处理器执行任一项所述的机器翻译译文的翻译步骤。The present invention also provides a computer storage medium storing one or more programs, and the one or more programs can be executed by one or more processors to cause the one or more processors to execute any one of the The translation steps of the machine translation translation described above.

上述实施例仅例示性说明本发明的原理及其功效，而非用于限制本发明。任何熟悉此技术的人士皆可在不违背本发明的精神及范畴下，对上述实施例进行修饰或改变。因此，举凡所属技术领域中具有通常知识者在未脱离本发明所揭示的精神与技术思想下所完成的一切等效修饰或改变，仍应由本发明的权利要求所涵盖。The above-mentioned embodiments merely illustrate the principles and effects of the present invention, but are not intended to limit the present invention. Anyone skilled in the art can modify or change the above embodiments without departing from the spirit and scope of the present invention. Therefore, all equivalent modifications or changes made by those with ordinary knowledge in the technical field without departing from the spirit and technical idea disclosed in the present invention should still be covered by the claims of the present invention.

Claims

1. a translation method of machine translation translation, is characterized in that, described method comprises:

Receive the source sentence to be translated;

performing word segmentation processing on the source sentence;

Obtain the part of speech of each word in the participle;

According to the word vector model, the part of speech is integrated into the word vector corresponding to the word, and the fused word vector sequence is obtained;

Inputting the word vector sequence into an encoder-decoder model to obtain an encoding and decoding result;

For the encoding and decoding results, result evaluation is performed based on a beam search evaluation function, wherein the beam search evaluation function includes a penalty item based on length comparison and a penalty item for repeated detection;

A translation is obtained based on the evaluation results.

2. the translation method of a kind of machine translation translation according to claim 1, is characterized in that, the concrete expression of described beam search evaluation function is:

s(Y, X)=log(P(Y|X))+d(x)+l(x)

where s(Y,X) is the beam evaluation function, x(Y,X) is the beam search evaluation function, log(P(Y/X)) is the probability function that Y appears when X appears, and d(x) is The penalty term based on repeated detection, l(x) is the penalty based on length comparison, and P is the distribution function;

A penalty term based on the length ratio is added to the beam search evaluation function to solve the problem of partial omissions in the translation;

A penalty term based on duplicate detection is added to the beam search evaluation function to solve the problem of duplicate content in translation.

3. the translation method of a kind of machine translation translation according to claim 2, is characterized in that, the concrete formula expression of described repetition detection penalty item d(x) is:

Among them, c is the index of the current translation word, δ is the range of duplicate detection, ε is the penalty coefficient, y is the matrix corresponding to the candidate translation, y _cj , y _cij are the two matrices for duplicate detection, i, j, to iterate over variables.

4. the translation method of a kind of machine translation translation according to claim 2 or 3, it is characterized in that, described for described codec result, the step of result evaluation based on beam search evaluation function, comprises:

the ratio of the length of the source sentence to the length of the target translation;

Fitting the length ratio through linear regression to obtain a cumulative distribution function;

When both end-of-sentence markers and common words appear in the candidate words of the beam search, the probability that the translation has ended F _X (x) and the probability that the translation has not ended 1-F _X (x) are respectively added to the evaluation function l(x )=θF _X (x), not_EOS, l(x)=θ(1-F _X (x)), EOS, where EOS is the end-of-sentence marker, and θ is a parameter;

When the candidate word is the marker at the end of the sentence, multiply the untranslated probability by the penalty factor as the penalty item;

When the candidate word is not the marker at the end of the sentence, the probability of completing the translation is multiplied by the penalty factor as the penalty item;

The obtained penalty term based on the length ratio is added to the evaluation function of beam search;

The results are evaluated based on the beam search evaluation function.

5 . The translation method for machine translation translation according to claim 1 , wherein, in the encoder-decoder model, both the encoder part and the decoder part use a bidirectional recurrent neural network. 6 .

6. The translation method of a machine translation translation according to claim 1, wherein the step of inputting the word vector sequence into an encoder-decoder model to obtain an encoding and decoding result comprises:

inputting the sequence of word vectors into an encoder-decoder model;

The deep learning framework based on the codec converts the sequence of word vectors into sentence vectors;

Based on the decoder, the sentence vector is converted into a sequence of word vectors.

7. A translation device for machine translation translation, wherein the device comprises a processor and a memory connected to the processor through a communication bus; wherein,

The memory is used to store a translation program for machine translation translation;

The processor is configured to execute the translation program of the machine-translated translation, so as to realize the steps of the translation method of the machine-translated translation according to any one of claims 1 to 6.

8. A computer storage medium, wherein the computer storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to cause the one or more programs to The processor executes the steps of the translation method of the machine translation translation according to any one of claims 1 to 6.