[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN110826334B - Chinese named entity recognition model based on reinforcement learning and training method thereof - Google Patents

Chinese named entity recognition model based on reinforcement learning and training method thereof Download PDF

Info

Publication number
CN110826334B
CN110826334B CN201911089295.3A CN201911089295A CN110826334B CN 110826334 B CN110826334 B CN 110826334B CN 201911089295 A CN201911089295 A CN 201911089295A CN 110826334 B CN110826334 B CN 110826334B
Authority
CN
China
Prior art keywords
word
sentence
network
named entity
entity recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201911089295.3A
Other languages
Chinese (zh)
Other versions
CN110826334A (en
Inventor
叶梅
卓汉逵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN201911089295.3A priority Critical patent/CN110826334B/en
Publication of CN110826334A publication Critical patent/CN110826334A/en
Application granted granted Critical
Publication of CN110826334B publication Critical patent/CN110826334B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a Chinese named entity recognition model based on reinforcement learning and a training method thereof, wherein the model comprises a strategy network module, a word segmentation recombination network and a named entity recognition network module; firstly, a strategy network designates an action sequence, then a word segmentation and recombination network executes actions in the action sequence one by one, a phrase is obtained through 'terminating' actions, the phrase is used as auxiliary input information, a lattice-LSTM modeling is carried out to obtain a hidden state sequence, the hidden state is input into a named entity recognition network, a label sequence of sentences is obtained, and a recognition result is used as a delay rewarding guiding strategy network module to update. The invention effectively divides sentences by reinforcement learning, avoids modeling redundant interference words matched in sentences, effectively avoids the influence on dependence and long text of an external dictionary, can better utilize the correct word information, and better helps a Chinese named entity recognition model to improve recognition effect.

Description

一种基于强化学习的中文命名实体识别模型及其训练方法A Chinese named entity recognition model based on reinforcement learning and its training method

技术领域Technical Field

本发明涉及机器学习领域,更具体地,涉及一种基于强化学习的中文命名实体识别模型及其训练方法。The present invention relates to the field of machine learning, and more specifically, to a Chinese named entity recognition model based on reinforcement learning and a training method thereof.

背景技术Background Art

命名实体识别(named entity recognition,NER)是自然语言处理领域里的一项基础性任务,是指从文本中识别出命名性指称项,为关系抽取、问答系统、句法分析、机器翻译等任务做铺垫,在自然语言处理技术走向实用化的过程中占有重要地位。一般来说,命名实体识别任务就是识别出待处理文本中三大类(实体类、时间类和数字类)、七小类(人名、组织机构名、地名、时间、日期、货币和百分比)命名实体。Named entity recognition (NER) is a basic task in the field of natural language processing. It refers to identifying named referents from text, paving the way for tasks such as relationship extraction, question-answering systems, syntactic analysis, and machine translation. It plays an important role in the process of natural language processing technology becoming practical. Generally speaking, the task of named entity recognition is to identify three major categories (entity category, time category, and number category) and seven subcategories (names, organization names, place names, time, date, currency, and percentage) of named entities in the text to be processed.

现有的一个中文命名实体识别模型为lattice-LSTM,该模型除了会输入句子中的每个字外,还会将以该字为词尾的所有潜在词的细胞向量作为输入,这些潜在词的选择取决于外部词典,另外增加一个补充门来控制字粒度信息和词粒度信息的选取,将输入向量由字信息、上一个隐藏状态向量和上一个细胞状态向量更改为字信息、上一个隐藏状态向量和以该字为词尾的所有词信息。该模型的优势在于,在基于字序列标记的模型中能够利用显式词信息,而且不会遇到分词错误。An existing Chinese named entity recognition model is lattice-LSTM. In addition to inputting each word in the sentence, the model also takes the cell vectors of all potential words ending with the word as input. The selection of these potential words depends on the external dictionary. In addition, a supplementary gate is added to control the selection of word granularity information and word granularity information, and the input vector is changed from word information, the previous hidden state vector, and the previous cell state vector to word information, the previous hidden state vector, and all word information ending with the word. The advantage of this model is that it can use explicit word information in a model based on word sequence tagging, and will not encounter word segmentation errors.

然而,正因为lattice-LSTM模型使用了句子中所有词的信息,导致句子中相邻字组成的词如果存在于外部词典中,就会被作为登录的词粒度信息输入到模型中,但该词在该句子中并不一定是正确的划分,比如:“南京市长江大桥”根据这个模型的思想,会按顺序将字组成的登录词作为输入,登录词的意思是该词是外部词典中已收录的名词,那么该模型中会将“南京”、“南京市”、“市长”、“长江”、“大桥”以及“长江大桥”作为登录词输入,但很显然“市长”这个词在本句中是个干扰性的词语,用了它的词信息对实体识别是有负面影响的。此外,该模型通常需要根据实验所用的数据集自主构造外部词典,对外部词典具有严重的依赖性。同时,当文本长度增加时,句子中潜在词语数量也会随之增加,会大大提升模型的复杂度。However, because the lattice-LSTM model uses the information of all words in the sentence, if the words composed of adjacent characters in the sentence exist in the external dictionary, they will be input into the model as the word granularity information of the login, but the word is not necessarily correctly divided in the sentence. For example, according to the idea of "Nanjing Yangtze River Bridge", the login words composed of characters will be input in order according to the idea of this model. The login word means that the word is a noun included in the external dictionary. Then the model will input "Nanjing", "Nanjing City", "Mayor", "Yangtze River", "Bridge" and "Yangtze River Bridge" as login words, but it is obvious that the word "Mayor" is a disturbing word in this sentence, and its word information has a negative impact on entity recognition. In addition, the model usually needs to independently construct an external dictionary based on the data set used in the experiment, and has a serious dependence on the external dictionary. At the same time, when the length of the text increases, the number of potential words in the sentence will also increase, which will greatly increase the complexity of the model.

发明内容Summary of the invention

本发明为克服上述现有技术中对句子里匹配出的多余干扰词语进行建模,以及依赖外部词典和受长文本影响的问题,提供一种基于强化学习的中文命名实体识别模型及其训练方法,通过构建一个强化学习模型来学习句子的内在关系,有效地学习到与命名实体识别任务相关的句子划分方法,从而对句子进行切割,实现有效的句子划分。这样就可以有效避免输入干扰词和使用外部词典,同时当文本长度增加时减少了句子中的词语数量,利用这些正确的词语信息能够更好地帮助中文命名实体识别模型提高它的识别正确率。In order to overcome the problems of modeling the redundant interference words matched in the sentence in the above-mentioned prior art, relying on external dictionaries and being affected by long texts, the present invention provides a Chinese named entity recognition model based on reinforcement learning and its training method, by building a reinforcement learning model to learn the internal relationship of the sentence, effectively learning the sentence division method related to the named entity recognition task, so as to cut the sentence and achieve effective sentence division. In this way, it is possible to effectively avoid inputting interference words and using external dictionaries, and at the same time, when the text length increases, the number of words in the sentence is reduced, and the use of these correct word information can better help the Chinese named entity recognition model improve its recognition accuracy.

为解决上述技术问题,本发明采用的技术方案是:提供一种基于强化学习的中文命名实体识别模型,包括策略网络模块、分词重组网络和命名实体识别网络模块;In order to solve the above technical problems, the technical solution adopted by the present invention is: to provide a Chinese named entity recognition model based on reinforcement learning, including a strategy network module, a word segmentation and reorganization network and a named entity recognition network module;

所述策略网络模块用于采用随机策略,在各个状态空间下对句子中的每个字采样一个动作,从而对整个句子得到一个动作序列,并根据中文命名实体识别网络的识别结果获得延时奖励,以指导策略网络模块更新;The strategy network module is used to adopt a random strategy to sample an action for each word in the sentence in each state space, so as to obtain an action sequence for the entire sentence, and obtain a delayed reward according to the recognition result of the Chinese named entity recognition network to guide the update of the strategy network module;

分词重组网络,用于根据所述策略网络模块输出的动作序列,对句子进行划分,将句子切割成一个个短语,将短语进行编码和该短语的最后一个字的编码向量结合,从而得到句子的lattice-LSTM表征;The word segmentation and reorganization network is used to divide the sentence according to the action sequence output by the strategy network module, cut the sentence into phrases, encode the phrase and combine it with the encoding vector of the last word of the phrase, so as to obtain the lattice-LSTM representation of the sentence;

命名实体识别网络模块,用于将所述句子的lattice-LSTM表征的隐藏状态输入到CRF(conditional random field,条件随机场)层中,最后得到命名实体识别结果,并根据识别结果计算得到一个损失值用来训练命名实体识别模型,同时将该损失值作为延迟奖励指导所述策略网络模块的更新。The named entity recognition network module is used to input the hidden state of the lattice-LSTM representation of the sentence into the CRF (conditional random field) layer, and finally obtain the named entity recognition result, and calculate a loss value based on the recognition result to train the named entity recognition model, and use the loss value as a delayed reward to guide the update of the strategy network module.

优选的,所述动作包括内部或终止。Preferably, the action comprises internal or terminate.

优选的,所述随机策略为:Preferably, the random strategy is:

π(at|st;θ)=σ(W*st+b)π(a t |s t ; θ)=σ(W*s t +b)

其中,π(at|st;θ)表示选择动作at的概率;θ={W,b},表示策略网络的参数;st为t时刻下策略网络的状态。Among them, π( at | st ; θ) represents the probability of selecting action at ; θ = {W, b}, represents the parameters of the policy network; st is the state of the policy network at time t.

优选的,所述分词重组网络根据所述策略网络模块输出的动作序列,对句子进行切割得到短语,并将每个短语进行编码,分别作为相应短语最后一字处细胞状态的输入,得到句子的lattice-LSTM表征。Preferably, the word segmentation and recombination network cuts the sentence into phrases according to the action sequence output by the strategy network module, and encodes each phrase as the input of the cell state at the last word of the corresponding phrase to obtain a lattice-LSTM representation of the sentence.

优选的,所述命名实体识别网络模块通过将分词重组网络得到的lattice-LSTM的输出输入到CRF层中,并利用CRF层的特征函数集给该句子的每个标注序列评分并对这个分数进行指数化和标准化,使用一阶Viterbi算法计算所有可能的标注序列,得分最高的序列作为最终输出,将损失函数的值反向传播进行参数训练,同时将该损失值作为延迟奖励更新策略网络模块;损失函数定义为带有L2正则项的句子层面的对数似然,如下:Preferably, the named entity recognition network module inputs the output of the lattice-LSTM obtained by the word segmentation and recombination network into the CRF layer, and uses the characteristic function set of the CRF layer to score each label sequence of the sentence and index and standardize the score, uses the first-order Viterbi algorithm to calculate all possible label sequences, and the sequence with the highest score is used as the final output, and the value of the loss function is back-propagated for parameter training, and the loss value is used as the delayed reward update strategy network module; the loss function is defined as the sentence-level log-likelihood with an L2 regularization term, as follows:

Figure BDA0002266384690000031
Figure BDA0002266384690000031

其中,λ为L2正则项系数;θ表示参数集;s和y分别表示句子和该句子对应的标注序列。Among them, λ is the coefficient of the L2 regularization term; θ represents the parameter set; s and y represent the sentence and the annotation sequence corresponding to the sentence respectively.

还提供一种基于强化学习的中文命名实体识别模型及其训练方法的训练方法,用于训练上述的中文命名实体识别模型,包括以下步骤:A reinforcement learning-based Chinese named entity recognition model and a training method thereof are also provided, which are used to train the above-mentioned Chinese named entity recognition model, comprising the following steps:

步骤一:将用于训练的句子数据输入策略网络模块,策略网络模块在各个状态空间下对句子中的每个字采样一个动作,输出整个句子的动作序列;Step 1: Input the sentence data for training into the policy network module. The policy network module samples an action for each word in the sentence in each state space and outputs the action sequence of the entire sentence.

步骤二:分词重组网络根据所述策略网络模块输出的动作序列,对句子进行划分,将句子切割成一个个短语,将短语进行编码和该短语的最后一个字的编码向量结合,从而得到字的lattice-LSTM表征;Step 2: The word segmentation and reorganization network divides the sentence according to the action sequence output by the strategy network module, cuts the sentence into phrases, encodes the phrase and combines it with the encoding vector of the last word of the phrase to obtain the lattice-LSTM representation of the word;

步骤三:命名实体识别网络从所述分词重组网络得到的隐藏状态输入到CRF层中,最后得到命名实体识别结果,并根据识别结果计算得到一个损失值用来训练命名实体识别模型,同时将该损失值作为延迟奖励指导所述策略网络模块的更新;Step 3: The named entity recognition network inputs the hidden state obtained from the word segmentation and recombination network into the CRF layer, and finally obtains the named entity recognition result, and calculates a loss value based on the recognition result to train the named entity recognition model, and uses the loss value as a delayed reward to guide the update of the strategy network module;

句子通过lattice-LSTM模型进行表征,就会得到句子中每个字的隐藏状态向量hi,然后将该状态向量序列H={h1,h2,…,hn}输入CRF层;令y=l1,l2,…,ln表示CRF层的输出标签,输出标签序列概率通过下式计算:The sentence is represented by the lattice-LSTM model, and the hidden state vector h i of each word in the sentence is obtained. Then the state vector sequence H = {h 1 ,h 2 ,…, hn } is input into the CRF layer; let y = l 1 ,l 2 ,…,l n represent the output label of the CRF layer, and the output label sequence probability is calculated by the following formula:

Figure BDA0002266384690000041
Figure BDA0002266384690000041

其中,s表示句子;

Figure BDA0002266384690000042
是针对于li的模型参数;
Figure BDA0002266384690000043
是针对于li-1和li的偏置参数;y′表示所有可能的输出标签集合。Among them, s represents a sentence;
Figure BDA0002266384690000042
is the model parameter for l i ;
Figure BDA0002266384690000043
is the bias parameter for li -1 and li ; y′ represents the set of all possible output labels.

损失值函数的计算公式为:The calculation formula of the loss value function is:

Figure BDA0002266384690000044
Figure BDA0002266384690000044

其中,λ为L2正则项系数;θ表示参数集;s和y分别表示句子和该句子对应的正确的标注序列;P表示为句子s标注为序列y的概率,即标注正确的概率。Among them, λ is the coefficient of the L2 regularization term; θ represents the parameter set; s and y represent the sentence and the correct annotation sequence corresponding to the sentence respectively; P represents the probability that sentence s is annotated with sequence y, that is, the probability of correct annotation.

优选的,在所述步骤一中,所述动作包括内部或终止,随机策略的公式如下:Preferably, in step 1, the action includes internal or termination, and the formula of the random strategy is as follows:

π(at|st;θ)=ρ(W*st+b)π(a t |s t ; θ)=ρ(W*s t +b)

其中,π(at|st;θ)表示选择动作at的概率;θ={W,b},表示策略网络的参数;st为t时刻下策略网络的状态。Among them, π( at | st ; θ) represents the probability of selecting action at ; θ = {W, b}, represents the parameters of the policy network; st is the state of the policy network at time t.

优选的,在所述步骤二中,字通过LSTM来进行字符层面的表征,更新公式如下所示:Preferably, in step 2, the word is represented at the character level by LSTM, and the update formula is as follows:

Figure BDA0002266384690000045
Figure BDA0002266384690000045

其中,

Figure BDA00022663846900000415
表示LSTM的转换函数;xt表示句子t时刻输入的字的编码向量;
Figure BDA0002266384690000046
Figure BDA0002266384690000047
分别表示时刻t时的细胞状态和隐藏状态。in,
Figure BDA00022663846900000415
represents the conversion function of LSTM; xt represents the encoding vector of the word input at time t of the sentence;
Figure BDA0002266384690000046
and
Figure BDA0002266384690000047
Represent the cell state and hidden state at time t respectively.

在完成句子的划分后,将短语信息整合进基于字粒度的LSTM模型中,基于字粒度的LSTM模型是基本的循环LSTM函数,如下:After completing the sentence division, the phrase information is integrated into the word-granularity-based LSTM model. The word-granularity-based LSTM model is a basic recurrent LSTM function, as follows:

Figure BDA0002266384690000048
Figure BDA0002266384690000048

Figure BDA0002266384690000049
Figure BDA0002266384690000049

Figure BDA00022663846900000410
Figure BDA00022663846900000410

其中,

Figure BDA00022663846900000411
表示句子中第j个字的编码向量;
Figure BDA00022663846900000412
表示句子第j-1个字时刻的隐藏状态;WcT和bc是模型参数;
Figure BDA00022663846900000413
分别代表输入、忘记和输出门;
Figure BDA00022663846900000414
表示新的候选状态;
Figure BDA0002266384690000051
表示句子第j-1个字时刻的细胞状态;
Figure BDA0002266384690000052
表示更新后的细胞状态;
Figure BDA0002266384690000053
表示句子第j个字时刻的隐藏状态,由输出门
Figure BDA0002266384690000054
和当前时刻的细胞状态
Figure BDA0002266384690000055
决定;σ()表示sigmoid函数;tanh()表示双曲正切激活函数。in,
Figure BDA00022663846900000411
Represents the encoding vector of the jth word in the sentence;
Figure BDA00022663846900000412
represents the hidden state of the sentence at the j-1th word moment; W cT and b c are model parameters;
Figure BDA00022663846900000413
Represent the input, forget and output gates respectively;
Figure BDA00022663846900000414
Indicates a new candidate state;
Figure BDA0002266384690000051
Represents the cell state at the j-1th word of the sentence;
Figure BDA0002266384690000052
Represents the updated cell state;
Figure BDA0002266384690000053
Represents the hidden state of the jth word in the sentence, which is represented by the output gate
Figure BDA0002266384690000054
and the current cell state
Figure BDA0002266384690000055
decision; σ() represents the sigmoid function; tanh() represents the hyperbolic tangent activation function.

短语信息通过没有输出门的LSTM模型进行表征,具体的公式如下:The phrase information is represented by an LSTM model without an output gate. The specific formula is as follows:

Figure BDA0002266384690000056
Figure BDA0002266384690000056

Figure BDA0002266384690000057
Figure BDA0002266384690000057

其中,

Figure BDA0002266384690000058
表示句子中从第b个字开始到第e个字结束的短语的编码向量;
Figure BDA0002266384690000059
表示句子第b个字时刻的隐藏状态,即短语第一个字的隐藏状态;WwT和bw是模型参数;
Figure BDA00022663846900000510
分别代表输入和忘记门;
Figure BDA00022663846900000511
表示新的候选状态;
Figure BDA00022663846900000512
表示短语第一个字的细胞状态;
Figure BDA00022663846900000513
表示更新后的细胞状态;σ()表示sigmoid函数;tanh()表示双曲正切激活函数。in,
Figure BDA0002266384690000058
The encoding vector representing the phrase starting from the bth word and ending at the eth word in the sentence;
Figure BDA0002266384690000059
represents the hidden state of the sentence at the bth word, that is, the hidden state of the first word of the phrase; W wT and b w are model parameters;
Figure BDA00022663846900000510
Represent the input and forget gates respectively;
Figure BDA00022663846900000511
Indicates a new candidate state;
Figure BDA00022663846900000512
The state of the cell representing the first word of the phrase;
Figure BDA00022663846900000513
Represents the updated cell state; σ() represents the sigmoid function; tanh() represents the hyperbolic tangent activation function.

另外增加一个附加门对字粒度和词粒度信息进行选取,输入为字的编码向量和以该字结尾的短语的细胞状态,公式定义如下:In addition, an additional gate is added to select the word granularity and term granularity information. The input is the encoding vector of the word and the cell state of the phrase ending with the word. The formula is defined as follows:

Figure BDA00022663846900000514
Figure BDA00022663846900000514

其中,

Figure BDA00022663846900000515
表示句子中第e个字的编码向量;
Figure BDA00022663846900000516
表示从第b个字开始到第e个字结束的短语的细胞状态,即句子中以第e个字为词尾的短语的细胞状态;WlT和bl是模型参数;
Figure BDA00022663846900000517
表示附加门;σ()表示sigmoid函数。in,
Figure BDA00022663846900000515
Represents the encoding vector of the e-th word in the sentence;
Figure BDA00022663846900000516
represents the cell state of the phrase starting from the bth word and ending with the eth word, that is, the cell state of the phrase ending with the eth word in the sentence; W lT and b l are model parameters;
Figure BDA00022663846900000517
represents an additional gate; σ() represents the sigmoid function.

Figure BDA00022663846900000518
的更新方式就变了,隐藏状态的更新没有变化,基于lattice-LSTM模型的表征最终公式如下:
Figure BDA00022663846900000518
The update method of has changed, and the update of the hidden state has not changed. The final representation formula based on the lattice-LSTM model is as follows:

Figure BDA00022663846900000519
Figure BDA00022663846900000519

其中,

Figure BDA00022663846900000520
为第j个字的输入门向量;
Figure BDA00022663846900000521
为从b开始以j结尾的短语的输入门向量;
Figure BDA00022663846900000522
为短语细胞状态;
Figure BDA00022663846900000523
为字的新候选细胞状态;
Figure BDA00022663846900000524
为短语信息向量;
Figure BDA00022663846900000525
为字信息向量。in,
Figure BDA00022663846900000520
is the input gate vector of the jth word;
Figure BDA00022663846900000521
is the input gate vector of the phrase starting from b and ending with j;
Figure BDA00022663846900000522
for the phrase cell state;
Figure BDA00022663846900000523
is the new candidate cell state for the word;
Figure BDA00022663846900000524
is the phrase information vector;
Figure BDA00022663846900000525
is the word information vector.

优选的,在进行步骤一前,在进行步骤一前,预训练命名实体识别网络及其网络参数,此时命名实体识别网络用到的词是通过简单的启发式算法对原始句子进行划分得到的词语;Preferably, before performing step 1, a named entity recognition network and its network parameters are pre-trained, wherein the words used in the named entity recognition network are words obtained by dividing the original sentence through a simple heuristic algorithm;

将实体识别网络预训练好的部分网络参数暂时定为命名实体识别网络的网络参数,再进行策略网络的预训练,最后联合训练整个网络参数。The pre-trained network parameters of the entity recognition network are temporarily set as the network parameters of the named entity recognition network, and then the policy network is pre-trained, and finally the entire network parameters are jointly trained.

与现有技术相比,本发明的有益效果是:本发明一种基于强化学习的中文命名实体识别模型及其方法通过利用强化学习对句子进行有效划分,避免对句子中匹配出的多余干扰词语进行建模,以及有效避免对外部词典的依赖和长文本的影响,本发明能够更好地利用这些正确的词语信息,更好地帮助中文命名实体识别模型提高识别效果。Compared with the prior art, the beneficial effects of the present invention are: a Chinese named entity recognition model based on reinforcement learning and its method of the present invention effectively divide sentences by utilizing reinforcement learning, avoid modeling redundant interference words matched in sentences, and effectively avoid dependence on external dictionaries and the influence of long texts. The present invention can better utilize these correct word information and better help the Chinese named entity recognition model to improve the recognition effect.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是本发明的一种基于强化学习的中文命名实体识别模型的架构示意图;FIG1 is a schematic diagram of the architecture of a Chinese named entity recognition model based on reinforcement learning of the present invention;

图2是本发明的一种基于强化学习的中文命名实体识别模型的策略网络模块的架构示意图;FIG2 is a schematic diagram of the architecture of a strategy network module of a Chinese named entity recognition model based on reinforcement learning of the present invention;

图3是本发明的一种基于强化学习的中文命名实体识别模型的命名实体识别网络模块的架构示意图;FIG3 is a schematic diagram of the architecture of a named entity recognition network module of a Chinese named entity recognition model based on reinforcement learning of the present invention;

图4是本发明的一种基于强化学习的中文命名实体识别模型的训练方法的流程图;FIG4 is a flow chart of a method for training a Chinese named entity recognition model based on reinforcement learning of the present invention;

图5是本发明的一种基于强化学习的中文命名实体识别模型的训练方法的句子分词示例图。FIG5 is a sentence segmentation example diagram of a training method for a Chinese named entity recognition model based on reinforcement learning according to the present invention.

具体实施方式DETAILED DESCRIPTION

附图仅用于示例性说明,不能理解为对本专利的限制;为了更好说明本实施例,附图某些部件会有省略、放大或缩小,并不代表实际产品的尺寸;对于本领域技术人员来说,附图中某些公知结构及其说明可能省略是可以理解的。附图中描述位置关系仅用于示例性说明,不能理解为对本专利的限制。The drawings are only for illustrative purposes and cannot be construed as limiting the present invention. To better illustrate the present embodiment, some parts of the drawings may be omitted, enlarged, or reduced, and do not represent the size of the actual product. For those skilled in the art, it is understandable that some well-known structures and their descriptions may be omitted in the drawings. The positional relationships described in the drawings are only for illustrative purposes and cannot be construed as limiting the present invention.

本发明实施例的附图中相同或相似的标号对应相同或相似的部件;在本发明的描述中,需要理解的是,若有术语“上”、“下”、“左”、“右”“长”“短”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此附图中描述位置关系的用语仅用于示例性说明,不能理解为对本专利的限制,对于本领域的普通技术人员而言,可以根据具体情况理解上述术语的具体含义。The same or similar numbers in the drawings of the embodiments of the present invention correspond to the same or similar parts; in the description of the present invention, it should be understood that if the terms "upper", "lower", "left", "right", "long", "short" and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, they are only for the convenience of describing the present invention and simplifying the description, rather than indicating or implying that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation. Therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes and cannot be understood as limitations on this patent. For ordinary technicians in this field, the specific meanings of the above terms can be understood according to specific circumstances.

下面通过具体实施例,并结合附图,对本发明的技术方案作进一步的具体描述:The technical solution of the present invention is further described in detail below through specific embodiments and in conjunction with the accompanying drawings:

实施例1Example 1

如图1-3所示为一种基于强化学习的中文命名实体识别模型的实施例,包括策略网络模块、分词重组网络和命名实体识别网络模块;As shown in FIG1-3, an embodiment of a Chinese named entity recognition model based on reinforcement learning includes a strategy network module, a word segmentation and reorganization network and a named entity recognition network module;

策略网络模块用于采用随机策略,在各个状态空间下对句子中的每个字采样一个动作(动作包括内部或终止),从而对整个句子得到一个动作序列,并根据中文命名实体识别网络的识别结果获得延时奖励,以指导策略网络模块更新;随机策略为:The policy network module is used to adopt a random strategy to sample an action (including internal or terminal) for each word in the sentence in each state space, so as to obtain an action sequence for the entire sentence, and obtain a delayed reward based on the recognition result of the Chinese named entity recognition network to guide the update of the policy network module; the random strategy is:

π(at|st;θ)=σ(W*st+b)π(a t |s t ; θ)=σ(W*s t +b)

其中,π(at|st;θ)表示选择动作at的概率;θ={W,b},表示策略网络的参数;st为t时刻下策略网络的状态。Among them, π( at | st ; θ) represents the probability of selecting action at ; θ = {W, b}, represents the parameters of the policy network; st is the state of the policy network at time t.

分词重组网络,用于根据策略网络模块输出的动作序列,对句子进行划分,将句子切割成一个个短语,将短语进行编码和该短语的最后一个字的编码向量结合,从而得到句子的lattice-LSTM表达;The word segmentation and reorganization network is used to divide the sentence according to the action sequence output by the policy network module, cut the sentence into phrases, encode the phrases and combine them with the encoding vector of the last word of the phrase to obtain the lattice-LSTM expression of the sentence;

具体的,分词重组网络根据策略网络模块输出的动作序列,对句子进行切割得到短语,并将每个短语进行编码,分别作为相应短语最后一字处细胞状态的输入,得到句子的lattice-LSTM表征。Specifically, the word segmentation and reorganization network cuts the sentence into phrases according to the action sequence output by the strategy network module, and encodes each phrase as the input of the cell state at the last word of the corresponding phrase to obtain the lattice-LSTM representation of the sentence.

命名实体识别网络模块,用于将句子的lattice-LSTM表达的隐藏状态输入到条件随机场中,最后得到命名实体识别结果,并根据识别结果计算得到一个损失值用来训练命名实体识别模型,同时将该损失值作为延迟奖励指导所述策略网络模块的更新。其中,损失值的计算公式为,The named entity recognition network module is used to input the hidden state of the sentence's lattice-LSTM expression into the conditional random field, and finally obtain the named entity recognition result. A loss value is calculated based on the recognition result to train the named entity recognition model, and the loss value is used as a delayed reward to guide the update of the strategy network module. The calculation formula of the loss value is:

Figure BDA0002266384690000071
Figure BDA0002266384690000071

其中,λ为L2正则项系数;θ表示参数集;s和y分别表示句子和该句子对应的正确标注序列;P表示为句子s标注为序列y的概率,即标注正确的概率。Among them, λ is the coefficient of the L2 regularization term; θ represents the parameter set; s and y represent the sentence and the correct annotation sequence corresponding to the sentence respectively; P represents the probability that sentence s is annotated as sequence y, that is, the probability of correct annotation.

本实施例的工作原理:先是策略网络指定动作序列,然后分词重组网络会逐个执行该动作序列中的动作,通过“终止”动作得到一个短语,将该短语作为该短语的最后一个字的输入信息,进行lattice-LSTM建模得到隐状态序列,并将该隐状态输入到命名实体识别网络,得到句子的标签序列,并将识别结果作为延迟奖励指导策略网络模块的更新。The working principle of this embodiment is as follows: first, the strategy network specifies the action sequence, and then the word segmentation and recombination network executes the actions in the action sequence one by one, and obtains a phrase through the "termination" action. The phrase is used as the input information of the last word of the phrase, and lattice-LSTM modeling is performed to obtain a hidden state sequence, and the hidden state is input into the named entity recognition network to obtain the label sequence of the sentence, and the recognition result is used as a delayed reward to guide the update of the strategy network module.

本实施例的有益效果:本实施例是一种基于神经网络的LSTM-CRF模型的强化,结合了强化学习的框架,来学习句子的内在关系,高效划分句子结构,将得到的短语信息集成到基于字粒度的lattice-LSTM模型中,充分学习到字粒度信息及与之相关的词粒度信息,以达到更好的识别效果。Beneficial effects of this embodiment: This embodiment is an enhancement of the LSTM-CRF model based on a neural network, which combines the framework of reinforcement learning to learn the intrinsic relationship of sentences, efficiently divide the sentence structure, and integrate the obtained phrase information into the lattice-LSTM model based on word granularity, fully learning the word granularity information and related word granularity information to achieve better recognition effect.

实施例2Example 2

如图4所示为一种基于强化学习的中文命名实体识别模型的训练方法的实施例,用于训练实施例1所述的模型,包括以下步骤:FIG4 is an embodiment of a training method for a Chinese named entity recognition model based on reinforcement learning, which is used to train the model described in Example 1, and includes the following steps:

预处理:预训练命名实体识别网络及其网络参数,此时命名实体识别网络用到的词是通过简单的启发式算法对原始句子进行划分得到的词语;Preprocessing: Pretrain the named entity recognition network and its network parameters. The words used by the named entity recognition network are obtained by dividing the original sentence through a simple heuristic algorithm.

将实体识别网络预训练好的部分网络参数暂时定为命名实体识别网络的网络参数,再进行策略网络的预训练,最后联合训练整个网络参数。The pre-trained network parameters of the entity recognition network are temporarily set as the network parameters of the named entity recognition network, and then the policy network is pre-trained, and finally the entire network parameters are jointly trained.

步骤一:将用于训练的句子数据输入策略网络模块,策略网络模块在各个状态空间下对句子中的每个字采样一个动作,输出整个句子的动作序列;Step 1: Input the sentence data for training into the policy network module. The policy network module samples an action for each word in the sentence in each state space and outputs the action sequence of the entire sentence.

在步骤一中,状态、动作、策略定义如下:In step 1, the states, actions, and policies are defined as follows:

1、状态:当前输入的字的编码向量和该字之前的上下文向量;1. State: the encoding vector of the current input word and the context vector before the word;

2、动作:定义两者不同的操作,包括内部和终止;2. Action: define different operations for the two, including internal and termination;

3、策略:定义随机策略如下:3. Strategy: Define the random strategy as follows:

π(at|st;θ)=σ(W*st+b)π(a t |s t ; θ)=σ(W*s t +b)

其中,π(at|st;θ)表示选择动作at的概率;θ={W,b},表示策略网络的参数;st为t时刻下策略网络的状态。Among them, π( at | st ; θ) represents the probability of selecting action at ; θ = {W, b}, represents the parameters of the policy network; st is the state of the policy network at time t.

步骤二:分词重组网络根据所述策略网络模块输出的动作序列,对句子进行划分,将句子切割成一个个短语,将短语进行编码和该短语的最后一个字的编码向量结合,从而得到字的lattice-LSTM表征;Step 2: The word segmentation and reorganization network divides the sentence according to the action sequence output by the strategy network module, cuts the sentence into phrases, encodes the phrase and combines it with the encoding vector of the last word of the phrase to obtain the lattice-LSTM representation of the word;

如图5所示,将“美国的华盛顿”划分为“美国”“的”“华盛顿”。字通过LSTM来进行字符层面的表征,更新公式如下所示:As shown in Figure 5, "Washington, USA" is divided into "United States", "of", and "Washington". The characters are represented at the character level through LSTM, and the update formula is as follows:

Figure BDA0002266384690000091
Figure BDA0002266384690000091

其中,

Figure BDA0002266384690000092
表示LSTM的转换函数;xt表示句子t时刻输入的字的编码向量;
Figure BDA0002266384690000093
Figure BDA0002266384690000094
分别表示时刻t时的细胞状态和隐藏状态。in,
Figure BDA0002266384690000092
represents the conversion function of LSTM; xt represents the encoding vector of the word input at time t of the sentence;
Figure BDA0002266384690000093
and
Figure BDA0002266384690000094
Represent the cell state and hidden state at time t respectively.

在完成句子的划分后,将短语信息整合进基于字粒度的LSTM模型中,基于字粒度的LSTM模型是基本的循环LSTM函数,如下:After completing the sentence division, the phrase information is integrated into the word-granularity-based LSTM model. The word-granularity-based LSTM model is a basic recurrent LSTM function, as follows:

Figure BDA0002266384690000095
Figure BDA0002266384690000095

Figure BDA0002266384690000096
Figure BDA0002266384690000096

Figure BDA0002266384690000097
Figure BDA0002266384690000097

其中,

Figure BDA0002266384690000098
表示句子中第j个字的编码向量;
Figure BDA0002266384690000099
表示句子第j-1个字时刻的隐藏状态;WcT和bc是模型参数;
Figure BDA00022663846900000910
分别代表输入、忘记和输出门;
Figure BDA00022663846900000911
表示新的候选状态;
Figure BDA00022663846900000912
表示句子第j-1个字时刻的细胞状态;
Figure BDA00022663846900000913
表示更新后的细胞状态;
Figure BDA00022663846900000914
表示句子第j个字时刻的隐藏状态;由输出门
Figure BDA00022663846900000915
和当前时刻的细胞状态
Figure BDA00022663846900000916
决定;σ()表示sigmoid函数,tanh()表示双曲正切激活函数。in,
Figure BDA0002266384690000098
Represents the encoding vector of the jth word in the sentence;
Figure BDA0002266384690000099
represents the hidden state of the sentence at the j-1th word moment; W cT and b c are model parameters;
Figure BDA00022663846900000910
Represent the input, forget and output gates respectively;
Figure BDA00022663846900000911
Indicates a new candidate state;
Figure BDA00022663846900000912
Represents the cell state at the j-1th word of the sentence;
Figure BDA00022663846900000913
Represents the updated cell state;
Figure BDA00022663846900000914
Represents the hidden state of the jth word in the sentence; the output gate
Figure BDA00022663846900000915
and the current cell state
Figure BDA00022663846900000916
Decision; σ() represents the sigmoid function, and tanh() represents the hyperbolic tangent activation function.

短语信息通过没有输出门的LSTM模型进行表征,具体的公式如下:The phrase information is represented by an LSTM model without an output gate. The specific formula is as follows:

Figure BDA00022663846900000917
Figure BDA00022663846900000917

Figure BDA00022663846900000918
Figure BDA00022663846900000918

Figure BDA00022663846900000919
表示句子中从第b个字开始到第e个字结束的短语的编码向量;
Figure BDA00022663846900000920
表示句子第b个字时刻的隐藏状态,即短语第一个字的隐藏状态;WwT和bw是模型参数;
Figure BDA00022663846900000921
分别代表输入和忘记门;
Figure BDA00022663846900000922
表示新的候选状态;
Figure BDA00022663846900000923
表示短语第一个字的细胞状态;
Figure BDA00022663846900000924
表示更新后的细胞状态;σ()表示sigmoid函数,tanh()表示双曲正切激活函数。
Figure BDA00022663846900000919
The encoding vector representing the phrase starting from the bth word and ending at the eth word in the sentence;
Figure BDA00022663846900000920
represents the hidden state of the sentence at the bth word, that is, the hidden state of the first word of the phrase; W wT and b w are model parameters;
Figure BDA00022663846900000921
Represent the input and forget gates respectively;
Figure BDA00022663846900000922
Indicates a new candidate state;
Figure BDA00022663846900000923
The state of the cell representing the first word of the phrase;
Figure BDA00022663846900000924
Represents the updated cell state; σ() represents the sigmoid function, and tanh() represents the hyperbolic tangent activation function.

另外增加一个附加门对字粒度和词粒度信息进行选取,输入为字的编码向量和以该字结尾的短语的细胞状态,公式定义如下:In addition, an additional gate is added to select the word granularity and term granularity information. The input is the encoding vector of the word and the cell state of the phrase ending with the word. The formula is defined as follows:

Figure BDA0002266384690000101
Figure BDA0002266384690000101

其中,

Figure BDA0002266384690000102
表示句子中第e个字的编码向量;
Figure BDA0002266384690000103
表示从第b个字开始到第e个字结束的短语的细胞状态,即句子中以第e个字为词尾的短语的细胞状态;WlT和bl是模型参数;
Figure BDA0002266384690000104
表示附加门;σ()表示sigmoid函数。in,
Figure BDA0002266384690000102
Represents the encoding vector of the e-th word in the sentence;
Figure BDA0002266384690000103
represents the cell state of the phrase starting from the bth word and ending with the eth word, that is, the cell state of the phrase ending with the eth word in the sentence; W lT and b l are model parameters;
Figure BDA0002266384690000104
represents an additional gate; σ() represents the sigmoid function.

Figure BDA0002266384690000105
的更新方式就变了,隐藏状态的更新没有变化,基于lattice-LSTM模型的表征最终公式如下:
Figure BDA0002266384690000105
The update method of has changed, and the update of the hidden state has not changed. The final representation formula based on the lattice-LSTM model is as follows:

Figure BDA0002266384690000106
Figure BDA0002266384690000106

其中,

Figure BDA0002266384690000107
为第j个字的输入门向量;
Figure BDA0002266384690000108
为从b开始以j结尾的短语的输入门向量;
Figure BDA0002266384690000109
为短语细胞状态;
Figure BDA00022663846900001010
为字的新候选细胞状态;
Figure BDA00022663846900001011
为短语信息向量;
Figure BDA00022663846900001012
为字信息向量。in,
Figure BDA0002266384690000107
is the input gate vector of the jth word;
Figure BDA0002266384690000108
is the input gate vector of the phrase starting from b and ending with j;
Figure BDA0002266384690000109
for the phrase cell state;
Figure BDA00022663846900001010
is the new candidate cell state for the word;
Figure BDA00022663846900001011
is the phrase information vector;
Figure BDA00022663846900001012
is the word information vector.

步骤三:命名实体识别网络从所述分词重组网络得到的隐藏状态输入到CRF层中,最后得到命名实体识别结果,并根据识别结果计算得到一个损失值用来训练命名实体识别模型,同时将该损失值作为延迟奖励指导所述策略网络模块的更新;Step 3: The named entity recognition network inputs the hidden state obtained from the word segmentation and recombination network into the CRF layer, and finally obtains the named entity recognition result, and calculates a loss value based on the recognition result to train the named entity recognition model, and uses the loss value as a delayed reward to guide the update of the strategy network module;

句子通过lattice-LSTM模型进行表征,就会得到句子中每个字的隐藏状态向量hi,然后将该状态向量序列H={h1,h2,…,hn}输入CRF层;令y=l1,l2,…,ln表示CRF层的输出标签,输出标签序列概率通过下式计算:The sentence is represented by the lattice-LSTM model, and the hidden state vector h i of each word in the sentence is obtained. Then the state vector sequence H = {h 1 ,h 2 ,…, hn } is input into the CRF layer; let y = l 1 ,l 2 ,…,l n represent the output label of the CRF layer, and the output label sequence probability is calculated by the following formula:

Figure BDA00022663846900001013
Figure BDA00022663846900001013

其中,s表示句子;

Figure BDA00022663846900001014
是针对于li的模型参数;
Figure BDA00022663846900001015
是针对于li-1和li的偏置参数;y′表示所有可能的输出标签集合。Among them, s represents a sentence;
Figure BDA00022663846900001014
is the model parameter for l i ;
Figure BDA00022663846900001015
is the bias parameter for li -1 and li ; y′ represents the set of all possible output labels.

损失值函数的计算公式为:The calculation formula of the loss value function is:

Figure BDA00022663846900001016
Figure BDA00022663846900001016

其中,λ为L2正则项系数,θ表示参数集,s和y分别表示句子和该句子对应的正确的标注序列。Among them, λ is the coefficient of the L2 regularization term, θ represents the parameter set, s and y represent the sentence and the correct annotation sequence corresponding to the sentence, respectively.

奖励的定义为:当通过策略网络采样到动作序列后,就可以得到句子的划分,将句子划分后得到的一个个短语作为词粒度信息加入到基于字粒度的LSTM模型中,得到基于lattice-LSTM模型的表征,将其输入到命名实体识别网络模块当中,通过CRF层得到每个字的实体标注,解码出实体标签,根据识别结果计算奖励值。由于要得到最后的识别结果才能计算该奖励值,因此这是一个延时奖励,利用该延迟奖励可以指导策略网络模块更新。The definition of reward is: after sampling the action sequence through the policy network, the sentence division can be obtained. The phrases obtained after the sentence division are added as word granularity information to the LSTM model based on the character granularity to obtain the representation based on the lattice-LSTM model, which is input into the named entity recognition network module, and the entity annotation of each word is obtained through the CRF layer, the entity label is decoded, and the reward value is calculated according to the recognition result. Since the reward value can only be calculated after the final recognition result is obtained, this is a delayed reward, which can be used to guide the update of the policy network module.

显然,本发明的上述实施例仅仅是为清楚地说明本发明所作的举例,而并非是对本发明的实施方式的限定。对于所属领域的普通技术人员来说,在上述说明的基础上还可以做出其它不同形式的变化或变动。这里无需也无法对所有的实施方式予以穷举。凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明权利要求的保护范围之内。Obviously, the above embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. For those skilled in the art, other different forms of changes or modifications can be made based on the above description. It is not necessary and impossible to list all the embodiments here. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the protection scope of the claims of the present invention.

Claims (8)

1. A training method of a Chinese named entity recognition model based on reinforcement learning is characterized by comprising the following steps:
step one: inputting sentence data for training into a strategy network module, wherein the strategy network module samples each word in a sentence with one action under each state space, and outputs an action sequence of the whole sentence;
step two: dividing sentences by the word segmentation and recombination network according to the action sequence output by the strategy network module, breaking the sentences into phrases, and combining the codes of the phrases with the code vector of the last word of the phrases so as to obtain the law-LSTM representation of the word; the character is characterized in character level through LSTM, and each phrase is obtained according to termination, and the updated formula is as follows:
Figure FDA0004102474230000011
wherein ,
Figure FDA0004102474230000012
a transfer function representing LSTM; x is x t A code vector representing a word input at time t of the sentence;
Figure FDA0004102474230000013
and
Figure FDA0004102474230000014
Respectively representing a cell state and a hidden state at a time t;
after the division of sentences is completed, phrase information is integrated into a word-granularity-based LSTM model, which is a basic cyclic LSTM function, as follows:
Figure FDA0004102474230000015
Figure FDA0004102474230000016
Figure FDA0004102474230000017
wherein ,
Figure FDA0004102474230000018
a code vector representing a j-th word in the sentence;
Figure FDA0004102474230000019
The hidden state of the j-1 th word moment of the sentence is represented; w (W) cT and bc Is a model parameter;
Figure FDA00041024742300000110
Representing input, forget and output gates, respectively;
Figure FDA00041024742300000111
Representing a new candidate state;
Figure FDA00041024742300000112
Cell states representing the j-1 th word of a sentence;
Figure FDA00041024742300000113
Representing the updated cell state;
Figure FDA00041024742300000114
The hidden state of the j-th word moment of the sentence is represented; by the output door->
Figure FDA00041024742300000115
And the cell state at the present moment->
Figure FDA00041024742300000116
Determining; sigma () represents a sigmoid function, and tanh () represents a hyperbolic tangent activation function;
phrase information is characterized by an LSTM model without an output gate, and a specific formula is as follows:
Figure FDA00041024742300000117
Figure FDA00041024742300000118
wherein ,
Figure FDA00041024742300000119
a code vector representing a phrase in the sentence starting from the b-th word and ending with the w-th word;
Figure FDA00041024742300000120
The hidden state of the b word moment of the sentence is represented, namely the hidden state of the first word of the phrase; w (W) wT and bw Is a model parameter;
Figure FDA0004102474230000021
Representing input and forget gates, respectively;
Figure FDA0004102474230000022
Representing a new candidate state;
Figure FDA0004102474230000023
A cell state representing the phrase first word;
Figure FDA0004102474230000024
Representing the updated cell state; sigma () represents a sigmoid function; tanh () represents a hyperbolic tangent activation function;
additionally, an additional gate is added to select the granularity of the word and the granularity information of the word, and the cell states of the code vector which is input as the word and the phrase ending with the word are input, and the formula is defined as follows:
Figure FDA0004102474230000025
wherein ,
Figure FDA0004102474230000026
a code vector representing an e-th word in the sentence;
Figure FDA0004102474230000027
Representing the cell state of a phrase starting from the b-th word and ending with the e-th word, i.e. the cell state of a phrase ending with the e-th word in a sentence; w (W) lT and bl Is a model parameter;
Figure FDA0004102474230000028
Representing an additional door; sigma () represents a sigmoid function;
Figure FDA0004102474230000029
the updating mode of the hidden state is changed, the updating of the hidden state is unchanged, and the final representation formula based on the lattice-LSTM model is as follows:
Figure FDA00041024742300000210
wherein ,
Figure FDA00041024742300000211
an input gate vector for the j-th word;
Figure FDA00041024742300000212
An input gate vector that is a phrase ending with j starting with b;
Figure FDA00041024742300000213
Is the phrase cell state;
Figure FDA00041024742300000214
New candidate cell states for the word;
Figure FDA00041024742300000215
Is a phrase information vector;
Figure FDA00041024742300000216
is a word information vector;
step three: the hidden state obtained by the named entity recognition network from the word segmentation and recombination network is input into a conditional random field layer, a named entity recognition result is finally obtained, a loss value is obtained through calculation according to the recognition result and used for training a named entity recognition model, and meanwhile the loss value is used as a delay reward to guide the update of the strategy network module;
the sentence is characterized by a lattice-LSTM model, so that the hidden state vector h of each word in the sentence can be obtained i The state vector sequence h= { H is then 1 ,h 2 ,…,h n Inputting a conditional random field layer; let y=l 1 ,l 2 ,…,l n The output label representing the conditional random field layer, the output label sequence probability is calculated by:
Figure FDA00041024742300000217
wherein s represents a sentence;
Figure FDA00041024742300000218
is directed to l i Model parameters of (2);
Figure FDA00041024742300000219
Is directed to l i-1 and li Is set to be a bias parameter of (a); y' represents all possible output tag sets;
the calculation formula of the loss value function is as follows:
Figure FDA0004102474230000031
wherein lambda is L 2 A regularization term coefficient; θ represents a parameter set; s and y respectively represent sentences and correct labeling sequences corresponding to the sentences; p denotes the probability that the sentence s is labeled as sequence y, i.e. the probability that the label is correct.
2. The training method of a reinforcement learning-based chinese named entity recognition model of claim 1, wherein in said step one, said actions include internal or termination, and the formula of the random strategy is as follows:
π(a t | t ;)=(W*s t +)
wherein ,π(at | t The method comprises the steps of carrying out a first treatment on the surface of the ) Representing selection action a t Probability of (2); θ= { W, b }, representing parameters of the policy network; s is(s) t The state of the strategy network at the moment t; sigma () represents a sigmoid function; w, b denotes network parameters.
3. The training method of a Chinese named entity recognition model based on reinforcement learning according to claim 1, wherein before the first step, the named entity recognition network and network parameters thereof are pre-trained, and words used by the named entity recognition network are words obtained by dividing an original sentence through a simple heuristic algorithm;
and (3) temporarily fixing the pre-trained partial network parameters of the entity identification network as the network parameters of the named entity identification network, then pre-training the strategy network, and finally jointly training the whole network parameters.
4. The Chinese named entity recognition model based on reinforcement learning is characterized by comprising a strategy network module, a word segmentation recombination network and a named entity recognition network module; training with the training method of the preceding claims 1-3;
the strategy network module is used for sampling an action for each word in the sentence under each state space by adopting a random strategy, so as to obtain an action sequence for the whole sentence;
the word segmentation and recombination network is used for dividing sentences according to the action sequence output by the strategy network module, breaking the sentences into phrases, and combining the codes of the phrases with the code vector of the last word of the phrases so as to obtain the lattice-LSTM expression of the sentences;
and the named entity recognition network module is used for inputting the hidden state of the language-LSTM expression of the sentence into the conditional random field, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train the named entity recognition model, and simultaneously guiding the updating of the strategy network module by taking the loss value as a delay reward.
5. The reinforcement-learning-based chinese named entity recognition model of claim 4, wherein said actions comprise internal or termination.
6. The reinforcement-learning-based chinese named entity recognition model of claim 4, wherein said random strategy is:
π(a t | t ;)=(W*s t +)
wherein ,π(at | t The method comprises the steps of carrying out a first treatment on the surface of the ) Representing selection action a t Probability of (2); θ= { W, b }, representing parameters of the policy network; s is(s) t The state of the strategy network at the moment t; sigma () represents a sigmoid function; w, b denotes network parameters.
7. The reinforcement learning-based Chinese named entity recognition model of claim 6, wherein the word segmentation and recombination network cuts sentences according to the action sequences output by the strategy network module to obtain phrases, and encodes each phrase as the cell state input at the last word of the corresponding phrase to obtain the language-LSTM representation of the sentences.
8. The reinforcement learning-based Chinese named entity recognition model of claim 7, wherein the named entity recognition network module inputs the output of lattice-LSTM obtained by the word segmentation and recombination network into a conditional random field layer, scores each labeling sequence of the sentence by using a feature function set of the conditional random field layer, indexes and normalizes the score, calculates all possible labeling sequences by using a first-order Viterbi algorithm, and the labeling sequence with the highest score is used as a final output. Meanwhile, defining a loss function, carrying out parameter training on the back propagation of the loss value, and taking the loss value as a delay rewarding updating strategy network module; the penalty function is defined as the log-likelihood of the sentence level with the L2 regularization term as follows:
Figure FDA0004102474230000041
wherein lambda is L 2 A regularization term coefficient; θ represents a parameter set; s and y respectively represent sentences and correct labeling sequences corresponding to the sentences; p denotes the probability that the sentence s is labeled as sequence y, i.e. the probability that the label is correct.
CN201911089295.3A 2019-11-08 2019-11-08 Chinese named entity recognition model based on reinforcement learning and training method thereof Expired - Fee Related CN110826334B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911089295.3A CN110826334B (en) 2019-11-08 2019-11-08 Chinese named entity recognition model based on reinforcement learning and training method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911089295.3A CN110826334B (en) 2019-11-08 2019-11-08 Chinese named entity recognition model based on reinforcement learning and training method thereof

Publications (2)

Publication Number Publication Date
CN110826334A CN110826334A (en) 2020-02-21
CN110826334B true CN110826334B (en) 2023-04-21

Family

ID=69553722

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911089295.3A Expired - Fee Related CN110826334B (en) 2019-11-08 2019-11-08 Chinese named entity recognition model based on reinforcement learning and training method thereof

Country Status (1)

Country Link
CN (1) CN110826334B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476031A (en) * 2020-03-11 2020-07-31 重庆邮电大学 An Improved Chinese Named Entity Recognition Method Based on Lattice-LSTM
CN111539195B (en) * 2020-03-26 2025-04-11 中国平安人寿保险股份有限公司 Text matching training method and related equipment based on reinforcement learning
CN111666734B (en) * 2020-04-24 2021-08-10 北京大学 Sequence labeling method and device
CN111951959A (en) * 2020-08-23 2020-11-17 云知声智能科技股份有限公司 Dialogue guidance method, device and storage medium based on reinforcement learning
CN112151183B (en) * 2020-09-23 2024-11-22 上海海事大学 An entity recognition method for Chinese electronic medical records based on Lattice LSTM model
CN112163089B (en) * 2020-09-24 2023-06-23 中国电子科技集团公司第十五研究所 High-technology text classification method and system integrating named entity recognition
CN112699682B (en) * 2020-12-11 2022-05-17 山东大学 Named entity identification method and device based on combinable weak authenticator
CN113051921B (en) * 2021-03-17 2024-02-20 北京智慧星光信息技术有限公司 Internet text entity identification method, system, electronic equipment and storage medium
CN112966517B (en) * 2021-04-30 2022-02-18 平安科技(深圳)有限公司 Training method, device, equipment and medium for named entity recognition model
CN114386046A (en) * 2021-12-28 2022-04-22 绿盟科技集团股份有限公司 An unknown vulnerability detection method, device, electronic device and storage medium
CN114004233B (en) * 2021-12-30 2022-05-06 之江实验室 Remote supervision named entity recognition method based on semi-training and sentence selection
CN114692634B (en) * 2022-01-27 2024-12-31 清华大学 Chinese named entity recognition and classification method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117472A (en) * 2018-11-12 2019-01-01 新疆大学 A kind of Uighur name entity recognition method based on deep learning
CN109255119A (en) * 2018-07-18 2019-01-22 五邑大学 A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition
CN109597876A (en) * 2018-11-07 2019-04-09 中山大学 A kind of more wheels dialogue answer preference pattern and its method based on intensified learning
CN109657239A (en) * 2018-12-12 2019-04-19 电子科技大学 The Chinese name entity recognition method learnt based on attention mechanism and language model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255119A (en) * 2018-07-18 2019-01-22 五邑大学 A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition
CN109597876A (en) * 2018-11-07 2019-04-09 中山大学 A kind of more wheels dialogue answer preference pattern and its method based on intensified learning
CN109117472A (en) * 2018-11-12 2019-01-01 新疆大学 A kind of Uighur name entity recognition method based on deep learning
CN109657239A (en) * 2018-12-12 2019-04-19 电子科技大学 The Chinese name entity recognition method learnt based on attention mechanism and language model

Also Published As

Publication number Publication date
CN110826334A (en) 2020-02-21

Similar Documents

Publication Publication Date Title
CN110826334B (en) Chinese named entity recognition model based on reinforcement learning and training method thereof
Lin et al. ASRNN: A recurrent neural network with an attention model for sequence labeling
Yao et al. An improved LSTM structure for natural language processing
CN110489555B (en) Language model pre-training method combined with similar word information
Kim et al. Two-stage multi-intent detection for spoken language understanding
CN111062217B (en) Language information processing method and device, storage medium and electronic equipment
CN110866401A (en) Chinese electronic medical record named entity identification method and system based on attention mechanism
CN112712804A (en) Speech recognition method, system, medium, computer device, terminal and application
CN109871541B (en) A Named Entity Recognition Applicable to Multilingual and Multi-Domain
CN111767718B (en) Chinese grammar error correction method based on weakened grammar error feature representation
CN112151183A (en) An entity recognition method for Chinese electronic medical records based on Lattice LSTM model
CN111428490A (en) A Weakly Supervised Learning Method for Referential Resolution Using Language Models
CN112420191A (en) A TCM auxiliary decision-making system and method
CN118277573B (en) Pre-hospital emergency text classification labeling method based on ChatGLM model, electronic equipment, storage medium and computer program product
CN114386409B (en) Self-distillation Chinese word segmentation method based on attention mechanism, terminal and storage medium
CN110442860A (en) Name entity recognition method based on time convolutional network
CN114818717A (en) Chinese named entity recognition method and system fusing vocabulary and syntax information
CN112818698A (en) Fine-grained user comment sentiment analysis method based on dual-channel model
CN114781651A (en) Small sample learning robustness improving method based on contrast learning
CN117057350B (en) Chinese electronic medical record named entity recognition method and system
Han et al. MAF‐CNER: A Chinese Named Entity Recognition Model Based on Multifeature Adaptive Fusion
CN114021549A (en) Chinese Named Entity Recognition Method and Device Based on Vocabulary Enhancement and Multi-feature
CN115510242A (en) Chinese medicine text entity relation combined extraction method
CN113012685B (en) Audio recognition method, device, electronic device and storage medium
Qiu Construction of english speech recognition model by fusing cnn and random deep factorization tdnn

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20230421

CF01 Termination of patent right due to non-payment of annual fee