CN112861539A

CN112861539A - Nested named entity recognition method and device, electronic equipment and storage medium

Info

Publication number: CN112861539A
Application number: CN202110283633.8A
Authority: CN
Inventors: 曾祥荣; 刘升平; 梁家恩
Original assignee: Unisound Intelligent Technology Co Ltd; Xiamen Yunzhixin Intelligent Technology Co Ltd
Current assignee: Unisound Intelligent Technology Co Ltd; Xiamen Yunzhixin Intelligent Technology Co Ltd
Priority date: 2021-03-16
Filing date: 2021-03-16
Publication date: 2021-05-28
Anticipated expiration: 2041-03-16
Also published as: CN112861539B

Abstract

The invention relates to a nested named entity recognition method, a device, electronic equipment and a storage medium, wherein the method comprises the following steps: obtaining a marker sequence; determining a semantic representation from the tag sequence; determining a feature map matrix from the tag sequence and the semantic representation; predicting a word level matrix according to the characteristic diagram matrix; and identifying the named entity according to the word level matrix value. The method for recognizing the nested named entities is an entity recognition method based on image semantic segmentation, can avoid the entity overlapping problem by segmenting the image semantic, realizes local and global information attention by recognizing the named entities through semantic representation, a feature map matrix and a word level matrix, and improves the recognition effect of the named entities by recognizing the named entities through the coordinates and the categories of the word level matrix.

Description

Nested named entity recognition method and device, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of computers, in particular to a nested named entity identification method, a nested named entity identification device, electronic equipment and a storage medium.

Background

A Named Entity Recognition (NER) task mainly aims to extract entities of specific types from a text, wherein the Entity types comprise a person name, a place name, a time, an organization name and the like. Nested named entities are a special form of named entity, that is, an identified entity that may include other entities, such as "Shandong university" which is a place name. The traditional named entity recognition model based on sequence labeling is difficult to effectively process the condition that one word corresponds to a plurality of labels, so researchers put forward a model specially suitable for nested named entity recognition.

At present, the nested named entity recognition includes a method based on sequence multi-label classification, a method based on Machine Reading Comprehension (MRC) and a method based on Seq2Seq sequence generation, but there is no named entity recognition method based on semantic segmentation class.

Disclosure of Invention

The invention provides a nested named entity recognition method, a nested named entity recognition device, electronic equipment and a storage medium.

The technical scheme for solving the technical problems is as follows:

in a first aspect, an embodiment of the present invention provides a method for identifying a nested named entity, including:

obtaining a marker sequence;

determining a semantic representation from the tag sequence;

determining a feature map matrix from the tag sequence and the semantic representation;

predicting a word level matrix according to the characteristic diagram matrix;

and identifying the named entity according to the word level matrix value.

In some embodiments, the above method further comprises: and regarding the characteristic map matrix as a d-channel image, and using a segmentation layer for predicting the word level matrix, wherein the segmentation layer uses a UNet structure in image semantic segmentation.

In some embodiments, the UNet structure in the above method is formed by two down-sampling modules and two up-sampling modules connected across layers,

wherein each downsampling module comprises two convolution layers and a maximum pooling layer;

wherein each upsampling module includes two convolutional layers and one inverse convolutional layer.

In some embodiments, the predicting a word-level matrix according to the feature map matrix in the above method further includes:

the full-connection network carries out single label classification on each element of the matrix to obtain the word level matrix;

the abscissa of each element in the word level matrix corresponds to the starting position of the entity in the sentence;

the abscissa of each element in the word-level matrix corresponds to the ending position of the entity in the sentence.

In some embodiments, identifying named entities from word-level matrix values includes:

and determining the entity according to the category and the coordinate value of each element in the word level matrix.

In some embodiments, determining a semantic representation from the sequence of tokens comprises:

determining corresponding word embedding, sentence embedding and position embedding according to the mark sequence;

embedding the word, the sentence embedding and the position embedding and adding;

and inputting the added mark sequence into a BERT model to obtain semantic representation.

In some embodiments, the determining of the feature map matrix from the token sequence and the semantic representation in the above method is determined from a similarity calculation.

In a second aspect, an embodiment of the present invention further provides a nested named entity recognition apparatus, including:

an acquisition module: for obtaining a marker sequence;

a first determination module: for determining a semantic representation from the sequence;

a second determination module: determining a feature map matrix from the tag sequence and the semantic representation;

a prediction module: the character level matrix is used for predicting a word level matrix according to the characteristic diagram matrix;

an identification module: for identifying named entities based on word-level matrix values.

In some embodiments, the feature map matrix is treated as a d-channel image in the above apparatus, and a partition layer is used to predict the word-level matrix, the partition layer using UNet structure in image semantic partitioning.

In some embodiments, the UNet structure in the above device is formed by two down-sampling modules and two up-sampling modules connected across layers,

In some embodiments, the apparatus for predicting a word-level matrix according to the feature map matrix further includes:

In some embodiments, the identification module in the above apparatus is further configured to determine the entity according to the category and the coordinate value of each element in the word-level matrix.

In some embodiments, the first determining module in the above apparatus is further configured to:

In some embodiments, determining the feature map matrix from the token sequence and the semantic representation in the apparatus is determined from a similarity calculation.

In a third aspect, an embodiment of the present invention further provides an electronic device, including: a processor and a memory;

the processor is configured to execute any one of the nested named entity recognition methods described above by calling a program or instructions stored in the memory.

In a fourth aspect, the embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores a program or instructions, and the program or instructions cause a computer to execute any one of the nested named entity identification methods described above.

The invention has the beneficial effects that: the invention relates to a nested named entity recognition method, a device, electronic equipment and a storage medium, wherein the method comprises the following steps: obtaining a marker sequence; determining a semantic representation from the tag sequence; determining a feature map matrix from the tag sequence and the semantic representation; predicting a word level matrix according to the characteristic diagram matrix; and identifying the named entity according to the word level matrix value. The method for recognizing the nested named entities is an entity recognition method based on image semantic segmentation, can avoid the entity overlapping problem by segmenting the image semantic, realizes local and global information attention by recognizing the named entities through semantic representation, a feature map matrix and a word level matrix, and improves the effect of named entity recognition by recognizing the named entities through the coordinates and the categories of the word level matrix.

Drawings

Fig. 1 is a diagram of a nested named entity recognition method according to an embodiment of the present invention;

fig. 2 is an architecture diagram of a nested named entity recognition method according to an embodiment of the present invention;

fig. 3 is a second diagram of a method for identifying a nested named entity according to an embodiment of the present invention;

fig. 4 is a diagram of a nested named entity recognition apparatus according to an embodiment of the present invention;

fig. 5 is a schematic block diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.

Fig. 1 is a diagram of a nested named entity recognition method according to an embodiment of the present invention.

With reference to fig. 1, in a first aspect, an embodiment of the present invention provides a method for identifying a nested named entity, including:

s101: and acquiring a marker sequence.

Specifically, in the examples of the present application, the marker sequence is represented by x ═ ([ cls)],x₁,x₂,x₃,x₄,...,x_n[sep]) (ii) a Label [ cls]The corresponding final hidden state is typically used for the classification task, labeled [ sep ]]Indicating the end of a sentence; these two flags are symbols agreed by the BERT model.

S102: determining a semantic representation from the tag sequence.

Specifically, in the embodiment of the present application, the tag sequence is input into the BERT model to obtain the semantic representation e ═ (e)_[cls],e₁,e₂,e₃,e₄,...，e_n,e_[sep])。

S103: determining a feature map matrix from the tag sequence and the semantic representation.

Specifically, in the embodiment of the present application, the feature map matrix is determined by similarity calculation.

S104: and predicting a word level matrix according to the characteristic diagram matrix.

Specifically, in the embodiment of the present application, the feature map matrix is regarded as a d-channel image, and the segmentation layer is used for predicting a word-level matrix, similar to a pixel mask.

S105: and identifying the named entity according to the word level matrix value.

Specifically, in the embodiment of the present application, the entity is determined according to the category and the coordinate value of each element in the word level matrix.

wherein each downsampling module comprises two convolution layers and one maximum pooling layer.

Specifically, in the embodiment of the present application, the segmentation layer uses a UNet structure in image semantic segmentation, and the structure is similar to a letter U and is formed by connecting two downsampling modules and two upsampling modules across layers. Each down-sampling module comprises two convolution layers and a maximum pooling layer, and the down-sampling module extends the receptive field of each element of the image and provides rich global information for final classification. Each up-sampling module includes two convolutional layers and one inverse convolutional layer.

and the full-connection network carries out single-label classification on each element of the matrix to obtain the word level matrix.

The abscissa of each element in the word-level matrix corresponds to the beginning position of the entity in the sentence.

Specifically, in the embodiment of the application, after the processing of the up-sampling module and the down-sampling module, the full-connection network performs single-label classification on each element of the matrix to obtain a word-level matrix, and horizontal and vertical coordinates of each element of the matrix respectively correspond to the starting position and the ending position of a potential entity in a sentence. By integrating the BERT encoding layer and the partition layer, local and global information of matrix elements can be captured. The total number of categories is c +1, for example, the entity type is name of person, place, organization name and time, then c is 4, and more categories represent that the entity is not an entity, similar to the background category in the image semantic segmentation.

Fig. 2 is an architecture diagram of a nested named entity recognition method according to an embodiment of the present invention.

Specifically, in the embodiment of the present application, the entity is determined according to the category and the coordinate of each element in the matrix. For example, in fig. 2, the matrix coordinates (1,2) and the coordinates (1,4) are the place name and the organization name, respectively, wherein the abscissa 1 of the coordinates (1,2) represents the position where the entity starts, the ordinate represents the position where the entity ends, and the entity 'shandong' is the place name according to the position where the entity starts and ends can be located in the sentence, and the resolution method of the coordinates (1,4) is the same as above. Note that since the entity end position cannot be ahead of the start position, the entity coordinates cannot appear in the lower triangular region of the matrix, and the false recognition can be reduced by the above method, and the calculation amount of the loss function can also be reduced.

Fig. 3 is a second diagram of a method for identifying a nested named entity according to an embodiment of the present invention.

In some embodiments, in conjunction with fig. 3, determining a semantic representation from the tag sequence includes:

s301: and determining corresponding word embedding, sentence embedding and position embedding according to the mark sequence.

S302: the word embedding, the sentence embedding and the position embedding are added.

S303: and inputting the added mark sequence into a BERT model to obtain semantic representation.

Specifically, the process of obtaining the semantic representation can be visually seen by combining the graph.

Specifically, in the embodiment of the present application, a plurality of similarity calculation methods are fused to encode to obtain the correlation between words.

Character i of input sentence_iAnd the jth word x_jIs a vector F ═ x_i,x_j)＝[e_iWe_j；cos(e_i,e_j；MultiHead(e_i,e_j)]The three contents are bilinear similarity, cosine similarity and multi-head attention mechanism, wherein the multi-head attention mechanism is regarded as the number of channels in the image, wherein W,

is a learnable parameter, h is the number of heads in attention,

is each head corresponding to a vector dimension.

MultiHead(e_i,e_j)＝Concat(head₁,head₂,...，head_h)

Fig. 4 is a diagram of a nested named entity recognition apparatus according to an embodiment of the present invention.

the acquisition module 401: for obtaining the marker sequence.

Specifically, in this embodiment of the application, the obtaining module obtains the tag sequenceThe marker sequence is denoted x ═ ([ cls)],x₁,x₂,x₃,x₄,...,x_n[sep]) (ii) a Label [ cls]The corresponding final hidden state is typically used for the classification task, labeled [ sep ]]Indicating the end of a sentence; these two flags are symbols agreed by the BERT model.

The first determination module 402: for determining a semantic representation from the tag sequence.

Specifically, in this embodiment of the present application, the first determining module inputs the token sequence into the BERT model to determine that the semantic representation is (e ═ e)_[cls],e₁,e₂,e₃,e₄,...，e_n,e_[sep])。

The second determination module 403: a feature matrix for determining from the tag sequence and the semantic representation.

Specifically, in the embodiment of the present application, the feature map matrix is determined by calculating the similarity in the second determining module 303.

The prediction module 404: for predicting a word-level matrix from the feature map matrix.

The identification module 405: for identifying named entities based on word-level matrix values.

In some embodiments, the UNet structure in the above apparatus is formed by two down-sampling modules and two up-sampling modules connected across layers.

In some embodiments, the first determining module 402 of the apparatus is further configured to:

and determining corresponding word embedding, sentence embedding and position embedding according to the mark sequence.

The word embedding, the sentence embedding and the position embedding are added.

In some embodiments, the second determining module 403 in the above apparatus is further configured to: determining a feature map matrix from the token sequence and the semantic representation is determined from a similarity calculation.

Fig. 5 is a schematic block diagram of an electronic device provided by an embodiment of the present disclosure.

As shown in fig. 5, the electronic apparatus includes: at least one processor 501, at least one memory 502, and at least one communication interface 503. The various components in the electronic device are coupled together by a bus system 504. A communication interface 503 for information transmission with an external device. It is understood that the bus system 504 is used to enable communications among the components. The bus system 504 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, the various buses are labeled as bus system 504 in fig. 5.

It will be appreciated that the memory 502 in this embodiment can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory.

In some embodiments, memory 502 stores elements, executable units or data structures, or a subset thereof, or an expanded set thereof as follows: an operating system and an application program.

The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The application programs, including various application programs such as a Media Player (Media Player), a Browser (Browser), etc., are used to implement various application services. The program for implementing any one of the nested named entity recognition methods provided by the embodiments of the present application may be included in an application program.

In this embodiment of the present application, the processor 501 is configured to execute the steps of the embodiments of the nested named entity recognition method provided in this application by calling a program or an instruction stored in the memory 502, which may be specifically a program or an instruction stored in an application program.

Obtaining a marker sequence;

determining a semantic representation from the tag sequence;

predicting a word level matrix according to the characteristic diagram matrix;

and identifying the named entity according to the word level matrix value.

Any one of the nested named entity recognition methods provided by the embodiments of the present application may be applied to the processor 501, or implemented by the processor 501. The processor 501 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 501. The Processor 501 may be a general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, or discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The steps of any method in the nested named entity recognition method provided by the embodiment of the application can be directly embodied as being executed by a hardware decoding processor, or can be executed by combining hardware and software units in the decoding processor. The software elements may be located in ram, flash, rom, prom, or eprom, registers, among other storage media that are well known in the art. The storage medium is located in the memory 502, and the processor 501 reads the information in the memory 502 and performs the steps of the method in combination with its hardware.

Those skilled in the art will appreciate that although some embodiments described herein include some features included in other embodiments instead of others, combinations of features of different embodiments are meant to be within the scope of the application and form different embodiments.

Those skilled in the art will appreciate that the description of each embodiment has a respective emphasis, and reference may be made to the related description of other embodiments for those parts of an embodiment that are not described in detail.

Although the embodiments of the present application have been described in conjunction with the accompanying drawings, those skilled in the art will be able to make various modifications and variations without departing from the spirit and scope of the application, and such modifications and variations are included in the specific embodiments of the present invention as defined in the appended claims, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of various equivalent modifications and substitutions within the technical scope of the present disclosure, and these modifications and substitutions are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. The nested named entity recognition method is characterized by comprising the following steps:

obtaining a marker sequence;

determining a semantic representation from the tag sequence;

predicting a word level matrix according to the characteristic diagram matrix;

and identifying the named entity according to the word level matrix value.

2. The nested named entity recognition method of claim 1, further comprising: and regarding the characteristic map matrix as a d-channel image, and using a segmentation layer for predicting the word level matrix, wherein the segmentation layer uses a UNet structure in image semantic segmentation.

3. A nested named entity recognition method according to claim 2, characterized in that the UNet structure is formed by two down-sampling modules and two up-sampling modules connected across layers,

4. The method of claim 1, wherein predicting a word-level matrix from the signature graph matrix further comprises:

5. The method of claim 1, wherein identifying a named entity according to word level matrix values comprises:

6. The method of claim 1, wherein determining a semantic representation from the token sequence comprises:

7. The method of claim 1, wherein the determining a feature map matrix from the token sequence and the semantic representation is determined from a similarity calculation.

8. A nested named entity recognition apparatus, comprising:

an acquisition module: for obtaining a marker sequence;

9. An electronic device, comprising: a processor and a memory;

the processor is configured to execute the nested named entity recognition method of any one of claims 1 to 7 by calling a program or instructions stored in the memory.

10. A computer-readable storage medium, characterized in that the non-transitory computer-readable storage medium stores a program or instructions for causing a computer to execute the nested named entity recognition method according to any one of claims 1 to 7.