CN114154504A - Chinese named entity recognition algorithm based on multi-information enhancement - Google Patents
Chinese named entity recognition algorithm based on multi-information enhancement Download PDFInfo
- Publication number
- CN114154504A CN114154504A CN202111472663.XA CN202111472663A CN114154504A CN 114154504 A CN114154504 A CN 114154504A CN 202111472663 A CN202111472663 A CN 202111472663A CN 114154504 A CN114154504 A CN 114154504A
- Authority
- CN
- China
- Prior art keywords
- information
- layer
- entity
- speech
- attention mechanism
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 claims abstract description 27
- 230000007246 mechanism Effects 0.000 claims abstract description 23
- 239000011159 matrix material Substances 0.000 claims abstract description 16
- 230000002708 enhancing effect Effects 0.000 claims abstract description 3
- 238000013528 artificial neural network Methods 0.000 claims description 13
- 238000003058 natural language processing Methods 0.000 claims description 9
- 239000013598 vector Substances 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 2
- 238000013507 mapping Methods 0.000 claims 2
- 238000002372 labelling Methods 0.000 claims 1
- 238000000844 transformation Methods 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 5
- 238000013461 design Methods 0.000 abstract description 2
- 230000006872 improvement Effects 0.000 abstract description 2
- 238000003062 neural network model Methods 0.000 abstract 1
- 238000004364 calculation method Methods 0.000 description 8
- 238000012549 training Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000007781 pre-processing Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000008520 organization Effects 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 239000007787 solid Substances 0.000 description 3
- 238000013526 transfer learning Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 101100397240 Arabidopsis thaliana ISPD gene Proteins 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 125000004432 carbon atom Chemical group C* 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Character Discrimination (AREA)
- Machine Translation (AREA)
Abstract
At present, a Chinese named entity recognition method based on combination of character information and word information obtains good effect, and on the basis, a method for enhancing information by using font information also obtains certain improvement on performance. However, the problems of lack of input semantic information and entity recognition errors caused by nested entities have not been solved. To address these issues, the MIEM (Multi-Information Enhancement Method) model is proposed herein. The MIEM firstly enhances input characteristics by adding part-of-speech information into an embedded layer, adds a nested entity position information matrix based on binary tree structure coding into position information coding, then codes the embedded information by using a self-attention mechanism, and in addition, an MD layer (more details layer) is designed to replace the view field of a traditional residual error structure expansion model so as to acquire more information. The design not only enhances the expression of input information, but also enhances the entity boundary information, and solves the problems that the entity boundary is not clear and the entity identification accuracy is influenced by nested entities. Finally, a neural network model enhanced based on the embedded information and the position coding information is constructed to solve the problem of the recognition error of the named entity caused by the nested entity in the Chinese named entity recognition.
Description
Technical Field
The invention relates to the field of deep learning and computer natural language processing, in particular to a named entity identification method based on multi-information enhancement.
Background
With the continuous development of the field of artificial intelligence, natural language processing is more and more widespread in practical application, and Named Entity Recognition (NER) is used as a basic technology of natural language processing, the accuracy of the NER determines the effect of downstream tasks, the importance of the NER plays an important role in many downstream tasks (such as translation, question and answer models, search matching, semantic analysis and the like) of natural language processing, and entities recognized by the NER mainly comprise 3 categories (Entity category, time category, numerical category), 7 subclasses (name, place name, organization name, time, date, currency, percentage) and proper nouns. The NER is essentially a sequence tagging problem, which aims to accurately identify entities in a text and classify the entities into a certain class, but at present, the identification accuracy of named entities in social media such as microblogs is not high.
On one hand, because Chinese characters have more complicated semantics compared with English, the expression of the same word has more diversity. Whereas english words have some natural part-of-speech information, such as some words: "action", "education", "organization" and so on all have the same root word "-edition", and at the same time, these words also have the same noun part of speech; it is also better that "adjustable", "respectable", "reasonable", etc. all have the same root "-able", and these words also have the same adjective part of speech. In addition, there are many similar characteristics in english words, so that an english word has some extra part-of-speech information that chinese words do not have. On the other hand, in the common usage, there is usually the problem of entity nesting, which refers to the entity appearing in the text, there is a case that a certain shorter entity is contained in another longer entity, and there are many statements in which nested entities exist, for example, "american project management association" is a nested entity, "american project management association" is an organization, but "usa" is a place name. When such a situation exists, entity identification is difficult, and it can be said that the existence of the nested entity is an important factor influencing the entity identification accuracy.
In the field of natural language processing, named entity recognition tasks are firstly performed based on word segmentation, and the method has a main problem that error information is often spread due to inaccuracy of word segmentation; thereafter, character-based named entity recognition methods overcome this problem, but lack the underlying word information. There are some problems in the Chinese named entity recognition task based on characters or word segmentation only, (Yue Zhang, Jie Yang. Chinese NER using the same LSTM [ C ]// Processing of the 54th Annual Meeting of the Association for the computerized Linear, ACL,2018:1554 and 1564.) the Chinese named entity recognition task is performed by combining the characters and word information. Recently, (Shuang Wu, Xiaoning Song, Zhenhua Feng. MECT: Multi-Metadata Embedding based Cross-transform for Chinese Name Entitude Recognition [ C ]// Proceedings of the 59th Annual Meeting of the Association for computer Linear constraints, ACL,2021:1529 and 1539.) in the Embedding layer, the radical information of the character is added for the input information, and a certain effect is obtained. In view of the current research trend, the accuracy of the chinese named entity recognition is still in need of improvement in some fields, such as social media.
In summary, in consideration of the problems of nested entities and low accuracy in the named entity recognition network based on deep learning, the method for recognizing the named entities based on multi-information enhancement is designed, and by enhancing two aspects of embedded information and position information, a model can learn richer input features and learn the information of the nested entities, so that the accuracy of the Chinese named entity recognition is improved.
Disclosure of Invention
The invention aims to design a multi-information enhanced Chinese named entity recognition algorithm to accurately recognize entities from texts, and fine-tune a pre-training model for the field of specifically realizing named entity recognition based on the method so as to achieve the optimal effect.
The invention provides a Chinese named entity recognition method based on multi-information enhancement, which comprises the following steps: the embedded information module is used for processing an input sentence, adding part-of-speech information for the input of Chinese named entity recognition, transferring the part-of-speech information based on words to a character level for input, inputting the information in an embedded layer, fusing the character information, the word information and the part-of-speech information to be used as input characteristics, simultaneously sending the constructed nested entity position information matrix code based on binary tree structure coding and the input characteristics into a self-attention mechanism together, modeling the input information of the embedded layer, and capturing details of the output of the self-attention mechanism by utilizing a feedforward neural network and a provided novel residual error structure to obtain a deep expression. And the conditional random field is used for learning the relation between the labels to obtain the final entity prediction result.
The invention mainly comprises two parts: an embedded information enhancement method and a position coded information enhancement method.
The method specifically comprises the following steps:
1. acquiring an input sentence, performing part-of-speech tagging on the input sentence, transferring the part-of-speech tagging to a character level, and finally fusing character information, word information and part-of-speech information to serve as a final input characteristic;
2. constructing a Chinese named entity recognition network based on multi-information enhancement, which mainly comprises part-of-speech information enhancement and nested entity matrix information enhancement;
3. pre-training the network by utilizing an open source data set;
4. fine tuning a pre-established neural network by using a small amount of self-made labeled Chinese named entity identification data sets in a transfer learning mode;
5. and predicting named entity identification data in the prepared test set on the network after the transfer learning is finished to obtain a final detection entity.
The Chinese named entity recognition network based on multi-information enhancement in the steps is the main content of the invention, and provides a double-information enhancement method for embedding information enhancement and position information coding enhancement.
In the embedding layer, firstly preprocessing is carried out on input, word information corresponding to characters is matched, meanwhile, word property information is added by using a natural language processing tool library spaCy, then, input word elements are matched by using pre-trained character vectors and word vectors, and finally output of the obtained word embedding information after passing through the linear layer is used as input of a model.
In the attention module, a transformer XL attention calculation method is adopted, wherein for a position coding part, the nested entity matrix position information coding based on the binary tree and the position information coding method of the Flat network are combined, so that the information of the nested entity is ensured, and the information among other morphemes is not lost. The attention module calculation method is as follows:
Att(A,V)=softmax(A)V
wherein i represents the ith lemma, and ij represents the relationship between the ith lemma and the jth lemma. Q, K, V is different linear transformation of input matrix, where the input matrix is character, word and part-of-speech information features fused in embedding layer, u and v are learnable hyper-parameters, position information coding module R in attention mechanismBinaryAnd RFLATIs position information coding in an attention mechanism for modeling position information between lemmas in an input sentence, wherein RBinaryThe coding mode can be found in the figure, and the complete position information coding is realized by splicing RBinaryAnd RFLATThe implementation, expressed as:
in the feedforward neural network module, the learned "distributed feature representation" is mapped to the sample label space through the linear layer. In order to learn more detailed features, the invention replaces the original residual error structure with the proposed MD layer, captures the detailed features with the MD layer, and finally outputs a feature matrix. In terms of the overall output structure of the network, the addition of the outputs of two parallel networks is used as the input of the overall CRF, so as to reduce errors and improve the robustness of the network.
Due to the adoption of the technical scheme, the invention has the following advantages:
1. english words have some natural part-of-speech information, such as the word suffix "-tion", "-able" indicates nouns and adjective parts-of-speech. Compared with English, Chinese characters have more complex semantics, and the expression of the same word has more diversity, but has no such characteristics. Then, if the part-of-speech information is added for the input of the Chinese named entity recognition, the model can learn more abundant information, and can learn more semantic information by adding the part-of-speech information, so that the performance of the entity recognition model is improved. Therefore, the invention uses the natural language processing tool spaCy to label part of speech information, adds part of speech information to the embedding layer, and transfers label information of words to character information to better endow semantic features of input information, and the obtained form is shown as the embedding layer in the drawing. For the embedding mode of input, the invention uses the pre-trained word list to match the input character and word vector, and for the condition that no character or word vector exists, the invention carries out random initialization processing. The original character-based representation is represented by the word matched with the character, and finally the part-of-speech information obtained by the natural language processing tool is added to obtain the total input, wherein the input comprises the character information, the matched word information and the part-of-speech information.
2. The invention provides a position information code with entity nesting information, which is used for combining relative position information between word elements and position relation between nesting entities and solving the influence of the nesting entities on the identification accuracy of Chinese named entities. In the self-attention module, the position information and the input information are fused, so that the model can actively pay attention to the semantic relation and the position relation among the word elements.
3. For the residual part of the feedforward neural network, in order to obtain a larger receptive field, the invention provides a novel residual structure MD Layer (More Details Layer) to obtain More hidden information, and the specific position of the MD Layer in the model is shown in the attached drawing. The figure shows an implementation method of an MD layer, firstly, input features are amplified by N times through a linear layer, then the amplified features are sliced, and finally the sliced features are added to obtain final output, so that the dimension is guaranteed to be unchanged.
Drawings
In order to make the purpose, technical scheme and beneficial effect of the invention more clear, the invention provides the following drawings for description:
FIG. 1 is a flow chart of the Chinese named entity recognition method based on multi-information enhancement according to the present invention;
FIG. 2 is a schematic diagram of a binary tree based position encoding structure according to the present invention;
FIG. 3 is a schematic diagram of a binary tree structure of a matrix form of position information encoding according to the present invention;
FIG. 4 is a calculation module of the present invention with an attention mechanism;
fig. 5 is a schematic diagram of an MD layer implementation of the present invention.
Detailed description of the preferred embodiments
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
The invention provides a Chinese named entity recognition algorithm based on multi-information enhancement, which specifically comprises the following steps as shown in figure 1:
and 2, constructing a neural network fusing part-of-speech information and nested entity position information codes, and inputting the lemmas into the network for learning.
and 5, sending the output of the encoder to CRF (conditional Random field) to obtain a final predicted entity.
Detailed Description
Step 1: and acquiring an input statement, and preprocessing the input statement by using operations such as word list matching, part of speech matching and the like in an input preprocessing module to enhance the input expression characteristics.
Step 2: inputting the preprocessed sentence into a self-attention mechanism module, and constructing a position coding structure based on a binary tree in the self-attention mechanism module, wherein a solid circle represents that a current node can form a word with two characters of a next node of a left subtree of the current node, and the word is exemplified by a sentence "Chongqing city changjiang river bridge" in fig. 2, and the words formed by two continuous characters are "Chongqing", "cixian", "Yangtze river" and "great bridge", and then the words are circled by oval solid lines. And the dotted circle represents that the current node may form a word with a plurality of nodes of its left sub-tree, which is denoted as "Chongqing City" and "Yangtze river bridge" in fig. 2. For the binary tree structured position information encoding method of fig. 2, the present invention is represented by the matrix of fig. 3. Wherein the diagonal line represented by the dotted line represents the connection between the nodes of the left subtree of the binary tree structure, the downward solid arrow represents a word that the current node can compose with two characters of the next node of its left subtree, and the rightward solid arrow represents a word that the current node can compose with multiple nodes of its left subtree. After the processing, the entity position code based on the binary tree structure is mapped to the matrix representation. The lemmas in the sentence are encoded according to the encoding mode, and the specific matrix input of the binary tree structure position information encoding is shown in fig. 3. The feature extraction module is an encoder module of a transform network, and an attention mechanism network after position coding is changed is used.
And step 3: the Chinese named entity recognition network is constructed by using a PyTorch framework, the position of a Multi-head attention mechanism in the overall framework is shown in FIG. 1, a calculation diagram of the Multi-head attention mechanism (Multi-HeadAttention) is shown in FIG. 4, and the overall calculation formula is as follows:
Att(A,V)=softmax(A)V
in the formula, Q, K, V is different linear transformation of input vector, u and v are learnable hyper-parameters, and position information is fusedComprises the following steps:
in the formula RFLAT_ijThe calculation formula is as follows:
in the above formula, the first and second carbon atoms are,h in (1)i-hjRepresentsIn the same way, ti-tjRepresents Andhas been calculatedThe formula is as follows:
in the above formula, dmodelIs the dimension of the model, and the position d is obtained by the following calculation method:
in the formula, hh represents the distance from head [ i ] to head [ j ], wherein i represents the ith lemma, j represents the jth lemma, and tt represents the distance from tail [ i ] to tail [ j ].
And 4, step 4: in the feedforward neural network part of the network, in order to obtain a larger receptive field, the invention provides a novel residual error structure (MD) Layer (More Details Layer) to obtain More hidden information, and the specific position of the MD Layer in the model is shown in FIG. 1. Fig. 5 shows an implementation method of the MD layer, as shown in the figure, first, the input features are amplified by N times through the linear layer, then, the amplified features are sliced, and finally, the sliced features are added to obtain a final output, so as to ensure that the dimensionality is not changed. In the current Chinese named entity recognition task, the N value in the MD layer can be obtained through experiments, the experiment effect can be optimal by taking 2 as the N value, and meanwhile, in order to prevent overfitting during training, a layer normalization function (LayerNorm) is added to the feedforward neural network part.
And 5: and (3) sending the output of the coding part into a CRF layer for calculation, and obtaining a final prediction entity through constraint learning of the conditional random field on the label information.
Step 6: and training the constructed Chinese named entity recognition network. By means of transfer learning, firstly, the network is pre-trained by utilizing open source data of related fields, and then the pre-trained network is finely adjusted by utilizing a self-made labeled Chinese entity recognition data set.
Claims (6)
1. A Chinese named entity recognition method based on multi-information enhancement is characterized in that text content can be processed to obtain a required proper noun, and the method specifically comprises the following steps:
step 1, collecting text sentences which need to be identified by a user, adding part-of-speech labels to input words through a natural language processing tool spaCy, transferring part-of-speech information of the words to a character level, and fusing characters, words and part-of-speech information to serve as embedded information;
step 2, constructing a Chinese named entity recognition network based on multi-information enhancement, which mainly comprises a part-of-speech information embedding module, a position information coding module of a nested entity matrix and a novel feedforward neural network module based on a detail capturing layer;
and 3, carrying out named entity recognition on the input sentence on the trained neural network to obtain the required entity type.
2. The method for recognizing the Chinese named entity based on the multi-information enhancement as claimed in claim 1, wherein the constructed network of the method for recognizing the Chinese named entity based on the multi-information enhancement comprises an information embedding module, a self-attention mechanism module based on the position information of a nested entity matrix, a novel feedforward neural network module and a CRF label constraint module, wherein the information embedding module obtains the embedded vector representation of characters and words by matching a pre-trained vocabulary, then adds part-of-speech tagging information and transfers the part-of-speech information to a character level expression, and for an unregistered word (Out OfVocacbuly), the part-of-speech tagging information is randomly initialized; the self-attention mechanism module sends the embedded information and the position information based on the nested entity matrix into the self-attention mechanism to obtain final characteristic input, wherein the position information enhancing part adopts embedded entity position information coding based on a binary tree structure and is fused with position information coding of an FLAT network; for the feedforward neural network module part, the detail Layer (More Details Layer) provided by the invention is used for capturing deeper feature information instead of a common residual error Layer, and the features obtained by a self-attention mechanism are relearned; the CRF (conditional Random field) label constraint module models the dependence or constraint inside the labeling sequence, learns the contact information among the labels, and finally outputs the prediction result.
3. The method as claimed in claim 2, wherein the model has part-of-speech information in the embedding layer, the part-of-speech information is added to the model through spaCy in the embedding layer and transferred to the character, and the part-of-speech information is fused with the character information and the word information in the embedding layer, providing richer features for the network model.
4. The method for Chinese named entity recognition based on Multi-information enhancement as claimed in claim 2, wherein the self-attention mechanism module (Multi-attention mechanism) encodes the embedded information by a Multi-head attention mechanism, learns the dependence on the long and short distances between the input lemmas, and the attention mechanism is calculated by:
Att(A,V)=softmax(A)V
wherein i represents the ith lemma, and ij represents the relationship between the ith lemma and the jth lemma. Q, K, V are different linear transformations of input matrix, u and v are learnable hyperparameters, position information coding module R in attention mechanismBinaryAnd RFLATIs position information coding in attention mechanism, is used for modeling position information between word elements in input sentences, and the complete position information coding is formed by splicing RBinaryAnd RFLATExpressed as:
5. the method as claimed in claim 2, wherein the feedforward neural network module performs feature mapping on the output of the attention mechanism by using a linear Layer, wherein the More detailed feature information is obtained by replacing the common residual structure with a detail Layer (More Details Layer) proposed by the present invention.
6. The method for Chinese named entity recognition based on multi-information enhancement as claimed in claim 1, wherein the Chinese named entity recognition operation mainly comprises: the method comprises the steps of performing part-of-speech tagging on an input sentence, transferring part-of-speech tagging information to expression at a character level, fusing character information, word information and part-of-speech information to serve as output of an embedding layer, learning by using information of the embedding layer and nested entity matrix information in a self-attention mechanism, and performing feature mapping through an improved novel feedforward neural network to obtain an output sequence. And finally, sending the output sequence into a CRF layer for label constraint learning to obtain a named entity.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111472663.XA CN114154504B (en) | 2021-12-06 | 2021-12-06 | Chinese named entity recognition algorithm based on multi-information enhancement |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111472663.XA CN114154504B (en) | 2021-12-06 | 2021-12-06 | Chinese named entity recognition algorithm based on multi-information enhancement |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114154504A true CN114154504A (en) | 2022-03-08 |
CN114154504B CN114154504B (en) | 2024-08-13 |
Family
ID=80452741
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111472663.XA Active CN114154504B (en) | 2021-12-06 | 2021-12-06 | Chinese named entity recognition algorithm based on multi-information enhancement |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114154504B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114818711A (en) * | 2022-04-27 | 2022-07-29 | 天津大学 | Neural network-based multi-information fusion named entity identification method |
CN114970532A (en) * | 2022-05-18 | 2022-08-30 | 重庆邮电大学 | Chinese named entity recognition method based on embedded distribution improvement |
CN115329766A (en) * | 2022-08-23 | 2022-11-11 | 中国人民解放军国防科技大学 | Named entity identification method based on dynamic word information fusion |
CN115688777A (en) * | 2022-09-28 | 2023-02-03 | 北京邮电大学 | Named entity recognition system for nested and discontinuous entities of Chinese financial text |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150178392A1 (en) * | 2013-12-20 | 2015-06-25 | Chacha Search, Inc. | Method and system of providing a search tool |
US20150199333A1 (en) * | 2014-01-15 | 2015-07-16 | Abbyy Infopoisk Llc | Automatic extraction of named entities from texts |
CN110188193A (en) * | 2019-04-19 | 2019-08-30 | 四川大学 | A kind of electronic health record entity relation extraction method based on most short interdependent subtree |
CN111651989A (en) * | 2020-04-13 | 2020-09-11 | 上海明略人工智能(集团)有限公司 | Named entity recognition method and device, storage medium and electronic device |
CN112270193A (en) * | 2020-11-02 | 2021-01-26 | 重庆邮电大学 | Chinese named entity identification method based on BERT-FLAT |
CN112446216A (en) * | 2021-02-01 | 2021-03-05 | 华东交通大学 | Method and device for identifying nested named entities fusing with core word information |
CN112487202A (en) * | 2020-11-27 | 2021-03-12 | 厦门理工学院 | Chinese medical named entity recognition method and device fusing knowledge map and BERT |
CN113449524A (en) * | 2021-04-01 | 2021-09-28 | 山东英信计算机技术有限公司 | Named entity identification method, system, equipment and medium |
-
2021
- 2021-12-06 CN CN202111472663.XA patent/CN114154504B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150178392A1 (en) * | 2013-12-20 | 2015-06-25 | Chacha Search, Inc. | Method and system of providing a search tool |
US20150199333A1 (en) * | 2014-01-15 | 2015-07-16 | Abbyy Infopoisk Llc | Automatic extraction of named entities from texts |
CN110188193A (en) * | 2019-04-19 | 2019-08-30 | 四川大学 | A kind of electronic health record entity relation extraction method based on most short interdependent subtree |
CN111651989A (en) * | 2020-04-13 | 2020-09-11 | 上海明略人工智能(集团)有限公司 | Named entity recognition method and device, storage medium and electronic device |
CN112270193A (en) * | 2020-11-02 | 2021-01-26 | 重庆邮电大学 | Chinese named entity identification method based on BERT-FLAT |
CN112487202A (en) * | 2020-11-27 | 2021-03-12 | 厦门理工学院 | Chinese medical named entity recognition method and device fusing knowledge map and BERT |
CN112446216A (en) * | 2021-02-01 | 2021-03-05 | 华东交通大学 | Method and device for identifying nested named entities fusing with core word information |
CN113449524A (en) * | 2021-04-01 | 2021-09-28 | 山东英信计算机技术有限公司 | Named entity identification method, system, equipment and medium |
Non-Patent Citations (2)
Title |
---|
GUANGYAO WANG等: "Nested Named Entity Recognition via an Independent-Layered Pretrained Model", 《IEEE ACCESS》, no. 9, 5 August 2021 (2021-08-05), pages 109693, XP011870980, DOI: 10.1109/ACCESS.2021.3102685 * |
任乐乐等: "采用融合规则与BERT-FLAT模型对营养健康领域命名实体识别", 《202011201643》, vol. 37, no. 20, 23 October 2021 (2021-10-23), pages 211 - 218 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114818711A (en) * | 2022-04-27 | 2022-07-29 | 天津大学 | Neural network-based multi-information fusion named entity identification method |
CN114970532A (en) * | 2022-05-18 | 2022-08-30 | 重庆邮电大学 | Chinese named entity recognition method based on embedded distribution improvement |
CN115329766A (en) * | 2022-08-23 | 2022-11-11 | 中国人民解放军国防科技大学 | Named entity identification method based on dynamic word information fusion |
CN115688777A (en) * | 2022-09-28 | 2023-02-03 | 北京邮电大学 | Named entity recognition system for nested and discontinuous entities of Chinese financial text |
CN115688777B (en) * | 2022-09-28 | 2023-05-05 | 北京邮电大学 | Named entity recognition system for nested and discontinuous entities of Chinese financial text |
Also Published As
Publication number | Publication date |
---|---|
CN114154504B (en) | 2024-08-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111310471B (en) | Travel named entity identification method based on BBLC model | |
CN112989834B (en) | Named entity identification method and system based on flat grid enhanced linear converter | |
CN111738004A (en) | Training method of named entity recognition model and named entity recognition method | |
CN114154504B (en) | Chinese named entity recognition algorithm based on multi-information enhancement | |
CN117151220B (en) | Entity link and relationship based extraction industry knowledge base system and method | |
CN112818698B (en) | Fine-grained user comment sentiment analysis method based on dual-channel model | |
CN111309918A (en) | Multi-label text classification method based on label relevance | |
CN113657123A (en) | Mongolian aspect level emotion analysis method based on target template guidance and relation head coding | |
CN114153973A (en) | Mongolian multi-mode emotion analysis method based on T-M BERT pre-training model | |
CN115688784A (en) | Chinese named entity recognition method fusing character and word characteristics | |
CN117933258A (en) | Named entity identification method and system | |
CN116340513A (en) | Multi-label emotion classification method and system based on label and text interaction | |
Park et al. | Natural language generation using dependency tree decoding for spoken dialog systems | |
CN115169349A (en) | Chinese electronic resume named entity recognition method based on ALBERT | |
CN116384371A (en) | Combined entity and relation extraction method based on BERT and dependency syntax | |
CN114330350A (en) | Named entity identification method and device, electronic equipment and storage medium | |
CN113901813A (en) | Event extraction method based on topic features and implicit sentence structure | |
CN117390131B (en) | Text emotion classification method for multiple fields | |
CN116522165B (en) | Public opinion text matching system and method based on twin structure | |
CN112989839A (en) | Keyword feature-based intent recognition method and system embedded in language model | |
CN112733526B (en) | Extraction method for automatically identifying tax collection object in financial file | |
Yolchuyeva | Novel NLP Methods for Improved Text-To-Speech Synthesis | |
CN113434698A (en) | Relation extraction model establishing method based on full-hierarchy attention and application thereof | |
CN114021549B (en) | Chinese named entity recognition method and device based on vocabulary enhancement and multiple features | |
Zhang et al. | Social Media Named Entity Recognition Based On Graph Attention Network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20240717 Address after: Room 401, Building 13, Bailian Huigu, No. 118 Yinhe Road, Dianjun District, Yichang City, Hubei Province, China 443000 Applicant after: Yichang Jinhui Big Data Industry Development Co.,Ltd. Country or region after: China Address before: 400065 No. 2, Chongwen Road, Nan'an District, Chongqing Applicant before: CHONGQING University OF POSTS AND TELECOMMUNICATIONS Country or region before: China |
|
GR01 | Patent grant | ||
GR01 | Patent grant |