WO2022127124A1 - Procédé et appareil de reconnaissance de catégorie d'entités sur la base d'un méta-apprentissage, dispositif et support de stockage - Google Patents
Procédé et appareil de reconnaissance de catégorie d'entités sur la base d'un méta-apprentissage, dispositif et support de stockage Download PDFInfo
- Publication number
- WO2022127124A1 WO2022127124A1 PCT/CN2021/109617 CN2021109617W WO2022127124A1 WO 2022127124 A1 WO2022127124 A1 WO 2022127124A1 CN 2021109617 W CN2021109617 W CN 2021109617W WO 2022127124 A1 WO2022127124 A1 WO 2022127124A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sample
- data
- entity category
- identified
- query
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 239000013074 reference sample Substances 0.000 claims abstract description 92
- 238000012549 training Methods 0.000 claims abstract description 80
- 239000000523 sample Substances 0.000 claims description 255
- 239000013598 vector Substances 0.000 claims description 42
- 230000006870 function Effects 0.000 claims description 30
- 238000011176 pooling Methods 0.000 claims description 21
- 230000008569 process Effects 0.000 claims description 19
- 238000012545 processing Methods 0.000 claims description 12
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 101001013832 Homo sapiens Mitochondrial peptide methionine sulfoxide reductase Proteins 0.000 description 12
- 102100031767 Mitochondrial peptide methionine sulfoxide reductase Human genes 0.000 description 12
- 238000010586 diagram Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 238000007792 addition Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Definitions
- the size of these five datasets is not very large and the number of entity categories is not enough. All together there are less than 30 entity classes.
- the entity category tree in the real world is much larger than 30.
- the traditional idea is to label as many data as there are entity categories, but this is unrealistic.
- the usual situation is that when a new entity category appears, there are often only 10 to 100 samples of the new category. It is not realistic to retrain the model with these samples, because the model is bound to suffer from class imbalance and overfitting. influences.
- a meta-learning-based entity category identification method According to various embodiments disclosed in the present application, a meta-learning-based entity category identification method, apparatus, device, and storage medium are provided.
- a meta-learning-based entity category recognition method including:
- a meta-learning-based entity category recognition device comprising:
- a new entity category acquisition module is added, which is used to acquire the newly added entity category and query the reference samples corresponding to the said newly added entity category;
- a data-to-be-identified acquisition module for acquiring data to be identified
- An entity identification module configured to input the reference sample and the data to be identified into a pre-generated entity category identification model to identify a new entity category corresponding to the reference sample in each of the to-be-identified data, wherein , the entity category recognition model is trained based on meta-learning.
- a computer device comprising a memory and one or more processors, the memory having computer-readable instructions stored therein, the computer-readable instructions, when executed by the processor, cause the one or more processors to execute The following steps:
- One or more computer-readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the following steps:
- the above-mentioned meta-learning-based entity category identification method, apparatus, device, and storage medium determine a reference sample according to the newly added entity category and use the reference sample and data to be identified. Input into the pre-generated entity category recognition model to identify the new entity category corresponding to the reference sample in each of the data to be identified, without manual intervention, without the need for specialized knowledge in the field of artificial intelligence, which greatly reduces manpower When there is a new entity category, the model does not need to be retrained, and only a few reference samples are needed to identify the data to be identified to determine whether there is an entity category.
- FIG. 1 is a schematic flowchart of a meta-learning-based entity category identification method according to one or more embodiments.
- FIG. 2 is a schematic flowchart of a meta-learning-based entity category identification method according to another or more embodiments.
- FIG. 3 is a structural block diagram of an apparatus for identifying entity categories based on meta-learning according to one or more embodiments.
- FIG. 4 is a block diagram of a computer device in accordance with one or more embodiments.
- a meta-learning-based entity category identification method is provided. This embodiment is illustrated by applying the method to a terminal. It can be understood that the method can also be applied to a terminal.
- the server can also be applied to a system including a terminal and a server, and is realized through the interaction between the terminal and the server.
- the method includes the following steps:
- the newly added entity category may be the name of the newly added entity, and the newly added entity category may be at least one.
- the reference samples are samples belonging to the newly added entity category, wherein the number of reference samples may be 10 or more, but the number is not too large.
- the server may establish the corresponding relationship between the newly added entity category and the corresponding reference sample, for example, by grouping.
- the same reference sample may belong to multiple new entity categories, that is, a reference sample may be labeled with multiple entity categories.
- the data to be identified is data that needs to be processed by entity category, which may be newly added data or previous data.
- S106 Input the reference sample and the data to be identified into a pre-generated entity category recognition model to identify a new entity category corresponding to the reference sample in each to-be-identified data, wherein the entity category identification model is trained based on meta-learning owned.
- the entity category recognition model is trained based on meta-learning, wherein multiple meta-training tasks are constructed according to sample data, and then the entity category recognition model is obtained by training through the constructed meta-training tasks.
- the meta-training task is given a small number of support samples and a large number of query samples, and then train the support samples and query samples to obtain an entity that can identify the entity class of the data to be identified with a few samples of new entity categories.
- Class recognition model is trained based on meta-learning, wherein multiple meta-training tasks are constructed according to sample data, and then the entity category recognition model is obtained by training through the constructed meta-training tasks.
- the meta-training task is given a small number of support samples and a large number of query samples, and then train the support samples and query samples to obtain an entity that can identify the entity class of the data to be identified with a few samples of new entity categories.
- Class recognition model is trained based on meta-learning, wherein multiple meta-training tasks are constructed according to sample data, and then the entity category recognition model is obtained by training through
- the server inputs the reference samples and the data to be identified into the pre-generated entity category identification model, so that the entity category identification model identifies the newly added entity category corresponding to the reference sample in each to-be-identified data.
- the process of identifying the new entity category corresponding to the reference sample in each to-be-identified data by the entity type identification model may include a process of processing the reference sample and the to-be-identified data, and calculating high-level features of the to-be-identified data by using the processed reference samples The step of representing, and the step of processing according to the high-level feature representation to determine the new entity category corresponding to the reference sample in each to-be-identified data.
- the process of processing the reference samples and the data to be recognized may include: serializing the words in the reference samples and the data to be recognized, and performing high-level representation on the serialized words, and finally performing an average pooling operation on the high-level representation.
- the last word is processed to obtain the reference sample and the vector representation corresponding to the data to be recognized.
- the step of calculating the high-level feature representation of the data to be identified by the server through the processed reference samples may be performed according to the following formula:
- the step of calculating the high-level feature representation of the data to be identified by the server through the processed reference samples may be performed according to the following formula:
- the step of processing according to the high-level feature representation to determine the new entity category corresponding to the reference sample in each to-be-recognized data includes: inputting high-level features into a predetermined fully-connected layer, converting the feature vector of each word into The dimension is mapped to a preset dimension, such as 3-dimension.
- the three-dimension represents the label of the word is O, B, I, that is, it does not belong to this category, belongs to this category and is located at the beginning of the sentence, belongs to this category and is located in the middle of the sentence.
- the above-mentioned new entity category and the entity category of the data to be identified can also be stored in a node of a blockchain.
- the above-mentioned meta-learning-based entity category identification method determines a reference sample according to the newly added entity category, and inputs the reference sample and the data to be identified into a pre-generated entity category identification model, so as to identify the corresponding
- the new entity category of the reference sample does not require manual intervention, and does not require special knowledge in the field of artificial intelligence, which greatly reduces labor costs, and when there are new entity categories, there is no need to retrain the model, and only a few references are required.
- the sample can then identify the data to be identified to determine whether an entity class exists.
- the reference samples and the data to be identified are input into a pre-generated entity category recognition model to identify a new entity category corresponding to the reference sample in each to-be-identified data, including: combining the reference samples and the to-be-identified data
- the words in the data are serialized, and the serialized words are represented by high-order features; the average pooling operation is performed on the words after the high-order feature representation to obtain the vector representation of the reference sample and the data to be recognized; the vector representation of the reference sample
- the vectorized representation of the data to be identified is processed to obtain high-level features of the data to be identified; the high-level features are processed to obtain a new entity category corresponding to the reference sample in the data to be identified.
- the high-level feature representation of each word of the reference sample and the data to be recognized can be obtained by the following formula:
- the average pooling operation is used to obtain a unified vector representation, which is used to represent the entire reference sample and the data to be recognized:
- the obtained s rep represents the feature representation of the entire reference sample
- q rep represents the feature representation of the entire data to be identified.
- the higher-order feature representation of the data to be identified can be obtained according to the reference sample:
- the atten function is used to calculate the contribution of each reference sample to the recognition of named entities in the data to be recognized.
- T is a real number that controls the sharpness of the distribution obtained by the atten function.
- k represents the serial number of the reference sample, which is related to the number of samples of the reference sample.
- the server obtains the final feature representation of each word in the query sample, and then goes through a fully connected layer to map the feature vector dimension of each word to 3 dimensions.
- These three dimensions represent the labels of the words are O, B, I, that is, no It belongs to this category, belongs to this category and is located at the beginning of the sentence, belongs to this category and is located in the middle of the sentence, that is, the new entity category corresponding to the reference sample in the data to be identified.
- the training method of the entity category recognition model includes: acquiring sample data, and constructing multiple groups of meta training samples according to the sample data; and obtaining the entity category recognition model by training according to the meta training samples.
- the sample data may be preset samples that have been classified.
- the meta-training samples are processed according to the sample data, wherein each meta-training sample may include multiple support samples and multiple query samples, and the support samples may include multiple grouped sample data, that is, sample data belonging to different categories , the corresponding query sample is also the query sample in the corresponding group.
- the number of groups of meta-training samples can be set as required, such as 10,000, and then the target classification model is obtained by training the meta-training samples, for example, training is performed through the meta-training samples in turn until the accuracy rate of the entity category recognition model reaches the expectation , the calculation of the accuracy of the entity category recognition model can be processed according to the meta-training samples, for example, the support samples and query samples in the meta-training samples are input into the entity category recognition model to determine the entity category corresponding to the query sample. Compared with the real entity category of the query sample, the model training is completed.
- acquiring sample data and constructing multiple groups of training samples based on the sample data includes: acquiring sample data, grouping the sample data according to entity categories, and randomly extracting at least one group from the groups; determining the extracted The first quantity of sample data in at least one grouping is a support sample, and the second quantity of sample data is a query sample; a set of meta-training samples is obtained according to the support sample and the query sample; at least one group is repeatedly randomly selected from the grouping to obtain multiple groups meta-training samples.
- the server first obtains the original sample data, then processes the original sample data to obtain a data set corresponding to each category, and then starts to construct a training set.
- the server In order to train a meta-learning model, it is first necessary to construct some columns of meta-training samples.
- the construction rules are as follows:
- each category randomly selects the first number, such as 10 samples as support samples, and each category randomly selects the second number, such as 100 samples as query samples, so a total of 30 supports will be obtained sample, 3000 query samples.
- the server converts the data set constructed in this way into a meta-training task, the purpose of which is to train the model to classify the query samples given the support samples.
- the server can build 10,000 such meta-training tasks.
- obtaining sample data and grouping the sample data according to entity categories includes: obtaining sample data grouped according to initial entity categories, and grouping sample data in the initial entity categories according to target entity categories; Standardize the sample data grouped by the target entity category; combine the standardized target entity categories corresponding to each initial entity category to obtain groups corresponding to the target entity category.
- the initial entity category is the open source MSRA, People's Daily, Weibo, CLUENER, and BOSON datasets collected from the Internet. Since the annotation formats of these datasets are not uniform, these data must be preprocessed and unified into BIO annotations.
- the data in the format specifically the traditional named entity recognition data set when labeling the data set, some use the BIO format, some use the BIEO format, here is the conversion of the BIEO format to the BIO format.
- the target entity is marked in the initial category entity, such as PER person name, LOC location, ORG organization, TIM time, COM company name, ADD specific address, GAME game name, GOV government department, SCENCE attractions, BOOK books, MOVIE movies and PRODUCT products, etc.
- each dataset is divided into new entity categories according to a single entity category.
- the server can also access datasets such as CLUENER-PER, CLUENER-ADD...etc.
- my name is AB
- I live in CD and I work in EF.
- EF is the ORG entity.
- the server can get MSRA-PER, People's Daily-PER, CLUENER-PER, Weibo-PER, BOSON-PER, which are five PER-related datasets. Entities in the PER category and entities in other categories are all negative samples, so these five PER-related datasets are mixed to form a new dataset, denoted as the ZH-PER dataset.
- the server can access a total of 12 datasets such as ZH-LOC, ZH-ORG, ZH-TIM, ZH-ADD, ZH-COM, ZH-BOOK, etc.
- FIG. 2 is a flow chart of the training process of the entity category recognition model in one embodiment.
- the entity category recognition model is obtained by training according to the meta-training samples, including: combining the support samples and the query
- the words in the sample are serialized, and the serialized words are represented by high-order features, and the average pooling operation is performed on the words after the high-order feature representation to obtain the vector representation of the support samples and query samples; according to the entity category recognition model
- the vectorized representation of the query sample is processed by the vectorized representation of the support sample to obtain the high-level features of the query sample; the high-level features of the query sample are processed to obtain a new entity category corresponding to the support sample in the query sample;
- the new entity category corresponding to the support sample in the sample and the real entity category of the query sample are input into the random field layer to calculate the loss function; the entity category recognition model is trained through the loss function.
- the server starts to construct the model.
- the Chinese pre-trained language model BERT is used to encode the feature representation of the sentence.
- the main structure of the model is as follows:
- the server After inputting support samples and query samples into BERT, the server obtains the high-order feature representation of each word of these samples by the following formula:
- the server uses an average pooling operation to obtain a unified vector representation that represents the entire sample:
- the obtained s rep represents the feature representation of the entire support sample
- q rep represents the feature representation of the entire query sample.
- the server After obtaining the feature representation of the entire sample, the server obtains the higher-order feature representation of the query sample based on the support sample:
- T is a real number that controls the sharpness of the distribution obtained by atten.
- k represents the serial number of the support sample, because 10 support samples are selected for each category, because k is 10 at most.
- the server obtains the final feature representation of each word in the query sample, and then passes through a fully connected layer to map the feature vector dimension of each word to 3 dimensions, which represent the labels of the words O, B, I respectively. , that is, not in the category, in the category and at the beginning of the sentence, in the category and in the middle of the sentence.
- a conditional random field CRF layer is then used to calculate the final loss.
- the model is trained with a loss function.
- the words in the support samples and the query samples are serialized, and the serialized words are represented by high-order features, which includes: expressing the vector of the query samples according to the following formula through the vectorized representation of the support samples The representation is processed to obtain the high-level features of the query sample:
- the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;
- k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.
- FIGS. 1 and 2 are shown in sequence according to the arrows, these steps are not necessarily executed in the sequence shown by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order, and these steps may be performed in other orders. Moreover, at least a part of the steps in FIGS. 1 and 2 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but may be executed at different times. These sub-steps or stages The order of execution of the steps is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of sub-steps or stages of other steps.
- a meta-learning-based entity category identification device including: a newly added entity category acquisition module 100, a data acquisition module 200 to be identified, and an entity identification module 300, wherein:
- a newly added entity category acquisition module 100 is used to acquire the newly added entity category and query the reference samples corresponding to the newly added entity category;
- the entity identification module 300 is used for inputting the reference samples and the data to be identified into the pre-generated entity category identification model to identify the newly added entity category corresponding to the reference sample in each to be identified data, wherein the entity category identification model is based on Trained in a meta-learning way.
- the above entity identification module 300 may include:
- the conversion unit is used to serialize the words in the reference sample and the data to be recognized, and perform high-level feature representation on the serialized words;
- the first vectorization unit is used to perform an average pooling operation on the words represented by the high-order features to obtain the vector representation of the reference sample and the data to be recognized;
- a first high-level feature representation unit configured to process the vectorized representation of the data to be identified by referring to the vectorized representation of the sample to obtain high-level features of the data to be identified;
- the identification unit is used to process the high-level features to obtain the newly added entity category corresponding to the reference sample in the data to be identified.
- the above-mentioned device for identifying entity categories based on meta-learning includes:
- the sample acquisition module is used to acquire sample data and construct multi-group training samples according to the sample data;
- the training is fast, and it is used to train the entity category recognition model according to the meta-training samples.
- the above-mentioned sample acquisition module may include:
- a grouping unit used for acquiring sample data, grouping the sample data according to entity categories, and randomly extracting at least one grouping from the grouping
- an extraction unit configured to determine that the first quantity of sample data in the extracted at least one grouping is a support sample, and the second quantity of sample data is a query sample;
- the combination unit is used to obtain a set of meta training samples according to the support samples and the query samples;
- the loop unit is used to repeatedly randomly extract at least one group from the groups to obtain multiple groups of training samples.
- the above-mentioned grouping unit may include:
- the grouping subunit is used to obtain sample data grouped according to the initial entity category, and group the sample data in the initial entity category according to the target entity category;
- the standardization subunit is used to standardize the sample data grouped according to the target entity category
- the merging subunit is used for merging the standardized target entity categories corresponding to each initial entity category to obtain groups corresponding to the target entity categories.
- the above-mentioned training module may include:
- the second vectorization unit is used to serialize the words in the support sample and the query sample, perform high-level feature representation on the serialized words, and perform an average pooling operation on the words after the high-level feature representation to obtain support vector representation of samples and query samples;
- the second high-level feature representation unit is used to process the vectorized representation of the query sample through the vectorized representation of the support sample according to the entity category recognition model to obtain the high-level feature of the query sample;
- the category identification unit is used to process the high-level features of the query sample to obtain a new entity category corresponding to the support sample in the query sample;
- the loss function generation unit is used for inputting the newly added entity category of the obtained query sample corresponding to the support sample and the real entity category of the query sample into the random field layer to calculate the loss function;
- the training unit is used to train the entity category recognition model through the loss function.
- the above-mentioned second vectorization unit is further configured to process the vectorized representation of the query sample through the vectorized representation of the support sample according to the following formula to obtain high-level features of the query sample:
- the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;
- k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.
- Each module in the above-mentioned meta-learning-based entity category identification device may be implemented in whole or in part by software, hardware, and combinations thereof.
- the above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
- a computer device is provided, and the computer device may be a server, and its internal structure diagram may be as shown in FIG. 4 .
- the computer device includes a processor, memory, a network interface, and a database connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities.
- the memory of the computer device includes a non-volatile storage medium, an internal memory.
- the non-volatile storage medium stores an operating system, computer readable instructions and a database.
- the internal memory provides an environment for the execution of the operating system and computer-readable instructions in the non-volatile storage medium.
- the network interface of the computer device is used to communicate with an external terminal through a network connection.
- the computer-readable instructions when executed by a processor, implement a meta-learning-based entity class recognition method.
- FIG. 4 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
- a computer device includes a memory and one or more processors, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processors, causes the one or more processors to perform the following steps: acquiring a new entity category , and query the reference sample corresponding to the newly added entity category; obtain the data to be identified; input the reference sample and the data to be identified into the pre-generated entity category recognition model to identify the new addition of the corresponding reference sample in each data to be identified Entity category, where the entity category recognition model is trained based on meta-learning.
- the reference samples and the data to be identified are input into the pre-generated entity category recognition model, so as to identify new additions corresponding to the reference samples in each data to be identified.
- Entity categories including: serializing the reference samples and words in the data to be recognized, and expressing the serialized words with high-level features; performing an average pooling operation on the words represented by the high-level features to obtain the reference samples and the words to be identified. Recognize the vector representation of the data; process the vectorized representation of the data to be recognized by referring to the vectorized representation of the sample to obtain high-level features of the data to be recognized; process the high-level features to obtain a new entity category corresponding to the reference sample in the data to be recognized.
- the training method of the entity category recognition model implemented when the processor executes the computer-readable instructions includes: acquiring sample data, and constructing multiple groups of meta-training samples according to the sample data; training according to the meta-training samples to obtain entities Class recognition model.
- acquiring sample data when the processor executes the computer-readable instructions, and constructing multiple groups of meta-training samples according to the sample data includes: acquiring sample data, grouping the sample data according to entity categories, and randomly Extract at least one grouping from the grouping; determine that the first quantity of sample data in the at least one extracted grouping is a support sample, and the second quantity of sample data is a query sample; obtain a set of meta-training samples according to the support sample and the query sample; repeat random At least one group is drawn from the groups to obtain multi-group meta training samples.
- obtaining sample data and grouping the sample data according to entity categories when the processor executes the computer-readable instructions includes: obtaining sample data grouped according to initial entity categories, The sample data is grouped according to the target entity category; the sample data grouped according to the target entity category is standardized; the standardized target entity categories corresponding to each initial entity category are merged to obtain a group corresponding to the target entity category.
- the entity category recognition model obtained by training according to the meta-training sample includes: serializing the words in the support sample and the query sample, and serializing the serialized
- the words represented by the high-order features are represented by high-level features, and the average pooling operation is performed on the words after the high-order feature representation to obtain the vector representation of the support samples and the query samples; according to the entity category recognition model, the vectorized representation of the support samples is used to vectorize the query samples.
- serializing the words in the support sample and the query sample, and performing high-order feature representation on the serialized words includes: according to the following formula:
- the vectorized representation of the support sample processes the vectorized representation of the query sample to obtain the high-level features of the query sample:
- the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;
- k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.
- One or more computer-readable storage media storing computer-readable instructions, when the computer-readable instructions are executed by one or more processors, cause the one or more processors to perform the following steps: obtain the newly added entity category, and query The reference sample corresponding to the newly added entity category; the data to be identified is obtained; the reference sample and the to-be-identified data are input into the pre-generated entity category recognition model to identify the newly added entity category corresponding to the reference sample in each of the to-be-identified data, Among them, the entity category recognition model is trained based on meta-learning.
- the computer-readable storage medium may be non-volatile or volatile.
- the reference samples and the data to be identified are input into a pre-generated entity category recognition model, so as to identify new data corresponding to the reference samples in each data to be identified.
- entity categories including: serializing the reference samples and words in the data to be recognized, and expressing the serialized words with high-level features; performing average pooling on the words after the high-level feature representation to obtain the reference samples and The vector representation of the data to be recognized; the vectorized representation of the data to be recognized is processed by referring to the vectorized representation of the sample to obtain the high-level features of the data to be recognized; the high-level features are processed to obtain the new entity category corresponding to the reference sample in the data to be recognized .
- the training method of the entity category recognition model implemented when the computer-readable instructions are executed by the processor includes: acquiring sample data, and constructing multiple groups of meta-training samples according to the sample data; training according to the meta-training samples to obtain Entity class recognition model.
- acquiring sample data when the computer-readable instructions are executed by the processor, and constructing multiple groups of meta-training samples according to the sample data includes: acquiring sample data, grouping the sample data according to entity categories, and Randomly extract at least one grouping from the groupings; determine that the first quantity of sample data in the at least one extracted grouping is a support sample, and the second quantity of sample data is a query sample; obtain a set of meta-training samples according to the support sample and the query sample; repeat At least one group is randomly selected from the groups to obtain multiple sets of meta training samples.
- obtaining sample data and grouping the sample data according to entity categories when the computer-readable instructions are executed by the processor includes: obtaining sample data grouped according to the initial entity category, The sample data is grouped according to the target entity category; the sample data grouped according to the target entity category is standardized; the standardized target entity categories corresponding to each initial entity category are merged to obtain the grouping corresponding to the target entity category.
- the entity category recognition model obtained by training according to the meta-training samples includes: serializing the words in the support samples and the query samples, and serializing the words in the query samples.
- the following words are represented by high-order features, and the average pooling operation is performed on the words after the high-order feature representation to obtain the vector representation of the support samples and the query samples; according to the entity category recognition model, the vector representation of the support samples is used to represent the vector of the query samples.
- processing the high-level features of the query samples processing the high-level features of the query samples; processing the high-level features of the query samples to obtain the newly added entity categories corresponding to the supporting samples in the query samples; adding the new entity categories corresponding to the supporting samples in the obtained query samples with the query
- the real entity category of the sample is input into the random field layer to calculate the loss function; the entity category recognition model is trained through the loss function.
- the words in the support sample and the query sample are serialized, and the serialized words are represented by high-order features, including: according to the following formula
- the vectorized representation of the query sample is processed by the vectorized representation of the support sample to obtain the high-level features of the query sample:
- the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;
- k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.
- the blockchain referred to in the present invention is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
- Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block.
- the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
- Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
- Volatile memory may include random access memory (RAM) or external cache memory.
- RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
La présente invention concerne un procédé de reconnaissance de catégorie d'entités sur la base d'un méta-apprentissage. Le procédé comprend les étapes consistant à : acquérir une catégorie d'entités nouvellement ajoutée et interroger un échantillon de référence correspondant à la catégorie d'entités nouvellement ajoutée ; acquérir des données devant être reconnues ; et entrer l'échantillon de référence et les données dans un modèle de reconnaissance de catégorie d'entités pré-généré de façon à reconnaître une catégorie d'entités nouvellement ajoutée correspondant à un échantillon de référence dans chacune des données, le modèle de reconnaissance de catégorie d'entités étant obtenu au moyen d'une formation sur la base d'un mode de méta-apprentissage.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011472865.X | 2020-12-15 | ||
CN202011472865.XA CN112528662A (zh) | 2020-12-15 | 2020-12-15 | 基于元学习的实体类别识别方法、装置、设备和存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022127124A1 true WO2022127124A1 (fr) | 2022-06-23 |
Family
ID=74999881
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/109617 WO2022127124A1 (fr) | 2020-12-15 | 2021-07-30 | Procédé et appareil de reconnaissance de catégorie d'entités sur la base d'un méta-apprentissage, dispositif et support de stockage |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112528662A (fr) |
WO (1) | WO2022127124A1 (fr) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112528662A (zh) * | 2020-12-15 | 2021-03-19 | 深圳壹账通智能科技有限公司 | 基于元学习的实体类别识别方法、装置、设备和存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020143163A1 (fr) * | 2019-01-07 | 2020-07-16 | 平安科技(深圳)有限公司 | Procédé et appareil de reconnaissance d'entité nommée, basés sur un mécanisme d'attention, et dispositif informatique associé |
CN111767400A (zh) * | 2020-06-30 | 2020-10-13 | 平安国际智慧城市科技股份有限公司 | 文本分类模型的训练方法、装置、计算机设备和存储介质 |
CN111860580A (zh) * | 2020-06-09 | 2020-10-30 | 北京百度网讯科技有限公司 | 识别模型获取及类别识别方法、装置及存储介质 |
CN111859937A (zh) * | 2020-07-20 | 2020-10-30 | 上海汽车集团股份有限公司 | 一种实体识别方法及装置 |
CN112528662A (zh) * | 2020-12-15 | 2021-03-19 | 深圳壹账通智能科技有限公司 | 基于元学习的实体类别识别方法、装置、设备和存储介质 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101846824B1 (ko) * | 2017-12-11 | 2018-04-09 | 가천대학교 산학협력단 | 개체명 인식시스템, 방법, 및 컴퓨터 판독가능매체 |
CN109783604B (zh) * | 2018-12-14 | 2024-03-19 | 平安科技(深圳)有限公司 | 基于少量样本的信息提取方法、装置和计算机设备 |
CN110825875B (zh) * | 2019-11-01 | 2022-12-06 | 科大讯飞股份有限公司 | 文本实体类型识别方法、装置、电子设备和存储介质 |
CN111797394B (zh) * | 2020-06-24 | 2021-06-08 | 广州大学 | 基于stacking集成的APT组织识别方法、系统及存储介质 |
CN112001179A (zh) * | 2020-09-03 | 2020-11-27 | 平安科技(深圳)有限公司 | 命名实体识别方法、装置、电子设备及可读存储介质 |
CN112052684A (zh) * | 2020-09-07 | 2020-12-08 | 南方电网数字电网研究院有限公司 | 电力计量的命名实体识别方法、装置、设备和存储介质 |
-
2020
- 2020-12-15 CN CN202011472865.XA patent/CN112528662A/zh active Pending
-
2021
- 2021-07-30 WO PCT/CN2021/109617 patent/WO2022127124A1/fr active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020143163A1 (fr) * | 2019-01-07 | 2020-07-16 | 平安科技(深圳)有限公司 | Procédé et appareil de reconnaissance d'entité nommée, basés sur un mécanisme d'attention, et dispositif informatique associé |
CN111860580A (zh) * | 2020-06-09 | 2020-10-30 | 北京百度网讯科技有限公司 | 识别模型获取及类别识别方法、装置及存储介质 |
CN111767400A (zh) * | 2020-06-30 | 2020-10-13 | 平安国际智慧城市科技股份有限公司 | 文本分类模型的训练方法、装置、计算机设备和存储介质 |
CN111859937A (zh) * | 2020-07-20 | 2020-10-30 | 上海汽车集团股份有限公司 | 一种实体识别方法及装置 |
CN112528662A (zh) * | 2020-12-15 | 2021-03-19 | 深圳壹账通智能科技有限公司 | 基于元学习的实体类别识别方法、装置、设备和存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN112528662A (zh) | 2021-03-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021174774A1 (fr) | Procédé d'extraction de relations par réseau neuronal, dispositif informatique et support de stockage lisible | |
CN112434535B (zh) | 基于多模型的要素抽取方法、装置、设备及存储介质 | |
CN111324696B (zh) | 实体抽取方法、实体抽取模型的训练方法、装置及设备 | |
WO2022048363A1 (fr) | Procédé et appareil de classification de site web, dispositif informatique et support de stockage | |
WO2022134586A1 (fr) | Procédé et appareil de classification de cible basés sur un méta-apprentissage, dispositif et support de stockage | |
WO2022088671A1 (fr) | Procédé et appareil de réponse automatique à des questions, dispositif et support de mémoire | |
CN113254649B (zh) | 敏感内容识别模型的训练方法、文本识别方法及相关装置 | |
JP2022109836A (ja) | テキスト分類情報の半教師あり抽出のためのシステム及び方法 | |
CN112052684A (zh) | 电力计量的命名实体识别方法、装置、设备和存储介质 | |
WO2021179708A1 (fr) | Procédé et appareil de reconnaissance d'entités nommées, dispositif informatique et support d'enregistrement lisible | |
CN114357117A (zh) | 事务信息查询方法、装置、计算机设备及存储介质 | |
CN113051914A (zh) | 一种基于多特征动态画像的企业隐藏标签抽取方法及装置 | |
CN112507061A (zh) | 多关系医学知识提取方法、装置、设备及存储介质 | |
CN113821635A (zh) | 一种用于金融领域的文本摘要的生成方法及系统 | |
WO2022073341A1 (fr) | Procédé et appareil de mise en correspondance d'entités de maladie fondés sur la sémantique vocale, et dispositif informatique | |
CN116821373A (zh) | 基于图谱的prompt推荐方法、装置、设备及介质 | |
WO2022127124A1 (fr) | Procédé et appareil de reconnaissance de catégorie d'entités sur la base d'un méta-apprentissage, dispositif et support de stockage | |
Hardisty et al. | The specimen data refinery: a canonical workflow framework and FAIR digital object approach to speeding up digital mobilisation of natural history collections | |
WO2023168810A1 (fr) | Procédé et appareil de prédiction des propriétés d'une molécule de médicament, support d'enregistrement et dispositif informatique | |
CN111831624A (zh) | 数据表创建方法、装置、计算机设备及存储介质 | |
CN113469338A (zh) | 模型训练方法、模型训练装置、终端设备及存储介质 | |
CN116721713B (zh) | 一种面向化学结构式识别的数据集构建方法和装置 | |
CN116108144B (zh) | 信息抽取方法及装置 | |
CN114706927B (zh) | 基于人工智能的数据批量标注方法及相关设备 | |
CN117235257A (zh) | 基于人工智能的情感预测方法、装置、设备及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21905057 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 041023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21905057 Country of ref document: EP Kind code of ref document: A1 |