[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2022127124A1 - Meta learning-based entity category recognition method and apparatus, device and storage medium - Google Patents

Meta learning-based entity category recognition method and apparatus, device and storage medium Download PDF

Info

Publication number
WO2022127124A1
WO2022127124A1 PCT/CN2021/109617 CN2021109617W WO2022127124A1 WO 2022127124 A1 WO2022127124 A1 WO 2022127124A1 CN 2021109617 W CN2021109617 W CN 2021109617W WO 2022127124 A1 WO2022127124 A1 WO 2022127124A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
data
entity category
identified
query
Prior art date
Application number
PCT/CN2021/109617
Other languages
French (fr)
Chinese (zh)
Inventor
刘玉
徐国强
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2022127124A1 publication Critical patent/WO2022127124A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the size of these five datasets is not very large and the number of entity categories is not enough. All together there are less than 30 entity classes.
  • the entity category tree in the real world is much larger than 30.
  • the traditional idea is to label as many data as there are entity categories, but this is unrealistic.
  • the usual situation is that when a new entity category appears, there are often only 10 to 100 samples of the new category. It is not realistic to retrain the model with these samples, because the model is bound to suffer from class imbalance and overfitting. influences.
  • a meta-learning-based entity category identification method According to various embodiments disclosed in the present application, a meta-learning-based entity category identification method, apparatus, device, and storage medium are provided.
  • a meta-learning-based entity category recognition method including:
  • a meta-learning-based entity category recognition device comprising:
  • a new entity category acquisition module is added, which is used to acquire the newly added entity category and query the reference samples corresponding to the said newly added entity category;
  • a data-to-be-identified acquisition module for acquiring data to be identified
  • An entity identification module configured to input the reference sample and the data to be identified into a pre-generated entity category identification model to identify a new entity category corresponding to the reference sample in each of the to-be-identified data, wherein , the entity category recognition model is trained based on meta-learning.
  • a computer device comprising a memory and one or more processors, the memory having computer-readable instructions stored therein, the computer-readable instructions, when executed by the processor, cause the one or more processors to execute The following steps:
  • One or more computer-readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the following steps:
  • the above-mentioned meta-learning-based entity category identification method, apparatus, device, and storage medium determine a reference sample according to the newly added entity category and use the reference sample and data to be identified. Input into the pre-generated entity category recognition model to identify the new entity category corresponding to the reference sample in each of the data to be identified, without manual intervention, without the need for specialized knowledge in the field of artificial intelligence, which greatly reduces manpower When there is a new entity category, the model does not need to be retrained, and only a few reference samples are needed to identify the data to be identified to determine whether there is an entity category.
  • FIG. 1 is a schematic flowchart of a meta-learning-based entity category identification method according to one or more embodiments.
  • FIG. 2 is a schematic flowchart of a meta-learning-based entity category identification method according to another or more embodiments.
  • FIG. 3 is a structural block diagram of an apparatus for identifying entity categories based on meta-learning according to one or more embodiments.
  • FIG. 4 is a block diagram of a computer device in accordance with one or more embodiments.
  • a meta-learning-based entity category identification method is provided. This embodiment is illustrated by applying the method to a terminal. It can be understood that the method can also be applied to a terminal.
  • the server can also be applied to a system including a terminal and a server, and is realized through the interaction between the terminal and the server.
  • the method includes the following steps:
  • the newly added entity category may be the name of the newly added entity, and the newly added entity category may be at least one.
  • the reference samples are samples belonging to the newly added entity category, wherein the number of reference samples may be 10 or more, but the number is not too large.
  • the server may establish the corresponding relationship between the newly added entity category and the corresponding reference sample, for example, by grouping.
  • the same reference sample may belong to multiple new entity categories, that is, a reference sample may be labeled with multiple entity categories.
  • the data to be identified is data that needs to be processed by entity category, which may be newly added data or previous data.
  • S106 Input the reference sample and the data to be identified into a pre-generated entity category recognition model to identify a new entity category corresponding to the reference sample in each to-be-identified data, wherein the entity category identification model is trained based on meta-learning owned.
  • the entity category recognition model is trained based on meta-learning, wherein multiple meta-training tasks are constructed according to sample data, and then the entity category recognition model is obtained by training through the constructed meta-training tasks.
  • the meta-training task is given a small number of support samples and a large number of query samples, and then train the support samples and query samples to obtain an entity that can identify the entity class of the data to be identified with a few samples of new entity categories.
  • Class recognition model is trained based on meta-learning, wherein multiple meta-training tasks are constructed according to sample data, and then the entity category recognition model is obtained by training through the constructed meta-training tasks.
  • the meta-training task is given a small number of support samples and a large number of query samples, and then train the support samples and query samples to obtain an entity that can identify the entity class of the data to be identified with a few samples of new entity categories.
  • Class recognition model is trained based on meta-learning, wherein multiple meta-training tasks are constructed according to sample data, and then the entity category recognition model is obtained by training through
  • the server inputs the reference samples and the data to be identified into the pre-generated entity category identification model, so that the entity category identification model identifies the newly added entity category corresponding to the reference sample in each to-be-identified data.
  • the process of identifying the new entity category corresponding to the reference sample in each to-be-identified data by the entity type identification model may include a process of processing the reference sample and the to-be-identified data, and calculating high-level features of the to-be-identified data by using the processed reference samples The step of representing, and the step of processing according to the high-level feature representation to determine the new entity category corresponding to the reference sample in each to-be-identified data.
  • the process of processing the reference samples and the data to be recognized may include: serializing the words in the reference samples and the data to be recognized, and performing high-level representation on the serialized words, and finally performing an average pooling operation on the high-level representation.
  • the last word is processed to obtain the reference sample and the vector representation corresponding to the data to be recognized.
  • the step of calculating the high-level feature representation of the data to be identified by the server through the processed reference samples may be performed according to the following formula:
  • the step of calculating the high-level feature representation of the data to be identified by the server through the processed reference samples may be performed according to the following formula:
  • the step of processing according to the high-level feature representation to determine the new entity category corresponding to the reference sample in each to-be-recognized data includes: inputting high-level features into a predetermined fully-connected layer, converting the feature vector of each word into The dimension is mapped to a preset dimension, such as 3-dimension.
  • the three-dimension represents the label of the word is O, B, I, that is, it does not belong to this category, belongs to this category and is located at the beginning of the sentence, belongs to this category and is located in the middle of the sentence.
  • the above-mentioned new entity category and the entity category of the data to be identified can also be stored in a node of a blockchain.
  • the above-mentioned meta-learning-based entity category identification method determines a reference sample according to the newly added entity category, and inputs the reference sample and the data to be identified into a pre-generated entity category identification model, so as to identify the corresponding
  • the new entity category of the reference sample does not require manual intervention, and does not require special knowledge in the field of artificial intelligence, which greatly reduces labor costs, and when there are new entity categories, there is no need to retrain the model, and only a few references are required.
  • the sample can then identify the data to be identified to determine whether an entity class exists.
  • the reference samples and the data to be identified are input into a pre-generated entity category recognition model to identify a new entity category corresponding to the reference sample in each to-be-identified data, including: combining the reference samples and the to-be-identified data
  • the words in the data are serialized, and the serialized words are represented by high-order features; the average pooling operation is performed on the words after the high-order feature representation to obtain the vector representation of the reference sample and the data to be recognized; the vector representation of the reference sample
  • the vectorized representation of the data to be identified is processed to obtain high-level features of the data to be identified; the high-level features are processed to obtain a new entity category corresponding to the reference sample in the data to be identified.
  • the high-level feature representation of each word of the reference sample and the data to be recognized can be obtained by the following formula:
  • the average pooling operation is used to obtain a unified vector representation, which is used to represent the entire reference sample and the data to be recognized:
  • the obtained s rep represents the feature representation of the entire reference sample
  • q rep represents the feature representation of the entire data to be identified.
  • the higher-order feature representation of the data to be identified can be obtained according to the reference sample:
  • the atten function is used to calculate the contribution of each reference sample to the recognition of named entities in the data to be recognized.
  • T is a real number that controls the sharpness of the distribution obtained by the atten function.
  • k represents the serial number of the reference sample, which is related to the number of samples of the reference sample.
  • the server obtains the final feature representation of each word in the query sample, and then goes through a fully connected layer to map the feature vector dimension of each word to 3 dimensions.
  • These three dimensions represent the labels of the words are O, B, I, that is, no It belongs to this category, belongs to this category and is located at the beginning of the sentence, belongs to this category and is located in the middle of the sentence, that is, the new entity category corresponding to the reference sample in the data to be identified.
  • the training method of the entity category recognition model includes: acquiring sample data, and constructing multiple groups of meta training samples according to the sample data; and obtaining the entity category recognition model by training according to the meta training samples.
  • the sample data may be preset samples that have been classified.
  • the meta-training samples are processed according to the sample data, wherein each meta-training sample may include multiple support samples and multiple query samples, and the support samples may include multiple grouped sample data, that is, sample data belonging to different categories , the corresponding query sample is also the query sample in the corresponding group.
  • the number of groups of meta-training samples can be set as required, such as 10,000, and then the target classification model is obtained by training the meta-training samples, for example, training is performed through the meta-training samples in turn until the accuracy rate of the entity category recognition model reaches the expectation , the calculation of the accuracy of the entity category recognition model can be processed according to the meta-training samples, for example, the support samples and query samples in the meta-training samples are input into the entity category recognition model to determine the entity category corresponding to the query sample. Compared with the real entity category of the query sample, the model training is completed.
  • acquiring sample data and constructing multiple groups of training samples based on the sample data includes: acquiring sample data, grouping the sample data according to entity categories, and randomly extracting at least one group from the groups; determining the extracted The first quantity of sample data in at least one grouping is a support sample, and the second quantity of sample data is a query sample; a set of meta-training samples is obtained according to the support sample and the query sample; at least one group is repeatedly randomly selected from the grouping to obtain multiple groups meta-training samples.
  • the server first obtains the original sample data, then processes the original sample data to obtain a data set corresponding to each category, and then starts to construct a training set.
  • the server In order to train a meta-learning model, it is first necessary to construct some columns of meta-training samples.
  • the construction rules are as follows:
  • each category randomly selects the first number, such as 10 samples as support samples, and each category randomly selects the second number, such as 100 samples as query samples, so a total of 30 supports will be obtained sample, 3000 query samples.
  • the server converts the data set constructed in this way into a meta-training task, the purpose of which is to train the model to classify the query samples given the support samples.
  • the server can build 10,000 such meta-training tasks.
  • obtaining sample data and grouping the sample data according to entity categories includes: obtaining sample data grouped according to initial entity categories, and grouping sample data in the initial entity categories according to target entity categories; Standardize the sample data grouped by the target entity category; combine the standardized target entity categories corresponding to each initial entity category to obtain groups corresponding to the target entity category.
  • the initial entity category is the open source MSRA, People's Daily, Weibo, CLUENER, and BOSON datasets collected from the Internet. Since the annotation formats of these datasets are not uniform, these data must be preprocessed and unified into BIO annotations.
  • the data in the format specifically the traditional named entity recognition data set when labeling the data set, some use the BIO format, some use the BIEO format, here is the conversion of the BIEO format to the BIO format.
  • the target entity is marked in the initial category entity, such as PER person name, LOC location, ORG organization, TIM time, COM company name, ADD specific address, GAME game name, GOV government department, SCENCE attractions, BOOK books, MOVIE movies and PRODUCT products, etc.
  • each dataset is divided into new entity categories according to a single entity category.
  • the server can also access datasets such as CLUENER-PER, CLUENER-ADD...etc.
  • my name is AB
  • I live in CD and I work in EF.
  • EF is the ORG entity.
  • the server can get MSRA-PER, People's Daily-PER, CLUENER-PER, Weibo-PER, BOSON-PER, which are five PER-related datasets. Entities in the PER category and entities in other categories are all negative samples, so these five PER-related datasets are mixed to form a new dataset, denoted as the ZH-PER dataset.
  • the server can access a total of 12 datasets such as ZH-LOC, ZH-ORG, ZH-TIM, ZH-ADD, ZH-COM, ZH-BOOK, etc.
  • FIG. 2 is a flow chart of the training process of the entity category recognition model in one embodiment.
  • the entity category recognition model is obtained by training according to the meta-training samples, including: combining the support samples and the query
  • the words in the sample are serialized, and the serialized words are represented by high-order features, and the average pooling operation is performed on the words after the high-order feature representation to obtain the vector representation of the support samples and query samples; according to the entity category recognition model
  • the vectorized representation of the query sample is processed by the vectorized representation of the support sample to obtain the high-level features of the query sample; the high-level features of the query sample are processed to obtain a new entity category corresponding to the support sample in the query sample;
  • the new entity category corresponding to the support sample in the sample and the real entity category of the query sample are input into the random field layer to calculate the loss function; the entity category recognition model is trained through the loss function.
  • the server starts to construct the model.
  • the Chinese pre-trained language model BERT is used to encode the feature representation of the sentence.
  • the main structure of the model is as follows:
  • the server After inputting support samples and query samples into BERT, the server obtains the high-order feature representation of each word of these samples by the following formula:
  • the server uses an average pooling operation to obtain a unified vector representation that represents the entire sample:
  • the obtained s rep represents the feature representation of the entire support sample
  • q rep represents the feature representation of the entire query sample.
  • the server After obtaining the feature representation of the entire sample, the server obtains the higher-order feature representation of the query sample based on the support sample:
  • T is a real number that controls the sharpness of the distribution obtained by atten.
  • k represents the serial number of the support sample, because 10 support samples are selected for each category, because k is 10 at most.
  • the server obtains the final feature representation of each word in the query sample, and then passes through a fully connected layer to map the feature vector dimension of each word to 3 dimensions, which represent the labels of the words O, B, I respectively. , that is, not in the category, in the category and at the beginning of the sentence, in the category and in the middle of the sentence.
  • a conditional random field CRF layer is then used to calculate the final loss.
  • the model is trained with a loss function.
  • the words in the support samples and the query samples are serialized, and the serialized words are represented by high-order features, which includes: expressing the vector of the query samples according to the following formula through the vectorized representation of the support samples The representation is processed to obtain the high-level features of the query sample:
  • the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;
  • k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.
  • FIGS. 1 and 2 are shown in sequence according to the arrows, these steps are not necessarily executed in the sequence shown by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order, and these steps may be performed in other orders. Moreover, at least a part of the steps in FIGS. 1 and 2 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but may be executed at different times. These sub-steps or stages The order of execution of the steps is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of sub-steps or stages of other steps.
  • a meta-learning-based entity category identification device including: a newly added entity category acquisition module 100, a data acquisition module 200 to be identified, and an entity identification module 300, wherein:
  • a newly added entity category acquisition module 100 is used to acquire the newly added entity category and query the reference samples corresponding to the newly added entity category;
  • the entity identification module 300 is used for inputting the reference samples and the data to be identified into the pre-generated entity category identification model to identify the newly added entity category corresponding to the reference sample in each to be identified data, wherein the entity category identification model is based on Trained in a meta-learning way.
  • the above entity identification module 300 may include:
  • the conversion unit is used to serialize the words in the reference sample and the data to be recognized, and perform high-level feature representation on the serialized words;
  • the first vectorization unit is used to perform an average pooling operation on the words represented by the high-order features to obtain the vector representation of the reference sample and the data to be recognized;
  • a first high-level feature representation unit configured to process the vectorized representation of the data to be identified by referring to the vectorized representation of the sample to obtain high-level features of the data to be identified;
  • the identification unit is used to process the high-level features to obtain the newly added entity category corresponding to the reference sample in the data to be identified.
  • the above-mentioned device for identifying entity categories based on meta-learning includes:
  • the sample acquisition module is used to acquire sample data and construct multi-group training samples according to the sample data;
  • the training is fast, and it is used to train the entity category recognition model according to the meta-training samples.
  • the above-mentioned sample acquisition module may include:
  • a grouping unit used for acquiring sample data, grouping the sample data according to entity categories, and randomly extracting at least one grouping from the grouping
  • an extraction unit configured to determine that the first quantity of sample data in the extracted at least one grouping is a support sample, and the second quantity of sample data is a query sample;
  • the combination unit is used to obtain a set of meta training samples according to the support samples and the query samples;
  • the loop unit is used to repeatedly randomly extract at least one group from the groups to obtain multiple groups of training samples.
  • the above-mentioned grouping unit may include:
  • the grouping subunit is used to obtain sample data grouped according to the initial entity category, and group the sample data in the initial entity category according to the target entity category;
  • the standardization subunit is used to standardize the sample data grouped according to the target entity category
  • the merging subunit is used for merging the standardized target entity categories corresponding to each initial entity category to obtain groups corresponding to the target entity categories.
  • the above-mentioned training module may include:
  • the second vectorization unit is used to serialize the words in the support sample and the query sample, perform high-level feature representation on the serialized words, and perform an average pooling operation on the words after the high-level feature representation to obtain support vector representation of samples and query samples;
  • the second high-level feature representation unit is used to process the vectorized representation of the query sample through the vectorized representation of the support sample according to the entity category recognition model to obtain the high-level feature of the query sample;
  • the category identification unit is used to process the high-level features of the query sample to obtain a new entity category corresponding to the support sample in the query sample;
  • the loss function generation unit is used for inputting the newly added entity category of the obtained query sample corresponding to the support sample and the real entity category of the query sample into the random field layer to calculate the loss function;
  • the training unit is used to train the entity category recognition model through the loss function.
  • the above-mentioned second vectorization unit is further configured to process the vectorized representation of the query sample through the vectorized representation of the support sample according to the following formula to obtain high-level features of the query sample:
  • the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;
  • k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.
  • Each module in the above-mentioned meta-learning-based entity category identification device may be implemented in whole or in part by software, hardware, and combinations thereof.
  • the above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
  • a computer device is provided, and the computer device may be a server, and its internal structure diagram may be as shown in FIG. 4 .
  • the computer device includes a processor, memory, a network interface, and a database connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium, an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions and a database.
  • the internal memory provides an environment for the execution of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions when executed by a processor, implement a meta-learning-based entity class recognition method.
  • FIG. 4 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
  • a computer device includes a memory and one or more processors, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processors, causes the one or more processors to perform the following steps: acquiring a new entity category , and query the reference sample corresponding to the newly added entity category; obtain the data to be identified; input the reference sample and the data to be identified into the pre-generated entity category recognition model to identify the new addition of the corresponding reference sample in each data to be identified Entity category, where the entity category recognition model is trained based on meta-learning.
  • the reference samples and the data to be identified are input into the pre-generated entity category recognition model, so as to identify new additions corresponding to the reference samples in each data to be identified.
  • Entity categories including: serializing the reference samples and words in the data to be recognized, and expressing the serialized words with high-level features; performing an average pooling operation on the words represented by the high-level features to obtain the reference samples and the words to be identified. Recognize the vector representation of the data; process the vectorized representation of the data to be recognized by referring to the vectorized representation of the sample to obtain high-level features of the data to be recognized; process the high-level features to obtain a new entity category corresponding to the reference sample in the data to be recognized.
  • the training method of the entity category recognition model implemented when the processor executes the computer-readable instructions includes: acquiring sample data, and constructing multiple groups of meta-training samples according to the sample data; training according to the meta-training samples to obtain entities Class recognition model.
  • acquiring sample data when the processor executes the computer-readable instructions, and constructing multiple groups of meta-training samples according to the sample data includes: acquiring sample data, grouping the sample data according to entity categories, and randomly Extract at least one grouping from the grouping; determine that the first quantity of sample data in the at least one extracted grouping is a support sample, and the second quantity of sample data is a query sample; obtain a set of meta-training samples according to the support sample and the query sample; repeat random At least one group is drawn from the groups to obtain multi-group meta training samples.
  • obtaining sample data and grouping the sample data according to entity categories when the processor executes the computer-readable instructions includes: obtaining sample data grouped according to initial entity categories, The sample data is grouped according to the target entity category; the sample data grouped according to the target entity category is standardized; the standardized target entity categories corresponding to each initial entity category are merged to obtain a group corresponding to the target entity category.
  • the entity category recognition model obtained by training according to the meta-training sample includes: serializing the words in the support sample and the query sample, and serializing the serialized
  • the words represented by the high-order features are represented by high-level features, and the average pooling operation is performed on the words after the high-order feature representation to obtain the vector representation of the support samples and the query samples; according to the entity category recognition model, the vectorized representation of the support samples is used to vectorize the query samples.
  • serializing the words in the support sample and the query sample, and performing high-order feature representation on the serialized words includes: according to the following formula:
  • the vectorized representation of the support sample processes the vectorized representation of the query sample to obtain the high-level features of the query sample:
  • the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;
  • k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.
  • One or more computer-readable storage media storing computer-readable instructions, when the computer-readable instructions are executed by one or more processors, cause the one or more processors to perform the following steps: obtain the newly added entity category, and query The reference sample corresponding to the newly added entity category; the data to be identified is obtained; the reference sample and the to-be-identified data are input into the pre-generated entity category recognition model to identify the newly added entity category corresponding to the reference sample in each of the to-be-identified data, Among them, the entity category recognition model is trained based on meta-learning.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the reference samples and the data to be identified are input into a pre-generated entity category recognition model, so as to identify new data corresponding to the reference samples in each data to be identified.
  • entity categories including: serializing the reference samples and words in the data to be recognized, and expressing the serialized words with high-level features; performing average pooling on the words after the high-level feature representation to obtain the reference samples and The vector representation of the data to be recognized; the vectorized representation of the data to be recognized is processed by referring to the vectorized representation of the sample to obtain the high-level features of the data to be recognized; the high-level features are processed to obtain the new entity category corresponding to the reference sample in the data to be recognized .
  • the training method of the entity category recognition model implemented when the computer-readable instructions are executed by the processor includes: acquiring sample data, and constructing multiple groups of meta-training samples according to the sample data; training according to the meta-training samples to obtain Entity class recognition model.
  • acquiring sample data when the computer-readable instructions are executed by the processor, and constructing multiple groups of meta-training samples according to the sample data includes: acquiring sample data, grouping the sample data according to entity categories, and Randomly extract at least one grouping from the groupings; determine that the first quantity of sample data in the at least one extracted grouping is a support sample, and the second quantity of sample data is a query sample; obtain a set of meta-training samples according to the support sample and the query sample; repeat At least one group is randomly selected from the groups to obtain multiple sets of meta training samples.
  • obtaining sample data and grouping the sample data according to entity categories when the computer-readable instructions are executed by the processor includes: obtaining sample data grouped according to the initial entity category, The sample data is grouped according to the target entity category; the sample data grouped according to the target entity category is standardized; the standardized target entity categories corresponding to each initial entity category are merged to obtain the grouping corresponding to the target entity category.
  • the entity category recognition model obtained by training according to the meta-training samples includes: serializing the words in the support samples and the query samples, and serializing the words in the query samples.
  • the following words are represented by high-order features, and the average pooling operation is performed on the words after the high-order feature representation to obtain the vector representation of the support samples and the query samples; according to the entity category recognition model, the vector representation of the support samples is used to represent the vector of the query samples.
  • processing the high-level features of the query samples processing the high-level features of the query samples; processing the high-level features of the query samples to obtain the newly added entity categories corresponding to the supporting samples in the query samples; adding the new entity categories corresponding to the supporting samples in the obtained query samples with the query
  • the real entity category of the sample is input into the random field layer to calculate the loss function; the entity category recognition model is trained through the loss function.
  • the words in the support sample and the query sample are serialized, and the serialized words are represented by high-order features, including: according to the following formula
  • the vectorized representation of the query sample is processed by the vectorized representation of the support sample to obtain the high-level features of the query sample:
  • the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;
  • k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.
  • the blockchain referred to in the present invention is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A meta learning-based entity category recognition method, comprising: acquiring a newly added entity category, and querying a reference sample corresponding to the newly added entity category; acquiring data to be recognized; inputting the reference sample and the data into a pre-generated entity category recognition model so as to recognize a newly added entity category corresponding to a reference sample in each piece of the data, wherein the entity category recognition model is obtained by training on the basis of a meta learning manner.

Description

基于元学习的实体类别识别方法、装置、设备和存储介质Meta-Learning-Based Entity Category Recognition Method, Apparatus, Equipment and Storage Medium
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求于2020年12月15日提交中国专利局,申请号为202011472865.X,申请名称为“基于元学习的实体类别识别方法、装置、设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on December 15, 2020, with the application number of 202011472865.X and the application title of "Meta-Learning-Based Entity Class Recognition Method, Apparatus, Equipment and Storage Medium", The entire contents of which are incorporated herein by reference.
技术领域technical field
目前人工智能领域对于命名实体识别的研究较多,但和命名实体识别有关的数据集却不多,特别是中文命名实体识别数据集十分稀少。而且现在市场上虽然有一套比较成熟的命名实体识别模型,然而这些模型通常都只能区分人名、机构、地址这三个较为常见的实体类别。当出现一些新的实体类别时,这些模型就无法处理了。At present, there are many researches on named entity recognition in the field of artificial intelligence, but there are not many datasets related to named entity recognition, especially the Chinese named entity recognition dataset is very rare. Moreover, although there is a set of relatively mature named entity recognition models on the market, these models usually only distinguish three common entity categories: person name, institution, and address. When some new entity classes appear, these models can't handle it.
然而,发明人意识到传统的开源中文命名实体识别数据集主要包括MSRA、人民日报、微博、CLUENER、BOSON数据集,这五个数据集的大小都不是很大,实体类别数也不够多,全部加在一起也不到30个实体类别。然而现实世界中的实体类别树远比30要大,传统的想法是有多少实体类别就标注多少数据,然而这是不现实的。通常的情况是当一个新的实体类别出现时,往往都只有10到100条新类别的样本,用这些样本来重新训练模型也是不现实的,因为模型必定会受到类别不平衡和过拟合的影响。However, the inventor realized that the traditional open-source Chinese named entity recognition datasets mainly include MSRA, People's Daily, Weibo, CLUENER, and BOSON datasets. The size of these five datasets is not very large and the number of entity categories is not enough. All together there are less than 30 entity classes. However, the entity category tree in the real world is much larger than 30. The traditional idea is to label as many data as there are entity categories, but this is unrealistic. The usual situation is that when a new entity category appears, there are often only 10 to 100 samples of the new category. It is not realistic to retrain the model with these samples, because the model is bound to suffer from class imbalance and overfitting. influences.
因此,急需一种在新的实体类别出现时,能够准确地识别数据中对应的实体类别的方法。Therefore, there is an urgent need for a method that can accurately identify the corresponding entity category in the data when a new entity category appears.
发明内容SUMMARY OF THE INVENTION
根据本申请公开的各种实施例,提供一种基于元学习的实体类别识别方法、装置、设备和存储介质。According to various embodiments disclosed in the present application, a meta-learning-based entity category identification method, apparatus, device, and storage medium are provided.
一种基于元学习的实体类别识别方法,包括:A meta-learning-based entity category recognition method, including:
获取新增实体类别,并查询与所述新增实体类别对应的参照样本;Obtain the newly added entity category, and query the reference samples corresponding to the newly added entity category;
获取待识别数据;及obtain data to be identified; and
将所述参照样本和所述待识别数据输入至预先生成的实体类别识别模型中,以识别每一所述待识别数据中对应所述参照样本的新增实体类别,其中,所述实体类别识别模型是基于元学习的方式训练得到的。Inputting the reference sample and the data to be identified into a pre-generated entity category identification model to identify a new entity category corresponding to the reference sample in each of the to-be-identified data, wherein the entity category identifies The model is trained based on meta-learning.
一种基于元学习的实体类别识别装置,包括:A meta-learning-based entity category recognition device, comprising:
新增实体类别获取模块,用于获取新增实体类别,并查询与所述新增实体类别对应的参照样本;A new entity category acquisition module is added, which is used to acquire the newly added entity category and query the reference samples corresponding to the said newly added entity category;
待识别数据获取模块,用于获取待识别数据;及a data-to-be-identified acquisition module for acquiring data to be identified; and
实体识别模块,用于将所述参照样本和所述待识别数据输入至预先生成的实体类别识别模型中,以识别每一所述待识别数据中对应所述参照样本的新增实体类别,其中,所述实体类别识别模型是基于元学习的方式训练得到的。An entity identification module, configured to input the reference sample and the data to be identified into a pre-generated entity category identification model to identify a new entity category corresponding to the reference sample in each of the to-be-identified data, wherein , the entity category recognition model is trained based on meta-learning.
一种计算机设备,包括存储器和一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device comprising a memory and one or more processors, the memory having computer-readable instructions stored therein, the computer-readable instructions, when executed by the processor, cause the one or more processors to execute The following steps:
获取新增实体类别,并查询与所述新增实体类别对应的参照样本;Obtain the newly added entity category, and query the reference samples corresponding to the newly added entity category;
获取待识别数据;及obtain data to be identified; and
将所述参照样本和所述待识别数据输入至预先生成的实体类别识别模型中,以识别每一所述待识别数据中对应所述参照样本的新增实体类别,其中,所述实体类别识别模型是基于元学习的方式训练得到的。Inputting the reference sample and the data to be identified into a pre-generated entity category identification model to identify a new entity category corresponding to the reference sample in each of the to-be-identified data, wherein the entity category identifies The model is trained based on meta-learning.
一个或多个存储有计算机可读指令的计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:One or more computer-readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the following steps:
获取新增实体类别,并查询与所述新增实体类别对应的参照样本;Obtain the newly added entity category, and query the reference samples corresponding to the newly added entity category;
获取待识别数据;及obtain data to be identified; and
将所述参照样本和所述待识别数据输入至预先生成的实体类别识别模型中,以识别每一所述待识别数据中对应所述参照样本的新增实体类别,其中,所述实体类别识别模型是基于元学习的方式训练得到的。Inputting the reference sample and the data to be identified into a pre-generated entity category identification model to identify a new entity category corresponding to the reference sample in each of the to-be-identified data, wherein the entity category identifies The model is trained based on meta-learning.
计算机可读指令计算机可读指令计算机可读指令计算机可读指令上述基于元学习的实体类别识别方法、装置、设备和存储介质,根据新增实体类别确定了参照样本并将参照样本和待识别数据输入至预先生成的实体类别识别模型,以识别每一所述待识别数据中对应所述参照样本的新增实体类别,不需要人工干涉,不需要专门的人工智能领域的知识,大大减少了人力成本,且当有新增实体类别时,不需要重新训练模型,只需要少数几个参照样本即可以对待识别数据进行识别以确定是否存在实体类别。Computer-readable instructions Computer-readable instructions Computer-readable instructions Computer-readable instructions The above-mentioned meta-learning-based entity category identification method, apparatus, device, and storage medium determine a reference sample according to the newly added entity category and use the reference sample and data to be identified. Input into the pre-generated entity category recognition model to identify the new entity category corresponding to the reference sample in each of the data to be identified, without manual intervention, without the need for specialized knowledge in the field of artificial intelligence, which greatly reduces manpower When there is a new entity category, the model does not need to be retrained, and only a few reference samples are needed to identify the data to be identified to determine whether there is an entity category.
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below. Other features and advantages of the present application will be apparent from the description, drawings, and claims.
附图说明Description of drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the drawings required in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.
图1为根据一个或多个实施例中基于元学习的实体类别识别方法的流程示意图。FIG. 1 is a schematic flowchart of a meta-learning-based entity category identification method according to one or more embodiments.
图2为根据另一个或多个实施例中基于元学习的实体类别识别方法的流程示意图。FIG. 2 is a schematic flowchart of a meta-learning-based entity category identification method according to another or more embodiments.
图3为根据一个或多个实施例中基于元学习的实体类别识别装置的结构框图。FIG. 3 is a structural block diagram of an apparatus for identifying entity categories based on meta-learning according to one or more embodiments.
图4为根据一个或多个实施例中计算机设备的框图。4 is a block diagram of a computer device in accordance with one or more embodiments.
具体实施方式Detailed ways
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the technical solutions and advantages of the present application clearer, the present application will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.
在其中一个实施例中,如图1所示,提供了一种基于元学习的实体类别识别方法,本实施例以该方法应用于终端进行举例说明,可以理解的是,该方法也可以应用于服务器,还可以应用于包括终端和服务器的系统,并通过终端和服务器的交互实现。本实施例中,该方法包括以下步骤:In one of the embodiments, as shown in FIG. 1 , a meta-learning-based entity category identification method is provided. This embodiment is illustrated by applying the method to a terminal. It can be understood that the method can also be applied to a terminal. The server can also be applied to a system including a terminal and a server, and is realized through the interaction between the terminal and the server. In this embodiment, the method includes the following steps:
S102:获取新增实体类别,并查询与新增实体类别对应的参照样本。S102: Acquire the newly added entity category, and query the reference samples corresponding to the newly added entity category.
具体地,新增实体类别可以是新增加的实体的名称,该新增实体类别可以至少为一个。参照样本是属于该新增实体类别的样本,其中参照样本的数量可以为10个或多个,但是数量并不是太多。服务器可以分别建立新增实体类别与对应的参照样本的对应关系,例如通过分组的方式。此外,需要说明的是,同一参照样本可以属于多个新增实体类别,即一个参照样本可以被贴上多个实体类别标签。Specifically, the newly added entity category may be the name of the newly added entity, and the newly added entity category may be at least one. The reference samples are samples belonging to the newly added entity category, wherein the number of reference samples may be 10 or more, but the number is not too large. The server may establish the corresponding relationship between the newly added entity category and the corresponding reference sample, for example, by grouping. In addition, it should be noted that the same reference sample may belong to multiple new entity categories, that is, a reference sample may be labeled with multiple entity categories.
S104:获取待识别数据。S104: Acquire the data to be identified.
具体地,待识别数据是需要进行实体类别处理的数据,其可以是新增的数据或者是以往的数据。Specifically, the data to be identified is data that needs to be processed by entity category, which may be newly added data or previous data.
S106:将参照样本和待识别数据输入至预先生成的实体类别识别模型中,以识别每一待识别数据中对应参照样本的新增实体类别,其中,实体类别识别模型是基于元学习的方式训练得到的。S106: Input the reference sample and the data to be identified into a pre-generated entity category recognition model to identify a new entity category corresponding to the reference sample in each to-be-identified data, wherein the entity category identification model is trained based on meta-learning owned.
具体地,实体类别识别模型是基于元学习的方式训练得到的,其中根据样本数据构建多个元训练任务,然后通过所构建的元训练任务进行训练得到实体类别识别模型。其中元训练任务是给定了少量支撑样本和大量的查询样本后,对该支撑样本和查询样本进行训练得到以较少的新增实体类别的样本即可以对待识别数据的实体类别进行识别的实体类别识别模型。Specifically, the entity category recognition model is trained based on meta-learning, wherein multiple meta-training tasks are constructed according to sample data, and then the entity category recognition model is obtained by training through the constructed meta-training tasks. The meta-training task is given a small number of support samples and a large number of query samples, and then train the support samples and query samples to obtain an entity that can identify the entity class of the data to be identified with a few samples of new entity categories. Class recognition model.
其中,服务器将参照样本和待识别数据输入至预先生成的实体类别识别模型中,以使得实体类别识别模型识别每一待识别数据中对应参照样本的新增实体类别。The server inputs the reference samples and the data to be identified into the pre-generated entity category identification model, so that the entity category identification model identifies the newly added entity category corresponding to the reference sample in each to-be-identified data.
其中,实体类别识别模型识别每一待识别数据中对应参照样本的新增实体类别的过程可以包括对参照样本以及待识别数据进行处理的过程、通过处理的参照样本来计算待识别数据的高层特征表示的步骤,以及根据该高层特征表示进行处理以确定每一待识别数据中对应参照样本的新增实体类别的步骤。Wherein, the process of identifying the new entity category corresponding to the reference sample in each to-be-identified data by the entity type identification model may include a process of processing the reference sample and the to-be-identified data, and calculating high-level features of the to-be-identified data by using the processed reference samples The step of representing, and the step of processing according to the high-level feature representation to determine the new entity category corresponding to the reference sample in each to-be-identified data.
其中参照样本以及待识别数据进行处理的过程可以包括:对参照样本以及待识别数据中的单词进行序列化,以及对序列化后的单词进行高阶表示,最后通过平均池化操作对高阶表示后的单词进行处理得到参照样本以及待识别数据对应的向量表示。The process of processing the reference samples and the data to be recognized may include: serializing the words in the reference samples and the data to be recognized, and performing high-level representation on the serialized words, and finally performing an average pooling operation on the high-level representation. The last word is processed to obtain the reference sample and the vector representation corresponding to the data to be recognized.
服务器通过处理的参照样本来计算待识别数据的高层特征表示的步骤可以是根据以下公式来进行的:The step of calculating the high-level feature representation of the data to be identified by the server through the processed reference samples may be performed according to the following formula:
服务器通过处理的参照样本来计算待识别数据的高层特征表示的步骤可以是根据以下公式来进行的:The step of calculating the high-level feature representation of the data to be identified by the server through the processed reference samples may be performed according to the following formula:
Figure PCTCN2021109617-appb-000001
Figure PCTCN2021109617-appb-000001
Figure PCTCN2021109617-appb-000002
Figure PCTCN2021109617-appb-000002
其中,
Figure PCTCN2021109617-appb-000003
是q j在经过参照样本建模之后得到的待识别数据的高层特征,其在一定程度上建模了参照样本和待识别数据之间的关系。atten函数是用来计算每个参照样本对待识别数据中的命名实体识别的贡献度。
Figure PCTCN2021109617-appb-000004
代表两个向量拼接成更长的一个新向量,T是一个实数,用于控制atten得到的分布的尖锐程度。k代表参照样本的序号,其与参照样本的样本数量相关。
in,
Figure PCTCN2021109617-appb-000003
It is the high-level feature of the data to be identified obtained by q j after modeling the reference sample, which models the relationship between the reference sample and the data to be identified to a certain extent. The atten function is used to calculate the contribution of each reference sample to the recognition of named entities in the data to be recognized.
Figure PCTCN2021109617-appb-000004
Represents the splicing of two vectors into a new longer vector, T is a real number that controls the sharpness of the distribution obtained by atten. k represents the serial number of the reference sample, which is related to the number of samples of the reference sample.
最后,根据该高层特征表示进行处理以确定每一待识别数据中对应参照样本的新增实体类别的步骤,包括:将高层特征输入至预先确定的全连接层中,将每个单词的特征向量维度映射到预设维度,例如3维,这三维分别代表单词的label是O,B,I,即不属于该类别,属于该类别且位于句子开头,属于该类别且位于句子中间。Finally, the step of processing according to the high-level feature representation to determine the new entity category corresponding to the reference sample in each to-be-recognized data includes: inputting high-level features into a predetermined fully-connected layer, converting the feature vector of each word into The dimension is mapped to a preset dimension, such as 3-dimension. The three-dimension represents the label of the word is O, B, I, that is, it does not belong to this category, belongs to this category and is located at the beginning of the sentence, belongs to this category and is located in the middle of the sentence.
需要强调的是,为进一步保证上述新增实体类别和待识别数据的实体类别的私密和安全性,上述新增实体类别和待识别数据的实体类别还可以存储于一区块链的节点中。It should be emphasized that, in order to further ensure the privacy and security of the above-mentioned new entity category and the entity category of the data to be identified, the above-mentioned new entity category and the entity category of the data to be identified can also be stored in a node of a blockchain.
上述基于元学习的实体类别识别方法,根据新增实体类别确定了参照样本并将参照样本和待识别数据输入至预先生成的实体类别识别模型,以识别每一所述待识别数据中对应所述参照样本的新增实体类别,不需要人工干涉,不需要专门的人工智能领域的知识,大大减少了人力成本,且当有新增实体类别时,不需要重新训练模型,只需要少数几个参照样本即可以对待识别数据进行识别以确定是否存在实体类别。The above-mentioned meta-learning-based entity category identification method determines a reference sample according to the newly added entity category, and inputs the reference sample and the data to be identified into a pre-generated entity category identification model, so as to identify the corresponding The new entity category of the reference sample does not require manual intervention, and does not require special knowledge in the field of artificial intelligence, which greatly reduces labor costs, and when there are new entity categories, there is no need to retrain the model, and only a few references are required. The sample can then identify the data to be identified to determine whether an entity class exists.
在其中一个实施例中,将参照样本和待识别数据输入至预先生成的实体类别识别模型中,以识别每一待识别数据中对应参照样本的新增实体类别,包括:将参照样本和待识别数据中的单词进行序列化,并将序列化后的单词进行高阶特征表示;对高阶特征表示后的单词进行平均池化操作得到参照样本和待识别数据的向量表示;通过参照样本的向量化表示对待识别数据的向量化表示进行处理,得到待识别数据的高层特征;对高层特征进行处理得到待识别数据中对应参照样本的新增实体类别。In one embodiment, the reference samples and the data to be identified are input into a pre-generated entity category recognition model to identify a new entity category corresponding to the reference sample in each to-be-identified data, including: combining the reference samples and the to-be-identified data The words in the data are serialized, and the serialized words are represented by high-order features; the average pooling operation is performed on the words after the high-order feature representation to obtain the vector representation of the reference sample and the data to be recognized; the vector representation of the reference sample The vectorized representation of the data to be identified is processed to obtain high-level features of the data to be identified; the high-level features are processed to obtain a new entity category corresponding to the reference sample in the data to be identified.
具体地,假设参照样本的单词序列分别为
Figure PCTCN2021109617-appb-000005
则参照样本的输入为
Figure PCTCN2021109617-appb-000006
待识别数据的单词序列为
Figure PCTCN2021109617-appb-000007
则待识别数据的输入为
Figure PCTCN2021109617-appb-000008
Specifically, it is assumed that the word sequences of the reference samples are respectively
Figure PCTCN2021109617-appb-000005
Then the input of the reference sample is
Figure PCTCN2021109617-appb-000006
The word sequence of the data to be recognized is
Figure PCTCN2021109617-appb-000007
Then the input of the data to be recognized is
Figure PCTCN2021109617-appb-000008
将参照样本和待识别数据输入BERT之后,可以通过以下式子得到参照样本和待识别数据的每个单词的高阶特征表示:After inputting the reference sample and the data to be recognized into BERT, the high-level feature representation of each word of the reference sample and the data to be recognized can be obtained by the following formula:
Figure PCTCN2021109617-appb-000009
Figure PCTCN2021109617-appb-000009
Figure PCTCN2021109617-appb-000010
Figure PCTCN2021109617-appb-000010
其中
Figure PCTCN2021109617-appb-000011
Figure PCTCN2021109617-appb-000012
分别为参照样本和待识别数据的第i个和第j个单词,s i和q j分别是这两个单词在经过BERT之后得到的高阶特征表示。
in
Figure PCTCN2021109617-appb-000011
and
Figure PCTCN2021109617-appb-000012
are the i-th and j-th words of the reference sample and the data to be recognized, respectively, and s i and q j are the high-order feature representations obtained by these two words after passing through BERT.
在得到这些单词的高阶特征表示之后,利用平均池化操作来得到一个统一的向量表示,用来代表整个参照样本和待识别数据:After obtaining the high-level feature representation of these words, the average pooling operation is used to obtain a unified vector representation, which is used to represent the entire reference sample and the data to be recognized:
s rep=MEAN_POOLING i(s i) s rep = MEAN_POOLING i (s i )
q rep=MEAN_POOLING j(q j) q rep = MEAN_POOLING j (q j )
这样得到的s rep就代表整个参照样本的特征表示,q rep就代表整个待识别数据的特征表示。 The obtained s rep represents the feature representation of the entire reference sample, and q rep represents the feature representation of the entire data to be identified.
在得到整个参照样本和待识别数据的特征表示之后,可以根据参照样本来得到待识别数据的更高阶特征表示:After obtaining the feature representation of the entire reference sample and the data to be identified, the higher-order feature representation of the data to be identified can be obtained according to the reference sample:
Figure PCTCN2021109617-appb-000013
Figure PCTCN2021109617-appb-000013
Figure PCTCN2021109617-appb-000014
Figure PCTCN2021109617-appb-000014
其中
Figure PCTCN2021109617-appb-000015
是q j在经过参照样本建模之后得到的待识别数据的高层特征,其在一定程度上建模了参照样本和待识别数据之间的关系。atten函数是用来计算每个参照样本对待识别数据中的命名实体识别的贡献度。
Figure PCTCN2021109617-appb-000016
代表两个向量拼接成更长的一个新向量,T是一个实数,用于控制atten函数得到的分布的尖锐程度。k代表参照样本的序号,其与参照样本的样本数量相关。
in
Figure PCTCN2021109617-appb-000015
It is the high-level feature of the data to be identified obtained by q j after modeling the reference sample, which models the relationship between the reference sample and the data to be identified to a certain extent. The atten function is used to calculate the contribution of each reference sample to the recognition of named entities in the data to be recognized.
Figure PCTCN2021109617-appb-000016
Represents the splicing of two vectors into a new longer vector, T is a real number that controls the sharpness of the distribution obtained by the atten function. k represents the serial number of the reference sample, which is related to the number of samples of the reference sample.
服务器得到了查询样本中每个单词的最终特征表示,然后经过一个全连接层,将每个单词的特征向量维度映射到3维,这三维分别代表单词的label是O,B,I,即不属于该类别,属于该类别且位于句子开头,属于该类别且位于句子中间,也就是待识别数据中对应参照样本的新增实体类别。The server obtains the final feature representation of each word in the query sample, and then goes through a fully connected layer to map the feature vector dimension of each word to 3 dimensions. These three dimensions represent the labels of the words are O, B, I, that is, no It belongs to this category, belongs to this category and is located at the beginning of the sentence, belongs to this category and is located in the middle of the sentence, that is, the new entity category corresponding to the reference sample in the data to be identified.
在其中一个实施例中,实体类别识别模型的训练方式包括:获取样本数据,并根据样本数据构建多组元训练样本;根据元训练样本进行训练得到实体类别识别模型。In one of the embodiments, the training method of the entity category recognition model includes: acquiring sample data, and constructing multiple groups of meta training samples according to the sample data; and obtaining the entity category recognition model by training according to the meta training samples.
具体地,样本数据可以是预先设置的已经分类完成的样本。元训练样本是根据样本数据进行处理得到的,其中每个元训练样本可以包括多个支撑样本和多个查询样本,其中支撑样本中可以包括多个分组的样本数据,即属于不同分类的样本数据,对应的查询样本也是相应的分组中的查询样本。其中元训练样本的组数可以根据需要进行设置,例如一万个,然后通过该元训练样本来进行训练得到目标分类模型,例如依次通过元训练样本进行训练直至实体类别识别模型的准确率达到预期,其中对于实体类别识别模型的准确率的计算可以根据元训练样本进行处理,例如将元训练样本中的支撑样本和查询样本输入至实体类别识别模型中,以确定查询样本对应的实体类别,若是与查询样本的真实实体类别相比较,达到预期,则模型训练完成。Specifically, the sample data may be preset samples that have been classified. The meta-training samples are processed according to the sample data, wherein each meta-training sample may include multiple support samples and multiple query samples, and the support samples may include multiple grouped sample data, that is, sample data belonging to different categories , the corresponding query sample is also the query sample in the corresponding group. The number of groups of meta-training samples can be set as required, such as 10,000, and then the target classification model is obtained by training the meta-training samples, for example, training is performed through the meta-training samples in turn until the accuracy rate of the entity category recognition model reaches the expectation , the calculation of the accuracy of the entity category recognition model can be processed according to the meta-training samples, for example, the support samples and query samples in the meta-training samples are input into the entity category recognition model to determine the entity category corresponding to the query sample. Compared with the real entity category of the query sample, the model training is completed.
在其中一个实施例中,获取样本数据,并根据样本数据构建多组元训练样本,包括:获取样本数据,对样本数据按照实体类别进行分组,并随机从分组中抽取至少一个分组;确定所抽取的至少一个分组中的第一数量样本数据为支撑样本,第二数量样本数据为查询样本;根据支撑样本和查询样本得到一组元训练样本;重复随机从分组中抽取至少一个分组以得到多组元训练样本。In one embodiment, acquiring sample data and constructing multiple groups of training samples based on the sample data includes: acquiring sample data, grouping the sample data according to entity categories, and randomly extracting at least one group from the groups; determining the extracted The first quantity of sample data in at least one grouping is a support sample, and the second quantity of sample data is a query sample; a set of meta-training samples is obtained according to the support sample and the query sample; at least one group is repeatedly randomly selected from the grouping to obtain multiple groups meta-training samples.
具体地,服务器首先获取到原始样本数据,然后对原始样本数据进行处理得到每个类别所对应的数据集之后,开始构建训练集。为了训练元学习模型,首先需要构建一些列的元训练样本,构建规则如下:Specifically, the server first obtains the original sample data, then processes the original sample data to obtain a data set corresponding to each category, and then starts to construct a training set. In order to train a meta-learning model, it is first necessary to construct some columns of meta-training samples. The construction rules are as follows:
从处理得到的实体类,例如12个实体类别中随机抽取若干分组,例如3个类别,不妨表示为,l 1,l 2,...,l 3.从l 1,l 2,...,l 3这3个类别中,每个类别随机抽取第一数量,例如10个样本作为支撑样本,每个类别随机抽取第二数量,例如100个样本作为查询样本,因此一共会得到30个支撑样本,3000个查询样本。 From the processed entity classes, such as 12 entity classes, randomly select several groups, such as 3 classes, which may be expressed as, l 1 , l 2 , ..., l 3 . From l 1 , l 2 , ... , l 3 In these three categories, each category randomly selects the first number, such as 10 samples as support samples, and each category randomly selects the second number, such as 100 samples as query samples, so a total of 30 supports will be obtained sample, 3000 query samples.
服务器将这样一次构建的数据集成为一个元训练任务(meta-training task),该任务的目的是训使得模型能够在给定支撑样本的前提下,为查询样本进行分类。为了训练模型,服务器可以构建了10000个这样的元训练任务。The server converts the data set constructed in this way into a meta-training task, the purpose of which is to train the model to classify the query samples given the support samples. To train the model, the server can build 10,000 such meta-training tasks.
在其中一个实施例中,获取样本数据,对样本数据按照实体类别进行分组,包括:获取按照初始实体类别进行分组的样本数据,对初始实体类别中的样本数据按照目标实体类别进行分组;对按照目标实体类别进行分组的样本数据进行标准化处理;将各个初始实体类别对应的标准化处理后的目标实体类别进行合并,得到与目标实体类别对应的分组。In one embodiment, obtaining sample data and grouping the sample data according to entity categories includes: obtaining sample data grouped according to initial entity categories, and grouping sample data in the initial entity categories according to target entity categories; Standardize the sample data grouped by the target entity category; combine the standardized target entity categories corresponding to each initial entity category to obtain groups corresponding to the target entity category.
具体地,初始实体类别是从网上收集的开源的MSRA、人民日报、微博、CLUENER、BOSON数据集,由于这些数据集的标注格式不统一,因此要将这些数据进行预处理,统一成BIO标注格式的数据,具体地传统的命名实体识别数据集在标注数据集的时候,有的用的是BIO格式的,有的用的BIEO格式的,此处是将BIEO格式转换为BIO格式。Specifically, the initial entity category is the open source MSRA, People's Daily, Weibo, CLUENER, and BOSON datasets collected from the Internet. Since the annotation formats of these datasets are not uniform, these data must be preprocessed and unified into BIO annotations. The data in the format, specifically the traditional named entity recognition data set when labeling the data set, some use the BIO format, some use the BIEO format, here is the conversion of the BIEO format to the BIO format.
其中目标实体是初始类别实体中所标注的,例如PER人名、LOC地点、ORG组织、TIM时间、COM公司名、ADD具体地址、GAME游戏名、GOV政府部门、SCENCE景点、BOOK书籍、MOVIE电影以及PRODUCT产品等。服务器统计MSRA、人民日报、微博、CLUENER、BOSON数据集中每个数据集所标注的实体类别,其中MSRA标注了PER、LOC、ORG这三个实体类别,那么设L(MSRA)为MSRA所标注的实体类别集合,则L(MSRA)={PER、LOC、ORG}。类似的,服务器可以得到L(人民日报)={PER、LOC、ORG、TIM},L{微博}={PER、ORG、LOC},L(CLUENER)={PER,LOC、ORG、COM、ADD,GAME、GOV、SCENCE、BOOK、MOVIE},L(BOSON)={PER、LOC、ORG、COM、TIM、PRODUCT}The target entity is marked in the initial category entity, such as PER person name, LOC location, ORG organization, TIM time, COM company name, ADD specific address, GAME game name, GOV government department, SCENCE attractions, BOOK books, MOVIE movies and PRODUCT products, etc. The server counts the entity categories marked in each dataset in the MSRA, People's Daily, Weibo, CLUENER, and BOSON datasets. Among them, MSRA has marked the three entity categories of PER, LOC and ORG, then let L(MSRA) be marked by MSRA , then L(MSRA)={PER, LOC, ORG}. Similarly, the server can obtain L(People's Daily)={PER, LOC, ORG, TIM}, L{Weibo}={PER, ORG, LOC}, L(CLUENER)={PER, LOC, ORG, COM, ADD, GAME, GOV, SCENCE, BOOK, MOVIE}, L(BOSON)={PER, LOC, ORG, COM, TIM, PRODUCT}
根据上一步得到的数据集标注实体类别集合将每个数据集按照单个实体类别划分成新的实体类别,例如对于MSRA数据集,其L(MSRA)={PER、LOC、ORG},首先考虑PER类别,将MSRA数据集中的所有PER正样本保留,其他的诸如LOC、ORG这两个类别的 正样本全部标注为负样本,MSRA中原本存在的负样本保持不变,则新得到的数据集中只包含了PER类别的正样本,其他类别的正样本全部变成了负样本,最后剔除掉整句话都是负样本的句子,记这样的数据集为MSRA-PER,类似地,服务器可以得到MSRA-ORG,MSRA-LOC数据集。针对另外四个数据集,服务器也可以到CLUENER-PER、CLUENER-ADD...等数据集。具体地,例如我叫AB,住在CD,在EF上班。其中AB是PER实体,CD是LOC实体,EF是ORG实体,这些都算正样本,而我叫,住在,在,上班这些都算负样本。According to the dataset labeling entity category set obtained in the previous step, each dataset is divided into new entity categories according to a single entity category. For example, for the MSRA dataset, L(MSRA)={PER, LOC, ORG}, first consider PER Category, keep all PER positive samples in the MSRA data set, other positive samples such as LOC and ORG are all marked as negative samples, the original negative samples in MSRA remain unchanged, then the newly obtained data set only The positive samples of the PER category are included, and the positive samples of other categories are all turned into negative samples. Finally, the sentences in which the entire sentence is a negative sample are removed, and such a data set is called MSRA-PER. Similarly, the server can get MSRA - ORG, MSRA-LOC dataset. For the other four datasets, the server can also access datasets such as CLUENER-PER, CLUENER-ADD...etc. Specifically, for example, my name is AB, I live in CD, and I work in EF. Among them, AB is the PER entity, CD is the LOC entity, and EF is the ORG entity. These are all positive samples, and my name, living, working, and work are all negative samples.
之后,服务器可以得到MSRA-PER、人民日报-PER、CLUENER-PER、微博-PER、BOSON-PER这五个和PER相关的数据集,经过上面的分析可以知道,这五个数据集中都只包含PER类别的实体,其他类别的实体都是负样本,因此将这五个和PER相关的数据集混合起来,构成一个新的数据集,记为ZH-PER数据集。类似的,服务器可以到ZH-LOC,ZH-ORG,ZH-TIM,ZH-ADD,ZH-COM,ZH-BOOK等共12个数据集。After that, the server can get MSRA-PER, People's Daily-PER, CLUENER-PER, Weibo-PER, BOSON-PER, which are five PER-related datasets. Entities in the PER category and entities in other categories are all negative samples, so these five PER-related datasets are mixed to form a new dataset, denoted as the ZH-PER dataset. Similarly, the server can access a total of 12 datasets such as ZH-LOC, ZH-ORG, ZH-TIM, ZH-ADD, ZH-COM, ZH-BOOK, etc.
在其中一个实施例中,参见图2所示,图2为一个实施例中实体类别识别模型训练过程的流程图,该根据元训练样本进行训练得到实体类别识别模型,包括:将支撑样本和查询样本中的单词进行序列化,并将序列化后的单词进行高阶特征表示,且对高阶特征表示后的单词进行平均池化操作得到支撑样本和查询样本的向量表示;根据实体类别识别模型通过支撑样本的向量化表示对查询样本的向量化表示进行处理,得到查询样本的高层特征;对查询样本的高层特征进行处理得到查询样本中对应支撑样本的新增实体类别;将所得到的查询样本中对应支撑样本的新增实体类别与查询样本的真实实体类别输入至随机场层中计算得到损失函数;通过损失函数对实体类别识别模型进行训练。In one embodiment, referring to FIG. 2, FIG. 2 is a flow chart of the training process of the entity category recognition model in one embodiment. The entity category recognition model is obtained by training according to the meta-training samples, including: combining the support samples and the query The words in the sample are serialized, and the serialized words are represented by high-order features, and the average pooling operation is performed on the words after the high-order feature representation to obtain the vector representation of the support samples and query samples; according to the entity category recognition model The vectorized representation of the query sample is processed by the vectorized representation of the support sample to obtain the high-level features of the query sample; the high-level features of the query sample are processed to obtain a new entity category corresponding to the support sample in the query sample; The new entity category corresponding to the support sample in the sample and the real entity category of the query sample are input into the random field layer to calculate the loss function; the entity category recognition model is trained through the loss function.
具体地,在构建好元训练任务之后,服务器开始构建模型。本文采用中文预训练语言模型BERT来编码句子的特征表示,模型主体架构如下:Specifically, after the meta-training task is constructed, the server starts to construct the model. In this paper, the Chinese pre-trained language model BERT is used to encode the feature representation of the sentence. The main structure of the model is as follows:
设支撑(support)样本的单词序列分别为
Figure PCTCN2021109617-appb-000017
则支撑样本的输入为
Figure PCTCN2021109617-appb-000018
查询样本的单词序列为
Figure PCTCN2021109617-appb-000019
则查询样本的输入为
Figure PCTCN2021109617-appb-000020
Let the word sequences of the support samples be
Figure PCTCN2021109617-appb-000017
Then the input of the support sample is
Figure PCTCN2021109617-appb-000018
The word sequence of the query sample is
Figure PCTCN2021109617-appb-000019
Then the input of the query sample is
Figure PCTCN2021109617-appb-000020
将支撑样本和查询样本输入BERT之后,服务器通过以下式子得到这些样本的每个单词的高阶特征表示:After inputting support samples and query samples into BERT, the server obtains the high-order feature representation of each word of these samples by the following formula:
Figure PCTCN2021109617-appb-000021
Figure PCTCN2021109617-appb-000021
Figure PCTCN2021109617-appb-000022
Figure PCTCN2021109617-appb-000022
其中
Figure PCTCN2021109617-appb-000023
Figure PCTCN2021109617-appb-000024
分别为支撑样本和查询样本的第i个和第j个单词,s i和q j分别是这两个单词在经过BERT之后得到的高阶特征表示。
in
Figure PCTCN2021109617-appb-000023
and
Figure PCTCN2021109617-appb-000024
are the i-th and j-th words of the support sample and query sample, respectively, and s i and q j are the high-order feature representations of these two words after passing through BERT, respectively.
在得到这些单词的高阶特征表示之后,服务器利用平均池化操作来得到一个统一的向量表示,用来代表整个样本:After obtaining the high-level feature representations of these words, the server uses an average pooling operation to obtain a unified vector representation that represents the entire sample:
s rep=MEAN_POOLING i(s i) s rep = MEAN_POOLING i (s i )
q rep=MEAN_POOLING j(q j) q rep = MEAN_POOLING j (q j )
这样得到的s rep就代表整个支撑样本的特征表示,q rep就代表整个查询样本的特征表 示。 The obtained s rep represents the feature representation of the entire support sample, and q rep represents the feature representation of the entire query sample.
在得到整个样本的特征表示之后,服务器根据支撑样本来得到查询样本的更高阶特征表示:After obtaining the feature representation of the entire sample, the server obtains the higher-order feature representation of the query sample based on the support sample:
Figure PCTCN2021109617-appb-000025
Figure PCTCN2021109617-appb-000025
Figure PCTCN2021109617-appb-000026
Figure PCTCN2021109617-appb-000026
其中
Figure PCTCN2021109617-appb-000027
是q j在经过支撑样本建模之后得到的更高层特征表示,其在一定程度上建模了支撑样本和查询样本之间的关系。atten函数是用来计算每个支撑样本对查询样本命名实体识别的贡献度。
Figure PCTCN2021109617-appb-000028
代表两个向量拼接成更长的一个新向量,T是一个实数,用于控制atten得到的分布的尖锐程度。k代表支撑样本的序号,因为针对每个类别选取了10个支撑样本,因为k最大取10。
in
Figure PCTCN2021109617-appb-000027
is the higher-level feature representation obtained by q j after modeling the support samples, which models the relationship between the support samples and the query samples to a certain extent. The atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample.
Figure PCTCN2021109617-appb-000028
Represents the splicing of two vectors into a new longer vector, T is a real number that controls the sharpness of the distribution obtained by atten. k represents the serial number of the support sample, because 10 support samples are selected for each category, because k is 10 at most.
这样,服务器就得到了查询样本中每个单词的最终特征表示,然后经过一个全连接层,将每个单词的特征向量维度映射到3维,这三维分别代表单词的label是O,B,I,即即不属于该类别,属于该类别且位于句子开头,属于该类别且位于句子中间。在经过全连接层将每个单词映射到3维之后,然后再利用一个条件随机场CRF层来计算最终的损失。通过损失函数来对模型进行训练。In this way, the server obtains the final feature representation of each word in the query sample, and then passes through a fully connected layer to map the feature vector dimension of each word to 3 dimensions, which represent the labels of the words O, B, I respectively. , that is, not in the category, in the category and at the beginning of the sentence, in the category and in the middle of the sentence. After mapping each word to 3 dimensions through a fully connected layer, a conditional random field CRF layer is then used to calculate the final loss. The model is trained with a loss function.
在其中一个实施例中,将支撑样本和查询样本中的单词进行序列化,并将序列化后的单词进行高阶特征表示,包括:根据以下公式通过支撑样本的向量化表示对查询样本的向量化表示进行处理,得到查询样本的高层特征:In one of the embodiments, the words in the support samples and the query samples are serialized, and the serialized words are represented by high-order features, which includes: expressing the vector of the query samples according to the following formula through the vectorized representation of the support samples The representation is processed to obtain the high-level features of the query sample:
Figure PCTCN2021109617-appb-000029
Figure PCTCN2021109617-appb-000029
Figure PCTCN2021109617-appb-000030
Figure PCTCN2021109617-appb-000030
其中,
Figure PCTCN2021109617-appb-000031
是q j在经过支撑样本建模之后得到的查询样本的高层特征;atten函数是用来计算每个支撑样本对查询样本命名实体识别的贡献度;
Figure PCTCN2021109617-appb-000032
代表两个向量拼接成一个新向量,T是一个实数,用于控制atten函数得到的分布的尖锐程度;k代表支撑样本的序号,k的取值与支撑样本的样本数量相关。
in,
Figure PCTCN2021109617-appb-000031
is the high-level feature of the query sample obtained by q j after modeling the support sample; the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;
Figure PCTCN2021109617-appb-000032
Represents the splicing of two vectors into a new vector, T is a real number, used to control the sharpness of the distribution obtained by the atten function; k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.
应该理解的是,虽然图1和2的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图1和2中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the steps in the flowcharts of FIGS. 1 and 2 are shown in sequence according to the arrows, these steps are not necessarily executed in the sequence shown by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order, and these steps may be performed in other orders. Moreover, at least a part of the steps in FIGS. 1 and 2 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but may be executed at different times. These sub-steps or stages The order of execution of the steps is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of sub-steps or stages of other steps.
在其中一个实施例中,如图3所示,提供了一种基于元学习的实体类别识别装置,包 括:新增实体类别获取模块100、待识别数据获取模块200和实体识别模块300,其中:In one embodiment, as shown in Figure 3, a meta-learning-based entity category identification device is provided, including: a newly added entity category acquisition module 100, a data acquisition module 200 to be identified, and an entity identification module 300, wherein:
新增实体类别获取模块100,用于获取新增实体类别,并查询与新增实体类别对应的参照样本;A newly added entity category acquisition module 100 is used to acquire the newly added entity category and query the reference samples corresponding to the newly added entity category;
待识别数据获取模块200,用于获取待识别数据;an acquisition module 200 for data to be identified, configured to acquire data to be identified;
实体识别模块300,用于将参照样本和待识别数据输入至预先生成的实体类别识别模型中,以识别每一待识别数据中对应参照样本的新增实体类别,其中,实体类别识别模型是基于元学习的方式训练得到的。The entity identification module 300 is used for inputting the reference samples and the data to be identified into the pre-generated entity category identification model to identify the newly added entity category corresponding to the reference sample in each to be identified data, wherein the entity category identification model is based on Trained in a meta-learning way.
在其中一个实施例中,上述实体识别模块300可以包括:In one embodiment, the above entity identification module 300 may include:
转换单元,用于将参照样本和待识别数据中的单词进行序列化,并将序列化后的单词进行高阶特征表示;The conversion unit is used to serialize the words in the reference sample and the data to be recognized, and perform high-level feature representation on the serialized words;
第一向量化单元,用于对高阶特征表示后的单词进行平均池化操作得到参照样本和待识别数据的向量表示;The first vectorization unit is used to perform an average pooling operation on the words represented by the high-order features to obtain the vector representation of the reference sample and the data to be recognized;
第一高层特征表示单元,用于通过参照样本的向量化表示对待识别数据的向量化表示进行处理,得到待识别数据的高层特征;a first high-level feature representation unit, configured to process the vectorized representation of the data to be identified by referring to the vectorized representation of the sample to obtain high-level features of the data to be identified;
识别单元,用于对高层特征进行处理得到待识别数据中对应参照样本的新增实体类别。The identification unit is used to process the high-level features to obtain the newly added entity category corresponding to the reference sample in the data to be identified.
在其中一个实施例中,上述的基于元学习的实体类别识别装置包括:In one of the embodiments, the above-mentioned device for identifying entity categories based on meta-learning includes:
样本获取模块,用于获取样本数据,并根据样本数据构建多组元训练样本;The sample acquisition module is used to acquire sample data and construct multi-group training samples according to the sample data;
训练快,用于根据元训练样本进行训练得到实体类别识别模型。The training is fast, and it is used to train the entity category recognition model according to the meta-training samples.
在其中一个实施例中,上述的样本获取模块可以包括:In one embodiment, the above-mentioned sample acquisition module may include:
分组单元,用于获取样本数据,对样本数据按照实体类别进行分组,并随机从分组中抽取至少一个分组;a grouping unit, used for acquiring sample data, grouping the sample data according to entity categories, and randomly extracting at least one grouping from the grouping;
抽取单元,用于确定所抽取的至少一个分组中的第一数量样本数据为支撑样本,第二数量样本数据为查询样本;an extraction unit, configured to determine that the first quantity of sample data in the extracted at least one grouping is a support sample, and the second quantity of sample data is a query sample;
组合单元,用于根据支撑样本和查询样本得到一组元训练样本;The combination unit is used to obtain a set of meta training samples according to the support samples and the query samples;
循环单元,用于重复随机从分组中抽取至少一个分组以得到多组元训练样本。The loop unit is used to repeatedly randomly extract at least one group from the groups to obtain multiple groups of training samples.
在其中一个实施例中,上述的分组单元可以包括:In one embodiment, the above-mentioned grouping unit may include:
分组子单元,用于获取按照初始实体类别进行分组的样本数据,对初始实体类别中的样本数据按照目标实体类别进行分组;The grouping subunit is used to obtain sample data grouped according to the initial entity category, and group the sample data in the initial entity category according to the target entity category;
标准化子单元,用于对按照目标实体类别进行分组的样本数据进行标准化处理;The standardization subunit is used to standardize the sample data grouped according to the target entity category;
合并子单元,用于将各个初始实体类别对应的标准化处理后的目标实体类别进行合并,得到与目标实体类别对应的分组。The merging subunit is used for merging the standardized target entity categories corresponding to each initial entity category to obtain groups corresponding to the target entity categories.
在其中一个实施例中,上述训练模块可以包括:In one embodiment, the above-mentioned training module may include:
第二向量化单元,用于将支撑样本和查询样本中的单词进行序列化,并将序列化后的单词进行高阶特征表示,且对高阶特征表示后的单词进行平均池化操作得到支撑样本和查询样本的向量表示;The second vectorization unit is used to serialize the words in the support sample and the query sample, perform high-level feature representation on the serialized words, and perform an average pooling operation on the words after the high-level feature representation to obtain support vector representation of samples and query samples;
第二高层特征表示单元,用于根据实体类别识别模型通过支撑样本的向量化表示对查询样本的向量化表示进行处理,得到查询样本的高层特征;The second high-level feature representation unit is used to process the vectorized representation of the query sample through the vectorized representation of the support sample according to the entity category recognition model to obtain the high-level feature of the query sample;
类别识别单元,用于对查询样本的高层特征进行处理得到查询样本中对应支撑样本的新增实体类别;The category identification unit is used to process the high-level features of the query sample to obtain a new entity category corresponding to the support sample in the query sample;
损失函数生成单元,用于将所得到的查询样本中对应支撑样本的新增实体类别与查询样本的真实实体类别输入至随机场层中计算得到损失函数;The loss function generation unit is used for inputting the newly added entity category of the obtained query sample corresponding to the support sample and the real entity category of the query sample into the random field layer to calculate the loss function;
训练单元,用于通过损失函数对实体类别识别模型进行训练。The training unit is used to train the entity category recognition model through the loss function.
在其中一个实施例中,上述第二向量化单元还用于根据以下公式通过支撑样本的向量化表示对查询样本的向量化表示进行处理,得到查询样本的高层特征:In one of the embodiments, the above-mentioned second vectorization unit is further configured to process the vectorized representation of the query sample through the vectorized representation of the support sample according to the following formula to obtain high-level features of the query sample:
Figure PCTCN2021109617-appb-000033
Figure PCTCN2021109617-appb-000033
Figure PCTCN2021109617-appb-000034
Figure PCTCN2021109617-appb-000034
其中,
Figure PCTCN2021109617-appb-000035
是q j在经过支撑样本建模之后得到的所述查询样本的高层特征;atten函数是用来计算每个支撑样本对查询样本命名实体识别的贡献度;
Figure PCTCN2021109617-appb-000036
代表两个向量拼接成一个新向量,T是一个实数,用于控制atten函数得到的分布的尖锐程度;k代表支撑样本的序号,k的取值与支撑样本的样本数量相关。
in,
Figure PCTCN2021109617-appb-000035
is the high-level feature of the query sample obtained by q j after modeling the support sample; the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;
Figure PCTCN2021109617-appb-000036
Represents the splicing of two vectors into a new vector, T is a real number, used to control the sharpness of the distribution obtained by the atten function; k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.
关于基于元学习的实体类别识别装置的具体限定可以参见上文中对于基于元学习的实体类别识别方法的限定,在此不再赘述。上述基于元学习的实体类别识别装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific definition of the meta-learning-based entity category identification device, reference may be made to the above definition of the meta-learning-based entity category identification method, which will not be repeated here. Each module in the above-mentioned meta-learning-based entity category identification device may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
在其中一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图4所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种基于元学习的实体类别识别方法。In one of the embodiments, a computer device is provided, and the computer device may be a server, and its internal structure diagram may be as shown in FIG. 4 . The computer device includes a processor, memory, a network interface, and a database connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium, an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions and a database. The internal memory provides an environment for the execution of the operating system and computer-readable instructions in the non-volatile storage medium. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer-readable instructions, when executed by a processor, implement a meta-learning-based entity class recognition method.
本领域技术人员可以理解,图4中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in FIG. 4 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行以下步骤:获取新增实体类别,并查询与新增实体类别对应的参照样本;获取待识别数据;将参照样本和待识别数据输入至预先生成的实体类别识别模型中,以识别每一待识别数据中对应参照样本的新增实 体类别,其中,实体类别识别模型是基于元学习的方式训练得到的。A computer device includes a memory and one or more processors, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processors, causes the one or more processors to perform the following steps: acquiring a new entity category , and query the reference sample corresponding to the newly added entity category; obtain the data to be identified; input the reference sample and the data to be identified into the pre-generated entity category recognition model to identify the new addition of the corresponding reference sample in each data to be identified Entity category, where the entity category recognition model is trained based on meta-learning.
在其中一个实施例中,处理器执行计算机可读指令时所实现的将参照样本和待识别数据输入至预先生成的实体类别识别模型中,以识别每一待识别数据中对应参照样本的新增实体类别,包括:将参照样本和待识别数据中的单词进行序列化,并将序列化后的单词进行高阶特征表示;对高阶特征表示后的单词进行平均池化操作得到参照样本和待识别数据的向量表示;通过参照样本的向量化表示对待识别数据的向量化表示进行处理,得到待识别数据的高层特征;对高层特征进行处理得到待识别数据中对应参照样本的新增实体类别。In one embodiment, when the processor executes the computer-readable instructions, the reference samples and the data to be identified are input into the pre-generated entity category recognition model, so as to identify new additions corresponding to the reference samples in each data to be identified. Entity categories, including: serializing the reference samples and words in the data to be recognized, and expressing the serialized words with high-level features; performing an average pooling operation on the words represented by the high-level features to obtain the reference samples and the words to be identified. Recognize the vector representation of the data; process the vectorized representation of the data to be recognized by referring to the vectorized representation of the sample to obtain high-level features of the data to be recognized; process the high-level features to obtain a new entity category corresponding to the reference sample in the data to be recognized.
在其中一个实施例中,处理器执行计算机可读指令时所实现的实体类别识别模型的训练方式包括:获取样本数据,并根据样本数据构建多组元训练样本;根据元训练样本进行训练得到实体类别识别模型。In one embodiment, the training method of the entity category recognition model implemented when the processor executes the computer-readable instructions includes: acquiring sample data, and constructing multiple groups of meta-training samples according to the sample data; training according to the meta-training samples to obtain entities Class recognition model.
在其中一个实施例中,处理器执行计算机可读指令时所实现的获取样本数据,并根据样本数据构建多组元训练样本,包括:获取样本数据,对样本数据按照实体类别进行分组,并随机从分组中抽取至少一个分组;确定所抽取的至少一个分组中的第一数量样本数据为支撑样本,第二数量样本数据为查询样本;根据支撑样本和查询样本得到一组元训练样本;重复随机从分组中抽取至少一个分组以得到多组元训练样本。In one embodiment, acquiring sample data when the processor executes the computer-readable instructions, and constructing multiple groups of meta-training samples according to the sample data, includes: acquiring sample data, grouping the sample data according to entity categories, and randomly Extract at least one grouping from the grouping; determine that the first quantity of sample data in the at least one extracted grouping is a support sample, and the second quantity of sample data is a query sample; obtain a set of meta-training samples according to the support sample and the query sample; repeat random At least one group is drawn from the groups to obtain multi-group meta training samples.
在其中一个实施例中,处理器执行计算机可读指令时所实现的获取样本数据,对样本数据按照实体类别进行分组,包括:获取按照初始实体类别进行分组的样本数据,对初始实体类别中的样本数据按照目标实体类别进行分组;对按照目标实体类别进行分组的样本数据进行标准化处理;将各个初始实体类别对应的标准化处理后的目标实体类别进行合并,得到与目标实体类别对应的分组。In one of the embodiments, obtaining sample data and grouping the sample data according to entity categories when the processor executes the computer-readable instructions includes: obtaining sample data grouped according to initial entity categories, The sample data is grouped according to the target entity category; the sample data grouped according to the target entity category is standardized; the standardized target entity categories corresponding to each initial entity category are merged to obtain a group corresponding to the target entity category.
在其中一个实施例中,处理器执行计算机可读指令时所实现的根据元训练样本进行训练得到实体类别识别模型,包括:将支撑样本和查询样本中的单词进行序列化,并将序列化后的单词进行高阶特征表示,且对高阶特征表示后的单词进行平均池化操作得到支撑样本和查询样本的向量表示;根据实体类别识别模型通过支撑样本的向量化表示对查询样本的向量化表示进行处理,得到查询样本的高层特征;对查询样本的高层特征进行处理得到查询样本中对应支撑样本的新增实体类别;将所得到的查询样本中对应支撑样本的新增实体类别与查询样本的真实实体类别输入至随机场层中计算得到损失函数;通过损失函数对实体类别识别模型进行训练。In one embodiment, when the processor executes the computer-readable instructions, the entity category recognition model obtained by training according to the meta-training sample includes: serializing the words in the support sample and the query sample, and serializing the serialized The words represented by the high-order features are represented by high-level features, and the average pooling operation is performed on the words after the high-order feature representation to obtain the vector representation of the support samples and the query samples; according to the entity category recognition model, the vectorized representation of the support samples is used to vectorize the query samples. Indicates that processing is performed to obtain the high-level features of the query sample; the high-level features of the query sample are processed to obtain the new entity category corresponding to the support sample in the query sample; the new entity category corresponding to the support sample in the obtained query sample is compared with the query sample. The real entity category is input into the random field layer to calculate the loss function; the entity category recognition model is trained through the loss function.
在其中一个实施例中,处理器执行计算机可读指令时所实现的将支撑样本和查询样本中的单词进行序列化,并将序列化后的单词进行高阶特征表示,包括:根据以下公式通过支撑样本的向量化表示对查询样本的向量化表示进行处理,得到查询样本的高层特征:In one embodiment, when the processor executes the computer-readable instructions, serializing the words in the support sample and the query sample, and performing high-order feature representation on the serialized words, includes: according to the following formula: The vectorized representation of the support sample processes the vectorized representation of the query sample to obtain the high-level features of the query sample:
Figure PCTCN2021109617-appb-000037
Figure PCTCN2021109617-appb-000037
Figure PCTCN2021109617-appb-000038
Figure PCTCN2021109617-appb-000038
其中,
Figure PCTCN2021109617-appb-000039
是q j在经过支撑样本建模之后得到的所述查询样本的高层特征;atten函数是用来 计算每个支撑样本对查询样本命名实体识别的贡献度;
Figure PCTCN2021109617-appb-000040
代表两个向量拼接成一个新向量,T是一个实数,用于控制atten函数得到的分布的尖锐程度;k代表支撑样本的序号,k的取值与支撑样本的样本数量相关。
in,
Figure PCTCN2021109617-appb-000039
is the high-level feature of the query sample obtained by q j after modeling the support sample; the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;
Figure PCTCN2021109617-appb-000040
Represents the splicing of two vectors into a new vector, T is a real number, used to control the sharpness of the distribution obtained by the atten function; k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.
一个或多个存储有计算机可读指令的计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:获取新增实体类别,并查询与新增实体类别对应的参照样本;获取待识别数据;将参照样本和待识别数据输入至预先生成的实体类别识别模型中,以识别每一待识别数据中对应参照样本的新增实体类别,其中,实体类别识别模型是基于元学习的方式训练得到的。One or more computer-readable storage media storing computer-readable instructions, when the computer-readable instructions are executed by one or more processors, cause the one or more processors to perform the following steps: obtain the newly added entity category, and query The reference sample corresponding to the newly added entity category; the data to be identified is obtained; the reference sample and the to-be-identified data are input into the pre-generated entity category recognition model to identify the newly added entity category corresponding to the reference sample in each of the to-be-identified data, Among them, the entity category recognition model is trained based on meta-learning.
其中,该计算机可读存储介质可以是非易失性,也可以是易失性的。Wherein, the computer-readable storage medium may be non-volatile or volatile.
在其中一个实施例中,计算机可读指令被处理器执行时所实现的将参照样本和待识别数据输入至预先生成的实体类别识别模型中,以识别每一待识别数据中对应参照样本的新增实体类别,包括:将参照样本和待识别数据中的单词进行序列化,并将序列化后的单词进行高阶特征表示;对高阶特征表示后的单词进行平均池化操作得到参照样本和待识别数据的向量表示;通过参照样本的向量化表示对待识别数据的向量化表示进行处理,得到待识别数据的高层特征;对高层特征进行处理得到待识别数据中对应参照样本的新增实体类别。In one embodiment, when the computer-readable instructions are executed by the processor, the reference samples and the data to be identified are input into a pre-generated entity category recognition model, so as to identify new data corresponding to the reference samples in each data to be identified. Adding entity categories, including: serializing the reference samples and words in the data to be recognized, and expressing the serialized words with high-level features; performing average pooling on the words after the high-level feature representation to obtain the reference samples and The vector representation of the data to be recognized; the vectorized representation of the data to be recognized is processed by referring to the vectorized representation of the sample to obtain the high-level features of the data to be recognized; the high-level features are processed to obtain the new entity category corresponding to the reference sample in the data to be recognized .
在其中一个实施例中,计算机可读指令被处理器执行时所实现的实体类别识别模型的训练方式包括:获取样本数据,并根据样本数据构建多组元训练样本;根据元训练样本进行训练得到实体类别识别模型。In one embodiment, the training method of the entity category recognition model implemented when the computer-readable instructions are executed by the processor includes: acquiring sample data, and constructing multiple groups of meta-training samples according to the sample data; training according to the meta-training samples to obtain Entity class recognition model.
在其中一个实施例中,计算机可读指令被处理器执行时所实现的获取样本数据,并根据样本数据构建多组元训练样本,包括:获取样本数据,对样本数据按照实体类别进行分组,并随机从分组中抽取至少一个分组;确定所抽取的至少一个分组中的第一数量样本数据为支撑样本,第二数量样本数据为查询样本;根据支撑样本和查询样本得到一组元训练样本;重复随机从分组中抽取至少一个分组以得到多组元训练样本。In one embodiment, acquiring sample data when the computer-readable instructions are executed by the processor, and constructing multiple groups of meta-training samples according to the sample data, includes: acquiring sample data, grouping the sample data according to entity categories, and Randomly extract at least one grouping from the groupings; determine that the first quantity of sample data in the at least one extracted grouping is a support sample, and the second quantity of sample data is a query sample; obtain a set of meta-training samples according to the support sample and the query sample; repeat At least one group is randomly selected from the groups to obtain multiple sets of meta training samples.
在其中一个实施例中,计算机可读指令被处理器执行时所实现的获取样本数据,对样本数据按照实体类别进行分组,包括:获取按照初始实体类别进行分组的样本数据,对初始实体类别中的样本数据按照目标实体类别进行分组;对按照目标实体类别进行分组的样本数据进行标准化处理;将各个初始实体类别对应的标准化处理后的目标实体类别进行合并,得到与目标实体类别对应的分组。In one embodiment, obtaining sample data and grouping the sample data according to entity categories when the computer-readable instructions are executed by the processor includes: obtaining sample data grouped according to the initial entity category, The sample data is grouped according to the target entity category; the sample data grouped according to the target entity category is standardized; the standardized target entity categories corresponding to each initial entity category are merged to obtain the grouping corresponding to the target entity category.
在其中一个实施例中,计算机可读指令被处理器执行时所实现的根据元训练样本进行训练得到实体类别识别模型,包括:将支撑样本和查询样本中的单词进行序列化,并将序列化后的单词进行高阶特征表示,且对高阶特征表示后的单词进行平均池化操作得到支撑样本和查询样本的向量表示;根据实体类别识别模型通过支撑样本的向量化表示对查询样本的向量化表示进行处理,得到查询样本的高层特征;对查询样本的高层特征进行处理得到查询样本中对应支撑样本的新增实体类别;将所得到的查询样本中对应支撑样本的新增 实体类别与查询样本的真实实体类别输入至随机场层中计算得到损失函数;通过损失函数对实体类别识别模型进行训练。In one embodiment, when the computer-readable instructions are executed by the processor, the entity category recognition model obtained by training according to the meta-training samples includes: serializing the words in the support samples and the query samples, and serializing the words in the query samples. The following words are represented by high-order features, and the average pooling operation is performed on the words after the high-order feature representation to obtain the vector representation of the support samples and the query samples; according to the entity category recognition model, the vector representation of the support samples is used to represent the vector of the query samples. processing the high-level features of the query samples; processing the high-level features of the query samples to obtain the newly added entity categories corresponding to the supporting samples in the query samples; adding the new entity categories corresponding to the supporting samples in the obtained query samples with the query The real entity category of the sample is input into the random field layer to calculate the loss function; the entity category recognition model is trained through the loss function.
在其中一个实施例中,计算机可读指令被处理器执行时所实现的将支撑样本和查询样本中的单词进行序列化,并将序列化后的单词进行高阶特征表示,包括:根据以下公式通过支撑样本的向量化表示对查询样本的向量化表示进行处理,得到查询样本的高层特征:In one embodiment, when the computer-readable instructions are executed by the processor, the words in the support sample and the query sample are serialized, and the serialized words are represented by high-order features, including: according to the following formula The vectorized representation of the query sample is processed by the vectorized representation of the support sample to obtain the high-level features of the query sample:
Figure PCTCN2021109617-appb-000041
Figure PCTCN2021109617-appb-000041
Figure PCTCN2021109617-appb-000042
Figure PCTCN2021109617-appb-000042
其中,
Figure PCTCN2021109617-appb-000043
是q j在经过支撑样本建模之后得到的所述查询样本的高层特征;atten函数是用来计算每个支撑样本对查询样本命名实体识别的贡献度;
Figure PCTCN2021109617-appb-000044
代表两个向量拼接成一个新向量,T是一个实数,用于控制atten函数得到的分布的尖锐程度;k代表支撑样本的序号,k的取值与支撑样本的样本数量相关。
in,
Figure PCTCN2021109617-appb-000043
is the high-level feature of the query sample obtained by q j after modeling the support sample; the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;
Figure PCTCN2021109617-appb-000044
Represents the splicing of two vectors into a new vector, T is a real number, used to control the sharpness of the distribution obtained by the atten function; k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.
本发明所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in the present invention is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一计算机可读取存储介质中,该存储介质可以为非易失性或易失性的。该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through computer-readable instructions, and the computer-readable instructions can be stored in a computer-readable storage medium, the storage medium may be non-volatile or volatile. When executed, the computer-readable instructions may include the processes of the above-described method embodiments. Wherein, any reference to memory, storage, database or other medium used in the various embodiments provided in this application may include non-volatile and/or volatile memory. Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily. In order to make the description simple, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features It is considered to be the range described in this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范 围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only represent several embodiments of the present application, and the descriptions thereof are relatively specific and detailed, but should not be construed as a limitation on the scope of the invention patent. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of the present application, several modifications and improvements can also be made, which all belong to the protection scope of the present application. Therefore, the scope of protection of the patent of the present application shall be subject to the appended claims.

Claims (20)

  1. 一种基于元学习的实体类别识别方法,包括:A meta-learning-based entity category recognition method, including:
    获取新增实体类别,并查询与所述新增实体类别对应的参照样本;Obtain the newly added entity category, and query the reference samples corresponding to the newly added entity category;
    获取待识别数据;及obtain data to be identified; and
    将所述参照样本和所述待识别数据输入至预先生成的实体类别识别模型中,以识别每一所述待识别数据中对应所述参照样本的新增实体类别,其中,所述实体类别识别模型是基于元学习的方式训练得到的。Inputting the reference sample and the data to be identified into a pre-generated entity category identification model to identify a new entity category corresponding to the reference sample in each of the to-be-identified data, wherein the entity category identifies The model is trained based on meta-learning.
  2. 根据权利要求1所述的方法,其中,所述将所述参照样本和所述待识别数据输入至预先生成的实体类别识别模型中,以识别每一所述待识别数据中对应所述参照样本的新增实体类别,包括:The method according to claim 1, wherein the inputting the reference sample and the data to be identified into a pre-generated entity category recognition model to identify the reference sample corresponding to each of the data to be identified New entity categories for , including:
    将所述参照样本和所述待识别数据中的单词进行序列化,并将序列化后的单词进行高阶特征表示;Serialize the reference sample and the words in the data to be recognized, and perform high-level feature representation on the serialized words;
    对高阶特征表示后的单词进行平均池化操作得到所述参照样本和所述待识别数据的向量表示;Performing an average pooling operation on the words represented by the high-order features to obtain the vector representation of the reference sample and the data to be identified;
    通过所述参照样本的向量化表示对所述待识别数据的向量化表示进行处理,得到所述待识别数据的高层特征;及processing the vectorized representation of the data to be identified by the vectorized representation of the reference sample to obtain high-level features of the data to be identified; and
    对所述高层特征进行处理得到所述待识别数据中对应所述参照样本的新增实体类别。The high-level feature is processed to obtain a new entity category corresponding to the reference sample in the to-be-identified data.
  3. 根据权利要求1或2所述的方法,其中,所述实体类别识别模型的训练方式包括:The method according to claim 1 or 2, wherein the training method of the entity category recognition model comprises:
    获取样本数据,并根据所述样本数据构建多组元训练样本;及obtaining sample data, and constructing multiple sets of meta training samples based on the sample data; and
    根据所述元训练样本进行训练得到实体类别识别模型。The entity category recognition model is obtained by training according to the meta-training samples.
  4. 根据权利要求3所述的方法,其中,所述获取样本数据,并根据所述样本数据构建多组元训练样本,包括:The method according to claim 3, wherein the acquiring sample data and constructing multiple groups of training samples according to the sample data comprises:
    获取样本数据,对所述样本数据按照实体类别进行分组,并随机从所述分组中抽取至少一个分组;acquiring sample data, grouping the sample data according to entity categories, and randomly extracting at least one grouping from the grouping;
    确定所抽取的至少一个分组中的第一数量样本数据为支撑样本,第二数量样本数据为查询样本;determining that the first quantity of sample data in the extracted at least one group is a support sample, and the second quantity of sample data is a query sample;
    根据所述支撑样本和所述查询样本得到一组元训练样本;及obtaining a set of meta-training samples from the support samples and the query samples; and
    重复随机从所述分组中抽取至少一个分组以得到多组元训练样本。Repeatedly randomly drawing at least one group from the groups to obtain multiple sets of meta training samples.
  5. 根据权利要求4所述的方法,其中,所述获取样本数据,对所述样本数据按照实体类别进行分组,包括:The method according to claim 4, wherein the acquiring sample data and grouping the sample data according to entity categories, comprising:
    获取按照初始实体类别进行分组的样本数据,对所述初始实体类别中的样本数据按照目标实体类别进行分组;obtaining sample data grouped according to the initial entity category, and grouping the sample data in the initial entity category according to the target entity category;
    对按照目标实体类别进行分组的样本数据进行标准化处理;及normalize sample data grouped by target entity category; and
    将各个初始实体类别对应的标准化处理后的所述目标实体类别进行合并,得到与所述目标实体类别对应的分组。The standardized target entity categories corresponding to each initial entity category are combined to obtain groups corresponding to the target entity categories.
  6. 根据权利要求5所述的方法,其中,所述根据所述元训练样本进行训练得到实体类别识别模型,包括:The method according to claim 5, wherein the entity category recognition model obtained by training according to the meta-training samples comprises:
    将所述支撑样本和所述查询样本中的单词进行序列化,并将序列化后的单词进行高阶特征表示,且对高阶特征表示后的单词进行平均池化操作得到所述支撑样本和所述查询样本的向量表示;Serialize the words in the support sample and the query sample, perform high-level feature representation on the serialized words, and perform an average pooling operation on the words after the high-level feature representation to obtain the support samples and a vector representation of the query sample;
    根据实体类别识别模型通过所述支撑样本的向量化表示对所述查询样本的向量化表示进行处理,得到所述查询样本的高层特征;Process the vectorized representation of the query sample through the vectorized representation of the support sample according to the entity category recognition model to obtain high-level features of the query sample;
    对所述查询样本的高层特征进行处理得到所述查询样本中对应所述支撑样本的新增实体类别;processing the high-level features of the query sample to obtain a new entity category corresponding to the support sample in the query sample;
    将所得到的所述查询样本中对应所述支撑样本的新增实体类别与所述查询样本的真实实体类别输入至随机场层中计算得到损失函数;及Inputting the obtained new entity category of the query sample corresponding to the support sample and the real entity category of the query sample into the random field layer to calculate a loss function; and
    通过所述损失函数对所述实体类别识别模型进行训练。The entity category recognition model is trained through the loss function.
  7. 根据权利要求6所述的方法,其中,所述将所述支撑样本和所述查询样本中的单词进行序列化,并将序列化后的单词进行高阶特征表示,包括:The method according to claim 6, wherein the performing serialization on the words in the support sample and the query sample, and performing high-order feature representation on the serialized words, comprises:
    根据以下公式通过所述支撑样本的向量化表示对所述查询样本的向量化表示进行处理,得到所述查询样本的高层特征:The vectorized representation of the query sample is processed by the vectorized representation of the support sample according to the following formula to obtain high-level features of the query sample:
    Figure PCTCN2021109617-appb-100001
    Figure PCTCN2021109617-appb-100001
    Figure PCTCN2021109617-appb-100002
    Figure PCTCN2021109617-appb-100002
    其中,
    Figure PCTCN2021109617-appb-100003
    是q j在经过支撑样本建模之后得到的所述查询样本的高层特征;atten函数是用来计算每个支撑样本对查询样本命名实体识别的贡献度;
    Figure PCTCN2021109617-appb-100004
    代表两个向量拼接成一个新向量,T是一个实数,用于控制atten函数得到的分布的尖锐程度;k代表支撑样本的序号,k的取值与支撑样本的样本数量相关。
    in,
    Figure PCTCN2021109617-appb-100003
    is the high-level feature of the query sample obtained by q j after modeling the support sample; the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;
    Figure PCTCN2021109617-appb-100004
    Represents the splicing of two vectors into a new vector, T is a real number, used to control the sharpness of the distribution obtained by the atten function; k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.
  8. 一种基于元学习的实体类别识别装置,包括:A meta-learning-based entity category recognition device, comprising:
    新增实体类别获取模块,用于获取新增实体类别,并查询与所述新增实体类别对应的参照样本;A new entity category acquisition module is added, which is used to acquire the newly added entity category and query the reference samples corresponding to the said newly added entity category;
    待识别数据获取模块,用于获取待识别数据;及a data-to-be-identified acquisition module for acquiring data to be identified; and
    实体识别模块,用于将所述参照样本和所述待识别数据输入至预先生成的实体类别识别模型中,以识别每一所述待识别数据中对应所述参照样本的新增实体类别,其中,所述实体类别识别模型是基于元学习的方式训练得到的。An entity identification module, configured to input the reference sample and the data to be identified into a pre-generated entity category identification model to identify a new entity category corresponding to the reference sample in each of the to-be-identified data, wherein , the entity category recognition model is trained based on meta-learning.
  9. 根据权利要求8所述的装置,其中,所述实体识别模块包括:The apparatus of claim 8, wherein the entity identification module comprises:
    转换单元,用于将所述参照样本和所述待识别数据中的单词进行序列化,并将序列化后的单词进行高阶特征表示;a conversion unit, used to serialize the reference samples and the words in the data to be recognized, and perform high-order feature representation on the serialized words;
    第一向量化单元,用于对高阶特征表示后的单词进行平均池化操作得到所述参照样本和所述待识别数据的向量表示;a first vectorization unit, configured to perform an average pooling operation on the words represented by the high-order features to obtain the vector representation of the reference sample and the data to be identified;
    第一高层特征表示单元,用于通过所述参照样本的向量化表示对所述待识别数据的向量化表示进行处理,得到所述待识别数据的高层特征;及a first high-level feature representation unit, configured to process the vectorized representation of the data to be identified through the vectorized representation of the reference sample to obtain high-level features of the data to be identified; and
    识别单元,用于对所述高层特征进行处理得到所述待识别数据中对应所述参照样本的新增实体类别。An identification unit, configured to process the high-level feature to obtain a new entity category corresponding to the reference sample in the data to be identified.
  10. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device comprising a memory and one or more processors, the memory having computer-readable instructions stored in the memory that, when executed by the one or more processors, cause the one or more processors to Each processor performs the following steps:
    获取新增实体类别,并查询与所述新增实体类别对应的参照样本;Obtain the newly added entity category, and query the reference samples corresponding to the newly added entity category;
    获取待识别数据;及obtain data to be identified; and
    将所述参照样本和所述待识别数据输入至预先生成的实体类别识别模型中,以识别每一所述待识别数据中对应所述参照样本的新增实体类别,其中,所述实体类别识别模型是基于元学习的方式训练得到的。Inputting the reference sample and the data to be identified into a pre-generated entity category identification model to identify a new entity category corresponding to the reference sample in each of the to-be-identified data, wherein the entity category identifies The model is trained based on meta-learning.
  11. 根据权利要求10所述的计算机设备,其中,所述处理器执行所述计算机可读指令时所实现的所述将所述参照样本和所述待识别数据输入至预先生成的实体类别识别模型中,以识别每一所述待识别数据中对应所述参照样本的新增实体类别,包括:The computer device according to claim 10, wherein the inputting the reference sample and the data to be identified into a pre-generated entity category identification model implemented when the processor executes the computer-readable instructions , to identify the newly added entity category corresponding to the reference sample in each of the to-be-identified data, including:
    将所述参照样本和所述待识别数据中的单词进行序列化,并将序列化后的单词进行高阶特征表示;Serialize the reference sample and the words in the data to be recognized, and perform high-level feature representation on the serialized words;
    对高阶特征表示后的单词进行平均池化操作得到所述参照样本和所述待识别数据的向量表示;Performing an average pooling operation on the words represented by the high-order features to obtain the vector representation of the reference sample and the data to be identified;
    通过所述参照样本的向量化表示对所述待识别数据的向量化表示进行处理,得到所述待识别数据的高层特征;及processing the vectorized representation of the data to be identified by the vectorized representation of the reference sample to obtain high-level features of the data to be identified; and
    对所述高层特征进行处理得到所述待识别数据中对应所述参照样本的新增实体类别。The high-level feature is processed to obtain a new entity category corresponding to the reference sample in the to-be-identified data.
  12. 根据权利要求10或11所述的计算机设备,其中,所述处理器执行所述计算机可读指令时所涉及的所述实体类别识别模型的训练方式包括:The computer device according to claim 10 or 11, wherein the training method of the entity category recognition model involved when the processor executes the computer-readable instructions comprises:
    获取样本数据,并根据所述样本数据构建多组元训练样本;及obtaining sample data, and constructing multiple sets of meta training samples based on the sample data; and
    根据所述元训练样本进行训练得到实体类别识别模型。The entity category recognition model is obtained by training according to the meta-training samples.
  13. 根据权利要求12所述的计算机设备,其中,所述处理器执行所述计算机可读指令时所实现的所述获取样本数据,并根据所述样本数据构建多组元训练样本,包括:The computer device according to claim 12, wherein the acquiring sample data realized when the processor executes the computer-readable instructions, and constructing multiple groups of meta training samples according to the sample data, comprising:
    获取样本数据,对所述样本数据按照实体类别进行分组,并随机从所述分组中抽取至少一个分组;acquiring sample data, grouping the sample data according to entity categories, and randomly extracting at least one grouping from the grouping;
    确定所抽取的至少一个分组中的第一数量样本数据为支撑样本,第二数量样本数据为查询样本;determining that the first quantity of sample data in the extracted at least one group is a support sample, and the second quantity of sample data is a query sample;
    根据所述支撑样本和所述查询样本得到一组元训练样本;及obtaining a set of meta-training samples from the support samples and the query samples; and
    重复随机从所述分组中抽取至少一个分组以得到多组元训练样本。Repeatedly randomly drawing at least one group from the groups to obtain multiple sets of meta training samples.
  14. 根据权利要求13所述的计算机设备,其中,所述处理器执行所述计算机可读指 令时所实现的所述获取样本数据,对所述样本数据按照实体类别进行分组,包括:The computer device according to claim 13, wherein the acquiring sample data realized when the processor executes the computer-readable instructions, grouping the sample data according to entity categories, comprising:
    获取按照初始实体类别进行分组的样本数据,对所述初始实体类别中的样本数据按照目标实体类别进行分组;obtaining sample data grouped according to the initial entity category, and grouping the sample data in the initial entity category according to the target entity category;
    对按照目标实体类别进行分组的样本数据进行标准化处理;及normalize sample data grouped by target entity category; and
    将各个初始实体类别对应的标准化处理后的所述目标实体类别进行合并,得到与所述目标实体类别对应的分组。The standardized target entity categories corresponding to each initial entity category are combined to obtain groups corresponding to the target entity categories.
  15. 根据权利要求14所述的计算机设备,其中,所述处理器执行所述计算机可读指令时所实现的所述根据所述元训练样本进行训练得到实体类别识别模型,包括:The computer device according to claim 14, wherein the entity category recognition model obtained by the training according to the meta-training samples, which is implemented when the processor executes the computer-readable instructions, comprises:
    将所述支撑样本和所述查询样本中的单词进行序列化,并将序列化后的单词进行高阶特征表示,且对高阶特征表示后的单词进行平均池化操作得到所述支撑样本和所述查询样本的向量表示;Serialize the words in the support sample and the query sample, perform high-level feature representation on the serialized words, and perform an average pooling operation on the words after the high-level feature representation to obtain the support samples and a vector representation of the query sample;
    根据实体类别识别模型通过所述支撑样本的向量化表示对所述查询样本的向量化表示进行处理,得到所述查询样本的高层特征;Process the vectorized representation of the query sample through the vectorized representation of the support sample according to the entity category recognition model to obtain high-level features of the query sample;
    对所述查询样本的高层特征进行处理得到所述查询样本中对应所述支撑样本的新增实体类别;processing the high-level features of the query sample to obtain a new entity category corresponding to the support sample in the query sample;
    将所得到的所述查询样本中对应所述支撑样本的新增实体类别与所述查询样本的真实实体类别输入至随机场层中计算得到损失函数;及Inputting the obtained new entity category of the query sample corresponding to the support sample and the real entity category of the query sample into the random field layer to calculate a loss function; and
    通过所述损失函数对所述实体类别识别模型进行训练。The entity category recognition model is trained through the loss function.
  16. 根据权利要求15所述的计算机设备,其中,所述处理器执行所述计算机可读指令时所实现的所述将所述支撑样本和所述查询样本中的单词进行序列化,并将序列化后的单词进行高阶特征表示,包括:16. The computer device of claim 15, wherein the processor executes the computer-readable instructions to serialize the words in the support sample and the query sample, and to serialize the words in the query sample. The subsequent words are represented by high-order features, including:
    根据以下公式通过所述支撑样本的向量化表示对所述查询样本的向量化表示进行处理,得到所述查询样本的高层特征:The vectorized representation of the query sample is processed by the vectorized representation of the support sample according to the following formula to obtain high-level features of the query sample:
    Figure PCTCN2021109617-appb-100005
    Figure PCTCN2021109617-appb-100005
    Figure PCTCN2021109617-appb-100006
    Figure PCTCN2021109617-appb-100006
    其中,
    Figure PCTCN2021109617-appb-100007
    是q j在经过支撑样本建模之后得到的所述查询样本的高层特征;atten函数是用来计算每个支撑样本对查询样本命名实体识别的贡献度;
    Figure PCTCN2021109617-appb-100008
    代表两个向量拼接成一个新向量,T是一个实数,用于控制atten函数得到的分布的尖锐程度;k代表支撑样本的序号,k的取值与支撑样本的样本数量相关。
    in,
    Figure PCTCN2021109617-appb-100007
    is the high-level feature of the query sample obtained by q j after modeling the support sample; the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;
    Figure PCTCN2021109617-appb-100008
    Represents the splicing of two vectors into a new vector, T is a real number, used to control the sharpness of the distribution obtained by the atten function; k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.
  17. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:One or more non-volatile computer-readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the following steps:
    获取新增实体类别,并查询与所述新增实体类别对应的参照样本;Obtain the newly added entity category, and query the reference samples corresponding to the newly added entity category;
    获取待识别数据;及obtain data to be identified; and
    将所述参照样本和所述待识别数据输入至预先生成的实体类别识别模型中,以识别每 一所述待识别数据中对应所述参照样本的新增实体类别,其中,所述实体类别识别模型是基于元学习的方式训练得到的。Inputting the reference sample and the data to be identified into a pre-generated entity category identification model to identify a new entity category corresponding to the reference sample in each of the to-be-identified data, wherein the entity category identifies The model is trained based on meta-learning.
  18. 根据权利要求17所述的存储介质,其中,所述计算机可读指令被所述处理器执行时所实现的所述将所述参照样本和所述待识别数据输入至预先生成的实体类别识别模型中,以识别每一所述待识别数据中对应所述参照样本的新增实体类别,包括:18. The storage medium of claim 17, wherein the inputting the reference sample and the data to be identified into a pre-generated entity class identification model implemented when the computer-readable instructions are executed by the processor , to identify the newly added entity category corresponding to the reference sample in each of the data to be identified, including:
    将所述参照样本和所述待识别数据中的单词进行序列化,并将序列化后的单词进行高阶特征表示;Serialize the reference sample and the words in the data to be recognized, and perform high-level feature representation on the serialized words;
    对高阶特征表示后的单词进行平均池化操作得到所述参照样本和所述待识别数据的向量表示;Performing an average pooling operation on the words represented by the high-order features to obtain the vector representation of the reference sample and the data to be identified;
    通过所述参照样本的向量化表示对所述待识别数据的向量化表示进行处理,得到所述待识别数据的高层特征;及processing the vectorized representation of the data to be identified by the vectorized representation of the reference sample to obtain high-level features of the data to be identified; and
    对所述高层特征进行处理得到所述待识别数据中对应所述参照样本的新增实体类别。The high-level feature is processed to obtain a new entity category corresponding to the reference sample in the to-be-identified data.
  19. 根据权利要求17或18所述的存储介质,其中,所述计算机可读指令被所述处理器执行时所涉及的所述实体类别识别模型的训练方式包括:The storage medium according to claim 17 or 18, wherein the training method of the entity category recognition model involved when the computer readable instructions are executed by the processor comprises:
    获取样本数据,并根据所述样本数据构建多组元训练样本;及obtaining sample data, and constructing multiple sets of meta training samples based on the sample data; and
    根据所述元训练样本进行训练得到实体类别识别模型。The entity category recognition model is obtained by training according to the meta-training samples.
  20. 根据权利要求19所述的存储介质,其中,所述计算机可读指令被所述处理器执行时所实现的所述获取样本数据,并根据所述样本数据构建多组元训练样本,包括:The storage medium according to claim 19, wherein the acquiring sample data realized when the computer-readable instructions are executed by the processor, and constructing multiple groups of meta training samples according to the sample data, comprising:
    获取样本数据,对所述样本数据按照实体类别进行分组,并随机从所述分组中抽取至少一个分组;acquiring sample data, grouping the sample data according to entity categories, and randomly extracting at least one grouping from the grouping;
    确定所抽取的至少一个分组中的第一数量样本数据为支撑样本,第二数量样本数据为查询样本;determining that the first quantity of sample data in the extracted at least one group is a support sample, and the second quantity of sample data is a query sample;
    根据所述支撑样本和所述查询样本得到一组元训练样本;及obtaining a set of meta-training samples from the support samples and the query samples; and
    重复随机从所述分组中抽取至少一个分组以得到多组元训练样本。Repeatedly randomly drawing at least one group from the groups to obtain multiple sets of meta training samples.
PCT/CN2021/109617 2020-12-15 2021-07-30 Meta learning-based entity category recognition method and apparatus, device and storage medium WO2022127124A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011472865.X 2020-12-15
CN202011472865.XA CN112528662A (en) 2020-12-15 2020-12-15 Entity category identification method, device, equipment and storage medium based on meta-learning

Publications (1)

Publication Number Publication Date
WO2022127124A1 true WO2022127124A1 (en) 2022-06-23

Family

ID=74999881

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/109617 WO2022127124A1 (en) 2020-12-15 2021-07-30 Meta learning-based entity category recognition method and apparatus, device and storage medium

Country Status (2)

Country Link
CN (1) CN112528662A (en)
WO (1) WO2022127124A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112528662A (en) * 2020-12-15 2021-03-19 深圳壹账通智能科技有限公司 Entity category identification method, device, equipment and storage medium based on meta-learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020143163A1 (en) * 2019-01-07 2020-07-16 平安科技(深圳)有限公司 Named entity recognition method and apparatus based on attention mechanism, and computer device
CN111767400A (en) * 2020-06-30 2020-10-13 平安国际智慧城市科技股份有限公司 Training method and device of text classification model, computer equipment and storage medium
CN111859937A (en) * 2020-07-20 2020-10-30 上海汽车集团股份有限公司 Entity identification method and device
CN111860580A (en) * 2020-06-09 2020-10-30 北京百度网讯科技有限公司 Recognition model obtaining and category recognition method, device and storage medium
CN112528662A (en) * 2020-12-15 2021-03-19 深圳壹账通智能科技有限公司 Entity category identification method, device, equipment and storage medium based on meta-learning

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101846824B1 (en) * 2017-12-11 2018-04-09 가천대학교 산학협력단 Automated Named-entity Recognizing Systems, Methods, and Computer-Readable Mediums
CN109783604B (en) * 2018-12-14 2024-03-19 平安科技(深圳)有限公司 Information extraction method and device based on small amount of samples and computer equipment
CN110825875B (en) * 2019-11-01 2022-12-06 科大讯飞股份有限公司 Text entity type identification method and device, electronic equipment and storage medium
CN111797394B (en) * 2020-06-24 2021-06-08 广州大学 APT organization identification method, system and storage medium based on stacking integration
CN112001179A (en) * 2020-09-03 2020-11-27 平安科技(深圳)有限公司 Named entity recognition method and device, electronic equipment and readable storage medium
CN112052684A (en) * 2020-09-07 2020-12-08 南方电网数字电网研究院有限公司 Named entity identification method, device, equipment and storage medium for power metering

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020143163A1 (en) * 2019-01-07 2020-07-16 平安科技(深圳)有限公司 Named entity recognition method and apparatus based on attention mechanism, and computer device
CN111860580A (en) * 2020-06-09 2020-10-30 北京百度网讯科技有限公司 Recognition model obtaining and category recognition method, device and storage medium
CN111767400A (en) * 2020-06-30 2020-10-13 平安国际智慧城市科技股份有限公司 Training method and device of text classification model, computer equipment and storage medium
CN111859937A (en) * 2020-07-20 2020-10-30 上海汽车集团股份有限公司 Entity identification method and device
CN112528662A (en) * 2020-12-15 2021-03-19 深圳壹账通智能科技有限公司 Entity category identification method, device, equipment and storage medium based on meta-learning

Also Published As

Publication number Publication date
CN112528662A (en) 2021-03-19

Similar Documents

Publication Publication Date Title
WO2021174774A1 (en) Neural network relationship extraction method, computer device, and readable storage medium
CN112434535B (en) Element extraction method, device, equipment and storage medium based on multiple models
CN111324696B (en) Entity extraction method, entity extraction model training method, device and equipment
WO2022048363A1 (en) Website classification method and apparatus, computer device, and storage medium
WO2022134586A1 (en) Meta-learning-based target classification method and apparatus, device and storage medium
WO2022088671A1 (en) Automated question answering method and apparatus, device, and storage medium
CN113254649B (en) Training method of sensitive content recognition model, text recognition method and related device
JP2022109836A (en) System and method for semi-supervised extraction of text classification information
CN112052684A (en) Named entity identification method, device, equipment and storage medium for power metering
WO2021179708A1 (en) Named-entity recognition method and apparatus, computer device and readable storage medium
CN114357117A (en) Transaction information query method and device, computer equipment and storage medium
CN113051914A (en) Enterprise hidden label extraction method and device based on multi-feature dynamic portrait
JP7347179B2 (en) Methods, devices and computer programs for extracting web page content
CN112507061A (en) Multi-relation medical knowledge extraction method, device, equipment and storage medium
WO2022073341A1 (en) Disease entity matching method and apparatus based on voice semantics, and computer device
CN116821373A (en) Map-based prompt recommendation method, device, equipment and medium
WO2022127124A1 (en) Meta learning-based entity category recognition method and apparatus, device and storage medium
WO2023168810A1 (en) Method and apparatus for predicting properties of drug molecule, storage medium, and computer device
CN112529743A (en) Contract element extraction method, contract element extraction device, electronic equipment and medium
CN113469338A (en) Model training method, model training device, terminal device, and storage medium
CN116108144B (en) Information extraction method and device
CN114706927B (en) Data batch labeling method based on artificial intelligence and related equipment
CN117235257A (en) Emotion prediction method, device, equipment and storage medium based on artificial intelligence
CN116721713A (en) Data set construction method and device oriented to chemical structural formula identification
CN115982363A (en) Small sample relation classification method, system, medium and electronic device based on prompt learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21905057

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 041023)

122 Ep: pct application non-entry in european phase

Ref document number: 21905057

Country of ref document: EP

Kind code of ref document: A1