CN112712163B - Coverage rate-based neural network effective data enhancement method - Google Patents
Coverage rate-based neural network effective data enhancement method Download PDFInfo
- Publication number
- CN112712163B CN112712163B CN202011562234.7A CN202011562234A CN112712163B CN 112712163 B CN112712163 B CN 112712163B CN 202011562234 A CN202011562234 A CN 202011562234A CN 112712163 B CN112712163 B CN 112712163B
- Authority
- CN
- China
- Prior art keywords
- data set
- neural network
- coverage rate
- network model
- coverage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 30
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000012549 training Methods 0.000 claims abstract description 79
- 238000003062 neural network model Methods 0.000 claims abstract description 23
- 210000002569 neuron Anatomy 0.000 claims abstract description 13
- 238000011156 evaluation Methods 0.000 claims abstract description 9
- 238000012360 testing method Methods 0.000 claims abstract description 4
- 230000000875 corresponding effect Effects 0.000 claims 2
- 230000002596 correlated effect Effects 0.000 claims 1
- 238000012216 screening Methods 0.000 description 12
- 230000009466 transformation Effects 0.000 description 12
- 230000008859 change Effects 0.000 description 7
- 230000007246 mechanism Effects 0.000 description 7
- 230000035772 mutation Effects 0.000 description 6
- 238000006467 substitution reaction Methods 0.000 description 5
- 241000282326 Felis catus Species 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 238000011426 transformation method Methods 0.000 description 2
- 241000022852 Letis Species 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a coverage rate-based neural network effective data enhancement method, which comprises the following steps: 1) Selecting a training data set according to a neural network model to be trained and selecting a plurality of coverage rate indexes for the neural network model; 2) Training the neural network model to be trained by using the training data set, and counting the number of activated neurons corresponding to different coverage rate indexes in the neural network model during training; 3) Calculating each coverage index value of the training data set according to the number of activated neurons corresponding to each coverage index; then selecting a coverage rate index which is most related to the accuracy of the neural network model as an evaluation index according to each coverage rate index value; 4) Expanding the training data set to obtain an expanded data set; 5) And (2) respectively testing the evaluation index value of the training data set and the evaluation index value of the expansion data set by using the neural network model trained in the step 1), and determining an effective data set.
Description
Technical Field
The invention relates to a coverage rate-based neural network effective data enhancement method, and belongs to the technical field of computer software.
Background
The best way that neural networks need to be optimized to improve the generalization ability of deep learning models (i.e., neural networks) is to use more data for training. However, in practice, situations such as a large scale of the trained neural network and an insufficient amount of data exist often occur, which may result in a large difference between the training effect and the testing effect, and the trained neural network lacks practicability. The method for solving the problem of insufficient data volume of the training network at present mainly comprises data enhancement, namely, expanding a data set and adding the data set into a training set; for image data, the data enhancement mode includes image turning, rotation, translation, contrast change, saturation change, blurring, generation of a countermeasure network (GAN) and the like; for text data, the data enhancement mode includes text mutation, synonym replacement and the like.
Data enhancement can increase the samples of the training set, effectively relieve the overfitting condition of the neural network, and can bring stronger generalization capability to the model. However, as the size of the data set is enlarged, new problems are introduced, such as uncertainty in the generated data set, containing many noisy data, and possibly degrading the performance of the model. Therefore, data enhancement techniques need to introduce a data set screening and judgment mechanism to obtain more effective data sets.
Disclosure of Invention
The data set is expanded by using a data enhancement technology and usually comprises noisy data, so that a data set screening mechanism needs to be introduced, an effective data set is obtained quickly, and a neural network can be trained better by using the enhanced effective data, so that the accuracy of a pre-training model is improved.
In order to achieve the purpose, the coverage rate characteristic and a data enhancement mechanism are organically combined, effective data are screened out by using the coverage rate index after the data enhancement is carried out on the training data set, and the accuracy rate of the pre-training model is improved.
The coverage rate of the neural network is mainly used for evaluating the performance of the neural network, and on the premise of certain accuracy rate, the higher the coverage rate of the model is, the more fully the neural network is verified. From the aspect of the data set, the coverage rate can see whether the nodes of the neural network are activated or not, and if the nodes are activated, the test is more sufficient; from the level of a single sample, the more neurons that are activated, the more features the sample contains, and the more complex the sample is, which is usually at the network classification boundary, and these data sets are very important for training neural networks. Therefore, the coverage rate characteristics and the data enhancement mechanism can be effectively combined to form the coverage rate-based neural network effective data enhancement method.
The coverage rate-based neural network effective data enhancement method of the invention has a flow as shown in fig. 1, and comprises the following steps:
step 1: selecting a training data set according to a neural network model to be trained; training parameters of the neural network model (referred to as neural network 1) using a training data set; wherein, the neural network 1 is a pre-trained neural network model;
and 2, step: expanding the training data set to obtain an expanded data set;
and 3, step 3: screening a plurality of coverage rate indexes, selecting the coverage rate index most relevant to the accuracy of the neural network model as a judgment basis, comparing the coverage rate of a training data set with the coverage rate of an expansion data set, and screening the coverage rate indexes to obtain an effective data set, namely taking the data set with high coverage rate as the effective data set; preferably, the coverage rate index of the text data comprises coverage rate of a hidden unit, coverage rate of a positive sequence and coverage rate of a negative sequence;
and 4, step 4: the pre-trained model is retrained using the active data set, resulting in the neural network 2.
Further, the step 2 of expanding the data set specifically includes:
for image data, the data expansion mode comprises a geometric transformation class, a color transformation class, a generation class and the like, wherein the geometric transformation class comprises turning, rotating, clipping, deforming, scaling and the like, the color transformation class comprises noise adding, blurring, color transformation, erasing, filling and the like, and the generation class comprises GAN and the like; for text data, the data expansion mode includes text mutation, synonym substitution and the like.
Further, step 3 specifically comprises:
step 3.1: preferably, the invention provides coverage rate indexes, namely, the coverage rate of the positive sequence, the coverage rate of the negative sequence and the coverage rate of the hidden unit, based on a pre-training language model;
the hidden layer formula based on the pre-training language model is shown as formula (1), wherein the pre-training language model is based on a Transformer model (Transformer-XL);
wherein Attention () is a formal representation of the content stream in the dual stream self-Attention mechanism in the model, letIs X Z<t Where m is the number of encoder layers, Z t Represents a sequence [ 1.,. T., T of text length T]All possible orders of (i.e. Z) t Set of ordering methods for sequences of text length T), and Z is one of the ordering methods, Z belonging to Z t ,z t Is Z t The t-th element (i.e. Z) t T ordering method) in (1), z <t Including all elements preceding the t-th element.
Wherein h is Zt Is a representation of a hidden layer, h Zt (i) Indicating a hidden layer h Zt The ith constituent element of (1); hidden unit coverage refers to the observable state change of the hidden layer, when the change is larger than a threshold value, the neuron is considered to be activated, and the coverage is used for calculating the ratio of the activated neurons; the sequence coverage range reflects information about the state of the successive hidden layers, which is represented by the element z t Number of forward sequences consisting of neurons activated during forward propagationAnd the element z t Negative sequence number of neurons activated in reverse propagationAnd (4) forming.
The hidden unit coverage formula is as follows:
wherein, NUM _ hidden _ activated represents the number of activated hidden units, and NUM _ hidden represents the total number of hidden units of the model.
The Positive Sequence Coverage (PSC) and Negative Sequence Coverage (NSC) obtained from equation (2) are as follows:
wherein,is the total number of forward sequences (in terms of z) t To pairIs obtained by summing),is the total number of negative sequences (each z is t Corresponding toSummation).
Step 3.2: a comparison of coverage is shown in fig. 2;
step 3.2.1: calculating correlation coefficients of a plurality of coverage rate indexes of the training data set and the accuracy rate of the model, wherein the correlation coefficients can use Pearson correlation coefficients;
the model accuracy rate represents the proportion of the number of samples correctly predicted by the model in the total samples;
step 3.2.2: selecting a coverage rate index with the highest coefficient related to the model accuracy rate as a judgment condition for searching effective data in the next step;
step 3.3.3: comparing the coverage rate of the training data and the coverage rate of the expansion data by using the trained neural network 1, and if the coverage rate of the expansion data is greater than that of the training data, keeping the expansion data as effective data; otherwise, the expansion data is failure data; the method filters the extended data and selects more effective data for data enhancement. In experiments, the screened data are verified to have better data enhancement effect than the primary data.
The coverage rate index is the coverage rate index which is selected by the coverage rate screening step and has the highest coefficient related to the model accuracy, the training data is the data of the initial training neural network 1, and the expansion data is the data selected from the expansion data set generated by the training data; for the extended data generated by a single training data sample, such as the extended data generated by a (image data) geometric transformation method, an (image data) color transformation method, and a (text data) synonym replacement method by using the training data, the coverage ratio comparison is to compare the coverage ratio of the single training data sample with the coverage ratio of the corresponding generated extended data sample; for extended data generated by a generative class method, such as extended data generated by a GAN, the coverage ratio comparison is performed by comparing the average coverage ratio calculated from data samples of the same class as the extended data in the training data set with the coverage ratio of a single sample of the extended data.
The invention has the positive effects that:
(1) The coverage rate is used for effectively screening the expansion data set, so that effective data of model training can be quickly obtained, and the accuracy of the pre-training model is improved;
(2) Screening a plurality of coverage rate indexes, and using the correlation coefficient of the model accuracy as a screening basis, so that the correlation degree of effective data and a model is improved, and the efficiency of data set screening is improved;
(3) Three coverage rate indexes are provided aiming at the pre-training language model, so that the accuracy rate of the pre-training language model can be improved.
Drawings
FIG. 1 is a flow chart of a coverage-based neural network effective data enhancement method;
fig. 2 is a flowchart of a comparison of coverage indicators.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
The technical scheme of the invention is that the coverage rate characteristic and a data enhancement mechanism are organically combined, and after data set is enhanced, an effective data set is screened out by using a coverage rate index, so that the accuracy rate of a pre-training model is improved.
The invention provides a coverage rate-based neural network effective data enhancement method, and the specific technical scheme is as shown in figure 1:
step 1: training parameters of the neural network 1 using a training data set; wherein the model of the neural network 1 is a pre-training model;
step 2: expanding the training data set to obtain an expanded data set;
and 3, step 3: screening a plurality of coverage rate indexes, selecting the coverage rate index most relevant to the accuracy of the model as a judgment basis, comparing the coverage rates of the training data set and the expansion data set, and obtaining an effective data set after screening the coverage rate indexes; preferably, the coverage rate index of the text data comprises coverage rate of a hidden unit, coverage rate of a positive sequence and coverage rate of a negative sequence;
and 4, step 4: the pre-trained model is retrained using the active data set, resulting in the neural network 2.
In one embodiment, the end of story prediction task (SCT) is to predict the end of a story using a pre-trained linguistic model, which requires a model to select the correct end from two candidate ends (one erroneous and the other correct) given the context of a four sentence story. In the field of text, text mutation and synonym replacement techniques are always used as methods for generating discrete data. The end-of-story prediction task is an evaluation task provided for automatic construction of an event chain technology, and the task is to give a series of event chains, delete a segment and enable a trained model to select one from a candidate data set for prediction.
Further, the pre-training model in step 1 specifically includes:
the pre-training language model may be selected from a Long Short Term Memory (LSTM) model, a transformer (transformer) model, an XLNet model, etc., and in one embodiment, the XLNet model is selected during a model training phase. The biggest advantage of XLNET is that it can learn context information through various permutations of input sequences, the algorithm uses permutation language model to adjust model parameters, and uses a dual-stream auto-attention mechanism to achieve target-sensitive representation to obtain better results; some parameters may be saved to select the best coverage criteria and model in preparation for subsequent calculations after training.
Further, the data set and data expansion in step 2 specifically include:
for image data, the data expansion mode comprises a geometric transformation class, a color transformation class, a generation class and the like, wherein the geometric transformation class comprises turning, rotating, clipping, deforming, scaling and the like, the color transformation class comprises noise adding, blurring, color transformation, erasing, filling and the like, and the generation class comprises GAN and the like; for text data, the data expansion mode includes text mutation, synonym substitution and the like.
In one embodiment, the specification is performed using text-like data, using the story-ending prediction task data set (SCT v 1.0) as the initial training data set, each data set containing four short sentences describing a story, two candidate answers as story predictions, and one label as the correct answer. The data format is shown in table 1.
Table 1 story-ending prediction data example
Content providing method and apparatus | Answer 1 | Answer 2 | Label (R) |
C1;C2;C3;C4. | A1 | B1 | 1 |
C5;C6;C7;C8. | A2 | B2 | 2 |
A variety of data generation techniques may be used during the data generation phase. For discrete text, we use two competing techniques to generate enough data from the training data, namely text mutation and synonym substitution techniques. Text mutation alters the original training data by random insertion, random exchange, and random deletion. But in this way the meaning of the story may change. Another approach is to use synonym substitution, i.e., using Paragram-SL999 to generate adjacent paraphrases for words. All generated data is five times the size of the original training data set.
Further, step 3 specifically comprises:
step 3.1: preferably, the invention provides coverage rate indexes based on a pre-training language model, namely positive sequence coverage rate, negative sequence coverage rate and hidden unit coverage rate;
the hidden layer formula based on the pre-training model is shown as formula (1), wherein the pre-training model is based on a Transformer model (Transformer-XL);
order toIs X Z<t Where m is the number of encoder layers, Z t Representing a sequence [1,. Ang., T ] of text length T]And Z is one of the ordering methods, Z belongs to Z t ,z t Is the t-th element, z <t Including all previous tuples.
Hidden unit coverage refers to the observable state change of the hidden layer, when the change is larger than a threshold value, the neuron is considered to be activated, and the coverage is used for calculating the ratio of the activated neurons; the sequence coverage range reflects the information about the state of the successive hidden layers, which is covered by the forward sequence coverageAnd negative sequence coverageAnd (4) forming.
The hidden unit coverage formula is as follows:
wherein, NUM _ hidden _ activated represents the number of activated hidden units, and NUM _ hidden represents the total number of hidden units of the model.
The Positive Sequence Coverage (PSC) and Negative Sequence Coverage (NSC) obtained from equation (2) are as follows:
step 3.2: comparing the coverage rate;
step 3.2.1: calculating correlation coefficients of a plurality of coverage rate indexes of the training data set and the accuracy rate of the model, wherein the correlation coefficients can be Pearson correlation coefficients;
the model accuracy rate represents the proportion of the number of samples correctly predicted by the model in the total samples, and the formula is as follows:
wherein, TP represents the number of predicted positive samples and actual positive samples, TN represents the number of predicted negative samples and actual negative samples, FP represents the number of predicted negative samples and actual positive samples, and FN represents the number of predicted positive samples and actual negative samples.
Step 3.2.2: selecting a coverage rate index with the highest coefficient related to the model accuracy rate as a judgment condition for searching effective data in the next step;
step 3.3.3: comparing the coverage rate of the training data and the coverage rate of the expansion data by using the trained neural network 1, and if the coverage rate of the expansion data is greater than that of the training data, keeping the expansion data as an effective data set; otherwise, the expansion data is failure data;
the coverage rate index is the coverage rate index which is selected by the coverage rate screening step and has the highest coefficient related to the model accuracy, the training data is the data of the initial training neural network 1, and the expansion data is the data selected from the expansion data set generated by the training data;
it is worth noting that for the image expansion data generated by the method of geometric transformation, color transformation or the text expansion data generated by the method of synonym replacement, the coverage ratio comparison is to compare the coverage ratio of the single sample of the training data with the coverage ratio of the single sample of the corresponding generated expansion data; taking text data as an example, a single training data sample is a certain sentence C1, a single training data sample plus a single extended data sample is C1+ C2, C2 is extended data obtained from C1, and the coverage ratio comparison is the coverage ratio comparison of C1 and C2.
For the generated class data, the coverage ratio comparison is to compare the average coverage ratio calculated by the data of the same class as the extended data in the training data set with the coverage ratio of a single sample of the extended data. Taking the generation of an image of a cat as an example, the data samples in the same category as the extended data in the training data set are all extracted from all images of the cat in the training data set to form a data subset D1, and the average coverage rate of D1 is calculated, and the single sample of the extended data is the single image I1 of the cat of which the label category is generated by GAN, and the comparison of the coverage rates is the comparison of the average value of the coverage rates of the single training data samples in D1 and the coverage rate of I1.
The above embodiments are only used for illustrating the technical solutions of the present invention and not for limiting the same, and those skilled in the art can make modifications or equivalent substitutions to the technical solutions of the present invention without departing from the principle and scope of the present invention, and the scope of the present invention should be determined by the claims.
Claims (3)
1. A coverage rate-based neural network effective data enhancement method comprises the following steps:
1) Selecting a training data set according to a neural network model to be trained and selecting a plurality of coverage rate indexes for the neural network model; the neural network model to be trained is a language model, the sample data in the training data set is text data, and the coverage rate index comprises hidden unit coverage rate, positive sequence coverage rate and negative sequence coverage rate; the hidden unit coverageNUM _ hidden _ activated represents the number of activated hidden units in the neural network model, and NUM _ hidden represents the total number of the hidden units in the neural network model; the sequence coverage range reflects information about the state of the successive hidden layers, which is represented by the element z t Number of forward sequences consisting of neurons activated during forward propagationAnd the element z t Negative sequence number of neurons activated in reverse propagationComposition of element z t The method is a T-th arrangement mode in all possible sequences of the text with the length of T; the forward sequence coverage rateThe negative sequence coverage rateWherein,indicates the total number of forward sequences,Indicates the total number of negative sequences;
2) Training the neural network model to be trained by using the training data set, and counting the number of activated neurons corresponding to different coverage rate indexes in the neural network model during training;
3) Calculating each coverage index value of the training data set according to the number of activated neurons corresponding to each coverage index; then selecting a coverage rate index which is most related to the accuracy of the neural network model as an evaluation index according to each coverage rate index value;
4) Expanding the training data set to obtain an expanded data set;
5) Respectively testing the evaluation index value of the training data set and the evaluation index value of the extended data set by using the neural network model trained in the step 1); if the evaluation index value of the expansion data set is larger than that of the training data set, taking the expansion data set as an effective data set; otherwise, the training data set is regarded as an invalid data set.
2. The method of claim 1, wherein the coverage index most correlated to the accuracy of the neural network model is selected as the evaluation index according to a pearson correlation coefficient between each coverage index and the accuracy of the neural network model.
3. A method for training a neural network model, wherein the effective data set obtained by the method of claim 1 is used to train the neural network model to be trained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011562234.7A CN112712163B (en) | 2020-12-25 | 2020-12-25 | Coverage rate-based neural network effective data enhancement method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011562234.7A CN112712163B (en) | 2020-12-25 | 2020-12-25 | Coverage rate-based neural network effective data enhancement method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112712163A CN112712163A (en) | 2021-04-27 |
CN112712163B true CN112712163B (en) | 2022-10-14 |
Family
ID=75546509
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011562234.7A Active CN112712163B (en) | 2020-12-25 | 2020-12-25 | Coverage rate-based neural network effective data enhancement method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112712163B (en) |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102073586B (en) * | 2010-12-23 | 2012-05-16 | 北京航空航天大学 | Gray generalized regression neural network-based small sample software reliability prediction method |
US11568307B2 (en) * | 2019-05-20 | 2023-01-31 | International Business Machines Corporation | Data augmentation for text-based AI applications |
CN111753985B (en) * | 2020-06-28 | 2024-02-23 | 浙江工业大学 | Image deep learning model testing method and device based on neuron coverage rate |
-
2020
- 2020-12-25 CN CN202011562234.7A patent/CN112712163B/en active Active
Non-Patent Citations (1)
Title |
---|
Fuzz Testing based Data Augof Deep Neural Networksmentation to Improve Robustness;Xiang Gao等;《ACM》;20201001;第1-12页 * |
Also Published As
Publication number | Publication date |
---|---|
CN112712163A (en) | 2021-04-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112487807B (en) | Text relation extraction method based on expansion gate convolutional neural network | |
CN111814982B (en) | Multi-hop question-answer oriented dynamic reasoning network system and method | |
US20200134016A1 (en) | System and method for cross-domain transferable neural coherence model | |
CN109034147B (en) | Optical character recognition optimization method and system based on deep learning and natural language | |
CN112685597B (en) | Weak supervision video clip retrieval method and system based on erasure mechanism | |
CN111950540A (en) | Knowledge point extraction method, system, device and medium based on deep learning | |
CN112749274B (en) | Chinese text classification method based on attention mechanism and interference word deletion | |
CN110765775A (en) | Self-adaptive method for named entity recognition field fusing semantics and label differences | |
CN109977199B (en) | Reading understanding method based on attention pooling mechanism | |
CN109753571B (en) | Scene map low-dimensional space embedding method based on secondary theme space projection | |
CN111597340A (en) | Text classification method and device and readable storage medium | |
CN107590127A (en) | A kind of exam pool knowledge point automatic marking method and system | |
CN116433909A (en) | Similarity weighted multi-teacher network model-based semi-supervised image semantic segmentation method | |
CN112884150A (en) | Safety enhancement method for knowledge distillation of pre-training model | |
CN115080715B (en) | Span extraction reading understanding method based on residual structure and bidirectional fusion attention | |
CN111881299A (en) | Outlier event detection and identification method based on duplicate neural network | |
KR102177728B1 (en) | Data augmentation method and apparatus using convolution neural network | |
Ferreira et al. | A new evolutionary method for time series forecasting | |
CN112712163B (en) | Coverage rate-based neural network effective data enhancement method | |
CN114860952A (en) | Graph topology learning method and system based on data statistics and knowledge guidance | |
US20240143940A1 (en) | Architecture for generating qa pairs from contexts | |
CN117197613B (en) | Image quality prediction model training method and device and image quality prediction method and device | |
CN118171171A (en) | Heat supply pipe network fault diagnosis method based on graphic neural network and multidimensional time sequence data | |
CN116720498A (en) | Training method and device for text similarity detection model and related medium thereof | |
CN117764074A (en) | Redundant information removing method for public opinion information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |