CN115497633A - Data processing method, device, equipment and storage medium - Google Patents
Data processing method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN115497633A CN115497633A CN202211291571.6A CN202211291571A CN115497633A CN 115497633 A CN115497633 A CN 115497633A CN 202211291571 A CN202211291571 A CN 202211291571A CN 115497633 A CN115497633 A CN 115497633A
- Authority
- CN
- China
- Prior art keywords
- data
- similarity
- users
- health data
- processed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 26
- 230000036541 health Effects 0.000 claims abstract description 115
- 239000013598 vector Substances 0.000 claims abstract description 90
- 238000012549 training Methods 0.000 claims abstract description 66
- 238000012545 processing Methods 0.000 claims abstract description 36
- 238000000034 method Methods 0.000 claims abstract description 35
- 238000004590 computer program Methods 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 claims description 15
- 230000006870 function Effects 0.000 claims description 10
- 238000004422 calculation algorithm Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 230000010354 integration Effects 0.000 abstract description 13
- 238000004891 communication Methods 0.000 description 8
- 238000004458 analytical method Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 5
- 238000012552 review Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 3
- 201000010099 disease Diseases 0.000 description 3
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000010276 construction Methods 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000012550 audit Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000036772 blood pressure Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000013210 evaluation model Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 210000000265 leukocyte Anatomy 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Theoretical Computer Science (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- Epidemiology (AREA)
- Pathology (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Primary Health Care (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention discloses a data processing method, a data processing device, data processing equipment and a storage medium. The method comprises receiving data to be processed; the data to be processed comprises health data of two users and user basic information; inputting the health data of each user into a corresponding health data twin network model to obtain first vectors corresponding to each health data, and determining the similarity between the two first vectors; inputting the user basic information and the similarity of the two users into a discrimination model obtained by pre-training, and determining the comprehensive similarity between the two users; and determining whether to combine the data to be processed of the two users based on the comprehensive similarity. According to the embodiment of the invention, a user main index matching system with stronger applicability is constructed, the data association degree of the healthy big data center is improved, the integration of similar user information in each system is realized, and convenience is provided for the calling and using of subsequent information.
Description
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a data processing method, apparatus, device, and storage medium.
Background
With the increase in the degree of medical informatization, it is necessary to integrate user health data stored in a plurality of medical institutions in the construction of a large medical health data center.
In the traditional data processing method, the basic information of the user is compared, and the similarity evaluation model is trained based on the artificially set rule, so that the data of the user is integrated.
According to the method, only basic information of the user is considered, the consideration dimension is single, the result accuracy is low, and the adaptability to the user is low.
Disclosure of Invention
The invention provides a data processing method, a device, equipment and a storage medium, which realize the organic integration of health data generated by different mechanisms, different services and different time and improve the accuracy of data integration by comparing and analyzing the comprehensive similarity of two suspected matching user basic information and the health data.
In a first aspect, an embodiment of the present invention provides a data processing method, where the method includes:
receiving data to be processed;
the data to be processed comprises health data of two users and user basic information;
inputting the health data of each user into a corresponding health data twin network model to obtain first vectors corresponding to each health data, and determining the similarity between the two first vectors;
inputting the user basic information and the similarity of the two users into a discriminant model obtained by pre-training, and determining the comprehensive similarity between the two users:
and determining whether to combine the data to be processed of the two users based on the comprehensive similarity.
In a second aspect, an embodiment of the present invention further provides a data processing apparatus, which is applied to data processing, and includes:
and the data receiving module is used for receiving the two groups of data to be processed.
The data to be processed comprises health data of two users and user basic information;
the similarity calculation module is used for inputting the health data of each user into the corresponding health data twin network model to obtain first vectors corresponding to the health data and determine the similarity between the two first vectors;
and the comprehensive similarity calculation module is used for inputting the user basic information and the similarity of the two users into a discrimination model obtained by pre-training and determining the comprehensive similarity between the two users.
And the decision module is used for determining whether to combine the data to be processed of the two users or not based on the comprehensive similarity.
In a third aspect, an embodiment of the present invention further provides an electronic device, where the electronic device includes:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the data processing method according to any one of the embodiments of the present invention.
In a fourth aspect, the embodiment of the present invention further provides a computer-readable storage medium, where computer instructions are stored, and the computer instructions are used to enable a processor to implement any one of the data processing methods according to the embodiments of the present invention when the computer instructions are executed.
According to the technical scheme of the embodiment of the invention, the data to be processed is received; inputting the health data of each user into a corresponding health data twin network model to obtain first vectors corresponding to each health data, and determining the similarity between the two first vectors; inputting the user basic information and the similarity of the two users into a discrimination model obtained by pre-training, and determining the comprehensive similarity between the two users; and determining whether to combine the data to be processed of the two users based on the comprehensive similarity. Through comprehensive processing and analysis of the basic information and the health information of the user, a user main index matching system with stronger applicability is constructed, integration of similar user information in each system is realized, and convenience is provided for calling and using subsequent information.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present invention, nor do they necessarily limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a data processing method according to an embodiment of the present invention;
fig. 2 is a flowchart of a data processing method according to a second embodiment of the present invention;
FIG. 3 is a flowchart of a data processing method according to a third embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device implementing the data processing method according to the embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example one
Fig. 1 is a flowchart of a data processing method according to an embodiment of the present invention, where the method is applicable to information merging between two suspected matching users, and the method may be executed by a data processing device, where the data processing device may be implemented in a form of hardware and/or software, and the device may be configured in a computer. As shown in fig. 1, the method includes:
and S110, receiving data to be processed.
The data to be processed comprises health data of two users and basic information of the users.
The health data refers to data related to the health of the user, such as data related to previous diseases, previous operations, previous medicines and the like. The basic information may be basic information used to characterize the identity of the user, such as name, gender, address, etc. The information dimensions specifically included in the health data and the basic information are not limited in this embodiment, and may be selected according to actual needs.
Specifically, when it is necessary to determine whether two users are similar users and/or user information integration is required, a worker may edit, retrieve, and upload data to be processed in the system, so as to receive data later.
S120, the health data of each user are input into the corresponding health data twin network model, first vectors corresponding to the health data are obtained, and the similarity between the two first vectors is determined.
The health data twin network model can be obtained by training with user data associated with known information. The health data twin network model may be a DNN network, with model parameters pre-trained. The twin network model based on the health data can perform feature engineering on the data to be processed to obtain a feature vector matched with the health data of the user. Optionally, the user characteristics are subjected to dimension reduction through a probability DNN network to obtain a first vector corresponding to the user health data. The first vector refers to a feature vector which contains the user health data features and can be processed and analyzed by a computer. The similarity is a probability value for representing the similarity of two user health data, wherein the closer to 1 the similarity is, the higher the similarity of two users is, and the closer to 0 the similarity is, the lower the similarity is.
Specifically, the information of the existing diseases, the existing operations, the existing medicines and the like in the two user health data are respectively input into corresponding health data twin network models, and two first vectors capable of respectively representing the characteristics of the two user health data are obtained through the operations of preprocessing of the models on the data, feature selection, feature construction, feature dimension reduction and the like. Then, the similarity of the two first vectors is determined. When determining the similarity, a corresponding algorithm, for example, a cosine similarity algorithm, may be used to calculate a distance value between the two first vectors to determine the similarity based on the distance value, and optionally, a smaller distance value indicates a higher similarity of the user, and conversely, a lower similarity.
It should be noted that the health data twin network model is two neural network models, the model structures of which are the same, and the health data of two users can be respectively processed based on a one-to-one correspondence relationship to obtain corresponding first vectors.
S130, inputting the basic user information and the similarity of the two users into a discriminant model obtained by pre-training, and determining the comprehensive similarity between the two users.
The discriminant model is obtained by pre-training, and optionally, the adopted model structure can be a linear model, a support vector machine model, a tree model, a deep network model and other common machine learning models. The comprehensive similarity is a probability value obtained by combining the basic information and the health data similarity of the two users and is used for representing the similarity of the comprehensive information of the users. The closer the comprehensive similarity is to 1, the higher the comprehensive similarity of the two users is, and the closer the comprehensive similarity is to 0, the lower the comprehensive similarity of the two users is.
Specifically, through feature engineering, basic information similar features of two users are obtained, the obtained similarity of the two first vectors is spliced with the basic information similar features to obtain similar comprehensive feature vectors of the two users, and the comprehensive similarity is obtained after processing of a discrimination model.
And S140, determining whether to combine the data to be processed of the two users based on the comprehensive similarity.
Specifically, a comprehensive similarity threshold value may be preset, and when the determined comprehensive similarity is within the comprehensive similarity threshold value range, it may be considered that the information of the two users may be merged; otherwise, the information merging processing is not performed.
According to the embodiment of the invention, the data to be processed is received; the data to be processed comprises health data of two users and user basic information; inputting the health data of each user into a corresponding health data twin network model to obtain first vectors corresponding to each health data, and determining the similarity between the two first vectors; inputting the basic user information and the similarity of the two users into a discriminant model obtained by pre-training, and determining the comprehensive similarity between the two users; and determining whether to combine the data to be processed of the two users based on the comprehensive similarity. According to the technical scheme, the problem that only the basic information of the user is considered and the dimensionality is considered to be single in the prior art is solved through comprehensive processing and analysis of the basic information and the health data of the user. The integration of similar user information in each system is realized, convenience is provided for calling and using subsequent information, the accuracy of data integration is improved, and the adaptability to users is higher.
Example two
Fig. 2 is a flowchart of a data processing method according to a second embodiment of the present invention, based on the foregoing embodiment, a corresponding training sample may be obtained first, so as to obtain a discriminant model through training based on the training sample, and then determine a comprehensive similarity between two users based on the discriminant model.
As shown in fig. 2, the method includes:
and S210, determining a training sample set.
The training sample set comprises a plurality of training samples, and the training samples comprise positive samples and corresponding positive labels, negative samples and corresponding negative labels.
Wherein, the positive sample is the user characteristic data of two similar users determined manually, and the label of the positive sample is set as 1. The positive label is a label set for the positive sample. Negative examples are manually determined user characteristic data of two dissimilar users, whose labels are set to negative labels. The negative label is a label set for the negative example.
Exemplarily, comparing name features, gender features and age features of the user 1 and the user 2 manually, if the feature data are similar, considering the two users as similar users, taking the feature data of the two users as a positive sample, and setting the label of the positive sample as 1; if the feature data are not similar, the two users are considered to be non-similar users, the feature data of the two users are used as negative samples, and the label of the negative sample is set to be 0.
Specifically, the training sample set includes basic information data characteristics of a plurality of users, such as names, ages, addresses, and the like. And if the basic information of the two users can be determined to be similar users through manual review, the characteristic data of the two users is used as a positive sample, and the labels of the characteristic data are set to be positive labels. And if the basic information of the two users can be determined to be non-similar users through manual review, taking the data characteristics of the two users as negative samples, and setting the labels of the negative samples as negative labels.
S220, for each training sample, inputting the current training sample into the discrimination model to be trained to obtain a corresponding actual output similarity value.
Exemplarily, the feature data of the user a and the user b are input into the discriminant model to be trained, and the model processes the feature data of the two users to calculate the actual output similarity values of the two feature data.
It should be noted that the processing method of each training sample based on the discriminant model is the same, and in this embodiment, one of the training samples may be taken as an example for description. The currently introduced training sample may be used as the current training sample.
The model parameters in the to-be-trained discrimination model are default values, and because the to-be-trained discrimination model is still an untrained discrimination model, the content output after the to-be-trained discrimination model processes the current training sample may not be consistent with the label content in the current training sample, and the content output by the to-be-trained discrimination model can be used as an actual output similarity value.
And S230, determining a loss value based on the actual output similarity value and the label of the current training sample, and correcting the model parameters in the to-be-trained discrimination model based on the loss value.
S240, taking the loss function convergence in the discriminant model to be trained as a training target to obtain the discriminant model.
Specifically, a training error of the loss function in the discriminant model to be trained, that is, a loss parameter is used as a condition for detecting whether the loss function reaches convergence, for example, whether the training error is smaller than a preset error or whether an error change trend tends to be stable, or whether the current iteration number is equal to a preset number. If the detection reaches the convergence condition, for example, the training error of the loss function is smaller than the preset error, or the error variation trend tends to be stable, the training of the discrimination model to be trained on the surface is finished, and then the iterative training can be stopped. If the convergence condition is not detected, other sample data can be further obtained to train the model continuously until the training error of the loss function is within the preset range.
It can be understood that when the training error of the loss function reaches convergence, the trained discriminant model can be obtained. At the moment, more accurate similarity can be obtained after the health data and/or the basic information of the user are/is input into the model.
And S250, receiving data to be processed.
The data to be processed comprises health data of two users and basic information of the users.
And S260, inputting the health data of one user into the first health data twin model to obtain a first vector to be processed, and inputting the health data of the other user into the second health data twin model to obtain a second vector to be processed.
The health data may include, among other things, whether a disease is present, whether the blood pressure is above normal, whether the level of white blood cells is above or below normal, and the like. If so, the corresponding identification bit is represented by 1, otherwise, the corresponding identification bit is represented by 0.
Illustratively, the health data of the user a is input into the first health data twin model, and a first vector to be processed is obtained. And inputting the health data of the user b into a second health data twin model to obtain a second vector to be processed. Wherein the first health data twin model and the second health data twin model have the same logic for analyzing and processing the data.
And S270, determining the similarity between the first vector to be processed and the second vector to be processed based on a preset similarity algorithm.
The preset similarity algorithm may be a cosine similarity algorithm, an euclidean distance algorithm, or the like. Optionally, in this embodiment, a cosine similarity calculation method may be used to calculate a distance value between the to-be-processed first vector and the to-be-processed second vector, where a smaller distance value indicates a higher user similarity, and conversely, the lower similarity.
S280, inputting the basic user information and the similarity of the two users into a discriminant model obtained by pre-training, and determining the comprehensive similarity between the two users.
And S290, determining whether to combine the data to be processed of the two users based on the comprehensive similarity.
According to the technical scheme of the embodiment of the invention, a discrimination model to be trained is trained, and then data to be processed is received; the data to be processed comprises health data of two users and user basic information; inputting the health data of each user into a corresponding health data twin network model to obtain first vectors corresponding to each health data, and determining the similarity between the two first vectors; inputting the basic user information and the similarity of the two users into a discriminant model obtained by pre-training, and determining the comprehensive similarity between the two users; and determining whether to combine the data to be processed of the two users based on the comprehensive similarity. According to the technical scheme, the problem that only the basic information of the user is considered and the dimensionality is considered to be single in the prior art is solved through comprehensive processing and analysis of the basic information and the health data of the user. The integration of similar user information in each system is realized, convenience is provided for calling and using subsequent information, the accuracy of data integration is improved, and the adaptability to users is higher.
EXAMPLE III
Fig. 3 is a flowchart of a data processing method according to a third embodiment of the present invention, and based on the foregoing embodiment, the basic user information and the similarity of two users may be input into a discriminant model obtained by pre-training, so as to determine the comprehensive similarity between the two users for refinement. For specific implementation, reference may be made to the detailed description of the embodiments of the present invention, wherein technical terms the same as or corresponding to those of the embodiments are not described herein again. As shown in fig. 3, the method includes:
and S310, receiving data to be processed.
Wherein, the data to be processed comprises health data of two users and basic information of the users.
S320, inputting the health data of each user into the corresponding health data twin network model to obtain first vectors corresponding to the health data, and determining the similarity between the two first vectors.
And S330, performing feature matching processing on the basic information of the two users to obtain feature matching vectors.
The feature matching is to compare dimensions in the basic information of the two users. For example, three dimensions a, b and c in the basic information are compared: the basic information of the user a is [ a1, b1, c1], the basic information of the user b is [ a2, b2, c2], taking one dimension as an example, if the output value obtained by feature matching of the a1 and the a2 is 0.8, then 0.8 is input at the corresponding identification bit of the comparison result, and finally the feature matching vectors of the two users are obtained. The feature matching vector is a feature vector which can represent the similarity degree of the two pieces of user basic information and is obtained after the user basic information is compared.
Optionally, the user basic information includes field contents corresponding to a plurality of fields, and the feature matching processing is performed on the basic information of the two users to obtain a feature matching vector, including: matching the field contents corresponding to the same field to obtain the matching characteristics corresponding to the corresponding field; and determining a feature matching vector based on the matching features corresponding to the fields.
The field may be the basic information of the user, such as name and age.
Specifically, the operation method of each field is the same, taking one field as an example, for example, feature matching processing is performed on name fields of two users, and 1 or name similarity is input in name identification bits in matching features for words with the same name or different tones. And performing the same operation on each field to finally obtain a feature matching vector.
And S340, obtaining a target vector by splicing the feature matching vector and the similarity.
The target vector is a feature vector obtained by splicing similarity on the basis of the feature matching vector.
And S350, inputting the target vector into the discrimination model to obtain the comprehensive similarity between the two users.
Specifically, the target vector is input into a trained discrimination model, and after the vector is analyzed and processed by the model, the comprehensive similarity between two users is input.
And S360, determining whether to combine the data to be processed of the two users based on the comprehensive similarity.
In this embodiment, it may be: if the comprehensive similarity is higher than a first preset similarity threshold, merging the data to be processed of the two users; if the comprehensive similarity is smaller than a second preset similarity threshold, refusing to combine the data to be processed of the two users; and if the comprehensive similarity is greater than the second preset similarity threshold and smaller than the first preset similarity threshold, sending the data to be processed of the two users to the target equipment so as to enable the auditing user corresponding to the target equipment to audit the data to be processed.
The preset similarity threshold is a similarity threshold set according to actual conditions, and whether the two user health data and the basic information are combined or forwarded to a manual auditing system is determined by comparing the size relationship between the preset similarity threshold and the comprehensive similarity.
Illustratively, when the first preset similarity threshold is 95%, the second preset similarity threshold is 60%:
if the comprehensive similarity is higher than 95%, the two users can be considered as similar users through the analysis of the discrimination model, and the health data and the basic information of the two users are merged; the method has the advantage that the similar user information can be accurately integrated so as to be convenient for subsequent calling and using of the information.
If the comprehensive similarity is less than 60%, the two users can be regarded as non-similar users, and the health data and the basic information of the two users are refused to be combined;
and if the comprehensive similarity is between 60% and 90%, the data needs to be transferred to a manual review system, and finally whether the information of the two users is combined or not is determined according to the manual review result.
In this embodiment, in order to process data by using the newly trained discriminant model and the health data twin network model, measures may be taken to: and periodically acquiring corresponding training samples to respectively update model parameters in the discriminant model and the health data twin network model so as to process data based on the updated discriminant model and the health data twin network model.
Wherein the period may be one day, one week or one month. And periodically training the relevant model by using newly acquired user data, and updating the model parameters.
Specifically, the periodically updating the correlation model by using online learning includes: periodically re-dividing the training set and the test set by using the model data set; updating the health data twin network model; updating the discrimination model; and recalculating the automatic matching preset similarity threshold according to the preset confidence level.
Wherein the regular period can be every other hour, every other day or 9 points daily. And updating the relevant model periodically, and continuously adjusting the model parameters according to the data change. The new requirements of the current data aggregation and integration on the data processing method are met, and the flexible analysis and processing of the data are realized.
According to the technical scheme provided by the embodiment, the data to be processed is received; and inputting the health data of the two users into a health data twin model to obtain a first vector to be processed and a second vector to be processed. Determining the similarity between the first vector to be processed and the second vector to be processed based on a preset similarity algorithm; carrying out feature matching processing on the basic information of the two users to obtain feature matching vectors; obtaining a target vector by splicing the feature matching vector and the similarity; and inputting the target vector into a discrimination model to obtain the comprehensive similarity between the two users. And determining whether to combine the data to be processed of the two users based on the comprehensive similarity. Through comprehensive processing analysis on the health data and the basic information of the user and regular updating of the relevant model, a user main index matching system with stronger applicability is constructed, flexible coping of the system on data change and organic integration of similar user information in each system are realized, and convenience is provided for calling and using subsequent information.
Example four
Fig. 4 is a schematic structural diagram of a data processing apparatus according to a fourth embodiment of the present invention. As shown in fig. 4, the apparatus includes,
and a data receiving module 410, configured to receive two sets of data to be processed. The data to be processed comprises health data of two users and basic information of the users. The similarity calculation module 420 is configured to input the health data of each user into the corresponding health data twin network model, obtain first vectors corresponding to each health data, and determine a similarity between the two first vectors. And the comprehensive similarity calculation module 430 is configured to input the user basic information and the similarity of the two users into a pre-trained discriminant model, and determine a comprehensive similarity between the two users. And the decision module 440 determines whether to combine the data to be processed of the two users or to submit the data to a manual review system for processing based on the comprehensive similarity.
On the basis of the technical proposal, the similarity calculation module comprises,
and the vector processing unit is used for inputting the health data of one user into the first health data twin model to obtain a first vector to be processed, and inputting the health data of the other user into the second health data twin model to obtain a second vector to be processed. And the similarity calculation unit is used for determining the similarity between the first vector to be processed and the second vector to be processed based on a preset similarity calculation method.
And the model structure of the first health data twin model and the second health data twin model is the same.
On the basis of the technical proposal, the comprehensive similarity calculation module also comprises,
the characteristic matching vector calculation unit is used for carrying out characteristic matching processing on the basic information of the two users to obtain a characteristic matching vector;
the target vector calculation unit is used for obtaining a target vector by splicing the feature matching vector and the similarity;
and the comprehensive similarity calculation unit is used for inputting the target vector into the discrimination model to obtain the comprehensive similarity between the two users.
On the basis of the above technical solutions, the data processing apparatus in the embodiment of the present invention further includes a discriminant model training module,
the discriminant model training module comprises a training module,
and the training sample set determining unit is used for determining samples required by discriminant model training. The training sample set comprises a plurality of training samples, wherein the training samples comprise positive samples, corresponding positive labels, corresponding negative samples and corresponding negative labels;
the actual output similarity value calculation unit is used for inputting the current training sample into the discrimination model to be trained for each training sample to obtain a corresponding actual output similarity value;
the loss value determining unit is used for determining a loss value based on the actual output similarity value and the label of the current training sample, and correcting the model parameter in the discrimination model to be trained based on the loss value;
and the discriminant model determining unit is used for converging the loss function in the discriminant model to be trained as a training target to obtain the discriminant model.
On the basis of the above technical solutions, the data processing apparatus in the embodiment of the present invention further includes a model updating module,
and the training device is used for periodically obtaining corresponding training samples to respectively update model parameters in the discriminant model and the health data twin network model so as to process data based on the updated discriminant model and the health data twin network model.
According to the embodiment of the invention, the data to be processed is received; the data to be processed comprises health data of two users and user basic information; inputting the health data of each user into a corresponding health data twin network model to obtain first vectors corresponding to each health data, and determining the similarity between the two first vectors; inputting the user basic information and the similarity of the two users into a discrimination model obtained by pre-training, and determining the comprehensive similarity between the two users; and determining whether to combine the data to be processed of the two users based on the comprehensive similarity. According to the technical scheme, the integration of the medical treatment information in multiple institutions is realized through comprehensive processing and analysis of the basic information and the health data of the user, the accuracy of data integration is improved, and the adaptability to the user is higher.
The data processing device provided by the embodiment of the invention can execute any data processing method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
EXAMPLE five
FIG. 5 illustrates a schematic diagram of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, and the like, wherein the memory stores a computer program executable by the at least one processor, and the processor 11 can perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from a storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data necessary for the operation of the electronic apparatus 10 can also be stored. The processor 11, the ROM 12, and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
A number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, or the like; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, or the like. The processor 11 executes the respective methods and processes described above, such as the data processing method in the present embodiment.
In some embodiments, the data processing method may be implemented as a computer program tangibly embodied in a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into the RAM 13 and executed by the processor 11, one or more steps of the data processing method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured by any other suitable means (e.g., by means of firmware) to perform the data processing method in the present embodiment.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for implementing the methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on a machine, as a stand-alone software package partly on a machine and partly on a remote machine or entirely on a remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. A data processing method, comprising:
receiving data to be processed; the data to be processed comprises health data of two users and user basic information;
inputting the health data of each user into a corresponding health data twin network model to obtain first vectors corresponding to each health data, and determining the similarity between the two first vectors;
inputting the user basic information and the similarity of the two users into a discriminant model obtained by pre-training to determine the comprehensive similarity between the two users;
and determining whether to combine the data to be processed of the two users or not based on the comprehensive similarity.
2. The method of claim 1, wherein the inputting the health data of each user into the corresponding health data twin network model, obtaining a first vector corresponding to each health data, and determining a similarity between the two first vectors comprises:
inputting the health data of one user into a first health data twin model to obtain a first vector to be processed, and inputting the health data of the other user into a second health data twin model to obtain a second vector to be processed;
determining the similarity between the first vector to be processed and the second vector to be processed based on a preset similarity algorithm;
wherein the model structure of the first health data twin model and the second health data twin model is the same.
3. The method according to claim 1, wherein the inputting the user basic information and the similarity of each user into a discriminant model obtained by pre-training and determining the comprehensive similarity between the two users comprises:
carrying out feature matching processing on the basic information of the two users to obtain feature matching vectors;
obtaining a target vector by splicing the feature matching vector and the similarity;
and inputting the target vector into the discriminant model to obtain the comprehensive similarity between the two users.
4. The method according to claim 3, wherein the user basic information includes field contents corresponding to a plurality of fields, and the performing feature matching processing on the basic information of two users to obtain a feature matching vector includes:
matching the field contents corresponding to the same field to obtain the matching characteristics corresponding to the corresponding field;
and determining the feature matching vector based on the matching features corresponding to the fields.
5. The method of claim 1, further comprising:
determining a training sample set, wherein the training sample set comprises a plurality of training samples, and the training samples comprise positive samples and corresponding positive labels, negative samples and corresponding negative labels;
for each training sample, inputting the current training sample into a discrimination model to be trained to obtain a corresponding actual output similarity value;
determining a loss value based on the actual output similarity value and the label of the current training sample, and correcting the model parameters in the to-be-trained discrimination model based on the loss value;
and taking the loss function convergence in the discriminant model to be trained as a training target to obtain the discriminant model.
6. The method according to claim 5, wherein the determining whether to perform merging processing on the data to be processed of the two users based on the comprehensive similarity comprises:
if the comprehensive similarity is higher than a first preset similarity threshold, merging the data to be processed of the two users;
if the comprehensive similarity is smaller than a second preset similarity threshold, refusing to combine the data to be processed of the two users;
and if the comprehensive similarity is greater than the second preset similarity threshold and smaller than the first preset similarity threshold, sending the data to be processed of the two users to target equipment so that the data to be processed is audited by the auditing user corresponding to the target equipment.
7. The method of claim 1, further comprising:
and periodically obtaining corresponding training samples to respectively update model parameters in the discrimination model and the health data twin network model so as to process data based on the updated discrimination model and the health data twin network model.
8. A data processing apparatus, characterized in that the apparatus comprises:
a data receiving module: for receiving two sets of data to be processed.
The data to be processed comprises health data of two users and user basic information;
a similarity calculation module: the system comprises a health data twin network model, a first vector and a second vector, wherein the health data twin network model is used for inputting the health data of each user into the corresponding health data twin network model to obtain the first vector corresponding to each health data and determine the similarity between the two first vectors;
the comprehensive similarity calculation module: and the similarity determination module is used for inputting the user basic information and the similarity of the two users into a discriminant model obtained by pre-training and determining the comprehensive similarity between the two users.
A decision module: and the method is used for determining whether to combine the data to be processed of the two users based on the comprehensive similarity.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the data processing method of any one of claims 1-7.
10. A computer-readable storage medium, characterized in that it stores computer instructions for causing a processor to implement the data processing method of any of claims 1-7 when executed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211291571.6A CN115497633B (en) | 2022-10-19 | 2022-10-19 | Data processing method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211291571.6A CN115497633B (en) | 2022-10-19 | 2022-10-19 | Data processing method, device, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115497633A true CN115497633A (en) | 2022-12-20 |
CN115497633B CN115497633B (en) | 2024-01-30 |
Family
ID=84473866
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211291571.6A Active CN115497633B (en) | 2022-10-19 | 2022-10-19 | Data processing method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115497633B (en) |
Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108596277A (en) * | 2018-05-10 | 2018-09-28 | 腾讯科技(深圳)有限公司 | A kind of testing vehicle register identification method, apparatus and storage medium |
WO2019015641A1 (en) * | 2017-07-19 | 2019-01-24 | 阿里巴巴集团控股有限公司 | Model training method and method, apparatus, and device for determining data similarity |
CN110413988A (en) * | 2019-06-17 | 2019-11-05 | 平安科技(深圳)有限公司 | Method, apparatus, server and the storage medium of text information matching measurement |
CN111143604A (en) * | 2019-12-25 | 2020-05-12 | 腾讯音乐娱乐科技(深圳)有限公司 | Audio similarity matching method and device and storage medium |
CN111859986A (en) * | 2020-07-27 | 2020-10-30 | 中国平安人寿保险股份有限公司 | Semantic matching method, device, equipment and medium based on multitask twin network |
CN112559578A (en) * | 2020-12-18 | 2021-03-26 | 深圳赛安特技术服务有限公司 | Data processing method and device, electronic equipment and storage medium |
CN113420847A (en) * | 2021-08-24 | 2021-09-21 | 平安科技(深圳)有限公司 | Target object matching method based on artificial intelligence and related equipment |
US20210342634A1 (en) * | 2020-05-01 | 2021-11-04 | EMC IP Holding Company LLC | Precomputed similarity index of files in data protection systems with neural network |
WO2021253686A1 (en) * | 2020-06-16 | 2021-12-23 | 北京迈格威科技有限公司 | Feature point tracking training and tracking methods, apparatus, electronic device, and storage medium |
CN114020906A (en) * | 2021-10-20 | 2022-02-08 | 杭州电子科技大学 | Chinese medical text information matching method and system based on twin neural network |
CN114490642A (en) * | 2021-12-31 | 2022-05-13 | 上海柯林布瑞信息技术有限公司 | Patient master index generation method, apparatus and medium |
CN114547307A (en) * | 2022-02-25 | 2022-05-27 | 北京沃东天骏信息技术有限公司 | Text vector model training method, text matching method, device and equipment |
CN114625406A (en) * | 2022-03-22 | 2022-06-14 | 深圳壹账通智能科技有限公司 | Application development control method, computer equipment and storage medium |
WO2022134728A1 (en) * | 2020-12-25 | 2022-06-30 | 苏州浪潮智能科技有限公司 | Image retrieval method and system, and device and medium |
CN114782714A (en) * | 2022-02-22 | 2022-07-22 | 北京深睿博联科技有限责任公司 | Image matching method and device based on context information fusion |
WO2022188584A1 (en) * | 2021-03-12 | 2022-09-15 | 京东科技控股股份有限公司 | Similar sentence generation method and apparatus based on pre-trained language model |
-
2022
- 2022-10-19 CN CN202211291571.6A patent/CN115497633B/en active Active
Patent Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019015641A1 (en) * | 2017-07-19 | 2019-01-24 | 阿里巴巴集团控股有限公司 | Model training method and method, apparatus, and device for determining data similarity |
CN108596277A (en) * | 2018-05-10 | 2018-09-28 | 腾讯科技(深圳)有限公司 | A kind of testing vehicle register identification method, apparatus and storage medium |
CN110413988A (en) * | 2019-06-17 | 2019-11-05 | 平安科技(深圳)有限公司 | Method, apparatus, server and the storage medium of text information matching measurement |
CN111143604A (en) * | 2019-12-25 | 2020-05-12 | 腾讯音乐娱乐科技(深圳)有限公司 | Audio similarity matching method and device and storage medium |
US20210342634A1 (en) * | 2020-05-01 | 2021-11-04 | EMC IP Holding Company LLC | Precomputed similarity index of files in data protection systems with neural network |
WO2021253686A1 (en) * | 2020-06-16 | 2021-12-23 | 北京迈格威科技有限公司 | Feature point tracking training and tracking methods, apparatus, electronic device, and storage medium |
CN111859986A (en) * | 2020-07-27 | 2020-10-30 | 中国平安人寿保险股份有限公司 | Semantic matching method, device, equipment and medium based on multitask twin network |
CN112559578A (en) * | 2020-12-18 | 2021-03-26 | 深圳赛安特技术服务有限公司 | Data processing method and device, electronic equipment and storage medium |
WO2022134728A1 (en) * | 2020-12-25 | 2022-06-30 | 苏州浪潮智能科技有限公司 | Image retrieval method and system, and device and medium |
WO2022188584A1 (en) * | 2021-03-12 | 2022-09-15 | 京东科技控股股份有限公司 | Similar sentence generation method and apparatus based on pre-trained language model |
CN113420847A (en) * | 2021-08-24 | 2021-09-21 | 平安科技(深圳)有限公司 | Target object matching method based on artificial intelligence and related equipment |
CN114020906A (en) * | 2021-10-20 | 2022-02-08 | 杭州电子科技大学 | Chinese medical text information matching method and system based on twin neural network |
CN114490642A (en) * | 2021-12-31 | 2022-05-13 | 上海柯林布瑞信息技术有限公司 | Patient master index generation method, apparatus and medium |
CN114782714A (en) * | 2022-02-22 | 2022-07-22 | 北京深睿博联科技有限责任公司 | Image matching method and device based on context information fusion |
CN114547307A (en) * | 2022-02-25 | 2022-05-27 | 北京沃东天骏信息技术有限公司 | Text vector model training method, text matching method, device and equipment |
CN114625406A (en) * | 2022-03-22 | 2022-06-14 | 深圳壹账通智能科技有限公司 | Application development control method, computer equipment and storage medium |
Non-Patent Citations (1)
Title |
---|
卢健;马成贤;周嫣然;李哲;: "双分支网络架构下的图像相似度学习", 测绘通报, no. 12, pages 54 - 59 * |
Also Published As
Publication number | Publication date |
---|---|
CN115497633B (en) | 2024-01-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115794916A (en) | Data processing method, device, equipment and storage medium for multi-source data fusion | |
CN113408280A (en) | Negative example construction method, device, equipment and storage medium | |
CN115497633B (en) | Data processing method, device, equipment and storage medium | |
CN117038099A (en) | Medical term standardization method and device | |
CN117593115A (en) | Feature value determining method, device, equipment and medium of credit risk assessment model | |
CN117273117A (en) | Language model training method, rewarding model training device and electronic equipment | |
CN117076610A (en) | Identification method and device of data sensitive table, electronic equipment and storage medium | |
CN116228301A (en) | Method, device, equipment and medium for determining target user | |
CN113032251B (en) | Method, device and storage medium for determining service quality of application program | |
CN116342164A (en) | Target user group positioning method and device, electronic equipment and storage medium | |
CN115439916A (en) | Face recognition method, apparatus, device and medium | |
CN114999665A (en) | Data processing method and device, electronic equipment and storage medium | |
CN113806541A (en) | Emotion classification method and emotion classification model training method and device | |
CN116089459B (en) | Data retrieval method, device, electronic equipment and storage medium | |
CN114661990B (en) | Method, device, equipment, medium and product for data prediction and model training | |
CN113420227B (en) | Training method of click rate estimation model, click rate estimation method and device | |
CN116361460A (en) | Data integration method and device, storage medium, electronic equipment and product | |
CN117911135A (en) | Data processing method, device, electronic equipment and storage medium | |
CN115935981A (en) | Word segmentation processing method and device, electronic equipment and storage medium | |
CN118364179A (en) | Resource recommendation method, training method and device of resource recommendation model, electronic equipment and medium | |
CN117493785A (en) | Data processing method and device and electronic equipment | |
CN116127948A (en) | Recommendation method and device for text data to be annotated and electronic equipment | |
CN118675638A (en) | Drug response prediction and model training method, device and equipment | |
CN116431809A (en) | Text labeling method, device and storage medium based on bank customer service scene | |
CN116452915A (en) | Image processing method, device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |