CN114004363A - Method, device and system for jointly updating model - Google Patents
Method, device and system for jointly updating model Download PDFInfo
- Publication number
- CN114004363A CN114004363A CN202111256451.8A CN202111256451A CN114004363A CN 114004363 A CN114004363 A CN 114004363A CN 202111256451 A CN202111256451 A CN 202111256451A CN 114004363 A CN114004363 A CN 114004363A
- Authority
- CN
- China
- Prior art keywords
- model
- synchronized
- submodel
- parameter
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 81
- 238000012549 training Methods 0.000 claims abstract description 181
- 230000001360 synchronised effect Effects 0.000 claims abstract description 164
- 230000011218 segmentation Effects 0.000 claims abstract description 74
- 230000008569 process Effects 0.000 claims abstract description 28
- 239000000203 mixture Substances 0.000 claims abstract description 8
- 238000012545 processing Methods 0.000 claims description 21
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 150000001875 compounds Chemical class 0.000 abstract description 7
- 230000009286 beneficial effect Effects 0.000 abstract description 3
- 230000001960 triggered effect Effects 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 13
- 238000010801 machine learning Methods 0.000 description 8
- 238000013528 artificial neural network Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 239000002131 composite material Substances 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 208000024891 symptom Diseases 0.000 description 2
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012502 risk assessment Methods 0.000 description 1
- 238000013058 risk prediction model Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the specification provides a method, a device and a system for jointly updating a model. By the method, the device and the system provided by the embodiment of the specification, based on the data compound segmentation situation in the process of jointly updating the model, the data of the training members are assumed to be segmented, so that a plurality of horizontally segmented subsystems are formed, and the training members with the vertically segmented data can be included in a single subsystem. In this way, the single subsystem with the data being vertically split iterates through the training samples distributed in the plurality of training members in the subsystem, so as to update the parameters to be synchronized. And the data synchronization can be carried out among the subsystems according to the synchronization period triggered by the synchronization condition. The method fully considers the data composition of each training member, provides a solution for the joint update model under the complex data structure, and is beneficial to expanding the application range of the federal learning.
Description
Technical Field
One or more embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a method, an apparatus, and a system for jointly updating a model by multiple data parties.
Background
With the development of artificial intelligence technology, machine learning models have been gradually applied in the fields of risk assessment, speech recognition, face recognition, natural language processing, and the like. More training data is needed to achieve better model performance. In the fields of medical treatment, finance and the like, different enterprises or institutions have different data samples, and once the data are jointly trained by using a distributed machine learning algorithm, the model precision is greatly improved, and huge economic benefits are brought to the enterprises.
In conventional techniques, federal learning is typically used to jointly train better performing models using data from multiple data parties. Federal learning can be divided into two broad categories depending on the type of data in the data side: horizontally sliced data and vertically sliced data. In a horizontal segmentation scene, data feature spaces owned by a data side are the same, and sample spaces are different; under the vertical segmentation scene, the data sample spaces owned by the data side are the same, and the feature spaces are different. However, in some multi-party united machine learning, it cannot be simply considered as horizontal segmentation or vertical segmentation, for example, federal learning is performed between a financial platform and multiple banks, a vertical segmentation scenario may be between the financial platform and the multiple banks, and a horizontal segmentation scenario may be more suitable between the multiple banks. That is, there are more complex slicing scenarios in practice. How to realize the joint training of the related business models in a complex segmentation scene is a technical problem which has important significance and is worthy of research in the field of federal learning.
Disclosure of Invention
One or more embodiments of the present specification describe a method and apparatus for jointly updating a model to address one or more of the problems identified in the background.
According to a first aspect, a system for jointly updating a model is provided, which comprises a federal service provider and a plurality of subsystems, wherein the plurality of subsystems are used for jointly updating the model W, and a single subsystem i in the plurality of subsystems comprises a first member C in training membersi1Second member Ci2First member Ci1Second member Ci2The held sample data form vertical segmentation, the sample data held by each subsystem form horizontal segmentation, and a single subsystem i corresponds to a local model W with the same structure as the model WiLocal model WiComprises a first member Ci1First submodel W ofci1Is arranged on the second member Ci2Second submodel W ofci2(ii) a Wherein: a single subsystem i for use with the first member Ci1Second member Ci2Training samples for up-vertical segmentation are directed to local model WiPerforming joint training in a vertical segmentation mode, and providing a corresponding local model W for the federal service side under the condition of meeting a synchronization conditioniUpdating values of the parameters to be synchronized corresponding to the parameters to be determined one by one, and synchronizing the local parameters to be synchronized with the parameters to be synchronized in each subsystem according to the synchronous values of the parameters to be synchronized fed back by the federal service side, so as to adjust the corresponding parameters to be determined; and the federal service side is used for carrying out safety synchronization on the updated values of the parameters to be synchronized from the subsystems and feeding back the synchronized values.
According to a second aspect, there is provided a method of jointly updating a model, the method being applicable to a process of jointly updating a system update model W of a model, the system comprising a federal service provider and a plurality of subsystems, a single subsystem i of the plurality of subsystems comprising a first member C of training membersi1Second member Ci2First member Ci1Second member Ci2The held sample data form vertical segmentation, the sample data held by each subsystem form horizontal segmentation, and a single subsystem i corresponds to a local model W with the same structure as the model WiLocal model WiComprises a first member Ci1First submodel W ofci1Is arranged on the second member Ci2Second submodel W ofci2(ii) a The method comprises the following steps: each subsystem respectively utilizes the training samples vertically divided on the corresponding first member and the second member to carry out combined training under the vertical division mode aiming at the corresponding local model, and each training member respectively sends the federate service party under the condition of meeting the synchronous conditionProviding an updating value of each parameter to be synchronized corresponding to each parameter to be determined in the corresponding submodel one by one; the federal service party carries out safe synchronization on the updated values of the parameters to be synchronized from a plurality of subsystems and feeds back the synchronization values of the parameters to be synchronized; and each training member in each subsystem receives the synchronization value of the local parameter to be synchronized so as to update the local parameter to be determined.
In one embodiment, the single subsystem i further comprises a sub-server SiSingle subsystem i vs. local model WiThe joint training performed includes: for several samples of the current round, a first member Ci1And a second member Ci2Each passing through a first submodel Wci1And a second submodel Wci2Processing corresponding local sample data to respectively obtain corresponding first intermediate results Rit1The second intermediate result Rit2To send to the sub-server Si(ii) a Sub-server SiBased on the third submodel WsiFor the first intermediate result Rit1The second intermediate result Rit2Processed to the first member C respectivelyi1Second member Ci2Feeding back the first intermediate result Rit1Gradient of (2), second intermediate result Rit2A gradient of (a); first member Ci1And a second member Ci2Each using the first intermediate result Rit1And a second intermediate result Rit2Determining a first submodel Wci1And a second submodel Wci2To determine the gradient of the parameter to be determined, to determine the first submodel W, respectivelyci1And a second submodel Wci2The updated value of the parameter to be synchronized.
In one embodiment, the label-holder of the samples of the current round in a single subsystem i is the first member Ci1Or a second member Ci2(ii) a The sub-server SiBased on the third submodel WsiFor the first intermediate result Rit1The second intermediate result Rit2Processed to the first member C respectivelyi1Second member Ci2Feeding back the first intermediate result Rit1Gradient of (2), second intermediate result Rit2Further comprising: the sub-server SiBased on the third submodel WsiFor the first intermediate result Rit1And a second intermediate result Rit2Processing to obtain a prediction result, and sending the prediction result to the label holder; the label holder determines corresponding model loss through comparison of label data of a plurality of samples of the current round with the prediction result so as to feed back the model loss to the sub-server Si(ii) a The sub-server SiDetermining for a first intermediate result R from the model lossit1And a second intermediate result Rit2Of the gradient of (c).
In one embodiment, in the third submodel WsiUnder the condition of undetermined parameters, the sub-server SiAlso detecting the model loss for the third submodel WsiIncluding the gradient of the parameter to be determined.
In one embodiment, the label-holder for several samples of the current round of a single subsystem i is the first member Ci1Or a second member Ci2The label holder is provided with a fourth submodel Wci3(ii) a The sub-server SiBased on the third submodel WsiFor the first intermediate result Rit1The second intermediate result Rit2Processed to the first member C respectivelyi1Second member Ci2Feeding back the first intermediate result Rit1Gradient of (2), second intermediate result Rit2Further comprising: the sub-server SiBased on the third submodel WsiFor the first intermediate result Rit1And a second intermediate result Rit2Is processed to obtain a third intermediate result Rit3And the third intermediate result Rit3Sending the label to the label holder; the label holder passes through a fourth submodel Wci3Processing the third intermediate result Rit3Obtaining corresponding prediction results, and determining a third intermediate result R of model loss aiming at the current turn based on the comparison of the label data of a plurality of samples of the current turn and the prediction resultsit3To be fed back to said sub-server Si(ii) a The above-mentionedSub-server SiAccording to the third intermediate result Rit3For the first intermediate result R is determinedit1And a second intermediate result Rit2Of the gradient of (c).
In one embodiment, subsystem i is paired with local model WiThe joint training performed includes: each training member in the subsystem i performs multi-party safety calculation so that each training member can determine the gradient of model loss aiming at the local undetermined parameters; each training member determines the updating value of the parameter to be synchronized based on the gradient of the parameter to be determined in the corresponding submodel, wherein the first member Ci1And a second member Ci2Respectively determining a first submodel Wci1And a second submodel Wci2The updated value of the parameter to be synchronized.
In one embodiment, the synchronization condition includes: each local model is updated in a predetermined round or a predetermined time period.
In one embodiment, the single parameter to be synchronized is a single parameter to be determined, or a single gradient corresponding to the single parameter to be determined.
In one embodiment, the federate service side performing safe synchronization on the updated values of the parameters to be synchronized from the plurality of subsystems comprises: the federal service side receives all parameters to be synchronized which are respectively sent by all training members and encrypted in a preset encryption mode; and the federal service side performs fusion of at least one mode of addition, weighted average and median value solving on the respective updated values of the parameters to be synchronized to obtain corresponding synchronization values.
In one embodiment, the predetermined encryption scheme comprises one of the following: adding perturbations that satisfy differential privacy; homomorphic encryption; and (4) secret sharing.
According to a third aspect, a method of jointly updating a model is provided, the method being applicable to a process of jointly updating a system update model W of the model, the system comprising a federal service provider and a plurality of subsystems, a single subsystem i of the plurality of subsystems comprising a first member C of training membersi1Second member Ci2First member Ci1Second member Ci2The held sample data form vertical segmentation, the sample data held by each subsystem form horizontal segmentation, and a single subsystem i corresponds to a local model W with the same structure as the model WiLocal model WiComprises a first member Ci1First submodel W ofci1Second member Ci2Second submodel W ofci2(ii) a The method is performed by the federal service, and includes: respectively receiving updated values of the parameters to be synchronized which are in one-to-one correspondence with the parameters to be determined in the corresponding sub-model under the condition that the synchronization condition is met from each subsystem, wherein the updated values of the parameters to be synchronized provided by the single subsystem i are based on the subsystem i aiming at the corresponding local model WiPerforming joint training determination in a vertical segmentation mode; and carrying out safe synchronization on the updated values of the parameters to be synchronized from the subsystems, and feeding back the synchronization value of each parameter to be synchronized so as to allow the corresponding training member or the sub-server to complete the updating of the parameters to be determined of the local module.
According to a fourth aspect, a method of jointly updating a model is provided, the method being applicable to a process of jointly updating a system update model W of the model, the system comprising a federal service provider and a plurality of subsystems, a single subsystem i of the plurality of subsystems comprising a first member C of training membersi1Second member Ci2First member Ci1Second member Ci2The held sample data form vertical segmentation, the sample data held by each subsystem form horizontal segmentation, and a single subsystem i corresponds to a local model W with the same structure as the model WiLocal model WiComprises a first member Ci1First module Wci1Is arranged on the second member Ci2Second module Wci2(ii) a The method comprises a first member Ci1Executing, including: using local and second members Ci2Constructing vertically sliced training samples for respective local models WiPerforming joint training to obtain a first sub-model Wci1The updating value of each parameter to be synchronized is in one-to-one correspondence with each parameter to be determined; in case of satisfying the synchronization condition, will andfirst submodel Wci1The updated values of the parameters to be synchronized, which correspond to the parameters to be determined one by one, are sent to the federal service side, so that the federal service side can safely synchronize the parameters to be synchronized based on the updated values of the parameters to be synchronized from the subsystems; obtaining a first sub-model W subjected to safety synchronization from the federal service sideci1The synchronous value of each parameter to be synchronized to update the first submodel Wci1To each pending parameter in (1).
In one embodiment, subsystem i further comprises a sub-server SiLocal model W for a single subsystem iiFurther comprises a sub-server SiThird submodel W ofsi(ii) a The utilization of local and second member Ci2Constructing vertically sliced training samples for respective local models WiPerforming the joint training comprises: for a plurality of samples of the current round, passing through a first sub-model Wci1Processing corresponding local sample data to obtain a corresponding first intermediate result Rit1To send to the sub-server SiFor the sub-server SiBased on the third submodel WsiFor the first intermediate result Rit1The second intermediate result Rit2Processing performed to feed back the first intermediate result Rit1In which the second intermediate result Rit2From a second member Ci2By means of the second submodel Wci2Processing corresponding local sample data to obtain; using the first intermediate result Rit1And a second intermediate result Rit2Determining a first submodel Wci1The gradient of each parameter to be determined, thereby determining a first submodel Wci1The updated value of the parameter to be synchronized.
In one embodiment, the utilization local and second member Ci2Constructing vertically sliced training samples for respective local models WiPerforming the joint training comprises: performing a multi-party security calculation with each training member in subsystem i to determine model loss for a first sub-model Wci1The gradient of the undetermined parameter; based on a first sub-model Wci1Determining the gradient of the undetermined parameter, determining the phaseThe updated value of the parameter should be synchronized.
According to a fifth aspect, there is provided an apparatus for federated update model, the apparatus being adapted to a federated server in a system for federated update model, the system comprising the federated server and a plurality of subsystems, a single subsystem i of the plurality of subsystems comprising a first member C of training membersi1Second member Ci2First member Ci1Second member Ci2The held sample data form vertical segmentation, the sample data held by each subsystem form horizontal segmentation, and a single subsystem i corresponds to a local model W with the same structure as the model WiLocal model WiComprises a first member Ci1First submodel W ofci1Is arranged on the second member Ci2Second submodel W ofci2(ii) a The device comprises:
an obtaining unit configured to receive, from each subsystem, an updated value of each parameter to be synchronized, which corresponds to each parameter to be determined in the corresponding local model one to one when the synchronization condition is satisfied, where the updated value of each parameter to be synchronized provided by a single subsystem i is based on the subsystem i with respect to the corresponding local model WiPerforming joint training determination in a vertical segmentation mode;
and the synchronization unit is configured to perform safe synchronization on the update values of the parameters to be synchronized received from the plurality of subsystems and feed back the synchronization values of the parameters to be synchronized so that the corresponding training members or the sub-servers can complete the update of the parameters to be determined of the local module.
According to a sixth aspect, there is provided an apparatus for jointly updating a model, the apparatus being adapted for use in a process of jointly updating a system update model W of a model, the system comprising a federal service provider and a plurality of subsystems, a single subsystem i of the plurality of subsystems comprising a first member C of training membersi1Second member Ci2First member Ci1Second member Ci2The held sample data form vertical segmentation, the sample data held by each subsystem form horizontal segmentation, and a single subsystem i corresponds to a local model W with the same structure as the model WiLocal model WiComprises a first member Ci1First module Wci1Is arranged on the second member Ci2Second module Wci2(ii) a The device is arranged on the first member Ci1The method comprises the following steps:
a training unit configured to utilize the local and second member Ci2Training samples with vertical segmentation are aimed at corresponding submodels WiPerforming joint training to obtain a first sub-model Wci1The updating value of each parameter to be synchronized is in one-to-one correspondence with each parameter to be determined;
a providing unit configured to be associated with the first submodel W in case a synchronization condition is satisfiedci1The updated values of the parameters to be synchronized, which correspond to the parameters to be determined one by one, are sent to the federal service side, so that the federal service side can safely synchronize the parameters to be synchronized based on the updated values of the parameters to be synchronized, which are received from the subsystems;
a synchronization unit configured to acquire a first sub-model W subjected to secure synchronization from the federal service sideci1The synchronous value of each parameter to be synchronized to update the first submodel Wci1To each pending parameter in (1).
According to a seventh aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the third, fourth aspect.
According to an eighth aspect, there is provided a computing device comprising a memory and a processor, wherein the memory stores executable code, and the processor implements the methods of the third and fourth aspects when executing the executable code.
By the method, the device and the system provided by the embodiment of the specification, based on the data compound segmentation situation in the process of jointly updating the model, the data of at least part of training members in all the training members are assumed to be segmented, so that a plurality of horizontally segmented subsystems are formed, and the training members with the data vertically segmented can be included in a single subsystem. In this way, the single subsystem with the data being vertically split iterates through the training samples distributed in the plurality of training members in the subsystem, so as to update the parameters to be synchronized. And the data synchronization can be carried out among the subsystems according to the synchronization period triggered by the synchronization condition. The method fully considers the data composition of each training member, provides a solution for the joint update model under the complex data structure, and is beneficial to expanding the application range of the federal learning.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIGS. 1a and 1b are schematic diagrams of horizontal segmentation and vertical segmentation of data in conventional federated learning, respectively;
2a and 2b show schematic diagrams of data compound segmentation scenes of two specific examples;
FIG. 3a is a schematic diagram illustrating a specific architecture of a system for a joint update model based on a data compound slicing scenario under the technical concept of the present specification;
FIG. 3b is a schematic diagram of another specific architecture of a system for joint update model based on data compound slicing scenario under the technical concept of the present specification;
FIG. 4a shows a schematic model architecture of a subsystem corresponding to FIG. 3 a;
FIG. 4b shows a schematic model architecture of a subsystem corresponding to FIG. 3 b;
FIG. 5 illustrates a flow diagram of a method of jointly updating a model, according to one embodiment;
FIG. 6 illustrates a timing flow diagram for a joint update model according to one embodiment;
FIG. 7 shows a schematic block diagram of an apparatus for jointly updating a model, according to one embodiment;
FIG. 8 shows a schematic block diagram of an apparatus for jointly updating a model, according to one embodiment.
Detailed Description
The scheme provided by the specification is described below with reference to the accompanying drawings.
Federal Learning (federal Learning), which may also be referred to as federal machine Learning, joint Learning, league Learning, and the like. Federal machine learning is a distributed machine learning framework, and can effectively help a plurality of organizations to perform data use and machine learning modeling under the condition of meeting the requirements of user privacy protection, data safety and government regulations.
In particular, assuming that enterprise A and enterprise B each build a task model, individual tasks may be categorical or predictive, and these tasks have been approved by the respective users at the time the data was obtained. However, the models at each end may not be able to be built or may not work well due to incomplete data, such as lack of tag data for enterprise a, lack of user profile data for enterprise B, or insufficient data and insufficient sample size to build a good model. The problem to be solved by federal learning is how to build high quality models at each end of a and B, and the data owned by each enterprise is not known by other parties, i.e., a common model is built without violating data privacy regulations. This common model is just like the optimal model that the parties aggregate the data together. In this way, the built model serves only the own targets in the region of each party.
The federal study can include a plurality of training members, and if necessary, a trusted third party can be used as a service party to perform some auxiliary operations. Each training member may correspond to different service data. The service data may be various data such as characters, pictures, voice, animation, video, and the like. Generally, the business data of each training member has correlation.
For example, among a plurality of business parties relating to financial business, the business party 1 is a bank which provides business such as savings and loan to a user and can hold data such as the age, sex, balance, loan amount, and deposit amount of the user, the business party 2 is a P2P platform which can hold data such as loan record, investment record, and payment due time of the user, and the business party 3 is a shopping site which holds data such as the shopping habit, payment habit, and payment account of the user. Then the business parties holding bank data, P2P platform data, shopping site data may act as training members to conduct federal learning of financial risk prediction models. For another example, among a plurality of business parties related to medical services, each business party may be each hospital, physical examination organization, etc., for example, the business party 1 is the hospital a, diagnosis records corresponding to the age, sex, symptom, diagnosis result, treatment plan, treatment result, etc. of the user are used as local business data, the business party 2 may be the physical examination organization B, physical examination record data corresponding to the age, sex, symptom, physical examination conclusion, etc. of the user, etc. Then each business party holding hospital data, physical examination institution data, etc. can act as a training member to conduct federal learning of models such as disease risk prediction.
Federal learning generally has two data distribution architectures, a horizontal split architecture and a vertical split architecture. Fig. 1a and 1b show these two distribution architectures, respectively.
As shown in fig. 1a, it is a horizontal slicing architecture. In the horizontal slicing architecture, a single sample is completely held by a single data party, and samples held by different data parties are independent. As in FIG. 1a, DataPart 1 holds the tag data for sample 1 (e.g., tag 1) and all the signature data (e.g., signature A1+ B1), and DataPart 2 holds the tag data for sample 2 (e.g., tag 2) and all the signature data (e.g., signature A2+ B2). The features a and B may be regarded as two types of features, in practice, there may be more types of features, and a single data party may also hold more sample data, which is not described herein. As shown in fig. 1a, a sample is represented by a row, and the samples in each data side are completely independent and can be completely separated along a straight line in the horizontal direction, so that the architecture is called a horizontal slicing (or vertical slicing) architecture. For example, basic characteristic data (such as type a data) of the age, the sex and the like of the respective users, property type characteristic data (such as type B data) of balance, running, loan, repayment and the like, label data of whether the financial risk users exist and the like are held among the various banking data parties. That is, sample data held by different data parties have different sample spaces and the same feature space.
Fig. 1b shows a vertical slicing architecture. Under the vertical slicing architecture, sample data of a single sample is held by a plurality of data parties, and the single data party only holds partial data of each sample. As shown in fig. 1B, under the vertical slicing architecture, the data side 1 holds tag data (e.g., tag 1, tag 2, etc.) and partial feature data (e.g., a type feature data, which is recorded as feature a1, feature a2, etc. corresponding to each sample), and the data side 2 holds another partial feature data (e.g., a type feature data, which is recorded as feature B1, feature B2, etc. corresponding to each sample. As shown in fig. 1b, a single data party cannot hold a complete sample, one line represents one sample, and the samples in each data party are combined horizontally to form complete sample data, or can be completely separated along a straight line in the vertical direction, so that the architecture is called a vertical slicing (or longitudinal slicing) architecture. For example, the data side 1 is a bank, the type a feature data is asset type feature data, the data side 2 is a shopping website, and the type B feature data is shopping type feature data such as a commodity browsing record, a search record, a purchase record, a payment route and the like of a user. That is, sample data held by different data parties have the same sample space and different feature spaces. In practice, the sample data of a single sample may also include more other characteristic data held by more data parties, as just an example in fig. 1 b.
For the horizontal-split and vertical-split architectures shown in fig. 1a and 1b, the federal learning mode in conventional learning is generally as follows: under the horizontal segmentation architecture, a parallel synchronous training method can be adopted, namely: training members have the same neural network structure, training is carried out under the assistance of a third party C, data such as updating values of gradient or undetermined parameters are provided to the third party C by a data direction in the training process, and synchronous gradient or undetermined parameters are calculated under the assistance of the third party C and are transmitted back to the training members; in a vertical scene, an MPC (multi-party secure computing) or a segmentation learning model is generally adopted, wherein the MPC is used for putting machine learning calculation into a dense state domain, training members in the segmentation learning have the first layers of the whole neural network structure, a server has a residual layer model of the neural network structure, the training members are trained locally by using private data respectively to obtain output layers of the first layers of the network model, the output layers are transmitted to the server for forward propagation of the residual layer, gradient data are propagated reversely through model loss, and the model is updated.
However, in a real business scenario, not all federal learning can be segmented according to the horizontal-slicing and vertical-slicing architectures shown in fig. 1a and 1b, and may also include both horizontal slicing and vertical slicing.
FIG. 2a shows a simple example of a co-sliced data architecture. The data structure comprises a plurality of horizontally-split data sides and a data side which forms vertical split with the plurality of horizontally-split data sides. In fig. 2a, a data party Ai (i ═ 1, 2 … … n) represents each data party holding X-type feature and tag data, such as a bank, and X-type feature data, such as asset-type feature data, constitutes a horizontal split between data party a1 and data party a2 … …. The data side B represents a data side with Z-type characteristics, the data side B is a shopping website and the like, the characteristics Z represent shopping-type characteristic data, and a vertical segmentation framework is formed between the data side B and each data side A.
In fact, in practice, the way of splitting data may be more complicated, as shown in fig. 2b, which includes many possible situations. Fig. 2b shows the data architecture in the case of samples arranged in rows, where each wire frame represents data for one data side and the vertical correspondence (or by columns) represents the same features. It can be seen that the data between the data parties is complicated, and the horizontal segmentation and the vertical segmentation are nested and crossed. For example, a portion of data on data side 4 and data side 6 comprise a vertical split, another portion of data and a portion of data on data side 9 comprise a vertical split, another portion of data on data side 9 and a portion of data on data side 5 comprise a vertical split, while the portion of data on data side 5 and data side 4 comprise a horizontal split, the above-mentioned data and another portion of data on data side 5 further comprise a horizontal split … …
For such complex data architectures, where the various data parties are training members, it may not be possible to train the models together using conventional federal learning. In view of this, the present specification proposes a novel distributed federated learning concept to process the sample data of this hybrid segmentation architecture. Under the technical concept of the specification, each data side of federal learning is firstly divided. In the situation shown in fig. 2a, the sub-servers may respectively correspond to the data parties Ai (as first members) one-to-one, and the data party B is divided into a plurality of second members, each of which contains sample data of training samples consistent with the corresponding data party Ai. The single data side Ai and the data side B portion containing the other data of the respective sample body can be considered as one subsystem. In the situation shown in fig. 2b, the data of the respective data side is divided into batches by means of respective dashed lines, such as dashed line 201. The data of each batch are horizontally split with each other. And a single batch of data may be internally presented as a vertical slice, or individually hold a small amount of complete sample data (such as sample data held by the data side 11 and 12 in fig. 2 b), and so on. In summary, the segmented data comprises at least a set of vertically sliced data parties. For the situation that data of one data party and a plurality of different data parties form vertical segmentation, intersection can be determined in modes of privacy intersection (SPI) and the like, and therefore alignment segmentation of samples is conducted. The specific privacy intersection method is determined according to the service requirement, and is not described herein again.
In view of this idea, please refer to fig. 3a and fig. 3b, which are schematic diagrams of two specific distributed federal learning architectures of the system for jointly updating the model according to the present disclosure. The system for jointly updating the model is functionally divided and comprises the following steps: a federal service and subsystems surrounded by dashed boxes such as dashed boxes 3011, 3012, etc. Wherein, each subsystem can be independent from the system function and the model setting. In other words, the subsystems can be viewed as "training members" split in parallel with each other. The federal service can be used to synchronize the parameters to be synchronized in the global model W. The parameter to be synchronized can generally be a pending parameter in the model W, or a gradient of the pending parameter. The parameters to be synchronized and the parameters to be determined correspond one to one. It is easy to understand, hypothesisThe number of the systems is n (n is a positive integer greater than 1), and any single subsystem i (i is greater than or equal to 1 and less than or equal to n) can correspond to a local model W which is consistent with the structure of the global model Wi. With reference to fig. 2a and 2b, according to an actual data distribution situation, at least 1 subsystem exists in each subsystem, and the segmentation federated learning may be performed on vertically sliced sample data, for example, in fig. 2b, a subsystem for performing segmentation federated learning on sample data composed of a data party 5, a data party 9, and a data party 10.
In the architectures shown in fig. 3a and 3b, only the subsystem for performing the split federal learning on the vertically split sample data is shown, and in practice, the architecture may also include a case where the subsystems are separately formed like the data party 11 and the data party 12 in fig. 2b, which is not described herein again.
As can be seen from the schematic diagrams of fig. 2a and 2b, in the system architectures shown in fig. 3a and 3b, a single training member may be one data party or may be a part of a single data party. That is, one data party may be divided into multiple subsystems as respective training members according to the data provided. Thus, a single training member in a single subsystem shown in fig. 3a or fig. 3b may represent one data party, or may represent a portion of one data party, or multiple training members may be from the same data party. As shown in fig. 3a and 3b, the difference is that fig. 3a shows a single subsystem comprising at least two training members, whereas fig. 3b shows a single subsystem comprising a sub-server in addition to at least two training members.
In the implementation architecture shown in fig. 3a, model training can be performed by means of multi-party security computing (MPC) among training members. Assuming that the subsystem i (i ═ 1, 2 … … n) is a subsystem for performing segmentation federated learning on vertically sliced sample data, 2 training members for vertically sliced data, for example, denoted as Ci1And Ci2Local model WiIs divided into a plurality of parts, e.g. comprising training member Ci1Sub-model W set up onci1Training Member Ci1Upper arranged submodel Wci2. At this time, in combination trainingThe model of (2) is a neural network for example, and the distribution of sub-models within a subsystem can be shown, for example, in FIG. 4a, where the characteristics of the gray parts, the weighting parameters, the neural network are represented by the data side Ci1The characteristics, weight parameters and neural network of the black part are held by the data side Ci2And (4) holding. Due to training member Ci1And training Member Ci1The data on the member C form a vertical segmentation, and the data of the data side can be used for the calculation of a single data side, then the member C is trainedi1And training Member Ci1Data can be interacted in modes of homomorphic encryption, secret sharing and the like, and the local model W corresponding to the training subsystem i is combined under the condition that the privacy of the local data is not disclosedi。
In the implementation architecture shown in fig. 3b, the subsystem i may correspond to a sub-server, as denoted by SiAnd at least 2 training members for vertically slicing the data, e.g. denoted Ci1And Ci2. Accordingly, the local model WiIs divided into a plurality of parts, e.g. comprising training member Ci1Sub-model W set up onci1Training Member Ci1Upper arranged submodel Wci2And a sub-server SiSub-model W in the sub-system set upsiAnd the like. At this time, the architecture of each model in the subsystem i can be as shown in FIG. 4b, sub-model Wci1And Wci2For submodels connected in parallel, and then with submodels WsiAre connected in series. In the split Federal learning Process, sub-model Wci1And Wci2Respectively processing the samples of the current batch in a training member Ci1Training Member Ci1Up distributed part, then send the intermediate result obtained to the sub-server i, by the sub-model W on the sub-serversiAnd processing the intermediate result to obtain a prediction result.
For the implementation architecture of fig. 3b, from the device and data attribution perspective: the federal service party and each sub-service party may belong to the same trusted third party, or may be provided in the same distributed device cluster, or may belong to different trusted third parties, which is not limited herein.
It should be noted that fig. 3a and 3b are examples, not exhaustive, of the system for jointly updating the model in the present specification. In practice, the system of jointly updating the models may also be arranged in other ways. For example, some subsystems may contain sub-servers as shown at 3012, some subsystems may not contain sub-servers as shown at 3011, and so on.
In addition, in FIG. 3a and FIG. 3b, n training members Ci1(i-1, 2 … … n), some or all of the data entities may be the same data entity, or may belong to different data entities, and similarly, n training members Ci2(i 1, 2 … … n), some or all of the data parties may be the same data party, or may belong to different data parties; and are not limited herein. For example, in the scenario shown in FIG. 2a, member C is trained12、C22……Cn2May both indicate data party B. It is worth noting that, in this case, the data side B can arrange a plurality of submodels Wci2Also, a submodel W is arrangedc2As a sub-model W common to the subsystemsci2. For clarity of description and greater generality, the submodels W will be referred to belowci2Each submodel Wci2In some alternative examples the same submodel may be indicated. In a single subsystem, train Member Ci1For example, it may be referred to as a first member, training member Ci2For example, may be referred to as a second member.
Wherein the model W can be negotiated and determined by each training member. Respective local model WiThe structure of W is uniform, for example, the number of neural network layers, the weight matrix, and the like. Each local model may have nuances depending on the respective subsystem. Within the subsystem, the local model WiThe distribution of the submodels on each member can be determined according to the feature dimension held by each member, and will not be described herein. In an alternative embodiment, the local model WiCompletely consistent with the global model W to be jointly updated, the Federal service side can initialize W and send W to each subsystem to obtain Wi. For example, under the architecture of fig. 3a, W may be split by a federal service side for each data sideiDivided into multiple sub-model assignmentsTo the respective data parties. Under the framework shown in fig. 3b, the federal service side initializes W, and issues W to each sub-service side to obtain Wi. Sub-server SiSplitting the model W into sub-models W of training membersciAnd sub-model W on sub-server sidesiAnd according to the feature dimension between training members, WciSplitting into Wci1And Wci2And the like. According to one embodiment, the submodel WiThe sub-server S can be modified on the basis of W according to the actual business requirements, such as adding constant item parameters and the likeiW can also beiAnd splitting after modification.
Further, from the aspect of data interaction, inside the subsystem: under the implementation architecture of fig. 3a, secure interaction is performed between various data parties inside a single subsystem; under the implementation architecture of fig. 3b, each member may interact with a sub-server, and the members are independent from each other, for example, after each member processes local data by using a local sub-module, each member may provide an intermediate processing result to the sub-server, and the sub-server may feed back gradient data of the intermediate result to each member in the subsystem. In general terms: each training member in the subsystem in fig. 3a may interact with the data to be synchronized by the federal service side, and each subsystem and each training member in fig. 3b may interact with the federal service side. As shown by the dashed two-way arrows in fig. 3a and fig. 3b, each sub-server and/or each training member may send the updated value of the local parameter to be synchronized to the federal server, and the federal server feeds back the synchronized value of the parameter to be synchronized to the federal server.
It should be noted that in FIG. 3a or FIG. 3b, if corresponding to the architecture of FIG. 2a, a plurality of submodels W are providedci2The data party B can complete the relevant parameter synchronization locally without the need of carrying out synchronization through a federal service party in the case of being provided with one data party B in a whole or being the same submodel on the data party B.
The system architecture divides a data side of a mixed architecture according to a training mode through distributed arrangement of a service side to form a horizontal division architecture among subsystems and a vertical division architecture inside the subsystems, so that an integral federal learning system and an internal sub-federal learning system are matched with each other, a division learning algorithm and a parallel synchronous learning algorithm are combined, distributed training of a model under a vertical and horizontal composite scene is realized, and a corresponding solution is provided for a more complex federal learning scene.
The technical idea of the present specification will be described in detail below by taking as an example the operation performed by the system of jointly updating models shown in fig. 3a or fig. 3b within one parameter synchronization period.
It is understood that in a system for jointly updating the model, the executed process may include several small loops inside the subsystem and one large loop of the whole system in one parameter synchronization period. FIG. 5 illustrates a flow diagram of a joint update model according to an embodiment of the present description.
As shown in fig. 5, the process includes the following steps: step 501, each subsystem respectively utilizes training samples vertically split on a corresponding first member and a corresponding second member to perform joint training in a vertical splitting mode aiming at a corresponding local model, and each training member provides an updated value of each parameter to be synchronized in a corresponding sub-model corresponding to each parameter to be determined one by one to a federal service party under the condition that a synchronization condition is met; 502, safely synchronizing the updated values of the parameters to be synchronized from a plurality of subsystems by a federal service party, and feeding back the synchronization value of each parameter to be synchronized; step 503, each training member in each subsystem receives the synchronization value of the local parameter to be synchronized, so as to update the local parameter to be determined.
First, in step 501, each subsystem performs joint training in a vertical segmentation mode on a corresponding local model by using training samples vertically segmented on a corresponding first member and a corresponding second member, and each training member provides an updated value of each parameter to be synchronized, which corresponds to each parameter to be determined one by one, in a corresponding sub-model to a federal service party under the condition that a synchronization condition is satisfied.
It can be understood that the federal learning in the vertical slicing mode is performed for the corresponding sub-models by using the vertically sliced training samples. The training architecture in this manner is shown in fig. 4a or fig. 4 b. In a training period of the subsystem, aiming at a batch of training samples, the undetermined parameters in the sub-model can be updated through forward transmission and backward propagation of data among training members or between the training members and the sub-server.
As shown in fig. 4a, under the framework shown in fig. 3a, since there is no assistance from the sub-server, forward data transfer and backward gradient propagation are completed between training members through multi-party secure computation, and each training member obtains the gradient of the undetermined parameter included in the corresponding sub-model.
To describe the technical solution of the present specification more clearly, fig. 6 shows a timing chart of operations performed by any vertically-sliced subsystem (such as the system shown in fig. 4 b) in the system of the joint update model corresponding to fig. 3b in one parameter synchronization period. The dotted line 601 part is a processing flow of federal learning in a vertical segmentation mode of the subsystem. The operation in this step 501 is described below in conjunction with the step shown in the dashed box S601 in fig. 6.
As shown in the dashed box S601 of fig. 6, through S6011, for several samples of the current round, the first member and the second member respectively process local data through local sub-modules, and deliver an intermediate result to the sub-server. Taking subsystem i as an example, the first member Ci1And a second member Ci2Each passing through a first submodel Wci1And a second submodel Wci2Processing corresponding local sample data to respectively obtain corresponding first intermediate results Rit1The second intermediate result Rit2To send to the sub-server Si。
Wherein the first member Ci1And a second member Ci2The processed sample data is in one-to-one correspondence, that is, the sample bodies are consistent, and for example, the local data is processed in a manner that the same sequence corresponds to the same sample identifier. The sample body is determined according to a specific service scenario, for example, the sample body is a user, and at this time, the sample identifier may uniquely mark the sample body, for example, an identity card number, a mobile phone number, and the like. First member Ci1And a firstTwo members Ci2When the sub-system is divided, the sample data can be determined by privacy intersection and other modes. In the current federal learning process, the information can be learned by the sub-server i or the first member Ci1Second member Ci2Determining the sequence of the sample data of the current batch in other appointed modes, and respectively passing through the first submodel W according to the corresponding sequenceci1The second submodel Wci2And carrying out corresponding data processing to obtain respective intermediate results so as to ensure that the first intermediate result and the second intermediate result of each training sample are corresponding to each other. The individual data in the intermediate result may be delivered to the child servant i in a form such as a vector, array, or the like, having a agreed order. As can be seen in FIG. 6, the first member Ci1And a second member Ci2The data processing and the intermediate result sending processes are mutually independent, and the private data can be ensured not to be leaked. In order to enforce privacy protection when sending intermediate results, first member Ci1And a second member Ci2And disturbance can be added to the intermediate result through modes such as differential privacy and the like, or the intermediate result is encrypted by adopting modes such as homomorphic encryption and secret sharing.
Further, through S6012, the sub-server SiBased on sub-model WsiFor the first intermediate result Rit1The second intermediate result Rit2Processed to the first member C respectivelyi1Second member Ci2Feeding back the first intermediate result Rit1Gradient of (2), second intermediate result Rit2Of the gradient of (c). In order to determine the gradient of the first intermediate result and the second intermediate result, a prediction result for the training sample needs to be determined first. The prediction result is the service result of the service to be predicted, and the prediction accuracy can be checked by a sample label. The sample labels may be commonly held by some training members in the current subsystem, but not exclusively, the training members provide the sample labels to the sub-server SiOr sample label by sub-servant SiA possibility of holding.
In one embodiment, the sub-server SiSample labels can be obtained, at which point predictions can be made by the sub-serverAnd finally, comparing the sample label with the prediction result to determine the model loss.
In another embodiment, the sample label may also be held by a portion of the training members.
In one case, as shown in FIG. 6, the sub-server S may be usediSending the prediction result to the training member who holds the sample label (such as the first member C)i1) The corresponding training member completes the comparison of the sample label and the prediction result, thereby determining the model loss and feeding back the model loss to the sub-server Si. Fig. 6 is one of the cases of model loss determination, and thus the corresponding timing is represented by a dotted line. In the case where a plurality of training members each hold a partial sample label, the child server SiAnd the prediction result data of the corresponding training samples can be sent to each training member respectively to compare model loss, which is not described herein again. The model loss is set according to a specific service scenario, and may be in various forms such as a mean square error, a cross entropy, a cosine similarity, and the like. Generally, the current model loss is the sum of model losses caused by the current batch of training samples, and the number of training samples in the current batch may be 1, or multiple (e.g., 100), and so on. Sub-server SiFurthermore, the gradient of each intermediate result, i.e. the partial derivative of the model loss to the intermediate result, can be determined from the model loss. According to the definition of the partial derivatives, the gradient is transitive, and in order to determine the gradient of each undetermined parameter in the submodule of each training member, the gradient of the intermediate result determined by the undetermined parameter can be determined first. Thus, it can be served by the sub-server SiAnd determining gradients of a first intermediate result and a second intermediate result respectively corresponding to the first member and the second member, and respectively returning the gradients to the corresponding training members. Optionally, in the submodel WsiIn the case of pending parameters, the sub-server SiThe submodel W may also be determined locallysiTo update the local parameters.
In another case, the tag holder may be further provided with a third submodel Wci3Sub-server SiBased on sub-model WsiFor the first intermediate result Rit1The second intermediate result Rit2The processing carried out may also result in a third intermediate result Rit3Then the third intermediate result Rit3Sent to the label holder by the third submodel Wci3Processing determines a third intermediate result Rit3Prediction results for training samples. Then, the label holder calculates the model loss according to the prediction result and the gradient of the third intermediate result, and transmits the model loss and the gradient of the third intermediate result back to the sub-server Si. Further, the sub-server SiDetermining a model loss versus a first intermediate result Rit1The second intermediate result Rit2Of the gradient of (c).
In other cases, the model may have other architectures, and the gradient may have other different determination manners, which are not described herein again. In summary, the gradient has the property of back propagation, and the gradient of the model loss to the first intermediate result and the second intermediate result can be determined by the corresponding sub-servers.
On the other hand, according to S6013, first member Ci1And a second member Ci2Each using the first intermediate result Rit1And a second intermediate result Rit2Determining a first submodel Wci1And a second submodel Wci2To locally update the first submodel W, respectivelyci1The second submodel Wci2The parameter (1).
It is worth noting that in the update iteration of the subsystem i, the parameters updated by each participant (such as each training member in the exemplary architecture of fig. 3a, or the sub-server and each member in the exemplary architecture of fig. 3 b) may generally include the current pending parameter gradient, and the pending parameters updated according to the gradient. When the subsystem does not need to perform parameter synchronization with other subsystems, each participant can update the gradient and the undetermined parameters in sequence, and under the condition of performing parameter synchronization with other subsystems, each participant can determine to update the parameters to be synchronized and send the updated values of the parameters to be synchronized to the federal service side. The parameter to be synchronized is understood to be a parameter that needs to be synchronized between the subsystems. Here, the parameter to be synchronized may be a gradient or a parameter to be determined.
In the process of jointly updating the model W, a condition for parameter synchronization may be preset, for example, a predetermined iteration turn (for example, 5 turns) or a predetermined time (for example, 3 minutes) is set, and in a case that the synchronization condition is satisfied, each subsystem may stop iteration and send an updated value of the currently determined pending parameter to the federal service side. Under the architecture shown in fig. 3b, as shown in fig. 6, in the case that the sub-server also has the parameter to be synchronized, the sub-server and the training member together send the updated value of the local parameter to be synchronized to the federal server. Since there may be a case where the sub-server does not have the parameter to be synchronized, the flow of uploading the local parameter to the federal server in fig. 6 is represented by a dotted line, which represents an alternative according to the actual business situation. It will be appreciated that under the architecture shown in FIG. 3a, the process of uploading parameters is similar to that of FIG. 6, except that there are no child servers, and therefore only membership directions are trained; and uploading the parameters to be synchronized by the shoulder service party. In the parameter uploading process, each participant only sends local parameters, so that the data privacy is effectively protected. Optionally, each participant may further encrypt the local parameter by adding perturbation (e.g., perturbation data satisfying differential privacy), or by using a homomorphic encryption, and then send the local parameter to the federal service side.
In this way, the parameter synchronization period of each subsystem in the process of jointly updating the model is controlled by using the synchronization condition, and in a single synchronization period, one or more iterations can be performed in the single subsystem, so that the current parameter to be synchronized is fed back to the federal service side when the parameter synchronization period arrives. In a single iteration process, each training member processes data held by the training member, and data privacy is effectively protected.
Next, through step 502, the federal service side performs security synchronization on the updated values of the parameters to be synchronized received from the multiple subsystems, and feeds back the synchronization values of the parameters to be synchronized.
Referring to FIG. 6, which includes the sub-server architecture, in S602, the parameter synchronization process is performed by using the updated values of the parameters to be synchronized received from the subsystems and corresponding to the parameters to be determined in a one-to-one mannerAnd (5) line fusion. For a single parameter to be synchronized, there may be corresponding update values sent from the various subsystems. The federal service side can obtain the average value, the median value, the maximum value, the minimum value and the like of the updated values to obtain the synchronous value of the single parameter to be synchronized (such as the synchronous parameter in fig. 6). In the case that the parameter to be synchronized is a gradient, the synchronization value of the gradient may also be determined in an additive manner. And then, the federal service side can feed back the synchronous value of each parameter to be synchronized to the corresponding participant. Wherein, the first member Ci1For example, for a first member Ci1A first parameter of the corresponding parameters to be synchronized by the first member Ci1Sending the updated value of the first parameter to the federal service side, and after the federal service side synchronizes the first parameter by using the updated value of the first parameter fed back by the participator in other subsystems to obtain a first synchronization value, feeding back the first synchronization value to the first member Ci1. In this way, for each parameter to be synchronized, the parameter is only transmitted between the related participant and the trusted federal service side, and the data privacy is guaranteed.
In addition, in the case that the participant of the subsystem encrypts the parameters to be synchronized by using a method such as homomorphic encryption, the federal service side can also perform secure synchronization of data in a homomorphic encryption manner. The federal service side may feed back the disclosed synchronization value to the participant of each subsystem, or may feed back the synchronization value in the corresponding encryption mode, which is not limited in this specification.
Then, via step 503, the corresponding sub-server and each member in each subsystem each receive the synchronization value of the local corresponding parameter to be synchronized, so as to update the local parameter to be determined. The participants corresponding to the single subsystem i can at least comprise a first member Ci1Second member Ci2Sub-server S may also be included under some architecturesi. In this step 503, as shown with reference to S603 in fig. 6, the single participant receives only the synchronization values of the corresponding parameters to be synchronized in the locally relevant sub-modules. It can be understood that, in the implementation architecture of the present specification, a subsystem may include a sub-server or may not include a sub-server, in the case of including a sub-serverUnder the condition, the sub-server SiThe parameter to be synchronized may or may not be corresponded, and therefore, in S603 of fig. 6, the sub-server S is related toiIs indicated by a dashed line, the sub-server S is included only in the sub-systemiAnd exists in the case of corresponding parameters to be synchronized.
In the case that the parameter to be synchronized is a parameter to be determined, a single participant may directly update the local parameter to be determined with the synchronization value of the parameter to be synchronized. When the parameter to be synchronized is the gradient of the parameter to be determined, the synchronization value of the parameter to be synchronized can be used as the current gradient, and the corresponding parameter to be determined can be updated by using a gradient-related method such as a gradient descent method and a Newton method.
It should be noted that in the embodiments of fig. 5 and fig. 6, only the first member and the second member included in a single subsystem are taken as examples for description. Under more complex composite architectures, there may be more types of members, e.g., referred to as third members, fourth members, etc. In practice, for such a system for jointly updating a model with a mixed segmentation, there are at least 1 subsystem, and the subsystem includes at least 2 training members, and sample data of these training members constitutes a vertical segmentation. That is, a system that can segment a joint update model of at least one such subsystem can be applied to the technical solution provided in the present specification.
It is to be understood that the systems shown in fig. 3a and 3b perform joint update model operation via the flow described in fig. 5, and the flow described in fig. 5 is a flow using the system shown in fig. 3a or 3b, so that the related descriptions for the systems shown in fig. 3a and 3b and the flow described in fig. 5 can be adapted to each other. In particular, fig. 6 is a flowchart corresponding to the architecture of fig. 3b, and the related descriptions in fig. 3b and fig. 6 may be adapted to each other, and are not repeated herein.
In addition, from the perspective of the federal service, the flow of the joint update model provided in an embodiment of the present specification may include: respectively receiving updated values of the parameters to be synchronized, which correspond to the parameters to be determined in the corresponding submodels one by one under the condition that the synchronization condition is met, from each subsystem, wherein each subsystem is singleIn System i, first Member Ci1Second member Ci2Providing local models W, respectivelyiNeutron model WsiThe first sub-model Wci1The second submodel Wci2To be synchronized parameter update value, local model WiThe updating value of each parameter to be synchronized is determined based on joint training of the subsystem i in a vertical segmentation mode aiming at the corresponding submodel; and carrying out safety synchronization on the update values of the parameters to be synchronized received from the plurality of subsystems, and feeding back the synchronization value of each parameter to be synchronized so that the corresponding training member or the sub-server can complete the update of the parameters to be determined of the local module.
In an implementation architecture including a servlet, a servlet SiFrom an execution perspective, the process of jointly updating a model according to one embodiment includes: with a corresponding first member Ci1Second member Ci2Together, are utilized in the corresponding first member Ci1Second member Ci2Training samples with upper composition vertical segmentation are directed to corresponding local model WiPerforming split federation learning in a vertical split mode; if the synchronization condition is satisfied, the sub-model W will be associated withsiThe updated values of the parameters to be synchronized, which correspond to the parameters to be determined one by one, are sent to a federal service side, so that the federal service side can safely synchronize the parameters to be synchronized based on the updated values of the parameters to be synchronized, which are received from a plurality of subsystems; obtaining a sub-model W from a federal service side after secure synchronizationsiThe synchronous value of each parameter to be synchronized to update the submodel WsiTo each pending parameter in (1).
From the first member Ci1From an execution perspective, the process of jointly updating a model according to one embodiment includes: using local and second members Ci2Training samples with upper composition vertical segmentation are directed to corresponding local model WiPerforming joint training to obtain a first sub-model Wci1The updating value of each parameter to be synchronized is in one-to-one correspondence with each parameter to be determined; in case the synchronization condition is satisfied, it will be associated with the first submodel Wci1The updated value of each parameter to be synchronized corresponding to each parameter to be determined one by one is sent to the federal serviceThe system comprises a side for the federal service side to safely synchronize each parameter to be synchronized based on the updated value of the parameter to be synchronized received from each subsystem; obtaining a first sub-model W subjected to safety synchronization from a federal service sideci1The synchronous value of each parameter to be synchronized to update the first submodel Wci1To each pending parameter in (1).
From the second member Ci2From an execution perspective, the process of jointly updating a model according to one embodiment includes: using local and first member Ci1Training samples with upper composition vertical segmentation are directed to corresponding local model WiPerforming joint training to obtain a first sub-model Wci1The updating value of each parameter to be synchronized is in one-to-one correspondence with each parameter to be determined; in case the synchronization condition is satisfied, it will be associated with the second submodel Wci2The updated values of the parameters to be synchronized, which correspond to the parameters to be determined one by one, are sent to a federal service side, so that the federal service side can safely synchronize the parameters to be synchronized based on the updated values of the parameters to be synchronized, which are received from the subsystems; obtaining a second sub-model W subjected to safety synchronization from a federal service sideci2The synchronization value of each parameter to be synchronized to update the second submodel Wci2To each pending parameter in (1).
It should be noted that, since the operations performed by the first member and the second member are identical, in the present specification, the names of the first member and the second member do not substantially distinguish the corresponding training members, and the performed operations are distinguished only by the identifiers, so that the operation performed by one member is also applicable to the other member.
Reviewing the above flow, the present specification provides a concept of a joint update model for data compound slicing scenarios. The compound segmentation of the data can comprise horizontal segmentation and horizontal segmentation, so that the conventional federal learning cannot be applied. In view of the above, the present specification contemplates segmenting the data of each training member to form a plurality of horizontally sliced subsystems, which may include training members with vertically sliced data within a single subsystem. In this way, the single subsystem with the data being vertically split iterates through the training samples distributed in the plurality of training members in the subsystem, so as to update the parameters to be synchronized. And the data synchronization can be carried out among the subsystems according to the synchronization period triggered by the synchronization condition. The method fully considers the data composition of each training member, provides a solution for the joint update model under the complex data structure, and is beneficial to expanding the application range of the federal learning.
According to an embodiment of another aspect, there are also provided respective apparatuses adapted to the above system for jointly updating a model. The means for jointly updating the model may include means for locating a federal service, means for locating a sub-service, and means for locating a first member or a second member. Fig. 7 shows a block diagram of a device provided on the federal service side, and fig. 8 shows a block diagram of a device commonly used on a sub-service side, a first member or a second member.
As shown in fig. 7, the apparatus 700 includes an acquisition unit and a synchronization unit. Wherein: the obtaining unit 71 is configured to receive, from each subsystem, an updated value of each to-be-synchronized parameter that corresponds to each to-be-determined parameter in the corresponding local model one to one when the synchronization condition is satisfied, where the updated value of the to-be-synchronized parameter provided by a single subsystem i is determined based on joint training of the subsystem i in a vertical segmentation manner for the corresponding sub-model; the synchronization unit 72 is configured to perform security synchronization on the update value of each parameter to be synchronized received from the multiple subsystems, and feed back the synchronization value of each parameter to be synchronized, so that the corresponding training member or the sub-server completes the update of the parameter to be determined of the local module.
As shown in fig. 8, the apparatus 800 includes a training unit 81, a providing unit 82, and a synchronizing unit 83.
At device 800 is located at sub-server SiIn the case of the apparatus for jointly updating a model of (1): a training unit 81 configured to communicate with a respective first member Ci1Second member Ci2Together, are utilized in the corresponding first member Ci1Second member Ci2Training samples with vertical segmentation are aimed at corresponding submodels WiPerforming joint training (also called federal learning) in a vertical segmentation mode; a providing unit 82 configured to satisfyIn case of step condition, the AND submodule WsiThe updated values of the parameters to be synchronized, which correspond to the parameters to be determined one by one, are sent to a federal service side, so that the federal service side can safely synchronize the parameters to be synchronized based on the updated values of the parameters to be synchronized, which are received from a plurality of subsystems; a synchronization unit 83 configured to obtain the safely synchronized submodel W from the federal service sidesiThe synchronous value of each parameter to be synchronized to update the submodel WsiTo each pending parameter in (1).
The apparatus 800 is provided for the first member Ci1(or a second member Ci2) In the case of the apparatus for jointly updating a model of (1): a training unit 81 configured to utilize the local and second member Ci2(or first member Ci1) Training samples with upper composition vertical segmentation are directed to corresponding local model WiPerforming split federal learning; a providing unit 82 configured to, in case a synchronization condition is fulfilled, associate with the first submodel Wci1(or a second submodel WCi2) The updated values of the parameters to be synchronized, which correspond to the parameters to be determined one by one, are sent to a federal service side, so that the federal service side can safely synchronize the parameters to be synchronized based on the updated values of the parameters to be synchronized, which are received from the subsystems; a synchronization unit 83 configured to obtain the first sub-model W subjected to the secure synchronization from the federal service sideci1(or a second submodel WCi2) The synchronous value of each parameter to be synchronized to update the first submodel Wci1(or a second submodel WCi2) To each pending parameter in (1).
It should be noted that the apparatus 700 shown in fig. 7 and the apparatus 800 shown in fig. 8 are respectively an embodiment of an apparatus provided on a federal service side and a training member in the method embodiment shown in fig. 5, so as to implement functions of a corresponding business side. Therefore, the corresponding description in the method embodiment shown in fig. 5 is also applicable to the apparatus 700 or the apparatus 800, and is not repeated here.
According to an embodiment of another aspect, a computer-readable storage medium is further provided, on which a computer program is stored, which, when executed in a computer, causes the computer to perform operations corresponding to any one of the participants of the method described in conjunction with fig. 5 and 6.
According to an embodiment of still another aspect, a computing device is further provided, which includes a memory and a processor, where the memory stores executable codes, and when the processor executes the executable codes, the processor implements operations corresponding to any one of the participants in the methods described in conjunction with fig. 5 and fig. 6.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in the embodiments of this specification may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The above embodiments are only intended to be specific embodiments of the technical concept of the present disclosure, and should not be used to limit the scope of the technical concept of the present disclosure, and any modification, equivalent replacement, improvement, etc. made on the basis of the technical concept of the embodiments of the present disclosure should be included in the scope of the technical concept of the present disclosure.
Claims (19)
1. A system for jointly updating a model comprises a federal service party and a plurality of subsystems, wherein the plurality of subsystems are used for jointly updating the model W, and a single subsystem i in the plurality of subsystems comprises a first member C in training membersi1Second member Ci2First member Ci1Second member Ci2The held sample data form vertical segmentation, the sample data held by each subsystem form horizontal segmentation, and a single subsystem i corresponds to a local model W with the same structure as the model WiLocal model WiComprises a first member Ci1First submodel W ofci1Is arranged on the second member Ci2Second submodel W ofci2(ii) a Wherein:
a single subsystem i for use with the first member Ci1Second member Ci2Training samples for up-vertical segmentation are directed to local model WiPerforming joint training in a vertical segmentation mode, and providing a corresponding local model W for the federal service side under the condition of meeting a synchronization conditioniUpdating values of the parameters to be synchronized corresponding to the parameters to be determined one by one, and synchronizing the local parameters to be synchronized with the parameters to be synchronized in each subsystem according to the synchronous values of the parameters to be synchronized fed back by the federal service side, so as to adjust the corresponding parameters to be determined;
and the federal service side is used for carrying out safety synchronization on the updated values of the parameters to be synchronized from the subsystems and feeding back the synchronized values.
2. A method for jointly updating a model is applicable to a process of jointly updating a model W of a system of models, wherein the system comprises a federal service provider and a plurality of subsystems, and a single subsystem i in the plurality of subsystems comprises a first member C in training membersi1Second member Ci2First member Ci1Second member Ci2The held sample data form vertical segmentation, the sample data held by each subsystem form horizontal segmentation, and a single subsystem i corresponds to a local model W with the same structure as the model WiLocal model WiComprises a first member Ci1First submodel W ofci1Is arranged on the second member Ci2Second submodel W ofci2(ii) a The method comprises the following steps:
each subsystem respectively utilizes the training samples vertically split on the corresponding first member and the second member to carry out combined training in a vertical splitting mode aiming at the corresponding local model, and each training member respectively provides the updated value of each parameter to be synchronized corresponding to each parameter to be determined in the corresponding sub-model to the federal service side under the condition of meeting the synchronization condition;
the federal service party carries out safe synchronization on the updated values of the parameters to be synchronized from a plurality of subsystems and feeds back the synchronization values of the parameters to be synchronized;
and each training member in each subsystem receives the synchronization value of the local parameter to be synchronized so as to update the local parameter to be determined.
3. The method of claim 2, the single subsystem i further comprising a sub-server SiSingle subsystem i vs. local model WiThe joint training performed includes:
for several samples of the current round, a first member Ci1And a second member Ci2Each passing through a first submodel Wci1And a second submodel Wci2Processing corresponding local sample data to respectively obtain corresponding first intermediate results Rit1The second intermediate result Rit2To send to the sub-server Si;
Sub-server SiBased on the third submodel WsiFor the first intermediate result Rit1The second intermediate result Rit2Processed to the first member C respectivelyi1Second member Ci2Feeding back the first intermediate result Rit1Gradient of (2), second intermediate result Rit2A gradient of (a);
first member Ci1And a second member Ci2Each using the first intermediate result Rit1And a second intermediate result Rit2Determining a first submodel Wci1And a second submodel Wci2To determine the gradient of the parameter to be determined, to determine the first submodel W, respectivelyci1And a second submodel Wci2The updated value of the parameter to be synchronized.
4. The method of claim 3, wherein the label-holders for the samples of the current round in a single subsystem i are first members Ci1Or a second member Ci2(ii) a The sub-server SiBased on the third submodel WsiFor the first intermediate result Rit1The second intermediate result Rit2Processed to the first member C respectivelyi1Second member Ci2Feeding back the first intermediate result Rit1Gradient of (2), second intermediate result Rit2LadderThe degree further includes:
the sub-server SiBased on the third submodel WsiFor the first intermediate result Rit1And a second intermediate result Rit2Processing to obtain a prediction result, and sending the prediction result to the label holder;
the label holder determines corresponding model loss through comparison of label data of a plurality of samples of the current round with the prediction result so as to feed back the model loss to the sub-server Si;
The sub-server SiDetermining for a first intermediate result R from the model lossit1And a second intermediate result Rit2Of the gradient of (c).
5. The method of claim 4, wherein in the third submodel WsiUnder the condition of undetermined parameters, the sub-server SiAlso detecting the model loss for the third submodel WsiIncluding the gradient of the parameter to be determined.
6. The method of claim 3, wherein the label-holders for several samples of a single subsystem i current turn are first members Ci1Or a second member Ci2The label holder is provided with a fourth submodel Wci3(ii) a The sub-server SiBased on the third submodel WsiFor the first intermediate result Rit1The second intermediate result Rit2Processed to the first member C respectivelyi1Second member Ci2Feeding back the first intermediate result Rit1Gradient of (2), second intermediate result Rit2Further comprising:
the sub-server SiBased on the third submodel WsiFor the first intermediate result Rit1And a second intermediate result Rit2Is processed to obtain a third intermediate result Rit3And the third intermediate result Rit3Sending the label to the label holder;
the label holder passesFourth submodel Wci3Processing the third intermediate result Rit3Obtaining corresponding prediction results, and determining a third intermediate result R of model loss aiming at the current turn based on the comparison of the label data of a plurality of samples of the current turn and the prediction resultsit3To be fed back to said sub-server Si;
The sub-server SiAccording to the third intermediate result Rit3For the first intermediate result R is determinedit1And a second intermediate result Rit2Of the gradient of (c).
7. The method of claim 2, wherein subsystem i is paired with local model WiThe joint training performed includes:
each training member in the subsystem i performs multi-party safety calculation so that each training member can determine the gradient of model loss aiming at the local undetermined parameters;
each training member determines the updating value of the parameter to be synchronized based on the gradient of the parameter to be determined in the corresponding submodel, wherein the first member Ci1And a second member Ci2Respectively determining a first submodel Wci1And a second submodel Wci2The updated value of the parameter to be synchronized.
8. The method of claim 2, wherein the synchronization condition comprises: each local model is updated in a predetermined round or a predetermined time period.
9. The method of claim 2, wherein the single parameter to be synchronized is a single parameter to be determined or a single gradient corresponding to the single parameter to be determined.
10. The method of claim 2, wherein the federated server securely synchronizing updated values for parameters to be synchronized from a plurality of subsystems comprises:
the federal service side receives all parameters to be synchronized which are respectively sent by all training members and encrypted in a preset encryption mode;
and the federal service side performs fusion of at least one mode of addition, weighted average and median value solving on the respective updated values of the parameters to be synchronized to obtain corresponding synchronization values.
11. The method of claim 10, wherein the predetermined encryption scheme comprises one of: adding perturbations that satisfy differential privacy; homomorphic encryption; and (4) secret sharing.
12. A method for jointly updating a model is applicable to a process of jointly updating a model W of a system of models, wherein the system comprises a federal service provider and a plurality of subsystems, and a single subsystem i in the plurality of subsystems comprises a first member C in training membersi1Second member Ci2First member Ci1Second member Ci2The held sample data form vertical segmentation, the sample data held by each subsystem form horizontal segmentation, and a single subsystem i corresponds to a local model W with the same structure as the model WiLocal model WiComprises a first member Ci1First submodel W ofci1Second member Ci2Second submodel W ofci2(ii) a The method is performed by the federal service, and includes:
respectively receiving updated values of the parameters to be synchronized which are in one-to-one correspondence with the parameters to be determined in the corresponding sub-model under the condition that the synchronization condition is met from each subsystem, wherein the updated values of the parameters to be synchronized provided by the single subsystem i are based on the subsystem i aiming at the corresponding local model WiPerforming joint training determination in a vertical segmentation mode;
and carrying out safe synchronization on the updated values of the parameters to be synchronized from the subsystems, and feeding back the synchronization value of each parameter to be synchronized so as to allow the corresponding training member or the sub-server to complete the updating of the parameters to be determined of the local module.
13. Method for jointly updating model, and the method is suitable forProcess for updating a model W of a system for jointly updating a model, the system comprising a federal service and a plurality of subsystems, a single subsystem i of the plurality of subsystems comprising a first member C of training membersi1Second member Ci2First member Ci1Second member Ci2The held sample data form vertical segmentation, the sample data held by each subsystem form horizontal segmentation, and a single subsystem i corresponds to a local model W with the same structure as the model WiLocal model WiComprises a first member Ci1First module Wci1Is arranged on the second member Ci2Second module Wci2(ii) a The method comprises a first member Ci1Executing, including:
using local and second members Ci2Constructing vertically sliced training samples for respective local models WiPerforming joint training to obtain a first sub-model Wci1The updating value of each parameter to be synchronized is in one-to-one correspondence with each parameter to be determined;
in case the synchronization condition is satisfied, it will be associated with the first submodel Wci1The updated values of the parameters to be synchronized, which correspond to the parameters to be determined one by one, are sent to the federal service side, so that the federal service side can safely synchronize the parameters to be synchronized based on the updated values of the parameters to be synchronized from the subsystems;
obtaining a first sub-model W subjected to safety synchronization from the federal service sideci1The synchronous value of each parameter to be synchronized to update the first submodel Wci1To each pending parameter in (1).
14. The method of claim 13, subsystem i further comprising a sub-server SiLocal model W for a single subsystem iiFurther comprises a sub-server SiThird submodel W ofsi(ii) a The utilization of local and second member Ci2Constructing vertically sliced training samples for respective local models WiPerforming the joint training comprises:
for a number of samples of the current round, pass the first subModel Wci1Processing corresponding local sample data to obtain a corresponding first intermediate result Rit1To send to the sub-server SiFor the sub-server SiBased on the third submodel WsiFor the first intermediate result Rit1The second intermediate result Rit2Processing performed to feed back the first intermediate result Rit1In which the second intermediate result Rit2From a second member Ci2By means of the second submodel Wci2Processing corresponding local sample data to obtain;
using the first intermediate result Rit1And a second intermediate result Rit2Determining a first submodel Wci1The gradient of each parameter to be determined, thereby determining a first submodel Wci1The updated value of the parameter to be synchronized.
15. The method of claim 13, the utilizing local and second members Ci2Constructing vertically sliced training samples for respective local models WiPerforming the joint training comprises:
performing a multi-party security calculation with each training member in subsystem i to determine model loss for a first sub-model Wci1The gradient of the undetermined parameter;
based on a first sub-model Wci1Determining the updating value of the corresponding parameter to be synchronized according to the gradient of the parameter to be determined.
16. An apparatus for federated update of a model, the apparatus being adapted to a federated server in a system for federated update of a model, the system comprising the federated server and a plurality of subsystems, a single subsystem i of the plurality of subsystems comprising a first member C of training membersi1Second member Ci2First member Ci1Second member Ci2The held sample data form vertical segmentation, the sample data held by each subsystem form horizontal segmentation, and a single subsystem i corresponds to a local model W with the same structure as the model WiLocal model WiComprises a first member Ci1First submodel W ofci1Is arranged on the second member Ci2Second submodel W ofci2(ii) a The device comprises:
an obtaining unit configured to receive, from each subsystem, an updated value of each parameter to be synchronized, which corresponds to each parameter to be determined in the corresponding local model one to one when the synchronization condition is satisfied, where the updated value of each parameter to be synchronized provided by a single subsystem i is based on the subsystem i with respect to the corresponding local model WiPerforming joint training determination in a vertical segmentation mode;
and the synchronization unit is configured to perform safe synchronization on the update values of the parameters to be synchronized received from the plurality of subsystems and feed back the synchronization values of the parameters to be synchronized so that the corresponding training members can complete the update of the parameters to be determined of the local module.
17. An apparatus for jointly updating a model, the apparatus being adapted to a process of jointly updating a model W of a system of models, the system comprising a federal service provider and a plurality of subsystems, a single subsystem i of the plurality of subsystems comprising a first member C of training membersi1Second member Ci2First member Ci1Second member Ci2The held sample data form vertical segmentation, the sample data held by each subsystem form horizontal segmentation, and a single subsystem i corresponds to a local model W with the same structure as the model WiLocal model WiComprises a first member Ci1First module Wci1Is arranged on the second member Ci2Second module Wci2(ii) a The device is arranged on the first member Ci1The method comprises the following steps:
a training unit configured to utilize the local and second member Ci2Training samples with upper composition vertical segmentation are directed to corresponding local model WiPerforming joint training to obtain a first sub-model Wci1The updating value of each parameter to be synchronized is in one-to-one correspondence with each parameter to be determined;
a providing unit configured to be associated with the first submodel W in case a synchronization condition is satisfiedci1Of each pending parameterThe updated values of the parameters to be synchronized which are in one-to-one correspondence are sent to the federal service side, so that the federal service side can safely synchronize the parameters to be synchronized based on the updated values of the parameters to be synchronized received from the subsystems;
a synchronization unit configured to acquire a first sub-model W subjected to secure synchronization from the federal service sideci1The synchronous value of each parameter to be synchronized to update the first submodel Wci1To each pending parameter in (1).
18. A computer-readable storage medium, on which a computer program is stored which, when executed in a computer, causes the computer to carry out the method of any one of claims 10-15.
19. A computing device comprising a memory and a processor, wherein the memory has stored therein executable code that, when executed by the processor, performs the method of any one of claims 10-15.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111256451.8A CN114004363B (en) | 2021-10-27 | 2021-10-27 | Method, device and system for jointly updating model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111256451.8A CN114004363B (en) | 2021-10-27 | 2021-10-27 | Method, device and system for jointly updating model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114004363A true CN114004363A (en) | 2022-02-01 |
CN114004363B CN114004363B (en) | 2024-05-31 |
Family
ID=79924225
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111256451.8A Active CN114004363B (en) | 2021-10-27 | 2021-10-27 | Method, device and system for jointly updating model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114004363B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114638376A (en) * | 2022-03-25 | 2022-06-17 | 支付宝(杭州)信息技术有限公司 | Multi-party combined model training method and device in composite sample scene |
CN114819182A (en) * | 2022-04-15 | 2022-07-29 | 支付宝(杭州)信息技术有限公司 | Method, apparatus and system for training a model via multiple data owners |
CN114841373A (en) * | 2022-05-24 | 2022-08-02 | 中国电信股份有限公司 | Parameter processing method, device, system and product applied to mixed federal scene |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110874484A (en) * | 2019-10-16 | 2020-03-10 | 众安信息技术服务有限公司 | Data processing method and system based on neural network and federal learning |
US20210004718A1 (en) * | 2019-07-03 | 2021-01-07 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and device for training a model based on federated learning |
CN112328617A (en) * | 2020-11-19 | 2021-02-05 | 杭州趣链科技有限公司 | Learning mode parameter updating method for longitudinal federal learning and electronic device |
WO2021022707A1 (en) * | 2019-08-06 | 2021-02-11 | 深圳前海微众银行股份有限公司 | Hybrid federated learning method and architecture |
CN112580826A (en) * | 2021-02-05 | 2021-03-30 | 支付宝(杭州)信息技术有限公司 | Business model training method, device and system |
CN112613618A (en) * | 2021-01-04 | 2021-04-06 | 神谱科技(上海)有限公司 | Safe federal learning logistic regression algorithm |
WO2021119601A1 (en) * | 2019-12-13 | 2021-06-17 | Qualcomm Technologies, Inc. | Federated mixture models |
CN113377797A (en) * | 2021-07-02 | 2021-09-10 | 支付宝(杭州)信息技术有限公司 | Method, device and system for jointly updating model |
-
2021
- 2021-10-27 CN CN202111256451.8A patent/CN114004363B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210004718A1 (en) * | 2019-07-03 | 2021-01-07 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and device for training a model based on federated learning |
WO2021022707A1 (en) * | 2019-08-06 | 2021-02-11 | 深圳前海微众银行股份有限公司 | Hybrid federated learning method and architecture |
CN110874484A (en) * | 2019-10-16 | 2020-03-10 | 众安信息技术服务有限公司 | Data processing method and system based on neural network and federal learning |
WO2021119601A1 (en) * | 2019-12-13 | 2021-06-17 | Qualcomm Technologies, Inc. | Federated mixture models |
CN112328617A (en) * | 2020-11-19 | 2021-02-05 | 杭州趣链科技有限公司 | Learning mode parameter updating method for longitudinal federal learning and electronic device |
CN112613618A (en) * | 2021-01-04 | 2021-04-06 | 神谱科技(上海)有限公司 | Safe federal learning logistic regression algorithm |
CN112580826A (en) * | 2021-02-05 | 2021-03-30 | 支付宝(杭州)信息技术有限公司 | Business model training method, device and system |
CN113377797A (en) * | 2021-07-02 | 2021-09-10 | 支付宝(杭州)信息技术有限公司 | Method, device and system for jointly updating model |
Non-Patent Citations (1)
Title |
---|
卢云龙: "数据隐私安全防护及共享方法研究", 中国博士学位论文全文数据库 信息科技辑, no. 01, 15 January 2021 (2021-01-15) * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114638376A (en) * | 2022-03-25 | 2022-06-17 | 支付宝(杭州)信息技术有限公司 | Multi-party combined model training method and device in composite sample scene |
CN114638376B (en) * | 2022-03-25 | 2024-06-04 | 支付宝(杭州)信息技术有限公司 | Multi-party joint model training method and device in composite sample scene |
CN114819182A (en) * | 2022-04-15 | 2022-07-29 | 支付宝(杭州)信息技术有限公司 | Method, apparatus and system for training a model via multiple data owners |
CN114819182B (en) * | 2022-04-15 | 2024-05-31 | 支付宝(杭州)信息技术有限公司 | Method, apparatus and system for training a model via multiple data owners |
CN114841373A (en) * | 2022-05-24 | 2022-08-02 | 中国电信股份有限公司 | Parameter processing method, device, system and product applied to mixed federal scene |
CN114841373B (en) * | 2022-05-24 | 2024-05-10 | 中国电信股份有限公司 | Parameter processing method, device, system and product applied to mixed federal scene |
Also Published As
Publication number | Publication date |
---|---|
CN114004363B (en) | 2024-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230078061A1 (en) | Model training method and apparatus for federated learning, device, and storage medium | |
CN110189192B (en) | Information recommendation model generation method and device | |
CN110929886B (en) | Model training and predicting method and system | |
CN114004363B (en) | Method, device and system for jointly updating model | |
CN110955915B (en) | Method and device for processing private data | |
US11341411B2 (en) | Method, apparatus, and system for training neural network model | |
AU2018347190A1 (en) | Blockchain data protection based on account note model with zero-knowledge proof | |
CN114677200B (en) | Business information recommendation method and device based on multiparty high-dimension data longitudinal federation learning | |
CN112989399B (en) | Data processing system and method | |
US11855970B2 (en) | Systems and methods for blind multimodal learning | |
CN112799708A (en) | Method and system for jointly updating business model | |
CN114595835A (en) | Model training method and device based on federal learning, equipment and storage medium | |
CN114564641A (en) | Personalized multi-view federal recommendation system | |
Yan et al. | Multi-participant vertical federated learning based time series prediction | |
Khan et al. | Vertical federated learning: A structured literature review | |
CN112507372B (en) | Method and device for realizing privacy protection of multi-party collaborative update model | |
CN113887740B (en) | Method, device and system for jointly updating model | |
US12088565B2 (en) | Systems and methods for privacy preserving training and inference of decentralized recommendation systems from decentralized data | |
CN114723012B (en) | Calculation method and device based on distributed training system | |
EP4399650A1 (en) | Systems and methods for blind vertical learning | |
CN114723068A (en) | Federal model training method and device | |
CN114547684A (en) | Method and device for protecting multi-party joint training tree model of private data | |
CN113657611A (en) | Method and device for jointly updating model | |
CN114638376B (en) | Multi-party joint model training method and device in composite sample scene | |
US20230401439A1 (en) | Vertical federated learning with secure aggregation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |