[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2020215209A1 - 预测操作结果的方法、电子设备和计算机程序产品 - Google Patents

预测操作结果的方法、电子设备和计算机程序产品 Download PDF

Info

Publication number
WO2020215209A1
WO2020215209A1 PCT/CN2019/083905 CN2019083905W WO2020215209A1 WO 2020215209 A1 WO2020215209 A1 WO 2020215209A1 CN 2019083905 W CN2019083905 W CN 2019083905W WO 2020215209 A1 WO2020215209 A1 WO 2020215209A1
Authority
WO
WIPO (PCT)
Prior art keywords
probability
prediction model
observation
target object
result
Prior art date
Application number
PCT/CN2019/083905
Other languages
English (en)
French (fr)
Inventor
刘春辰
李伟
Original Assignee
日本电气株式会社
刘春辰
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本电气株式会社, 刘春辰 filed Critical 日本电气株式会社
Priority to PCT/CN2019/083905 priority Critical patent/WO2020215209A1/zh
Priority to JP2021562849A priority patent/JP7355115B2/ja
Priority to US17/605,652 priority patent/US20220222554A1/en
Publication of WO2020215209A1 publication Critical patent/WO2020215209A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/10ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

Definitions

  • the embodiments of the present disclosure relate to the field of machine learning, and more specifically, to methods, electronic devices, and computer program products for predicting operation results.
  • causal inference questions are often referred to as counterfactual questions, such as "If this patient takes a new drug, will he survive longer?" or "If this student participates in an educational training program, will he get a higher score? ?”
  • the embodiments of the present disclosure provide a solution for predicting operation results.
  • a method of predicting the result of an operation includes determining a first prediction model based on a first set of observation data, the first set of observation data includes a first set of observation results for performing a first operation on a first set of observation objects, and the first prediction model is used to predict execution on a target object The first potential result of the first operation.
  • the method further includes: determining a first probability model based on a first group of observation objects on which the first operation is performed and a second group of observation objects on which a second operation different from the first operation is performed, and the first probability model is used to determine the pair The corresponding probability of the target object performing the first operation and the second operation.
  • the method further includes determining a second prediction model based on the first set of observation data and the first probability model.
  • the second prediction model predicts the performance of the first operation on the target object by estimating the result of performing the first operation on the second set of observation objects.
  • the second potential result The method also includes determining a first combination of at least a first prediction model, a second prediction model, and a first probability model for predicting a first final result of performing the first operation on the target object.
  • a method of predicting the result of an operation includes predicting a first potential result of performing a first operation on the target object in response to determining that one of a set of operations will be performed on the target object.
  • the method further includes determining a first probability of performing a first operation on the target object and a second probability of performing a second operation in a set of operations, the second operation being different from the first operation.
  • the method further includes predicting a second potential result of performing the first operation on the target object by estimating the result of performing the first operation on the observation object on which the second operation is performed.
  • the method further includes predicting a first final result of performing the first operation on the target object based on at least the first potential result, the second potential result, the first probability, and the second probability.
  • an electronic device in a third aspect of the present disclosure, includes a processor and a memory coupled with the processor.
  • the memory has instructions stored therein, and the instructions cause the device to perform actions when executed by the processor.
  • the action includes: determining a first prediction model based on a first set of observation data, the first set of observation data includes a first set of observation results performed on the first set of observation objects, and the first prediction model is used to predict execution on the target object The first potential result of the first operation; determine the first probability model and the first probability model based on the first group of observation objects that are executed the first operation and the second group of observation objects that are executed the second operation different from the first operation Used to determine the corresponding probabilities of performing the first operation and the second operation on the target object; determine the second prediction model based on the first set of observation data and the first probability model, and the second prediction model performs the second prediction on the second set of observation objects The result of an operation is used to predict the second potential result of performing the first operation on the target object; and a first combination of at
  • an electronic device in a fourth aspect of the present disclosure, includes a processor and a memory coupled with the processor.
  • the memory has instructions stored therein, and the instructions cause the device to perform actions when executed by the processor. Actions include: in response to determining that one of a set of operations will be performed on the target object, predicting the first potential result of performing the first operation on the target object; determining the first probability of performing the first operation on the target object and performing a set of operations The second probability of the second operation in the second operation is different from the first operation; the second potential result of performing the first operation on the target object is predicted by estimating the result of performing the first operation on the observation object on which the second operation is performed ; And based on at least the first potential result, the second potential result, the first probability and the second probability, predict the first final result of performing the first operation on the target object.
  • a computer program product is provided.
  • the computer program product is tangibly stored on a computer-readable medium and includes machine-executable instructions. When executed, the machine-executable instructions cause the machine to execute according to the first One side approach.
  • a computer program product is provided.
  • the computer program product is tangibly stored on a computer-readable medium and includes machine-executable instructions. When executed, the machine-executable instructions cause the machine to execute according to the first Two-sided approach.
  • FIG. 1 shows a schematic diagram of an example environment in which multiple embodiments of the present disclosure can be implemented
  • Fig. 2 shows a flowchart of a process of acquiring a model according to an embodiment of the present disclosure
  • FIG. 3 shows a flowchart of a process for determining a second prediction model according to an embodiment of the present disclosure
  • FIG. 4 shows a flowchart of a process of predicting operation results according to an embodiment of the present disclosure.
  • Figure 5 shows a block diagram of an example device that can be used to implement embodiments of the present disclosure.
  • treatment allocation mechanism refers to a potential strategy that depends on individual characteristics and determines the treatment level (for example, drug or placebo) allocated to each individual under investigation in the observation experiment.
  • the term "potential result” refers to the result of the individual in the case where the individual is assigned to the designated treatment level t, and is represented by Y t .
  • ITE body causal effect
  • ITE is intended to measure how the individual allocates a specific treatment based on the characteristics of the individual (e.g., patient) (e.g., height, weight, age, etc., represented by X) (For example, drugs) respond.
  • X x).
  • the term “covariate shift” refers to the change in the distribution of input features (covariates) existing in two different observation groups (such as the intervention group and the control group). This problem often occurs in observational research and leads to the so-called “disposal selection bias” problem.
  • ITE individual causal effects
  • the computing device can automatically Making decisions or assisting people in making decisions, that is, determining whether to perform a certain treatment on an individual or determining which of multiple treatments to perform on the individual. For example, it may be desirable to predict the possible impact of a certain drug or treatment on a certain patient's condition, thereby automatically or assisting doctors in formulating treatment plans. It may also be expected to predict the extent to which the training course will improve a student’s performance, or predict the impact of advertising on the final purchase behavior of consumers. In order to make such a prediction, counterfactual information needs to be known.
  • ITE estimation mainly focused on training regression models or performing sample reweighting to estimate counterfactual results, involving nearest neighbor matching, propensity score matching, propensity score reweighting, and some tree and forest-based methods, such as Bayesian Additive regression tree (BART) and causal forest.
  • BART Bayesian Additive regression tree
  • Other recent work on ITE estimation includes representation learning methods and Gaussian processes.
  • the representation learning method it is necessary to first construct a deep neural network for representation learning to map the feature space to the representation space, and transfer the distribution of the intervention group and the control group to the new representation space. Then, the error loss of using the other two representation-based deep neural networks to predict the fact result across the intervention sample and the control sample is obtained as the fact result loss.
  • a metric representing the distance between the intervention distribution caused by the intervention and the control distribution is obtained as an integral probability metric (IPM) distance. Finally, minimize the weighted sum of the actual result error loss and the IPM distance.
  • the above ITE estimation method has several problems.
  • Traditional results regression methods generally do not consider the "disposition selection bias" problem, and the re-weighting method may be affected by the high variance of a limited sample. It means that the learning method is biased even with unlimited data and can only be applied to settings with only two treatment levels, which limits its use in practical operations where there may be many treatment levels.
  • the Gaussian process has a poor O(N 3 ) complexity relative to the number of samples, so it is not easy to apply this method to large-scale observational research.
  • a solution for predicting the operation result is proposed.
  • the factual outcome model and the propensity scoring model are first established, then the counterfactual outcome model is determined based on the factual outcome model and the propensity score model, and finally the weighted average of the fact and counterfactual outcome models is used to predict Disposal results and ITE.
  • the solution of the present disclosure can not only correct the covariate shift in the observational data, improve the accuracy of estimated treatment results and ITE, but also can be extended to scenarios with multiple treatment levels, and can be easily applied to large-scale observational research.
  • FIG. 1 shows a schematic diagram of an example environment 100 in which multiple embodiments of the present disclosure can be implemented.
  • a model 103 is generated by the computing device 102 based on the observation data set 101, which is used to predict the final result of performing one or more operations on the target object 107 and the individual causal effect ITE.
  • the model 103 may include the prediction model and the probability model 104 described in detail below, and may define how to combine the outputs of these models to predict the operation result.
  • the model 103 may also include an ITE representation 105, which predicts the difference between the final result of performing one operation on the target object 107 and the final result of performing another operation by combining the output of the predictive model and the probability model 104. In this article, this difference can be considered as the ITE for the target object 107.
  • the first set of observation data 110 includes the first set of observation results 112 that perform the first operation on the first set of observation objects 111
  • the second set of observation data 120 includes the second set of observations that perform the second operation on the second set of observation objects 121 Result 122.
  • the observation data set 101 may further include a third set of observation data 130, which includes a third set of observation results 132 for performing a third operation on the third set of observation objects 131.
  • the first operation, the second operation, and the third operation herein are different from each other, and the first group of observation objects 111, the second group of observation objects 121, and the third group of observation objects 131 are different from each other.
  • the first operation, the second operation, and the optional third operation described here can be regarded as different treatments given to the subject, and the treatments and operations in the following description can be used interchangeably.
  • observation data set 101 may also include more sets of observation data, where different operations are performed on objects in each set of observation data.
  • the first group of observation results 112 and the second group of observation results 122 can be represented by Y, and Y can be discrete or continuous.
  • the first group of observation objects 111 and the second group of observation objects 121 can be represented by their feature X, which can include many preprocessing variables that can be discrete or continuous.
  • the observation data set 101 can come from many observational studies in various disciplines.
  • the computing device 106 can obtain the model 103 generated by the computing device 102 and the feature X of the target object 107, and provide a prediction result 108 based on the model 103.
  • the prediction result 108 may include a prediction for the result of performing a certain operation on the target object 107, and may also include a prediction for the difference between the results of performing two different operations on the target object 107, that is, a prediction for ITE.
  • the operation or treatment T indicates whether the patient receives aspirin
  • the result Y indicates whether the patient's headache disappears
  • the feature X may include information such as the patient's age, gender, and blood pressure.
  • the first set of observation data 110 may include the characteristics X of multiple patients who received aspirin and whether their headaches disappeared after receiving the drug; and the second set of observation data 120 may include the characteristics X of multiple patients who did not receive aspirin and whether they had headaches disappear.
  • the computing device 102 can generate a model 103, and the computing device 106 can use the model 103 to predict whether the headache can disappear when the target object 107 accepts or does not accept aspirin, and can also predict the treatment of the target object 107 by aspirin Effect to determine or assist in determining whether the target object 107 should receive aspirin treatment.
  • the operation or treatment T may indicate whether the product is recommended to the consumer, the result Y indicates whether the consumer purchases the product, and the characteristic X may include the consumer's income, purchase history, and other information.
  • the first set of observation data 110 may include the characteristics of consumers who have recommended the product and whether they have purchased the product; the second set of observation data 120 may include the characteristics of consumers who have not recommended the product and Did they buy the product.
  • the computing device 106 can use the generated model 103 to predict the effect of product push on the target object 107, so as to determine or help determine whether to push the product to the target object 107.
  • FIG. 2 shows a flowchart of a process 200 of acquiring a model according to an embodiment of the present disclosure.
  • the process 200 may be implemented by the computing device 102 of FIG. 1.
  • the computing device 102 determines a first prediction model based on the first set of observation data 110.
  • the first set of observation data 110 includes a first set of observation results 112 of performing a first operation on the first set of observation objects 111, and the first prediction model is used to predict a first potential result of performing the first operation on the target object 107.
  • the first prediction model in this article may also be referred to as the fact result prediction model for the first operation.
  • the computing device 102 may establish a first objective function for determining the first prediction model based on the first set of observation objects 111 and the first set of observation results 112, and determine the first objective function by minimizing the first objective function.
  • a model parameter of the prediction model may be established.
  • X, T 1).
  • the established first predictive model that is, the factual result model for the first operation, can be as shown in formula (1):
  • the computing device 102 may also determine the fact result prediction model for the second operation based on the second set of observation data 120, and this model is also referred to herein as the fourth prediction model.
  • the fourth prediction model is used to predict the potential result of performing the second operation on the target object 107.
  • X, T 2).
  • the fourth prediction model established that is, the factual result model for the second operation, can be as shown in equation (2):
  • the number of hidden layers can be set to 1 or 2.
  • the dimensions of all hidden layers are the same and greater than the dimension of feature X, and the learning rate is the candidate set ⁇ 1.0 , 0.1, 0.01, 0.001, 0.0001 ⁇ , the regularization parameter is the candidate set ⁇ 0, 1e-4, 1e-2 ⁇ , the batch size is 64, and the number of iterations is 10000.
  • the training/test splitting technique often used in machine learning algorithms can then be used to select the best parameters from the candidate set.
  • the factual result prediction model for any operation t can also be determined in other ways. For example, it is possible to learn the representation of feature X, and then use neural network learning in the representation space of feature X to determine the factual outcome prediction model for operation t.
  • the computing device 102 determines a first probability model based on the first group of observation objects 111 on which the first operation is performed and the second group of observation objects 121 on which the second operation is performed different from the first operation.
  • the first probability model is used to determine the corresponding probability of performing the first operation and the second operation on the target object 107.
  • the computing device 102 may determine the second probability model based on the first group of observation objects 111, the second group of observation objects 121, and the third group of observation objects 131 on which the third operation is performed.
  • the second probability model is used to determine the probability of performing the third operation on the target object 107.
  • the first probability model, the second probability model, and possibly more probability models are collectively referred to as probability models.
  • X) and P(T 2
  • the computing device 102 may use a deep neural network to model the treatment or operation T for the feature X (each group of observation objects in the observation data set 101), thereby obtaining a probability model.
  • the computing device 102 can train a deep neural network to model the corresponding operation T for the feature X of the observation object, so as to obtain the conditional probability of the individual object being performed the t-th operation under the given feature X, as a probability model, That is (4):
  • Equation (4) represents the probability of performing the t-th operation on the target object 107 with feature X.
  • P(T 1
  • X) represents the probability of performing the first operation on the target object 107 with feature X
  • P(T 2
  • X) represents the probability of performing the second operation on the target object 107 with feature X
  • P(T 3
  • X) represents the probability of performing the second operation on the target object 107 with feature X.
  • a probability model such as the formula (4) can be regarded as a propensity scoring model or a treatment allocation mechanism as mentioned above. It should be understood that all observation objects in the observation data set 101 can be used to train the neural network for the probability model.
  • the computing device 102 can also use the training/test splitting technique to select the best hyperparameters from the candidate set.
  • the computing device 102 determines a second prediction model based on the first set of observation data 110 and the first probability model determined in block 220.
  • the second prediction model predicts the potential result of performing the first operation on the target object 107 by estimating the result of performing the first operation on the second group of observation objects 121.
  • the computing device 102 may also determine a third prediction model based on the first set of observation data 110, the first probability model, and the second probability model.
  • the third prediction model predicts the third potential result of performing the first operation on the target object 107 by estimating the result of performing the first operation on the third group of observation objects 131.
  • the second prediction model and the third prediction model described in this article can be considered as counterfactual outcome prediction models for the first operation.
  • FIG. 3 shows a flowchart of a process 300 for determining a second prediction model according to an embodiment of the present disclosure.
  • the general process for determining the counterfactual outcome model will also be described in conjunction with Figure 3.
  • the process 300 may be regarded as an implementation of the block 230 in FIG. 2 and may be executed by the computing device 102 as shown in FIG. 1, for example. It should be understood that the process 300 may also include additional steps not shown and/or the steps shown may be omitted. The scope of the present disclosure is not limited in this respect.
  • the computing device 102 determines a sample weight based on the first probability model determined in block 220, the number of observation objects 111 in the first group, and the number of observation objects 121 in the second group.
  • the sample weight is used to correct the deviation of the distribution of the second group of observation objects 121 relative to the distribution of the first group of observation objects 111.
  • the sample weight determined here can be regarded as the importance sampling weight, which is used to correct the covariate offset problem mentioned above, and the sample weight will be used for subsequent counterfactual result prediction.
  • It can be estimated from the number of observation objects 111 in the first group and the number of observation objects 121 in the second group, It can be estimated by the probability model determined at block 220.
  • the computing device 102 may determine a second prediction model based on the determined sample weight (for example, equation (7)) and the first set of observation data 110, that is, the counterfactual result prediction model for the first operation. For example, the computing device 102 may adopt a transfer learning technique to determine the second prediction model.
  • the computing device 102 establishes a second objective function for determining the second prediction model based on the sample weight, the first group of observation objects 111, and the first group of observation results 112.
  • the counterfactual result Y 1 of the second group of observation objects 121 being performed the first operation cannot be observed.
  • the counterfactual result for the second group of observation objects 121 can be predicted based on the migration learning method with the aid of the first group of observation objects 111 and the corresponding first group of observation results 112 observed by the result Y 1 .
  • the objective function shown in formula (8) can be established:
  • h 2 ( ⁇ ) represents a deep neural network
  • ⁇ 2 represents a regularization parameter
  • the computing device 102 determines the model parameters of the second prediction model by minimizing the second objective function, thereby obtaining the second prediction model.
  • the computing device 102 can learn the neural network h 2 (X) by minimizing the objective function shown in equation (8), and use h 2 (X) to obtain the second prediction model, as shown in equation (9) :
  • the computing device 102 may also similarly determine a counterfactual outcome prediction model for any t-th operation (for example, the second operation).
  • the counterfactual result prediction model for the second operation is referred to as the fifth prediction model in this article.
  • the computing device 102 may determine a fifth prediction model based on the second set of observation data 120 and the first probability model determined at block 220. The fifth prediction model predicts the potential result of performing the second operation on the target object 107 by estimating the result of performing the second operation on the first group of observation objects 111.
  • the fifth prediction model can be determined similarly to the process described with reference to FIG. 3.
  • It can be estimated from the number of observation objects 111 in the first group and the number of observation objects 121 in the second group, It can be estimated by the probability model determined at block 220.
  • w 1 1/w 2 .
  • the computing device 102 next establishes an objective function for determining the five prediction model based on the sample weight, the second set of observation objects 121 and the second set of observation results 122.
  • the counterfactual result Y 2 of the second operation performed on the first group of observation objects 111 is unobservable.
  • the counterfactual result for the first group of observation objects 111 can be predicted based on the transfer learning method with the aid of the second group of observation objects 121 and the corresponding second group of observation results 122 observed with the result T 2 .
  • the objective function shown in equation (12) can be established:
  • h 1 ( ⁇ ) represents a deep neural network
  • ⁇ 1 represents a regularization parameter
  • the computing device 102 can then learn the neural network h 1 (X) by minimizing the objective function as shown in equation (12), and use h 1 (X) to obtain the fifth prediction Model, as shown in formula (13):
  • the counterfactual result prediction model of the third group of observation objects 131 for which the third operation is performed can also be similarly determined for the second operation, as shown in equation (13'):
  • the computing device 102 can target any For the counterfactual result prediction model of the tth operation, as shown in equation (14):
  • the instability problem in the traditional method can be avoided.
  • another advantage of this method is that the training/test separation technique can still be used to select the optimal parameters. Specifically, although the counterfactual result is not observed, the weighted error can be used as the test error.
  • the computing device 102 determines a first combination of at least a first prediction model, a second prediction model, and a first probability model for predicting the first final result of performing the first operation on the target object 107 .
  • the computing device 102 may use equations (1), (4), and (9) to determine the first combination.
  • the computing device 102 may combine the outputs of various prediction models and probability models according to a predetermined relationship.
  • the final result of performing the t-th operation on the target object 107 with feature X can be predicted by the following formula:
  • the first combination can be embodied as:
  • the first combination can be embodied as:
  • the computing device 102 generates the first combination for predicting the final result of performing the first operation on the target object 107.
  • the computing device 102 may provide the above-described factual result prediction model, counterfactual result prediction model, and probability model as part of the model 103 for the computing device 106 to determine the prediction result and ITE.
  • the computing device 102 may generate a combination for predicting the final result of performing any t-th operation on the target object 107. For example, the computing device 102 may determine the second combination based on the fourth prediction model, the fifth prediction model, and the probability model mentioned above for predicting the second final result of performing the second operation on the target object 107.
  • the computing device 102 may, for example, use equations (2), (4), and (13) to determine the second combination. Similar to equation (17), in this case, the second combination can be embodied as:
  • the computing device 102 may provide a combination determined in the above manner for predicting the final result of the t-th operation performed on the target object 107, namely E(Y t
  • the computing device 102 may determine the difference representation based on the above-mentioned first combination and the above-mentioned second combination for predicting the first final result of performing the first operation on the target object 107 and performing the second operation on the target object 107.
  • the difference is, for example, the ITE indication 105 shown in FIG. 1.
  • Equation (15) It can be seen from equation (20) that the value of ITE not only depends on the characteristic information (value of X) of the individual (target object 107), but also depends on the treatment level considered. This definition allows prediction of ITE between any two treatment levels.
  • the computing device 102 may provide the model 103 including but not limited to the predictive model and probability model 104 and the ITE representation 105 to the computing device 106, for example. Based on the model 103, the computing device 106 can predict at least one of an operation or treatment result and an individual causal effect for the target object 107. For example, the computing device 106 may predict the difference between giving a certain treatment and not giving the treatment for the patient, thereby automatically or assisting the doctor or the patient in making a decision.
  • the model for obtaining the prediction operation result and ITE according to the present disclosure is described above.
  • This model can not only improve the accuracy of estimated treatment results and ITE, but can also be extended to scenarios with multiple treatment levels.
  • the scheme can be easily applied to large-scale observational research.
  • FIG. 4 shows a flowchart of a process 400 of predicting the result of an operation according to an embodiment of the present disclosure.
  • the process 400 may be implemented by the computing device 106 of FIG. 1. To facilitate discussion, the process 400 will be described in conjunction with FIG. 1.
  • the computing device 106 determines whether one of a set of operations will be performed on the target object 107. For example, if the computing device 106 receives an input about the characteristic X of the target object 107 (the value x of X), it can be considered that a certain operation is to be performed on the target object 107. It should be understood that the description of the first operation and the second operation may refer to any two operations in the case of having multiple treatment levels, and is not limited to the case of having only two treatment levels.
  • the process proceeds to block 420.
  • the computing device 106 predicts a first potential result of performing the first operation (t 1 ) on the target object 107.
  • the computing device 106 may obtain the first prediction model described above with respect to block 210, and predict the first potential outcome based on the first prediction model. For example, using equation (3) to obtain the first potential result E (Y
  • the computing device 106 determines the first probability of performing the first operation (t 1 ) and the second probability of the second operation (t 2 ) on the target object 107.
  • the computing device 106 may obtain the probability model described above with respect to block 220, and determine the corresponding probability based on the probability model.
  • X x).
  • X x)
  • the computing device 106 predicts the second potential result of performing the first operation on the target object 107 by estimating the result of performing the first operation on the observation object on which the second operation is performed (eg, the second group of observation objects 120).
  • the computing device 106 may obtain the second prediction model described above with respect to block 230 and FIG. 3, and predict the second potential result based on the second prediction model.
  • the computing device 106 can use equation (14) to calculate the second potential result
  • the computing device 106 may also obtain the third prediction model described above, and predict the third potential result based on the third prediction model.
  • the computing device 106 can use equation (14) to calculate the third potential result
  • the computing device 106 predicts a first final result of performing the first operation on the target object 107 based on at least the first potential result, the second potential result, the first probability, and the second probability.
  • the computing device 106 may utilize the calculations at blocks 420, 430, and 440 to determine the first final result.
  • the first final result can be expressed as:
  • the computing device 106 may predict the first final result based on the first potential result, the second potential result, the third potential result, the first probability, the second probability, and the third probability.
  • the first final result can be expressed as:
  • the computing device 106 may also predict the fourth potential result of performing the second operation on the target object 107, for example, based on the fourth prediction model described above.
  • the computing device 106 may predict the fifth potential result of performing the second operation on the target object 107 by estimating the result of performing the second operation on the observation object on which the first operation is performed, for example, based on the fifth prediction model described above.
  • the computing device 106 may further predict the second final result of performing the second operation on the target object 107 based on the fourth potential result, the fifth potential result, and the corresponding probability.
  • the second final result can be expressed as:
  • the second final result can be expressed as:
  • the computing device 106 may predict the difference between the first final result of performing the first operation on the target object 107 and the second final result of performing the second operation on the target object 107, that is, the first operation is relative to the first operation. 2.
  • Operational ITE For example, the computing device 106 may calculate the ITE of the first operation t 1 relative to the second operation t 2 for the target object 107 based on the ITE representation 105:
  • the computing device 106 may provide (for example, to the user) at least one of the predicted first final result, the second final result, and the ITE (for example, the prediction result 108 shown in FIG. 1) to It helps the user to make a decision whether to perform the first operation or the second operation on the target object 107, for example.
  • the computing device 106 may determine the target operation to be performed on the target object 107 from the first operation and the second operation based on the determined difference (for example, ITE). The computing device 106 may simply select the target operation based on the determined difference. For example, if the determined difference indicates that the ITE is a positive effect or a positive effect, the computing device 106 may determine that the first operation is to be performed on the target object 107. The computing device 106 may further combine other factors to select the target operation, for example, combine the cost difference between the first operation and the second operation (for example, time cost, expense cost). For example, if the ratio of the ITE to the cost difference is greater than the threshold ratio, the computing device 106 may determine that the first operation is to be performed on the target object 107.
  • the determined difference for example, ITE
  • the computing device 106 may determine that the first operation is to be performed on the target object 107.
  • process 200 and the process 400 are described as being implemented at two computing devices, it should be understood that the processes described herein can be implemented at the same computing device or by a distributed computing system. The scope of the present disclosure is not limited in this respect.
  • the data set used here comes from a semi-simulated study based on the Infant Health and Development Program (IHDP).
  • IHDP data has treatments and features from real random experiments to study the impact of high-quality child care and home visits on future cognitive test scores. This experiment uses simulated results so that the true causal effect is known.
  • the data set consists of 747 individuals (139 interventions, 608 controls), and each individual is represented by 25 covariates that measure the characteristics of the child and its mother.
  • Treatment indicates whether the premature baby has received high-quality child care and home visits from a trained provider; the continuous result indicates the baby’s future cognitive test scores; these 25 characteristics include measures for the child-birth weight, head circumference, Premature babies are born in weeks, gender, twin status, and the current measures—the mother’s age at delivery, marital status, and education. There are 6 continuous covariates and 19 binary covariates in total.
  • a predicted future cognitive test score for a premature baby is obtained given the 25 characteristics of the baby.
  • a prediction about the potential probability of a premature baby receiving high-quality child care and home visits is obtained based on the 25 characteristics of the baby.
  • the above-mentioned IHDP data is learned by using the above-mentioned traditional method, the representation learning method, and the method of the present disclosure to predict ITE.
  • the results show that the method of the present disclosure is superior to the traditional method and the representation learning method in terms of the estimation accuracy of non-uniform effect (HEPE) and the root mean square error.
  • HEPE non-uniform effect
  • doctors or patients themselves may need to choose from a variety of treatment methods or methods, for example, they need to choose whether to administer a certain drug, or choose one from multiple drugs, or from multiple Choose one of physical therapy (such as infrared, magnetic therapy) and so on.
  • the treatment means A, B, C, etc. will be used to represent.
  • doctors choose treatment based on personal experience, but this relies heavily on subjective judgment and experience accumulation.
  • Medical institutions usually have the treatment status of patients with the same disease who received treatments A, B, and C (hereinafter referred to as observation data).
  • Observation data can include various characteristics of previous individual patients, such as age, weight, blood pressure, and various examination data related to the disease, etc., and include the treatment effect of these individual patients after receiving treatment, such as whether the disease disappears, after receiving treatment Physiological parameters and so on. It is expected that such observational data can be used to predict the therapeutic effect of these treatment methods on the target patient currently undergoing treatment and the difference in effect between different treatment methods (for example, ITE), so as to determine the treatment method for the target patient.
  • ITE different treatment methods
  • observational data usually has the covariate shift problem mentioned above. For example, due to the cost difference between different treatment methods, the observation data may have deviations in the economic status of individual patients. If this covariate shift problem is not considered, the obtained prediction results will not be objective and accurate enough.
  • the solution provided in the present disclosure can solve the problem caused by the covariate shift, accurately predict the therapeutic effect of the target patient on the treatment methods A, B, and C, and the effect difference between any two treatment methods.
  • the predicted treatment effect and/or effect difference can be provided to doctors or patients to assist them in choosing treatment methods.
  • the computing device may also automatically determine the treatment means based on predetermined rules, as described above. Therefore, in this way, a more suitable and effective treatment can be selected for the patient.
  • Observation data can include the characteristics of students who have previously participated in courses D, E, F, etc., such as age, gender, whether they have participated in similar courses, family economics, and other information, as well as individual students' performance after participating in corresponding courses, such as test scores, awards, etc.
  • observational data can be used to help students who are currently choosing courses to make decisions, or to recommend more suitable courses to them. Similar to the previous article, the observed data usually has covariate shifts, which makes traditional prediction methods unable to accurately and objectively recommend courses to students.
  • the computing device may also recommend more suitable courses to the students based on predetermined rules, as described above. For example, if the ITE for courses D and E indicates a positive or positive effect, then course D can be recommended to target students. Therefore, in this way, more suitable and beneficial courses can be selected for students.
  • FIG. 5 shows a schematic block diagram of an example device 500 that can be used to implement embodiments of the present disclosure.
  • the device 500 includes a central processing unit (CPU) 501, which can be based on computer program instructions stored in a read only memory (ROM) 502 or loaded from a storage unit 508 to a computer in a random access memory (RAM) 503. Program instructions to perform various appropriate actions and processing.
  • ROM read only memory
  • RAM random access memory
  • RAM 503 various programs and data required for the operation of the device 500 can also be stored.
  • the CPU 501, the ROM 502, and the RAM 503 are connected to each other through a bus 504.
  • An input/output (I/O) interface 505 is also connected to the bus 504.
  • the I/O interface 505 includes: an input unit 506, such as a keyboard, a mouse, etc.; an output unit 507, such as various types of displays, speakers, etc.; and a storage unit 508, such as a magnetic disk, an optical disk, etc. ; And a communication unit 509, such as a network card, a modem, a wireless communication transceiver, etc.
  • the communication unit 509 allows the device 500 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.
  • the processing unit 501 executes the various methods and processes described above, such as any one of the processes 200, 300, and 400.
  • the processes 200, 300, and 400 may be implemented as a computer software program or a computer program product, which is tangibly contained in a machine-readable medium, such as the storage unit 508.
  • part or all of the computer program may be loaded and/or installed on the device 500 via the ROM 502 and/or the communication unit 509.
  • the CPU 501 may be configured to execute any of the processes 200, 300, and 400 in any other suitable manner (for example, by means of firmware).
  • a computer-readable medium having a computer program stored thereon, and when the program is executed by a processor, the method according to the present disclosure is implemented.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Public Health (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Primary Health Care (AREA)
  • Probability & Statistics with Applications (AREA)
  • Epidemiology (AREA)
  • Mathematical Optimization (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medicinal Chemistry (AREA)
  • Algebra (AREA)
  • Chemical & Material Sciences (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Pathology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本公开的实施例提供了用于预测操作结果的方法、电子设备和计算机程序产品。一种预测操作结果的方法包括基于第一组观测数据,确定第一预测模型。该方法还包括基于被执行第一操作的第一组观测对象和被执行不同于第一操作的第二操作的第二组观测对象,确定第一概率模型。该方法进一步包括基于第一组观测数据和第一概率模型,确定第二预测模型。该方法进一步包括确定至少第一预测模型、第二预测模型和第一概率模型的第一组合,以用于预测对目标对象执行第一操作的第一最终结果。利用本公开的实施例,能够提高个体因果效应估计的准确性,还能够扩展到具有多个处置水平的应用场景。

Description

预测操作结果的方法、电子设备和计算机程序产品 技术领域
本公开的实施例涉及机器学习领域,并且更具体地,涉及用于预测操作结果的方法、电子设备和计算机程序产品。
背景技术
随着信息技术的飞速发展,数据规模迅速增长。在这样的背景和趋势下,机器学习受到越来越广泛的关注。其中,推断因果关系是诸如医疗健康、教育、就业和生态等各个领域的基本问题。这种因果推断问题通常被称为反事实问题,例如“如果这名患者服用新药,他会存活更长时间吗?”或“如果这名学生参加教育培训计划,他会获得更高的分数吗?”
对每个个体的因果效应(treatment effect)进行预测的主要挑战是对于特定分配的处理仅能观测到事实结果,而无法观测到对应的反事实结果,没有反事实结果则难以确定真正的因果效应。因此,需要一种通过观测到的数据来更准确地预测因果效应的方法。
发明内容
本公开的实施例提供了用于预测操作结果的方案。
在本公开的第一方面中,提供了一种预测操作结果的方法。该方法包括基于第一组观测数据,确定第一预测模型,第一组观测数据包括对第一组观测对象执行第一操作的第一组观测结果,第一预测模型用于预测对目标对象执行第一操作的第一潜在结果。该方法还包括:基于被执行第一操作的第一组观测对象和被执行不同于第一操作的第二操作的第二组观测对象,确定第一概率模型,第一概率模型用于确定对目标对象执行第一操作和第二操作的相应概率。该方法还包括基于第一组观测数据和第一概率模型,确定第二预测模型,第二预测 模型通过估计对第二组观测对象执行第一操作的结果来预测对目标对象执行第一操作的第二潜在结果。该方法还包括确定至少第一预测模型、第二预测模型和第一概率模型的第一组合,以用于预测对目标对象执行第一操作的第一最终结果。
在本公开的第二方面中,提供了预测操作结果的方法。该方法包括响应于确定将对目标对象执行一组操作中的一个操作,预测对目标对象执行第一操作的第一潜在结果。该方法还包括:确定对目标对象执行第一操作的第一概率和执行一组操作中的第二操作的第二概率,第二操作不同于第一操作。该方法还包括:通过估计对被执行第二操作的观测对象执行第一操作的结果来预测对目标对象执行第一操作的第二潜在结果。该方法还包括:至少基于第一潜在结果、第二潜在结果、第一概率和第二概率,预测对目标对象执行第一操作的第一最终结果。
在本公开的第三方面中,提供了一种电子设备。该电子设备包括处理器以及与处理器耦合的存储器,存储器具有存储于其中的指令,指令在被处理器执行时使设备执行动作。动作包括:基于第一组观测数据,确定第一预测模型,第一组观测数据包括对第一组观测对象执行第一操作的第一组观测结果,第一预测模型用于预测对目标对象执行第一操作的第一潜在结果;基于被执行第一操作的第一组观测对象和被执行不同于第一操作的第二操作的第二组观测对象,确定第一概率模型,第一概率模型用于确定对目标对象执行第一操作和第二操作的相应概率;基于第一组观测数据和第一概率模型,确定第二预测模型,第二预测模型通过估计对第二组观测对象执行第一操作的结果来预测对目标对象执行第一操作的第二潜在结果;以及确定至少第一预测模型、第二预测模型和第一概率模型的第一组合,以用于预测对目标对象执行第一操作的第一最终结果。
在本公开的第四方面中,提供了一种电子设备。该电子设备包括处理器以及与处理器耦合的存储器,存储器具有存储于其中的指令,指令在被处理器执行时使设备执行动作。动作包括:响应于确定将对 目标对象执行一组操作中的一个操作,预测对目标对象执行第一操作的第一潜在结果;确定对目标对象执行第一操作的第一概率和执行一组操作中的第二操作的第二概率,第二操作不同于第一操作;通过估计对被执行第二操作的观测对象执行第一操作的结果来预测对目标对象执行第一操作的第二潜在结果;以及至少基于第一潜在结果、第二潜在结果、第一概率和第二概率,预测对目标对象执行第一操作的第一最终结果。
在本公开的第五方面中,提供了一种计算机程序产品,计算机程序产品被有形地存储在计算机可读介质上并且包括机器可执行指令,机器可执行指令在被执行时使机器执行根据第一方面的方法。
在本公开的第六方面中,提供了一种计算机程序产品,计算机程序产品被有形地存储在计算机可读介质上并且包括机器可执行指令,机器可执行指令在被执行时使机器执行根据第二方面的方法。
提供发明内容部分是为了简化的形式来介绍对概念的选择,它们在下文的具体实施方式中将被进一步描述。发明内容部分无意标识本公开的关键特征或主要特征,也无意限制本公开的范围。
附图说明
通过结合附图对本公开示例性实施例进行更详细的描述,本公开的上述以及其它目的、特征和优势将变得更加明显,其中,在本公开示例性实施例中,相同的参考标号通常代表相同部件。在附图中:
图1示出了本公开的多个实施例能够在其中实现的示例环境的示意图;
图2示出了根据本公开的实施例的获取模型的过程的流程图;
图3示出了根据本公开的实施例的用于确定第二预测模型的过程的流程图;
图4示出了根据本公开的实施例的预测操作结果的过程的流程图;以及
图5示出了可以用来实施本公开的实施例的示例设备的框图。
具体实施方式
下面将参考附图中示出的若干示例实施例来描述本公开的原理。虽然附图中显示了本公开的优选实施例,但应当理解,描述这些实施例仅是为了使本领域技术人员能够更好地理解进而实现本公开,而并非以任何方式限制本公开的范围。
在本文中使用的术语“包括”及其变形表示开放性包括,即“包括但不限于”。除非特别申明,术语“或”表示“和/或”。术语“基于”表示“至少部分地基于”。术语“一个示例实施例”和“一个实施例”表示“至少一个示例实施例”。术语“另一实施例”表示“至少一个另外的实施例”。术语“第一”、“第二”等等可以指代不同的或相同的对象。下文还可能包括其他明确的和隐含的定义。
在本公开的实施例中,术语“处置分配机制”是指取决于个体特征并且确定观测实验中被研究的每个个体所分配的处置水平(例如,药物或安慰剂)的潜在策略。在本文中,使用T来表示处置,其取值为t,例如对于仅两个处置水平的情况t=0,1。
在本公开的实施例中,术语“潜在结果”是指将个体分配到指定的处置水平t的情况下个体的结果,用Y t表示。例如,在两个处置水平t=0,1的情况下,每个个体有两个潜在的结果Y 0和Y 1;在三个处置水平t=0,1,2的情况下,每个个体有三个潜在的结果Y 0、Y 1和Y 2
在本公开的实施例中,术语“反事实”是指对过去已经发生的事实进行否定而重新表征,以建构一种可能性的假设结果。举例而言,对于干预组(treated group)(T=1)中的个体,潜在结果Y 1是观测到的事实结果,而Y 0是未观测到的反事实结果;然而,对于控制组(T=0)中的个体,Y 0是观测到的事实结果,Y 1是未观测到的反事实结果。
在本公开的实施例中,术语“个体因果效应(ITE)”旨在基于个体(例如,患者)的特征(例如,身高、体重、年龄等,用X表示)来测量个体如何对特定处置分配(例如,药物)做出响应。在本文中,对于具有多个处置水平的情况,ITE定义为
Figure PCTCN2019083905-appb-000001
在具有两个处置水平的情况下,ITE定义为ITE(x)=E(Y 1-Y 0|X=x)。
在本公开的实施例中,术语“协变量偏移”是指两个不同的观测组(诸如,干预组和对照组)中存在的输入特征(协变量)分布的变化。这种问题经常发生在观测性研究中,并导致所谓的“处置选择偏差”问题。
如前文所提到的,在许多实际场景中,期望能够预测个体在接受某个处置后的结果以及预测个体接受不同处置的结果差异,即个体因果效应(ITE),以使得计算设备能够自动地做出决策或者辅助人们做出决策,即确定对某一个体是否执行某个处置或者确定对个体执行多个处置中的哪一个。例如,可能期望预测某种药物或治疗对某个患者病症的可能影响,从而自动地或辅助医生制定治疗方案。还可能期望预测培训课程能在多大程度上提高某个学生的成绩,或者预测广告推送对消费者最终购买行为的影响等。为了进行这样的预测,需要知晓反事实信息。
估计反事实信息的一种可能方法是进行随机对照实验。然而,这些实验昂贵且耗时,并且经常无法获得。因此,需要基于观测性研究来进行预测,在观测性研究中干预组和对照组下的数据分布是未知的并且通常是不同的。这导致了从观测性数据估计ITE的“处置选择偏差”问题。
传统上,估计ITE的方法主要集中在训练回归模型或执行样本重新加权来估计反事实结果,涉及最近邻匹配、倾向评分匹配、倾向评分重新加权以及一些基于树和森林的方法,如贝叶斯加性回归树(BART)和因果森林。ITE估计方面的其他近期工作包括表示学习方法和高斯过程。
在表示学习方法中,需要首先构建用于表示学习的深度神经网络以将特征空间映射到表示空间,并将干预组和对照组的分布转移到新的表示空间中。然后获得使用另外两个基于表示的深度神经网络来跨干预样本和对照样本预测事实结果的错误损失,作为事实结果损失。 获得由表示引起的干预分布和对照分布之间的距离的度量作为积分概率度量(IPM)距离。最后最小化事实结果误差损失和IPM距离的加权和。
以上ITE估计方法存在若干问题。传统的结果回归方法一般不考虑“处置选择偏差”问题,而重新加权方法可能会受到有限样本的高方差影响。表示学习方法即使在无限数据下也是偏置的,并且仅能应用于只有两个处置水平的设置,这限制了其在可能有很多处置水平的实际操作中的使用。已知高斯过程相对于样本数具有较差的O(N 3)的复杂度,因此将这种方法应用于大型观测性研究并不容易。
根据本公开的实施例,提出了一种预测操作结果的方案。在该方案中,基于观测性数据,首先建立事实结果模型和倾向评分模型,接下来基于事实结果模型和倾向评分模型来确定反事实结果模型,最后通过事实和反事实结果模型的加权平均来预测处置结果和ITE。本公开的方案不仅能够修正观测性数据中的协变量偏移、提高估计处置结果和ITE的准确性,还能够扩展到具有多个处置水平的场景,并且能够容易地应用到大型观测性研究。
以下将参照附图来具体描述本公开的实施例。
图1示出了本公开的多个实施例能够在其中实现的示例环境100的示意图。在该示例环境100中,由计算设备102基于观测数据集101来生成模型103,其用于预测对目标对象107执行一个或多个操作的最终结果和个体因果效应ITE。模型103可以包括下文所详细描述的预测模型和概率模型104,并且可以限定如何来组合这些模型的输出以进行操作结果的预测。模型103还可以包括ITE表示105,其通过组合预测模型和概率模型104的输出来预测对目标对象107执行一个操作的最终结果与执行另一操作的最终结果之间的差异。在本文中,这种差异可以认为是针对目标对象107的ITE。
第一组观测数据110包括对第一组观测对象111执行第一操作的第一组观测结果112,而第二组观测数据120包括对第二组观测对象121执行第二操作的第二组观测结果122。在一些实施例中,观测数 据集101还可以包括第三组观测数据130,其包括对第三组观测对象131执行第三操作的第三组观测结果132。本文中的第一操作、第二操作和第三操作互不相同,并且第一组观测对象111、第二组观测对象121和第三组观测对象131互不相同。这里所描述的第一操作、第二操作以及可选的第三操作可以认为是给予对象的不同处置,以下描述中处置和操作可以互换地使用。
尽管图1中仅示出了第一组观测数据110、第二组观测数据120和可选的第三组观测数据130,但是应当理解本公开的实施例可以应用于具有更多个处置水平的情况,从而观测数据集101还可以包括更多组观测数据,其中每组观测数据中的对象被执行不同的操作。
观测数据集101包括对每个被研究对象的操作或处置(T)、观测结果(Y)和对象的特征信息(X)。在具有两种不同的处置的情况下,T是二元的。为了便于讨论,用T=1表示第一操作(例如,干预组),用T=2表示第二操作(例如,对照组)。在存在第三操作的情况下,用T=3表示第三操作。第一组观测结果112和第二组观测结果122可以用Y来表示,并且Y可以是离散的或连续的。第一组观测对象111和第二组观测对象121可以用其特征X来表示,特征X可以包括可以是离散的或连续的许多预处理变量。观测数据集101可以来自各种学科的许多观测性研究。
计算设备106可以获取由计算设备102生成的模型103以及目标对象107的特征X,并且基于模型103来提供预测结果108。预测结果108可以包括针对对目标对象107执行某一操作的结果的预测,还可以包括针对对目标对象107执行两个不同操作的结果差异的预测,即针对ITE的预测。
举例而言,在医学研究中,操作或处置T指示患者是否接受阿司匹林,结果Y指示患者的头痛是否消失,并且特征X可以包括患者的年龄、性别、血压等信息。第一组观测数据110可以包括接受阿司匹林的多个患者的特征X以及他们在接受药物后头痛是否消失;而第二组观测数据120可以包括没有接受阿司匹林的多个患者的特征X以及 他们头痛是否消失。基于这样的观测数据,计算设备102可以生成模型103,并且计算设备106可以利用模型103来预测目标对象107接受和不接受阿司匹林的情况下头痛是否能够消失,还可以预测阿司匹林对目标对象107的处置效应,以确定或者辅助确定目标对象107是否应当接受阿司匹林的治疗。
作为另一示例,操作或处置T可以指示产品是否被推荐给消费者,结果Y指示消费者是否购买该产品,并且特征X可以包括消费者的收入、购买历史等信息。在这种场景中,第一组观测数据110可以包括被推荐了产品的消费者的特征及他们是否购买了该产品;第二组观测数据120可以包括未被推荐该产品的消费者的特征及他们是否购买了该产品。类似的,计算设备106可以利用所生成的模型103来预测产品推送对目标对象107的效应,从而确定或者帮助确定是否向目标对象107进行产品推送。
为了更清楚地理解本公开的实施例所提供的方案,将参照图2来进一步描述本公开的实施例。图2示出了根据本公开的实施例的获取模型的过程200的流程图。过程200可以由图1的计算设备102来实现。为便于讨论,将结合图1来描述过程200,并且将用T=1来表示第一操作、用T=2来表示第二操作、用T=t来表示任意操作。
在框210,计算设备102基于第一组观测数据110,确定第一预测模型。第一组观测数据110包括对第一组观测对象111执行第一操作的第一组观测结果112,并且第一预测模型用于预测对目标对象107执行第一操作的第一潜在结果。本文中的第一预测模型也可以被称为针对第一操作的事实结果预测模型。
可以使用任何适当的方法来确定第一预测模型,例如使用神经网络。在一些实施例中,计算设备102可以基于第一组观测对象111和第一组观测结果112,建立用于确定第一预测模型的第一目标函数,并且通过最小化第一目标函数来确定第一预测模型的模型参数。
例如,计算设备102可以训练深度神经网络以针对第一组观测对象111(例如,T=1的观测对象)中的各个观测对象的特征X对相应 的观测结果Y进行建模,从而得到给定T=1和特征X情况下Y的条件期望,即E(Y|X,T=1)。在完全训练神经网络模型之后,所建立的第一预测模型,即针对第一操作的事实结果模型,可以如式(1)所示:
E(Y|X,T=1)  (1)
在一些实施例中,计算设备102还可以基于第二组观测数据120,确定针对第二操作的事实结果预测模型,该模型在本文中也被称为第四预测模型。第四预测模型用于预测对目标对象107执行第二操作的潜在结果。
与第一预测模型的确定类似,计算设备102也可以使用神经网络来确定第四预测模型。例如,计算设备102可以训练深度神经网络以针对第二组观测对象121(例如,T=2)中的各个观测对象的特征X对相应的观测结果Y进行建模,从而得到给定T=2和特征X情况下Y的条件期望,即E(Y|X,T=2)。在完全训练神经网络模型之后,所建立的第四预测模型,即针对第二操作的事实结果模型,可以如式(2)所示:
E(Y|X,T=2)  (2)
在实际操作中,为了降低训练深度神经网络时的计算复杂度,可以将隐藏层的数量设置为1或2,所有隐藏层的维数相同且大于特征X的维度,学习率为候选集{1.0,0.1,0.01,0.001,0.0001},正则化参数为候选集{0,1e-4,1e-2},批量大小为64,迭代次数为10000。然后可以使用机器学习算法中经常采用的训练/测试分裂技术来从候选集合中选择最佳参数。
应当理解,以上关于神经网络的训练方式仅是示意性而无意限制本公开的范围。还应当理解,对于任何T=t的情况,都可以类似地基于观测的数据来确定针对第t操作的事实结果预测模型,如式(3)所示:
E(Y|X,T=t)  (3)
还可以通过其他方式来确定针对任意操作t的事实结果预测模型。例如,可以学习特征X的表示,然后在特征X的表示空间中通过神经网络的学习来确定针对操作t的事实结果预测模型。
在框220,计算设备102基于被执行第一操作的第一组观测对象111和被执行不同于第一操作的第二操作的第二组观测对象121,确定第一概率模型。该第一概率模型用于确定对目标对象107执行第一操作和第二操作的相应概率。
在存在第三操作的实施例中,计算设备102可以基于第一组观测对象111、第二组观测对象121和被执行第三操作的第三组观测对象131,确定第二概率模型。该第二概率模型用于确定对目标对象107执行第三操作的概率。
在本文中将第一概率模型、第二概率模型以及可能的更多概率模型统称为概率模型。应当理解,第一概率模型可以包括分别用于确定对目标对象107执行第一操作的第一概率和执行第二操作的第二概率的两个模型,例如下文所描述的P(T=1|X)和P(T=2|X)。
概率模型的一个实现可以是倾向评分模型。计算设备102可以使用深度神经网络来针对特征X(观测数据集101中的各组观测对象)对处置或操作T进行建模,从而获得概率模型。例如,计算设备102可以训练深度神经网络以针对观测对象的特征X对相应的操作T进行建模,从而得到在给定特征X情况下个体对象被执行第t操作的条件概率,作为概率模型,即式(4):
P(T=t|X)  (4)
式(4)表示对具有特征X的目标对象107执行第t操作的概率。例如,P(T=1|X)表示对具有特征X的目标对象107执行第一操作的概率;P(T=2|X)表示对具有特征X的目标对象107执行第二操作的概率;P(T=3|X)表示对具有特征X的目标对象107执行第二操作的概率。诸如式(4)所示的概率模型可以被认为是倾向评分模型或如上文所提及的处置分配机制。应当理解,可以使用观测数据集101中的所有观测对象来训练用于概率模型的神经网络。
在实际应用中训练该深度神经网络时,由于T的取值是离散的,因此可以交叉熵损失作为损失函数,隐藏层数为1或2,所有隐藏层的维数相同且大于特征X的维数,学习率是候选集{0.8,0.1,0.05,0.005,0.001},正则化参数为0,批量大小为64,迭代次数为10000。类似于上述事实结果预测模型的深度神经网络的训练,计算设备102也可以使用训练/测试分裂技术从候选集中选择最佳超参数。
在框230,计算设备102基于第一组观测数据110和在框220处确定的第一概率模型,确定第二预测模型。第二预测模型通过估计对第二组观测对象121执行第一操作的结果来预测对目标对象107执行第一操作的潜在结果。
在存在第三操作的实施例中,计算设备102还可以基于第一组观测数据110、第一概率模型和第二概率模型,确定第三预测模型。第三预测模型通过估计对第三组观测对象131执行第一操作的结果来预测对目标对象107执行第一操作的第三潜在结果。本文中描述的第二预测模型和第三预测模型可以认为是针对第一操作的反事实结果预测模型。
在此方面,图3示出了根据本公开的实施例的用于确定第二预测模型的过程300的流程图。还将结合图3来描述用于确定反事实结果模型的一般过程。在一些实施例中,过程300可以视为图2中的框230的一种实现,并且例如可以由如图1所示的计算设备102来执行。应当理解,过程300还可以包括未示出的附加步骤和/或可以省略所示出的步骤。本公开的范围在此方面不受限制。
在框310,计算设备102基于在框220处确定的第一概率模型、第一组观测对象111的数目和第二组观测对象121的数目,确定样本权重。该样本权重用于修正第二组观测对象121的分布相对于第一组观测对象111的分布的偏移。在此确定的样本权重可以认为是重要性采样权重,其用于修正上文提及的协变量偏移问题,并且该样本权重将用于后续的反事实结果预测。
总体而言,对于被执行第
Figure PCTCN2019083905-appb-000002
操作的观测对象,其针对第t操作的权重
Figure PCTCN2019083905-appb-000003
如式(5)所示:
Figure PCTCN2019083905-appb-000004
其中p(X|T=t)表示给定T=t情况下X的条件密度函数。
在此由于要确定的是针对第一操作的反事实结果预测模型,因此首先考虑针对未被执行第一操作的观测对象的样本权重。以图1所示的情况为例,对于被执行第二操作的第二组观测对象121,其针对第一操作的样本权重表示为w 2,1:=w 2表示为式(6):
Figure PCTCN2019083905-appb-000005
可以根据式(7)来计算w 2
Figure PCTCN2019083905-appb-000006
其中
Figure PCTCN2019083905-appb-000007
可以由第一组观测对象111的数目和第二组观测对象121的数目来估计,
Figure PCTCN2019083905-appb-000008
可以由在框220处确定的概率模型来估计。
接下来,计算设备102可以基于所确定的样本权重(例如,式(7))和第一组观测数据110来确定第二预测模型,也即针对第一操作的反事实结果预测模型。例如,计算设备102可以采用迁移学习技术来确定第二预测模型。
在框320处,计算设备102基于样本权重、第一组观测对象111和第一组观测结果112,建立用于确定第二预测模型的第二目标函数。
第二组观测对象121被执行第一操作的反事实结果Y 1是无法观测到的。可以基于迁移学习方法,借助于结果Y 1被观测到的第一组观测对象111及对应的第一组观测结果112来预测针对第二组观测对象121的反事实结果。例如,可以建立如式(8)所示的目标函数:
Figure PCTCN2019083905-appb-000009
其中h 2(·)表示深度神经网络,α 2表示正则化参数,以及
Figure PCTCN2019083905-appb-000010
表示模型复杂度惩罚项。该目标函数表示对第一组观测对象111中的每个观测对象i(T i=1)求和。应当理解,对于具有多于两个处置水平的情况,这里建立的目标函数还应当包括其他t操作的权重项w ti
在框330,计算设备102通过最小化第二目标函数来确定第二预测模型的模型参数,从而获得第二预测模型。例如,计算设备102可以通过最小化如式(8)所示的目标函数来学习神经网络h 2(X),并且利用h 2(X)来获得第二预测模型,如式(9)所示:
E(Y 1|X,T=2)  (9)
在存在第三操作的实施例中,计算设备102还可以考虑被执行第三操作的第三组对象131针对第一操作的权重w 3,1:=w 3,如可以通过式(5)导出的。由此可以类似地得到上文提及的第三预测模型,如式(9’)所示:
E(Y 1|X,T=3)  (9’)
以上描述了确定针对第一操作的反事实结果预测模型,即第二预测模型和可能的第三预测模型的过程。在一些实施例中,计算设备102还可以类似地确定针对任何第t操作(例如,第二操作)的反事实结果预测模型。为了便于讨论,在本文中将针对第二操作的反事实结果预测模型称为第五预测模型。例如,计算设备102可以基于第二组观测数据120和在框220处确定的第一概率模型,确定第五预测模型。第五预测模型通过估计对第一组观测对象111执行第二操作的结果来预测对目标对象107执行第二操作的潜在结果。
可以与参考图3所描述的过程类似地来确定第五预测模型。在此由于要确定的是针对第二操作的反事实结果预测模型,因此首先考虑针对未被执行第二操作的观测对象的样本权重。与相对框310所描述的类似,对于被执行第一操作的第一组观测对象111,其针对第二操作的样本权重表示为w 1,2:=w 1表示为式(10):
Figure PCTCN2019083905-appb-000011
可以根据式(10)来计算w 1
Figure PCTCN2019083905-appb-000012
其中
Figure PCTCN2019083905-appb-000013
可以由第一组观测对象111的数目和第二组观测对象121 的数目来估计,
Figure PCTCN2019083905-appb-000014
可以由在框220处确定的概率模型来估计。特别地,在仅有两个处置水平的情况下,w 1=1/w 2
与相对框320所描述的类似,计算设备102接下来基于样本权重、第二组观测对象121和第二组观测结果122,建立用于确定五预测模型的目标函数。
第一组观测对象111被执行第二操作的反事实结果Y 2是无法观测到的。可以基于迁移学习方法,借助于结果T 2被观测到的第二组观测对象121及对应的第二组观测结果122来预测针对第一组观测对象111的反事实结果。例如,可以建立如式(12)所示的目标函数:
Figure PCTCN2019083905-appb-000015
其中h 1(·)表示深度神经网络,α 1表示正则化参数,以及
Figure PCTCN2019083905-appb-000016
表示模型复杂度惩罚项。该目标函数表示对第二组观测对象121中的每个观测对象i(T i=2)求和。
与相对框330所描述的类似,计算设备102接下来可以通过最小化如式(12)所示的目标函数来学习神经网络h 1(X),并且利用h 1(X)来获得第五预测模型,如式(13)所示:
E(Y 2|X,T=1)  (13)
在存在第三操作的实施例中,还可以类似地确定被执行第三操作的第三组观测对象131针对第二操作的反事实结果预测模型,如式(13’)所示:
E(Y 2|X,T=3)  (13’)
上文已以两个和三个处置水平为例,描述了反事实结果预测模型的确定。对于多个处置水平的情况,计算设备102可以针对任何
Figure PCTCN2019083905-appb-000017
来针对第t操作的反事实结果预测模型,如式(14)所示:
Figure PCTCN2019083905-appb-000018
在这样的实施例中,通过将传统的重新权重方法中的变量w 1i γ和w 0i γ替换为如上所描述的样本权重,可以避免传统方法中的不稳定问题。另外,这种方法的另一优点是仍然可以使用训练/测试分离技术来 选择最优参数,具体地尽管反事实结果是未观测到的,但可以使用经加权的误差作为测试误差。
继续参考图2,在框240,计算设备102确定至少第一预测模型、第二预测模型和第一概率模型的第一组合,以用于预测对目标对象107执行第一操作的第一最终结果。例如,计算设备102可以利用式(1)、(4)和(9)来确定第一组合。例如,计算设备102可以根据预定关系将各个预测模型和概率模型的输出进行组合。
总体而言,对于任何第t操作,对具有特征X的目标对象107执行该第t操作的最终结果可以通过下式来预测:
Figure PCTCN2019083905-appb-000019
其中项E(Y|T=t,X)是相对于框210所描述的事实结果预测模型,P(T=t|X)和
Figure PCTCN2019083905-appb-000020
是相对于框220所描述的概率模型即倾向评分模型,
Figure PCTCN2019083905-appb-000021
是相对于框230和图3所描述的反事实结果预测模型。
对于用于预测对目标对象107执行第一操作的最终结果的第一组合,上述式(15)可以具体化为:
Figure PCTCN2019083905-appb-000022
当仅具有两个处置水平(例如,1和2)时,第一组合又可以具体化为:
E(Y 1|X)=E(Y|X,T=1)P(T=1|X)
+P(T=2|X)E(Y 1|X,T=2)  (17)
当具有三个处置水平(例如,1、2和3)时,第一组合又可以具体化为:
E(Y 1|X)=E(Y|X,T=1)P(T=1|X)
+P(T=2|X)E(Y 1|X,T=2)+P(T=3|X)E(Y 1|X,T=3)  (17’)
以此方式,计算设备102生成用于预测对目标对象107执行第一操作的最终结果的第一组合。计算设备102可以提供以上描述的事实 结果预测模型、反事实结果预测模型和概率模型作为模型103的一部分,以供计算设备106来确定预测结果和ITE。
在一些实施例中,计算设备102可以生成用于预测对目标对象107执行任意第t操作的最终结果的组合。例如,计算设备102可以基于上文提及的第四预测模型、第五预测模型和概率模型来确定第二组合,以用于预测对目标对象107执行第二操作的第二最终结果。
对于用于预测对目标对象107执行第二操作的最终结果的第二组合,上述式(15)可以具体化为:
Figure PCTCN2019083905-appb-000023
当仅具有两个处置水平时,计算设备102可以例如利用式(2)、(4)和(13)来确定第二组合。与式(17)类似,在这种情况下,第二组合可以具体化为:
E(Y 2|X)=E(Y|X,T=2)P(T=2|X)
+P(T=1|X)E(Y 2|X,T=1)  (19)
计算设备102可以提供以如上方式确定的、用于预测对目标对象107执行第t操作的最终结果的组合,即E(Y t|X),从而能够预测目标对象107被执行某一操作或处置的结果。在一些情况下,还期望预测对目标对象107执行不同处置或操作的差异,即期望预测ITE。
在一些实施例中,计算设备102可以基于上述第一组合和上述第二组合,确定差异表示,以用于预测对目标对象107执行第一操作的第一最终结果与对目标对象107执行第二操作的第二最终结果之间的差异。差异表示例如是图1中所示的ITE表示105。
总体上,针对任意两个操作t 1和t 2的个体因果效应ITE可以如下式:
Figure PCTCN2019083905-appb-000024
其中
Figure PCTCN2019083905-appb-000025
Figure PCTCN2019083905-appb-000026
可以由式(15)得出。从式(20)可以看出ITE的值不仅取决于个体(目标对象107)的特征信息(X的取值),还取 决于所考虑的处置水平。该定义允许预测任何两个处置水平之间的ITE。
计算设备102可以将包括但不限于预测模型和概率模型104以及ITE表示105的模型103提供给例如计算设备106。基于模型103,计算设备106可以针对目标对象107来预测操作或处置结果和个体因果效应中的至少一项。例如,计算设备106可以针对患者来预测给予某种治疗与不给予该治疗之间的差异,从而自动地或者辅助医生或患者做出决策。
以上描述了根据本公开的用于获取预测操作结果和ITE的模型。该模型不仅能够提高估计处置结果和ITE的准确性,还能够扩展到具有多个处置水平的场景。另外,该方案还能够容易地应用到大型观测性研究。
图4示出了根据本公开的实施例的预测操作结果的过程400的流程图。过程400可以由图1的计算设备106来实现。为便于讨论,将结合图1来描述过程400。
在框410,计算设备106确定是否将对目标对象107执行一组操作中的一个操作。例如,如果计算设备106接收到关于目标对象107的特征X的输入(X的值x),则可以认为将要对目标对象107执行某个操作。应当理解,在描述第一操作和第二操作时可以是指具有多个处置水平的情况中的任意两个操作,而不限于仅具有两个处置水平的情况。
如果在框410确定将对目标对象107执行一组操作中的一个操作,则过程进行到框420。在框420,计算设备106预测对目标对象107执行第一操作(t 1)的第一潜在结果。计算设备106可以获取上文关于框210所描述的第一预测模型,并且基于该第一预测模型来预测第一潜在结果。例如,利用式(3)来获取第一潜在结果E(Y|X=x,T=t 1)。
在框430,计算设备106确定对目标对象107执行第一操作(t 1)的第一概率和第二操作(t 2)的第二概率。计算设备106可以获取上文关于框220所描述的概率模型,并且基于该概率模型来确定相应概 率。例如,计算设备106可以利用式(4)确定对目标对象107执行第一操作(t 1)和第二操作(t 2)的概率分别为P(T=t 1|X=x)和P(T=t 2|X=x)。在存在第三操作(t 3)的实施例中,计算设备106还可以利用式(4)确定对目标对象107执行第三操作(t 3)的概率分别为P(T=t 3|X=x)
在框440,计算设备106通过估计对被执行第二操作的观测对象(例如,第二组观测对象120)执行第一操作的结果来预测对目标对象107执行第一操作的第二潜在结果。计算设备106可以获取上文关于框230和图3所描述的第二预测模型,并且基于该第二预测模型来预测第二潜在结果。例如,计算设备106可以利用式(14)来计算第二潜在结果
Figure PCTCN2019083905-appb-000027
在存在第三操作(t 3)的实施例中,计算设备106还可以获取上文关于所描述的第三预测模型,并且基于该第三预测模型来预测第三潜在结果。例如,计算设备106可以利用式(14)来计算第三潜在结果
Figure PCTCN2019083905-appb-000028
在框450,计算设备106至少基于第一潜在结果、第二潜在结果、第一概率和第二概率,预测对目标对象107执行第一操作的第一最终结果。例如,计算设备106可以利用框420、430和440处的计算来确定第一最终结果。例如,在具有两个处置水平的情况下第一最终结果可以表示为:
Figure PCTCN2019083905-appb-000029
在存在第三操作的实施例中,计算设备106可以基于第一潜在结果、第二潜在结果、第三潜在结果、第一概率、第二概率和第三概率来预测第一最终结果。例如,在具有三个处置水平的情况下第一最终结果可以表示为:
Figure PCTCN2019083905-appb-000030
在一些实施例中,计算设备106还可以预测对目标对象107执第二操作的第四潜在结果,例如基于上文描述的第四预测模型。计算设备106可以通过估计对被执行第一操作的观测对象执行第二操作的结果来预测对目标对象107执行第二操作的第五潜在结果,例如基于上文描述的第五预测模型。计算设备106可以进而基于第四潜在结果、第五潜在结果和相应概率,预测对目标对象107执行第二操作的第二最终结果。例如,在具有两个处置水平的情况下第二最终结果可以表示为:
Figure PCTCN2019083905-appb-000031
在具有三个处置水平的情况下第二最终结果可以表示为:
Figure PCTCN2019083905-appb-000032
在一些实施例中,计算设备106可以预测针对目标对象107执行第一操作的第一最终结果与针对目标对象107执行第二操作的第二最终结果之间的差异,即第一操作相对于第二操作的ITE。例如,计算设备106可以基于ITE表示105来针对目标对象107计算第一操作t 1相对于第二操作t 2的ITE:
Figure PCTCN2019083905-appb-000033
在一些实施例中,计算设备106可以(例如,向用户)提供预测的第一最终结果、第二最终结果以及ITE中的至少一项(例如,图1中所示的预测结果108),以帮助诸如用户做出是对目标对象107执行第一操作还是执行第二操作的决策。
在一些实施例中,计算设备106可以基于所确定的差异(例如,ITE),从第一操作和第二操作中确定将要对目标对象107执行的目标操作。计算设备106可以简单地基于所确定的差异来选择目标操作。例如,如果所确定的差异指示ITE为正面效应或积极效应,则计算设 备106可以确定将要对目标对象107执行第一操作。计算设备106还可以进一步结合其他因素来选择目标操作,例如,结合第一操作与第二操作之间的成本差异(例如,时间成本、费用成本)。举例而言,如果ITE与成本差异之比大于阈值比,则计算设备106可以确定将要对目标对象107执行第一操作。
虽然将过程200和过程400描述为在两个计算设备处实现,但是应当理解,本文中所描述的过程可以在同一计算设备处实现,也可以由分布式计算系统实现。本公开的范围在此方面不受限制。
下面来描述一个具体示例。这里使用的数据集来自基于婴儿健康与发展计划(IHDP)的半模拟研究。IHDP数据具有来自真实随机实验的处置和特征,研究高质量儿童护理和家访对未来认知测试分数的影响。该实验使用模拟的结果,使得真正的因果效应是已知的。从实验数据开始,通过移除干预群体的子集来创建观测研究。数据集由747个个体(139个干预,608个对照)组成,每个个体由度量儿童及其母亲的特性的25个协变量表示。
处置表示早产儿是否接受过高质量儿童护理和来自受过训练的提供者的家访;连续性的结果表示婴儿未来认知测试分数;这25个特征包括对孩子的度量——出生体重、头围、早产儿出生周数、性别、双胞胎状态,以及对目前的度量——母亲在分娩时的年龄、婚姻状况、受教育程度。总共有6个连续协变量和19个二元协变量。
在上文关于框210描述的步骤中,获得给定婴儿的25个特征情况下一个早产儿的预测的未来认知测试分数。
在上文关于框220描述的步骤中,基于婴儿的25个特征获得关于一个早产儿接受高质量儿童护理和家访的潜在概率的预测。
在上文关于框230描述的步骤中,对于接受高质量儿童护理和家访的婴儿,获得假设其未接受此类儿童护理服务的情况下预测的未来认知测试分数;对于未接受高质量儿童护理和家访的婴儿,获得假设其接受了这种儿童护理服务的情况下预测的未来认知测试分数。
最后,基于以上步骤可以获得高质量儿童护理和来自受过训练 的提供者的家访对每个早产儿的未来认知测试分数的因果效应的估计。
使用上文所述的传统方法、表示学习方法以及本公开的方法对上述IHDP数据进行学习从而预测ITE。结果显示本公开的方法在非均匀效应的估计精度(HEPE)和均方根误差方面均优于传统方法和表示学习方法。
下面结合具体实施例来描述本公开方案的实现。
在医疗领域中,针对同一病症,医生或患者自身可能需要从多种治疗手段或方式中进行选择,例如需要选择要不要给予某种药物、或者从多种药物中选择一种、或者从多种物理疗法(诸如,红外、磁疗)中选择一种等。以下为了便于描述,将用治疗手段A、B、C等来表示。通常医生会基于个人经验来选择治疗手段,但这严重依赖于主观判断和经验积累。医疗机构通常具有分别接受A、B、C治疗的同一病症的病患的治疗情况(以下称为观测数据)。
观测数据可以包括先前个体患者的各种特征,例如年龄、体重、血压以及与病症有关的各项检查数据等,并且包括这些个体患者在接受治疗后的治疗效果,例如病症是否消失、接受治疗后的生理参数等。期望利用这样的观测数据来预测这些治疗手段对当前正要接受治疗的目标患者的治疗效果以及不同治疗手段之间的效果差异(例如,ITE),从而确定针对该目标患者的治疗手段。然而,这样的观测数据通常具有上文提及的协变量偏移问题。例如,由于不同治疗手段之间的成本差异,观测数据可能存在个体患者在经济状况方面的偏移。如果不考虑这样协变量偏移问题,所获得预测结果将不够客观和准确。
利用本公开中提供的方案可以解决协变量偏移引起的问题,准确预测目标患者针对治疗手段A、B、C的治疗效果以及任意两个治疗手段之间的效果差异。所预测的治疗效果和/或效果差异可以被提供给医生或患者,以辅助他们选择治疗手段。附加地或备选地,计算设备也可以基于预定规则自动地确定治疗手段,如上文所描述的。因此,以这种方式可以为患者选择更适合和更有效的治疗手段。
在教育领域中,学生可能需要从具体安排不同的多个同类课程(例如,听、说、读、写比例不同的英语课程)中进行选择,或者教育机构需要向学生推荐更适合的课程。教育机构通常具有这方面的观测数据。观测数据可以包括先前参加课程D、E、F等的学生的特征,例如年龄、性别、是否参加过类似课程、家庭经济情况等信息,以及个体学生在参加相应课程后的表现,例如考试成绩、获奖情况等。
期望利用这样的观测数据来帮助当前正在进行课程选择的学生做出决策,或者向其推荐更合适的课程。与前文提及的类似,观测数据通常存在协变量偏移,这导致传统预测方法不能准确、客观地向学生推荐课程。
利用本公开中提供的上述方案可以解决协变量偏移问题,准确预测目标学生针对课程D、E、F的学习效果以及任意两个课程之间的效果差异。所预测的学习效果和/或效果差异可以被提供给学生,以辅助他们进行课程选择。附加地或备选地,计算设备也可以基于预定规则向学生推荐更合适的课程,如上文所描述的。例如,如果针对课程D和E的ITE指示正面或积极效应,则可以将课程D推荐给目标学生。因此,以这种方式可以为学生选择更适合和更有益的课程。
图5示出了可以用来实现本公开的实施例的示例设备500的示意性框图。如图所示,设备500包括中央处理单元(CPU)501,其可以根据存储在只读存储器(ROM)502中的计算机程序指令或者从存储单元508加载到随机访问存储器(RAM)503中的计算机程序指令,来执行各种适当的动作和处理。在RAM 503中,还可存储设备500操作所需的各种程序和数据。CPU 501、ROM 502以及RAM 503通过总线504彼此相连。输入/输出(I/O)接口505也连接至总线504。
设备500中的多个部件连接至I/O接口505,包括:输入单元506,例如键盘、鼠标等;输出单元507,例如各种类型的显示器、扬声器等;存储单元508,例如磁盘、光盘等;以及通信单元509,例如网卡、调制解调器、无线通信收发机等。通信单元509允许设备500通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信 息/数据。
处理单元501执行上文所描述的各个方法和处理,例如过程200、300和400中的任一个。例如,在一些实施例中,过程200、300和400可以被实现为计算机软件程序或计算机程序产品,其被有形地包含于机器可读介质,例如存储单元508。在一些实施例中,计算机程序的部分或者全部可以经由ROM 502和/或通信单元509而被载入和/或安装到设备500上。当计算机程序加载到RAM 503并由CPU 501执行时,可以执行上文描述的过程200、300和400中的任一个的一个或多个步骤。备选地,在其他实施例中,CPU 501可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行过程200、300和400中的任一个。
根据本公开的一些实施例,提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理器执行时实现根据本公开的方法。
本领域的技术人员应当理解,上述本公开的方法的各个步骤可以通过通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而可以将它们存储在存储装置中由计算装置来执行,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本公开不限制于任何特定的硬件和软件结合。
应当理解,尽管在上文的详细描述中提及了设备的若干装置或子装置,但是这种划分仅仅是示例性而非强制性的。实际上,根据本公开的实施例,上文描述的两个或更多装置的特征和功能可以在一个装置中具体化。反之,上文描述的一个装置的特征和功能可以进一步划分为由多个装置来具体化。
以上所述仅为本公开的可选实施例,并不用于限制本公开,对于本领域的技术人员来说,本公开可以有各种更改和变化。凡在本公开的精神和原则之内,所作的任何修改、等效替换、改进等,均应包含在本公开的保护范围之内。

Claims (32)

  1. 一种预测操作结果的方法,包括:
    基于第一组观测数据,确定第一预测模型,所述第一组观测数据包括对第一组观测对象执行第一操作的第一组观测结果,所述第一预测模型用于预测对目标对象执行所述第一操作的第一潜在结果;
    基于被执行所述第一操作的所述第一组观测对象和被执行不同于所述第一操作的第二操作的第二组观测对象,确定第一概率模型,所述第一概率模型用于确定对所述目标对象执行所述第一操作和所述第二操作的相应概率;
    基于所述第一组观测数据和所述第一概率模型,确定第二预测模型,所述第二预测模型通过估计对所述第二组观测对象执行所述第一操作的结果来预测对所述目标对象执行所述第一操作的第二潜在结果;以及
    确定至少所述第一预测模型、所述第二预测模型和所述第一概率模型的第一组合,以用于预测对所述目标对象执行所述第一操作的第一最终结果。
  2. 根据权利要求1所述的方法,还包括:
    基于所述第一组观测对象、所述第二组观测对象和被执行第三操作的第三组观测对象,确定第二概率模型,所述第三操作不同于所述第一操作和所述第二操作,所述第二概率模型用于确定对所述目标对象执行所述第三操作的概率;以及
    基于所述第一组观测数据、所述第一概率模型和所述第二概率模型,确定第三预测模型,所述第三预测模型通过估计对所述第三组观测对象执行所述第一操作的结果来预测对所述目标对象执行所述第一操作的第三潜在结果;并且
    其中确定所述第一组合包括将所述第一预测模型、所述第二预测模型、所述第三预测模型、所述第一概率模型和所述第二概率模型的输出相组合。
  3. 根据权利要求1所述的方法,其中确定所述第一预测模型包括:
    基于所述第一组观测对象和所述第一组观测结果,建立用于确定所述第一预测模型的第一目标函数;以及
    通过最小化所述第一目标函数来确定所述第一预测模型的模型参数。
  4. 根据权利要求1所述的方法,其中确定所述第二预测模型包括:
    基于所述第一概率模型、所述第一组观测对象的数目和所述第二组观测对象的数目,确定样本权重,所述样本权重用于修正所述第二组观测对象的分布相对于所述第一组观测对象的分布的偏移;以及
    基于所述样本权重和所述第一组观测数据来确定所述第二预测模型。
  5. 根据权利要求4所述的方法,其中基于所述样本权重和所述第一组测数据来确定所述第二预测模型包括:
    基于所述样本权重、所述第一组观测对象和所述第一组观测结果,建立用于确定所述第二预测模型的第二目标函数;以及
    通过最小化所述第二目标函数来确定所述第二预测模型的模型参数。
  6. 根据权利要求1所述的方法,还包括:
    基于第二组观测数据,确定第四预测模型,所述第二组观测数据包括对所述第二组观测对象执行所述第二操作的第二组观测结果,所述第四预测模型用于预测对所述目标对象执行所述第二操作的第四潜在结果;
    基于所述第二组观测数据和所述第一概率模型,确定第五预测模型,所述第五预测模型通过估计对所述第一组观测对象执行所述第二操作的结果来预测对所述目标对象执行所述第二操作的第五潜在结果;以及
    确定至少所述第四预测模型、所述第五预测模型和所述第一概率模型的第二组合,以用于预测对所述目标对象执行所述第二操作的第二最终结果。
  7. 根据权利要求6所述的方法,还包括:
    基于所述第一组合和所述第二组合确定差异表示,所述差异表示用于预测对所述目标对象执行所述第一操作的第一最终结果与对所述目标对象执行所述第二操作的第二最终结果之间的差异。
  8. 一种预测操作结果的方法,包括:
    响应于确定将对目标对象执行一组操作中的一个操作,预测对所述目标对象执行所述一组操作中的第一操作的第一潜在结果;
    确定对所述目标对象执行所述第一操作的第一概率和执行所述一组操作中的第二操作的第二概率,所述第二操作不同于所述第一操作;
    通过估计对被执行所述第二操作的观测对象执行所述第一操作的结果来预测对所述目标对象执行所述第一操作的第二潜在结果;以及
    至少基于所述第一潜在结果、所述第二潜在结果、所述第一概率和所述第二概率,预测对所述目标对象执行所述第一操作的第一最终结果。
  9. 根据权利要求8所述的方法,其中所述一组操作还包括不同于所述第一操作和所述第二操作的第三操作,并且所述方法还包括:
    确定对所述目标对象执行所述第三操作的第三概率;以及
    通过估计对被执行所述第三操作的观测对象执行所述第一操作的结果来预测对所述目标对象执行所述第一操作的第三潜在结果;并且
    其中预测所述第一最终结果包括基于所述第一潜在结果、所述第二潜在结果、所述第三潜在结果、所述第一概率、所述第二概率和所述第三概率来预测所述第一最终结果。
  10. 根据权利要求8所述的方法,其中预测所述第一潜在结果包括:
    获取第一预测模型,所述第一预测模型是基于第一组观测数据确定的,所述第一组观测数据包括对第一组观测对象执行所述第一操作 的第一组观测结果;以及
    基于所述第一预测模型,预测所述第一潜在结果。
  11. 根据权利要求10所述的方法,其中确定所述第一概率和所述第二概率包括:
    获取概率模型,所述概率模型是基于被执行所述第一操作的第一组观测对象和被执行所述第二操作的第二组观测对象确定的;以及
    基于所述概率模型,确定所述第一概率和所述第二概率。
  12. 根据权利要求11所述的方法,其中预测所述第二潜在结果包括:
    获取第二预测模型,所述第二预测模型是基于所述第一组观测数据和所述概率模型确定的;以及
    基于所述第二预测模型,预测所述第二潜在结果。
  13. 根据权利要求8所述的方法,还包括:
    预测对所述目标对象执行所述第二操作的第四潜在结果;
    通过估计对被执行所述第一操作的观测对象执行所述第二操作的结果来预测对所述目标对象执行所述第二操作的第五潜在结果;以及
    至少基于所述第四潜在结果、所述第五潜在结果、所述第一概率和所述第二概率,预测对所述目标对象执行所述第二操作的第二最终结果。
  14. 根据权利要求8所述的方法,还包括:
    预测对所述目标对象执行所述第一操作的所述第一最终结果与对所述目标对象执行所述第二操作的第二最终结果之间的差异。
  15. 根据权利要求14所述的方法,还包括:
    基于所述差异,从所述第一操作和所述第二操作中确定将要对所述目标对象执行的目标操作。
  16. 一种电子设备,包括:
    处理器;以及
    处理器耦合的存储器,所述存储器具有存储于其中的指令,所述 指令在被处理器执行时使所述设备执行动作,所述动作包括:
    基于第一组观测数据,确定第一预测模型,所述第一组观测数据包括对第一组观测对象执行第一操作的第一组观测结果,所述第一预测模型用于预测对目标对象执行所述第一操作的第一潜在结果;
    基于被执行所述第一操作的所述第一组观测对象和被执行不同于所述第一操作的第二操作的第二组观测对象,确定第一概率模型,所述第一概率模型用于确定对所述目标对象执行所述第一操作和所述第二操作的相应概率;
    基于所述第一组观测数据和所述第一概率模型,确定第二预测模型,所述第二预测模型通过估计对所述第二组观测对象执行所述第一操作的结果来预测对所述目标对象执行所述第一操作的第二潜在结果;以及
    确定至少所述第一预测模型、所述第二预测模型和所述第一概率模型的第一组合,以用于预测对所述目标对象执行所述第一操作的第一最终结果。
  17. 根据权利要求16所述的电子设备,其中所述动作还包括:
    基于所述第一组观测对象、所述第二组观测对象和被执行第三操作的第三组观测对象,确定第二概率模型,所述第三操作不同于所述第一操作和所述第二操作,所述第二概率模型用于确定对所述目标对象执行所述第三操作的概率;以及
    基于所述第一组观测数据、所述第一概率模型和所述第二概率模型,确定第三预测模型,所述第三预测模型通过估计对所述第三组观测对象执行所述第一操作的结果来预测对所述目标对象执行所述第一操作的第三潜在结果;并且
    其中确定所述第一组合包括将所述第一预测模型、所述第二预测模型、所述第三预测模型、所述第一概率模型和所述第二概率模型的输出相组合。
  18. 根据权利要求16所述的电子设备,其中确定所述第一预测模型包括:
    基于所述第一组观测对象和所述第一组观测结果,建立用于确定所述第一预测模型的第一目标函数;以及
    通过最小化所述第一目标函数来确定所述第一预测模型的模型参数。
  19. 根据权利要求16所述的电子设备,其中确定所述第二预测模型包括:
    基于所述第一概率模型、所述第一组观测对象的数目和所述第二组观测对象的数目,确定样本权重,所述样本权重用于修正所述第二组观测对象的分布相对于所述第一组观测对象的分布的偏移;以及
    基于所述样本权重和所述第一组观测数据来确定所述第二预测模型。
  20. 根据权利要求19所述的电子设备,其中基于所述样本权重和所述第一组测数据来确定所述第二预测模型包括:
    基于所述样本权重、所述第一组观测对象和所述第一组观测结果,建立用于确定所述第二预测模型的第二目标函数;以及
    通过最小化所述第二目标函数来确定所述第二预测模型的模型参数。
  21. 根据权利要求16所述的电子设备,其中所述动作还包括:
    基于第二组观测数据,确定第四预测模型,所述第二组观测数据包括对所述第二组观测对象执行所述第二操作的第二组观测结果,所述第四预测模型用于预测对所述目标对象执行所述第二操作的第四潜在结果;
    基于所述第二组观测数据和所述第一概率模型,确定第五预测模型,所述第五预测模型通过估计对所述第一组观测对象执行所述第二操作的结果来预测对所述目标对象执行所述第二操作的第五潜在结果;以及
    确定至少所述第四预测模型、所述第五预测模型和所述第一概率模型的第二组合,以用于预测对所述目标对象执行所述第二操作的第二最终结果。
  22. 根据权利要求21所述的电子设备,其中所述动作还包括:
    基于所述第一组合和所述第二组合确定差异表示,所述差异表示用于预测对所述目标对象执行所述第一操作的第一最终结果与对所述目标对象执行所述第二操作的第二最终结果之间的差异。
  23. 一种电子设备,包括:
    处理器;以及
    处理器耦合的存储器,所述存储器具有存储于其中的指令,所述指令在被处理器执行时使所述设备执行动作,所述动作包括:
    响应于确定将对目标对象执行一组操作中的一个操作,预测对所述目标对象执行所述一组操作中的第一操作的第一潜在结果;
    确定对所述目标对象执行所述第一操作的第一概率和执行所述一组操作中的第二操作的第二概率,所述第二操作不同于所述第一操作;
    通过估计对被执行所述第二操作的观测对象执行所述第一操作的结果来预测对所述目标对象执行所述第一操作的第二潜在结果;以及
    至少基于所述第一潜在结果、所述第二潜在结果、所述第一概率和所述第二概率,预测对所述目标对象执行所述第一操作的第一最终结果。
  24. 根据权利要求23所述的电子设备,其中所述一组操作还包括不同于所述第一操作和所述第二操作的第三操作,并且所述动作还包括:
    确定对所述目标对象执行所述第三操作的第三概率;以及
    通过估计对被执行所述第三操作的观测对象执行所述第一操作的结果来预测对所述目标对象执行所述第一操作的第三潜在结果;并且
    其中预测所述第一最终结果包括基于所述第一潜在结果、所述第二潜在结果、所述第三潜在结果、所述第一概率、所述第二概率和所述第三概率来预测所述第一最终结果。
  25. 根据权利要求23所述的电子设备,其中测所述第一潜在结果包括:
    获取第一预测模型,所述第一预测模型是基于第一组观测数据确定的,所述第一组观测数据包括对第一组观测对象执行所述第一操作的第一组观测结果;以及
    基于所述第一预测模型,预测所述第一潜在结果。
  26. 根据权利要求25所述的电子设备,其中确定所述第一概率和所述第二概率包括:
    获取概率模型,所述概率模型是基于被执行所述第一操作的第一组观测对象和被执行所述第二操作的第二组观测对象确定的;以及
    基于所述概率模型,确定所述第一概率和所述第二概率。
  27. 根据权利要求26所述的电子设备,其中预测所述第二潜在结果包括:
    获取第二预测模型,所述第二预测模型是基于所述第一组观测数据和所述概率模型确定的;以及
    基于所述第二预测模型,预测所述第二潜在结果。
  28. 根据权利要求23所述的电子设备,其中所述动作还包括:
    预测对所述目标对象执行所述第二操作的第四潜在结果;
    通过估计对被执行所述第一操作的观测对象执行所述第二操作的结果来预测对所述目标对象执行所述第二操作的第五潜在结果;以及
    基于所述第四潜在结果、所述第五潜在结果、所述第一概率和所述第二概率,预测对所述目标对象执行所述第二操作的第二最终结果。
  29. 根据权利要求23所述的电子设备,其中所述动作还包括:
    预测对所述目标对象执行所述第一操作的所述第一最终结果与对所述目标对象执行所述第二操作的第二最终结果之间的差异。
  30. 根据权利要求29所述的电子设备,其中所述动作还包括:
    基于所述差异,从所述第一操作和所述第二操作中确定将要对所述目标对象执行的目标操作。
  31. 一种计算机程序产品,所述计算机程序产品被有形地存储在计算机可读介质上并且包括机器可执行指令,所述机器可执行指令在被执行时使机器执行根据权利要求1至7中任一项所述的方法。
  32. 一种计算机程序产品,所述计算机程序产品被有形地存储在计算机可读介质上并且包括机器可执行指令,所述机器可执行指令在被执行时使机器执行根据权利要求8至15中任一项所述的方法。
PCT/CN2019/083905 2019-04-23 2019-04-23 预测操作结果的方法、电子设备和计算机程序产品 WO2020215209A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
PCT/CN2019/083905 WO2020215209A1 (zh) 2019-04-23 2019-04-23 预测操作结果的方法、电子设备和计算机程序产品
JP2021562849A JP7355115B2 (ja) 2019-04-23 2019-04-23 操作結果を予測する方法、電子機器、及びコンピュータプログラム製品
US17/605,652 US20220222554A1 (en) 2019-04-23 2019-04-23 Operation result predicting method, electronic device, and computer program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/083905 WO2020215209A1 (zh) 2019-04-23 2019-04-23 预测操作结果的方法、电子设备和计算机程序产品

Publications (1)

Publication Number Publication Date
WO2020215209A1 true WO2020215209A1 (zh) 2020-10-29

Family

ID=72941281

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/083905 WO2020215209A1 (zh) 2019-04-23 2019-04-23 预测操作结果的方法、电子设备和计算机程序产品

Country Status (3)

Country Link
US (1) US20220222554A1 (zh)
JP (1) JP7355115B2 (zh)
WO (1) WO2020215209A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220129794A1 (en) * 2020-10-27 2022-04-28 Accenture Global Solutions Limited Generation of counterfactual explanations using artificial intelligence and machine learning techniques
JP2024017703A (ja) * 2022-07-28 2024-02-08 株式会社日立製作所 情報処理装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105701839A (zh) * 2014-12-11 2016-06-22 株式会社巨晶片 状态估计装置、状态估计方法以及集成电路
CN108734499A (zh) * 2017-04-25 2018-11-02 腾讯科技(深圳)有限公司 推广信息效果分析方法及装置、计算机可读介质
JP2019059348A (ja) * 2017-09-27 2019-04-18 株式会社日立製作所 運行情報処理装置およびその処理方法

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007041705A (ja) 2005-08-01 2007-02-15 National Institute Of Advanced Industrial & Technology 事業者の温暖化対策計画策定方法
US20110295782A1 (en) 2008-10-15 2011-12-01 Alexander Stojadinovic Clinical Decision Model
US8682762B2 (en) 2009-12-01 2014-03-25 Fair Isaac Corporation Causal modeling for estimating outcomes associated with decision alternatives
WO2011095999A1 (en) 2010-02-05 2011-08-11 Decode Genetics Ehf Genetic variants for predicting risk of breast cancer
JP2015132863A (ja) 2012-04-27 2015-07-23 住友林業株式会社 生産活動と生産地の保全を統合した排出権創出方法
JP6109037B2 (ja) 2013-10-23 2017-04-05 本田技研工業株式会社 時系列データ予測装置、時系列データ予測方法、及びプログラム
WO2015066052A1 (en) 2013-10-28 2015-05-07 New York University Methods, computer-accessible medium and systems to model disease progression using biomedical data from multiple patients
WO2016004063A1 (en) 2014-06-30 2016-01-07 Amazon Technologies, Inc. Feature processing recipes for machine learning
WO2018225227A1 (ja) 2017-06-08 2018-12-13 富士通株式会社 情報処理方法、情報処理装置及び情報処理コンピュータプログラム

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105701839A (zh) * 2014-12-11 2016-06-22 株式会社巨晶片 状态估计装置、状态估计方法以及集成电路
CN108734499A (zh) * 2017-04-25 2018-11-02 腾讯科技(深圳)有限公司 推广信息效果分析方法及装置、计算机可读介质
JP2019059348A (ja) * 2017-09-27 2019-04-18 株式会社日立製作所 運行情報処理装置およびその処理方法

Also Published As

Publication number Publication date
JP7355115B2 (ja) 2023-10-03
JP2022536825A (ja) 2022-08-19
US20220222554A1 (en) 2022-07-14

Similar Documents

Publication Publication Date Title
Hernandez et al. Synthetic data generation for tabular health records: A systematic review
Liu et al. Chatcounselor: A large language models for mental health support
Atan et al. Deep-treat: Learning optimal personalized treatments from observational data using neural networks
Groenwold et al. Missing covariate data in clinical research: when and when not to use the missing-indicator method for analysis
Davenport et al. Twitter versus Facebook: Exploring the role of narcissism in the motives and usage of different social media platforms
Häggström Data‐driven confounder selection via Markov and Bayesian networks
Liu et al. Robust hybrid learning for estimating personalized dynamic treatment regimens
Li et al. Multithreshold change plane model: estimation theory and applications in subgroup identification
US20220157413A1 (en) Systems and Methods for Designing Augmented Randomized Trials
Fei et al. An overview of healthcare data analytics with applications to the COVID-19 pandemic
Laber et al. Using pilot data to size a two‐arm randomized trial to find a nearly optimal personalized treatment strategy
Qiu et al. A latent batch-constrained deep reinforcement learning approach for precision dosing clinical decision support
Chen et al. The roles, challenges, and merits of the p value
Böck et al. Superhuman performance on sepsis MIMIC-III data by distributional reinforcement learning
WO2020215209A1 (zh) 预测操作结果的方法、电子设备和计算机程序产品
Khan et al. Incorporating deep learning methodologies into the creation of healthcare systems
Li et al. Reinforcement learning in possibly nonstationary environments
Bo et al. A meta-learner framework to estimate individualized treatment effects for survival outcomes
Wang et al. Semiparametric model averaging prediction: a Bayesian approach
Zhang et al. Min-max optimal design of two-armed trials with side information
CN114547444A (zh) 一种在线医疗推荐系统及其实现方法
Lucas et al. An ExplainableFair Framework for Prediction of Substance Use Disorder Treatment Completion
Chernofsky et al. Causal mediation analysis with mediator values below an assay limit
CN118299070B (zh) 基于反事实预测的治疗效果估计方法、系统、设备及介质
CN118609803A (zh) 一种帕累托最优的因果效应估计和策略学习的预测方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19926279

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021562849

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19926279

Country of ref document: EP

Kind code of ref document: A1