CN117275661A - Deep reinforcement learning-based lung cancer patient medication prediction method and device - Google Patents
Deep reinforcement learning-based lung cancer patient medication prediction method and device Download PDFInfo
- Publication number
- CN117275661A CN117275661A CN202311567874.0A CN202311567874A CN117275661A CN 117275661 A CN117275661 A CN 117275661A CN 202311567874 A CN202311567874 A CN 202311567874A CN 117275661 A CN117275661 A CN 117275661A
- Authority
- CN
- China
- Prior art keywords
- patient
- medication
- data
- lung cancer
- state
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000003814 drug Substances 0.000 title claims abstract description 163
- 229940079593 drug Drugs 0.000 title claims abstract description 133
- 208000020816 lung neoplasm Diseases 0.000 title claims abstract description 57
- 206010058467 Lung neoplasm malignant Diseases 0.000 title claims abstract description 54
- 201000005202 lung cancer Diseases 0.000 title claims abstract description 54
- 238000000034 method Methods 0.000 title claims abstract description 47
- 230000002787 reinforcement Effects 0.000 title claims abstract description 26
- 230000009471 action Effects 0.000 claims abstract description 49
- 230000006870 function Effects 0.000 claims abstract description 26
- 238000007781 pre-processing Methods 0.000 claims abstract description 9
- 230000007246 mechanism Effects 0.000 claims abstract description 7
- 238000012546 transfer Methods 0.000 claims abstract description 6
- 230000000857 drug effect Effects 0.000 claims abstract description 5
- 238000012549 training Methods 0.000 claims description 27
- 238000011282 treatment Methods 0.000 claims description 23
- 230000008859 change Effects 0.000 claims description 22
- 230000007704 transition Effects 0.000 claims description 12
- 239000013598 vector Substances 0.000 claims description 12
- 230000000694 effects Effects 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 9
- 206010028980 Neoplasm Diseases 0.000 claims description 8
- 201000010099 disease Diseases 0.000 claims description 6
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 6
- 230000008901 benefit Effects 0.000 claims description 5
- 238000011161 development Methods 0.000 claims description 5
- 238000000053 physical method Methods 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 5
- 238000012360 testing method Methods 0.000 claims description 5
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 238000011478 gradient descent method Methods 0.000 claims description 4
- 230000001575 pathological effect Effects 0.000 claims description 4
- 238000012216 screening Methods 0.000 claims description 4
- 230000002159 abnormal effect Effects 0.000 claims description 3
- 230000001186 cumulative effect Effects 0.000 claims description 3
- 230000003247 decreasing effect Effects 0.000 claims description 3
- 230000006866 deterioration Effects 0.000 claims description 3
- 230000006872 improvement Effects 0.000 claims description 3
- 208000037841 lung tumor Diseases 0.000 claims description 3
- 238000003062 neural network model Methods 0.000 claims description 3
- 230000007613 environmental effect Effects 0.000 claims description 2
- 238000001647 drug administration Methods 0.000 abstract description 3
- 239000003795 chemical substances by application Substances 0.000 description 13
- 230000000875 corresponding effect Effects 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 230000008447 perception Effects 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 201000011510 cancer Diseases 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 230000004083 survival effect Effects 0.000 description 2
- 238000011269 treatment regimen Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000002137 anti-vascular effect Effects 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 238000002512 chemotherapy Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000035487 diastolic blood pressure Effects 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000009169 immunotherapy Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 230000000391 smoking effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000035488 systolic blood pressure Effects 0.000 description 1
- 238000002626 targeted therapy Methods 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 231100000331 toxic Toxicity 0.000 description 1
- 230000002588 toxic effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H20/00—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
- G16H20/10—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/092—Reinforcement learning
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/60—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/50—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Epidemiology (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Pathology (AREA)
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Medicinal Chemistry (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention provides a method and a device for predicting the medication of a lung cancer patient based on deep reinforcement learning, belonging to the technical field of the medication prediction of the lung cancer patient; the technical problems to be solved are as follows: providing a method and a device for predicting the medication of a lung cancer patient based on deep reinforcement learning; the technical scheme adopted for solving the technical problems is as follows: collecting lung cancer patient data information, extracting vital signs and related medical histories of a lung cancer patient within a period of time, and preprocessing the vital signs and the related medical histories to construct a patient data set; constructing a patient-based environment model by using the collected data, wherein the model is used for simulating a reward mechanism of a drug effect on a patient body and comprises a patient state, a drug action space, a reward function, a transfer model and an initial state; constructing a network model comprising an online network and a target network for calculating each possible drug regimen adjustment value for the current state of the patient; the method is applied to the drug administration prediction of lung cancer patients.
Description
Technical Field
The invention provides a method and a device for predicting the medication of a lung cancer patient based on deep reinforcement learning, and belongs to the technical field of the medication prediction of the lung cancer patient.
Background
The deep reinforcement learning is a technology combining the deep learning and reinforcement learning, and can optimize the decision process of the intelligent body through simulating and learning the behaviors and results in the environment, and can be applied to the aspects of medical diagnosis, treatment scheme design, health management and the like in the personalized medical field, thereby providing more accurate and effective medical services for patients.
Personalized medicine is a medical mode based on individual differentiation of patients and based on unique gene, physiological and psychological characteristics of the patients, and corresponding traditional medical modes usually only consider general rules, and individual differences among patients are ignored, so that the prediction effect is poor, the disease mode of the patients cannot be found, and the optimal medicine adjustment scheme cannot be predicted; among these, the patient with lung cancer needs to predict the medication situation by analyzing a large amount of medical data and the lung cancer characteristics of the individual patient, and the currently adopted traditional medical mode cannot meet the prediction requirement for medication.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and solves the technical problems that: a method and a device for predicting the medication of a lung cancer patient based on deep reinforcement learning are provided.
In order to solve the technical problems, the invention adopts the following technical scheme: a lung cancer patient medication prediction method based on deep reinforcement learning comprises the following medication prediction steps:
step S1: collecting lung cancer patient data information;
step S2: extracting vital signs and related medical histories of a patient suffering from lung cancer for a period of time and preprocessing the vital signs and the related medical histories to construct a patient data set;
step S3: constructing a patient-based environmental model using the collected data for simulating a reward mechanism of a drug effect on a patient's body, comprising: patient state, medication action space, reward function, transfer model, and initial state;
step S4: setting up a network model comprising an online network and a target network, and calculating an adjustment value of each medication scheme under the current state of a patient;
step S5: taking the collected patient history treatment data as input, and outputting a predicted drug adjustment scheme;
step S6: updating network parameters by using a random gradient descent method;
step S7: through constantly interacting with the environment, the method carries out training and learning for multiple times, achieves the goal of rewarding maximization, and predicts and outputs the medication type and medication dosage adjustment scheme suitable for patients.
The specific method for collecting the data information of the lung cancer patient in the step S1 comprises the following steps:
step S11: collecting personal basic information, medical history, physiological data and drug treatment scheme data of a lung cancer patient;
step S12: collecting data on the relationship between the type of medication, dosage and treatment effect of a lung cancer patient, comprising: the patient takes different drug types and the change data of the lung tumor size of the patient when the patient takes different doses.
The specific method for constructing the patient data set in the step S2 is as follows:
step S21: screening lung cancer patient data of a set age group;
step S22: preprocessing the screened data, including: removing repeated data, processing missing values and processing abnormal values;
step S23: and (3) dividing the data obtained in the previous step into a training set and a testing set according to the ratio of 8:2 after the data obtained in the previous step are stored.
The specific method for constructing the environment model based on the patient in the step S3 is as follows:
step S31: determining a patient state space S, including tumor size, pathological stage and physiological index data;
features defining the patient's condition of a visit include: demographic, medical history, disease risk, historical medicine, laboratory data, physical measurement, and establishing a state space according to the information to obtain a multidimensional state vector;
step S32: determining a medicine action space A, including adjustment of medicine types and dosages thereof;
according to the type and the dosage of the historical medication of the patient, determining the adjustment scheme of the medication, wherein the action space A comprises four-dimensional vectors: no prescription change 0, 1 increase of drug dose, 2 decrease of drug dose, 3 replacement of drug;
step S33: determining a reward function R, designing a reasonable reward function according to the disease condition, the drug dosage and the treatment effect factors of a patient, and feeding back the change of various indexes of the body when the patient takes different drugs and the doses thereof to mark the improvement or the deterioration of the tumor;
step S34: based on patient historical medication data, a probability model is built for transitioning to each medication action in the current patient stateCalculating the current state of the patient +.>Taking medication strategy->After transition to the next state +.>Probability of (2)PAnd use +.>Strategy is used for balancing development and exploration, and expected benefits at the current moment are maximized;
step S35: for the state of the patient at the beginning of the treatment, an initial state is determined based on the patient's basic condition and medical history.
The specific method for constructing the network model to calculate the adjustment value of the medication scheme of the patient in the step S4 is as follows:
step S41: setting up an online network for calculating the adjustment value of each personalized medication scheme under the current physical state of the patient, and updating the optimal scheme according to the adjustment value;
the parameter weight of the online network is updated in the process of each iteration to minimize the difference between the predicted value and the target value in the current state;
step S42: calculating a target value according to the action in the current state and the maximum adjustment value in the next state;
step S43: constructing a neural network model to calculate an action-cost function Q (s, a) for estimating the cumulative rewards of each adjustment scheme in the current treatment state, wherein the action-cost function Q is denoted as Q (s, a) and is used for calculating an expected return value Q caused by the change of the body index of a patient after taking the medication strategy a in the state s;
step S44: training the DQN model, two structurally identical neural networks were used: online networkAnd a target networkFor obtaining optimal dosing action decisions +.>And is->Training is carried out;
wherein,for the current state of the patient,acurrent medication strategy for the patient,/->For online networksQParameter of->For calculating in a given state->Has the maximum at the bottomQAction of valuea,QExpected return value for patient physical index change in online network, < >>The expected return value for the patient's physical index change in the target network,Lfor loss function, calculating the difference between the two to perform network training;
estimating expected action values using a target networkTo calculate a loss functionLBy tracking the online network parameters in each training iteration +.>To update the target network->Parameter->Finally, the optimal personalized medicine adjustment scheme suitable for the patient is obtained.
The specific method for outputting the predicted drug adjustment scheme in the step S5 is as follows:
step S51: the method comprises the steps that five kinds of information including patient illness state, physiological indexes, laboratory examination results, image examination results and medication conditions are formed into a high-dimensional vector to serve as input data of a model;
step S52: the output of the model is a drug adjustment regimen based on the patient's condition, including four drug adjustment regimens: definition 0 indicates no prescription change, 1 indicates increased medication dose, 2 indicates decreased medication dose, and 3 indicates replacement medication, respectively.
The device for realizing the lung cancer patient medication prediction method based on deep reinforcement learning comprises an acquisition computer for collecting lung cancer patient data information, a data server for collecting and storing the data information, and a prediction server for building a network model, training learning and outputting a prediction scheme.
Compared with the prior art, the invention has the following beneficial effects: the invention provides a prediction method and a device for using medicines for individualized lung cancer patients, which are mainly based on the deep reinforcement learning step of a strategy optimization algorithm, and are characterized in that an individualized environment model is built by collecting historical medicine data of the patients, and intelligent bodies and the environment are trained to perform interactive learning, so that the future optimal medicine scheme of the lung cancer patients is obtained through adjustment and prediction; the deep reinforcement learning method adopted by the invention has the perception capability of deep learning and the decision capability of reinforcement learning, can integrate perception, learning and decision into the same framework, and is used for solving the problem of high-dimensional decision based on time sequences, so that the intelligent body can be trained to learn an optimal drug adjustment scheme based on the treatment history of a cancer patient, and more accurate and effective medical service is provided for the lung cancer patient.
It should be noted that, all actions for acquiring signals, information or data in the present application are performed under the condition of conforming to the corresponding data protection rule policy of the country of the location and obtaining the authorization given by the owner of the corresponding device.
Drawings
The invention is further described below with reference to the accompanying drawings:
FIG. 1 is a diagram of a drug prediction model structure employing a strategy-based optimization algorithm;
FIG. 2 is a flow chart of the steps of the medication prediction method of the present invention.
Detailed Description
As shown in fig. 1 and fig. 2, the method for predicting the medication of the lung cancer patient based on the deep reinforcement learning and strategy optimization algorithm adopted by the invention specifically comprises the following steps:
step S1: collecting lung cancer patient data information;
step S2: extracting vital signs and related medical histories of a patient suffering from lung cancer within 6 months, and preprocessing the vital signs and the related medical histories to construct a patient data set;
step S3: constructing a patient-based environment model by using the collected data, wherein the model is used for simulating a reward mechanism of a drug effect on a patient body and comprises a patient state, a drug action space, a reward function, a transfer model and an initial state;
step S4: constructing a network model comprising an online network and a target network, wherein the network model is used for calculating the Q value adjusted by each possible drug scheme under the current state of a patient;
step S5: taking the collected historical treatment data of the patient as input, and outputting a predicted drug adjustment scheme;
step S6: updating network parameters by using a random gradient descent method;
step S7: the intelligent body performs multiple training and learning, and outputs the optimal personalized medicine adjustment scheme.
Further, the collecting lung cancer patient data information in step S1 includes:
step S11: collecting personal basic information, medical history, physiological data, drug treatment scheme and other relevant data of a lung cancer patient;
step S12: and collecting the relation data between the medicine types, the dosages and the treatment effects of the lung cancer patients.
Further, the personal basic information described in step S11 includes the patient' S age, sex, race, smoking, and others; medical history including complications, cancer complications, hospitalization history, emergency treatment history, etc.; physiological data includes systolic pressure (SBP), diastolic pressure (DBP), heart rate, weight, height, BMI, etc.; drug treatment regimens include chemotherapy, targeted therapy, immunotherapy, and anti-vascular therapy.
Further, the relationship between the drug type, the dose and the therapeutic effect in step S12 includes the change of the physical index of the patient when the patient takes different drug types and doses, which is specifically represented by the change of the lung tumor size of the lung cancer patient.
Further, extracting and preprocessing vital signs of a lung cancer patient within 6 months as described in step S2 includes:
step S21: screening lung cancer patient data above 18 years old and below 75 years old;
step S22: and preprocessing the screened data. Operations including removing duplicate data, processing missing values (for missing physical measurements, replacing missing data with the value of the nearest data point of the same patient, if the data is still missing, estimating the missing data using the median of the variable observations of all patients without missing data), processing outliers, etc., to ensure the integrity and accuracy of the data;
step S23: and (3) dividing the data obtained in the previous step into a training set and a testing set according to the ratio of 8:2 after the data obtained in the previous step are stored, and training and evaluating the effectiveness of the drug prediction model.
Further, the constructing the patient-based environment model using the collected data in step S3 includes:
step S31: determining a patient state space S: the state space refers to the pathological condition of patients, including tumor size, pathological stage, physiological index and the like. Features defining patient visit status in the present invention include demographics, medical history, disease risk, historical medication, laboratory data, and physical measurements. Establishing a state space according to the information to obtain a 20-dimensional state vector;
step S32: determining a medicine action space A: the action space refers to actions that an intelligent doctor can take, i.e. the adjustment of the kind of drug and its dosage. According to the type and the dosage of the historical medication of a patient, the invention determines the adjustment scheme of the medication, and consists of four-dimensional vectors of 0 no-prescription change, 1 increase of the medication dosage, 2 decrease of the medication dosage and 3 replacement of the medication;
step S33: determining a reward function R: the reward function refers to feedback rewards obtained after the intelligent agent makes corresponding actions according to the current state in the reinforcement learning algorithm. According to the factors such as the illness state, the medicine dosage and the treatment effect of the patient, a reasonable reward function is designed to feed back the changes of various indexes of the body when the patient takes different medicines and dosages thereof, and the tumor improvement or deterioration is marked to assist doctors in providing medicine adjustment schemes for the patient;
step S34: establishment ofTransfer model: the transition model is the probability of transition between finger state space and action space, i.e., the probability of transition to the next state after taking some action in the current state. The invention establishes a probability model for transferring to each medication action under the current patient state according to the historical medication data of the patient, and uses +.>Strategies trade-off development and exploration, maximizing the expected benefit at the current moment, e.g. at time step t, the patient has k possible medication selection strategies, then +.>The policy may be expressed as:
;
wherein,representing possible medication strategies,/->Expressed in given policy->Cumulative rewards due to changes in physical index of patient, < ->For calculating so->Administration strategy to achieve maximum->,∈To explore the rate parameters, expressed in∈Random medication strategy selection at 1-∈Is selected by greedy medication policy actionsThe administration strategy with the maximum action value is +.>;
Step S35: determining an initial state: the initial state refers to a state of the patient at the time of starting treatment, and is determined according to the basic condition of the patient and the medical history.
Further, the network model in step S4 is a deep reinforcement learning model based on a policy optimization algorithm, and is specifically used for solving the decision problem of the high-dimensional space, and the specific construction steps are as follows:
step S41: an online network is built for calculating the Q value of each personalized medicine adjustment scheme under the current physical state of the patient, and the optimal scheme is updated according to the Q value. Parameters (weights) of the online network are updated in the process of each iteration to minimize the gap between the predicted Q value and the target Q value in the current state;
step S42: the target network is used for estimating a target Q value, i.e. calculating the target Q value based on the action in the current state and the maximum Q value in the next state. The parameters of the target network are updated slowly compared with those of the online network to maintain the stability of the target Q value. During each iteration, the parameters of the target network are copied from the online network, but are not updated directly.
By constructing a neural network model to calculate the action-cost function Q of estimating the jackpot for each adjustment scheme at the current visit status, the cost function can be expressed as Q (s, a). To train the model, two structurally identical neural networks were used: online networkAnd target network->. The online network is used for obtaining optimal drug administration action decisionAnd is->Training is performed. Estimating the expected action value using the target network>To calculate the loss function L and by slowly tracking the online network parameter +.>To update its parameters->Finally, the optimal personalized medicine adjustment scheme suitable for the patient is obtained.
Further, the step S5 takes the collected treatment data of the patient as input, and outputs a predicted drug adjustment scheme, which specifically includes:
step S51: the method comprises the steps that five kinds of information including patient illness state, physiological indexes, laboratory examination results, image examination results and medication conditions are formed into a high-dimensional vector to serve as input data of a model;
step S52: the output of the model is a drug adjustment scheme based on the condition of the patient, and specifically comprises four drug adjustment schemes of 0 no-prescription change, 1 increase of drug dosage, 2 decrease of drug dosage and 3 replacement of drug.
Further, the updating of network parameters using random gradient descent as described in step S6, and in addition, to alleviate the problems of related data and non-stationary distribution, an empirical replay mechanism is introduced to interact each time-step agent with the environment to obtain a transition sampleStored in buffers and randomly decimated, in this way, differences in data distribution can be mitigated, thereby smoothing the training distribution of many behaviors in the past.
Further, in step S7, the agent continuously interacts with the environment, and through learning the strategy, the goal of maximizing rewards is achieved, and the optimal medicine and the dosage adjustment scheme thereof suitable for the patient are predicted, so that the doctor can make more intelligent medicine adjustment for the patient, and the survival rate and life quality of the lung cancer patient can be improved.
In order to make the technical problems, technical schemes and beneficial effects to be solved more clear, the invention is further described in detail by describing exemplary embodiments of the application with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. The following describes the technical scheme of the present invention in detail with reference to examples and drawings, but the scope of protection is not limited thereto.
In this embodiment, specifically based on the drug prediction model structure shown in fig. 1, the personalized drug adjustment can be performed for the lung cancer patient according to the actual situation of the patient through the step flow of the technical scheme shown in fig. 2, and the processing steps include:
step S1: collecting lung cancer patient data information;
the data used in the experiment was collected from a hospital in a region covering 6124 lung cancer patients, including 632565 outpatient visits over a period of time, and the personal basic information, medical history, physiological data, drug treatment regimen and other relevant data of the patients visited during the period and the relationship data between the type of drug taken by the patients, the dosage and the treatment effect were recorded.
Step S2: vital signs and related medical history of lung cancer patients within 6 months are extracted and preprocessed to construct a patient dataset:
screening collected lung cancer patient data, such as patient data of 18 years old and 75 years old or less, and preprocessing the screened data, wherein the method comprises the following steps: removing duplicate data, processing missing values, for missing physical measured values, replacing missing data by using the value of the nearest data point of the same patient, if the data is still lost, estimating the lost data by using the median of the variable observed values of all patients without the lost data, processing abnormal values and the like so as to ensure the integrity and the accuracy of the data; normalizing vital sign data so that all features are in the same scale range, and avoiding the influence of overlarge weight difference among different features on the performance of a model; the preprocessed data are stored and then divided into a training set and a testing set according to the ratio of 8:2, and the training set and the testing set are used for training and evaluating the effectiveness of a medicine prediction model.
Step S3: constructing a patient-based environment model by using the collected data, wherein the model is used for simulating a reward mechanism of a drug effect on a patient body and comprises a patient state, a drug action space, a reward function, a transfer model and an initial state:
step S31: determining a patient state space S: the state refers to the perception of the environment by the machine, i.e. the specific environment in which the agent is located, all possible states are referred to as state spaces, and in this embodiment the features defining the patient's visit status include: demographics, medical history, risk of illness, historical medication, laboratory data, and physical measurements; wherein the continuous variable is normalized to a common scale, the binary variable is expressed as 0 or 1, other classified variables are converted into a plurality of binary variables by adopting single thermal coding, and finally, a state space is established according to the information to obtain a 20-dimensional state vector;
step S32: determining a medicine action space A: refers to a specific action that an agent can take in the current environment, in an embodiment, the action space consists of four-dimensional vectors of 0 no-prescription change, 1 increase of medicine dosage, 2 decrease of medicine dosage, 3 change of medicine; wherein no prescription change indicates that the same medication and dosage as the previous prescription is used, and increasing or decreasing the medication dosage refers to adjusting the dosage of the medication taken by the patient in the current input state;
step S33: determining a reward function R: the reinforcement learning winning function is used for evaluating feedback rewards obtained by the agent after executing a certain action, and is a function for mapping environment feedback to scalar values, and the goal in reinforcement learning is to enable the agent to learn an optimal strategy through interaction with the environment so as to maximize accumulated rewards;
in the present embodiment, the bonus function is set as follows:
;
wherein:is at presenttPatient status at moment->Is at presenttDosing action performed by the moment agent, +.>Is thattPatient status at +1, ∈1>Is that the agent is executing the drug administration action +.>Posterior slave state->Transition to State->The awards obtained;s_rrepresenting the patient's survival in the current state, < + >>Representing the toxic side effects of a drug on a patient in the current state, for guiding an agent to avoid selecting drugs harmful to the patient,/I>Representing the cost of the selected drug in the current state, for guiding the agent to avoid selecting too expensive drugs; />、/>、/>The specific values are respectively set to be 1, -0.5 and-0.5 for the weight coefficients; training DQN models to optimize accumulationA prize, the jackpot being equal to the current prize plus the desired jackpot for the next visit multiplied by the discount factor +.>The model is able to estimate the impact of current actions on short-term and long-term results;
step S34: and (3) establishing a transfer model:;
the transition model refers to the transition probability between a state and an action, namely the probability of transition to the next state after taking a certain action in the current state; usingThe strategy is used for balancing development and exploration, the development is correct for maximizing expected benefits at the current moment, and the exploration possibly brings about the maximization of total benefits in the long term; />Is a common strategy in reinforcement learning, which means that there is a very small positive number +.>Is selected randomly, leaving +.>The probability of selecting the action with the greatest action value among the existing actions, for example, at time step t, the agent has k possible actions, respectively denoted +.> Let->Representing actionsaIs>The policy may be expressed as:
;
step S35: determining an initial stateThe initial state refers to a state of the patient at the time of starting treatment, and is determined according to the basic condition of the patient and the medical history.
Step S4: constructing a network model comprising an online network and a target network, wherein the network model is used for calculating the Q value adjusted by each possible drug scheme under the current state of a patient;
building a fully connected neural network with 2 hidden layers, wherein each hidden layer comprises 64 neurons, and adopting batch normalization and a leakage-ReLU activation function; the input layer has 20 dimensions, the output layer has 4 dimensions, which correspond to the state vector and the action space, respectively, and the learning rateSet to 0.001, batch size 256, target network update parameter +.>Set to 0.01; to control the stability of the model, a discount factor λ=0.5 is set; training the model by using an Adam optimizer, wherein the maximum iteration number is 100,000; the online network and the target network have the same structure, but the parameter values are different; an empirical playback mechanism is used to store all experience and to randomly extract a number of samples therefrom for training to reduce estimation errors and high variance problems. And the parameters of the target network are used for updating the parameters of the online network at regular intervals so as to improve the stability and speed up the convergence rate of the model.
Step S5: the extracted clinical information of the patient is input into a network as a state space, and the network is subjected to continuous iterative updating to finally output actions, wherein the final output actions comprise four options of 0 no-prescription change, 1 increase of medicine dosage, 2 decrease of medicine dosage and 3 replacement of medicine.
Step S6: the random gradient descent method is used for updating network parameters, and the specific steps include:
step S61: calculating the return of each state-action pair, generating a transition sampleWherein->Is in state->Execution of action down->The obtained prize value forms an empirical replay memory of size N>;
Step S62: initializing online network parametersAnd target network parameters->;
Step S63: playback of memory from experienceExtracting a batch of history samples;
step S64: selecting an optimal action for each state transition process;
Step S65: calculating expected action value from target network;
Step S66: calculating action value of current drug adjustment by online network;
Step S67: calculating a Q loss value L, and repeating S64-S67 for each sample in each batch of samples;
step S68: updating parameter values by a loss value L training network;
Step S69: updating parameter values。
Step S7: the intelligent agent continuously optimizes during the period of training and learning for a plurality of times, adjusts the related weight parameters, gradually increases the accumulated rewards, finally predicts the future drug adjustment scheme of the patient according to different conditions of different patients, changes the drug or adjusts the drug dosage, and assists doctors in adjusting the drug scheme for the patient.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.
Claims (7)
1. A lung cancer patient medication prediction method based on deep reinforcement learning is characterized in that: the method comprises the following medicine use prediction steps:
step S1: collecting lung cancer patient data information;
step S2: extracting vital signs and related medical histories of a patient suffering from lung cancer for a period of time and preprocessing the vital signs and the related medical histories to construct a patient data set;
step S3: constructing a patient-based environmental model using the collected data for simulating a reward mechanism of a drug effect on a patient's body, comprising: patient state, medication action space, reward function, transfer model, and initial state;
step S4: setting up a network model comprising an online network and a target network, and calculating an adjustment value of each medication scheme under the current state of a patient;
step S5: taking the collected patient history treatment data as input, and outputting a predicted drug adjustment scheme;
step S6: updating network parameters by using a random gradient descent method;
step S7: through constantly interacting with the environment, the method carries out training and learning for multiple times, achieves the goal of rewarding maximization, and predicts and outputs the medication type and medication dosage adjustment scheme suitable for patients.
2. The method for predicting medication for a lung cancer patient based on deep reinforcement learning of claim 1, wherein the method comprises the steps of: the specific method for collecting the data information of the lung cancer patient in the step S1 comprises the following steps:
step S11: collecting personal basic information, medical history, physiological data and drug treatment scheme data of a lung cancer patient;
step S12: collecting data on the relationship between the type of medication, dosage and treatment effect of a lung cancer patient, comprising: the patient takes different drug types and the change data of the lung tumor size of the patient when the patient takes different doses.
3. The method for predicting medication for a lung cancer patient based on deep reinforcement learning according to claim 2, wherein the method comprises the following steps: the specific method for constructing the patient data set in the step S2 is as follows:
step S21: screening lung cancer patient data of a set age group;
step S22: preprocessing the screened data, including: removing repeated data, processing missing values and processing abnormal values;
step S23: and (3) dividing the data obtained in the previous step into a training set and a testing set according to the ratio of 8:2 after the data obtained in the previous step are stored.
4. The method for predicting medication for a lung cancer patient based on deep reinforcement learning according to claim 3, wherein the method comprises the steps of: the specific method for constructing the environment model based on the patient in the step S3 is as follows:
step S31: determining a patient state space S, including tumor size, pathological stage and physiological index data;
features defining the patient's condition of a visit include: demographic, medical history, disease risk, historical medicine, laboratory data, physical measurement, and establishing a state space according to the information to obtain a multidimensional state vector;
step S32: determining a medicine action space A, including adjustment of medicine types and dosages thereof;
according to the type and the dosage of the historical medication of the patient, determining the adjustment scheme of the medication, wherein the action space A comprises four-dimensional vectors: no prescription change 0, 1 increase of drug dose, 2 decrease of drug dose, 3 replacement of drug;
step S33: determining a reward function R, designing a reasonable reward function according to the disease condition, the drug dosage and the treatment effect factors of a patient, and feeding back the change of various indexes of the body when the patient takes different drugs and the doses thereof to mark the improvement or the deterioration of the tumor;
step S34: based on patient historical medication data, a probability model is built for transitioning to each medication action in the current patient stateCalculating the current state of the patient +.>Taking medication strategy->After transition to the next state +.>Probability of (2)And use +.>Strategy is used for balancing development and exploration, and expected benefits at the current moment are maximized;
step S35: for the state of the patient at the beginning of the treatment, an initial state is determined based on the patient's basic condition and medical history.
5. The method for predicting medication for a lung cancer patient based on deep reinforcement learning of claim 4, wherein the method comprises the steps of: the specific method for constructing the network model to calculate the adjustment value of the medication scheme of the patient in the step S4 is as follows:
step S41: setting up an online network for calculating the adjustment value of each personalized medication scheme under the current physical state of the patient, and updating the optimal scheme according to the adjustment value;
the parameter weight of the online network is updated in the process of each iteration to minimize the difference between the predicted value and the target value in the current state;
step S42: calculating a target value according to the action in the current state and the maximum adjustment value in the next state;
step S43: constructing a neural network model to calculate an action-cost function Q (s, a) for estimating the cumulative rewards of each adjustment scheme in the current treatment state, wherein the action-cost function Q is denoted as Q (s, a) and is used for calculating an expected return value Q caused by the change of the body index of a patient after taking the medication strategy a in the state s;
step S44: training the DQN model, two structurally identical neural networks were used: online networkAnd target network->For obtaining optimal dosing action decisions +.>And is->Training is carried out;
wherein,for the current state of the patient->Current medication strategy for the patient,/->For online network->Parameter of->For calculating in a given state->The lower part has the maximum->Action of value->,/>Expected return value for patient physical index change in online network, < >>The expected return value for the patient's physical index change in the target network,Lfor loss function, calculating the difference between the two to perform network training;
estimating expected action values using a target networkTo calculate a loss functionLBy tracking the online network parameters in each training iteration +.>To update the target network->Parameter->Finally, the optimal personalized medicine adjustment scheme suitable for the patient is obtained.
6. The method for predicting medication for a lung cancer patient based on deep reinforcement learning of claim 5, wherein the method comprises the steps of: the specific method for outputting the predicted drug adjustment scheme in the step S5 is as follows:
step S51: the method comprises the steps that five kinds of information including patient illness state, physiological indexes, laboratory examination results, image examination results and medication conditions are formed into a high-dimensional vector to serve as input data of a model;
step S52: the output of the model is a drug adjustment regimen based on the patient's condition, including four drug adjustment regimens: definition 0 indicates no prescription change, 1 indicates increased medication dose, 2 indicates decreased medication dose, and 3 indicates replacement medication, respectively.
7. A device for use in implementing the deep reinforcement learning-based lung cancer patient medication prediction method of claim 1, characterized in that: the system comprises a collecting computer for collecting lung cancer patient data information, a data server for collecting and storing the data information, and a prediction server for building a network model, training and learning and outputting a prediction scheme.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311567874.0A CN117275661B (en) | 2023-11-23 | 2023-11-23 | Deep reinforcement learning-based lung cancer patient medication prediction method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311567874.0A CN117275661B (en) | 2023-11-23 | 2023-11-23 | Deep reinforcement learning-based lung cancer patient medication prediction method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117275661A true CN117275661A (en) | 2023-12-22 |
CN117275661B CN117275661B (en) | 2024-02-09 |
Family
ID=89220067
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311567874.0A Active CN117275661B (en) | 2023-11-23 | 2023-11-23 | Deep reinforcement learning-based lung cancer patient medication prediction method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117275661B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117637188A (en) * | 2024-01-26 | 2024-03-01 | 四川省肿瘤医院 | Tumor chemotherapy response monitoring method, medium and system based on digital platform |
CN118039062A (en) * | 2024-04-12 | 2024-05-14 | 四川省肿瘤医院 | Individualized chemotherapy dose remote control method based on big data analysis |
CN118280512A (en) * | 2024-04-05 | 2024-07-02 | 泰昊乐生物科技有限公司 | Personalized treatment scheme recommendation method and system based on artificial intelligence |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112071388A (en) * | 2019-06-10 | 2020-12-11 | 郑州大学第一附属医院 | Intelligent medicine dispensing and preparing method based on deep learning |
CN112420154A (en) * | 2020-11-25 | 2021-02-26 | 深圳市华嘉生物智能科技有限公司 | New coronary medication suggestion method based on deep learning neural network |
CN113257416A (en) * | 2020-12-09 | 2021-08-13 | 浙江大学 | COPD patient personalized management and tuning method, device and equipment based on deep learning |
CN113255735A (en) * | 2021-04-29 | 2021-08-13 | 平安科技(深圳)有限公司 | Method and device for determining medication scheme of patient |
CN113270189A (en) * | 2021-05-19 | 2021-08-17 | 复旦大学附属肿瘤医院 | Tumor treatment aid decision-making method based on reinforcement learning |
WO2021226064A1 (en) * | 2020-05-04 | 2021-11-11 | University Of Louisville Research Foundation, Inc. | Artificial intelligence-based systems and methods for dosing of pharmacologic agents |
WO2022067189A1 (en) * | 2020-09-25 | 2022-03-31 | Linus Health, Inc. | Systems and methods for machine-learning-assisted cognitive evaluation and treatment |
CN114330566A (en) * | 2021-12-30 | 2022-04-12 | 中山大学 | Method and device for learning sepsis treatment strategy |
CN114388095A (en) * | 2021-12-22 | 2022-04-22 | 中山大学 | Sepsis treatment strategy optimization method, system, computer device and storage medium |
CN114783571A (en) * | 2022-04-06 | 2022-07-22 | 北京交通大学 | Traditional Chinese medicine dynamic diagnosis and treatment scheme optimization method and system based on deep reinforcement learning |
CN115050451A (en) * | 2022-08-17 | 2022-09-13 | 合肥工业大学 | Automatic generation system for clinical sepsis medication scheme |
CN115831340A (en) * | 2023-02-22 | 2023-03-21 | 安徽省立医院(中国科学技术大学附属第一医院) | ICU (intensive care unit) breathing machine and sedative management method and medium based on inverse reinforcement learning |
CN115985514A (en) * | 2023-01-09 | 2023-04-18 | 重庆大学 | Septicemia treatment system based on dual-channel reinforcement learning |
CN116453706A (en) * | 2023-06-14 | 2023-07-18 | 之江实验室 | Hemodialysis scheme making method and system based on reinforcement learning |
CN117010476A (en) * | 2023-08-11 | 2023-11-07 | 电子科技大学长三角研究院(衢州) | Multi-agent autonomous decision-making method based on deep reinforcement learning |
-
2023
- 2023-11-23 CN CN202311567874.0A patent/CN117275661B/en active Active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112071388A (en) * | 2019-06-10 | 2020-12-11 | 郑州大学第一附属医院 | Intelligent medicine dispensing and preparing method based on deep learning |
WO2021226064A1 (en) * | 2020-05-04 | 2021-11-11 | University Of Louisville Research Foundation, Inc. | Artificial intelligence-based systems and methods for dosing of pharmacologic agents |
WO2022067189A1 (en) * | 2020-09-25 | 2022-03-31 | Linus Health, Inc. | Systems and methods for machine-learning-assisted cognitive evaluation and treatment |
CN112420154A (en) * | 2020-11-25 | 2021-02-26 | 深圳市华嘉生物智能科技有限公司 | New coronary medication suggestion method based on deep learning neural network |
CN113257416A (en) * | 2020-12-09 | 2021-08-13 | 浙江大学 | COPD patient personalized management and tuning method, device and equipment based on deep learning |
CN113255735A (en) * | 2021-04-29 | 2021-08-13 | 平安科技(深圳)有限公司 | Method and device for determining medication scheme of patient |
CN113270189A (en) * | 2021-05-19 | 2021-08-17 | 复旦大学附属肿瘤医院 | Tumor treatment aid decision-making method based on reinforcement learning |
CN114388095A (en) * | 2021-12-22 | 2022-04-22 | 中山大学 | Sepsis treatment strategy optimization method, system, computer device and storage medium |
CN114330566A (en) * | 2021-12-30 | 2022-04-12 | 中山大学 | Method and device for learning sepsis treatment strategy |
CN114783571A (en) * | 2022-04-06 | 2022-07-22 | 北京交通大学 | Traditional Chinese medicine dynamic diagnosis and treatment scheme optimization method and system based on deep reinforcement learning |
CN115050451A (en) * | 2022-08-17 | 2022-09-13 | 合肥工业大学 | Automatic generation system for clinical sepsis medication scheme |
CN115985514A (en) * | 2023-01-09 | 2023-04-18 | 重庆大学 | Septicemia treatment system based on dual-channel reinforcement learning |
CN115831340A (en) * | 2023-02-22 | 2023-03-21 | 安徽省立医院(中国科学技术大学附属第一医院) | ICU (intensive care unit) breathing machine and sedative management method and medium based on inverse reinforcement learning |
CN116453706A (en) * | 2023-06-14 | 2023-07-18 | 之江实验室 | Hemodialysis scheme making method and system based on reinforcement learning |
CN117010476A (en) * | 2023-08-11 | 2023-11-07 | 电子科技大学长三角研究院(衢州) | Multi-agent autonomous decision-making method based on deep reinforcement learning |
Non-Patent Citations (4)
Title |
---|
PANAGIOTIS SYMEONIDIS等: "Deep Reinforcement Learning for Medicine Recommendation", 《2022 IEEE 22ND INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE)》, pages 85 - 90 * |
傅群: "基于机器学习模型的他克莫司在肾移植受者中的个体化给药研究", 《中国硕士学位论文全文数据库 医药卫生科技辑》, no. 02, pages 1 - 75 * |
吴青等: "基于深度Q网络的BECT治疗药物左乙拉西坦用药剂量推荐", 《中国现代应用药学》, vol. 39, no. 12, pages 1585 - 1590 * |
董云云: "基于医学影像和基因数据的肺癌辅助诊断方法研究", 《中国博士学位论文全文数据库 医药卫生科技辑》, no. 1, pages 072 - 136 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117637188A (en) * | 2024-01-26 | 2024-03-01 | 四川省肿瘤医院 | Tumor chemotherapy response monitoring method, medium and system based on digital platform |
CN117637188B (en) * | 2024-01-26 | 2024-04-09 | 四川省肿瘤医院 | Tumor chemotherapy response monitoring method, medium and system based on digital platform |
CN118280512A (en) * | 2024-04-05 | 2024-07-02 | 泰昊乐生物科技有限公司 | Personalized treatment scheme recommendation method and system based on artificial intelligence |
CN118039062A (en) * | 2024-04-12 | 2024-05-14 | 四川省肿瘤医院 | Individualized chemotherapy dose remote control method based on big data analysis |
Also Published As
Publication number | Publication date |
---|---|
CN117275661B (en) | 2024-02-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN117275661B (en) | Deep reinforcement learning-based lung cancer patient medication prediction method and device | |
CN109599177B (en) | Method for predicting medical treatment track through deep learning based on medical history | |
CN110880362B (en) | Large-scale medical data knowledge mining and treatment scheme recommending system | |
US9370689B2 (en) | System and methods for providing dynamic integrated wellness assessment | |
CN109087706B (en) | Human health assessment method and system based on sleep big data | |
JP7019127B2 (en) | Insulin assessment based on reinforcement learning | |
CN111105860A (en) | Intelligent prediction, analysis and optimization system for accurate motion big data for chronic disease rehabilitation | |
CN111798954A (en) | Drug combination recommendation method based on time attention mechanism and graph convolution network | |
Javad et al. | A reinforcement learning–based method for management of type 1 diabetes: exploratory study | |
CN116453706B (en) | Hemodialysis scheme making method and system based on reinforcement learning | |
US20200203020A1 (en) | Digital twin of a person | |
US20210089965A1 (en) | Data Conversion/Symptom Scoring | |
JP6962854B2 (en) | Water prescription system and water prescription program | |
CN114732402B (en) | Diabetes digital health management system based on big data | |
Wang et al. | Prediction models for glaucoma in a multicenter electronic health records consortium: the sight outcomes research collaborative | |
US11887736B1 (en) | Methods for evaluating clinical comparative efficacy using real-world health data and artificial intelligence | |
Oroojeni Mohammad Javad et al. | Reinforcement learning algorithm for blood glucose control in diabetic patients | |
CN116525117B (en) | Data distribution drift detection and self-adaption oriented clinical risk prediction system | |
Dogaru et al. | Big Data and Machine Learning Framework in Healthcare | |
US20230386656A1 (en) | Computerized system for the repeated determination of a set of at least one control parameters of a medical device | |
Mohanty et al. | A classification model based on an adaptive neuro-fuzzy inference system for disease prediction | |
CN118588226B (en) | Training method, optimizing method and device of antiepileptic medicinal strategy optimizing model | |
Rad et al. | Optimizing Blood Glucose Control through Reward Shaping in Reinforcement Learning | |
Rodriguez Leon et al. | Prediction of Blood Glucose Levels in Patients with Type 1 Diabetes via LSTM Neural Networks | |
Ranganathan et al. | Intelligent Inhalation Therapy for Cystic Fibrosis Using IoT and Machine Learning Solutions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |