CN118213087A - Disease transmission risk prediction method and system - Google Patents
Disease transmission risk prediction method and system Download PDFInfo
- Publication number
- CN118213087A CN118213087A CN202410481857.3A CN202410481857A CN118213087A CN 118213087 A CN118213087 A CN 118213087A CN 202410481857 A CN202410481857 A CN 202410481857A CN 118213087 A CN118213087 A CN 118213087A
- Authority
- CN
- China
- Prior art keywords
- data
- user
- risk
- users
- community network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 230000005541 medical transmission Effects 0.000 title claims abstract description 34
- 230000010354 integration Effects 0.000 claims abstract description 43
- 230000007480 spreading Effects 0.000 claims abstract description 26
- 238000003892 spreading Methods 0.000 claims abstract description 26
- 230000009471 action Effects 0.000 claims abstract description 20
- 208000015181 infectious disease Diseases 0.000 claims abstract description 19
- 230000000694 effects Effects 0.000 claims abstract description 18
- 201000010099 disease Diseases 0.000 claims abstract description 14
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims abstract description 14
- 238000004140 cleaning Methods 0.000 claims description 26
- 230000005540 biological transmission Effects 0.000 claims description 14
- 208000035473 Communicable disease Diseases 0.000 claims description 11
- 238000012549 training Methods 0.000 claims description 9
- 238000012795 verification Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 8
- 238000012502 risk assessment Methods 0.000 claims description 8
- 238000003066 decision tree Methods 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 5
- 238000013480 data collection Methods 0.000 claims description 4
- 230000003862 health status Effects 0.000 claims description 4
- 238000012216 screening Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000002265 prevention Effects 0.000 abstract description 7
- 238000003012 network analysis Methods 0.000 abstract description 6
- 230000009286 beneficial effect Effects 0.000 abstract description 5
- 238000004458 analytical method Methods 0.000 description 10
- 230000000875 corresponding effect Effects 0.000 description 7
- 230000003993 interaction Effects 0.000 description 6
- 230000006806 disease prevention Effects 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 230000008821 health effect Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000013441 quality evaluation Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000011217 control strategy Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 230000003449 preventive effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000005180 public health Effects 0.000 description 1
- 230000011273 social behavior Effects 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/80—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Public Health (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Biology (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the technical field of disease transmission risk prediction, in particular to a disease transmission risk prediction method and a disease transmission risk prediction system, comprising the following steps: acquiring user data; carrying out association integration of personal data according to the personal identification; comparing the user relations among the users according to the user data to form a community network; analyzing the transmissible diseases of each user, superposing risk sources through a community network, and analyzing the action route and the residence time of the user according to a positioning module so as to know the activity condition of the user at different places and evaluate the transmissible risk of the user; and grading the spreading risk of the places according to the information of the places where the users stay, the population of the places and the data. The invention utilizes various technical means such as data integration and association, community network analysis, positioning module and the like, can analyze and evaluate the spreading risks of users and places accurately in real time, and is beneficial to improving the prevention and control level of diseases.
Description
Technical Field
The invention relates to the technical field of disease transmission risk prediction, in particular to a disease transmission risk prediction method and a disease transmission risk prediction system.
Background
During the epidemic of an infectious disease, knowledge of the spread of the disease is critical to the development of targeted preventive and control strategies. Traditional health monitoring methods typically rely on the patient actively reporting symptoms or the doctor diagnosing by observation. The methods have the problems of information lag, inaccurate infection data and the like, and the spreading risk of diseases is difficult to predict accurately in time. Therefore, developing a method and system that can accurately predict disease transmission risk in real time has important application value.
Disclosure of Invention
In order to solve the problems, the invention provides a disease transmission risk prediction method and a disease transmission risk prediction system, which can analyze and evaluate transmission risks of users and places accurately in real time by utilizing user data and community network information, and are beneficial to improving the prevention and control level of diseases.
In order to achieve the above purpose, the invention adopts the following technical scheme:
A disease transmission risk prediction method comprising the steps of:
Acquiring user data;
Carrying out association integration of personal data according to the personal identification;
comparing the user relations among the users according to the user data to form a community network;
analyzing the transmissible diseases of each user, superposing risk sources through a community network, and analyzing the action route and the residence time of the user according to a positioning module;
And grading the spreading risk of the places according to the information of the places where the users stay, the population of the places and the data.
Further, the step of acquiring each user data specifically includes:
data relating to the user is obtained from different data sources.
Further, the user-related data includes, but is not limited to, demographic information, health status, and action trajectories of the user.
Further, the step of performing association integration of personal data according to the personal identification mark includes:
The personal identification is utilized to carry out association integration on the acquired user data, and personal information files of the users are established by matching the data of different data sources; the method specifically comprises the following steps:
Data preprocessing: cleaning and standardizing the data of each data source, and ensuring the consistency and accuracy of the data;
Feature extraction: extracting key features from each data source, including names, mobile phone numbers and mailboxes;
Similarity calculation: comparing the similarity degree between different data through an edit distance algorithm;
User matching: based on the similarity calculation result, matching data with similarity higher than a set threshold value into the same user, and establishing association between the users;
information integration: integrating the successfully matched user data into a personal information file, wherein the personal information file comprises information provided by each data source;
updating data: and updating and checking the user data regularly to ensure the timeliness and accuracy of the user information file.
Further, the step of forming a community network by comparing user relationships among users according to user data specifically includes:
Step 1, extracting feature data from user data, including: user ID, user attributes, and relationships between users; cleaning and de-duplicating the extracted characteristic data;
Step 2, representing the characteristic data after cleaning and de-duplication into a graph structure, wherein each node represents a user, and the edges represent the relationship between the users; constructing a community network according to the relation among users, and identifying a user group with tight connection by adopting a Louvain algorithm;
Step 3, dividing users into different communities or groups according to the identification result of the Louvain algorithm in the step 2; and the modularity is adopted to evaluate the quality of communities and the association degree between users, so that the structure of a community network is further optimized.
Further, the step of forming a community network by comparing the user relationships among the users according to the user data is to form a network containing the user relationships based on social media and address book data modeling.
Further, the step of analyzing the transmissible diseases existing in each user, superposing risk sources through a community network, and analyzing the action route and the residence time of the user according to the positioning module so as to know the activity condition of the user at different places and evaluate the transmissible risk of the user, specifically comprises the following steps:
Analyzing the potential infectious disease risks of each user according to the personal data of the user and the community network, and further evaluating the spreading risk of the user by superposing infectious disease risk sources of communities where the user is located; meanwhile, the action route and the residence time of the user are analyzed through the positioning module, so that the activity condition of the user at different places is known.
Further, the step of grading the spreading risk of the venue according to the information of the user stay venue, the venue population and the data specifically includes:
And (3) data collection: collecting user data, venue information and venue population related data, wherein the user data comprises position information and behavior data of a user, the venue information comprises types, sizes and positions of venues, and the venue population data comprises the number of people and the crowd constitution in the venues;
data integration and cleaning: integrating the collected various data, and cleaning the data to ensure the accuracy and the integrity of the data;
And (3) establishing a model: based on the collected data, establishing a propagation risk level prediction model by using a decision tree algorithm;
characteristic engineering: converting, combining and screening the data after data integration and cleaning to extract more meaningful features;
Model training and verification: training the established model by using historical data, and verifying and optimizing the model by using verification data, so as to ensure the accuracy and generalization capability of the model;
predicting a propagation risk level: and predicting the propagation risks of different places by using the established model, and formulating corresponding propagation risk grades for each place.
A disease transmission risk prediction system for performing the disease transmission risk prediction method as described above.
Further, the system comprises a data acquisition module, an association and integration module, a community network modeling module, a propagation risk analysis module and a risk grading module,
The data acquisition module is used for acquiring the data of each user;
The association integration module is used for carrying out association integration of personal data according to the personal identification mark;
The community network modeling module is used for comparing the user relationship among the users according to the user data to form a community network;
The spreading risk analysis module is used for analyzing the spreading diseases of each user, superposing risk sources through a community network, and analyzing the action route and the stay time of the user according to the positioning module so as to know the activity condition of the user at different places and evaluate the spreading risk of the user;
The risk grading module is used for grading the spreading risk of the places according to the information of the places where the users stay, the population of the places and the data.
The invention has the beneficial effects that:
The invention utilizes various technical means such as data integration and association, community network analysis, positioning module and the like, can analyze and evaluate the spreading risks of users and places accurately in real time, and is beneficial to improving the prevention and control level of diseases.
The invention can realize the accurate association and integration of personal data through the steps of data cleaning, feature extraction, similarity calculation, user matching, information integration, data updating and the like, and improves the availability and the value of the data.
The community network is constructed by using the user data, and the relationship among users can be effectively mined through the steps of cleaning, graph structure representation, community identification, quality evaluation and the like, so that powerful support is provided for the management and analysis of user groups.
Drawings
Fig. 1 is a flowchart of a disease transmission risk prediction method provided by the present invention.
Detailed Description
Referring to fig. 1, the method and system for predicting disease transmission risk according to the present invention can analyze and evaluate the transmission risk of users and sites accurately in real time by using the user data and the community network information, and are helpful for improving the prevention and control level of diseases.
A disease transmission risk prediction method comprising the steps of:
Acquiring user data;
Carrying out association integration of personal data according to the personal identification;
comparing the user relations among the users according to the user data to form a community network;
analyzing the transmissible diseases of each user, superposing risk sources through a community network, and analyzing the action route and the residence time of the user according to a positioning module so as to know the activity condition of the user at different places and evaluate the transmissible risk of the user;
And grading the spreading risk of the places according to the information of the places where the users stay, the population of the places and the data.
The disease transmission risk prediction method uses the technical means of the association integration of user data and community network analysis, and analyzes the action route and the stay time of the user by combining a positioning module so as to evaluate the activity condition and the transmission risk of the user in different places. The specific technical effects are as follows:
1. Data integration and association: by acquiring the data of each user and carrying out association integration according to the personal identification, the user data of different sources and different types can be processed and analyzed uniformly, and the comprehensive utilization efficiency of the data is improved.
2. Community network analysis: by comparing the user data, a relational network, i.e., a community network, between users can be formed. Such analysis methods may reveal close relationships and interactions between users, helping to discover potential pathways and risks of disease transmission.
3. Analysis of transmissible disease: by analyzing the presence of the transmissible disease for each user, potential sources of transmission can be identified and superimposed into the community network. Therefore, the propagation risk of communities can be estimated more accurately, and corresponding measures can be taken in time for intervention, prevention and control.
4. User action route and dwell time analysis: the action route and the stay time information of the user are acquired by using the positioning module, so that the activity condition of the user in different places can be known. This helps to assess the risk of spread of a user at a particular location, as well as to understand the spread potential of that location.
5. Place spread risk grading: and grading the spreading risk of the places according to the information of the places where the users stay, the population of the places and the data obtained by the analysis. Through risk assessment of different places, guidance and decision support can be provided for disease prevention and control.
In general, the method utilizes various technical means such as data integration and association, community network analysis, positioning modules and the like, can evaluate the propagation risk of individuals and communities more accurately, and provides scientific basis and decision support for disease prevention and control.
Further, the step of acquiring each user data specifically includes:
data relating to the user is obtained from different data sources.
Further, the user-related data includes, but is not limited to, demographic information of the user, health status, and action trajectories.
Further, the step of performing association integration of personal data according to the personal identification mark includes:
The personal identification is utilized to carry out association integration on the acquired user data, and personal information files of the users are established by matching the data of different data sources; the method specifically comprises the following steps:
data preprocessing: through cleaning and standardizing the data of each data source, the consistency and the accuracy of the data are ensured, the efficiency and the accuracy of subsequent processing can be effectively improved, and meanwhile, the error matching caused by the quality problem of the data is reduced.
Feature extraction: key features such as name, mobile phone number and mailbox are extracted from each data source, so that different users can be identified and distinguished, and an accurate user information file is established.
Similarity calculation: the similarity degree between different data is compared by adopting an edit distance algorithm, so that similar personal information can be effectively identified, and the accuracy and reliability of data matching are improved.
User matching: based on the similarity calculation result, matching data with similarity higher than a set threshold value into the same user, and establishing association between the users, so that personal information integration and association across data sources are realized.
Information integration: and integrating the successfully matched user data into a personal information file, wherein the personal information file comprises information provided by each data source, so that the user information is more complete and comprehensive.
Updating data: and updating and checking the user data regularly, so that timeliness and accuracy of the user information file are ensured, and the validity and instantaneity of the user information are maintained.
In summary, according to the embodiment, through the steps of data cleaning, feature extraction, similarity calculation, user matching, information integration, data updating and the like, accurate association integration of personal data can be realized, and the availability and value of the data are improved.
Further, the step of forming a community network by comparing the user relationships among the users according to the user data specifically includes:
Step 1, extracting feature data from user data, including: user ID, user attributes, and relationships between users; cleaning and de-duplicating the extracted characteristic data;
Step 2, representing the characteristic data after cleaning and de-duplication into a graph structure, wherein each node represents a user, and the edges represent the relationship between the users; constructing a community network according to the relation among users, and identifying a user group with tight connection by adopting a Louvain algorithm;
Step 3, dividing users into different communities or groups according to the identification result of the Louvain algorithm in the step 2; and the modularity is adopted to evaluate the quality of communities and the association degree between users, so that the structure of a community network is further optimized.
The technical effects of this embodiment are as follows:
The establishment of a connection between users may be aided by extracting feature data from the user data, including user IDs, user attributes, and relationships between users. The cleaning and the de-duplication can remove noise and repeated information in the data, and improve the accuracy and the reliability of subsequent analysis.
And the characteristic data after cleaning and de-duplication are expressed into a graph structure, so that the relation among users is effectively displayed. Each node represents a user, the edges represent the relationship between users, and the relationship between user groups can be intuitively presented. The user groups with tight connection are identified through the Louvain algorithm, so that a community structure can be effectively found, and finer division and grouping of users are facilitated.
And dividing the users into different communities or groups according to the identification result of the Louvain algorithm, so as to realize effective management and classification of the user groups. The quality of communities and the association degree between users are evaluated by adopting the modularity, so that the quality of the network structure of communities can be evaluated quantitatively, and a basis is provided for further optimization. Through evaluating the association degree between the modularity and the user group, the problems in the community network can be found, the structure is optimized, and the quality and the efficiency of the community network are improved.
The modularity is an index for evaluating the quality of the division in the network structure, and is widely used in community discovery and network analysis. The quality of communities is measured by comparing the connection density of nodes in the network with the connection sparsity of nodes among different communities. A high modularity value indicates that the intra-community connections are tight, and the inter-community connections are sparse, i.e., there are more connections between intra-community nodes, and fewer connections between different communities. Therefore, by evaluating the quality of communities through modularity, we can be helped to understand how tightly the community structure is and how correlated users are. In a social network or an online community, optimizing the structure of the community network can help to improve the satisfaction and participation of users and promote information transmission and interaction. Through constantly adjusting the structure of the community network, the internal connection of communities is tighter, the connection among communities is sparser, the interaction and communication among users can be enhanced, and the activity and the cohesive force of communities are improved. Therefore, the community optimization based on modularity can effectively improve the structure of the community network, so that the quality of communities and the association degree between users are improved.
In general, the embodiment utilizes the user data to construct a community network, and through the steps of cleaning, graph structure representation, community identification, quality evaluation and the like, the relationship among users can be effectively mined, and powerful support is provided for the management and analysis of user groups, so that the purposes of improving the community network efficiency and the user interaction experience are achieved.
Further, the step of forming a community network by comparing the user relationships among the users according to the user data is to form a network containing the user relationships based on social media and address book data modeling. It has the following effects:
1. Modeling social media and address book data: and modeling a relationship network among the users by analyzing the data of the users in the social media platform and the address book. The system can be helped to more fully understand the connection and interaction between users, and basic data is provided for subsequent community network construction.
2. Building a personalized community network: by comparing the relationship data between users, the system can identify groups of users having similar or close relationships, thereby forming a personalized community network. The refined community network construction can improve user experience and enhance the aggregation and liveness of communities.
3. User relationship analysis and optimization: through the formation of the community network, the system can further deeply analyze the relationship among users and explore the rules and trends of the social behaviors of the users. According to the analysis results, more intelligent and personalized services and recommendations can be provided for the user, and the satisfaction degree and participation degree of the user are improved.
Further, the step of analyzing the transmissible diseases existing in each user, superposing risk sources through a community network, and analyzing the action route and the stay time of the user according to the positioning module so as to know the activity condition of the user in different places and evaluate the transmissible risk of the user, comprises the following steps:
Analyzing the potential infectious disease risks of each user according to the personal data of the user and the community network, and further evaluating the spreading risk of the user by superposing infectious disease risk sources of communities where the user is located; meanwhile, the action route and the residence time of the user are analyzed through the positioning module, so that the activity condition of the user at different places is known.
Further, the step of grading the spreading risk of the venue according to the information of the user stay venue, the venue population and the data comprises the following steps:
And (3) data collection: collecting user data, venue information and venue population related data, wherein the user data comprises position information and behavior data of a user, the venue information comprises types, sizes and positions of venues, and the venue population data comprises the number of people and the crowd constitution in the venues;
data integration and cleaning: integrating the collected various data, and cleaning the data to ensure the accuracy and the integrity of the data;
And (3) establishing a model: based on the collected data, establishing a propagation risk level prediction model by using a decision tree algorithm;
characteristic engineering: converting, combining and screening the data after data integration and cleaning to extract more meaningful features;
Model training and verification: training the established model by using historical data, and verifying and optimizing the model by using verification data, so as to ensure the accuracy and generalization capability of the model;
predicting a propagation risk level: and predicting the propagation risks of different places by using the established model, and formulating corresponding propagation risk grades for each place.
Some examples of the above steps are as follows:
and (3) data collection:
User data: the user's location information and behavioral data are collected by various data sources such as mobile applications, sensors, and the like. Such as the length of time, frequency, etc. that the user is staying at a location.
Location information: information about the type (e.g., mall, school, hospital, etc.), size, and specific location of the venue is collected for subsequent analysis.
Venue population data: the number of people recorded in the place and the crowd constitution comprise information of different age groups, sexes, professions and the like.
Data integration and cleaning:
And integrating the data from different sources, eliminating repeated data, and performing cleaning operations including processing missing values, abnormal values and noise data to ensure high data quality and consistency.
And the user data, the place information and the place population data can be correlated, and an accurate data basis is provided for subsequent modeling.
And (3) establishing a model:
A decision tree algorithm is used to build a propagation risk level prediction model based on the data that has been integrated and cleaned. The decision tree algorithm can visually present the relationship between the data features, facilitating understanding and interpretation of the model results.
Characteristic engineering:
The integrated and cleaned data is subjected to further processing including feature transformation, combining and screening operations to extract more meaningful features. For example, the expressive power of the model may be enhanced by means of feature scaling, single-hot encoding, etc.
Model training and verification:
And training the established propagation risk level prediction model by using historical data, and verifying and optimizing the model by using verification data. Through repeated iterative training and verification processes, the accuracy and generalization capability of the model are ensured, so that the model can be suitable for propagation risk prediction in different places.
Predicting a propagation risk level:
Finally, predicting the propagation risks of different places by using the established model, and formulating corresponding propagation risk grades for each place according to the prediction result. The method is helpful for decision makers to take targeted prevention and control measures, effectively reduces the transmission risk of infectious diseases and ensures public health safety.
In summary, according to the embodiment, by combining factors such as user data, site information and site population factors, disease transmission potential and population density, corresponding transmission risk levels are formulated for each site, so that disease prevention and control efficiency and accuracy are improved.
A disease transmission risk prediction system for performing the disease transmission risk prediction method as described above.
Further, the system comprises a data acquisition module, an association integration module, a community network modeling module, a propagation risk analysis module and a risk grading module,
The data acquisition module is used for acquiring the data of each user;
The association integration module is used for carrying out association integration of personal data according to the personal identification mark;
The community network modeling module is used for comparing the user relationship among the users according to the user data to form a community network;
The spreading risk analysis module is used for analyzing the spreading diseases of each user, superposing risk sources through a community network, and analyzing the action route and the stay time of the user according to the positioning module so as to know the activity condition of the user at different places and evaluate the spreading risk of the user;
The risk grading module is used for grading the spreading risk of the places according to the information of the places where the users stay, the population of the places and the data.
The embodiment 1 of the invention provides a disease transmission risk prediction method, which comprises the following steps:
First, acquiring user data. Data relating to the user is obtained from different data sources. Such data may include, but is not limited to, demographic information, health status, and course of action of the user, etc.
And step two, carrying out association integration of personal data according to the personal identification. And carrying out association integration on the acquired user data by utilizing the personal identification. And establishing a personal information file of the user by matching the data of different data sources.
And thirdly, comparing the user relations among the users according to the user data to form a community network. And establishing a community network by analyzing the contact and interaction conditions among users. The connection between users can be modeled based on social media, address book and other data to form a network containing the relationship of each user.
And fourthly, analyzing the transmissible diseases of each user, superposing risk sources through a community network, and analyzing the action route and the residence time of the user according to the positioning module. And analyzing the potential infectious disease risks of each user according to the personal data of the user and the community network. And further evaluating the transmission risk of the user by superposing the infectious disease risk sources of the community where the user is located. Meanwhile, the action route and the residence time of the user are analyzed through the positioning module, so that the activity condition of the user at different places is known.
And fifthly, grading the spreading risk of the places according to the data, the information of the places where the users stay, the population of the places and the like. The propagation risk of different sites is assessed and ranked based on factors such as user data, site information, site population, etc. By considering factors such as the transmission potential of diseases in different places and the crowd density of places, corresponding transmission risk grades are formulated for each place.
Embodiment 2 of the present invention provides a disease transmission risk prediction system corresponding to the disease transmission risk prediction method described in embodiment 1. The system comprises a data acquisition module, an association integration module, a community network modeling module, a propagation risk analysis module and a risk grading module. These modules are used to implement the respective steps in the above-described method, respectively. The system utilizes advanced computer technology and algorithms to enable efficient prediction and assessment of disease transmission risk.
The method has the advantage that the disease transmission risk can be predicted more accurately by comprehensively utilizing the information of a plurality of data sources and the community network. Compared with the traditional health monitoring method, the method has the advantages of real-time performance, accuracy, comprehensiveness and the like, and is beneficial to improving the prevention and control level of infectious diseases.
The above embodiments are merely illustrative of the preferred embodiments of the present invention and are not intended to limit the scope of the present invention, and various modifications and improvements made by those skilled in the art to the technical solution of the present invention should fall within the scope of protection defined by the claims of the present invention without departing from the spirit of the design of the present invention.
Claims (10)
1. A method of predicting risk of disease transmission, comprising the steps of:
Acquiring user data;
Carrying out association integration of personal data according to the personal identification;
comparing the user relations among the users according to the user data to form a community network;
analyzing the transmissible diseases of each user, superposing risk sources through a community network, and analyzing the action route and the residence time of the user according to a positioning module;
And grading the spreading risk of the places according to the information of the places where the users stay, the population of the places and the data.
2. The disease propagation risk prediction method according to claim 1, wherein the step of acquiring each user data specifically comprises:
data relating to the user is obtained from different data sources.
3. The method of claim 2, wherein the user-related data includes, but is not limited to, demographic information, health status, and trajectory of action of the user.
4. The disease transmission risk prediction method according to claim 1, wherein the step of performing association integration of personal data based on personal identification, comprises:
The personal identification is utilized to carry out association integration on the acquired user data, and personal information files of the users are established by matching the data of different data sources; the method specifically comprises the following steps:
Data preprocessing: cleaning and standardizing the data of each data source, and ensuring the consistency and accuracy of the data;
Feature extraction: extracting key features from each data source, including names, mobile phone numbers and mailboxes;
Similarity calculation: comparing the similarity degree between different data through an edit distance algorithm;
User matching: based on the similarity calculation result, matching data with similarity higher than a set threshold value into the same user, and establishing association between the users;
information integration: integrating the successfully matched user data into a personal information file, wherein the personal information file comprises information provided by each data source;
updating data: and updating and checking the user data regularly to ensure the timeliness and accuracy of the user information file.
5. The disease propagation risk prediction method according to claim 1, wherein the step of forming a community network by comparing user relationships among users based on user data comprises:
Step 1, extracting feature data from user data, including: user ID, user attributes, and relationships between users; cleaning and de-duplicating the extracted characteristic data;
Step 2, representing the characteristic data after cleaning and de-duplication into a graph structure, wherein each node represents a user, and the edges represent the relationship between the users; constructing a community network according to the relation among users, and identifying a user group with tight connection by adopting a Louvain algorithm;
Step 3, dividing users into different communities or groups according to the identification result of the Louvain algorithm in the step 2; and the modularity is adopted to evaluate the quality of communities and the association degree between users, so that the structure of a community network is further optimized.
6. The disease propagation risk prediction method according to claim 1, wherein the step of forming a community network by comparing user relationships among users according to user data is to form a network including the user relationships based on social media and address book data modeling.
7. The disease transmission risk prediction method according to claim 1, wherein the steps of analyzing the transmission disease of each user, superimposing risk sources through a community network, and analyzing the action route and the stay time of the user according to the positioning module so as to know the activity condition of the user at different places and evaluate the transmission risk of the user specifically comprise:
Analyzing the potential infectious disease risks of each user according to the personal data of the user and the community network, and further evaluating the spreading risk of the user by superposing infectious disease risk sources of communities where the user is located; meanwhile, the action route and the residence time of the user are analyzed through the positioning module, so that the activity condition of the user at different places is known.
8. The disease transmission risk prediction method according to claim 1, wherein the step of grading the transmission risk of the location based on the information of the location where the user stays, the population of the location, and the data comprises:
And (3) data collection: collecting user data, venue information and venue population related data, wherein the user data comprises position information and behavior data of a user, the venue information comprises types, sizes and positions of venues, and the venue population data comprises the number of people and the crowd constitution in the venues;
data integration and cleaning: integrating the collected various data, and cleaning the data to ensure the accuracy and the integrity of the data;
And (3) establishing a model: based on the collected data, establishing a propagation risk level prediction model by using a decision tree algorithm;
characteristic engineering: converting, combining and screening the data after data integration and cleaning to extract more meaningful features;
Model training and verification: training the established model by using historical data, and verifying and optimizing the model by using verification data, so as to ensure the accuracy and generalization capability of the model;
predicting a propagation risk level: and predicting the propagation risks of different places by using the established model, and formulating corresponding propagation risk grades for each place.
9. A disease transmission risk prediction system for performing the disease transmission risk prediction method according to any one of claims 1 to 8.
10. The disease transmission risk prediction system according to claim 9, wherein the system comprises a data acquisition module, an association integration module, a community network modeling module, a transmission risk analysis module, and a risk classification module,
The data acquisition module is used for acquiring the data of each user;
The association integration module is used for carrying out association integration of personal data according to the personal identification mark;
The community network modeling module is used for comparing the user relationship among the users according to the user data to form a community network;
The spreading risk analysis module is used for analyzing the spreading diseases of each user, superposing risk sources through a community network, and analyzing the action route and the stay time of the user according to the positioning module so as to know the activity condition of the user at different places and evaluate the spreading risk of the user;
The risk grading module is used for grading the spreading risk of the places according to the information of the places where the users stay, the population of the places and the data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410481857.3A CN118213087A (en) | 2024-04-22 | 2024-04-22 | Disease transmission risk prediction method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410481857.3A CN118213087A (en) | 2024-04-22 | 2024-04-22 | Disease transmission risk prediction method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118213087A true CN118213087A (en) | 2024-06-18 |
Family
ID=91448449
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410481857.3A Pending CN118213087A (en) | 2024-04-22 | 2024-04-22 | Disease transmission risk prediction method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118213087A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111192153A (en) * | 2019-12-19 | 2020-05-22 | 浙江大搜车软件技术有限公司 | Crowd relation network construction method and device, computer equipment and storage medium |
CN113161006A (en) * | 2021-03-24 | 2021-07-23 | 南方科技大学 | Close contact person infection risk assessment method, close contact person infection risk assessment device, electronic equipment and storage medium |
CN115116621A (en) * | 2022-06-17 | 2022-09-27 | 北京航空航天大学 | Epidemic disease time-space risk modeling and predicting method based on mobile communication data |
CN115240869A (en) * | 2022-07-18 | 2022-10-25 | 石会文 | Intelligent infectious disease monitoring and early warning system |
CN117035388A (en) * | 2022-04-29 | 2023-11-10 | 医渡云(北京)技术有限公司 | Risk assessment method and system for target place, terminal equipment and storage medium |
-
2024
- 2024-04-22 CN CN202410481857.3A patent/CN118213087A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111192153A (en) * | 2019-12-19 | 2020-05-22 | 浙江大搜车软件技术有限公司 | Crowd relation network construction method and device, computer equipment and storage medium |
CN113161006A (en) * | 2021-03-24 | 2021-07-23 | 南方科技大学 | Close contact person infection risk assessment method, close contact person infection risk assessment device, electronic equipment and storage medium |
CN117035388A (en) * | 2022-04-29 | 2023-11-10 | 医渡云(北京)技术有限公司 | Risk assessment method and system for target place, terminal equipment and storage medium |
CN115116621A (en) * | 2022-06-17 | 2022-09-27 | 北京航空航天大学 | Epidemic disease time-space risk modeling and predicting method based on mobile communication data |
CN115240869A (en) * | 2022-07-18 | 2022-10-25 | 石会文 | Intelligent infectious disease monitoring and early warning system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111368074B (en) | Link prediction method based on network structure and text information | |
WO2020015104A1 (en) | Method, apparatus, computer device, and storage medium for predicting flow rate of passengers presenting security risk | |
CN105404890A (en) | Criminal gang discrimination method considering locus space-time meaning | |
CN110781308B (en) | Anti-fraud system for constructing knowledge graph based on big data | |
CN103795613A (en) | Method for predicting friend relationships in online social network | |
Rintyarna et al. | Mapping acceptance of Indonesian organic food consumption under Covid-19 pandemic using Sentiment Analysis of Twitter dataset | |
CN111538741A (en) | Deep learning analysis method and system for big data of alarm condition | |
Gliwa et al. | Models of social groups in blogosphere based on information about comment addressees and sentiments | |
Gao et al. | Early indicators of human activity during COVID-19 period using digital trace data of population activities | |
Sun et al. | Anomaly subgraph detection with feature transfer | |
CN116991932B (en) | Data analysis and management system and method based on artificial intelligence | |
CN118213087A (en) | Disease transmission risk prediction method and system | |
CN113254580A (en) | Special group searching method and system | |
CN113221984B (en) | User drunk driving behavior analysis and prediction method, device, equipment and storage medium | |
Joseph et al. | A Case Study on using Unstructured Data Analysis Methods to identify local Covid-19 Hotspots | |
Ahmed et al. | Development of an Intelligent System for Brain Stroke Prediction using Ensemble Feature Selection and Machine Learning Technique | |
Anggraeni et al. | A Sparse Representation of Social Media, Internet Query, and Surveillance Data to Forecast Dengue Case Number using Hybrid Decomposition-Bidirectional LSTM. | |
CN114049966A (en) | Food-borne disease outbreak identification method and system based on link prediction | |
Gholizadehy et al. | Analysis of human trafficking in North Carolina based on criminal records: A framework to measure human trafficking trends | |
CN117614845B (en) | Communication information processing method and device based on big data analysis | |
Krishnamurthy et al. | Segregation in social networks: Markov bridge models and estimation | |
Shinde et al. | Comparative study of decision tree algorithm and Naive Bayes classifier for swine flu prediction | |
CN114610921B (en) | Object cluster portrait determination method, device, computer equipment and storage medium | |
CN110717837A (en) | User portrait construction method for hacker forum | |
WO2024124640A1 (en) | Node analysis method and apparatus based on threat analysis graph |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |