CN116091253A - Medical insurance wind control data acquisition method and device - Google Patents
Medical insurance wind control data acquisition method and device Download PDFInfo
- Publication number
- CN116091253A CN116091253A CN202310366286.4A CN202310366286A CN116091253A CN 116091253 A CN116091253 A CN 116091253A CN 202310366286 A CN202310366286 A CN 202310366286A CN 116091253 A CN116091253 A CN 116091253A
- Authority
- CN
- China
- Prior art keywords
- medical data
- view information
- column
- original medical
- information table
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/08—Insurance
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Public Health (AREA)
- Business, Economics & Management (AREA)
- Medical Informatics (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Data Mining & Analysis (AREA)
- Primary Health Care (AREA)
- General Health & Medical Sciences (AREA)
- Epidemiology (AREA)
- Marketing (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Technology Law (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Development Economics (AREA)
- Biomedical Technology (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention discloses a medical insurance wind control data acquisition method and device. The medical insurance wind control data acquisition method comprises the following steps: extracting content characteristics from each column in at least one original medical data table obtained in advance; matching the content features of each column with weight configuration information of a preset view information table, and determining weight values corresponding to the matched content features in an original medical data table; calculating the correlation value of each original medical data table and each target view information table to be generated according to the weight value corresponding to the matched content characteristics; and according to the correlation value corresponding to the view information table, carrying out association display on the original medical data table and the target view information table. The business personnel can conveniently select useful information from the displayed original medical data table to construct a target view information table, so that the efficiency of constructing the information view is greatly improved, and a data source is provided for related follow-up business operations.
Description
Technical Field
The invention relates to the technical field of medical insurance wind control, in particular to a method and a device for acquiring medical insurance wind control data.
Background
Modern hospitals basically use a hospital information system (Hospital Information System, HIS) to store and manage medical information of outpatients and inpatients. However, manufacturers of HIS systems used in different hospitals are inconsistent, which results in messy data format and table information of tables in HIS systems of different hospitals, and the readability of the header and name of a considerable part of tables is very low, so that the meaning of the table cannot be understood.
For a business scenario requiring subsequent analysis from data in the HIS system, for example, a scenario for acquiring medical insurance wind control data based on an original medical data table, information needed in the business scenario needs to be extracted from the original data table of the different HIS system to construct an information view, such as generating a view information table, for subsequent data analysis and abnormal condition detection.
However, for the foregoing reasons, the original medical data tables of the HIS systems in different hospitals at present are poor in readability, and the original medical data tables are huge in number and the data content is disordered and lacks a unified data standard, which causes a lot of difficulties in constructing the information view, and lacks an efficient auxiliary means for constructing the information view.
Disclosure of Invention
The present invention has been made in view of the above-mentioned problems, and it is an object of the present invention to provide a method and apparatus for collecting medical insurance wind control data that overcomes or at least partially solves the above-mentioned problems.
In a first aspect, an embodiment of the present invention provides a method for collecting medical insurance wind control data, including:
extracting content characteristics from each column in at least one original medical data table obtained in advance;
matching the content features of each column with weight configuration information of a preset view information table, and determining weight values corresponding to the matched content features in the original medical data table; the view information table is a table obtained by extracting required information according to the original medical data table;
calculating the correlation value of each original medical data table and each target view information table to be generated according to the weight value corresponding to the matched content characteristics;
and according to the correlation value corresponding to the view information table, carrying out association display on the original medical data table and the target view information table.
In one embodiment, the extracting the content features from each column in the at least one pre-obtained original medical data table includes:
performing feature extraction on each cell data of at least one original medical data table obtained in advance by using a preset feature extraction model to obtain feature vectors;
inputting the feature vector into a medical data classification model obtained by training in advance to obtain a model classification label corresponding to each piece of cell data;
and according to the model classification labels, content characteristics corresponding to each column in the original medical data table are statistically determined.
In one embodiment, the statistically determining content features corresponding to each column in the original medical data table according to the model classification tags includes:
determining the model classification label with the largest occurrence number in the model classification labels of all the cell data of each column in the original medical data table as a mode label;
taking the mode label as a column classification label corresponding to the column;
and taking all the column classification labels of all columns in each original medical data table as the content characteristics of each original medical data table.
In one embodiment, before the step of using the mode label as the column classification label corresponding to the column, the method further includes:
judging whether the quantity proportion of the mode labels in the columns exceeds a threshold value or not;
if yes, executing the step of taking the mode label as a column classification label corresponding to the column;
if not, taking the preset default label as the column classification label corresponding to the column.
In one embodiment, the weight configuration information of the view information table includes: each medical data type and corresponding weight value contained in each view information table;
the matching of the content characteristics of each column of each original medical data table with the weight configuration information of the preset view information table, and the determination of the weight value corresponding to the matched content characteristics in the original medical data table comprise the following steps:
matching the content characteristics of each original medical data table with each medical data type of each view information table in the weight configuration information;
in the event that the content characteristics of any one of the original medical data tables partially or fully match the medical data types contained in any one of the view information tables, the weight value of the medical data type is determined as the weight value of the content characteristics matching the medical data type.
In one embodiment, calculating a relevance value of each original medical data table and each target view information table to be generated according to the weight value corresponding to the matched content feature includes:
calculating the weight value of the content characteristics of the original medical data table matched with the medical data type of the target view information table aiming at each original medical data table and each target view information table to obtain the correlation value of the original medical data table and the target view information table;
the calculating includes: averaging, summing or weighted averaging.
In one embodiment, the medical data classification model is pre-trained by:
determining a model classification label corresponding to cell data in a pre-generated view information table;
extracting feature vectors from cell data in a pre-generated view information table by using a preset feature extraction model to obtain feature vector data corresponding to each cell data;
generating sample data according to the feature vector data corresponding to each cell data and the model classification labels;
and training a preset classifier model by using the sample data to obtain a medical data classification model.
In a second aspect, an embodiment of the present invention provides a medical insurance wind control data acquisition device, including:
the extraction module is used for extracting content characteristics of each column in at least one original medical data table obtained in advance;
the weight value determining module is used for matching the content characteristics of each column with weight configuration information of a preset view information table and determining weight values corresponding to the matched content characteristics in the original medical data table;
the computing module is used for computing the correlation value of each original medical data table and each target view information table to be generated according to the weight value corresponding to the matched content characteristics;
and the data display module is used for carrying out association display on the original medical data table and the target view information table according to the correlation value corresponding to the view information table.
In a third aspect, an embodiment of the present invention provides a server, including: the system comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the medical insurance wind control data acquisition method when executing the program.
In a fourth aspect, embodiments of the present invention provide a computer readable storage medium storing a computer program which, when executed by a processor, implements a medical insurance wind-controlled data collection method as described above.
The technical scheme provided by the embodiment of the invention has the beneficial effects that at least:
according to the medical insurance wind control data acquisition method and device, the content characteristics of the original medical data table are extracted, the matching degree between the original medical data table and the view information table to be generated can be accurately identified through the weight configuration information of the preset view information table, the original medical data table and the view information table to be generated are associated and displayed according to the matching degree, service personnel can conveniently select useful information from the displayed original medical data table to construct a target view information table, the efficiency of constructing information views is greatly improved, and data sources are provided for related follow-up service operations.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a flow chart of a traditional Chinese medicine wind-protection control data acquisition method in an embodiment of the invention;
FIG. 2 is a flow chart of content feature extraction for each column in at least one pre-obtained raw medical data table in accordance with an embodiment of the present invention;
FIG. 3 is a diagram of an original medical data table in an embodiment of the present invention;
FIG. 4 is a schematic diagram of a prediction result of a model classification label of an emergency prescription information table according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a portion of the content of a weight configuration file according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of an AI auxiliary acquisition interface of a traditional Chinese medicine wind-protection control platform in an embodiment of the invention;
FIG. 7 is a flowchart of a method of training a medical data classification model in accordance with an embodiment of the present invention;
fig. 8 is a block diagram of a traditional Chinese medicine wind-protection control data acquisition device according to an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
In order to realize collection and data display of medical insurance wind control data and better assist in constructing an information view, the embodiment of the invention provides a medical insurance wind control data collection method and device. The following description refers to the accompanying drawings.
The embodiment of the invention provides a medical insurance wind control data acquisition method, which is shown by referring to fig. 1 and comprises the following steps of:
s11, extracting content characteristics of each column in at least one original medical data table obtained in advance;
the raw medical data table refers to a table of various aspects related to medical care, expense settlement, diagnosis and treatment details of HIS systems of respective medical institutions obtained from medical care data systems, for example, in the scope of medical care, and since HIS systems adopted by different medical institutions in the prior art are inconsistent, the format of each raw medical data table may thus take various different forms.
In this step S11, for the data content of each column in the original medical table, a feature extraction model or algorithm is used to extract a corresponding feature vector for the cell data of each column, the cell data of the same column generally represents information of the same item type, and content feature extraction is performed for the column, so that a data base can be made for matching with the target view information table subsequently.
S12, matching the content features of each column with weight configuration information of a preset view information table, and determining weight values corresponding to the matched content features in an original medical data table;
the original medical data table can be used as a data source, the view information table refers to a table obtained by extracting needed information according to the original medical data table, and the view information table can be generated in a manual, semi-manual or automatic mode according to the original medical data table, namely, the process of constructing an information view.
The view information table may be, for example: outpatient tariffs, outpatient diagnosis tables, outpatient information tables, outpatient settlement tables, hospitalization tariffs, hospitalization diagnosis tables, hospitalization information tables, hospitalization settlement tables, assay result tables, and the like.
Because the medical data types and the quantity of the medical data type containing information and the like of the view information tables are not necessarily the same because the purposes of the different view information tables are different, in the embodiment of the invention, the weight configuration information of each view information table can be preset, and the weight configuration information can reflect the weight of the different medical data types in each view information table.
And matching the content characteristics of the columns of the original medical data table with the medical data types in the view information table, and determining the weight value corresponding to the content characteristics in the original medical data table according to the weight of the matched medical data types in the weight configuration information when the matching of the content characteristics and the medical data types is successful.
S13, calculating a correlation value between each original medical data table and each target view information table to be generated according to the weight value corresponding to the matched content characteristics;
aiming at a certain target view information table, calculating the correlation value between the original medical data table and the target view information table according to the weight value corresponding to the content characteristics matched with the original medical data table; by adopting the calculation mode, all correlation values of the original medical data table and the target view information table can be obtained.
By adopting the method to process each target view information table, the correlation value between each original medical data table and each target view information table to be generated can be obtained.
S14, according to the correlation value corresponding to the target view information table, the original medical data table and the target view information table are displayed in a correlated mode.
After the correlation value between each original medical data table and each target view information table to be generated is obtained through the processing in step S13, in this step S14, any target view information table may be displayed in association with the original medical data table according to the correlation value, where the association display includes but is not limited to multiple manners such as association window display, paging comparison display, and the like.
In one embodiment, in the step S11, the content feature extraction is performed on each column in the at least one original medical data table obtained in advance, and referring to fig. 2, the method may be implemented as follows:
s21, performing feature extraction on each cell data of at least one original medical data table obtained in advance by using a preset feature extraction model to obtain feature vectors;
s22, inputting the feature vector into a medical data classification model obtained through training in advance to obtain a model classification label corresponding to each cell data;
s23, according to the model classification labels, content characteristics corresponding to each column in the original medical data table are statistically determined.
And carrying out feature extraction on each cell data of the original medical data table by using the feature extraction model to obtain feature vectors, wherein the extracted vectors can be ensured to be consistent with the original cell data in content on the one hand so as to facilitate the input of a subsequent classification model. The model classification labels are output through the pre-trained medical data classification model, namely, classification of the feature vectors of the cell data is identified in an artificial intelligence mode, and therefore efficiency is high and results are accurate. Because the classification labels of different cells in the column are not the same, the corresponding content characteristics of the column are obtained in a statistical mode, the generation process of the content characteristics of the column can be simplified, and the result is more accurate.
According to the medical insurance wind control data acquisition method provided by the embodiment of the invention, various original medical data tables with unknown meanings are firstly acquired from databases or data warehouses of various hospitals, an example of the original medical data table is shown by referring to fig. 3, fig. 3 is a sample of the original medical data table (data is desensitized), data of each column in each original medical data table is read, after the data of each column is read, a preset feature extraction model is utilized to extract feature vectors of the read data of each column, and the preset feature extraction model can be a BERT feature extraction model; through the BERT feature extraction model, feature vectors corresponding to phrases and sentences in the cells can be obtained, the features of the original data are also provided, the extracted vectors can be ensured to be characterized more accurately, and finally the classification result is ensured to be obtained more accurately.
After the extraction of the feature vector is completed, the feature vector corresponding to the data of each cell in each column is obtained, the obtained feature vector corresponding to the data of each cell in each column is input into a medical data classification model which is trained in advance, a model classification label corresponding to the data of each cell in each column is obtained through the output of the medical data classification model, and meanwhile, a probability score recording each model classification label can be obtained (for example, the probability score ranges from 0 to 1, and the probability score is more likely to be the model classification label when the probability score is closer to 1, and represents the accuracy of prediction and is used for reference).
In each column of each original medical data table, there are multiple cells, and the prediction results of the model classification labels of each cell data may be different, at this time, the prediction conditions of the model classification labels of all the cells in each column are counted, and according to the prediction results of the model classification labels of multiple cells in each column, one model classification label with the largest occurrence number in the prediction results is counted, and the model classification label is used as a mode label, and the mode label can be used as a column classification label of the corresponding column. Referring to fig. 4, fig. 4 is a schematic diagram showing the prediction result of the emergency prescription information table after the classification label of the prediction model in the original medical data table, and as can be seen from fig. 4, in the itemname column (second column) in the his_emergency prescription information table, 73 cells are identified as medicines, 25 cells are identified as charge items, and 1 cell is identified as diseases, so that the column classification label in this column should be identified as the classification label of the "medicine" with the highest occurrence frequency. By the method, the column classification label of each column in the original medical data table can be determined, and the column classification label of each column in the original medical data table is integrated, so that the content characteristics reflecting the essence of the cell data of each column in the original medical data table can be better obtained, and the content characteristics of the original medical data table are more accurate.
Referring to fig. 4, each column of the emergency prescription information table in fig. 4 may obtain a corresponding column classification label, for example, a column classification label is "hospital", a column classification label is "medicine", a column classification label is "fee type", a column classification label is "unit of measure", etc., which are not enumerated here. The content features of the table are obtained by integrating all column classification labels of the table, and the content features of the table can be a list, for example: the empty and other categories are not integrated or may also participate in the integration.
In one embodiment, before the mode label is used as the column classification label corresponding to the column, the following steps may be further performed in the embodiment of the present invention:
judging whether the quantity proportion of the mode labels in the columns exceeds a threshold value or not;
if yes, executing the step of taking the mode label as a column classification label corresponding to the column;
if not, taking the preset default label as a column classification label corresponding to the column.
For example, if the proportion of a mode label to the class labels corresponding to all cells in the column does not reach the preset threshold, the column class label in the column needs to be set as a default label, such as the "other" label in fig. 4, and the embodiment of the present invention is not limited to the default label type. Otherwise, if the ratio reaches a preset threshold, the mode label is used as the column classification label of the column.
Before the step of taking the mode label as the column classification label, judging whether the quantity proportion of the mode label exceeds a threshold value, taking the mode label as the column classification label when the threshold value is determined to be exceeded, comprehensively considering the two conditions of the maximum quantity and the large quantity proportion, and well ensuring that the obtained mode label is the label which can represent the data content of the column unit cell in the statistical sense.
In one embodiment, the weight configuration information of the view information table may include: each view information table contains a respective medical data type and a corresponding weight value.
Correspondingly, in the step S12, the content features of each original medical data table may be matched with the respective medical data types of each view information table in the weight configuration information;
calculating the weight value of the content characteristics of the original medical data table matched with the medical data type of the target view information table aiming at each original medical data table and each target view information table to obtain the correlation value of the original medical data table and the target view information table;
the above calculation may be various, including, for example: averaging, summing, or weighted averaging, etc. The embodiment of the present invention is not limited thereto.
Specifically, the medical data types contained in the view information table may include, for example: hospitals, medicines, examination items, diseases, names, sexes, departments, treatment types, payment modes, disease codes, time dates, measurement units and the like, and various data can be included under each medical data type and can be used as a data source for subsequent medical insurance wind control analysis.
In the event that the content characteristics of any one of the original medical data tables partially or completely match the medical data types contained in any one of the view information tables, the weight value of the medical data type is determined as the weight value of the content characteristics matching the medical data type.
Calculating a correlation value between each original medical data table and each preset view information table according to each column classification label in the original medical data table and a preset weight configuration file; the preset weight configuration file comprises weight values of medical data types contained in each preset view information table; in each view information table, the number of times and importance of each medical data type in each view information table are different because the medical data type composing each view information table and the amount of information contained under each medical data type are different, and the weight value of each medical data type in each view information table can be preconfigured according to the importance, the number of times of occurrence and other factors of different medical data types in different view information tables, and can be continuously debugged and modified in actual production.
For example, in the case where the content features of any one of the original medical data tables partially match or completely match the medical data types included in any one of the view information tables, adding all weight values corresponding to the successfully matched (i.e., partially matched or completely matched) medical data types, dividing the result of the addition of the weight values by the number of the successfully matched medical data types, and taking the obtained calculation result as the correlation value between the original medical data table and the view information table;
through such a calculation process, a correlation value between each original medical data table and each view information table can be obtained.
Referring to fig. 5, fig. 5 is a schematic view of a portion of the content of the weight configuration file, and it can be seen from fig. 5 that each medical data type may have different weight values under different view information tables, and src_his_mz_charge_detail, src_his_mz_diag, src_his_mz_master_info in fig. 5 are names of the view information tables; sequentially matching the content characteristics of each original medical data table with the medical data types contained in each view information table in the weight configuration file, adding all weight values corresponding to successfully matched medical data types when the content characteristics of any one original medical data table are partially matched or completely matched with the medical data types contained in any one view information table in the matching process, dividing the added result of the weight values by the number of the successfully matched medical data types, and taking the obtained calculation result as the correlation value of the original medical data table and the view information table; for example, the content of a certain original medical data table is characterized by: the content feature is matched with the medical data type contained in the src_his_mz_charge_detail view information table, and the matching result is that: the hospital is successfully matched with the hospital in the src_his_mz_charge_detail, the charging item is successfully matched with the charging item in the src_his_mz_charge_detail, the weight value corresponding to the hospital in the weight configuration file is 0, the weight value corresponding to the charging item is 5, and the calculation formula of the correlation value is as follows: (0+5)/2, wherein the calculated result is 2.5, and the correlation value between the original medical data table and the view information table is 2.5. And executing the calculation process on each original medical data table and each preset view information table to obtain a correlation value of each original medical data table and each view information table.
Because the content characteristics in the original medical data table are not completely consistent with the medical data types of the view information table, the problem of matching the original medical data table and the view information table can be solved by adopting partial matching or safe matching operation, and according to the matching result, the correlation degree between each original medical data table and the corresponding target view information table can be calculated more accurately by utilizing the weight information in the weight configuration information of the preset view information table so as to better reflect the correlation degree between the original medical data table and the target view information table.
In one embodiment, in the step S14, the original medical data table and the target view information table are displayed in association according to the degree of correlation corresponding to the view information table, and in a specific implementation, for each view information table, the view information table and the associated original medical data table are displayed in the same interface according to the order of the degree of correlation corresponding to the view information table from high to low, so that the association between the target view information table and the associated original medical data table can be displayed more intuitively, the user can conveniently use the original medical data table as a data source, extract required data to construct the target view information table, or perform necessary data analysis on the associated original medical data table and the like.
For example, under each target view information table, each original medical data table is sorted according to the high-low degree of correlation value, and a preset number of original medical data tables with the high-low degree of correlation value are displayed on a relevant platform (the original medical data tables can be displayed in multiple pages under the condition of large data volume). Through the process, business personnel can more conveniently select the original medical data table which is actually needed from the displayed original medical data table to construct an information view, and generate a corresponding target view information table, so that data sources can be provided for related follow-up business operations.
Referring to fig. 6, fig. 6 is a schematic diagram of an AI auxiliary acquisition interface of the medical insurance wind control platform, each original medical data table is ordered according to the degree of correlation, and a preset number of original medical data tables are displayed on the relevant platform after the degree of correlation is ordered from high to low. The left side of the dotted line in fig. 6 is a target view and a recommended database table, the leftmost column is a target view, that is, each target view information table is indicated, for example, the user selects the first target view information table, that is, the emergency gate charging table, the recommended database table shown in the second left column is an ordered original medical data table, and the right side part of the dotted line in fig. 6 is cell information in the recommended original medical data table, so as to facilitate viewing and data selection.
In one embodiment, the medical data classification model may be trained by the following training method of the medical data classification model.
The training method of the medical data classification model, referring to fig. 7, may include the following steps:
s71, determining model classification labels corresponding to cell data in a pre-generated view information table;
s72, extracting feature vectors from cell data in a view information table which is generated in advance by using a preset feature extraction model to obtain feature vector data corresponding to each cell data;
s73, generating sample data according to the feature vector data and the model classification labels corresponding to each cell data;
and S74, training a preset classifier model by using the sample data to obtain a medical data classification model.
The description of the view information table itself is as described above, and will not be repeated here.
The view information table used in the training process of the medical data classification model is pre-generated in various modes, for example, a target data table is manually selected, and Sql or some related business logic is used for selecting related data to generate a result table. In the process of generating the view information table, acquired original medical data may be various, wherein part of the acquired original medical data may be data with low correlation with a service scene (for example, may be a service scene related to medical insurance management), the part of the data with low correlation is excluded, and the rest of the original medical data is stored in a table to obtain a target data table with high correlation with the service scene; the predefined business logic can be, for example, clinic charging logic rules, clinic diagnosis logic rules, clinic information logic rules and the like; and acquiring data related to the predefined business logic rule from the target data table, and generating a view information table according to the acquired data.
Of course, other than manual generation of the view information table, the view information table may be generated in other automatic or semi-automatic manners, and the embodiment of the present invention is not limited, and specific implementation may refer to the prior art.
The model classification labels may include, for example, hospital classification labels, drug classification labels, examination item classification labels, disease classification labels, and the like. The model classification labels may also be made in a hierarchical fashion with reference to the prior art.
In the step S72, the feature vector may be extracted from the cell data in the view information table that has been generated in advance by using the preset feature extraction model, so as to obtain feature vector data corresponding to each cell data; the predetermined feature extraction model may be, for example, a BERT feature extraction model. The specific process of extracting the feature vector can be referred to the foregoing description, and will not be described herein.
To facilitate the training of the subsequent classifier model, further, the model classification labels may be converted to generate label data forms that can be identified by the classifier model, such as data forms of integer labels; the integer type tag may be, for example, an integer type tag of 0, 1, 2, 3 … … 10, or the like, in which: 0 represents a hospital classification label, 1 represents a medicine classification label, 2 represents an examination item classification label, 3 represents a disease classification label, 4 represents a name classification label, 5 represents a gender classification label, 6 represents a department classification label, 7 represents a doctor type classification label, 8 represents a payment mode classification label, 9 represents a time and date classification label, 10 represents a measurement unit classification label and the like; other integer tag forms may also be employed as the integer tag, the above being merely examples.
The feature vector data and the integer label obtained by extracting the feature vector are generated into sample data, and the sample data can be further divided into training sample data and test sample data; the splitting manner of the training sample data and the test sample data, and the data proportion of the training sample data and the test sample data are not limited in the embodiment of the present invention.
The preset classifier model may be, for example, an XGBoost classifier model, where the internal principle of the XGBoost classifier model is a decision tree model (CART regression tree), the input feature vectors are classified according to different conditions, for example, after each feature vector sample is input, the next step of judgment is performed or a certain leaf node is reached according to the parent node judgment condition of the decision tree, and finally, the classification result is determined through the decision tree. The training data of the decision tree model, namely the feature vectors containing different semantics and the integer labels forming the supervised learning are input, so that the decision tree model can adjust parameters towards the classification direction of actual needs, a better fitting effect is achieved, the trained medical data classification model is finally obtained, and the accuracy of the output result of the medical data classification model is improved.
Based on the same inventive concept, the embodiment of the invention also provides a medical insurance wind control data acquisition device, and because the principle of the device for solving the problem is similar to that of the medical insurance wind control data acquisition method, the implementation of the devices can be referred to the implementation of the method, and the repetition is omitted.
The embodiment of the invention provides a medical insurance wind control data acquisition device, which is shown by referring to fig. 8 and comprises:
an extraction module 81, configured to perform content feature extraction on each column in at least one original medical data table obtained in advance;
the weight value determining module 82 is configured to match the content features of each column with weight configuration information of a preset view information table, and determine a weight value corresponding to the matched content feature in the original medical data table;
a calculating module 83, configured to calculate a correlation value between each original medical data table and each target view information table to be generated according to the weight value corresponding to the matched content feature;
the data display module 84 is configured to display the original medical data table and the target view information table in association according to the correlation value corresponding to the view information table.
According to the medical insurance wind control data acquisition method and device, the content characteristics of the original medical data table are extracted, the matching degree between the original medical data table and the view information table to be generated can be accurately identified through the weight configuration information of the preset view information table, the original medical data table and the view information table to be generated are associated and displayed according to the matching degree, service personnel can conveniently select useful information from the displayed original medical data table to construct a target view information table, the efficiency of constructing information views is greatly improved, and data sources are provided for related follow-up service operations.
The embodiment of the invention provides a server, which comprises: the system comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the medical insurance wind control data acquisition method when executing the program.
The embodiment of the invention provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the medical insurance wind control data acquisition method when being executed by a processor.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.
Claims (10)
1. The utility model provides a medical insurance wind control data acquisition method which is characterized in that the method comprises the following steps:
extracting content characteristics from each column in at least one original medical data table obtained in advance;
matching the content features of each column with weight configuration information of a preset view information table, and determining weight values corresponding to the matched content features in the original medical data table; the view information table is a table obtained by extracting required information according to the original medical data table;
calculating the correlation value of each original medical data table and each target view information table to be generated according to the weight value corresponding to the matched content characteristics;
and according to the correlation value corresponding to the target view information table, carrying out association display on the original medical data table and the target view information table.
2. The method of claim 1, wherein the extracting content features for each column in the at least one pre-obtained raw medical data table comprises:
performing feature extraction on each cell data of at least one original medical data table obtained in advance by using a preset feature extraction model to obtain feature vectors;
inputting the feature vector into a medical data classification model obtained by training in advance to obtain a model classification label corresponding to each piece of cell data;
and according to the model classification labels, content characteristics corresponding to each column in the original medical data table are statistically determined.
3. The method of claim 2, wherein statistically determining content features corresponding to columns in the original medical data table based on the model classification tags, comprises:
determining the model classification label with the largest occurrence number in the model classification labels of all the cell data of each column in the original medical data table as a mode label;
taking the mode label as a column classification label corresponding to the column;
and taking all the column classification labels of all columns in each original medical data table as the content characteristics of each original medical data table.
4. The method of claim 3, further comprising, prior to said assigning said mode label as a column classification label corresponding to said column:
judging whether the quantity proportion of the mode labels in the columns exceeds a threshold value or not;
if yes, executing the step of taking the mode label as a column classification label corresponding to the column;
if not, taking the preset default label as the column classification label corresponding to the column.
5. The method according to claim 1, wherein the weight configuration information of the view information table comprises: each medical data type and corresponding weight value contained in each view information table;
the matching of the content characteristics of each column of each original medical data table with the weight configuration information of the preset view information table, and the determination of the weight value corresponding to the matched content characteristics in the original medical data table comprise the following steps:
matching the content characteristics of each original medical data table with each medical data type of each view information table in the weight configuration information;
in the event that the content characteristics of any one of the original medical data tables partially or fully match the medical data types contained in any one of the view information tables, the weight value of the medical data type is determined as the weight value of the content characteristics matching the medical data type.
6. The method of claim 5, wherein calculating a relevance value of each original medical data table and each target view information table to be generated according to the weight value corresponding to the matched content feature, comprises:
calculating the weight value of the content characteristics of the original medical data table matched with the medical data type of the target view information table aiming at each original medical data table and each target view information table to obtain the correlation value of the original medical data table and the target view information table;
the calculating includes: averaging, summing or weighted averaging.
7. The method of any one of claims 1-6, wherein the medical data classification model is pre-trained by:
determining model classification labels corresponding to cell data in a pre-generated view information table;
extracting feature vectors from cell data in the pre-generated view information table by using a preset feature extraction model to obtain feature vector data corresponding to each cell data;
generating sample data according to the feature vector data corresponding to each cell data and the model classification labels;
and training a preset classifier model by using the sample data to obtain a medical data classification model.
8. A medical insurance wind control data acquisition device, characterized by comprising:
the extraction module is used for extracting content characteristics of each column in at least one original medical data table obtained in advance;
the weight value determining module is used for matching the content characteristics of each column with weight configuration information of a preset view information table and determining weight values corresponding to the matched content characteristics in the original medical data table;
the computing module is used for computing the correlation value of each original medical data table and each target view information table to be generated according to the weight value corresponding to the matched content characteristics;
and the data display module is used for carrying out association display on the original medical data table and the target view information table according to the correlation value corresponding to the view information table.
9. A server, comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the medical insurance wind control data collection method according to any one of claims 1 to 7 when the program is executed.
10. A computer readable storage medium storing a computer program which when executed by a processor implements the medical insurance wind control data collection method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310366286.4A CN116091253B (en) | 2023-04-07 | 2023-04-07 | Medical insurance wind control data acquisition method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310366286.4A CN116091253B (en) | 2023-04-07 | 2023-04-07 | Medical insurance wind control data acquisition method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116091253A true CN116091253A (en) | 2023-05-09 |
CN116091253B CN116091253B (en) | 2023-08-08 |
Family
ID=86208626
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310366286.4A Active CN116091253B (en) | 2023-04-07 | 2023-04-07 | Medical insurance wind control data acquisition method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116091253B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014201515A1 (en) * | 2013-06-18 | 2014-12-24 | Deakin University | Medical data processing for risk prediction |
CN108090068A (en) * | 2016-11-21 | 2018-05-29 | 医渡云(北京)技术有限公司 | The sorting technique and device of table in hospital database |
CN111178069A (en) * | 2019-12-25 | 2020-05-19 | 平安健康保险股份有限公司 | Data processing method and device, computer equipment and storage medium |
CN112015774A (en) * | 2020-09-25 | 2020-12-01 | 北京百度网讯科技有限公司 | Chart recommendation method and device, electronic equipment and storage medium |
CN113569042A (en) * | 2021-01-26 | 2021-10-29 | 腾讯科技(深圳)有限公司 | Text information classification method and device, computer equipment and storage medium |
CN113934895A (en) * | 2021-09-29 | 2022-01-14 | 浪潮云信息技术股份公司 | Method for assisting in establishing patient main index |
CN114936233A (en) * | 2022-05-24 | 2022-08-23 | 百果园技术(新加坡)有限公司 | List processing method, apparatus, system, device, storage medium and program product |
CN115186764A (en) * | 2022-08-03 | 2022-10-14 | 腾讯科技(北京)有限公司 | Data processing method and device, electronic equipment and storage medium |
-
2023
- 2023-04-07 CN CN202310366286.4A patent/CN116091253B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014201515A1 (en) * | 2013-06-18 | 2014-12-24 | Deakin University | Medical data processing for risk prediction |
CN108090068A (en) * | 2016-11-21 | 2018-05-29 | 医渡云(北京)技术有限公司 | The sorting technique and device of table in hospital database |
CN111178069A (en) * | 2019-12-25 | 2020-05-19 | 平安健康保险股份有限公司 | Data processing method and device, computer equipment and storage medium |
CN112015774A (en) * | 2020-09-25 | 2020-12-01 | 北京百度网讯科技有限公司 | Chart recommendation method and device, electronic equipment and storage medium |
CN113569042A (en) * | 2021-01-26 | 2021-10-29 | 腾讯科技(深圳)有限公司 | Text information classification method and device, computer equipment and storage medium |
CN113934895A (en) * | 2021-09-29 | 2022-01-14 | 浪潮云信息技术股份公司 | Method for assisting in establishing patient main index |
CN114936233A (en) * | 2022-05-24 | 2022-08-23 | 百果园技术(新加坡)有限公司 | List processing method, apparatus, system, device, storage medium and program product |
CN115186764A (en) * | 2022-08-03 | 2022-10-14 | 腾讯科技(北京)有限公司 | Data processing method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN116091253B (en) | 2023-08-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3985559A1 (en) | Entity semantics relationship classification | |
CN107644011B (en) | System and method for fine-grained medical entity extraction | |
US9911082B2 (en) | Question classification and feature mapping in a deep question answering system | |
Sarker et al. | Automatic evidence quality prediction to support evidence-based decision making | |
CN114547346B (en) | Knowledge graph construction method and device, electronic equipment and storage medium | |
Post et al. | Protempa: A method for specifying and identifying temporal sequences in retrospective data for patient selection | |
CN115954072A (en) | Intelligent clinical test scheme generation method and related device | |
Jiang et al. | Stroke risk prediction using artificial intelligence techniques through electronic health records | |
CN111755090A (en) | Medical record searching method, medical record searching device, storage medium and electronic equipment | |
CN115101160A (en) | Drug sales data mining and retrieving method and device | |
CN116091253B (en) | Medical insurance wind control data acquisition method and device | |
CN113505117A (en) | Data quality evaluation method, device, equipment and medium based on data indexes | |
CN116737945B (en) | Mapping method for EMR knowledge map of patient | |
CN109144999B (en) | Data positioning method, device, storage medium and program product | |
CN114882985B (en) | Medicine multimedia management system and method based on database and AI algorithm identification | |
CN117235218A (en) | Question answering method, medium, device and computing equipment | |
Ashish et al. | The GAAIN entity mapper: an active-learning system for medical data mapping | |
CN115221323A (en) | Cold start processing method, device, equipment and medium based on intention recognition model | |
CN113707302A (en) | Service recommendation method, device, equipment and storage medium based on associated information | |
CN113793677A (en) | Electronic medical record management method and device, storage medium and electronic equipment | |
CN113688854A (en) | Data processing method and device and computing equipment | |
CN115983237B (en) | Form type recognition model training, predicting and form data recommending method and device | |
CN118365459B (en) | Intelligent matching system, method, equipment and medium for business insurance claim rules | |
CN115392206B (en) | Method, device and equipment for quickly querying data based on WPS/EXCEL and storage medium | |
Ramya et al. | A Comparative Study on Aspects Level Drug Reviews Using Back Propagation Neural Networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |