CN117725961B - Medical intention recognition model training method, medical intention recognition method and equipment - Google Patents
Medical intention recognition model training method, medical intention recognition method and equipment Download PDFInfo
- Publication number
- CN117725961B CN117725961B CN202410180194.1A CN202410180194A CN117725961B CN 117725961 B CN117725961 B CN 117725961B CN 202410180194 A CN202410180194 A CN 202410180194A CN 117725961 B CN117725961 B CN 117725961B
- Authority
- CN
- China
- Prior art keywords
- relation
- entity
- graph
- medical
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012549 training Methods 0.000 title claims abstract description 85
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000000605 extraction Methods 0.000 claims abstract description 38
- 239000011159 matrix material Substances 0.000 claims description 34
- 239000013598 vector Substances 0.000 claims description 25
- 230000002159 abnormal effect Effects 0.000 claims description 11
- 238000003860 storage Methods 0.000 claims description 10
- 238000010586 diagram Methods 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 8
- 230000007246 mechanism Effects 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 6
- 239000000203 mixture Substances 0.000 claims description 5
- 230000002776 aggregation Effects 0.000 claims description 4
- 238000004220 aggregation Methods 0.000 claims description 4
- 238000011176 pooling Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 3
- 230000004927 fusion Effects 0.000 abstract description 3
- 230000002708 enhancing effect Effects 0.000 abstract description 2
- 238000010276 construction Methods 0.000 description 14
- 230000008569 process Effects 0.000 description 6
- 238000013145 classification model Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000009877 rendering Methods 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
Landscapes
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention discloses a medical intention recognition model training method, a medical intention recognition method and equipment, comprising the following steps: inputting training data into a trained feature recognition and relation extraction model to perform entity feature extraction and relation extraction to obtain entity features of the textAnd a relation list between entities, t 1 to t n are entity characteristics of each word, and n is the number of words contained in sentences of training data; constructing an heterogram based on the entity characteristics of the text and a relation list among the entities; and inputting the heterogeneous graph into a graph attention model for training to obtain a trained graph attention model, and enhancing the understanding and fusion of the characteristics of the entity-relationship through the intention recognition model of the heterogeneous relationship graph, thereby improving the effect of intention recognition.
Description
Technical Field
The invention relates to the field of natural language processing, in particular to a medical intention recognition model training method, a medical intention recognition method and medical intention recognition equipment.
Background
With the development of artificial intelligence technology, natural language identification is increasingly applied, especially in medical fine-grained text classification scenes. Current natural language recognition has many practices in deep learning based classification models, such as text classification models based on the pretrained model BERT, text classification models based on LSTM/CNN, or related variants thereof; the scheme of a single task does not introduce external information such as entities, so that a joint combination mode of a multitask model based on classification and entity identification is provided, and the model simultaneously performs entity identification and classification tasks so as to improve the capacity of the two tasks; the other way is a rendering pipeline mode that the characteristics such as entity relation and the like are input as an external characteristic; the joint combination mode relieves the defect of exposure deviation, but the schemes of construction of loss weights of a plurality of tasks, difficulty of task learning fitting and the like are problems; but the model can be simply decoupled based on the rendering pipeline mode construction, more features are added from the bottom layer, but the exposure deviation problem exists. Therefore, the natural language classification performed in the existing mode is not high enough in accuracy of text classification in the medical fine-grained text classification scene, and practical application is difficult to meet.
Disclosure of Invention
The embodiment of the invention provides a medical intention recognition model training method, a medical intention recognition device, computer equipment and a storage medium, so as to improve the accuracy of medical intention recognition classification.
In order to solve the above technical problems, an embodiment of the present application provides a medical intention recognition model training method, including:
Inputting training data into a trained feature recognition and relation extraction model to perform entity feature extraction and relation extraction to obtain entity features of the text And a list of relationships between the entities,To the point ofFor the physical characteristics of each word, n is the number of words contained in the sentence of the training data;
Constructing an heterogram based on the entity characteristics of the text and the relation list between the entities;
and inputting the heterogeneous graph into a graph attention model for training to obtain a trained graph attention model.
Optionally, the constructing the heterogeneous graph based on the entity characteristics of the text and the relationship list between the entities includes:
constructing an adjacency matrix by taking n as the node number of the current sample iso-graph ;
For n nodes, constructing edges between every two nodes, wherein the entity edges are connected only on the head and tail nodes and are marked with entity type labels;
When the relation between the entities constructs the relation adjacent value, the head and the tail of the corresponding entity are respectively labeled by the relation labelMarked on the left, the nodes without relation are represented by 0, and an adjacent matrix with heterogeneous relation is obtainedThe adjacency matrix with heterogeneous relationAs the differential pattern.
Optionally, the constructing an iso-graph is performed for each sample in the training data.
Optionally, inputting the heterogeneous graph into a graph attention model for training, and obtaining a trained graph attention model includes:
The coding layer of the graph annotation force model is a Bert model, and the training data and the heterogeneous graph are subjected to coding layer to obtain an input node vector Wherein,B is batch_size, batch_size is an important parameter in machine learning, and n is text length;
At each feature extraction layer of the graph-meaning network, the input node vector is mapped to two matrices ,Extracting features by adopting an attention mechanism to obtain extracted features;
and carrying out intention recognition loss calculation according to the extracted features, and carrying out iterative training according to loss results to obtain the trained graph attention model.
Optionally, the feature extraction using an attention mechanism to obtain an extracted feature includes:
Will be Dimension expansion and filling are carried out to lead the dimension expansion and filling to reach four-dimensional size (dimension parameter):
;
Will be 、The connection is carried out according to the fourth dimension to obtain;
In the process of obtaining the isomerism relation graphThen, calculating dot products line by line to obtainAttention score of (2),Wherein 0 represents that the node is not associated with the node;
By passing through Constructing a Mask matrixWhen (when)A position value of 0Value of corresponding positionOtherwise, 0;
calculating a final attention score ;
And (3) carrying out pooling and aggregation operation to obtain an output vector:
;
For the output vector And weighting to obtain the extracted features.
In order to solve the above technical problems, an embodiment of the present application provides a medical intention recognition method, including:
Receiving a statement to be identified;
Constructing an abnormal pattern of the statement to be identified, and inputting the constructed statement to a trained medical intention identification model to obtain an intention identification result.
In order to solve the above technical problem, an embodiment of the present application further provides a medical intention recognition model training apparatus, including:
the entity extraction module is used for inputting training data into the trained feature recognition and relation extraction model to extract entity features and relation to obtain entity features of the text And a list of relationships between the entities,To the point ofFor the physical characteristics of each word, n is the number of words contained in the sentence of the training data;
the abnormal composition building module is used for building abnormal compositions based on the entity characteristics of the text and a relation list among the entities;
and the model training module is used for inputting the heterogeneous graph into the graph attention model for training to obtain a trained graph attention model.
Optionally, the heterogeneous graph construction module includes:
A matrix construction unit for constructing an adjacency matrix by taking n as the node number of the current sample iso-graph ;
An edge construction unit for constructing edges between every two nodes for n nodes, wherein the entity edges are connected only on the head and tail nodes and labeled with entity types;
The heterogeneous graph generating unit is used for respectively labeling the head and the tail of the corresponding entity with the relationship when the relationship between the entities constructs the relationship adjacent valueMarked on the left, the nodes without relation are represented by 0, and an adjacent matrix with heterogeneous relation is obtainedAdjacency matrix to have heterogeneous relationsAs an iso-pattern.
Optionally, the model training module comprises:
The coding module is used for obtaining an input node vector by taking a Bert model as a coding layer of the graph attention model and passing training data and the heterogeneous graph through the coding layer Wherein,B is batch_size, n is text length;
A feature extraction module for mapping the input node vector to two matrices at each feature extraction layer of the graph annotation force network ,Extracting features by adopting an attention mechanism to obtain extracted features;
And the iterative training unit is used for carrying out intention recognition loss calculation according to the extracted features and carrying out iterative training according to the loss result to obtain a trained graph attention model.
In order to solve the above technical problem, an embodiment of the present application further provides a medical intention recognition device, including:
The receiving module is used for receiving the statement to be identified;
The recognition module is used for constructing the abnormal patterns of the sentences to be recognized, and inputting the constructed sentences into the trained medical intention recognition model to obtain an intention recognition result.
In order to solve the above technical problem, an embodiment of the present application further provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the medical intention recognition model training method or implements the steps of the medical intention recognition method when executing the computer program.
To solve the above-mentioned technical problem, an embodiment of the present application further provides a computer-readable storage medium storing a computer program that implements the steps of the above-mentioned medical intent recognition model training method when executed by a processor, or implements the steps of the above-mentioned medical intent recognition method when executed by a processor.
The medical intention recognition model training method, the medical intention recognition device, the computer equipment and the storage medium provided by the embodiment of the invention are used for extracting entity characteristics and extracting relations by inputting training data into a trained characteristic recognition and relation extraction model, so as to obtain the entity characteristics of textsAnd a list of relationships between the entities,To the point ofFor the physical characteristics of each word, n is the number of words contained in the sentence of the training data; constructing an heterogram based on the entity characteristics of the text and a relation list among the entities; inputting heterogeneous graphs into a graph annotation force model
Training is carried out to obtain a trained graph attention model, understanding and fusion of the characteristics of the entity-relationship are enhanced through the intention recognition model of the heterogeneous relationship graph, and the effect of intention recognition is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow chart of one embodiment of a medical intent recognition model training method of the present application;
FIG. 3 is a flow chart of one embodiment of a medical intent recognition method of the present application;
FIG. 4 is a schematic structural view of one embodiment of a medical intent recognition model training apparatus in accordance with the present application;
FIG. 5 is a schematic structural view of one embodiment of a medical intent recognition device according to the present application;
FIG. 6 is a schematic structural diagram of one embodiment of a computer device in accordance with the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the applications herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having" and any variations thereof in the description of the application and the claims and the description of the drawings above are intended to cover a non-exclusive inclusion. The terms first, second and the like in the description and in the claims or in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, as shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.
It should be noted that, the medical intention recognition model training method and the medical intention recognition method provided by the embodiments of the present application are executed by a server, and accordingly, the medical intention recognition model training device and the medical intention recognition device are disposed in the server.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. Any number of terminal devices, networks and servers may be provided according to implementation requirements, and the terminal devices 101, 102, 103 in the embodiment of the present application may specifically correspond to application systems in actual production.
Referring to fig. 2, fig. 2 shows a training method for a medical intention recognition model according to an embodiment of the present invention, and the method is applied to the server in fig. 1 for illustration, and is described in detail as follows:
S201: inputting training data into a trained feature recognition and relation extraction model to perform entity feature extraction and relation extraction to obtain entity features of the text And a list of relationships between the entities,To the point ofFor each word's physical characteristics, n is the number of words contained in the sentence of the training data.
Specifically, in the input layer of the model, the input of word granularity is adopted, and the input of sentences is thatWhereinFor the sentence length x as the corresponding character feature, then the sentence is subjected to a trained entity recognition and relation extraction model to obtain the entity feature of the textFor each word's physical characteristics, the physical characteristics are represented by BEIOS physical labels, such as body parts: arm= { hand: B-B, arm: b-E }, while obtaining a list of relationships between entitiesFor example, the relationship "site attribute" < arm-site attribute-scratch >, and after obtaining the initial input feature, the construction of the iso-composition is performed.
S202: and constructing an heterogram based on the entity characteristics of the text and the relation list among the entities.
In a specific alternative embodiment, constructing the heterogeneous map based on the text-based entity characteristics and the list of relationships between entities includes:
constructing an adjacency matrix by taking n as the node number of the current sample iso-graph ;
For n nodes, constructing edges between every two nodes, wherein the entity edges are connected only on the head and tail nodes and are marked with entity type labels;
When the relation between the entities constructs the relation adjacent value, the head and the tail of the corresponding entity are respectively labeled by the relation labelMarked on the left, the nodes without relation are represented by 0, and an adjacent matrix with heterogeneous relation is obtainedAdjacency matrix to have heterogeneous relationsAs an iso-pattern.
Further, constructing the iso-graph is performed for each sample in the training data.
Specifically, in this embodiment, according to the size of the sentence length n as the number of nodes of the current sample iso-pattern, an adjacency matrix is first constructedBecause the embodiment mainly integrates entity and relation information and does not consider the enhanced characteristics of entity recognition vocabulary, the embodiment does not construct word nodes, but all the word nodes are byte points, and firstly, entity types in the whole data set are countedAnd number of relationship typesWherein the subscripts l and g represent the number of entities and relationships in the whole, as well asType of (a) maps to the id of an integer, i.eExamples of entity types are: body parts, diseases, clinical manifestations, physical examination items, etc., and relationships include clinical attributes, site attributes, drug attributes, etc., and then the heterogram is constructed based on the extracted entity and relationship features of Step1, specifically n nodesAn edge is constructed between every two nodes, wherein the entity edge is connected with the edge only on the head and tail nodes, and the entity type label is markedThe relationship between similar entities is also constructed by the above-mentioned method, except that multiple entities may have the same beginning word but different ending words in consideration of the nesting condition of the entities, so that when constructing the relationship adjacency value, the head and tail of the corresponding entity need to be respectively labeled with the relationshipMarked on the left, the nodes without relation are represented by 0, and finally an adjacent matrix with heterogeneous relation is obtainedWhere different values in the matrix represent different relationships, either different entity types or different relationship types, so that there is more information about the original 0,1 value adjacency matrix, while three values 0/1/2 can be used to represent no connection/entity connection/relationship connection, respectively, if simplification is considered.
S203: and inputting the heterogeneous graph into a graph attention model for training to obtain a trained graph attention model.
In a specific alternative embodiment, inputting the heterogeneous graph into the graph attention model for training, and obtaining the trained graph attention model includes:
The coding layer of the graph attention model is a Bert model, and the training data and the heterogeneous graph are processed by the coding layer to obtain an input node vector Wherein,B is batch_size, n is text length;
At each feature extraction layer of the graph-meaning network, the input node vector is mapped to two matrices ,Extracting features by adopting an attention mechanism to obtain extracted features;
And carrying out intention recognition loss calculation according to the extracted features, and carrying out iterative training according to loss results to obtain a trained graph attention model.
In another specific optional implementation manner of this embodiment, feature extraction is performed by using an attention mechanism, so as to obtain an extracted feature, including:
Will be Dimension expansion and filling are carried out to lead the dimension expansion and filling to reach four-dimensional size (dimension parameter):
;
Will be 、The connection is carried out according to the fourth dimension to obtain;
In the process of obtaining the isomerism relation graphThen, calculating dot products line by line to obtainAttention score of (2),Wherein 0 represents that the node is not associated with the node;
By passing through Constructing a Mask matrixWhen (when)A position value of 0Value of corresponding positionOtherwise, 0;
calculating a final attention score ;
And (3) carrying out pooling and aggregation operation to obtain an output vector:
;
For output vector And weighting to obtain the extracted features.
Specifically, in this embodiment, after feature graphs are constructed, the graph construction is performed for each sample, and since the graph construction is performed for each sample, the graphs in the samples are static, so that a Inductive Learning strategy such as GRAPHSAGE model is not required, but modeling is performed by adopting a manner of Transductive to construct GCN or GAT (graph injection force model), the scheme adopts a GAT model for construction, and firstly, the input data of the whole model is thatAnd an adjacency graphThe encoder model of the text is BERT, and thereforeObtaining an input vector of the first layer of transformers through Token EmbWhereinSubsequentlyWhere b is batch_size, n is text length, and the text length is input into the GAT to extract the isomerism map information, specifically, the number of layers of the GAT is set to beThe following operations are performed in each layer:
The present embodiment maps the input node vector to two matrices without considering multi-head ,The following attention operations will then be performed:
Will be Dimension expansion and filling are carried out to lead the dimension expansion and filling to reach four-dimensional size (dimension parameter):
;
And then will 、Concat (expansion) according to the fourth dimensionThen construct a vector if in a standard GATWill beConversion toIs then Softmax; and in the heterogeneous relationship diagramIs a three-dimensional matrix, i.eEach of which is provided withElements of a two-dimensional matrix of (a)Are all obtained by mapping the adjacency matrix constructed according to step2, specifically, a global entity relation matrix is firstly constructedWherein each row of vectors represents a vector of a corresponding id in the entity-relationship, similar to the Token's word embedding matrix, all layers in this scheme share oneWhere +1 is the id of the edge without entities and relationships only, i.e. each GAT operationAre all consistent in thatThe dot product can be calculated row by rowAttention score of (2)At this time, due to the adjacency matrixA value of 0 in (a) represents that the node and the node (word to word) are not related, and all are also required to passConstructing a Mask matrixWhen (when)A position value of 0Value of corresponding positionOtherwise it is 0 and the number of the cells is,
Final attention scoreThe output vector is then obtained through leakyRule and softmax and aggregation operations:
;
Thereby obtaining the slave input To output toThe whole process of the heterogeneous relation GAT of the scheme (1), wherein multi-head and marked GAT consistent operations are that dimensions are cut, then attention operation is carried out, finally concat operation is carried out, wherein each bert layers can be subjected to multiple GAT operations before, and the frequency of the scheme is 2 times.
Obtaining the output of GATThen, the two output vectors are added to be used as the input of the current converter-block to obtain the output vector of the converter-block(Note that here we can do not add, but rather do a concatenation of input transducers-blocks in the sentence length dimension), i.e
,
The final model output vector of the whole BERT+GATIntent recognition losses are then constructed through MeanPool and cross entropy operations.
In this embodiment, training data is input into a trained feature recognition and relationship extraction model to perform entity feature extraction and relationship extraction, so as to obtain entity features of textAnd a list of relationships between the entities,To the point ofFor the physical characteristics of each word, n is the number of words contained in the sentence of the training data; constructing an heterogram based on the entity characteristics of the text and a relation list among the entities; and inputting the heterogeneous graph into a graph attention model for training to obtain a trained graph attention model, and enhancing the understanding and fusion of the characteristics of the entity-relationship through the intention recognition model of the heterogeneous relationship graph, thereby improving the effect of intention recognition.
Referring to fig. 3, fig. 3 shows a medical intention recognition method according to an embodiment of the present invention, and the method is applied to the server in fig. 1 for illustration, and is described in detail as follows:
s204: receiving a statement to be identified;
S205: constructing an abnormal pattern of the statement to be identified, and inputting the constructed statement to a trained medical intention identification model to obtain an intention identification result.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.
Fig. 4 shows a schematic block diagram of a medical intent recognition model training apparatus in one-to-one correspondence with the medical intent recognition model training method of the above embodiment. As shown in fig. 4, the medical intention recognition model training apparatus includes an entity extraction module 31, a heterogeneous map construction module 32, and a model training module 33. The functional modules are described in detail as follows:
the entity extraction module 31 is configured to input training data into the trained feature recognition and relationship extraction model to perform entity feature extraction and relationship extraction, thereby obtaining entity features of the text And a relation list between entities, t 1 to t n are entity characteristics of each word, and n is the number of words contained in sentences of training data;
The heterogeneous diagram construction module 32 is configured to construct a heterogeneous diagram based on the entity characteristics of the text and the relationship list between the entities;
the model training module 33 is configured to input the heterogeneous graph to the graph attention model for training, and obtain a trained graph attention model.
Optionally, the heterogeneous map construction module 32 includes:
A matrix construction unit for constructing an adjacency matrix by taking n as the node number of the current sample iso-graph ;
An edge construction unit for constructing edges between every two nodes for n nodes, wherein the entity edges are connected only on the head and tail nodes and labeled with entity types;
The heterogeneous graph generating unit is used for respectively labeling the head and the tail of the corresponding entity with the relationship when the relationship between the entities constructs the relationship adjacent valueMarked on the left, the nodes without relation are represented by 0, and an adjacent matrix with heterogeneous relation is obtainedAdjacency matrix to have heterogeneous relationsAs an iso-pattern.
Optionally, the model training module 33 includes:
The coding module is used for obtaining an input node vector by taking a Bert model as a coding layer of the graph attention model and passing training data and the heterogeneous graph through the coding layer Wherein,B is batch_size, n is text length;
A feature extraction module for mapping the input node vector to two matrices at each feature extraction layer of the graph annotation force network ,Extracting features by adopting an attention mechanism to obtain extracted features;
And the iterative training unit is used for carrying out intention recognition loss calculation according to the extracted features and carrying out iterative training according to the loss result to obtain a trained graph attention model.
Fig. 5 shows a schematic block diagram of a medical intent recognition device in one-to-one correspondence with the medical intent recognition method of the above embodiment. As shown in fig. 5, the medical intention recognition apparatus includes a receiving module 34 and a recognition module 35. The functional modules are described in detail as follows:
a receiving module 34, configured to receive a sentence to be identified;
The recognition module 35 is configured to construct an iso-graph from the sentence to be recognized, and input the constructed iso-graph into a trained medical intention recognition model to obtain an intention recognition result.
For specific limitations on the medical intent recognition model training apparatus, reference may be made to the above limitations on the medical intent recognition model training method, and no further description is given here. The respective modules in the medical intention recognition model training apparatus described above may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In order to solve the technical problems, the embodiment of the application also provides computer equipment. Referring specifically to fig. 6, fig. 6 is a basic structural block diagram of a computer device according to the present embodiment.
The computer device 4 comprises a memory 41, a processor 42, a network interface 43 communicatively connected to each other via a system bus. It is noted that only a computer device 4 having a component connection memory 41, a processor 42, a network interface 43 is shown in the figures, but it is understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and its hardware includes, but is not limited to, a microprocessor, an Application SPECIFIC INTEGRATED Circuit (ASIC), a Programmable gate array (Field-Programmable GATE ARRAY, FPGA), a digital Processor (DIGITAL SIGNAL Processor, DSP), an embedded device, and the like.
The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.
The memory 41 includes at least one type of readable storage medium including flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or D interface display memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 41 may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the computer device 4. Of course, the memory 41 may also comprise both an internal memory unit of the computer device 4 and an external memory device. In this embodiment, the memory 41 is typically used to store an operating system and various types of application software installed on the computer device 4, such as program code for training a medical intention recognition model. Further, the memory 41 may be used to temporarily store various types of data that have been output or are to be output.
The processor 42 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 42 is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to execute the program code stored in the memory 41 or process data, such as the program code for training the medical intention recognition model.
The network interface 43 may comprise a wireless network interface or a wired network interface, which network interface 43 is typically used for establishing a communication connection between the computer device 4 and other electronic devices.
The present application also provides another embodiment, namely, a computer-readable storage medium storing an interface display program executable by at least one processor to cause the at least one processor to perform the steps of the medical intent recognition model training method as described above.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present application.
It is apparent that the above-described embodiments are only some embodiments of the present application, but not all embodiments, and the preferred embodiments of the present application are shown in the drawings, which do not limit the scope of the patent claims. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a thorough and complete understanding of the present disclosure. Although the application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing description, or equivalents may be substituted for elements thereof. All equivalent structures made by the content of the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the scope of the application.
Claims (9)
1. A medical intent recognition model training method, comprising:
Inputting training data into a trained feature recognition and relation extraction model to perform entity feature extraction and relation extraction to obtain a relation list between entity features T= { T 1,t2,...,tn } and entities of a text, wherein T 1 to T n are the entity features of each word, and n is the number of words contained in sentences of the training data;
Constructing an heterogram based on the entity characteristics of the text and the relation list between the entities;
Inputting the heterogeneous graph into a graph attention model for training to obtain a trained graph attention model;
Wherein, the constructing the heterogeneous graph based on the entity characteristics of the text and the relation list between the entities includes:
Taking n as the node number of the current sample iso-graph, and constructing an adjacency matrix A epsilon R n×n;
For n nodes, constructing edges between every two nodes, wherein the entity edges are only connected on head and tail nodes, and marking entity type tags i;
When the relation between the entities constructs a relation adjacent value, the head and the tail of the corresponding entity are marked by relation labels ids i respectively, nodes without relation are represented by 0, an adjacent matrix A epsilon R n×n with heterogeneous relation is obtained, and the adjacent matrix A epsilon R n×n with heterogeneous relation is used as the heterogram.
2. The medical intent recognition model training method of claim 1, wherein the constructing a iso-graph is performed for each sample in the training data.
3. The method for training a medical intent recognition model as recited in claim 1, wherein said inputting said heterogeneous map into a graph attention model for training, said obtaining a trained graph attention model comprises:
The coding layer of the graph annotation force model is a Bert model, and the training data and the heterogeneous graph are subjected to coding layer to obtain an input node vector Wherein the method comprises the steps ofH 0∈Rb×n×d, b is batch_size, n is text length;
At each feature extraction layer of the graph-meaning network, the input node vector is mapped to two matrices Extracting features by adopting an attention mechanism to obtain extracted features;
and carrying out intention recognition loss calculation according to the extracted features, and carrying out iterative training according to loss results to obtain the trained graph attention model.
4. The medical intent recognition model training method of claim 3, wherein said feature extraction using an attention mechanism results in extracted features, comprising:
Will be Dimension expansion and filling are carried out to lead the dimension expansion and filling to reach four-dimensional batch size:
Will be The connection is carried out according to the fourth dimension to obtain
After the heterogeneous relationship diagram W QK is obtained, the dot product is calculated row by row to obtainAttention score of (2)A epsilon R n×n indicates that the node is not associated with the node when 0 is 0;
Building a Mask matrix A mask∈Rn×n through A epsilon R n×n, wherein when a position value in A is 0, the value of a corresponding position of A mask is-1 e 9, otherwise, the value is 0;
Calculating a final attention score A QK=AQK+Amask;
and (3) carrying out pooling and aggregation operation to obtain an output vector:
And weighting the output vector H 0' to obtain the extracted feature.
5. A medical intent recognition method, comprising:
Receiving a statement to be identified;
Constructing an abnormal pattern of the statement to be recognized, and inputting the constructed abnormal pattern into a trained medical intention recognition model to obtain an intention recognition result, wherein the trained medical intention recognition model is trained according to the medical intention recognition model training method of any one of claims 1 to 4.
6. A medical intent recognition model training apparatus, characterized in that the medical intent recognition model training apparatus comprises:
The entity extraction module is used for inputting training data into the trained feature recognition and relation extraction model to carry out entity feature extraction and relation extraction to obtain a text entity feature T= { T 1,t2,...,tn } and a relation list among entities, T 1 to T n are the entity features of each word, and n is the number of words contained in sentences of the training data;
The abnormal composition building module is used for building abnormal compositions based on the entity characteristics of the text and the relation list among the entities;
The model training module is used for inputting the heterogeneous graph into a graph attention model for training to obtain a trained graph attention model;
Wherein, the constructing the heterogeneous graph based on the entity characteristics of the text and the relation list between the entities includes:
Taking n as the node number of the current sample iso-graph, and constructing an adjacency matrix A epsilon R n×n;
For n nodes, constructing edges between every two nodes, wherein the entity edges are only connected on head and tail nodes, and marking entity type tags i;
When the relation between the entities constructs a relation adjacent value, the head and the tail of the corresponding entity are marked by relation labels ids i respectively, nodes without relation are represented by 0, an adjacent matrix A epsilon R n×n with heterogeneous relation is obtained, and the adjacent matrix A epsilon R n×n with heterogeneous relation is used as the heterogram.
7. A medical intent recognition device, comprising:
The receiving module is used for receiving the statement to be identified;
The recognition module is used for constructing the abnormal pattern of the statement to be recognized, inputting the constructed abnormal pattern into a trained medical intention recognition model to obtain an intention recognition result, wherein the trained medical intention recognition model is trained according to the medical intention recognition model training method of any one of claims 1 to 4.
8. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the medical intent recognition model training method according to any one of claims 1 to 4 when executing the computer program or the processor implements the medical intent recognition method according to claim 5 when executing the computer program.
9. A computer-readable storage medium storing a computer program, wherein the computer program implements the medical intent recognition model training method according to any one of claims 1 to 4 when executed by a processor, or the medical intent recognition method according to claim 5 when executed by a processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410180194.1A CN117725961B (en) | 2024-02-18 | 2024-02-18 | Medical intention recognition model training method, medical intention recognition method and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410180194.1A CN117725961B (en) | 2024-02-18 | 2024-02-18 | Medical intention recognition model training method, medical intention recognition method and equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117725961A CN117725961A (en) | 2024-03-19 |
CN117725961B true CN117725961B (en) | 2024-07-30 |
Family
ID=90211108
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410180194.1A Active CN117725961B (en) | 2024-02-18 | 2024-02-18 | Medical intention recognition model training method, medical intention recognition method and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117725961B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113010683A (en) * | 2020-08-26 | 2021-06-22 | 齐鲁工业大学 | Entity relationship identification method and system based on improved graph attention network |
CN116167382A (en) * | 2023-01-05 | 2023-05-26 | 中国电信股份有限公司 | Intention event extraction method and device, electronic equipment and storage medium |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111046671A (en) * | 2019-12-12 | 2020-04-21 | 中国科学院自动化研究所 | Chinese named entity recognition method based on graph network and merged into dictionary |
CN112035637A (en) * | 2020-08-28 | 2020-12-04 | 康键信息技术(深圳)有限公司 | Medical field intention recognition method, device, equipment and storage medium |
CN112201359B (en) * | 2020-09-30 | 2024-05-03 | 平安科技(深圳)有限公司 | Method and device for identifying severe inquiry data based on artificial intelligence |
CN114036308A (en) * | 2021-09-28 | 2022-02-11 | 西安电子科技大学 | Knowledge graph representation method based on graph attention neural network |
CN114419304B (en) * | 2022-01-18 | 2024-11-08 | 深圳前海环融联易信息科技服务有限公司 | Multi-mode document information extraction method based on graphic neural network |
CN114443846B (en) * | 2022-01-24 | 2024-07-16 | 重庆邮电大学 | Classification method and device based on multi-level text different composition and electronic equipment |
CN115687934A (en) * | 2022-12-30 | 2023-02-03 | 智慧眼科技股份有限公司 | Intention recognition method and device, computer equipment and storage medium |
-
2024
- 2024-02-18 CN CN202410180194.1A patent/CN117725961B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113010683A (en) * | 2020-08-26 | 2021-06-22 | 齐鲁工业大学 | Entity relationship identification method and system based on improved graph attention network |
CN116167382A (en) * | 2023-01-05 | 2023-05-26 | 中国电信股份有限公司 | Intention event extraction method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN117725961A (en) | 2024-03-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022142014A1 (en) | Multi-modal information fusion-based text classification method, and related device thereof | |
US11036996B2 (en) | Method and apparatus for determining (raw) video materials for news | |
CN117765132A (en) | Image generation method, device, equipment and storage medium | |
CN113360654A (en) | Text classification method and device, electronic equipment and readable storage medium | |
CN113657105A (en) | Medical entity extraction method, device, equipment and medium based on vocabulary enhancement | |
CN112084752A (en) | Statement marking method, device, equipment and storage medium based on natural language | |
CN116245097A (en) | Method for training entity recognition model, entity recognition method and corresponding device | |
CN115757731A (en) | Dialogue question rewriting method, device, computer equipment and storage medium | |
CN112232052B (en) | Text splicing method, text splicing device, computer equipment and storage medium | |
CN110019952B (en) | Video description method, system and device | |
CN113688232A (en) | Method and device for classifying bidding texts, storage medium and terminal | |
CN112199954A (en) | Disease entity matching method and device based on voice semantics and computer equipment | |
CN112434746B (en) | Pre-labeling method based on hierarchical migration learning and related equipment thereof | |
CN117725961B (en) | Medical intention recognition model training method, medical intention recognition method and equipment | |
CN117874234A (en) | Text classification method and device based on semantics, computer equipment and storage medium | |
CN113312568A (en) | Web information extraction method and system based on HTML source code and webpage snapshot | |
CN116186295B (en) | Attention-based knowledge graph link prediction method, attention-based knowledge graph link prediction device, attention-based knowledge graph link prediction equipment and attention-based knowledge graph link prediction medium | |
CN112182157A (en) | Training method of online sequence labeling model, online labeling method and related equipment | |
CN117992569A (en) | Method, device, equipment and medium for generating document based on generation type large model | |
CN115510203B (en) | Method, device, equipment, storage medium and program product for determining answers to questions | |
CN113657092B (en) | Method, device, equipment and medium for identifying tag | |
CN115982363A (en) | Small sample relation classification method, system, medium and electronic device based on prompt learning | |
CN114091451A (en) | Text classification method, device, equipment and storage medium | |
CN113657104A (en) | Text extraction method and device, computer equipment and storage medium | |
CN117688193B (en) | Picture and text unified coding method, device, computer equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |