[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN117725961B - Medical intention recognition model training method, medical intention recognition method and equipment - Google Patents

Medical intention recognition model training method, medical intention recognition method and equipment Download PDF

Info

Publication number
CN117725961B
CN117725961B CN202410180194.1A CN202410180194A CN117725961B CN 117725961 B CN117725961 B CN 117725961B CN 202410180194 A CN202410180194 A CN 202410180194A CN 117725961 B CN117725961 B CN 117725961B
Authority
CN
China
Prior art keywords
relation
entity
graph
medical
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410180194.1A
Other languages
Chinese (zh)
Other versions
CN117725961A (en
Inventor
吴俊江
杨峻
王晓龙
李文昊
马源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Athena Eyes Co Ltd
Original Assignee
Athena Eyes Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Athena Eyes Co Ltd filed Critical Athena Eyes Co Ltd
Priority to CN202410180194.1A priority Critical patent/CN117725961B/en
Publication of CN117725961A publication Critical patent/CN117725961A/en
Application granted granted Critical
Publication of CN117725961B publication Critical patent/CN117725961B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a medical intention recognition model training method, a medical intention recognition method and equipment, comprising the following steps: inputting training data into a trained feature recognition and relation extraction model to perform entity feature extraction and relation extraction to obtain entity features of the textAnd a relation list between entities, t 1 to t n are entity characteristics of each word, and n is the number of words contained in sentences of training data; constructing an heterogram based on the entity characteristics of the text and a relation list among the entities; and inputting the heterogeneous graph into a graph attention model for training to obtain a trained graph attention model, and enhancing the understanding and fusion of the characteristics of the entity-relationship through the intention recognition model of the heterogeneous relationship graph, thereby improving the effect of intention recognition.

Description

Medical intention recognition model training method, medical intention recognition method and equipment
Technical Field
The invention relates to the field of natural language processing, in particular to a medical intention recognition model training method, a medical intention recognition method and medical intention recognition equipment.
Background
With the development of artificial intelligence technology, natural language identification is increasingly applied, especially in medical fine-grained text classification scenes. Current natural language recognition has many practices in deep learning based classification models, such as text classification models based on the pretrained model BERT, text classification models based on LSTM/CNN, or related variants thereof; the scheme of a single task does not introduce external information such as entities, so that a joint combination mode of a multitask model based on classification and entity identification is provided, and the model simultaneously performs entity identification and classification tasks so as to improve the capacity of the two tasks; the other way is a rendering pipeline mode that the characteristics such as entity relation and the like are input as an external characteristic; the joint combination mode relieves the defect of exposure deviation, but the schemes of construction of loss weights of a plurality of tasks, difficulty of task learning fitting and the like are problems; but the model can be simply decoupled based on the rendering pipeline mode construction, more features are added from the bottom layer, but the exposure deviation problem exists. Therefore, the natural language classification performed in the existing mode is not high enough in accuracy of text classification in the medical fine-grained text classification scene, and practical application is difficult to meet.
Disclosure of Invention
The embodiment of the invention provides a medical intention recognition model training method, a medical intention recognition device, computer equipment and a storage medium, so as to improve the accuracy of medical intention recognition classification.
In order to solve the above technical problems, an embodiment of the present application provides a medical intention recognition model training method, including:
Inputting training data into a trained feature recognition and relation extraction model to perform entity feature extraction and relation extraction to obtain entity features of the text And a list of relationships between the entities,To the point ofFor the physical characteristics of each word, n is the number of words contained in the sentence of the training data;
Constructing an heterogram based on the entity characteristics of the text and the relation list between the entities;
and inputting the heterogeneous graph into a graph attention model for training to obtain a trained graph attention model.
Optionally, the constructing the heterogeneous graph based on the entity characteristics of the text and the relationship list between the entities includes:
constructing an adjacency matrix by taking n as the node number of the current sample iso-graph
For n nodes, constructing edges between every two nodes, wherein the entity edges are connected only on the head and tail nodes and are marked with entity type labels
When the relation between the entities constructs the relation adjacent value, the head and the tail of the corresponding entity are respectively labeled by the relation labelMarked on the left, the nodes without relation are represented by 0, and an adjacent matrix with heterogeneous relation is obtainedThe adjacency matrix with heterogeneous relationAs the differential pattern.
Optionally, the constructing an iso-graph is performed for each sample in the training data.
Optionally, inputting the heterogeneous graph into a graph attention model for training, and obtaining a trained graph attention model includes:
The coding layer of the graph annotation force model is a Bert model, and the training data and the heterogeneous graph are subjected to coding layer to obtain an input node vector WhereinB is batch_size, batch_size is an important parameter in machine learning, and n is text length;
At each feature extraction layer of the graph-meaning network, the input node vector is mapped to two matrices Extracting features by adopting an attention mechanism to obtain extracted features;
and carrying out intention recognition loss calculation according to the extracted features, and carrying out iterative training according to loss results to obtain the trained graph attention model.
Optionally, the feature extraction using an attention mechanism to obtain an extracted feature includes:
Will be Dimension expansion and filling are carried out to lead the dimension expansion and filling to reach four-dimensional size (dimension parameter):
Will be The connection is carried out according to the fourth dimension to obtain
In the process of obtaining the isomerism relation graphThen, calculating dot products line by line to obtainAttention score of (2)Wherein 0 represents that the node is not associated with the node;
By passing through Constructing a Mask matrixWhen (when)A position value of 0Value of corresponding positionOtherwise, 0;
calculating a final attention score
And (3) carrying out pooling and aggregation operation to obtain an output vector:
For the output vector And weighting to obtain the extracted features.
In order to solve the above technical problems, an embodiment of the present application provides a medical intention recognition method, including:
Receiving a statement to be identified;
Constructing an abnormal pattern of the statement to be identified, and inputting the constructed statement to a trained medical intention identification model to obtain an intention identification result.
In order to solve the above technical problem, an embodiment of the present application further provides a medical intention recognition model training apparatus, including:
the entity extraction module is used for inputting training data into the trained feature recognition and relation extraction model to extract entity features and relation to obtain entity features of the text And a list of relationships between the entities,To the point ofFor the physical characteristics of each word, n is the number of words contained in the sentence of the training data;
the abnormal composition building module is used for building abnormal compositions based on the entity characteristics of the text and a relation list among the entities;
and the model training module is used for inputting the heterogeneous graph into the graph attention model for training to obtain a trained graph attention model.
Optionally, the heterogeneous graph construction module includes:
A matrix construction unit for constructing an adjacency matrix by taking n as the node number of the current sample iso-graph
An edge construction unit for constructing edges between every two nodes for n nodes, wherein the entity edges are connected only on the head and tail nodes and labeled with entity types
The heterogeneous graph generating unit is used for respectively labeling the head and the tail of the corresponding entity with the relationship when the relationship between the entities constructs the relationship adjacent valueMarked on the left, the nodes without relation are represented by 0, and an adjacent matrix with heterogeneous relation is obtainedAdjacency matrix to have heterogeneous relationsAs an iso-pattern.
Optionally, the model training module comprises:
The coding module is used for obtaining an input node vector by taking a Bert model as a coding layer of the graph attention model and passing training data and the heterogeneous graph through the coding layer WhereinB is batch_size, n is text length;
A feature extraction module for mapping the input node vector to two matrices at each feature extraction layer of the graph annotation force network Extracting features by adopting an attention mechanism to obtain extracted features;
And the iterative training unit is used for carrying out intention recognition loss calculation according to the extracted features and carrying out iterative training according to the loss result to obtain a trained graph attention model.
In order to solve the above technical problem, an embodiment of the present application further provides a medical intention recognition device, including:
The receiving module is used for receiving the statement to be identified;
The recognition module is used for constructing the abnormal patterns of the sentences to be recognized, and inputting the constructed sentences into the trained medical intention recognition model to obtain an intention recognition result.
In order to solve the above technical problem, an embodiment of the present application further provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the medical intention recognition model training method or implements the steps of the medical intention recognition method when executing the computer program.
To solve the above-mentioned technical problem, an embodiment of the present application further provides a computer-readable storage medium storing a computer program that implements the steps of the above-mentioned medical intent recognition model training method when executed by a processor, or implements the steps of the above-mentioned medical intent recognition method when executed by a processor.
The medical intention recognition model training method, the medical intention recognition device, the computer equipment and the storage medium provided by the embodiment of the invention are used for extracting entity characteristics and extracting relations by inputting training data into a trained characteristic recognition and relation extraction model, so as to obtain the entity characteristics of textsAnd a list of relationships between the entities,To the point ofFor the physical characteristics of each word, n is the number of words contained in the sentence of the training data; constructing an heterogram based on the entity characteristics of the text and a relation list among the entities; inputting heterogeneous graphs into a graph annotation force model
Training is carried out to obtain a trained graph attention model, understanding and fusion of the characteristics of the entity-relationship are enhanced through the intention recognition model of the heterogeneous relationship graph, and the effect of intention recognition is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow chart of one embodiment of a medical intent recognition model training method of the present application;
FIG. 3 is a flow chart of one embodiment of a medical intent recognition method of the present application;
FIG. 4 is a schematic structural view of one embodiment of a medical intent recognition model training apparatus in accordance with the present application;
FIG. 5 is a schematic structural view of one embodiment of a medical intent recognition device according to the present application;
FIG. 6 is a schematic structural diagram of one embodiment of a computer device in accordance with the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the applications herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having" and any variations thereof in the description of the application and the claims and the description of the drawings above are intended to cover a non-exclusive inclusion. The terms first, second and the like in the description and in the claims or in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, as shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.
It should be noted that, the medical intention recognition model training method and the medical intention recognition method provided by the embodiments of the present application are executed by a server, and accordingly, the medical intention recognition model training device and the medical intention recognition device are disposed in the server.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. Any number of terminal devices, networks and servers may be provided according to implementation requirements, and the terminal devices 101, 102, 103 in the embodiment of the present application may specifically correspond to application systems in actual production.
Referring to fig. 2, fig. 2 shows a training method for a medical intention recognition model according to an embodiment of the present invention, and the method is applied to the server in fig. 1 for illustration, and is described in detail as follows:
S201: inputting training data into a trained feature recognition and relation extraction model to perform entity feature extraction and relation extraction to obtain entity features of the text And a list of relationships between the entities,To the point ofFor each word's physical characteristics, n is the number of words contained in the sentence of the training data.
Specifically, in the input layer of the model, the input of word granularity is adopted, and the input of sentences is thatWhereinFor the sentence length x as the corresponding character feature, then the sentence is subjected to a trained entity recognition and relation extraction model to obtain the entity feature of the textFor each word's physical characteristics, the physical characteristics are represented by BEIOS physical labels, such as body parts: arm= { hand: B-B, arm: b-E }, while obtaining a list of relationships between entitiesFor example, the relationship "site attribute" < arm-site attribute-scratch >, and after obtaining the initial input feature, the construction of the iso-composition is performed.
S202: and constructing an heterogram based on the entity characteristics of the text and the relation list among the entities.
In a specific alternative embodiment, constructing the heterogeneous map based on the text-based entity characteristics and the list of relationships between entities includes:
constructing an adjacency matrix by taking n as the node number of the current sample iso-graph
For n nodes, constructing edges between every two nodes, wherein the entity edges are connected only on the head and tail nodes and are marked with entity type labels
When the relation between the entities constructs the relation adjacent value, the head and the tail of the corresponding entity are respectively labeled by the relation labelMarked on the left, the nodes without relation are represented by 0, and an adjacent matrix with heterogeneous relation is obtainedAdjacency matrix to have heterogeneous relationsAs an iso-pattern.
Further, constructing the iso-graph is performed for each sample in the training data.
Specifically, in this embodiment, according to the size of the sentence length n as the number of nodes of the current sample iso-pattern, an adjacency matrix is first constructedBecause the embodiment mainly integrates entity and relation information and does not consider the enhanced characteristics of entity recognition vocabulary, the embodiment does not construct word nodes, but all the word nodes are byte points, and firstly, entity types in the whole data set are countedAnd number of relationship typesWherein the subscripts l and g represent the number of entities and relationships in the whole, as well asType of (a) maps to the id of an integer, i.eExamples of entity types are: body parts, diseases, clinical manifestations, physical examination items, etc., and relationships include clinical attributes, site attributes, drug attributes, etc., and then the heterogram is constructed based on the extracted entity and relationship features of Step1, specifically n nodesAn edge is constructed between every two nodes, wherein the entity edge is connected with the edge only on the head and tail nodes, and the entity type label is markedThe relationship between similar entities is also constructed by the above-mentioned method, except that multiple entities may have the same beginning word but different ending words in consideration of the nesting condition of the entities, so that when constructing the relationship adjacency value, the head and tail of the corresponding entity need to be respectively labeled with the relationshipMarked on the left, the nodes without relation are represented by 0, and finally an adjacent matrix with heterogeneous relation is obtainedWhere different values in the matrix represent different relationships, either different entity types or different relationship types, so that there is more information about the original 0,1 value adjacency matrix, while three values 0/1/2 can be used to represent no connection/entity connection/relationship connection, respectively, if simplification is considered.
S203: and inputting the heterogeneous graph into a graph attention model for training to obtain a trained graph attention model.
In a specific alternative embodiment, inputting the heterogeneous graph into the graph attention model for training, and obtaining the trained graph attention model includes:
The coding layer of the graph attention model is a Bert model, and the training data and the heterogeneous graph are processed by the coding layer to obtain an input node vector WhereinB is batch_size, n is text length;
At each feature extraction layer of the graph-meaning network, the input node vector is mapped to two matrices Extracting features by adopting an attention mechanism to obtain extracted features;
And carrying out intention recognition loss calculation according to the extracted features, and carrying out iterative training according to loss results to obtain a trained graph attention model.
In another specific optional implementation manner of this embodiment, feature extraction is performed by using an attention mechanism, so as to obtain an extracted feature, including:
Will be Dimension expansion and filling are carried out to lead the dimension expansion and filling to reach four-dimensional size (dimension parameter):
Will be The connection is carried out according to the fourth dimension to obtain
In the process of obtaining the isomerism relation graphThen, calculating dot products line by line to obtainAttention score of (2)Wherein 0 represents that the node is not associated with the node;
By passing through Constructing a Mask matrixWhen (when)A position value of 0Value of corresponding positionOtherwise, 0;
calculating a final attention score
And (3) carrying out pooling and aggregation operation to obtain an output vector:
For output vector And weighting to obtain the extracted features.
Specifically, in this embodiment, after feature graphs are constructed, the graph construction is performed for each sample, and since the graph construction is performed for each sample, the graphs in the samples are static, so that a Inductive Learning strategy such as GRAPHSAGE model is not required, but modeling is performed by adopting a manner of Transductive to construct GCN or GAT (graph injection force model), the scheme adopts a GAT model for construction, and firstly, the input data of the whole model is thatAnd an adjacency graphThe encoder model of the text is BERT, and thereforeObtaining an input vector of the first layer of transformers through Token EmbWhereinSubsequentlyWhere b is batch_size, n is text length, and the text length is input into the GAT to extract the isomerism map information, specifically, the number of layers of the GAT is set to beThe following operations are performed in each layer:
The present embodiment maps the input node vector to two matrices without considering multi-head The following attention operations will then be performed:
Will be Dimension expansion and filling are carried out to lead the dimension expansion and filling to reach four-dimensional size (dimension parameter):
;
And then will Concat (expansion) according to the fourth dimensionThen construct a vector if in a standard GATWill beConversion toIs then Softmax; and in the heterogeneous relationship diagramIs a three-dimensional matrix, i.eEach of which is provided withElements of a two-dimensional matrix of (a)Are all obtained by mapping the adjacency matrix constructed according to step2, specifically, a global entity relation matrix is firstly constructedWherein each row of vectors represents a vector of a corresponding id in the entity-relationship, similar to the Token's word embedding matrix, all layers in this scheme share oneWhere +1 is the id of the edge without entities and relationships only, i.e. each GAT operationAre all consistent in thatThe dot product can be calculated row by rowAttention score of (2)At this time, due to the adjacency matrixA value of 0 in (a) represents that the node and the node (word to word) are not related, and all are also required to passConstructing a Mask matrixWhen (when)A position value of 0Value of corresponding positionOtherwise it is 0 and the number of the cells is,
Final attention scoreThe output vector is then obtained through leakyRule and softmax and aggregation operations:
Thereby obtaining the slave input To output toThe whole process of the heterogeneous relation GAT of the scheme (1), wherein multi-head and marked GAT consistent operations are that dimensions are cut, then attention operation is carried out, finally concat operation is carried out, wherein each bert layers can be subjected to multiple GAT operations before, and the frequency of the scheme is 2 times.
Obtaining the output of GATThen, the two output vectors are added to be used as the input of the current converter-block to obtain the output vector of the converter-block(Note that here we can do not add, but rather do a concatenation of input transducers-blocks in the sentence length dimension), i.e
The final model output vector of the whole BERT+GATIntent recognition losses are then constructed through MeanPool and cross entropy operations.
In this embodiment, training data is input into a trained feature recognition and relationship extraction model to perform entity feature extraction and relationship extraction, so as to obtain entity features of textAnd a list of relationships between the entities,To the point ofFor the physical characteristics of each word, n is the number of words contained in the sentence of the training data; constructing an heterogram based on the entity characteristics of the text and a relation list among the entities; and inputting the heterogeneous graph into a graph attention model for training to obtain a trained graph attention model, and enhancing the understanding and fusion of the characteristics of the entity-relationship through the intention recognition model of the heterogeneous relationship graph, thereby improving the effect of intention recognition.
Referring to fig. 3, fig. 3 shows a medical intention recognition method according to an embodiment of the present invention, and the method is applied to the server in fig. 1 for illustration, and is described in detail as follows:
s204: receiving a statement to be identified;
S205: constructing an abnormal pattern of the statement to be identified, and inputting the constructed statement to a trained medical intention identification model to obtain an intention identification result.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.
Fig. 4 shows a schematic block diagram of a medical intent recognition model training apparatus in one-to-one correspondence with the medical intent recognition model training method of the above embodiment. As shown in fig. 4, the medical intention recognition model training apparatus includes an entity extraction module 31, a heterogeneous map construction module 32, and a model training module 33. The functional modules are described in detail as follows:
the entity extraction module 31 is configured to input training data into the trained feature recognition and relationship extraction model to perform entity feature extraction and relationship extraction, thereby obtaining entity features of the text And a relation list between entities, t 1 to t n are entity characteristics of each word, and n is the number of words contained in sentences of training data;
The heterogeneous diagram construction module 32 is configured to construct a heterogeneous diagram based on the entity characteristics of the text and the relationship list between the entities;
the model training module 33 is configured to input the heterogeneous graph to the graph attention model for training, and obtain a trained graph attention model.
Optionally, the heterogeneous map construction module 32 includes:
A matrix construction unit for constructing an adjacency matrix by taking n as the node number of the current sample iso-graph
An edge construction unit for constructing edges between every two nodes for n nodes, wherein the entity edges are connected only on the head and tail nodes and labeled with entity types
The heterogeneous graph generating unit is used for respectively labeling the head and the tail of the corresponding entity with the relationship when the relationship between the entities constructs the relationship adjacent valueMarked on the left, the nodes without relation are represented by 0, and an adjacent matrix with heterogeneous relation is obtainedAdjacency matrix to have heterogeneous relationsAs an iso-pattern.
Optionally, the model training module 33 includes:
The coding module is used for obtaining an input node vector by taking a Bert model as a coding layer of the graph attention model and passing training data and the heterogeneous graph through the coding layer WhereinB is batch_size, n is text length;
A feature extraction module for mapping the input node vector to two matrices at each feature extraction layer of the graph annotation force network Extracting features by adopting an attention mechanism to obtain extracted features;
And the iterative training unit is used for carrying out intention recognition loss calculation according to the extracted features and carrying out iterative training according to the loss result to obtain a trained graph attention model.
Fig. 5 shows a schematic block diagram of a medical intent recognition device in one-to-one correspondence with the medical intent recognition method of the above embodiment. As shown in fig. 5, the medical intention recognition apparatus includes a receiving module 34 and a recognition module 35. The functional modules are described in detail as follows:
a receiving module 34, configured to receive a sentence to be identified;
The recognition module 35 is configured to construct an iso-graph from the sentence to be recognized, and input the constructed iso-graph into a trained medical intention recognition model to obtain an intention recognition result.
For specific limitations on the medical intent recognition model training apparatus, reference may be made to the above limitations on the medical intent recognition model training method, and no further description is given here. The respective modules in the medical intention recognition model training apparatus described above may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In order to solve the technical problems, the embodiment of the application also provides computer equipment. Referring specifically to fig. 6, fig. 6 is a basic structural block diagram of a computer device according to the present embodiment.
The computer device 4 comprises a memory 41, a processor 42, a network interface 43 communicatively connected to each other via a system bus. It is noted that only a computer device 4 having a component connection memory 41, a processor 42, a network interface 43 is shown in the figures, but it is understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and its hardware includes, but is not limited to, a microprocessor, an Application SPECIFIC INTEGRATED Circuit (ASIC), a Programmable gate array (Field-Programmable GATE ARRAY, FPGA), a digital Processor (DIGITAL SIGNAL Processor, DSP), an embedded device, and the like.
The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.
The memory 41 includes at least one type of readable storage medium including flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or D interface display memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 41 may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the computer device 4. Of course, the memory 41 may also comprise both an internal memory unit of the computer device 4 and an external memory device. In this embodiment, the memory 41 is typically used to store an operating system and various types of application software installed on the computer device 4, such as program code for training a medical intention recognition model. Further, the memory 41 may be used to temporarily store various types of data that have been output or are to be output.
The processor 42 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 42 is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to execute the program code stored in the memory 41 or process data, such as the program code for training the medical intention recognition model.
The network interface 43 may comprise a wireless network interface or a wired network interface, which network interface 43 is typically used for establishing a communication connection between the computer device 4 and other electronic devices.
The present application also provides another embodiment, namely, a computer-readable storage medium storing an interface display program executable by at least one processor to cause the at least one processor to perform the steps of the medical intent recognition model training method as described above.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present application.
It is apparent that the above-described embodiments are only some embodiments of the present application, but not all embodiments, and the preferred embodiments of the present application are shown in the drawings, which do not limit the scope of the patent claims. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a thorough and complete understanding of the present disclosure. Although the application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing description, or equivalents may be substituted for elements thereof. All equivalent structures made by the content of the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the scope of the application.

Claims (9)

1. A medical intent recognition model training method, comprising:
Inputting training data into a trained feature recognition and relation extraction model to perform entity feature extraction and relation extraction to obtain a relation list between entity features T= { T 1,t2,...,tn } and entities of a text, wherein T 1 to T n are the entity features of each word, and n is the number of words contained in sentences of the training data;
Constructing an heterogram based on the entity characteristics of the text and the relation list between the entities;
Inputting the heterogeneous graph into a graph attention model for training to obtain a trained graph attention model;
Wherein, the constructing the heterogeneous graph based on the entity characteristics of the text and the relation list between the entities includes:
Taking n as the node number of the current sample iso-graph, and constructing an adjacency matrix A epsilon R n×n;
For n nodes, constructing edges between every two nodes, wherein the entity edges are only connected on head and tail nodes, and marking entity type tags i;
When the relation between the entities constructs a relation adjacent value, the head and the tail of the corresponding entity are marked by relation labels ids i respectively, nodes without relation are represented by 0, an adjacent matrix A epsilon R n×n with heterogeneous relation is obtained, and the adjacent matrix A epsilon R n×n with heterogeneous relation is used as the heterogram.
2. The medical intent recognition model training method of claim 1, wherein the constructing a iso-graph is performed for each sample in the training data.
3. The method for training a medical intent recognition model as recited in claim 1, wherein said inputting said heterogeneous map into a graph attention model for training, said obtaining a trained graph attention model comprises:
The coding layer of the graph annotation force model is a Bert model, and the training data and the heterogeneous graph are subjected to coding layer to obtain an input node vector Wherein the method comprises the steps ofH 0∈Rb×n×d, b is batch_size, n is text length;
At each feature extraction layer of the graph-meaning network, the input node vector is mapped to two matrices Extracting features by adopting an attention mechanism to obtain extracted features;
and carrying out intention recognition loss calculation according to the extracted features, and carrying out iterative training according to loss results to obtain the trained graph attention model.
4. The medical intent recognition model training method of claim 3, wherein said feature extraction using an attention mechanism results in extracted features, comprising:
Will be Dimension expansion and filling are carried out to lead the dimension expansion and filling to reach four-dimensional batch size:
Will be The connection is carried out according to the fourth dimension to obtain
After the heterogeneous relationship diagram W QK is obtained, the dot product is calculated row by row to obtainAttention score of (2)A epsilon R n×n indicates that the node is not associated with the node when 0 is 0;
Building a Mask matrix A mask∈Rn×n through A epsilon R n×n, wherein when a position value in A is 0, the value of a corresponding position of A mask is-1 e 9, otherwise, the value is 0;
Calculating a final attention score A QK=AQK+Amask;
and (3) carrying out pooling and aggregation operation to obtain an output vector:
And weighting the output vector H 0' to obtain the extracted feature.
5. A medical intent recognition method, comprising:
Receiving a statement to be identified;
Constructing an abnormal pattern of the statement to be recognized, and inputting the constructed abnormal pattern into a trained medical intention recognition model to obtain an intention recognition result, wherein the trained medical intention recognition model is trained according to the medical intention recognition model training method of any one of claims 1 to 4.
6. A medical intent recognition model training apparatus, characterized in that the medical intent recognition model training apparatus comprises:
The entity extraction module is used for inputting training data into the trained feature recognition and relation extraction model to carry out entity feature extraction and relation extraction to obtain a text entity feature T= { T 1,t2,...,tn } and a relation list among entities, T 1 to T n are the entity features of each word, and n is the number of words contained in sentences of the training data;
The abnormal composition building module is used for building abnormal compositions based on the entity characteristics of the text and the relation list among the entities;
The model training module is used for inputting the heterogeneous graph into a graph attention model for training to obtain a trained graph attention model;
Wherein, the constructing the heterogeneous graph based on the entity characteristics of the text and the relation list between the entities includes:
Taking n as the node number of the current sample iso-graph, and constructing an adjacency matrix A epsilon R n×n;
For n nodes, constructing edges between every two nodes, wherein the entity edges are only connected on head and tail nodes, and marking entity type tags i;
When the relation between the entities constructs a relation adjacent value, the head and the tail of the corresponding entity are marked by relation labels ids i respectively, nodes without relation are represented by 0, an adjacent matrix A epsilon R n×n with heterogeneous relation is obtained, and the adjacent matrix A epsilon R n×n with heterogeneous relation is used as the heterogram.
7. A medical intent recognition device, comprising:
The receiving module is used for receiving the statement to be identified;
The recognition module is used for constructing the abnormal pattern of the statement to be recognized, inputting the constructed abnormal pattern into a trained medical intention recognition model to obtain an intention recognition result, wherein the trained medical intention recognition model is trained according to the medical intention recognition model training method of any one of claims 1 to 4.
8. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the medical intent recognition model training method according to any one of claims 1 to 4 when executing the computer program or the processor implements the medical intent recognition method according to claim 5 when executing the computer program.
9. A computer-readable storage medium storing a computer program, wherein the computer program implements the medical intent recognition model training method according to any one of claims 1 to 4 when executed by a processor, or the medical intent recognition method according to claim 5 when executed by a processor.
CN202410180194.1A 2024-02-18 2024-02-18 Medical intention recognition model training method, medical intention recognition method and equipment Active CN117725961B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410180194.1A CN117725961B (en) 2024-02-18 2024-02-18 Medical intention recognition model training method, medical intention recognition method and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410180194.1A CN117725961B (en) 2024-02-18 2024-02-18 Medical intention recognition model training method, medical intention recognition method and equipment

Publications (2)

Publication Number Publication Date
CN117725961A CN117725961A (en) 2024-03-19
CN117725961B true CN117725961B (en) 2024-07-30

Family

ID=90211108

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410180194.1A Active CN117725961B (en) 2024-02-18 2024-02-18 Medical intention recognition model training method, medical intention recognition method and equipment

Country Status (1)

Country Link
CN (1) CN117725961B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113010683A (en) * 2020-08-26 2021-06-22 齐鲁工业大学 Entity relationship identification method and system based on improved graph attention network
CN116167382A (en) * 2023-01-05 2023-05-26 中国电信股份有限公司 Intention event extraction method and device, electronic equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111046671A (en) * 2019-12-12 2020-04-21 中国科学院自动化研究所 Chinese named entity recognition method based on graph network and merged into dictionary
CN112035637A (en) * 2020-08-28 2020-12-04 康键信息技术(深圳)有限公司 Medical field intention recognition method, device, equipment and storage medium
CN112201359B (en) * 2020-09-30 2024-05-03 平安科技(深圳)有限公司 Method and device for identifying severe inquiry data based on artificial intelligence
CN114036308A (en) * 2021-09-28 2022-02-11 西安电子科技大学 Knowledge graph representation method based on graph attention neural network
CN114419304B (en) * 2022-01-18 2024-11-08 深圳前海环融联易信息科技服务有限公司 Multi-mode document information extraction method based on graphic neural network
CN114443846B (en) * 2022-01-24 2024-07-16 重庆邮电大学 Classification method and device based on multi-level text different composition and electronic equipment
CN115687934A (en) * 2022-12-30 2023-02-03 智慧眼科技股份有限公司 Intention recognition method and device, computer equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113010683A (en) * 2020-08-26 2021-06-22 齐鲁工业大学 Entity relationship identification method and system based on improved graph attention network
CN116167382A (en) * 2023-01-05 2023-05-26 中国电信股份有限公司 Intention event extraction method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN117725961A (en) 2024-03-19

Similar Documents

Publication Publication Date Title
WO2022142014A1 (en) Multi-modal information fusion-based text classification method, and related device thereof
US11036996B2 (en) Method and apparatus for determining (raw) video materials for news
CN117765132A (en) Image generation method, device, equipment and storage medium
CN113360654A (en) Text classification method and device, electronic equipment and readable storage medium
CN113657105A (en) Medical entity extraction method, device, equipment and medium based on vocabulary enhancement
CN112084752A (en) Statement marking method, device, equipment and storage medium based on natural language
CN116245097A (en) Method for training entity recognition model, entity recognition method and corresponding device
CN115757731A (en) Dialogue question rewriting method, device, computer equipment and storage medium
CN112232052B (en) Text splicing method, text splicing device, computer equipment and storage medium
CN110019952B (en) Video description method, system and device
CN113688232A (en) Method and device for classifying bidding texts, storage medium and terminal
CN112199954A (en) Disease entity matching method and device based on voice semantics and computer equipment
CN112434746B (en) Pre-labeling method based on hierarchical migration learning and related equipment thereof
CN117725961B (en) Medical intention recognition model training method, medical intention recognition method and equipment
CN117874234A (en) Text classification method and device based on semantics, computer equipment and storage medium
CN113312568A (en) Web information extraction method and system based on HTML source code and webpage snapshot
CN116186295B (en) Attention-based knowledge graph link prediction method, attention-based knowledge graph link prediction device, attention-based knowledge graph link prediction equipment and attention-based knowledge graph link prediction medium
CN112182157A (en) Training method of online sequence labeling model, online labeling method and related equipment
CN117992569A (en) Method, device, equipment and medium for generating document based on generation type large model
CN115510203B (en) Method, device, equipment, storage medium and program product for determining answers to questions
CN113657092B (en) Method, device, equipment and medium for identifying tag
CN115982363A (en) Small sample relation classification method, system, medium and electronic device based on prompt learning
CN114091451A (en) Text classification method, device, equipment and storage medium
CN113657104A (en) Text extraction method and device, computer equipment and storage medium
CN117688193B (en) Picture and text unified coding method, device, computer equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant