WO2024098682A1 - Xai model evaluation method and apparatus, device, and medium - Google Patents
Xai model evaluation method and apparatus, device, and medium Download PDFInfo
- Publication number
- WO2024098682A1 WO2024098682A1 PCT/CN2023/091751 CN2023091751W WO2024098682A1 WO 2024098682 A1 WO2024098682 A1 WO 2024098682A1 CN 2023091751 W CN2023091751 W CN 2023091751W WO 2024098682 A1 WO2024098682 A1 WO 2024098682A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- node sequence
- explanation
- subgraph
- pair
- score
- Prior art date
Links
- 238000011156 evaluation Methods 0.000 title claims abstract description 44
- 238000000034 method Methods 0.000 claims abstract description 56
- 238000004590 computer program Methods 0.000 claims description 17
- 238000013473 artificial intelligence Methods 0.000 claims description 10
- 238000012549 training Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 description 11
- 238000010586 diagram Methods 0.000 description 9
- 238000004891 communication Methods 0.000 description 8
- 238000012545 processing Methods 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 6
- 238000010187 selection method Methods 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 3
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 3
- 230000001427 coherent effect Effects 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 2
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 2
- 238000003491 array Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000010845 search algorithm Methods 0.000 description 2
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 1
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 1
- 240000005561 Musa balbisiana Species 0.000 description 1
- 235000018290 Musa x paradisiaca Nutrition 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000003930 cognitive ability Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 239000008267 milk Substances 0.000 description 1
- 210000004080 milk Anatomy 0.000 description 1
- 235000013336 milk Nutrition 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 239000000843 powder Substances 0.000 description 1
- 230000035935 pregnancy Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Definitions
- Embodiments of the present application relate to the field of artificial intelligence technology, for example, to an Explainable Artificial Intelligence (XAI) model evaluation method, device, equipment and medium.
- XAI Explainable Artificial Intelligence
- XAI is an important branch of trusted AI technology. Similar to traditional machine learning models, XAI algorithms and models also need to be optimized and selected from a group of models. However, the traditional machine learning model selection method is not applicable to XAI models, because the evaluation indicators of XAI models not only include the evaluation of model accuracy, but also include the evaluation of the expression form and comprehensibility of the interpretation results. There are also multiple interpretation methods to evaluate the ability to interpret the same result. The evaluation content often involves the cognitive ability of the evaluator, so it is difficult to form a unified standard or universal method.
- the XAI evaluation method in related technologies has major defects: on the one hand, there is a lack of standard quantitative measurement methods, and manual evaluation consumes a lot of resources and has low accuracy and efficiency; on the other hand, users' evaluation of the model is easily affected by many factors (such as subjective factors), and the validity of the evaluation data cannot be guaranteed.
- the embodiments of the present application provide an XAI model evaluation method, apparatus, device and storage medium, which solve the problems of traditional XAI evaluation and selection methods being time-consuming and labor-intensive, lengthy processes, and low evaluation effectiveness.
- the present application provides an XAI model evaluation method, including: obtaining an explanation result pair corresponding to the XAI model to be evaluated; obtaining a node sequence pair matching the explanation result pair in a knowledge graph; determining a subgraph set corresponding to each node sequence in the node sequence pair based on the node sequence pair and the knowledge graph; and determining a scoring pair corresponding to the explanation result pair based on the subgraph set corresponding to each node sequence.
- the present application provides an XAI model evaluation device, which includes: an explanation result pair acquisition module, configured to obtain an explanation result pair corresponding to the XAI model to be evaluated; a node sequence pair acquisition module, configured to obtain a node sequence pair matching the explanation result pair in a knowledge graph; a subgraph set determination module, configured to determine a subgraph set corresponding to each node sequence in the node sequence pair based on the node sequence pair and the knowledge graph; and a scoring pair determination module, configured to determine a scoring pair corresponding to the explanation result pair based on the subgraph set corresponding to each node sequence.
- the present application provides an electronic device, comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores a computer program executable by the at least one processor, and the computer program is executed by the at least one processor so that the at least one processor can execute the XAI model evaluation method described in any embodiment of the present application.
- the present application provides a computer-readable storage medium, which stores computer instructions, and the computer instructions are used to enable a processor to implement the XAI model evaluation method described in any embodiment of the present application when executed.
- FIG1 is a flow chart of an XAI model evaluation method in Embodiment 1 of the present application.
- FIG2 is a schematic diagram of a sub-atlas set G a [] obtained by a matching result in Example 1 of the present application;
- FIG3 is a schematic diagram of a sub-graph set G b [] obtained from a matching result in Example 1 of the present application;
- FIG4 is a schematic diagram of the structure of an XAI model evaluation device in Embodiment 2 of the present application.
- FIG5 is a schematic diagram of the structure of an electronic device in Embodiment 3 of the present application.
- Figure 1 is a flow chart of an XAI model evaluation method in Example 1 of the present application. This embodiment is applicable to the situation of comprehensive ranking evaluation of the interpretable performance of the XAI model in multiple random comparison tests.
- the method can be performed by the XAI model evaluation device in the embodiment of the present application, which can be implemented in software and/or hardware. As shown in Figure 1, the method includes the following steps.
- the explanation result pairs corresponding to the XAI model to be evaluated may be explanation result pairs of the same XAI model under different training batches, or may be explanation result pairs of different XAI models.
- the method for obtaining the explanation result pair corresponding to the XAI model to be evaluated may be: randomly selecting two explanation results to form an explanation result pair according to the explanation results of the same XAI model in different training batches.
- the method for obtaining the explanation result pair corresponding to the XAI model to be evaluated may also be: randomly selecting two explanation results to form an explanation result pair according to the explanation results of different XAI models.
- Knowledge Graph can be referred to as KG for short.
- the matching method can be based on the nodes in KG or the attributes of KG, which is not limited here.
- the node sequence pair is the corresponding result obtained after the interpretation result pair is matched through KG.
- the method of obtaining the node sequence pair matching the explanation result pair in the knowledge graph can be: obtaining the explanation result pair, matching the explanation result pair with the node sequence and attribute sequence of the KG, and obtaining the node sequence pair matching the explanation result pair through KG semantic retrieval matching processing.
- the nodes of KG can be denoted as V, the edges as E, the node attributes as A, the node sequence of KG is V[], and the attribute sequence is A[].
- the factors in E a [] and E b [] are represented without semantic information
- the metadata of the data use the field name to replace the factors so that the sequences of E a [] and E b [] are sequences with semantic information E′ a [] and E′ b [], where the metadata of the data is the basic information in the data processing system and the data set management system.
- a reverse traversal search algorithm from the final result to the original cause, and its matching implementation algorithm is as follows: first, call the KG semantic retrieval interface to match the i n factor in the KG, and the matched node in the KG is V n ; second, starting from V n , match the i n-1 factor within the range of path depth D. If the match is successful, repeat this substep starting from i n-1.
- the search depth can be adjusted according to actual needs when executing the reverse traversal search algorithm for node matching.
- a subgraph set can be one or more subgraphs, and is the path corresponding to a node sequence in the KG. If a subgraph set has multiple subgraphs, it means that the path is broken.
- the method of determining the subgraph set corresponding to each node sequence according to the node sequence pair and the knowledge graph is as follows: obtaining the node sequence pair, and obtaining the subgraph set of each node sequence through the path search function of the knowledge graph.
- Ga [] is the path corresponding to Va [], which is one or more subgraphs, that is, the subgraph set corresponding to Va [ ] is determined to be Ga [].
- G b [] is the path corresponding to V b [], that is, the subgraph set corresponding to V b [] is determined to be G b [].
- the score pair corresponding to the explanation result pair is a total score pair calculated based on the explanation coherence score, explanation complexity score and explanation credibility score of the subgraph set corresponding to each node sequence.
- the method of determining the score pair corresponding to the explanation result pair according to the subgraph set corresponding to each node sequence can be: obtaining the subgraph set corresponding to each node sequence, determining the score of the coherence of the explanation, the complexity of the explanation, and the credibility of the explanation of each node sequence according to the subgraph set corresponding to each node sequence, and based on the score of the coherence of the explanation, the complexity of the explanation, and the credibility of the explanation of each node sequence, calculating the total score of the subgraph set corresponding to each node sequence, and then determining the score pair corresponding to the explanation result pair.
- a score pair corresponding to an explanation result pair is determined according to a subgraph set corresponding to each node sequence, including: determining an explanation coherence score, an explanation complexity score and an explanation credibility score corresponding to each node sequence according to the subgraph set corresponding to each node sequence; determining a target score corresponding to each node sequence according to the explanation coherence score, the explanation complexity score and the explanation credibility score corresponding to each node sequence; and determining a score pair corresponding to an explanation result pair according to the target score corresponding to each node sequence.
- the target score is the total score calculated based on the explanation coherence score, explanation complexity score, and explanation credibility score corresponding to each node sequence.
- the method of determining the explanation coherence score, explanation complexity score and explanation credibility score corresponding to each node sequence according to the subgraph set corresponding to each node sequence can be: obtaining the subgraph set corresponding to each node sequence, obtaining the explanation coherence score by measuring the number of subgraphs in the subgraph set corresponding to each node sequence, obtaining the explanation complexity score by measuring the number of nodes and edges of each subgraph in the subgraph set corresponding to each node sequence, and obtaining the explanation credibility score by calculating the sum of the edge weights of all subgraphs in the subgraph set corresponding to each node sequence.
- the method of determining the target score corresponding to each node sequence according to the explanation coherence score, explanation complexity score and explanation credibility score corresponding to each node sequence can be: based on the explanation coherence score, explanation complexity score and explanation credibility score corresponding to each node sequence, the total score of each node sequence is calculated.
- the total score of each node sequence can be calculated by taking Ga [] as an example, determining the explanation coherence score S split according to the number of subgraphs in Ga [], determining the explanation complexity score S complexity_a_total corresponding to Ga [] according to the explanation complexity of Ga [], determining the explanation credibility score S credit according to the multiple edge weights of the subgraph Ga_target containing the result factor, and calculating the total score as follows:
- the method of determining the score pair corresponding to the interpretation result pair according to the target score corresponding to each node sequence can be: determining the total score corresponding to each node sequence according to the interpretation coherence score, the interpretation complexity score and the interpretation credibility score corresponding to each node sequence, and obtaining the total score pair according to the total score corresponding to each node sequence, that is, determining the score pair corresponding to the interpretation result pair.
- the result set Rsc[i] is used to store the scoring results.
- the representation of one comparison result can be an ordered positive integer pair (a, b), where a and b are the labels of the selected models to be evaluated, and i is the number of the current comparison round. All comparison results are stored in the result set Rsc[], whose size is the hyperparameter N, that is, the total number of comparison rounds. Repeat steps S110 to S140 until the number of comparison rounds reaches the hyperparameter N.
- determining the explanation coherence score corresponding to each node sequence according to the subgraph set corresponding to each node sequence includes: obtaining the number of subgraphs in the subgraph set corresponding to each node sequence; and determining the explanation coherence score corresponding to each node sequence according to the number of subgraphs in the subgraph set.
- the scoring strategy for explanation coherence is: for a logically coherent explanation result, its reasoning process should be coherent, and the corresponding nodes of its factors in the KG form a connected subgraph. The more disconnected subgraphs there are, the less coherent the explanation result is.
- the method for obtaining the number of subgraphs in the subgraph set corresponding to each node sequence may be: obtaining the subgraph set corresponding to each node sequence, and then obtaining the number of subgraphs in the subgraph set.
- the method of determining the explanation coherence score corresponding to each node sequence according to the number of subgraphs in the subgraph set may be: obtaining the number of subgraphs in the subgraph set, and the explanation coherence score corresponding to each node sequence is the inverse of the number of subgraphs. For example, taking Ga [] as an example, the number of subgraphs of Ga [] is
- the interpretation coherence score is the inverse of the number of subgraphs.
- the explanation complexity score corresponding to each node sequence is determined based on the subgraph set corresponding to each node sequence, including: determining the complexity of each subgraph based on the number of nodes and the number of edges in each subgraph in the subgraph set corresponding to each node sequence; and determining the sum of the complexities of all subgraphs in the subgraph set corresponding to each node sequence as the explanation complexity score corresponding to each node sequence.
- the scoring strategy for explanation complexity is: the smaller the ratio of nodes to edges in a subgraph, the more effective, concise, and convincing the explanation is.
- An edge can be understood as the connection between two nodes in each subgraph.
- the method for determining the complexity of each subgraph based on the number of nodes and the number of edges of each subgraph in the subgraph set corresponding to each node sequence can be: obtain each subgraph in the subgraph set corresponding to each node sequence, obtain the number of nodes and the number of edges of each subgraph based on each subgraph in the subgraph set, and obtain the complexity of each subgraph based on the number of nodes/number of edges of each subgraph.
- the method of determining the sum of the complexity of all subgraphs in the subgraph set corresponding to each node sequence as the interpretation complexity score corresponding to each node sequence can be: obtaining the complexity of each subgraph in the subgraph set, and determining the sum of the complexity of all subgraphs in the subgraph set as the interpretation complexity score corresponding to each node sequence.
- each subgraph in Ga [] is represented by a triple Represents, where V is the node set, E is the edge set, is the edge weight set, and the triple corresponding to the i-th subgraph Ga_i of Ga [] is The measure of the complexity of this subgraph is:
- S complexity_a_total is the total score of the explanation complexity of G a [].
- the explanation credibility score corresponding to each node sequence is determined based on the subgraph set corresponding to each node sequence, including: obtaining a target subgraph in the subgraph set corresponding to each node sequence, wherein the target subgraph is a subgraph including a result factor; and determining the sum of the weights of all edges of the target subgraph as the explanation credibility score corresponding to each node sequence.
- the credibility scoring strategy is explained as follows: if the edge weight reflects the strength of node entity connection and logical association, then the sum of all edge weights of the subgraph can be simply used to represent the credibility of the subgraph. Among them, the edge weight in KG comes from the KG construction process, using the relationship weight in the Bayesian class structure construction method.
- a target subgraph in a subgraph set corresponding to each node sequence is obtained, wherein the target subgraph is a subgraph including a result factor by: obtaining a subgraph set corresponding to each node sequence, and selecting a subgraph including a result factor according to the subgraph set.
- w e_i represents the weight of the i-th edge in the subgraph Ga_target .
- the explanation credibility score corresponding to each node sequence is determined based on the subgraph set corresponding to each node sequence, including: obtaining a target subgraph in the subgraph set corresponding to each node sequence, wherein the target subgraph is a subgraph including a result factor; and determining the sum of the weights of the edges in the target subgraph connected to the result factor as the explanation credibility score corresponding to each node sequence.
- the method of determining the sum of the weights of the edges connected to the result factor in the target subgraph as the explanation credibility score corresponding to each node sequence can be: selecting the edges connected to the result factor in the target subgraph, calculating the sum of the weights of all the edges connected to the result factor in the target subgraph, and determining the sum of the weights of all the edges connected to the result factor as the explanation credibility score corresponding to each node sequence.
- the explanation results corresponding to the XAI model to be evaluated include the explanation results of different XAI models; after determining the score pairs corresponding to the explanation results according to the subgraph set corresponding to each node sequence, it also includes: inputting the score pairs into the target model to obtain the ranking order of the explanation ability of each XAI model in the different XAI models, or, the explanation results corresponding to the XAI model to be evaluated include the explanation results of the same XAI model under different training batches; after determining the score pairs corresponding to the explanation results according to the subgraph set corresponding to each node sequence, it also includes: inputting the score pairs into the target model to obtain the ranking order of the explanation ability of each XAI model in the different XAI models. The ranking order of the explanation ability of each explanation result in the pair of explanation results is obtained.
- the target model can be a model that takes the score pair as input and outputs the ranking result of the explanation ability of the XAI model to be evaluated. It should be noted that in order to improve accuracy and reduce errors, the explanation results are compared in a random manner, so the number of comparisons of different XAI models is not the same. The target model is established to obtain the ranking order based on the unbalanced comparison results.
- the score pairs are input into the target model to obtain the ranking order of the explanatory power of each XAI model to be evaluated, or the ranking order of the explanatory power of each explanation result can be as follows: if the XAI models to be evaluated initially selected are two different XAI models, the score pairs of the explanation result pairs corresponding to the two different XAI models are input into the target model to obtain the ranking order of the explanatory power of each XAI model; if the XAI models to be evaluated initially selected are models of the same XAI model in different training batches, the score pairs of the explanation result pairs of the same XAI model in different training batches are input into the target model to obtain the ranking order of the explanatory power of each explanation result.
- the output is the ranking result of the explanatory power of different XAI models, recorded as ⁇ ... ⁇ i ... ⁇ j ... ⁇ , where ⁇ i is the ranking order of the explanatory power of the i-th XAI model, and its order in the ranking result is the ranking order of its explanatory power.
- the implementation method is: first, the comparison results stored in the result set array Rsc[] are converted into a matrix A n ⁇ n storing the comparison results, and the position of a ij in the matrix is (i, j), which means the number of times the i-th XAI model outperforms the j-th XAI model; secondly, the value of the ranking order ⁇ i of the explanatory power of each model is calculated, and the solution method is: after arbitrarily specifying an ⁇ i as the explanatory power benchmark value 1, solve ArgMax(L), and the calculation expression of L is as follows:
- ⁇ j is the ranking order of the explanation ability of the jth XAI model.
- the value of the explanatory power of each model can be obtained and sorted, and the sorted models are stored as the model's explanatory power sequence ⁇ i ... ⁇ j ... ⁇ , where ⁇ i corresponds to the i-th explanatory model MODEL i ranked first in the explanatory power ranking.
- the ranking order of the explanatory power of each XAI model, or the ranking order of the explanatory power of each explanation result can be obtained.
- the sample library of the recommendation business department of an Internet company H stores 10,000
- the data of a user is Data[].
- the department wants to add explanations to the original recommendation system to improve the user's acceptance of the recommended products.
- FIG2 is a schematic diagram of a subgraph set Ga [] obtained from a matching result in Example 1 of the present application. As shown in FIG2 , there are 2 subgraphs in Ga [].
- FIG3 is a schematic diagram of a subgraph set G b [] obtained from a matching result in the first embodiment of the present application. As shown in FIG3 , there are 3 subgraphs in G b [].
- the calculated result is accurate to five decimal places and is used to measure the explanation complexity score of G a [].
- the subgraph Ga_2 [] in Figure 2 is the subgraph where the result factor Y is located.
- the total score is calculated based on the explanation coherence score, explanation complexity score, and explanation credibility score of Ga []. If the standard() function uses the logistic function The calculation result is Similarly, Accurate to five decimal places, after comparison, MODEL 1 outperforms MODEL 2 in this round.
- the technical solution of this embodiment obtains the explanation result pair corresponding to the XAI model to be evaluated; obtains the node sequence pair matching the explanation result pair in the knowledge graph; determines the subgraph set corresponding to each node sequence according to the node sequence pair and the knowledge graph; and determines the score pair corresponding to the explanation result pair according to the subgraph set corresponding to each node sequence. It solves the problems of traditional XAI evaluation and selection methods being time-consuming and labor-intensive, lengthy processes, and low evaluation effectiveness, and can improve the accuracy and efficiency of the evaluation and selection methods and ensure the effectiveness of the evaluation data.
- the XAI model evaluation device includes: an explanation result pair acquisition module 210, a node sequence pair acquisition module 220, a subgraph set determination module 230, and a score pair determination module 240.
- the explanation result pair acquisition module 210 is configured to obtain the explanation result pair corresponding to the XAI model to be evaluated;
- the node sequence pair acquisition module 220 is configured to obtain the node sequence pair matching the explanation result pair in the knowledge graph;
- the subgraph set determination module 230 is configured to determine the subgraph set corresponding to each node sequence based on the node sequence pair and the knowledge graph;
- the scoring pair determination module 240 is configured to determine the scoring pair corresponding to the explanation result pair based on the subgraph set corresponding to each node sequence.
- the scoring pair determination module is configured to: Determine the explanation coherence score, explanation complexity score and explanation credibility score corresponding to each node sequence; determine the target score corresponding to each node sequence according to the explanation coherence score, explanation complexity score and explanation credibility score corresponding to each node sequence; determine the score pair corresponding to the explanation result pair according to the target score corresponding to each node sequence.
- the score pair determination module is configured to: obtain the number of subgraphs in a subgraph set corresponding to each node sequence; and determine an explanation coherence score corresponding to each node sequence according to the number of subgraphs in the subgraph set.
- the score determination module is set to: determine the complexity of each subgraph based on the number of nodes in each subgraph in the subgraph set corresponding to each node sequence and the number of edges in each subgraph; and determine the sum of the complexities of all subgraphs in the subgraph set corresponding to each node sequence as the explanation complexity score corresponding to each node sequence.
- the score pair determination module is configured to: obtain a target subgraph from a subgraph set corresponding to each node sequence, wherein the target subgraph is a subgraph including a result factor; and determine the sum of the weights of all edges of the target subgraph as the explanation credibility score corresponding to each node sequence.
- the score pair determination module is configured to: obtain a target subgraph from a subgraph set corresponding to each node sequence, wherein the target subgraph is a subgraph including a result factor; and determine the sum of the weights of the edges in the target subgraph connected to the result factor as the explanation credibility score corresponding to each node sequence.
- the device also includes: a ranking acquisition module, configured to input the score pair into the target model to obtain the ranking order of the explanatory power of each XAI model to be evaluated among the different XAI models, or to input the score pair into the target model to obtain the ranking order of the explanatory power of each explanation result in the explanation result pair.
- a ranking acquisition module configured to input the score pair into the target model to obtain the ranking order of the explanatory power of each XAI model to be evaluated among the different XAI models, or to input the score pair into the target model to obtain the ranking order of the explanatory power of each explanation result in the explanation result pair.
- the above-mentioned product can execute the method provided by any embodiment of the present application and has the corresponding functional modules for executing the method.
- the technical solution of this embodiment obtains the explanation result pair corresponding to the XAI model to be evaluated; obtains the node sequence pair matching the explanation result pair in the knowledge graph; determines the subgraph set corresponding to each node sequence according to the node sequence pair and the knowledge graph; and determines the score pair corresponding to the explanation result pair according to the subgraph set corresponding to each node sequence.
- FIG5 is a schematic diagram of the structure of an electronic device in the third embodiment of the present application.
- the electronic device 10 is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers.
- the sub-device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (such as helmets, glasses, watches, etc.) and other similar computing devices.
- the components shown herein, their connections and relationships, and their functions are merely examples and are not intended to limit the implementation of the present application described and/or claimed herein.
- the electronic device 10 includes at least one processor 11, and a memory connected to the at least one processor 11, such as a read-only memory (ROM) 12, a random access memory (RAM) 13, etc., wherein the memory stores a computer program that can be executed by at least one processor, and the processor 11 can perform a variety of appropriate actions and processes according to the computer program stored in the ROM 12 or the computer program loaded from the storage unit 18 to the RAM 13.
- the RAM 13 a variety of programs and data required for the operation of the electronic device 10 can also be stored.
- the processor 11, the ROM 12, and the RAM 13 are connected to each other through a bus 14.
- An input/output (I/O) interface 15 is also connected to the bus 14.
- a number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16, such as a keyboard, a mouse, etc.; an output unit 17, such as various types of displays, speakers, etc.; a storage unit 18, such as a disk, an optical disk, etc.; and a communication unit 19, such as a network card, a modem, a wireless communication transceiver, etc.
- the communication unit 19 allows the electronic device 10 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.
- the processor 11 may be a variety of general and/or special processing components with processing and computing capabilities. Some examples of the processor 11 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), a variety of dedicated artificial intelligence (AI) computing chips, a variety of processors running machine learning model algorithms, a digital signal processor (DSP), and any appropriate processor, controller, microcontroller, etc.
- the processor 11 executes the multiple methods and processes described above, such as the XAI model evaluation method.
- the XAI model evaluation method may be implemented as a computer program, which is tangibly contained in a computer-readable storage medium, such as a storage unit 18.
- part or all of the computer program may be loaded and/or installed on the electronic device 10 via the ROM 12 and/or the communication unit 19.
- the processor 11 may be configured to perform the XAI model evaluation method in any other appropriate manner (e.g., by means of firmware).
- Various implementations of the systems and techniques described above may be implemented in digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard parts (ASSPs), system on chip (SoCs), and the like.
- FPGAs field programmable gate arrays
- ASICs application specific integrated circuits
- ASSPs application specific standard parts
- SoCs system on chip
- SOC complex programmable logic device
- CPLD complex Programmable Logic Device
- computer hardware firmware, software, and/or a combination thereof.
- These various implementations may include: being implemented in one or more computer programs, which may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a dedicated or general programmable processor, which may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
- a programmable processor which may be a dedicated or general programmable processor, which may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
- the computer programs for implementing the methods of the present application may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, so that when the computer programs are executed by the processor, the functions/operations specified in the flow charts and/or block diagrams are implemented.
- the computer programs may be executed entirely on the machine, partially on the machine, partially on the machine and partially on a remote machine as a stand-alone software package, or entirely on a remote machine or server.
- a computer readable storage medium may be a tangible medium that may contain or store a computer program for use by or in conjunction with an instruction execution system, device, or apparatus.
- a computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any suitable combination of the foregoing.
- a computer readable storage medium may be a machine readable signal medium.
- machine readable storage media may include an electrical connection based on one or more wires, a portable computer disk, a hard disk, RAM, ROM, an erasable programmable read-only memory (EPROM) or flash memory, optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
- the systems and techniques described herein may be implemented on an electronic device having: a display device (e.g., a cathode ray tube (CRT) or a liquid crystal display (LCD) monitor) for displaying information to the user; and a keyboard and pointing device (e.g., a mouse or trackball) through which the user can provide input to the electronic device.
- a display device e.g., a cathode ray tube (CRT) or a liquid crystal display (LCD) monitor
- a keyboard and pointing device e.g., a mouse or trackball
- Other types of devices may also be used to provide interaction with the user; for example, the feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form (including acoustic input, voice input, or tactile input).
- the systems and techniques described herein can be implemented in a computing system that includes a backend component (e.g., as a data server), or a computing system that includes a middleware component (e.g., an application server), or a computing system that includes a frontend component (e.g., a user interface with a graphical user interface or a web browser).
- a backend component e.g., as a data server
- a middleware component e.g., an application server
- a frontend component e.g., a user interface with a graphical user interface or a web browser
- a computer a user can interact with embodiments of the systems and techniques described herein through a graphical user interface or a web browser), or a computing system including any combination of such backend components, middleware components, or frontend components.
- the components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: a Local Area Network (LAN
- a computing system may include a client and a server.
- the client and the server are generally remote from each other and usually interact through a communication network.
- the client and server relationship is generated by computer programs running on the respective computers and having a client-server relationship with each other.
- the server may be a cloud server, also known as a cloud computing server or cloud host, which is a host product in the cloud computing service system to solve the defects of difficult management and weak business scalability in traditional physical hosts and virtual private servers (VPS) services.
- VPN virtual private servers
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present application discloses an XAI model evaluation method and apparatus, a device and a storage medium. The method comprises: obtaining an explanation result pair corresponding to an XAI model to be evaluated; obtaining a node sequence pair matching the explanation result pair in a knowledge graph; according to the node sequence pair and the knowledge graph, determining a sub-graph set corresponding to each node sequence in the node sequence pair; and according to the sub-graph set corresponding to each node sequence, determining a corresponding score pair corresponding to the explanation result pair.
Description
本申请要求在2022年11月10日提交中国专利局、申请号为202211404037.1的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed with the China Patent Office on November 10, 2022, with application number 202211404037.1, the entire contents of which are incorporated by reference into this application.
本申请实施例涉及人工智能技术领域,例如涉及一种可解释性人工智能(Explainable Artificial Intelligence,XAI)模型评价方法、装置、设备及介质。Embodiments of the present application relate to the field of artificial intelligence technology, for example, to an Explainable Artificial Intelligence (XAI) model evaluation method, device, equipment and medium.
随着人工智能(Artificial Intelligence,AI)的规模应用,可信AI技术受到更多的重视,XAI是可信AI技术的一个重要分支。与传统的机器学习模型类似,XAI算法及模型同样需要在一组模型中进行优化选择。然而,传统的机器学习模型的选择方法并不适用于XAI模型,因为XAI模型的评价指标不仅包含模型的准确性评价,还包含对解释结果的表达形式、可理解性等方面的评价,也存在着多种解释手段对同一结果进行解释能力的评价,评价内容往往涉及评价者的认知能力,所以难以形成统一标准或通用方法。在专注于评估用户效果的研究中,大多数研究都集中在主观测量上,即需要人基于一些给定指标对XAI模型的解释做出手动评分,也有一些研究既测量了解释的主观易用性,也测量了参与者根据解释正确做出推断的能力,使得人们可以区分解释的行为效应和自我感知效应,一定程度上强调了进行客观测量的价值。With the large-scale application of artificial intelligence (AI), trusted AI technology has received more attention. XAI is an important branch of trusted AI technology. Similar to traditional machine learning models, XAI algorithms and models also need to be optimized and selected from a group of models. However, the traditional machine learning model selection method is not applicable to XAI models, because the evaluation indicators of XAI models not only include the evaluation of model accuracy, but also include the evaluation of the expression form and comprehensibility of the interpretation results. There are also multiple interpretation methods to evaluate the ability to interpret the same result. The evaluation content often involves the cognitive ability of the evaluator, so it is difficult to form a unified standard or universal method. In the research focusing on evaluating user effects, most studies focus on subjective measurement, that is, people are required to manually score the interpretation of the XAI model based on some given indicators. Some studies have measured both the subjective ease of use of the interpretation and the ability of participants to make correct inferences based on the interpretation, so that people can distinguish between the behavioral effect and the self-perception effect of the interpretation, which to a certain extent emphasizes the value of objective measurement.
因此,相关技术中的XAI评价方法具有较大缺陷:一方面缺乏标准的量化度量方法,人工评估又耗费大量资源,且准确性和效率低下;另一方面,用户对模型的评价很容易受到诸多因素(例如主观因素)的影响,评价数据有效性无法保证。Therefore, the XAI evaluation method in related technologies has major defects: on the one hand, there is a lack of standard quantitative measurement methods, and manual evaluation consumes a lot of resources and has low accuracy and efficiency; on the other hand, users' evaluation of the model is easily affected by many factors (such as subjective factors), and the validity of the evaluation data cannot be guaranteed.
发明内容Summary of the invention
本申请实施例提供一种XAI模型评价方法、装置、设备及存储介质,解决了传统XAI评估选择方法耗时耗力,过程冗长,评估有效性低等问题。The embodiments of the present application provide an XAI model evaluation method, apparatus, device and storage medium, which solve the problems of traditional XAI evaluation and selection methods being time-consuming and labor-intensive, lengthy processes, and low evaluation effectiveness.
本申请提供了一种XAI模型评价方法,包括:获取待评价XAI模型对应的解释结果对;获取知识图谱中与所述解释结果对匹配的节点序列对;根据所述节点序列对和所述知识图谱确定所述节点序列对中每个节点序列对应的子图集合;根据每个节点序列对应的子图集合确定所述解释结果对对应的评分对。
The present application provides an XAI model evaluation method, including: obtaining an explanation result pair corresponding to the XAI model to be evaluated; obtaining a node sequence pair matching the explanation result pair in a knowledge graph; determining a subgraph set corresponding to each node sequence in the node sequence pair based on the node sequence pair and the knowledge graph; and determining a scoring pair corresponding to the explanation result pair based on the subgraph set corresponding to each node sequence.
本申请提供了一种XAI模型评价装置,该XAI模型评价装置包括:解释结果对获取模块,设置为获取待评价XAI模型对应的解释结果对;节点序列对获取模块,设置为获取知识图谱中与所述解释结果对匹配的节点序列对;子图集合确定模块,设置为根据所述节点序列对和所述知识图谱确定所述节点序列对中每个节点序列对应的子图集合;评分对确定模块,设置为根据每个节点序列对应的子图集合确定所述解释结果对对应的评分对。The present application provides an XAI model evaluation device, which includes: an explanation result pair acquisition module, configured to obtain an explanation result pair corresponding to the XAI model to be evaluated; a node sequence pair acquisition module, configured to obtain a node sequence pair matching the explanation result pair in a knowledge graph; a subgraph set determination module, configured to determine a subgraph set corresponding to each node sequence in the node sequence pair based on the node sequence pair and the knowledge graph; and a scoring pair determination module, configured to determine a scoring pair corresponding to the explanation result pair based on the subgraph set corresponding to each node sequence.
本申请提供了一种电子设备,所述电子设备包括:至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的计算机程序,所述计算机程序被所述至少一个处理器执行,以使所述至少一个处理器能够执行本申请任一实施例所述的XAI模型评价方法。The present application provides an electronic device, comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores a computer program executable by the at least one processor, and the computer program is executed by the at least one processor so that the at least one processor can execute the XAI model evaluation method described in any embodiment of the present application.
本申请提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机指令,所述计算机指令用于使处理器执行时实现本申请任一实施例所述的XAI模型评价方法。The present application provides a computer-readable storage medium, which stores computer instructions, and the computer instructions are used to enable a processor to implement the XAI model evaluation method described in any embodiment of the present application when executed.
下面将对实施例中所需要使用的附图作简单地介绍。The following is a brief introduction to the drawings required for use in the embodiments.
图1是本申请实施例一中的一种XAI模型评价方法的流程图;FIG1 is a flow chart of an XAI model evaluation method in Embodiment 1 of the present application;
图2是本申请实施例一中的一种匹配结果所得子图集Ga[]的示意图;FIG2 is a schematic diagram of a sub-atlas set G a [] obtained by a matching result in Example 1 of the present application;
图3是本申请实施例一中的一种匹配结果所得子图集Gb[]的示意图;FIG3 is a schematic diagram of a sub-graph set G b [] obtained from a matching result in Example 1 of the present application;
图4是本申请实施例二中的一种XAI模型评价装置的结构示意图;FIG4 is a schematic diagram of the structure of an XAI model evaluation device in Embodiment 2 of the present application;
图5是本申请实施例三中的一种电子设备的结构示意图。FIG5 is a schematic diagram of the structure of an electronic device in Embodiment 3 of the present application.
为了使本技术领域的人员理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述,显然,所描述的实施例仅仅是本申请一部分的实施例。In order to enable those skilled in the art to understand the solution of the present application, the technical solution in the embodiments of the present application will be described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments are only embodiments of a part of the present application.
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或
单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second", etc. in the specification and claims of the present application and the above-mentioned drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence. It should be understood that the data used in this way can be interchanged where appropriate, so that the embodiments of the present application described herein can be implemented in an order other than those illustrated or described herein. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions. For example, a process, method, system, product or device that includes a series of steps or units is not necessarily limited to those steps or units that are clearly listed. Rather, the processes, methods, products, or apparatus may include other steps or elements not expressly listed or inherent to such processes, methods, products, or apparatus.
实施例一Embodiment 1
图1是本申请实施例一中的一种XAI模型评价方法的流程图。本实施例可适用于对XAI模型在多次随机对比测试中的可解释性能的综合排名性评估的情况,该方法可以由本申请实施例中的XAI模型评价装置来执行,该装置可采用软件和/或硬件的方式实现,如图1所示,该方法包括如下步骤。Figure 1 is a flow chart of an XAI model evaluation method in Example 1 of the present application. This embodiment is applicable to the situation of comprehensive ranking evaluation of the interpretable performance of the XAI model in multiple random comparison tests. The method can be performed by the XAI model evaluation device in the embodiment of the present application, which can be implemented in software and/or hardware. As shown in Figure 1, the method includes the following steps.
S110,获取待评价XAI模型对应的解释结果对。S110, obtaining an explanation result pair corresponding to the XAI model to be evaluated.
待评价XAI模型对应的解释结果对可以为同一XAI模型在不同训练批次下的解释结果对,也可以为不同XAI模型的解释结果对。The explanation result pairs corresponding to the XAI model to be evaluated may be explanation result pairs of the same XAI model under different training batches, or may be explanation result pairs of different XAI models.
获取待评价XAI模型对应的解释结果对的方式可以为:根据同一XAI模型在不同训练批次下的解释结果,随机选择两个解释结果构成解释结果对。获取待评价XAI模型对应的解释结果对的方式还可以为:根据不同XAI模型的解释结果,随机选择两个解释结果构成解释结果对。The method for obtaining the explanation result pair corresponding to the XAI model to be evaluated may be: randomly selecting two explanation results to form an explanation result pair according to the explanation results of the same XAI model in different training batches. The method for obtaining the explanation result pair corresponding to the XAI model to be evaluated may also be: randomly selecting two explanation results to form an explanation result pair according to the explanation results of different XAI models.
解释结果的原子表示形式可以为E=[X1...Xn→Y],简称原子解释结果,其中,X为因因子,Y为结果因子,n是因因子的个数,因此解释结果对可以表示为Epair={Ea[],Eb[]},Ea[]、Eb[]分别是下标为a、b的原子解释结果序列。需要说明的是,基于原子解释结果及原子解释结果序列的定义、组合形式,可灵活地表示多种复杂的解释,如一果一因、一果多因、多果一因、多果多因等。The atomic representation of the interpretation result can be E = [X 1 ...X n →Y], referred to as the atomic interpretation result, where X is the cause factor, Y is the result factor, and n is the number of cause factors. Therefore, the interpretation result pair can be expressed as E pair = {E a [], E b []}, where E a [] and E b [] are atomic interpretation result sequences with subscripts a and b, respectively. It should be noted that based on the definition and combination of atomic interpretation results and atomic interpretation result sequences, a variety of complex interpretations can be flexibly expressed, such as one effect and one cause, one effect and multiple causes, multiple effects and one cause, and multiple effects and multiple causes.
S120,获取知识图谱中与解释结果对匹配的节点序列对。S120, obtaining a node sequence pair in the knowledge graph that matches the explanation result pair.
知识图谱(Knowledge Graph)可简称为KG,匹配的方式可以根据KG中的节点匹配,也可以根据KG的属性进行匹配,在此不作限定。节点序列对是解释结果对经过KG匹配后得到的对应结果。Knowledge Graph can be referred to as KG for short. The matching method can be based on the nodes in KG or the attributes of KG, which is not limited here. The node sequence pair is the corresponding result obtained after the interpretation result pair is matched through KG.
获取知识图谱中与解释结果对匹配的节点序列对的方式可以为:获取解释结果对,将解释结果对与KG的节点序列、属性序列进行匹配,经过KG语义检索匹配处理,得到与解释结果对匹配的节点序列对。The method of obtaining the node sequence pair matching the explanation result pair in the knowledge graph can be: obtaining the explanation result pair, matching the explanation result pair with the node sequence and attribute sequence of the KG, and obtaining the node sequence pair matching the explanation result pair through KG semantic retrieval matching processing.
KG的节点可以记为V,边记为E,节点属性记为A,KG的节点序列为V[],属性序列为A[]。首先若Ea[]、Eb[]中的因子为无语义信息表示,则根据数据的元信息,使用字段名替换因子,使Ea[]、Eb[]序列为具有语义信息的序列E′a[]、E′b[],其中,数据的元信息为数据处理系统、数据集管理系统中的基础信息。接着调用KG的语义检索接口对E′a[]、E′b[]的语义因子进行查询匹配,得到Epair在KG中对应的节点序列对为Vpair={Va[],Vb[]}。需要说明的是,为提高匹配效率,采用
一种由最终果向原始因的逆向遍历查找算法,其匹配实现算法如下:一,调用KG语义检索接口,在KG中匹配in因子,在KG中匹配的节点为Vn;二,以Vn为起点,在路径深度为D的范围内再匹配in-1因子,若匹配成功,则以in-1为起点重复此子步骤,若匹配失败,则以(未连续匹配成功数+1)*D的深度即2*D的深度范围内匹配in-2因子,以此规则完成所有的结果序列I[]的因子匹配;三,输出结果序列I[]的匹配结果V[]={...Vi...}。The nodes of KG can be denoted as V, the edges as E, the node attributes as A, the node sequence of KG is V[], and the attribute sequence is A[]. First, if the factors in E a [] and E b [] are represented without semantic information, then according to the metadata of the data, use the field name to replace the factors so that the sequences of E a [] and E b [] are sequences with semantic information E′ a [] and E′ b [], where the metadata of the data is the basic information in the data processing system and the data set management system. Then call the semantic retrieval interface of KG to query and match the semantic factors of E′ a [] and E′ b [], and obtain the node sequence pair corresponding to E pair in KG as V pair = {V a [], V b []}. It should be noted that in order to improve the matching efficiency, the method is adopted. A reverse traversal search algorithm from the final result to the original cause, and its matching implementation algorithm is as follows: first, call the KG semantic retrieval interface to match the i n factor in the KG, and the matched node in the KG is V n ; second, starting from V n , match the i n-1 factor within the range of path depth D. If the match is successful, repeat this substep starting from i n-1. If the match fails, match the i n-2 factor within the depth of (the number of unsuccessful consecutive matches + 1)*D, that is, within the depth of 2 *D. This rule is used to complete the factor matching of all result sequences I[]; third, output the matching result V[]={...V i ...} of the result sequence I[].
需要说明的是,在与解释结果对匹配的过程中,考虑到KG的结构以及体量在实际应用中可能有所差异,在执行逆向遍历查找算法进行节点匹配的时候可以根据实际需要调整搜索深度。It should be noted that in the process of matching with the interpretation results, considering that the structure and volume of KG may vary in actual applications, the search depth can be adjusted according to actual needs when executing the reverse traversal search algorithm for node matching.
S130,根据节点序列对和知识图谱确定节点序列对中每个节点序列对应的子图集合。S130, determining a subgraph set corresponding to each node sequence in the node sequence pair according to the node sequence pair and the knowledge graph.
子图集合可为一个或多个子图,且为节点序列对在KG中对应的路径,子图集合具有多个子图说明路径存在断裂的情况。A subgraph set can be one or more subgraphs, and is the path corresponding to a node sequence in the KG. If a subgraph set has multiple subgraphs, it means that the path is broken.
根据节点序列对和知识图谱确定每个节点序列对应的子图集合的方式可以为:获取节点序列对,通过知识图谱的路径查找功能,得到每个节点序列的子图集合。The method of determining the subgraph set corresponding to each node sequence according to the node sequence pair and the knowledge graph is as follows: obtaining the node sequence pair, and obtaining the subgraph set of each node sequence through the path search function of the knowledge graph.
节点序列对为Vpair={Va[],Vb[]},通过KG的路径查找功能,得到Vpair的对应路径,记为Gpair={Ga[],Gb[]},Ga[]是Va[]对应的路径,为一个或多个子图,即确定了Va[]对应的子图集合为Ga[],同理,Gb[]是Vb[]对应的路径,即确定了Vb[]对应的子图集合为Gb[]。The node sequence pair is V pair = {V a [], V b []}. Through the path search function of KG, the corresponding path of V pair is obtained, recorded as G pair = {G a [], G b []}, Ga [] is the path corresponding to Va [], which is one or more subgraphs, that is, the subgraph set corresponding to Va [ ] is determined to be Ga []. Similarly, G b [] is the path corresponding to V b [], that is, the subgraph set corresponding to V b [] is determined to be G b [].
S140,根据每个节点序列对应的子图集合确定解释结果对对应的评分对。S140, determining a score pair corresponding to the explanation result pair according to the subgraph set corresponding to each node sequence.
解释结果对对应的评分对为基于每个节点序列对应的子图集合的解释连贯性评分、解释复杂度评分及解释可信度评分,计算得到的总评分对。The score pair corresponding to the explanation result pair is a total score pair calculated based on the explanation coherence score, explanation complexity score and explanation credibility score of the subgraph set corresponding to each node sequence.
根据每个节点序列对应的子图集合确定解释结果对对应的评分对的方式可以为:获取每个节点序列对应的子图集合,根据每个节点序列对应的子图集合确定每个节点序列的解释的连贯性、解释的复杂性、解释的可信度的评分,基于每个节点序列的解释的连贯性、解释的复杂性、解释的可信度的评分,计算每个节点序列对应的子图集合的总评分,进而确定解释结果对对应的评分对。The method of determining the score pair corresponding to the explanation result pair according to the subgraph set corresponding to each node sequence can be: obtaining the subgraph set corresponding to each node sequence, determining the score of the coherence of the explanation, the complexity of the explanation, and the credibility of the explanation of each node sequence according to the subgraph set corresponding to each node sequence, and based on the score of the coherence of the explanation, the complexity of the explanation, and the credibility of the explanation of each node sequence, calculating the total score of the subgraph set corresponding to each node sequence, and then determining the score pair corresponding to the explanation result pair.
可选的,根据每个节点序列对应的子图集合确定解释结果对对应的评分对,包括:根据每个节点序列对应的子图集合确定每个节点序列对应的解释连贯性评分、解释复杂性评分以及解释可信度评分;根据每个节点序列对应的解释连贯性评分、解释复杂性评分以及解释可信度评分确定每个节点序列对应的目标评分;根据每个节点序列对应的目标评分确定解释结果对对应的评分对。
Optionally, a score pair corresponding to an explanation result pair is determined according to a subgraph set corresponding to each node sequence, including: determining an explanation coherence score, an explanation complexity score and an explanation credibility score corresponding to each node sequence according to the subgraph set corresponding to each node sequence; determining a target score corresponding to each node sequence according to the explanation coherence score, the explanation complexity score and the explanation credibility score corresponding to each node sequence; and determining a score pair corresponding to an explanation result pair according to the target score corresponding to each node sequence.
目标评分为根据每个节点序列对应的解释连贯性评分、解释复杂性评分以及解释可信度评分计算得到的总评分。The target score is the total score calculated based on the explanation coherence score, explanation complexity score, and explanation credibility score corresponding to each node sequence.
根据每个节点序列对应的子图集合确定每个节点序列对应的解释连贯性评分、解释复杂性评分以及解释可信度评分的方式可以为:获取每个节点序列对应的子图集合,通过对每个节点序列对应的子图集合中的子图数量进行度量获取解释连贯性评分,通过对每个节点序列对应的子图集合中每一个子图的节点和边数进行度量获取解释复杂性评分,通过计算每个节点序列对应的子图集合中所有子图的边的权重之和获取解释可信度评分。The method of determining the explanation coherence score, explanation complexity score and explanation credibility score corresponding to each node sequence according to the subgraph set corresponding to each node sequence can be: obtaining the subgraph set corresponding to each node sequence, obtaining the explanation coherence score by measuring the number of subgraphs in the subgraph set corresponding to each node sequence, obtaining the explanation complexity score by measuring the number of nodes and edges of each subgraph in the subgraph set corresponding to each node sequence, and obtaining the explanation credibility score by calculating the sum of the edge weights of all subgraphs in the subgraph set corresponding to each node sequence.
根据每个节点序列对应的解释连贯性评分、解释复杂性评分以及解释可信度评分确定每个节点序列对应的目标评分的方式可以为:基于每个节点序列对应的解释连贯性评分、解释复杂性评分以及解释可信度评分,计算每个节点序列的总评分。The method of determining the target score corresponding to each node sequence according to the explanation coherence score, explanation complexity score and explanation credibility score corresponding to each node sequence can be: based on the explanation coherence score, explanation complexity score and explanation credibility score corresponding to each node sequence, the total score of each node sequence is calculated.
计算每个节点序列的总评分的方式可以为:以Ga[]为例,根据Ga[]中子图的个数确定解释连贯性评分Ssplit,根据Ga[]的解释复杂性确定Ga[]对应的解释复杂性评分Scomplexity_a_total,根据包含结果因子的子图Ga_target的多个边权重确定解释可信度评分Scredit,计算总评分为:
The total score of each node sequence can be calculated by taking Ga [] as an example, determining the explanation coherence score S split according to the number of subgraphs in Ga [], determining the explanation complexity score S complexity_a_total corresponding to Ga [] according to the explanation complexity of Ga [], determining the explanation credibility score S credit according to the multiple edge weights of the subgraph Ga_target containing the result factor, and calculating the total score as follows:
The total score of each node sequence can be calculated by taking Ga [] as an example, determining the explanation coherence score S split according to the number of subgraphs in Ga [], determining the explanation complexity score S complexity_a_total corresponding to Ga [] according to the explanation complexity of Ga [], determining the explanation credibility score S credit according to the multiple edge weights of the subgraph Ga_target containing the result factor, and calculating the total score as follows:
standard()是数据标准化函数,此处不作赘述。standard() is a data standardization function and will not be described here.
根据每个节点序列对应的目标评分确定解释结果对对应的评分对的方式可以为:根据每个节点序列对应的解释连贯性评分、解释复杂性评分以及解释可信度评分确定每个节点序列对应的总评分,根据每个节点序列对应的总评分得到总评分对,即确定解释结果对对应的评分对。The method of determining the score pair corresponding to the interpretation result pair according to the target score corresponding to each node sequence can be: determining the total score corresponding to each node sequence according to the interpretation coherence score, the interpretation complexity score and the interpretation credibility score corresponding to each node sequence, and obtaining the total score pair according to the total score corresponding to each node sequence, that is, determining the score pair corresponding to the interpretation result pair.
需要说明的是,在得到评分对后,采用结果集Rsc[i]存储此次评分结果,其中一个比较结果的表示形式可以为一个有序正整数对(a,b),此处a与b为选择的待评估的模型标号,i是当前比较的轮数,所有的比较结果存储在结果集Rsc[]中,其大小为超参数N,即总比较轮数,重复步骤S110至S140,直到比较轮数达到超参数N。It should be noted that after obtaining the scoring pair, the result set Rsc[i] is used to store the scoring results. The representation of one comparison result can be an ordered positive integer pair (a, b), where a and b are the labels of the selected models to be evaluated, and i is the number of the current comparison round. All comparison results are stored in the result set Rsc[], whose size is the hyperparameter N, that is, the total number of comparison rounds. Repeat steps S110 to S140 until the number of comparison rounds reaches the hyperparameter N.
可选的,根据每个节点序列对应的子图集合确定每个节点序列对应的解释连贯性评分,包括:获取每个节点序列对应的子图集合中的子图数量;根据子图集合中的子图数量确定每个节点序列对应的解释连贯性评分。Optionally, determining the explanation coherence score corresponding to each node sequence according to the subgraph set corresponding to each node sequence includes: obtaining the number of subgraphs in the subgraph set corresponding to each node sequence; and determining the explanation coherence score corresponding to each node sequence according to the number of subgraphs in the subgraph set.
解释连贯性评分策略为:一个逻辑通顺的解释结果,其推理过程应当是连贯的,其因因子在KG中对应的节点组成一个连通子图,相互不连通的子图数量越多,认为这个解释结果连贯性越差。
The scoring strategy for explanation coherence is: for a logically coherent explanation result, its reasoning process should be coherent, and the corresponding nodes of its factors in the KG form a connected subgraph. The more disconnected subgraphs there are, the less coherent the explanation result is.
获取每个节点序列对应的子图集合中的子图数量的方式可以为:获取每个节点序列对应的子图集合,进而获取子图集合中的子图数量。The method for obtaining the number of subgraphs in the subgraph set corresponding to each node sequence may be: obtaining the subgraph set corresponding to each node sequence, and then obtaining the number of subgraphs in the subgraph set.
根据子图集合中的子图数量确定每个节点序列对应的解释连贯性评分的方式可以为:获取子图集合中的子图数量,每个节点序列对应的解释连贯性评分为子图数量的倒数。例如可以是,以Ga[]为例,Ga[]的子图数量为|Ga[]|,解释连贯性评分为:
The method of determining the explanation coherence score corresponding to each node sequence according to the number of subgraphs in the subgraph set may be: obtaining the number of subgraphs in the subgraph set, and the explanation coherence score corresponding to each node sequence is the inverse of the number of subgraphs. For example, taking Ga [] as an example, the number of subgraphs of Ga [] is | Ga []|, and the explanation coherence score is:
The method of determining the explanation coherence score corresponding to each node sequence according to the number of subgraphs in the subgraph set may be: obtaining the number of subgraphs in the subgraph set, and the explanation coherence score corresponding to each node sequence is the inverse of the number of subgraphs. For example, taking Ga [] as an example, the number of subgraphs of Ga [] is | Ga []|, and the explanation coherence score is:
即解释连贯性评分为子图数量的倒数。That is, the interpretation coherence score is the inverse of the number of subgraphs.
可选的,根据每个节点序列对应的子图集合确定每个节点序列对应的解释复杂性评分,包括:根据每个节点序列对应的子图集合中的每个子图的节点数量和边数量确定每个子图的复杂度;将每个节点序列对应的子图集合中的所有子图的复杂度之和确定为每个节点序列对应的解释复杂性评分。Optionally, the explanation complexity score corresponding to each node sequence is determined based on the subgraph set corresponding to each node sequence, including: determining the complexity of each subgraph based on the number of nodes and the number of edges in each subgraph in the subgraph set corresponding to each node sequence; and determining the sum of the complexities of all subgraphs in the subgraph set corresponding to each node sequence as the explanation complexity score corresponding to each node sequence.
解释复杂性评分策略为:一个子图中的节点/边数的比值越小,表明解释更有效、简洁,更有说服力。边可理解为每个子图中两个节点的连线。The scoring strategy for explanation complexity is: the smaller the ratio of nodes to edges in a subgraph, the more effective, concise, and convincing the explanation is. An edge can be understood as the connection between two nodes in each subgraph.
根据每个节点序列对应的子图集合中的每个子图的节点数量和边数量确定每个子图的复杂度的方式可以为:获取每个节点序列对应的子图集合中的每个子图,根据子图集合中的每个子图获取每个子图的节点数量和边数量,根据每个子图的节点数量/边数量获取每个子图的复杂度。The method for determining the complexity of each subgraph based on the number of nodes and the number of edges of each subgraph in the subgraph set corresponding to each node sequence can be: obtain each subgraph in the subgraph set corresponding to each node sequence, obtain the number of nodes and the number of edges of each subgraph based on each subgraph in the subgraph set, and obtain the complexity of each subgraph based on the number of nodes/number of edges of each subgraph.
将每个节点序列对应的子图集合中的所有子图的复杂度之和确定为每个节点序列对应的解释复杂性评分的方式可以为:获取子图集合中每个子图的复杂度,将子图集合中所有子图的复杂度之和确定为每个节点序列对应的解释复杂性评分。例如可以是,以Ga[]为例,将Ga[]中的每个子图用三元组表示,其中V为节点集,E为边集,为边的权重集,Ga[]的第i个子图Ga_i对应的三元组为对此子图的复杂度的度量指标为:
The method of determining the sum of the complexity of all subgraphs in the subgraph set corresponding to each node sequence as the interpretation complexity score corresponding to each node sequence can be: obtaining the complexity of each subgraph in the subgraph set, and determining the sum of the complexity of all subgraphs in the subgraph set as the interpretation complexity score corresponding to each node sequence. For example, taking Ga [] as an example, each subgraph in Ga [] is represented by a triple Represents, where V is the node set, E is the edge set, is the edge weight set, and the triple corresponding to the i-th subgraph Ga_i of Ga [] is The measure of the complexity of this subgraph is:
The method of determining the sum of the complexity of all subgraphs in the subgraph set corresponding to each node sequence as the interpretation complexity score corresponding to each node sequence can be: obtaining the complexity of each subgraph in the subgraph set, and determining the sum of the complexity of all subgraphs in the subgraph set as the interpretation complexity score corresponding to each node sequence. For example, taking Ga [] as an example, each subgraph in Ga [] is represented by a triple Represents, where V is the node set, E is the edge set, is the edge weight set, and the triple corresponding to the i-th subgraph Ga_i of Ga [] is The measure of the complexity of this subgraph is:
|Va_i|为子图Ga_i的节点数量,|Ea_i|为Ga_i的边数(被多条路径共用的边计多次边数),Scomplexity_a_i是子图Ga_i的复杂度评分。通过对子图集合Ga[]的总评分由多个子图评分相加得到,计算方式为:
|V a_i | is the number of nodes in the subgraph Ga_i , |E a_i | is the number of edges in Ga_i (edges shared by multiple paths are counted multiple times), and S complexity_a_i is the complexity score of the subgraph Ga_i . The total score of the subgraph set Ga [] is obtained by adding the scores of multiple subgraphs, and the calculation method is:
|V a_i | is the number of nodes in the subgraph Ga_i , |E a_i | is the number of edges in Ga_i (edges shared by multiple paths are counted multiple times), and S complexity_a_i is the complexity score of the subgraph Ga_i . The total score of the subgraph set Ga [] is obtained by adding the scores of multiple subgraphs, and the calculation method is:
Scomplexity_a_total为Ga[]的解释复杂性总评分。
S complexity_a_total is the total score of the explanation complexity of G a [].
可选的,根据每个节点序列对应的子图集合确定每个节点序列对应的解释可信度评分,包括:获取每个节点序列对应的子图集合中的目标子图,其中,所述目标子图为包括结果因子的子图;将目标子图的所有边的权重之和确定为每个节点序列对应的解释可信度评分。Optionally, the explanation credibility score corresponding to each node sequence is determined based on the subgraph set corresponding to each node sequence, including: obtaining a target subgraph in the subgraph set corresponding to each node sequence, wherein the target subgraph is a subgraph including a result factor; and determining the sum of the weights of all edges of the target subgraph as the explanation credibility score corresponding to each node sequence.
解释可信度评分策略为:若用边的权重反映节点实体联系强度以及逻辑关联强度,则可以简单使用子图所有边权重之和代表该子图的可信程度。其中,KG中的边权重来源于KG构建过程,采用贝叶斯类结构构建方法中的关系权重。The credibility scoring strategy is explained as follows: if the edge weight reflects the strength of node entity connection and logical association, then the sum of all edge weights of the subgraph can be simply used to represent the credibility of the subgraph. Among them, the edge weight in KG comes from the KG construction process, using the relationship weight in the Bayesian class structure construction method.
获取每个节点序列对应的子图集合中的目标子图,其中,所述目标子图为包括结果因子的子图的方式可以为:获取每个节点序列对应的子图集合,根据子图集合选择包括结果因子的子图。A target subgraph in a subgraph set corresponding to each node sequence is obtained, wherein the target subgraph is a subgraph including a result factor by: obtaining a subgraph set corresponding to each node sequence, and selecting a subgraph including a result factor according to the subgraph set.
将目标子图的所有边的权重之和确定为每个节点序列对应的解释可信度评分的方式可以为:选择每个节点序列对应的子图集合中的包括结果因子的子图,计算所选择的子图中所有边的权重和确定为每个节点序列对应的解释可信度评分。例如可以是,以Ga[]为例,从Ga[]选择包含结果因子的子图,记为Ga_target,计算Ga_target中所有边的权重和:
Scredit=∑we_i The method of determining the sum of weights of all edges of the target subgraph as the interpretation credibility score corresponding to each node sequence may be: selecting a subgraph including the result factor from the subgraph set corresponding to each node sequence, calculating the sum of weights of all edges in the selected subgraph to determine the interpretation credibility score corresponding to each node sequence. For example, taking Ga [] as an example, selecting a subgraph including the result factor from Ga [], denoted as Ga_target , and calculating the sum of weights of all edges in Ga_target :
S credit = ∑ w e_i
Scredit=∑we_i The method of determining the sum of weights of all edges of the target subgraph as the interpretation credibility score corresponding to each node sequence may be: selecting a subgraph including the result factor from the subgraph set corresponding to each node sequence, calculating the sum of weights of all edges in the selected subgraph to determine the interpretation credibility score corresponding to each node sequence. For example, taking Ga [] as an example, selecting a subgraph including the result factor from Ga [], denoted as Ga_target , and calculating the sum of weights of all edges in Ga_target :
S credit = ∑ w e_i
we_i代表子图Ga_target中第i条边的权重。w e_i represents the weight of the i-th edge in the subgraph Ga_target .
可选的,根据每个节点序列对应的子图集合确定每个节点序列对应的解释可信度评分,包括:获取每个节点序列对应的子图集合中的目标子图,其中,所述目标子图为包括结果因子的子图;将目标子图中与结果因子相连的边的权重之和确定为每个节点序列对应的解释可信度评分。Optionally, the explanation credibility score corresponding to each node sequence is determined based on the subgraph set corresponding to each node sequence, including: obtaining a target subgraph in the subgraph set corresponding to each node sequence, wherein the target subgraph is a subgraph including a result factor; and determining the sum of the weights of the edges in the target subgraph connected to the result factor as the explanation credibility score corresponding to each node sequence.
将目标子图中与结果因子相连的边的权重之和确定为每个节点序列对应的解释可信度评分的方式可以为:选择目标子图中与结果因子相连的边,计算目标子图中与结果因子相连的所有边的权重之和,将与结果因子相连的所有边的权重之和确定为每个节点序列对应的解释可信度评分。The method of determining the sum of the weights of the edges connected to the result factor in the target subgraph as the explanation credibility score corresponding to each node sequence can be: selecting the edges connected to the result factor in the target subgraph, calculating the sum of the weights of all the edges connected to the result factor in the target subgraph, and determining the sum of the weights of all the edges connected to the result factor as the explanation credibility score corresponding to each node sequence.
需要说明的是,在确定解释结果对对应的评分对时,技术人员可根据评价的不同场景以及评分的实际效用,调整评分策略中的计算方式及组合方式。It should be noted that when determining the score pairs corresponding to the interpretation results, the technical personnel can adjust the calculation method and combination method in the scoring strategy according to the different evaluation scenarios and the actual utility of the scoring.
可选的,所述待评价XAI模型对应的解释结果包括不同XAI模型的解释结果;在根据每个节点序列对应的子图集合确定解释结果对对应的评分对之后,还包括:将所述评分对输入目标模型,得到所述不同XAI模型中的每个XAI模型的解释能力的排名次序,或者,所述待评价XAI模型对应的解释结果包括同一XAI模型在不同训练批次下的解释结果;在根据每个节点序列对应的子图集合确定所述解释结果对对应的评分对之后,还包括:将所述评分对输入目标模
型,得到所述解释结果对中每个解释结果的解释能力的排名次序。Optionally, the explanation results corresponding to the XAI model to be evaluated include the explanation results of different XAI models; after determining the score pairs corresponding to the explanation results according to the subgraph set corresponding to each node sequence, it also includes: inputting the score pairs into the target model to obtain the ranking order of the explanation ability of each XAI model in the different XAI models, or, the explanation results corresponding to the XAI model to be evaluated include the explanation results of the same XAI model under different training batches; after determining the score pairs corresponding to the explanation results according to the subgraph set corresponding to each node sequence, it also includes: inputting the score pairs into the target model to obtain the ranking order of the explanation ability of each XAI model in the different XAI models. The ranking order of the explanation ability of each explanation result in the pair of explanation results is obtained.
目标模型可以为根据输入为评分对,输出为待评价XAI模型的解释能力的排序结果的模型。需要说明的是,为了提高准确性,减少误差,采用随机的方式进行解释结果对比对,因此不同XAI模型的比对次数并不相同,为基于不均衡的比对结果得到排名次序建立了目标模型。The target model can be a model that takes the score pair as input and outputs the ranking result of the explanation ability of the XAI model to be evaluated. It should be noted that in order to improve accuracy and reduce errors, the explanation results are compared in a random manner, so the number of comparisons of different XAI models is not the same. The target model is established to obtain the ranking order based on the unbalanced comparison results.
将所述评分对输入目标模型,得到每个待评价XAI模型的解释能力的排名次序,或者,每个解释结果的解释能力的排名次序的方式可以为:若最初选择的待评价的XAI模型为两个不同的XAI模型,则将两个不同的XAI模型对应的解释结果对的评分对输入目标模型,得到每个XAI模型的解释能力的排名次序;若最初选择的待评价的XAI模型为同一XAI模型在不同训练批次的模型,则将同一XAI模型在不同训练批次的解释结果对的评分对输入目标模型,得到每个解释结果的解释能力的排名次序。The score pairs are input into the target model to obtain the ranking order of the explanatory power of each XAI model to be evaluated, or the ranking order of the explanatory power of each explanation result can be as follows: if the XAI models to be evaluated initially selected are two different XAI models, the score pairs of the explanation result pairs corresponding to the two different XAI models are input into the target model to obtain the ranking order of the explanatory power of each XAI model; if the XAI models to be evaluated initially selected are models of the same XAI model in different training batches, the score pairs of the explanation result pairs of the same XAI model in different training batches are input into the target model to obtain the ranking order of the explanatory power of each explanation result.
例如可以是,若目标模型的输入为评分对存储在的结果集Rsc[],输出为不同XAI模型的解释能力的排序结果,记为{...αi...αj...},其中,αi为第i个XAI模型解释能力的排名次序,其在排序结果中的次序为其解释能力的排名次序,实现方式为:首先将结果集数组Rsc[]中存储的比较结果转换到一个存储比较结果的矩阵An×n中,aij在矩阵中的位置为(i,j),含义为第i个XAI模型胜过第j个XAI模型的次数;其次,计算每个模型的解释能力的排名次序αi的值,其求解方式:在任意指定一个αi为解释能力基准值1后,求解ArgMax(L),L的计算表达如下:
For example, if the input of the target model is the score pair stored in the result set Rsc[], the output is the ranking result of the explanatory power of different XAI models, recorded as {...α i ...α j ...}, where α i is the ranking order of the explanatory power of the i-th XAI model, and its order in the ranking result is the ranking order of its explanatory power. The implementation method is: first, the comparison results stored in the result set array Rsc[] are converted into a matrix A n×n storing the comparison results, and the position of a ij in the matrix is (i, j), which means the number of times the i-th XAI model outperforms the j-th XAI model; secondly, the value of the ranking order α i of the explanatory power of each model is calculated, and the solution method is: after arbitrarily specifying an α i as the explanatory power benchmark value 1, solve ArgMax(L), and the calculation expression of L is as follows:
For example, if the input of the target model is the score pair stored in the result set Rsc[], the output is the ranking result of the explanatory power of different XAI models, recorded as {...α i ...α j ...}, where α i is the ranking order of the explanatory power of the i-th XAI model, and its order in the ranking result is the ranking order of its explanatory power. The implementation method is: first, the comparison results stored in the result set array Rsc[] are converted into a matrix A n×n storing the comparison results, and the position of a ij in the matrix is (i, j), which means the number of times the i-th XAI model outperforms the j-th XAI model; secondly, the value of the ranking order α i of the explanatory power of each model is calculated, and the solution method is: after arbitrarily specifying an α i as the explanatory power benchmark value 1, solve ArgMax(L), and the calculation expression of L is as follows:
αj为第j个XAI模型的解释能力的排名次序。 αj is the ranking order of the explanation ability of the jth XAI model.
需要注意的是,在等权的时候,对任意j,这个损失函数的项对i求和,就是一个常数减去“秩和检验”,而“秩和检验”是一种达到统计功效最高的非参数检验,且正比于常见的曲线下面积(Area Under Curve,AUC)指标。所以优化本损失函数,可以理解成优化AUC的加权和。所以有很强的延展性和实用性。经过ArgMax(L)的求解之后,可得到每个模型的解释能力的值并排序,并将排序后的模型存储为模型的解释能力序列{αi...αj...},这里的αi对应的是解释能力排序第一位的第i个解释模型MODELi。基于上述做法,可以得到每个XAI模型的解释能力的排名次序,或者,每个解释结果的解释能力的排名次序。It should be noted that when the weights are equal, for any j, the sum of the terms of this loss function over i is a constant minus the "rank sum test", which is a non-parametric test that achieves the highest statistical power and is proportional to the common area under the curve (AUC) indicator. Therefore, optimizing this loss function can be understood as optimizing the weighted sum of AUC. Therefore, it has strong scalability and practicality. After solving ArgMax(L), the value of the explanatory power of each model can be obtained and sorted, and the sorted models are stored as the model's explanatory power sequence {α i ...α j ...}, where α i corresponds to the i-th explanatory model MODEL i ranked first in the explanatory power ranking. Based on the above approach, the ranking order of the explanatory power of each XAI model, or the ranking order of the explanatory power of each explanation result, can be obtained.
需要说明的是,技术人员可根据实际需要调整模型比较结果的存储方式,而不必局限于以矩阵形式存储。It should be noted that technicians can adjust the storage method of the model comparison results according to actual needs, and do not have to be limited to storage in matrix form.
在一个例子中,若一互联网公司H的推荐业务部门的样本库存储了10000
名用户的数据Data[],对一用户,其数据表示为P={Pur[],Foo[],rec},其中Pur[]是其过往的购买记录集,表示为字符串(如"umbrella","banana");Foo[]为浏览(但未购买)的记录集,表示为字符串;rec为模型基于上述两组数组给用户推荐的商品,表示为字符串。该部门欲在原推荐系统中加入解释,提高用户对其推荐商品的接受度,选定了多个待评价的XAI解释模型MODEL1,MODEL2,MODEL3对其推荐结果做出合理解释,基于用户已购买的商品与浏览过的商品信息推测出用户当前最可能想要购买的商品,对这一推理过程的路径解释进行评价,并选出解释能力强的模型投入使用。系统管理员可设定比较轮数参数N=20,首先采用随机算法选定两个待比较的XAI模型MODEL1与MODEL2(下面分别记作a与b),然后从样本库Data[]中随机选取一个用户的数据P,送入a,b进行解释结果的生成,输出的解释结果的原子表示形式为E=[X1……Xn→Y],这里Xi是解释模型选定参与结果解释的因因子,因此解释结果对可表示为Epair={Ea[],Eb[]},Ea[]、Eb[]分别是下标为a、b的原子解释结果序列。In an example, if the sample library of the recommendation business department of an Internet company H stores 10,000 The data of a user is Data[]. For a user, the data is represented as P = {Pur[], Foo[], rec}, where Pur[] is the set of past purchase records, represented as a string (such as "umbrella", "banana"); Foo[] is the set of records that have been browsed (but not purchased), represented as a string; rec is the product recommended to the user by the model based on the above two arrays, represented as a string. The department wants to add explanations to the original recommendation system to improve the user's acceptance of the recommended products. It has selected several XAI explanation models to be evaluated, MODEL 1 , MODEL 2 , and MODEL 3 , to make reasonable explanations for their recommendation results. Based on the information of the products that the user has purchased and browsed, it infers the products that the user is most likely to buy at present. The path explanation of this reasoning process is evaluated, and the model with strong explanation ability is selected for use. The system administrator can set the comparison round number parameter N=20. First, a random algorithm is used to select two XAI models MODEL 1 and MODEL 2 to be compared (hereinafter referred to as a and b respectively). Then, a user's data P is randomly selected from the sample library Data[] and sent to a and b to generate the interpretation result. The atomic representation of the output interpretation result is E=[ X1 ... Xn →Y], where Xi is the factor selected by the interpretation model to participate in the result interpretation. Therefore, the interpretation result pair can be expressed as Epair={ Ea [], Eb []}, where Ea [] and Eb [] are atomic interpretation result sequences with subscripts a and b respectively.
若用户P的数据为Pura[]={"纸尿裤","《微观经济学》","婴儿床"...},Foo[]={"SK-II","水壶","孕妇衣"...},rec="奶粉"。Ea[]的数据如下:
If the data of user P is Pur a [] = {"diapers", "Microeconomics", "baby bed"...}, Foo[] = {"SK-II", "kettle", "maternity clothes"...}, rec = "milk powder", the data of E a [] is as follows:
If the data of user P is Pur a [] = {"diapers", "Microeconomics", "baby bed"...}, Foo[] = {"SK-II", "kettle", "maternity clothes"...}, rec = "milk powder", the data of E a [] is as follows:
Eb[]的数据如下:
The data of E b [] are as follows:
The data of E b [] are as follows:
在获得解释结果序列对后,解释结果序列对与KG匹配。为提高匹配效率,以Ea[]的匹配为例:After obtaining the interpretation result sequence pair, the interpretation result sequence pair is matched with KG. To improve the matching efficiency, take the matching of E a [] as an example:
1.调用Sophon KG语义检索接口,在KG中匹配Y对应的因子,在KG中匹配的节点为Vy。1. Call the Sophon KG semantic retrieval interface to match the factor corresponding to Y in the KG. The matching node in the KG is V y .
2.以Vy为起点,在路径深度为D的范围内匹配X6对应的节点V6。若匹配成功,再以V6为起点重复此步骤;若匹配失败,则以(未连续匹配成功数+1)*D的深度内匹配X5因子对应的节点,并认为前一节点所在子图与后一节点所在子图不连通。按此规则直至完成所有的节点匹配。2. Starting from Vy , match the node V6 corresponding to X6 within the path depth D. If the match is successful, repeat this step starting from V6 ; if the match fails, match the node corresponding to the X5 factor within the depth of (number of unsuccessful consecutive matches + 1)*D, and assume that the subgraph where the previous node is located is not connected to the subgraph where the next node is located. Follow this rule until all node matches are completed.
最终输出匹配结果:Epair在KG中对应的节点序列对为Vpair={Va[],Vb[]}。The final output matching result: The node sequence pair corresponding to E pair in KG is V pair = {V a [], V b []}.
通过Sophon KG的路径查找功能,能够得到节点序列对Vpair={Va[],Vb[]}对应的子图集合对Gpair={Ga[],Gb[]},具体的匹配结果为:
Ga[]={{X1,X2},{X3,X4,X5,X6,Y}}
Gb[]={{X1,X2},{X3,X4},{X5,Y}}Through the path search function of Sophon KG, we can get the subgraph set pair G pair = {G a [],G b []} corresponding to the node sequence pair V pair = {V a [],V b []}. The specific matching result is:
Ga []={{ X1 , X2 },{ X3 , X4 , X5 , X6 ,Y}}
G b [] = {{X 1 ,X 2 },{X 3 ,X 4 },{X 5 ,Y}}
Ga[]={{X1,X2},{X3,X4,X5,X6,Y}}
Gb[]={{X1,X2},{X3,X4},{X5,Y}}Through the path search function of Sophon KG, we can get the subgraph set pair G pair = {G a [],G b []} corresponding to the node sequence pair V pair = {V a [],V b []}. The specific matching result is:
Ga []={{ X1 , X2 },{ X3 , X4 , X5 , X6 ,Y}}
G b [] = {{X 1 ,X 2 },{X 3 ,X 4 },{X 5 ,Y}}
图2是本申请实施例一中的一种匹配结果所得子图集Ga[]的示意图,如图2所示,Ga[]中有2个子图。FIG2 is a schematic diagram of a subgraph set Ga [] obtained from a matching result in Example 1 of the present application. As shown in FIG2 , there are 2 subgraphs in Ga [].
图3是本申请实施例一中的一种匹配结果所得子图集Gb[]的示意图,如图3所示,Gb[]中有3个子图。FIG3 is a schematic diagram of a subgraph set G b [] obtained from a matching result in the first embodiment of the present application. As shown in FIG3 , there are 3 subgraphs in G b [].
根据Gpair={Ga[],Gb[]}对每个子图集进行评分,以图2中的Ga[]为例:Score each subgraph set according to G pair = {G a [], G b []}, taking Ga [] in Figure 2 as an example:
根据衡量Ga[]的解释连贯性评分。according to Measure the interpretation coherence score of G a [].
计算结果精确到五位小数,用于衡量Ga[]的解释复杂度评分。 The calculated result is accurate to five decimal places and is used to measure the explanation complexity score of G a [].
图2中的子图Ga_2[]为结果因子Y所在子图,子图中多个边权重已表明,计算出Scredit=7.623,用于衡量Ga[]的解释可信度评分。The subgraph Ga_2 [] in Figure 2 is the subgraph where the result factor Y is located. The weights of multiple edges in the subgraph have been shown, and S credit = 7.623 is calculated to measure the explanation credibility score of Ga [].
根据Ga[]的解释连贯性评分、解释复杂度评分和解释可信度评分计算总评分,若standard()函数采用logistic函数则计算结果为同理,精确到小数点后五位,经过比较,本轮中MODEL1胜过MODEL2。The total score is calculated based on the explanation coherence score, explanation complexity score, and explanation credibility score of Ga []. If the standard() function uses the logistic function The calculation result is Similarly, Accurate to five decimal places, after comparison, MODEL 1 outperforms MODEL 2 in this round.
重复N=20次后,得到一个含20次比较结果的结果数组Rsc[],将Rsc[]所含
数据转化为一个3×3的矩阵A,A中的元素aij表示在全部比较轮数中第i个模型胜过第j个模型的次数,A的数据如下:
After repeating N = 20 times, a result array Rsc[] containing 20 comparison results is obtained. The data is converted into a 3×3 matrix A. The element aij in A represents the number of times the i-th model outperforms the j-th model in all comparison rounds. The data of A is as follows:
After repeating N = 20 times, a result array Rsc[] containing 20 comparison results is obtained. The data is converted into a 3×3 matrix A. The element aij in A represents the number of times the i-th model outperforms the j-th model in all comparison rounds. The data of A is as follows:
记α1,α2,α3分别为MODEL1,MODEL2,MODEL3的解释能力具体值,指定解释能力基准值α1=1.00,经过求解ArgMax(L)可求得αi,L的计算表达如下:
Note that α 1 , α 2 , and α 3 are the specific values of the explanatory power of MODEL 1 , MODEL 2 , and MODEL 3 respectively, and the explanatory power benchmark value α 1 =1.00 is specified. By solving ArgMax(L), the calculation expression of α i , L can be obtained as follows:
Note that α 1 , α 2 , and α 3 are the specific values of the explanatory power of MODEL 1 , MODEL 2 , and MODEL 3 respectively, and the explanatory power benchmark value α 1 =1.00 is specified. By solving ArgMax(L), the calculation expression of α i , L can be obtained as follows:
用拟牛顿算法(Broyden Fletcher Goldfarb Shanno,BFGS)(不限于)等优化算法可求得:当L取得最大值时,有α1=1.00,α2=0.89,α3=0.66,这三个模型的解释能力序列为{α1=1.00,α2=0.89,α3=0.66},因此选出解释能力最强的XAI模型MODEL1投入使用。需要说明的是,在例子中只比较了3个XAI模型,实际应用中不限定待评价XAI模型数量。Using the optimization algorithm such as the quasi-Newton algorithm (Broyden Fletcher Goldfarb Shanno, BFGS) (not limited to), we can obtain: when L reaches the maximum value, α 1 =1.00, α 2 =0.89, α 3 =0.66, and the explanatory power sequence of the three models is {α 1 =1.00, α 2 =0.89, α 3 =0.66}, so the XAI model MODEL 1 with the strongest explanatory power is selected and put into use. It should be noted that only three XAI models are compared in this example, and the number of XAI models to be evaluated is not limited in actual applications.
本实施例的技术方案,通过获取待评价XAI模型对应的解释结果对;获取知识图谱中与所述解释结果对匹配的节点序列对;根据所述节点序列对和所述知识图谱确定每个节点序列对应的子图集合;根据每个节点序列对应的子图集合确定解释结果对对应的评分对,解决了传统XAI评估选择方法耗时耗力,过程冗长,评估有效性低等问题,能够提高评估选择方法的准确性和效率,保证评价数据的有效性。The technical solution of this embodiment obtains the explanation result pair corresponding to the XAI model to be evaluated; obtains the node sequence pair matching the explanation result pair in the knowledge graph; determines the subgraph set corresponding to each node sequence according to the node sequence pair and the knowledge graph; and determines the score pair corresponding to the explanation result pair according to the subgraph set corresponding to each node sequence. It solves the problems of traditional XAI evaluation and selection methods being time-consuming and labor-intensive, lengthy processes, and low evaluation effectiveness, and can improve the accuracy and efficiency of the evaluation and selection methods and ensure the effectiveness of the evaluation data.
实施例二Embodiment 2
图4是本申请实施例二中的一种XAI模型评价装置的结构示意图。本实施例可适用于对XAI模型在多次随机对比测试中的可解释性能的综合排名性评估的情况,该装置可采用软件和/或硬件的方式实现,该装置可集成在任何提供XAI模型评价的功能的设备中,如图4所示,所述XAI模型评价的装置包括:解释结果对获取模块210、节点序列对获取模块220、子图集合确定模块230和评分对确定模块240。4 is a schematic diagram of the structure of an XAI model evaluation device in Embodiment 2 of the present application. This embodiment is applicable to the situation of comprehensive ranking evaluation of the interpretability performance of the XAI model in multiple random comparison tests. The device can be implemented in software and/or hardware, and the device can be integrated in any device that provides the function of XAI model evaluation. As shown in FIG4 , the XAI model evaluation device includes: an explanation result pair acquisition module 210, a node sequence pair acquisition module 220, a subgraph set determination module 230, and a score pair determination module 240.
解释结果对获取模块210,设置为获取待评价XAI模型对应的解释结果对;节点序列对获取模块220,设置为获取知识图谱中与所述解释结果对匹配的节点序列对;子图集合确定模块230,设置为根据所述节点序列对和所述知识图谱确定每个节点序列对应的子图集合;评分对确定模块240,设置为根据每个节点序列对应的子图集合确定解释结果对对应的评分对。The explanation result pair acquisition module 210 is configured to obtain the explanation result pair corresponding to the XAI model to be evaluated; the node sequence pair acquisition module 220 is configured to obtain the node sequence pair matching the explanation result pair in the knowledge graph; the subgraph set determination module 230 is configured to determine the subgraph set corresponding to each node sequence based on the node sequence pair and the knowledge graph; the scoring pair determination module 240 is configured to determine the scoring pair corresponding to the explanation result pair based on the subgraph set corresponding to each node sequence.
可选的,所述评分对确定模块是设置为:根据每个节点序列对应的子图集
合确定每个节点序列对应的解释连贯性评分、解释复杂性评分以及解释可信度评分;根据每个节点序列对应的解释连贯性评分、解释复杂性评分以及解释可信度评分确定每个节点序列对应的目标评分;根据每个节点序列对应的目标评分确定解释结果对对应的评分对。Optionally, the scoring pair determination module is configured to: Determine the explanation coherence score, explanation complexity score and explanation credibility score corresponding to each node sequence; determine the target score corresponding to each node sequence according to the explanation coherence score, explanation complexity score and explanation credibility score corresponding to each node sequence; determine the score pair corresponding to the explanation result pair according to the target score corresponding to each node sequence.
可选的,所述评分对确定模块是设置为:获取每个节点序列对应的子图集合中的子图数量;根据子图集合中的子图数量确定每个节点序列对应的解释连贯性评分。Optionally, the score pair determination module is configured to: obtain the number of subgraphs in a subgraph set corresponding to each node sequence; and determine an explanation coherence score corresponding to each node sequence according to the number of subgraphs in the subgraph set.
可选的,所述评分对确定模块是设置为:根据每个节点序列对应的子图集合中的每个子图的节点数量和每个子图的边数量确定每个子图的复杂度;将每个节点序列对应的子图集合中的所有子图的复杂度之和确定为每个节点序列对应的解释复杂性评分。Optionally, the score determination module is set to: determine the complexity of each subgraph based on the number of nodes in each subgraph in the subgraph set corresponding to each node sequence and the number of edges in each subgraph; and determine the sum of the complexities of all subgraphs in the subgraph set corresponding to each node sequence as the explanation complexity score corresponding to each node sequence.
可选的,所述评分对确定模块是设置为:获取每个节点序列对应的子图集合中的目标子图,其中,所述目标子图为包括结果因子的子图;将目标子图所有边的权重之和确定为每个节点序列对应的解释可信度评分。Optionally, the score pair determination module is configured to: obtain a target subgraph from a subgraph set corresponding to each node sequence, wherein the target subgraph is a subgraph including a result factor; and determine the sum of the weights of all edges of the target subgraph as the explanation credibility score corresponding to each node sequence.
可选的,所述评分对确定模块是设置为:获取每个节点序列对应的子图集合中的目标子图,其中,所述目标子图为包括结果因子的子图;将目标子图中与结果因子相连的边的权重之和确定为每个节点序列对应的解释可信度评分。Optionally, the score pair determination module is configured to: obtain a target subgraph from a subgraph set corresponding to each node sequence, wherein the target subgraph is a subgraph including a result factor; and determine the sum of the weights of the edges in the target subgraph connected to the result factor as the explanation credibility score corresponding to each node sequence.
可选的,该装置还包括:排名得到模块,设置为将所述评分对输入目标模型,得到所述不同XAI模型中的每个待评价XAI模型的解释能力的排名次序,或者,将所述评分对输入目标模型,得到所述解释结果对中每个解释结果的解释能力的排名次序。Optionally, the device also includes: a ranking acquisition module, configured to input the score pair into the target model to obtain the ranking order of the explanatory power of each XAI model to be evaluated among the different XAI models, or to input the score pair into the target model to obtain the ranking order of the explanatory power of each explanation result in the explanation result pair.
上述产品可执行本申请任意实施例所提供的方法,具备执行方法相应的功能模块。The above-mentioned product can execute the method provided by any embodiment of the present application and has the corresponding functional modules for executing the method.
本实施例的技术方案,通过获取待评价XAI模型对应的解释结果对;获取知识图谱中与所述解释结果对匹配的节点序列对;根据所述节点序列对和所述知识图谱确定每个节点序列对应的子图集合;根据每个节点序列对应的子图集合确定解释结果对对应的评分对,解决了传统XAI评估选择方法耗时耗力,过程冗长,评估有效性低等问题,能够提高评估选择方法的准确性和效率,保证评价数据的有效性。The technical solution of this embodiment obtains the explanation result pair corresponding to the XAI model to be evaluated; obtains the node sequence pair matching the explanation result pair in the knowledge graph; determines the subgraph set corresponding to each node sequence according to the node sequence pair and the knowledge graph; and determines the score pair corresponding to the explanation result pair according to the subgraph set corresponding to each node sequence. This solves the problems of traditional XAI evaluation and selection methods being time-consuming and labor-intensive, lengthy processes, and low evaluation effectiveness, and can improve the accuracy and efficiency of the evaluation and selection methods and ensure the effectiveness of the evaluation data.
实施例三Embodiment 3
图5是本申请实施例三中的一种电子设备的结构示意图。电子设备10旨在表示多种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电
子设备还可以表示多种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备(如头盔、眼镜、手表等)和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本申请的实现。FIG5 is a schematic diagram of the structure of an electronic device in the third embodiment of the present application. The electronic device 10 is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. The sub-device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (such as helmets, glasses, watches, etc.) and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely examples and are not intended to limit the implementation of the present application described and/or claimed herein.
如图5所示,电子设备10包括至少一个处理器11,以及与至少一个处理器11通信连接的存储器,如只读存储器(Read-Only Memory,ROM)12、随机访问存储器(Random Access Memory,RAM)13等,其中,存储器存储有可被至少一个处理器执行的计算机程序,处理器11可以根据存储在ROM12中的计算机程序或者从存储单元18加载到RAM13中的计算机程序,来执行多种适当的动作和处理。在RAM 13中,还可存储电子设备10操作所需的多种程序和数据。处理器11、ROM 12以及RAM 13通过总线14彼此相连。输入/输出(Input/Output,I/O)接口15也连接至总线14。As shown in FIG5 , the electronic device 10 includes at least one processor 11, and a memory connected to the at least one processor 11, such as a read-only memory (ROM) 12, a random access memory (RAM) 13, etc., wherein the memory stores a computer program that can be executed by at least one processor, and the processor 11 can perform a variety of appropriate actions and processes according to the computer program stored in the ROM 12 or the computer program loaded from the storage unit 18 to the RAM 13. In the RAM 13, a variety of programs and data required for the operation of the electronic device 10 can also be stored. The processor 11, the ROM 12, and the RAM 13 are connected to each other through a bus 14. An input/output (I/O) interface 15 is also connected to the bus 14.
电子设备10中的多个部件连接至I/O接口15,包括:输入单元16,例如键盘、鼠标等;输出单元17,例如多种类型的显示器、扬声器等;存储单元18,例如磁盘、光盘等;以及通信单元19,例如网卡、调制解调器、无线通信收发机等。通信单元19允许电子设备10通过诸如因特网的计算机网络和/或多种电信网络与其他设备交换信息/数据。A number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16, such as a keyboard, a mouse, etc.; an output unit 17, such as various types of displays, speakers, etc.; a storage unit 18, such as a disk, an optical disk, etc.; and a communication unit 19, such as a network card, a modem, a wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.
处理器11可以是多种具有处理和计算能力的通用和/或专用处理组件。处理器11的一些示例包括但不限于中央处理单元(Central Processing Unit,CPU)、图形处理单元(Graphics Processing Unit,GPU)、多种专用的人工智能(AI)计算芯片、多种运行机器学习模型算法的处理器、数字信号处理器(Digital Signal Processor,DSP)、以及任何适当的处理器、控制器、微控制器等。处理器11执行上文所描述的多个方法和处理,例如XAI模型评价方法。The processor 11 may be a variety of general and/or special processing components with processing and computing capabilities. Some examples of the processor 11 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), a variety of dedicated artificial intelligence (AI) computing chips, a variety of processors running machine learning model algorithms, a digital signal processor (DSP), and any appropriate processor, controller, microcontroller, etc. The processor 11 executes the multiple methods and processes described above, such as the XAI model evaluation method.
在一些实施例中,XAI模型评价方法可被实现为计算机程序,其被有形地包含于计算机可读存储介质,例如存储单元18。在一些实施例中,计算机程序的部分或者全部可以经由ROM 12和/或通信单元19而被载入和/或安装到电子设备10上。当计算机程序加载到RAM 13并由处理器11执行时,可以执行上文描述的XAI模型评价方法的一个或多个步骤。可选地,在其他实施例中,处理器11可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行XAI模型评价方法。In some embodiments, the XAI model evaluation method may be implemented as a computer program, which is tangibly contained in a computer-readable storage medium, such as a storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed on the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into the RAM 13 and executed by the processor 11, one or more steps of the XAI model evaluation method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the XAI model evaluation method in any other appropriate manner (e.g., by means of firmware).
本文中以上描述的系统和技术的多种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(Field Programmable Gate Array,FPGA)、专用集成电路(Application Specific Integrated Circuit,ASIC)、专用标准产品(Application Specific Standard Parts,ASSP)、芯片上系统(System on Chip,
SOC)、复杂可编程逻辑设备(Complex Programmable Logic Device,CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些多种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described above may be implemented in digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard parts (ASSPs), system on chip (SoCs), and the like. SOC), complex programmable logic device (Complex Programmable Logic Device, CPLD), computer hardware, firmware, software, and/or a combination thereof. These various implementations may include: being implemented in one or more computer programs, which may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a dedicated or general programmable processor, which may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
用于实施本申请的方法的计算机程序可以采用一个或多个编程语言的任何组合来编写。这些计算机程序可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器,使得计算机程序当由处理器执行时使流程图和/或框图中所规定的功能/操作被实施。计算机程序可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。The computer programs for implementing the methods of the present application may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, so that when the computer programs are executed by the processor, the functions/operations specified in the flow charts and/or block diagrams are implemented. The computer programs may be executed entirely on the machine, partially on the machine, partially on the machine and partially on a remote machine as a stand-alone software package, or entirely on a remote machine or server.
在本申请的上下文中,计算机可读存储介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的计算机程序。计算机可读存储介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。可选地,计算机可读存储介质可以是机器可读信号介质。机器可读存储介质的示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、RAM、ROM、可擦除可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM)或快闪存储器、光纤、便捷式紧凑盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present application, a computer readable storage medium may be a tangible medium that may contain or store a computer program for use by or in conjunction with an instruction execution system, device, or apparatus. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any suitable combination of the foregoing. Alternatively, a computer readable storage medium may be a machine readable signal medium. Examples of machine readable storage media may include an electrical connection based on one or more wires, a portable computer disk, a hard disk, RAM, ROM, an erasable programmable read-only memory (EPROM) or flash memory, optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
为了提供与用户的交互,可以在电子设备上实施此处描述的系统和技术,该电子设备具有:用于向用户显示信息的显示装置(例如,阴极射线管(Cathode Ray Tube,CRT)或者液晶显示器(Liquid Crystal Display,LCD)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给电子设备。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者触觉输入)来接收来自用户的输入。To provide interaction with a user, the systems and techniques described herein may be implemented on an electronic device having: a display device (e.g., a cathode ray tube (CRT) or a liquid crystal display (LCD) monitor) for displaying information to the user; and a keyboard and pointing device (e.g., a mouse or trackball) through which the user can provide input to the electronic device. Other types of devices may also be used to provide interaction with the user; for example, the feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form (including acoustic input, voice input, or tactile input).
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户
计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(Local Area Network,LAN)、广域网(Wide Area Network,WAN)、区块链网络和互联网。The systems and techniques described herein can be implemented in a computing system that includes a backend component (e.g., as a data server), or a computing system that includes a middleware component (e.g., an application server), or a computing system that includes a frontend component (e.g., a user interface with a graphical user interface or a web browser). A computer, a user can interact with embodiments of the systems and techniques described herein through a graphical user interface or a web browser), or a computing system including any combination of such backend components, middleware components, or frontend components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: a Local Area Network (LAN), a Wide Area Network (WAN), a blockchain network, and the Internet.
计算系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器,又称为云计算服务器或云主机,是云计算服务体系中的一项主机产品,以解决了传统物理主机与虚拟专用服务器(Virtual Private Server,VPS)服务中,存在的管理难度大,业务扩展性弱的缺陷。A computing system may include a client and a server. The client and the server are generally remote from each other and usually interact through a communication network. The client and server relationship is generated by computer programs running on the respective computers and having a client-server relationship with each other. The server may be a cloud server, also known as a cloud computing server or cloud host, which is a host product in the cloud computing service system to solve the defects of difficult management and weak business scalability in traditional physical hosts and virtual private servers (VPS) services.
应该理解,可以使用上面所示的多种形式的流程,重新排序、增加或删除步骤。例如,本申请中记载的多个步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本申请的技术方案所期望的结果,本文在此不进行限制。
It should be understood that the various forms of processes shown above can be used to reorder, add or delete steps. For example, the multiple steps recorded in this application can be executed in parallel, sequentially or in different orders, as long as the expected results of the technical solution of this application can be achieved, and this document is not limited here.
Claims (10)
- 一种可解释性人工智能XAI模型评价方法,包括:An interpretable artificial intelligence (XAI) model evaluation method, comprising:获取待评价XAI模型对应的解释结果对;Obtain the explanation result pair corresponding to the XAI model to be evaluated;获取知识图谱中与所述解释结果对匹配的节点序列对;Obtaining a node sequence pair in the knowledge graph that matches the interpretation result pair;根据所述节点序列对和所述知识图谱确定所述节点序列对中每个节点序列对应的子图集合;Determine a subgraph set corresponding to each node sequence in the node sequence pair according to the node sequence pair and the knowledge graph;根据每个节点序列对应的子图集合确定所述解释结果对对应的评分对。The score pairs corresponding to the explanation result pairs are determined according to the subgraph set corresponding to each node sequence.
- 根据权利要求1所述的方法,其中,根据每个节点序列对应的子图集合确定所述解释结果对对应的评分对,包括:The method according to claim 1, wherein determining the score pair corresponding to the pair of interpretation results according to the subgraph set corresponding to each node sequence comprises:根据每个节点序列对应的子图集合确定所述每个节点序列对应的解释连贯性评分、解释复杂性评分以及解释可信度评分;Determine, according to the subgraph set corresponding to each node sequence, an explanation coherence score, an explanation complexity score, and an explanation credibility score corresponding to each node sequence;根据每个节点序列对应的解释连贯性评分、解释复杂性评分以及解释可信度评分确定所述每个节点序列对应的目标评分;Determine a target score corresponding to each node sequence according to the explanation coherence score, explanation complexity score, and explanation credibility score corresponding to each node sequence;根据每个节点序列对应的目标评分确定所述解释结果对对应的评分对。The score pair corresponding to the pair of interpretation results is determined according to the target score corresponding to each node sequence.
- 根据权利要求2所述的方法,其中,根据每个节点序列对应的子图集合确定所述每个节点序列对应的解释连贯性评分,包括:The method according to claim 2, wherein determining the interpretation coherence score corresponding to each node sequence according to the subgraph set corresponding to each node sequence comprises:获取每个节点序列对应的子图集合中的子图数量;Get the number of subgraphs in the subgraph set corresponding to each node sequence;根据所述子图集合中的子图数量确定所述每个节点序列对应的解释连贯性评分。The explanation coherence score corresponding to each node sequence is determined according to the number of subgraphs in the subgraph set.
- 根据权利要求2所述的方法,其中,根据每个节点序列对应的子图集合确定所述每个节点序列对应的解释复杂性评分,包括:The method according to claim 2, wherein determining the interpretation complexity score corresponding to each node sequence according to the subgraph set corresponding to each node sequence comprises:根据每个节点序列对应的子图集合中的每个子图的节点数量和所述每个子图的边数量确定所述每个子图的复杂度;Determine the complexity of each subgraph according to the number of nodes in each subgraph in the subgraph set corresponding to each node sequence and the number of edges of each subgraph;将所述每个节点序列对应的子图集合中的所有子图的复杂度之和确定为所述每个节点序列对应的解释复杂性评分。The sum of the complexities of all subgraphs in the subgraph set corresponding to each node sequence is determined as the interpretation complexity score corresponding to each node sequence.
- 根据权利要求2所述的方法,其中,根据每个节点序列对应的子图集合确定所述每个节点序列对应的解释可信度评分,包括:The method according to claim 2, wherein determining the interpretation credibility score corresponding to each node sequence according to the subgraph set corresponding to each node sequence comprises:获取每个节点序列对应的子图集合中的目标子图,其中,所述目标子图为包括结果因子的子图;Obtain a target subgraph in the subgraph set corresponding to each node sequence, wherein the target subgraph is a subgraph including the result factor;将所述目标子图的所有边的权重之和确定为所述每个节点序列对应的解释可信度评分。 The sum of the weights of all the edges of the target subgraph is determined as the explanation credibility score corresponding to each node sequence.
- 根据权利要求2所述的方法,其中,根据每个节点序列对应的子图集合确定所述每个节点序列对应的解释可信度评分,包括:The method according to claim 2, wherein determining the interpretation credibility score corresponding to each node sequence according to the subgraph set corresponding to each node sequence comprises:获取每个节点序列对应的子图集合中的目标子图,其中,所述目标子图为包括结果因子的子图;Obtain a target subgraph in the subgraph set corresponding to each node sequence, wherein the target subgraph is a subgraph including the result factor;将所述目标子图中与所述结果因子相连的边的权重之和确定为所述每个节点序列对应的解释可信度评分。The sum of the weights of the edges connected to the result factor in the target subgraph is determined as the explanation credibility score corresponding to each node sequence.
- 根据权利要求1所述的方法,其中,所述待评价XAI模型对应的解释结果包括不同XAI模型的解释结果;在根据每个节点序列对应的子图集合确定所述解释结果对对应的评分对之后,还包括:将所述评分对输入目标模型,得到所述不同XAI模型中的每个XAI模型的解释能力的排名次序;或者The method according to claim 1, wherein the explanation results corresponding to the XAI model to be evaluated include explanation results of different XAI models; after determining the score pairs corresponding to the explanation result pairs according to the subgraph set corresponding to each node sequence, the method further comprises: inputting the score pairs into the target model to obtain the ranking order of the explanation ability of each XAI model in the different XAI models; or所述待评价XAI模型对应的解释结果包括同一XAI模型在不同训练批次下的解释结果;在根据每个节点序列对应的子图集合确定所述解释结果对对应的评分对之后,还包括:将所述评分对输入目标模型,得到所述解释结果对中每个解释结果的解释能力的排名次序。The explanation results corresponding to the XAI model to be evaluated include the explanation results of the same XAI model under different training batches; after determining the score pairs corresponding to the explanation results according to the subgraph set corresponding to each node sequence, it also includes: inputting the score pairs into the target model to obtain the ranking order of the explanation ability of each explanation result in the explanation result pair.
- 一种可解释性人工智能XAI模型评价装置,包括:An interpretable artificial intelligence (XAI) model evaluation device, comprising:解释结果对获取模块,设置为获取待评价XAI模型对应的解释结果对;An explanation result pair acquisition module, configured to acquire an explanation result pair corresponding to the XAI model to be evaluated;节点序列对获取模块,设置为获取知识图谱中与所述解释结果对匹配的节点序列对;A node sequence pair acquisition module, configured to acquire a node sequence pair in the knowledge graph that matches the interpretation result pair;子图集合确定模块,设置为根据所述节点序列对和所述知识图谱确定所述节点序列对中每个节点序列对应的子图集合;A subgraph set determination module, configured to determine a subgraph set corresponding to each node sequence in the node sequence pair according to the node sequence pair and the knowledge graph;评分对确定模块,设置为根据每个节点序列对应的子图集合确定所述解释结果对对应的评分对。The scoring pair determination module is configured to determine the scoring pair corresponding to the pair of interpretation results according to the subgraph set corresponding to each node sequence.
- 一种电子设备,包括:An electronic device, comprising:至少一个处理器;以及at least one processor; and与所述至少一个处理器通信连接的存储器;其中,a memory communicatively connected to the at least one processor; wherein,所述存储器存储有可被所述至少一个处理器执行的计算机程序,所述计算机程序被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-7中任一项所述的可解释性人工智能XAI模型评价方法。The memory stores a computer program executable by the at least one processor, and the computer program is executed by the at least one processor so that the at least one processor can execute the explainable artificial intelligence (XAI) model evaluation method described in any one of claims 1 to 7.
- 一种计算机可读存储介质,存储有计算机指令,所述计算机指令用于使处理器执行时实现权利要求1-7中任一项所述的可解释性人工智能XAI模型评价方法。 A computer-readable storage medium stores computer instructions, wherein the computer instructions are used to enable a processor to implement the explainable artificial intelligence (XAI) model evaluation method described in any one of claims 1 to 7 when executed.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211404037.1A CN115905558A (en) | 2022-11-10 | 2022-11-10 | Knowledge graph-based XAI model evaluation method, device, equipment and medium |
CN202211404037.1 | 2022-11-10 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024098682A1 true WO2024098682A1 (en) | 2024-05-16 |
Family
ID=86485068
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/091751 WO2024098682A1 (en) | 2022-11-10 | 2023-04-28 | Xai model evaluation method and apparatus, device, and medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115905558A (en) |
WO (1) | WO2024098682A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115905558A (en) * | 2022-11-10 | 2023-04-04 | 南京星环智能科技有限公司 | Knowledge graph-based XAI model evaluation method, device, equipment and medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112508688A (en) * | 2020-12-22 | 2021-03-16 | 四川新网银行股份有限公司 | XAI-based machine learning credit model interpretation method and storage medium |
CN115062791A (en) * | 2022-06-28 | 2022-09-16 | 星环信息科技(上海)股份有限公司 | Artificial intelligence interpretation method, device, equipment and storage medium |
WO2022200624A2 (en) * | 2021-03-26 | 2022-09-29 | Datawalk Spolka Akcyjna | Systems and methods for end-to-end machine learning with automated machine learning explainable artificial intelligence |
CN115905558A (en) * | 2022-11-10 | 2023-04-04 | 南京星环智能科技有限公司 | Knowledge graph-based XAI model evaluation method, device, equipment and medium |
-
2022
- 2022-11-10 CN CN202211404037.1A patent/CN115905558A/en active Pending
-
2023
- 2023-04-28 WO PCT/CN2023/091751 patent/WO2024098682A1/en unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112508688A (en) * | 2020-12-22 | 2021-03-16 | 四川新网银行股份有限公司 | XAI-based machine learning credit model interpretation method and storage medium |
WO2022200624A2 (en) * | 2021-03-26 | 2022-09-29 | Datawalk Spolka Akcyjna | Systems and methods for end-to-end machine learning with automated machine learning explainable artificial intelligence |
CN115062791A (en) * | 2022-06-28 | 2022-09-16 | 星环信息科技(上海)股份有限公司 | Artificial intelligence interpretation method, device, equipment and storage medium |
CN115905558A (en) * | 2022-11-10 | 2023-04-04 | 南京星环智能科技有限公司 | Knowledge graph-based XAI model evaluation method, device, equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN115905558A (en) | 2023-04-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2023024259A1 (en) | Digital twin-based partial discharge monitoring system, method and apparatus | |
WO2023097929A1 (en) | Knowledge graph recommendation method and system based on improved kgat model | |
CN113535984B (en) | Knowledge graph relation prediction method and device based on attention mechanism | |
US20210311751A1 (en) | Machine-learning models applied to interaction data for determining interaction goals and facilitating experience-based modifications to interface elements in online environments | |
WO2019047790A1 (en) | Method and system for generating combined features of machine learning samples | |
WO2022179384A1 (en) | Social group division method and division system, and related apparatuses | |
US20210157819A1 (en) | Determining a collection of data visualizations | |
CN112183881A (en) | Public opinion event prediction method and device based on social network and storage medium | |
CN113792115B (en) | Entity correlation determination method, device, electronic equipment and storage medium | |
CN111695024A (en) | Object evaluation value prediction method and system, and recommendation method and system | |
CN112784591B (en) | Data processing method and device, electronic equipment and storage medium | |
WO2024098682A1 (en) | Xai model evaluation method and apparatus, device, and medium | |
CN105740434B (en) | Network information methods of marking and device | |
US9020962B2 (en) | Interest expansion using a taxonomy | |
CN111198905B (en) | Visual analysis framework for understanding missing links in a two-way network | |
US20220027722A1 (en) | Deep Relational Factorization Machine Techniques for Content Usage Prediction via Multiple Interaction Types | |
US10552428B2 (en) | First pass ranker calibration for news feed ranking | |
CN115659985A (en) | Electric power knowledge graph entity alignment method and device and computer equipment | |
CN115248890A (en) | User interest portrait generation method and device, electronic equipment and storage medium | |
Wu et al. | l0-Norm variable adaptive selection for geographically weighted regression model | |
CN117370326A (en) | Data evaluation method, device, electronic equipment and medium | |
CN115878761A (en) | Event context generation method, apparatus, and medium | |
CN112818221B (en) | Entity heat determining method and device, electronic equipment and storage medium | |
US10430831B2 (en) | Prioritizing companies for people search | |
CN115758271A (en) | Data processing method, data processing device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23887383 Country of ref document: EP Kind code of ref document: A1 |