CN109697511A - Data reasoning method, apparatus and computer equipment - Google Patents
Data reasoning method, apparatus and computer equipment Download PDFInfo
- Publication number
- CN109697511A CN109697511A CN201710999833.7A CN201710999833A CN109697511A CN 109697511 A CN109697511 A CN 109697511A CN 201710999833 A CN201710999833 A CN 201710999833A CN 109697511 A CN109697511 A CN 109697511A
- Authority
- CN
- China
- Prior art keywords
- inference
- reasoning
- layer
- unit
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 74
- 239000011159 matrix material Substances 0.000 claims abstract description 205
- 238000012549 training Methods 0.000 claims description 103
- 230000015654 memory Effects 0.000 claims description 12
- 210000002569 neuron Anatomy 0.000 description 46
- 230000006870 function Effects 0.000 description 22
- 238000012545 processing Methods 0.000 description 19
- 238000003062 neural network model Methods 0.000 description 16
- 238000010586 diagram Methods 0.000 description 15
- 239000013598 vector Substances 0.000 description 13
- 230000001537 neural effect Effects 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 238000013528 artificial neural network Methods 0.000 description 6
- 230000000306 recurrent effect Effects 0.000 description 6
- 210000005036 nerve Anatomy 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000003313 weakening effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
This application discloses a kind of data reasoning method, apparatus and computer equipments, belong to computer field.The described method includes: obtaining Inference Conditions;Inference pattern is obtained, which includes at least two reasoning elements for sharing same first parameter matrix, and reasoning element is for handling the data received;Corresponding first parameter matrix of each reasoning element is used to indicate the correlation degree between reasoning element and related reasoning unit, and the related reasoning unit of reasoning element refers in preceding layer and there are the reasoning elements of incidence relation with the reasoning element;The Inference Conditions are inputted into the inference pattern, obtain model reasoning result;The model parameter that can solve in probability graph is excessive, leads to the higher problem of the complexity of data reasoning;Since different reasoning elements can share same first parameter matrix, the model parameter of inference pattern is simplified, so as to simplify the complexity of data reasoning, improves the efficiency of data reasoning.
Description
Technical Field
The embodiment of the application relates to the field of computers, in particular to a data reasoning method, a data reasoning device and computer equipment.
Background
The data reasoning refers to a process of automatically reasoning to obtain a reasoning result through computer equipment according to a reasoning condition and a reasoning model. Such as: deducing according to two reasoning conditions of the movie name and the movie type to obtain a movie director, a showing area and showing time; and deducing to obtain the consumer, the consumer type and the like according to the three reasoning conditions of the product name, the product type and the product price.
Currently, a probabilistic graph model (graphical model) is generally used as an inference model for data inference. Before data inference is carried out through the probabilistic graphical model, the computer device trains the initial probabilistic graphical model according to characteristic variables corresponding to at least two characteristic classifications in a database, model parameters in the initial probabilistic graphical model are updated, and the probabilistic graphical model G (v, epsilon) for data inference is obtained, wherein v is a set of the characteristic variables corresponding to the at least two characteristic classifications, and epsilon is an incidence relation between the characteristic classifications. After the probabilistic graph model G ═ (ν, ∈) is obtained, the input inference condition is inferred by the probabilistic graph model G ═ (ν, ∈), and an inference result is obtained.
Because the probability map model has more model parameters and higher complexity of the structure of the probability map model, the efficiency of data inference through the probability map model is lower.
Disclosure of Invention
The application provides a data reasoning method, a data reasoning device and computer equipment, which can solve the problems that the probability graph model has more model parameters and the efficiency is lower when data reasoning is carried out through the probability graph model.
In a first aspect, an embodiment of the present application provides a data inference method, where the data inference method includes: acquiring inference conditions; and acquiring a reasoning model, and inputting a reasoning condition into the reasoning model to obtain a model reasoning result. The inference model comprises an input layer, n inference layers, an output layer and at least m first parameter matrixes, wherein each inference layer and the output layer of the input layer and the n inference layers comprise at least one inference unit; the reasoning model comprises at least two reasoning units sharing the same first parameter matrix; the first parameter matrix used by each inference unit is used for indicating the association degree between the inference unit and the associated inference unit; the associated reasoning unit of the reasoning unit is a reasoning unit which is positioned in the input layer and has an associated relation with the reasoning unit; or the associated reasoning unit of the reasoning unit is a reasoning unit which is positioned at the previous layer of reasoning layer and has an associated relation with the reasoning unit; n and m are positive integers.
The data inference is carried out by using the inference model which comprises at least two inference units and shares the same first parameter matrix, so that the problem that the complexity of the data inference is higher due to excessive model parameters in a probability map when the data inference is carried out by using the probability map can be solved; different inference units can share the same first parameter matrix, so that model parameters of an inference model are simplified, complexity of data inference is simplified, and efficiency of data inference is improved.
Optionally, the inference units in adjacent layers are in a full-connection relationship, where the full-connection relationship means that each inference unit in the ith layer has an association relationship with all inference units in the (i-1) th layer, i is an integer, and i is greater than or equal to 2 and less than or equal to n + 2; the number of the reasoning units in the input layer and the number of the reasoning units in the output layer are both x; the number of the reasoning units in each reasoning layer in the n reasoning layers is y; in the n layers of reasoning layers, the type of the b-th reasoning unit in the a-th layer reasoning layer is the same as that of the b-th reasoning unit in the c-th layer reasoning layer, a, b and c are integers, a is more than or equal to 1 and less than or equal to n, b is more than or equal to 1 and less than or equal to y, and c is more than or equal to 1 and less than or equal to n; the at least m first parameter matrices include: the first parameter matrix is shared by the reasoning units of the same type in the reasoning layers except the first layer reasoning layer; the first layer of inference layer is the first layer of n layers of inference layers when the inference layers are sorted according to the data inference sequence, and the value of m is the maximum value of x and y.
The number of the reasoning units in each layer of the reasoning model is set to be y, and the first parameter matrix shared by the reasoning units of the same type in the reasoning layer except the first layer of the reasoning layer is set, so that the model parameters in the reasoning model are reduced as much as possible, the complexity of data reasoning is simplified, and the efficiency of data reasoning is improved.
Optionally, when x and y are equal, the inference model includes m first parameter matrices, the m first parameter matrices include: the first parameter matrix is shared by inference units of the same type in the n layers of inference layers and the output layer, wherein the type of the e-th inference unit in the d-th layer of inference layers is the same as that of the e-th inference unit in the output layer, d and e are integers, d is more than or equal to 1 and less than or equal to n, and e is more than or equal to 1 and less than or equal to y; when x and y are not equal, the inference model comprises at least m first parameter matrices, and the at least m first parameter matrices comprise: the first parameter matrix corresponding to each inference unit in the first layer of inference layer, the first parameter matrix corresponding to each inference unit in the output layer, and the first parameter matrix shared by inference units of the same type in the inference layers except the first layer of inference layer.
By setting the number of the reasoning units in the input layer, the number of the reasoning units in the output layer and the number of the reasoning units in each reasoning layer to be equal, the reasoning units of the same type in the n layers of reasoning layers and the output layer can share the same first parameter matrix, so that model parameters in the reasoning model are reduced as much as possible, the complexity of data reasoning is simplified, and the efficiency of data reasoning is improved.
Optionally, the association relationship between the inference units in adjacent layers is determined according to a probability map or a random association relationship, the probability map or the random association relationship is used for indicating the association relationship between the inference units of different types, the probability map is determined according to the feature classification and the feature variable in the database, and the random association relationship is generated randomly; the number of the inference units in each layer of the input layer, the n layers of the inference layers and the output layer is m; the type of the g-th reasoning unit in the f-th layer is the same as that of the g-th reasoning unit in the h-th layer, f, g and h are integers, f is more than or equal to 1 and less than or equal to n +2, g is more than or equal to 1 and less than or equal to m, and h is more than or equal to 1 and less than or equal to n + 2; the inference model comprises m first parameter matrixes, and inference units of the same type in the n layers of inference layers and the output layer share the same first parameter matrix.
The same type of reasoning units in each reasoning layer and the output layer share the same first parameter matrix, so that the model parameters in the reasoning model are reduced as much as possible, the complexity of data reasoning is simplified, and the efficiency of data reasoning is improved.
Optionally, the j-th inference unit in the input layer and the j-th inference unit in each layer of inference layer have an association relationship, j is an integer, and j is greater than or equal to 1 and less than or equal to m; the reasoning model also comprises m second parameter matrixes, the reasoning units of the same type in the n layers of reasoning layers share the same second parameter matrix, and the second parameter matrix used by each reasoning unit is used for indicating the degree of association between the reasoning unit and the reasoning unit of the same type in the input layer.
When the reasoning model also comprises the second parameter matrix, the reasoning units of the same type in each reasoning layer share the same second parameter matrix, so that the complexity of the reasoning model is simplified, the complexity of data reasoning is simplified, and the efficiency of data reasoning is improved.
Optionally, the inference model is obtained by training according to a database and an initial inference model, where the database includes x feature classifications and feature variables corresponding to each feature classification; the initial reasoning model comprises an input layer, n layers of reasoning layers, an output layer and at least m initial first parameter matrixes, the number x of the feature classifications is equal to the number of the reasoning units in the input layer, and the number x of the feature classifications is equal to the number of the reasoning units in the output layer.
Optionally, the inference model is obtained by performing self-supervision inference model training on the initial inference model according to the input-output pair; the input and output pairs are established according to a database, each group of input and output pairs comprises input characteristic variables and output characteristic variables, the input characteristic variables are used for inputting the initial reasoning model in the training stage of the model, and the output characteristic variables are used for indicating expected reasoning results of the input characteristic variables in the training stage of the model; and the self-supervision reasoning model training is used for updating at least m initial first parameter matrixes according to the reasoning loss between the expected reasoning result and the actual reasoning result to obtain the reasoning model. A
The initial reasoning model is trained in an automatic supervision model training mode, so that the training reasoning result is as close as possible to the expected reasoning result, the accuracy of the trained reasoning model can be improved, and the accuracy of the reasoning result obtained when data reasoning is carried out through the reasoning model can be improved.
Optionally, inputting the inference condition into the inference model to obtain a model inference result, including:
inputting x inference conditions into x inference units in the input layer correspondingly; each inference unit in the input layer outputs inference conditions to all inference units in the first layer of inference layer; for each inference unit in each layer of inference layer, determining the output result of the layer according to the output result of the previous layer and the used first parameter matrix, and sending the inference result of the layer to all inference units in the next layer; and each inference unit in the output layer determines a model inference result according to the output result of the nth layer of inference layer and the used first parameter matrix.
Optionally, inputting the inference condition into the inference model to obtain a model inference result, including: inputting m inference conditions into m inference units in the input layer correspondingly; each inference unit in the input layer outputs inference conditions to inference units having an association relation with the inference unit; for each inference unit in the first layer of inference layer, determining an inference result of the layer according to the received inference condition, a second parameter matrix used by the inference unit and the first parameter matrix, and sending the inference result of the layer to k inference units in the second layer of inference layer, wherein the inference unit sending the inference result of the layer is a correlation inference unit of the k inference units, and k is a positive integer less than or equal to m; for each inference unit in each layer of inference layers except the first layer of inference layer, determining the inference result of the layer according to the received inference condition, the used second parameter matrix, the inference result of the layer output by the associated inference unit and the used first parameter matrix, and sending the inference result of the layer to q inference units in the next layer, wherein the inference unit sending the inference result of the layer is the associated inference unit of the q inference units, and q is a positive integer less than or equal to m; and the reasoning unit in the output layer determines a model reasoning result according to the received reasoning result of the layer and the used first parameter matrix.
Optionally, the inference condition includes a default inference condition corresponding to the unobserved inference condition and an observed inference condition;
the model reasoning result comprises a reasoning result which is reasoned according to the observed reasoning condition and corresponds to the default reasoning condition, wherein the reasoning unit which outputs the reasoning result corresponding to the default reasoning condition and the reasoning unit which inputs the default reasoning condition are the same type of reasoning unit.
In a second aspect, an embodiment of the present application provides a method for training an inference model, where the method includes:
acquiring a database; creating an initial reasoning model; inputting the characteristic variables corresponding to the i characteristic classifications in the database into an initial reasoning model, training the initial reasoning model to obtain a reasoning model, wherein the reasoning model is used for obtaining a model reasoning result according to an input reasoning condition, and i is a positive integer smaller than x. Wherein the database comprises x feature classifications and at least one feature variable corresponding to each feature classification; the initial reasoning model comprises: the system comprises an input layer, n layers of reasoning layers, an output layer and at least m initial first parameter matrixes, wherein the input layer, each layer of reasoning layer and the output layer respectively comprise at least one reasoning unit, and n and m are positive integers; the initial reasoning model comprises at least two reasoning units sharing the same initial first parameter matrix, and the reasoning units are used for processing the received data; the initial first parameter matrix used by each inference unit is used for indicating the association degree between the inference unit and the associated inference unit, and the associated inference unit of the inference unit is the inference unit which is positioned in the input layer and has an association relation with the inference unit; or the associated reasoning unit of the reasoning unit is a reasoning unit which is positioned in the previous layer of reasoning layer and has an associated relation with the reasoning unit; n and m are positive integers.
Creating an initial inference model, the initial inference model comprising at least m initial first parameter matrices; at least two reasoning units share the same initial first parameter matrix; the problem that the complexity of training the probabilistic graphical model is high due to the fact that model parameters in the probabilistic graphical model are too many can be solved; different inference units can share the same initial first parameter matrix, so that the number of the initial first parameter matrices in the inference model is reduced, the difficulty of training the inference model is reduced, and the efficiency of training the inference model is improved.
Optionally, inputting the feature variables corresponding to the i feature classifications in the database into an initial reasoning model, and training the initial reasoning model to obtain a reasoning model, including: inputting the characteristic variables corresponding to the i characteristic classes into an initial reasoning model to obtain a training reasoning result, wherein the characteristic variables corresponding to the i characteristic classes are acquired at the same moment; comparing the training reasoning result with m-i characteristic variables to obtain the reasoning loss, wherein the m-i characteristic variables comprise characteristic variables corresponding to at least one characteristic classification except the i characteristic classifications in the database, and the at least one characteristic variable is acquired at the same time; the inference loss is used for indicating an error between the training inference result and the expected inference result; updating at least m initial first parameter matrixes according to the inference loss, and continuing to input the characteristic variables corresponding to the i characteristic classifications into the initial inference model until the change of the at least m initial first parameter matrixes is within a preset range; or stopping training when the training times reach the preset times to obtain the inference model.
The initial reasoning model is trained in an automatic supervision model training mode, so that the training reasoning result is as close as possible to the expected reasoning result, the accuracy of the trained reasoning model can be improved, and the accuracy of the reasoning result obtained when data reasoning is carried out through the reasoning model can be improved.
Optionally, before the feature variables corresponding to the i feature classifications are input into the initial inference model to obtain the training inference result, the method further includes: creating at least one group of input and output pairs according to the database, wherein each group of input and output pairs comprises input characteristic variables and output characteristic variables, and the characteristic variables corresponding to the i characteristic classifications input into the initial reasoning model are determined from the input characteristic variables in the input and output pairs; the m-i characteristic variables comprise output characteristic variables in input and output pairs to which the characteristic variables corresponding to the i characteristic classifications belong.
Optionally, creating an initial inference model, comprising: determining the number x of inference units in an input layer and an output layer according to a database, wherein the x inference units correspond to the x characteristic classifications one by one; determining the number y of inference units in each inference layer of n inference layers, wherein y is a positive integer, the type of an s-th inference unit in an r-th inference layer is the same as that of an s-th inference unit in a t-th inference layer, r, s and t are positive integers, r is more than or equal to 1 and less than or equal to n, t is more than or equal to 1 and less than or equal to y, and s is more than or equal to 1 and less than or equal to n; establishing an incidence relation between each inference unit in each inference layer and all inference units in the previous inference layer; establishing an incidence relation between each inference unit in each layer of inference layer or each inference unit in the output layer and all inference units in the previous layer; generating an initial first parameter matrix used by each inference unit to obtain at least m initial first parameter matrices, wherein the at least m initial first parameter matrices include: the method comprises the following steps that an initial first parameter matrix shared by reasoning units of the same type in reasoning layers except a first layer of reasoning layer is provided, and m is the maximum value of x and y; and creating an initial reasoning model according to the input layer, each reasoning layer in the n reasoning layers and at least one reasoning unit in the output layer and at least m initial first parameter matrixes.
By establishing the initial reasoning model according to the full connection relation, the computer equipment does not need to determine the incidence relation between the reasoning units in the adjacent reasoning layers in advance, and resources consumed by the computer equipment for establishing the initial reasoning model are saved.
Optionally, when x and y are equal, the initial inference model comprises m initial first parameter matrices, the m initial first parameter matrices comprising: the same initial first parameter matrix shared by the inference units of the same type in the n layers of inference layers and the output layer; when x and y are not equal, the initial inference model comprises at least m initial first parameter matrices, the at least m initial first parameter matrices comprising: an initial first parameter matrix used by the inference unit of each of the first layer of inference layers; the initial first parameter matrix is shared by inference units of the same type in different inference layers except the first layer of inference layer; and outputting an initial first parameter matrix used by each inference unit in the layer.
Optionally, creating an initial inference model, comprising: determining a probability map according to the database, determining the number x of inference units in an input layer, each layer of inference layer and an output layer, wherein the probability map is used for indicating the incidence relation among x feature classifications, and the x inference units in the same layer are in one-to-one correspondence with the x feature classifications; determining a related inference unit of a vth inference unit in a u layer according to the probability map, wherein u is an integer which is greater than 1 and less than n +2, and v is a positive integer which is less than or equal to x; generating an initial first parameter matrix used by each inference unit to obtain m initial first parameter matrices, wherein the m initial first parameter matrices include: the initial first parameter matrix is shared by inference units of the same type in the n layers of inference layers and the output layer; and creating an initial reasoning model according to the m reasoning units in the input layer, the m reasoning units in each reasoning layer in the n reasoning layers, the m reasoning units in the output layer and the m initial first parameter matrixes.
The initial reasoning model is established according to the probability map, so that the computer equipment can preliminarily determine the incidence relation between the reasoning units in the adjacent reasoning layers according to the probability map, the accuracy of the established initial reasoning model is improved, the complexity of training the initial reasoning model is reduced, and the efficiency of training the initial reasoning model is improved.
Optionally, creating an initial inference model, comprising: determining the number x of inference units in an input layer, each inference layer and an output layer according to a database, wherein a probability map is used for indicating the incidence relation among x feature classifications, and the x inference units in the same layer are in one-to-one correspondence with the x feature classifications; acquiring a random association relation among different feature classifications; determining an associated inference unit of a vth inference unit in a nth layer inference layer according to the random association relation, wherein u is an integer larger than 1 and smaller than n +2, and v is a positive integer smaller than or equal to x; generating an initial first parameter matrix used by each inference unit to obtain m initial first parameter matrices, wherein the m initial first parameter matrices include: the initial first parameter matrix is shared by inference units of the same type in the n layers of inference layers and the output layer; and creating an initial reasoning model according to the m reasoning units in the input layer, the m reasoning units in each reasoning layer in the n reasoning layers, the m reasoning units in the output layer and the m initial first parameter matrixes.
The initial reasoning model is established according to the random incidence relation, so that the computer equipment can randomly determine the incidence relation between the reasoning units in the adjacent reasoning layers without determining a probability map according to the characteristic variables in the database, and resources consumed by the computer equipment for establishing the initial reasoning model are saved.
Optionally, creating an initial inference model, further comprising: for each inference unit in the input layer, establishing an incidence relation between the inference unit and the inference units of the same type in each layer of inference layer; generating an initial second parameter matrix used by each inference unit to obtain m initial second parameter matrices, wherein the m initial second parameter matrices include: the initial second parameter matrix used by each inference unit is used for indicating the degree of association between the inference unit and the inference unit of the same type in the input layer; creating an initial reasoning model according to the m reasoning units in the input layer, the m reasoning units in each reasoning layer in the n reasoning layers, the m reasoning units in the output layer and the m initial first parameter matrixes, wherein the initial reasoning model comprises the following steps: and creating an initial reasoning model according to the m reasoning units in the input layer, the m reasoning units in each reasoning layer in the n reasoning layers, the m reasoning units in the output layer, the m initial first parameter matrixes and the m initial second parameter matrixes.
By establishing the incidence relation between the reasoning units in the input layer and the reasoning units of the same type in each layer of reasoning layer, the problem that the influence of the input characteristic variables is weakened along with the increase of the number of layers of the reasoning layer can be avoided, and the accuracy of data reasoning is improved.
In addition, the complexity of the initial reasoning model is simplified by sharing the same initial second parameter matrix by the reasoning units of the same type in different reasoning layers, so that the complexity of training the initial reasoning model is reduced, and the efficiency of training the initial reasoning model is improved.
In a third aspect, an embodiment of the present application provides a data inference apparatus, where the data inference apparatus includes at least one unit, and the at least one unit is configured to implement the data inference method provided in the first aspect.
In a fourth aspect, the embodiment of the present application provides a training apparatus for an inference model, where the training apparatus for an inference model includes at least one unit, and the at least one unit is used to implement the method for training an inference model provided in the second aspect.
In a fifth aspect, an embodiment of the present application provides a computer device, including: one or more processors, and a memory storing one or more programs configured for execution by the one or more processors, the one or more programs comprising instructions for implementing the data inference method as described in the first aspect; or, comprising instructions for implementing a training method of an inference model as described in the second aspect.
In a sixth aspect, the present application provides a computer-readable storage medium, in which one or more programs are stored, and when the one or more programs are executed, the data inference method provided in the first aspect is implemented; or, a training method for implementing the inference model provided by the second aspect.
Drawings
FIG. 1 is a schematic diagram of a fully-connected neural network provided by one embodiment of the present application;
FIG. 2 is a schematic diagram of a recurrent neural network provided by one embodiment of the present application;
FIG. 3 is a flow diagram of a data inference method provided by an embodiment of the present application;
FIG. 4 is a schematic diagram of a database provided by one embodiment of the present application;
FIG. 5 is a schematic diagram of a database provided by one embodiment of the present application;
FIG. 6 is a schematic diagram of an inference model provided by one embodiment of the present application;
FIG. 7 is a schematic diagram of an inference model provided by one embodiment of the present application;
FIG. 8 is a schematic illustration of a probability map provided by one embodiment of the present application;
FIG. 9 is a schematic diagram of an inference model built from a probability map, as provided by an embodiment of the present application;
FIG. 10 is a flow diagram of a method for training an inference model provided by an embodiment of the present application;
FIG. 11 is a block diagram of a data inference apparatus provided in one embodiment of the present application;
FIG. 12 is a block diagram of an apparatus for training inference models provided in one embodiment of the present application;
FIG. 13 is a block diagram of a computer device provided by one embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present application more clear, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
First, several terms referred to in the present application will be described.
Neural Networks (NN) model: refers to a mathematical model that simulates the human real neural network. The neural network model includes a plurality of neural layers, each neural layer including a plurality of neurons.
Optionally, in the present application, there is no connection between neurons in the same neural layer, and there is a connection between neurons in two adjacent neural layers. Connections between different neurons are used to indicate associative relationships between different neurons.
Optionally, each connection between two neurons corresponds to a weight, and the value of the weight is called a weight. The magnitude of the weight is used to indicate the degree of association between two neurons.
The neuron is a unit for data processing, and is used for processing input data.
Optionally, in this application, each neural layer of the neural network model sequentially includes, from front to back according to the data processing sequence: an input layer, at least one hidden layer (or intermediate layer), and an output layer.
The neurons in the input layer are used for receiving data input into the neural network model and outputting the data to the hidden layer.
And the neurons in the hidden layer are used for processing the received data to obtain a processing result.
And the neurons in the output layer are used for processing the received processing result to obtain an output result.
Optionally, in the present application, the neural network model includes, but is not limited to, the following types: a fully-connected neural network model and a recurrent neural network model.
1. The fully connected neural network model is as follows: for a neuron in each nerve layer, the neuron is connected to all neurons in the adjacent nerve layer.
Referring to the fully-connected neural network model shown in fig. 1, the neural network model includes an input layer 110, a first layer hidden layer 120, a second layer hidden layer 130, and an output layer 140.
Wherein, the neurons 111, 112 and 113 in the input layer 110 are all connected with the neurons 121, 122 and 123 in the first layer hidden layer 120, and there is a corresponding weight between the two connected neurons. The neurons 121, 122 and 123 in the first layer of hidden layer 120 are connected with the neurons 131, 132 and 133 in the second layer of hidden layer 130, and corresponding weights exist between the two connected neurons. The neurons 131, 132, and 133 in the second layer hidden layer 130 are all connected to the neurons 141, 142, and 143 in the output layer 140, and there is a corresponding weight between the two connected neurons.
Alternatively, in fig. 1 above, assuming that each neuron is represented by a function g (x), the individual neurons in the first layer hidden layer 120 can be represented by the following formula:
z121=g(a1*w11+a2*w12+a3*w13)
z122=g(a1*w21+a2*w22+a3*w23)
z123=g(a1*w31+a2*w32+a3*w33)
alternatively, a1, a2, and a3 are variables received by each neuron in the input layer, and if these three variables are represented by a vector a, the left side of the equation is [ z [ ]121,z122,z123]When the vector z represents the weight, and the matrix w represents the weight, the formula z is g (a × w).
Wherein,
alternatively, a1, a2 and a3 are variables received by each neuron in the input layer, and if the three variables are expressed by a vector a, the left side of the equation is z121、z122、z123The weights passing through a matrix w1、w2And w3To obtain the formula z121=g(a*w1);z122=g(a*w2);z123=g(a*w3)。
Wherein, w1=[w11w12w13];w2=[w21w22w23];w3=[w31w32w33]。
Optionally, the weight corresponding to each inference unit in the second hidden layer 130 is equal to the weight corresponding to each inference unit in the first hidden layer 120, that is, each inference unit in the second hidden layer 130 and each inference unit in the first hidden layer 120 share the same matrix w.
2. The recurrent neural network model refers to: the inference units which are positioned in the adjacent nerve layers and have the same position in the affiliated nerve layers have the same association relationship.
Referring to the recurrent neural network model shown in fig. 2, the recurrent neural network model includes an input layer 210, a first layer hidden layer 220, a second layer hidden layer 230, and an output layer 240.
Wherein the neuron 211 in the input layer 210 is connected with 222 in the first hidden layer 220 with weight w21(ii) a The neuron 212 in the input layer 210 is connected to the neurons 221 and 223, respectively, in the first hidden layer 220 with weights w, respectively12And w32(ii) a The neurons 213 in the input layer 210 are connected to 222 in the first hidden layer 220 with a weight w23。
The connection relationship between the neurons in the first layer hidden layer 220 and the neurons in the second layer hidden layer 230 is the same as the connection relationship between the neurons in the input layer 210 and the neurons in the first layer hidden layer 220.
The connection relationship between the neurons in the second layer hidden layer 230 and the neurons in the output layer 240 is the same as the connection relationship between the neurons in the first layer hidden layer 220 and the neurons in the second layer hidden layer 230.
Alternatively, in fig. 2 above, assuming that each neuron is represented by a function g (x), the individual neurons in the first layer hidden layer 220 can be represented by the following formula:
Z221=g(a2*w12)
Z222=g(a1*w21+a3*w23)
Z223=g(a2*w32)
alternatively, a1, a2, and a3 are variables received by each neuron in the input layer, and if these three variables are represented by a vector a, the left side of the equation is [ z [ ]221,z222,z223]Expressed by the vector z; the weights corresponding to the neurons in the first hidden layer are represented by a matrix w, which results in the formula z ═ g (a ×) w.
Wherein,
alternatively, a1, a2 and a3 are variables received by each neuron in the input layer, and if the three variables are expressed by a vector a, the formula z is obtained221=g(a*w1);z222=g(a*w2);z223=g(a*w3)。
Wherein, w1=[0 w120];w2=[w210 w23];w3=[0 w320]。
Optionally, the respective inference units in the second layer hidden layer 230 and output layer 240 share the same matrix w with the respective inference units in the respective neural layers.
Optionally, in this application, when the neural network model is applied to a data inference scenario, a hidden layer in the neural network model is taken as an inference layer in the inference model, and neurons in the neural layer are taken as inference units in the inference model for illustration, and in actual implementation, the hidden layer and the neurons may also be referred to by other names, which is not limited in this application.
In the related art, a computer device determines a probabilistic graphical model according to a preset database, and because there are many model parameters in the probabilistic graphical model, the efficiency of training the probabilistic graphical model and performing data inference according to the probabilistic graphical model is low.
Based on the technical problem, the inference model is established according to the neural network, and the inference model comprises at least two inference units sharing the same first parameter matrix, so that model parameters in the inference model are simplified, and the efficiency of model training and data inference is improved.
Referring to fig. 3, a flow chart of a data inference method provided in an embodiment of the present application is shown. The present embodiment is exemplified by the application of the data inference method to a computer device having data inference capabilities. The data reasoning method comprises the following steps:
step 301, obtaining inference conditions.
The inference conditions include default inference conditions corresponding to observed inference conditions and unobserved inference conditions.
The observed inference condition is used for indicating the sampled characteristic variables in the database; the unobserved inference conditions are used to indicate the feature variables to be inferred, and the unobserved inference conditions belong to different feature classes than the observed inference conditions.
The sum of the number of observed inference conditions and the number of unobserved inference conditions is equal to the number of feature classes in the database.
Wherein the database is stored in the computer device. The database includes x feature classes and at least one feature variable corresponding to each feature class. Optionally, the feature variables stored in the database are obtained by sampling data having attributes of corresponding feature classes.
Optionally, each feature classification is used to indicate the attributes that different feature variables have.
Referring to the database 400 shown in fig. 4, the database 400 includes 6 feature classifications, which are: movie title, movie genre, director, actors, show time, and show area. Each feature classification corresponds to two feature variables. The characteristic variables located in the same row are collected at the same time.
Referring to the database 500 shown in fig. 5, the database 500 includes 3 feature classifications, which are: product type, consumer type, and time of purchase. Each feature classification corresponds to a feature variable. The characteristic variables located in the same row are collected at the same time.
Optionally, the computer device stores at least one database, each database corresponds to a different inference scenario, and at least one database corresponds to an inference model such as: the database 400 corresponds to a movie inference scenario; database 500 corresponds to a product reasoning scenario.
Alternatively, the databases shown in fig. 4 and 5 are only schematic, and in actual implementation, the number of feature variables corresponding to each feature classification is usually large, such as: hundreds or thousands of feature variables corresponding to each feature classification are not limited by this implementation.
Optionally, the number of observed inference conditions is at least one.
Optionally, the default inference condition is 0; alternatively, the default inference condition is that no inference condition is input.
Schematically, in the database 500 shown in fig. 5, if an observed inference condition "mobile phone" is used as an observed inference condition to input an inference model, at this time, other inference conditions are all unobserved inference conditions, and therefore, 0 is input to a corresponding inference unit; or, the inference unit corresponding to the unobserved inference condition has no data input.
Optionally, the computer device obtains observed inference conditions input by the user, and automatically generates default inference conditions corresponding to the unobserved inference conditions.
Step 302, obtaining an inference model.
The reasoning model is used for carrying out data reasoning on the input reasoning conditions to obtain a reasoning result.
Optionally, the computer device trains a pre-stored initial reasoning model according to the database to obtain a reasoning model; alternatively, the computer device obtains a pre-stored inference model.
The inference model comprises an input layer, n inference layers, an output layer and at least m first parameter matrixes, wherein each inference layer of the input layer and the n inference layers and the output layer comprise at least one inference unit; n and m are positive integers.
Optionally, the inference model includes, in order from front to back according to the data inference order: the system comprises an input layer, n layers of inference layers and an output layer, wherein n is larger than 3.
In this embodiment, the inference model includes at least two inference units sharing the same first parameter matrix; the first parameter matrix used by each inference unit is used for indicating the degree of association between the inference unit and the associated inference unit.
Optionally, for an inference unit located in a first layer of inference layers of the n layers of inference layers, the associated inference unit of the inference unit refers to an inference unit located in the input layer and having an association relationship with the inference unit.
Optionally, for inference units in inference layers other than the first layer of inference layer in the n layers of inference layers; or, for the inference unit located in the output layer, the associated inference unit of the inference unit refers to the inference unit located in the previous layer of inference layer and having an association relation with the inference unit.
Because the reasoning model comprises at least two reasoning units sharing the same first parameter matrix, the first parameter matrix does not need to be configured for each reasoning unit, the number of the parameters of the model in the reasoning model is simplified, and the complexity of data reasoning is simplified.
And step 303, inputting the inference condition into the inference model to obtain a model inference result.
Inputting the observed inference conditions into the corresponding inference unit in the input layer, and performing data inference through an inference model to obtain a model inference result.
Referring to fig. 6, if the inference conditions received by the input layer are: a1, a2 and a3, wherein a1 is a default inference condition corresponding to an unobserved inference condition, and a1 is input into the inference unit 611; a2 is the observed inference condition, a2 is input into the inference unit 612; a3 is the observed inference condition, a3 is input to the inference unit 613. After the data inference is performed by the inference model, the inference unit 641 in the output layer 640 outputs an inference result corresponding to the default inference condition a 1; the inference result is inferred from a 2.
In summary, in the data inference method provided in this embodiment, the inference model including at least two inference units sharing the same first parameter matrix is used for data inference, so that the problem that when a probability graph is used for data inference, model parameters in the probability graph are too many, which results in higher complexity of data inference can be solved; different inference units can share the same first parameter matrix, so that model parameters of an inference model are simplified, complexity of data inference is simplified, and efficiency of data inference is improved.
Optionally, in this application, since the input layer is configured to receive inference conditions, and the types of the inference conditions are in one-to-one correspondence with the feature classifications in the database, in order to ensure that each type of inference condition can be input to a corresponding inference unit, the number of inference units in the input layer is equal to the number of feature classifications in the database. In order to ensure that the inference result corresponding to each type of inference condition can be output from the corresponding inference unit, the number of inference units in the output layer is equal to the number of feature classifications in the database. That is, the number of inference units in the input layer is equal to the number of inference units in the output layer, and both are equal to the number of feature classifications in the database.
Optionally, three inference models are provided in the present application, which are: establishing a reasoning model according to the full connection relation; a reasoning model established according to the probability graph and a reasoning model established according to the random incidence relation. The corresponding data reasoning process is different according to different reasoning models.
The fully connected relation means that each inference unit in the ith layer has an incidence relation with all inference units in the (i-1) th layer, i is an integer, and i is more than or equal to 2 and less than or equal to n + 2.
The probability map is determined by the computer device according to the feature classification and the feature variable in the database; the probability map includes associations between feature classifications. In the application, each inference unit in each inference layer or each inference unit in each output layer corresponds to one feature classification in the database, so the probability map can indicate the association relationship between inference units of different types.
The random incidence relation is generated by the computer equipment through a random algorithm; or, it is set randomly by the user. The random association includes associations between feature classifications. In the application, each inference unit in each inference layer or each inference unit in each output layer corresponds to one feature classification in the database, so that the random association relationship can indicate the association relationship between inference units of different types.
Each inference model and the data inference mode corresponding to each inference model are introduced below.
Firstly, the method comprises the following steps: and establishing the reasoning model according to the full connection relation. At this time, in the inference model, the inference units in the adjacent layers are in a full connection relationship.
For example, in fig. 7, each inference unit in the first layer of inference hierarchy 720 has an association relationship with all inference units in the input hierarchy 710; each inference unit in the second layer of inference layer 730 has an association relation with all inference units in the first layer of inference layer 720; each inference unit in the third layer of inference layer 740 has an association relation with all inference units in the second layer of inference layer 730; each inference unit in the output layer 750 has an association relationship with all inference units in the third layer of inference layer 740.
The number of the inference units in the input layer and the number of the inference units in the output layer are both x.
The number of the reasoning units in each reasoning layer in the n reasoning layers is y; in the n layers of reasoning layers, the type of the b-th reasoning unit in the a-th layer reasoning layer is the same as that of the b-th reasoning unit in the c-th layer reasoning layer, a, b and c are integers, a is more than or equal to 1 and less than or equal to n, b is more than or equal to 1 and less than or equal to y, and c is more than or equal to 1 and less than or equal to n.
In this embodiment, on the premise of ensuring that the correct probability of the inference result of the data inference is higher, the number of inference units in the inner layer of each layer of the graph in the n layers of inference layers is set to be equal in order to increase the number of inference units sharing the same first parameter matrix, so that the dimension of the first parameter matrix used by the b-th inference units in the different inference layers is ensured to be the same, and under the condition that the first parameter matrices used by the b-th inference units are all equal, the b-th inference units in the different inference layers can share the same first parameter matrix, thereby reducing the parameters in the inference model and improving the efficiency of the data inference.
When the b-th inference units in different inference layers share the same first parameter matrix, at least m first parameter matrices in the inference model comprise: the first parameter matrix is shared by the reasoning units of the same type in the reasoning layers except the first layer reasoning layer; the first layer of inference layer is the first layer of n layers of inference layers when the inference layers are sorted according to the data inference sequence, and the value of m is the maximum value of x and y.
Alternatively, when x is equal toy is equal, i.e. x-y-m, the inference model comprises m first parameter matrices, the m first parameter matrices comprising: and the n-layer inference layer and the output layer share the same type of inference unit, wherein the type of the e-th inference unit in the d-layer inference layer is the same as that of the e-th inference unit in the output layer, d and e are integers, d is more than or equal to 1 and less than or equal to n, and e is more than or equal to 1 and less than or equal to y. Referring to fig. 1, the inference model includes: first parameter matrix w common to inference units 121, 131 and 1411(ii) a A first parameter matrix w common to the inference units 122, 132 and 1422(ii) a First parameter matrix w common to inference units 123, 133 and 1433。
Optionally, when x and y are not equal, the inference model includes at least m first parameter matrices, including: the first parameter matrix corresponding to each inference unit in the first layer of inference layer, the first parameter matrix corresponding to each inference unit in the output layer, and the first parameter matrix shared by inference units of the same type in the inference layers except the first layer of inference layer. Such as: in fig. 7, the inference model includes: a first parameter matrix used by the inference unit 721; a first parameter matrix used by the inference unit 722; a first parameter matrix used by the inference unit 723; a first parameter matrix common to inference units 731 and 741; a first parameter matrix common to inference units 732 and 742; a first parameter matrix used by the inference unit 751; a first parameter matrix used by the inference unit 752; the first parameter matrix used by the inference unit 753.
Optionally, in the n layers of inference layers, a part of inference units of the same type located in different inference layers share the same first parameter matrix. Such as: in fig. 1, the inference units 121, 131 and 141 share a first parameter matrix w1(ii) a Inference units 122, 132 and 142 share a first parameter matrix w2(ii) a The inference units 123, 133 and 143 use different first parameter matrices, respectively. For another example: in fig. 7, inference units 731 and 741 share the same first parameter matrix; inference units 732 and 742 use different first parameter matrices, respectively.
When the inference model is built according to the full-connection relationship, the inference condition is input into the inference model to obtain the model inference result, which includes but is not limited to the following steps:
1) and inputting x inference conditions corresponding to the x inference units in the input layer.
When the inference model is established according to the full connection relationship, m is the maximum value of x and y, and x may be equal to m or different from m.
2) And each inference unit in the input layer outputs inference conditions to all inference units in the first layer of inference layers.
Referring to the inference model shown in fig. 7, each inference unit in the input layer 710 has an association relationship with all inference units in the first layer inference layer 720, and according to the association relationship, each inference unit in the input layer 710 outputs inference conditions to all inference units in the first layer inference layer 720.
3) And for each inference unit in each layer of inference layer, determining the output result of the layer according to the output result of the previous layer and the used first parameter matrix, and sending the inference result of the layer to all inference units in the next layer.
Wherein the previous layer is an input layer; or, the inference level is one of the n inference levels. The next layer is one of the n inference layers; or an output layer.
Referring to the inference model shown in fig. 7, for each inference unit in the inference layer 720 of the first layer, the output result of the current layer is determined according to the output results of all inference units in the input layer 710 and the first parameter matrix of the inference unit, and the output result of the current layer is output to the inference layer 720 of the second layer. For each inference unit in the second layer of inference layer 730, determining the output result of the layer according to the output results of all inference units in the first layer of inference layer 720 and the first parameter matrix of the inference unit, and outputting the output result of the layer to all inference units in the third layer of inference layer 740. For each inference unit in the third layer of inference layer 740, determining the output result of the layer according to the output results of all inference units in the second layer of inference layer 730 and the first parameter matrix of the inference unit, and outputting the output result of the layer to all inference units in the output layer 750.
Optionally, for each inference unit in the first layer of inference layers, the output result of the previous layer is the inference condition output by the input layer.
Schematically, this step is represented by the following formula:
wherein,representing the output result of the nth inference unit in the k layer inference layer; (x) represents the processing function of the inference unit in the k-th layer inference layer; u represents the u-th inference unit which has an association relation with the v-th inference unit and is positioned in the previous layer;indicating the first parameter matrix used by the vth inference unit. Wherein k is greater than 1 and less than n; v is a positive integer less than y.
Alternatively, in this embodiment, f (x) may be a non-linear perturbation function, and the processing function represented by f (x) is represented by the following formula.
f(x)=1/(1+e-x)
4) And each inference unit in the output layer determines a model inference result according to the output result of the nth layer of inference layer and the used first parameter matrix.
Optionally, the processing function of the inference unit in the output layer is different from the processing function of the inference unit in the n-layer inference layer.
Schematically, this step is represented by the following formula:
wherein,representing the output result of the nth inference unit in the output layer; softmax (x) represents the processing function of the inference unit in the output layer; u represents the u-th inference unit which has an association relation with the v-th inference unit and is positioned in the nth layer of inference layer;a first parameter matrix used for indicating the used of the nth inference unit;and the output result of the u-th reasoning unit in the nth layer reasoning layer is shown. Wherein k is greater than 1; v is a positive integer less than y.
Optionally, in this embodiment, softmax (x) refers to an operation of unitizing a vector, and the processing function represented by softmax (x) is represented by the following formula, where i is the ith element in the vector, and j is the sum of all elements in the vector.
In summary, in the data inference method provided in this embodiment, the number of inference units in each layer of inference layer of the inference model is set to be y, and the first parameter matrix shared by inference units of the same type in inference layers other than the first layer of inference layer is set, so that model parameters in the inference model are reduced as much as possible, the complexity of data inference is simplified, and the efficiency of data inference is improved.
In addition, the number of the reasoning units in the input layer, the number of the reasoning units in the output layer and the number of the reasoning units in each reasoning layer are set to be equal, so that the reasoning units of the same type in the n layers of reasoning layers and the output layer can share the same first parameter matrix, the model parameters in the reasoning model are reduced as much as possible, the complexity of data reasoning is simplified, and the efficiency of data reasoning is improved.
Optionally, in this application, when the first parameter matrices of different inference units in the same layer inference layer are the same, the different inference units in the same layer inference layer may also share the same first parameter matrix. When the first parameter matrices of different inference units in the output layer are the same, different inference units in the output layer may also share the same first parameter matrix.
Secondly, the method comprises the following steps: the inference model is built according to a probability map or a random association relationship. Such as: in fig. 8, the probability map includes the association relationships among the feature classifications 81, 82, and 83, then in the inference model built based on the probability map shown in fig. 9, the inference units 811, 821, 831, and 841 correspond to the feature classification 81; inference units 812, 822, 832, and 842 correspond to feature classes 82; inference units 813, 823, 833, and 843 correspond to feature classification 83; according to the probability map, the inference unit 811 is connected with the inference units 822 and 823, the inference unit 821 is connected with the inference units 832 and 833, and the inference unit 831 is connected with the inference units 842 and 843; inference unit 812 is connected to inference units 821 and 823, inference unit 822 is connected to inference units 831 and 833, inference unit 832 is connected to inference units 841 and 843; the reasoning unit 813 is connected to the reasoning units 821 and 822, the reasoning unit 823 to the reasoning units 831 and 832 and the reasoning unit 833 to the reasoning units 841 and 842.
As the reasoning units in the reasoning model correspond to the feature classification, in order to ensure that the incidence relation between the reasoning units in different layers can be established according to the probability chart or the random incidence relation, the number of the reasoning units in each layer of the input layer and the n layers of the reasoning layers and the number of the reasoning units in the output layer are both m; the type of the g-th reasoning unit in the f-th layer is the same as that of the g-th reasoning unit in the h-th layer, f, g and h are integers, f is more than or equal to 1 and less than or equal to n +2, g is more than or equal to 1 and less than or equal to m, and h is more than or equal to 1 and less than or equal to n + 2.
Optionally, when the inference model is built according to a probability map or a random association relationship, the inference model includes m first parameter matrices, and inference units of the same type in the n layers of inference layers and the output layer share the same first parameter matrix.
Such as: in fig. 9, the m first parameter matrices in the inference unit include: a first parameter matrix common to inference units 821, 831, and 841; a first parameter matrix common to inference units 822, 832 and 842; inference units 823, 833, and 843.
Optionally, the j-th inference unit in the input layer and the j-th inference unit in each layer of inference layer have an association relationship, j is an integer, and j is greater than or equal to 1 and less than or equal to m; the reasoning model also comprises m second parameter matrixes, the reasoning units of the same type in the n layers of reasoning layers share the same second parameter matrix, and the second parameter matrix used by each reasoning unit is used for indicating the degree of association between the reasoning unit and the reasoning unit of the same type in the input layer.
In this embodiment, for each inference unit in the input layer, by establishing an association relationship between the inference unit and the inference unit of the same type in each inference layer, the inference model does not weaken the influence of inference conditions when performing data inference according to the inference conditions, and can improve the accuracy of data inference. In addition, the same type of reasoning units in different reasoning layers are arranged to share the second parameter matrix, so that the parameters in the reasoning model are reduced, and the data reasoning efficiency can be improved.
When the inference model is built according to the probability graph or the random incidence relation, the inference condition is input into the inference model to obtain the model inference result, which includes but is not limited to the following steps:
1) and inputting the m inference conditions into the m inference units in the input layer correspondingly.
As the inference model is obtained based on the probability graph and the random incidence relation, at the moment, the feature classifications of the m inference conditions correspond to the m inference units one by one, and each layer of inference layer, the input layer and the output layer respectively comprise the m inference units.
2) Each inference unit in the input layer outputs inference conditions to an inference unit having an association relation with the inference unit.
The reasoning unit having an association relation with the reasoning unit in the input layer comprises: and the inference unit is positioned in the first layer of inference layer. Such as: in fig. 6, as for the inference unit 611 in the input layer 610, the inference units having an association relationship with the inference unit 611 include the inference units 622 and 623 in the first layer inference layer 620.
Optionally, the inference unit having an association relationship with the inference unit in the input layer further includes: and the inference units are positioned in each inference layer of the n inference layers and have the same type as the inference units in the input layer. Such as: in fig. 6, as for the inference unit 611 in the input layer 610, the inference unit having an association relationship with the inference unit 611 further includes: the inference unit 621 in the first layer inference layer 620 and the inference unit 631 in the second layer inference layer 630.
3) And for each inference unit in the first layer of inference layer, determining an inference result of the layer according to the received inference condition, the second parameter matrix used by the inference unit and the first parameter matrix, and sending the inference result of the layer to k inference units in the second layer of inference layer, wherein the inference unit sending the inference result of the layer is a correlation inference unit of the k inference units, and k is a positive integer less than or equal to m.
Illustratively, each inference unit in the first layer of inference layer determines that the inference result of the layer can be represented by the following formula:
wherein,the layer reasoning result of the nth reasoning unit in the first layer reasoning layer is referred, and f (x) is a processing function of the reasoning unit in the reasoning layer; wv is a second parameter matrix, x, used by the vth inference unitvThe inference conditions received by the nth inference unit in the input layer; n (v) is a set of related inference units having an association relation with the v-th inference unit, xuAnd the inference conditions received by the inference units in the input layer are in association relationship with the nth inference unit.
The related description of f (x) is described in detail in the first data inference method, and this embodiment is not repeated herein.
3) And for each inference unit in each layer of inference layers except the first layer of inference layer, determining the inference result of the layer according to the received inference condition, the used second parameter matrix, the inference result of the layer output by the associated inference unit and the used first parameter matrix, and sending the inference result of the layer to q inference units in the next layer, wherein the inference unit sending the inference result of the layer is the associated inference unit of the q inference units, and q is a positive integer less than or equal to m.
Illustratively, each inference unit in each inference layer except the first layer inference layer determines that the layer inference result can be represented by the following formula:
wherein,the layer is the inference result of the nth inference unit in the k layer inference layer, wherein k is an integer which is more than 2 and less than n; (x) is a processing function of the inference unit in the inference layer; wv is a second parameter matrix, x, used by the vth inference unitvIn the input layerThe inference condition received by the nth inference unit; n (v) is a set of related reasoning units having a relationship with the v-th reasoning unit, yuAnd the inference result of the current layer is output by the inference unit in the previous layer of inference layer which has an association relation with the nth inference unit.
4) And the reasoning unit in the output layer determines a model reasoning result according to the received reasoning result of the layer and the used first parameter matrix.
The inference result of the inference unit determination model in the output layer can be represented by the following formula:
wherein,representing the output result of the nth inference unit in the output layer; softmax (x) represents the processing function of the inference unit in the output layer; u represents the u-th inference unit which has an association relation with the v-th inference unit and is positioned in the previous layer;a transpose to indicate a first parameter matrix used by the vth inference unit;and the output result of the u-th reasoning unit in the nth layer reasoning layer is shown. Wherein k is greater than 1; v is a positive integer less than y.
In summary, when the inference model is built according to the probability map or the random association relationship, the number of inference units in the input layer, each layer of inference layer, and the output layer of the inference model is set to be equal, and inference units of the same type in different layers are set to share the same first parameter matrix, so that model parameters in the inference model are reduced as much as possible, the complexity of data inference is simplified, and the efficiency of data inference is improved.
In addition, when the reasoning model also comprises a second parameter matrix, the complexity of the reasoning model is simplified by setting the same type of reasoning units in different reasoning layers to share the same second parameter matrix, thereby simplifying the complexity of data reasoning and improving the efficiency of data reasoning.
Optionally, in this application, when the first parameter matrices of different inference units in the same layer inference layer are the same, the different inference units in the same layer inference layer may also share the same first parameter matrix. When the first parameter matrices of different inference units in the output layer are the same, different inference units in the output layer may also share the same first parameter matrix.
Optionally, in the present application, the inference model is obtained by training according to a database and an initial inference model, where the database includes x feature classifications and a feature variable corresponding to each feature classification; the initial reasoning model comprises an input layer, n layers of reasoning layers, an output layer and at least m initial first parameter matrixes, the number x of the feature classifications is equal to the number of the reasoning units in the input layer, and the number x of the feature classifications is equal to the number of the reasoning units in the output layer.
The first parameter matrix in the inference model is obtained by training the initial first parameter matrix in the initial inference model.
Optionally, in the present application, the inference model is obtained by performing self-supervision inference model training on the initial inference model according to the input-output pair; the input and output pairs are established according to a database, each group of input and output pairs comprises input characteristic variables and output characteristic variables, the input characteristic variables are used for inputting the initial reasoning model in the training stage of the model, and the output characteristic variables are used for indicating expected reasoning results of the input characteristic variables in the training stage of the model; and the self-supervision reasoning model training is used for updating at least m initial first parameter matrixes according to the reasoning loss between the expected reasoning result and the actual reasoning result to obtain the reasoning model.
The following describes the training process of the inference model.
Referring to fig. 10, a flowchart of a method for training an inference model provided in an embodiment of the present application is shown. The present embodiment is exemplified by the application of the training method to a computer device having the training capability of an inference model. The training method comprises the following steps:
step 1001, a database is obtained, where the database includes x feature classifications and at least one feature variable corresponding to each feature classification.
The database is stored in the computer device. Referring to step 301 in the embodiment described in fig. 3, related description of the database is omitted here for brevity.
Step 1002, an initial inference model is created.
The initial reasoning model comprises an input layer, n layers of reasoning layers, an output layer and at least m initial first parameter matrixes, wherein the input layer, each layer of reasoning layer and the output layer comprise at least one reasoning unit, and n and m are positive integers.
Optionally, the number n of inference layers of the initial inference model is set by a developer; alternatively, it is set by default by the computer device.
The reasoning unit is used for processing the received data.
The number of the inference units in the input layer and the output layer is equal, and the number of the inference units in the input layer and the number of the inference units in the output layer are equal to the number of the feature classifications in the database. Each inference unit corresponds to a feature classification in the database. Such as: in the initial inference model corresponding to the database 400, the input layer and the output layer each include 6 inference units, and each inference unit corresponds to a feature classification.
Alternatively, the number of inference units in the inference layer may be equal to the number of feature classifications in the database; or may be different from the number of feature classes in the database.
Optionally, the initial first parameter matrix used by each inference unit is formed by weights between the inference unit and the associated inference unit; the initial first parameter matrix used by each inference unit is used to indicate the degree of association between the inference unit and the associated inference unit. The correlation degree is in positive correlation with the weight in the first parameter matrix.
The associated reasoning unit is a reasoning unit which is positioned in the previous layer of reasoning layer and has an associated relation with the reasoning unit; or, the associated reasoning unit is a reasoning unit which is positioned in the input layer and has an associated relation with the reasoning unit.
Schematically, the matrix w of the fully-connected neural network model shown in FIG. 11(inference unit z)121Corresponding initial first parameter matrix), w11For indicating the reasoning unit z121And associated inference unit z111The degree of correlation between; w is a12For indicating the reasoning unit z121And associated inference unit z112The degree of correlation between; w is a13For indicating the reasoning unit z121And associated inference unit z113The degree of correlation between them.
It is assumed that each inference unit uses an initial first parameter matrix in the inference model shown in fig. 1, and the initial inference model includes 3 × 4 — 12 initial parameter matrices. As the number of layers of the inference layer and/or the number of inference units increases, the number of initial first parameter matrices increases. Since the more the number of the initial first parameter matrices, the higher the complexity of training the initial inference model, in this embodiment, the initial inference model includes at least two inference units sharing the same initial first parameter matrix. Because different inference units use the same initial first parameter matrix, the number of the initial first parameter matrix in the initial inference model is reduced, and the complexity of training the initial inference model is reduced. In addition, the more the number of the inference units sharing the initial first parameter matrix is, the better the effect of reducing the complexity of training the initial inference model is.
Such as: in fig. 1, the inference units 121, 131 and 141 share the same initial first parameter matrix w1。
The manner of creating the initial inference model in this step is detailed in three creating manners described later.
Step 1003, inputting the characteristic variables corresponding to the i characteristic classifications in the database into an initial reasoning model, and training the initial reasoning model to obtain a reasoning model.
The reasoning model is used for obtaining a model reasoning result according to input reasoning conditions in the data reasoning process, and i is a positive integer smaller than x.
Optionally, in this embodiment, the initial inference model is trained in an auto-supervised model training manner.
Illustratively, the computer device training the initial inference model includes, but is not limited to, the following steps:
1) at least one set of input-output pairs is created from the database.
Wherein each set of input-output pairs comprises an input characteristic variable and an output characteristic variable.
Optionally, the input feature variable and the output feature variable in the same group of input-output pairs are acquired at the same time and have the attribute of the corresponding feature classification. Different feature variables (including input feature variables and output feature variables) in the same input and output pair group correspond to different feature classifications.
Illustratively, the input-output pairs created by the computer device from the database 500 include: (abcd, swordsmen, LA) - > ZRF; LA- > in China; WJW- > love; WJW > hong Kong. The characteristic variables before the arrow are input characteristic variables, and the characteristic variables before the arrow are output characteristic variables.
2) And inputting the characteristic variables corresponding to the i characteristic classifications into the initial reasoning model to obtain a training reasoning result.
The feature variables corresponding to the i feature classifications are input feature variables in the input-output pairs, and the i feature classifications are acquired at the same time. Such as: the feature variables corresponding to the i feature classes are (abcd, martial arts, LA).
3) And comparing the training reasoning result with the m-i characteristic variables to obtain the reasoning loss.
The m-i characteristic variables comprise characteristic variables corresponding to at least one characteristic classification in the database except the i characteristic classifications, and the characteristic variables corresponding to the at least one characteristic classification are collected at the same time. In this embodiment, the m-i feature variables include output feature variables in the input-output pair to which the feature variables corresponding to the i feature classifications belong. Such as: the m-i feature variables include (abcd, martial arts, LA) corresponding (ZRF).
Optionally, the inference penalty is used to indicate an error between the trained inference result and the expected inference result. And the expected inference result corresponding to the input characteristic variable in the input and output pair is an output characteristic variable.
Alternatively, in the present embodiment, the inference loss is represented by a cross-quotient (cross-error). Schematically, this step is represented by the following formula:
wherein, H (p, q) represents the inference loss, p (x) and q (x) are discrete distribution vectors with equal length, and p (x) represents the training inference result; q (x) represents a desired inference result; x is a vector in the training reasoning result or the desired reasoning result.
Optionally, when the output layer includes a plurality of inference units, determining a sum of inference losses corresponding to each inference unit as a total inference loss of the initial inference model.
Optionally, in this embodiment, the data input into the initial inference model and the data output from the initial inference model are both represented by a vector, and the vector includes at least one element.
4) Updating at least m initial first parameter matrixes according to the inference loss, and continuing to execute the step 2) until the change of the at least m initial first parameter matrixes is within a preset range; or stopping training when the training times reach the preset times to obtain the inference model.
Optionally, in this embodiment, the gradient directions of the at least m initial first parameter matrices are determined according to the inference loss through a back propagation algorithm, and the at least m initial first parameter matrices are updated layer by layer from the output layer.
Optionally, when a difference value between at least m initial first parameter matrices obtained by the training and at least m initial first parameter matrices obtained by the last training is within a preset range, stopping the training to obtain a reasoning model, where the reasoning model includes at least m first parameter matrices.
Optionally, when the execution times of steps 2 to 4 reach a preset time, that is, when the training time of the initial inference model reaches the preset time, stopping training, obtaining an inference model, where the inference model includes at least m first parameter matrices.
In summary, in the training method of the inference model provided in this embodiment, an initial inference model is created, where the initial inference model includes at least m initial first parameter matrices; the method comprises the following steps that inference units in at least two inference layers share the same initial first parameter matrix; the problem that the complexity of training the probabilistic graphical model is high due to the fact that model parameters in the probabilistic graphical model are too many can be solved; different inference units can share the same initial first parameter matrix, so that the number of the initial first parameter matrices in the inference model is reduced, the difficulty of training the inference model is reduced, and the efficiency of training the inference model is improved.
In addition, the initial reasoning model is trained in an automatic supervision model training mode, so that the training reasoning result is close to the expected reasoning result as much as possible, the accuracy of the trained reasoning model can be improved, and the accuracy of the reasoning result obtained when data reasoning is carried out through the reasoning model can be improved.
Optionally, in step 302 above, the computer device creates the initial inference model in three ways including, but not limited to:
the first method comprises the following steps: establishing an initial reasoning model according to the full connection relation;
and the second method comprises the following steps: creating an initial reasoning model according to the probability graph;
and the third is that: and creating an initial reasoning model according to the random incidence relation.
The three ways of creating the initial inference model are described below.
The first method comprises the following steps: an initial inference model is created from the fully-connected relationship, including, but not limited to, the following steps.
1) And determining the number x of the reasoning units in the input layer and the output layer according to the database.
Because the input layer is used for receiving the feature variables corresponding to each feature classification in the database, all the inference units in the input layer correspond to the feature classifications in the database one by one, that is, the computer device determines that the number of the inference units in the input layer is x according to x feature classifications in the database.
Because the output layer is used for outputting the inference result of the feature variable corresponding to each feature classification, each inference unit in the output layer corresponds to a feature classification in the database one by one, that is, the computer device determines that the number of the inference units of the output layer is x according to x feature classifications in the database.
Such as: in fig. 1, the input layer 110 and the output layer 140 include three inference units, each inference unit corresponding to a feature class in the database 500, such as: the inference units 111 and 141 correspond to the feature classification product types, the inference units 112 and 142 correspond to the feature classification consumer types, and the inference units 113 and 143 correspond to the feature classification purchase times.
2) And determining the number y of the inference units in each inference layer in the n inference layers, wherein the y is a positive integer.
The type of the s-th reasoning unit in the r-th reasoning layer is the same as that of the s-th reasoning unit in the t-th reasoning layer, wherein r, s and t are positive integers, r is more than or equal to 1 and less than or equal to n, t is more than or equal to 1 and less than or equal to y, and s is more than or equal to 1 and less than or equal to n.
Alternatively, in order to reduce the number of initial first parameter matrices in the initial inference model as much as possible, inference units of the same type in different inference layers may share the same initial first parameter matrix.
Such as: referring to FIG. 1, inference units 121, 131 and 141 share the same initial first parameter matrix w1(ii) a Inference units 122, 132 and 142 share the same initial first parameter matrix w2(ii) a The inference units 123, 133 and 143 share the same initial first parameter matrix w3。
Optionally, in this embodiment, y may be equal to x, or may be different from x.
3) And establishing an incidence relation between each inference unit in each layer of inference layer or each inference unit in the output layer and all inference units in the previous layer.
Such as: establishing an incidence relation between each inference unit in the first layer of inference layer and all inference units in the input layer; establishing an incidence relation between each inference unit in the inference layers except the first layer of inference layer and all inference units in the previous layer of inference layer; and establishing an incidence relation between the reasoning unit and all reasoning units in the previous reasoning layer for each reasoning unit in the output layer.
In this embodiment, the inference units in the two adjacent inference layers are in a full connection relationship, so that the computer device does not need to determine the association relationship between the inference units in different inference layers in advance, and resources of the computer device are saved.
4) And generating an initial first parameter matrix used by each inference unit to obtain at least m initial first parameter matrices.
The at least m initial first parameter matrices include: and m is the maximum value of x and y.
Optionally, each layer of inference layer comprises y inference units, and the types of inference units at the same position in different inference layers are the same. Such as: in fig. 1, the inference units 121 and 131 are the same type of inference unit. At this time, since the same type of inference unit shares the same initial first parameter matrix in different inference layers except for the first layer inference layer, at least m initial first parameter matrices at least include: and y initial first parameter matrixes corresponding to the y reasoning units.
Optionally, when x and y are equal, the inference model includes m initial first parameter matrices, and the inference units of the same type in the n layers of inference layers and the output layer share the same initial first parameter matrix. Referring to FIG. 1, inference units 121, 131 and 141 share the same initial first parameter matrix w1(ii) a Inference units 122, 132 and 142 share the same initial first parameter matrix w2(ii) a The inference units 123, 133 and 143 share the same initial first parameter matrix w3。
Optionally, when x and y are equal, some inference units of the same type in the n-layer inference layer and the output layer share the same initial first parameter matrix, such as: in fig. 1, the inference units 121, 131 and 141 share the same initial first parameter matrix w1(ii) a The inference units 122, 132 and 142 respectively correspond to different initial first parameter matrixes; the inference units 123, 133 and 143 share the same initial first parameter matrix w3。
When x and y are not equal, the initial inference model comprises at least m initial first parameter matrices, the at least m initial first parameter matrices comprising: an initial first parameter matrix used by the inference unit of each of the first layer of inference layers; the initial first parameter matrix is shared by inference units of the same type in different inference layers except the first layer of inference layer; and outputting an initial first parameter matrix used by each inference unit in the layer.
Optionally, in this embodiment, n is greater than 2.
Referring to the initial inference model shown in fig. 7, the initial inference model includes 5 layers of inference layers, an input layer 710, a first layer of inference layer 720, a second layer of inference layer 730, a third layer of inference layer 740, and an output layer 750. The input layer 710 and the output layer 750 each include 3 inference units, and the inference layers 720, 730, and 740 each include 2 inference units, where x is 3, y is 2, and x is not equal to y. Since the connection relationship of the inference units in the first layer of inference layer 720 is different from the connection relationship of the inference units in the second layer of inference layer 730 and the third layer of inference layer 740, the initial first associated parameter matrix used by the inference units in the first layer of inference layer 720 is different from the initial first associated parameter matrix used by the inference units in the second layer of inference layer 730 and the third layer of inference layer 740. At this time, the initial inference model includes: 2 initial first relevance parameter matrixes used by 2 inference units in the first layer inference layer 720 respectively, 2 initial first relevance parameter matrixes shared by inference units of the same type in the inference layers 730 and 740, and 3 initial first relevance parameter matrixes used by 3 inference units in the output layer 750 respectively.
Optionally, in different inference layers except the first layer of inference layer, some inference units of the same type share the same initial first parameter matrix.
It should be added that the initial inference model shown in fig. 7 is only schematic, and in practical implementation, the number y of inference units in the n-layer inference layer may also be greater than x, such as: when x is 3, the value of y is 5, 6, 7, etc., which is not limited in this embodiment.
5) And creating an initial reasoning model according to the input layer, each reasoning layer in the n reasoning layers and at least one reasoning unit in the output layer and at least m initial first parameter matrixes.
In summary, according to the training method of the inference model provided in this embodiment, the initial inference model is established according to the full-connection relationship, so that the computer device does not need to determine the association relationship between the inference units in the adjacent inference layers in advance, and resources consumed by the computer device to create the initial inference model are saved.
And the second method comprises the following steps: creating an initial inference model from the probability map, including but not limited to the following steps:
1) and determining a probability map according to the database, determining the number x of the inference units in the input layer, each layer of inference layer and the output layer, wherein the probability map is used for indicating the incidence relation among the x feature classifications, and the x inference units in the same layer are in one-to-one correspondence with the x feature classifications.
In this embodiment, an association relationship between inference units in adjacent layers is established according to a probability map. The probability map includes x feature classifications in the database and associations between the x feature classifications.
Optionally, the computer device selects part of feature variables from the database in advance, and calculates log-likelihood functions of the part of feature variables; calculating an approximate gradient through a contrast divergence algorithm; and determining model parameters in the probability map model according to the approximate gradient to obtain the probability map.
Schematically, the probability map obtained by the computer device is shown in fig. 8, and fig. 8 includes 3 kinds of characteristic variables, namely, a product type 81, a consumer type 82, and a purchase time 83. The product type 81 is linked to the consumer type 82 and the time of purchase 83, and the consumer type 82 is linked to the time of purchase 83.
Alternatively, in this embodiment, the probability map is only described as an undirected graph, and in actual implementation, the probability map may also be a directed graph.
Optionally, in order to ensure that the computer device can determine the association relationship between the inference units in adjacent layers according to the probability map, the inference units in each layer correspond to the feature classes in the probability map one to one, and the number of the inference units in each layer is equal to the number of the feature classes in the probability map, and is x.
Alternatively, the number of inference units per layer may also be a multiple of the number of feature classes in the probability map, i.e. a positive integer multiple of x, such as: 2 x, 3 x, etc., which are not limited in this embodiment.
2) And determining the associated inference unit of the nth inference unit in the u layer according to the probability graph, wherein u is an integer which is greater than 1 and less than n +2, and v is a positive integer which is less than or equal to x.
Schematically, when determining the associated inference unit of the vth inference unit in the u-th layer inference layer according to the probability map, determining the feature classification corresponding to the vth inference unit in the probability map, determining other feature classifications connected with the feature classification, and determining the inference unit corresponding to the other feature classifications as the associated inference unit of the vth inference unit.
3) And generating an initial first parameter matrix used by each inference unit to obtain m initial first parameter matrices.
In this embodiment, m is equal to x.
The m initial first parameter matrices include: and the n layers of inference layers and the output layer share the initial first parameter matrix.
In this embodiment, the inference units of the same type in the other layers (including the n inference layers and the output layer) except the input layer share the same initial first parameter matrix, and since the inference units in all the layers are in one-to-one correspondence with the feature classifications and are equal in number, the number of the initial first parameter matrices corresponding to the m types of inference units is m.
4) And creating an initial reasoning model according to the m reasoning units in the input layer, the m reasoning units in each reasoning layer in the n reasoning layers, the m reasoning units in the output layer and the m initial first parameter matrixes.
Assuming that an initial inference model created from the probability map shown in fig. 8 is shown in fig. 9, the initial inference model includes an input layer 810, a first layer of inference layers 820, a second layer of inference layers 830, and an output layer 840. Each layer includes 3 inference units. Wherein the product type 81 corresponds to inference units 811, 821, 831, and 841; consumer type 82 corresponds to inference units 812, 822, 832, and 842; the purchase time 83 corresponds to the inference units 813, 823, 833, and 843.
The reasoning unit corresponding to the product type 81 is connected with the reasoning unit corresponding to the consumer type 82 in the next layer and the reasoning unit corresponding to the purchase time 83, and the reasoning unit corresponding to the consumption type 82 is connected with the reasoning unit corresponding to the purchase time 83 in the next layer.
The inference units 821, 831, 841 corresponding to the product type 81 share an initial first parameter matrix, the inference units 822, 832, 842 corresponding to the consumer type 82 share an initial first parameter matrix, and the inference units 823, 833, 843 corresponding to the purchase time 83 share an initial first parameter matrix.
At this time, since the association relationships of the inference units located in the adjacent layers and having the same position in the layer to which the inference units belong are the same, the initial inference model belongs to the recurrent neural network model.
In summary, according to the training method of the inference model provided in this embodiment, the initial inference model is established according to the probability map, so that the computer device can preliminarily determine the association relationship between the inference units in the adjacent inference layers according to the probability map, and the accuracy of the established initial inference model is improved, thereby reducing the complexity of training the initial inference model and improving the efficiency of training the initial inference model.
Optionally, the probability map obtained by the computer device is used to indicate a full connection relationship between the m feature classifications, and at this time, in the obtained initial inference model, the inference units in the adjacent inference layers are also in a full connection relationship.
And the third is that: an initial inference model is created from the random associations, including, but not limited to, the following steps.
1) And determining the number x of the inference units in the input layer, each inference layer and the output layer according to the database, wherein the probability graph is used for indicating the incidence relation among the x characteristic classifications, and the x inference units in the same layer are in one-to-one correspondence with the x characteristic classifications.
Optionally, in order to ensure that the computer device can determine the association relationship between the inference units in adjacent layers according to the random association relationship, the inference units in each layer correspond to the feature classes in the random association relationship one to one, and the number of the inference units in each layer is equal to the number of the feature classes in the random association relationship, and is x.
Alternatively, the number of inference units in each layer may also be a multiple of the number of feature classifications in the random association relationship, i.e., a positive integer multiple of x, such as: 2 x, 3 x, etc., which are not limited in this embodiment.
2) And acquiring random association relations among different feature classifications.
The random association relationship includes x feature classifications in the database and an association between the x feature classifications.
Optionally, the random association relationship is generated randomly by the computer device according to a random algorithm; alternatively, the random association relationship is user-determined. Assume that the random association relationship generated by the computer device is as shown in fig. 8.
Alternatively, step 2 may be performed after step 1; alternatively, it may be performed before step 1; alternatively, the step 1 may be performed simultaneously, which is not limited in this embodiment.
3) And determining the associated inference unit of the vth inference unit in the nth layer inference layer according to the random association relation, wherein u is an integer larger than 1 and smaller than n +2, and v is a positive integer smaller than or equal to x.
The description of the step is the same as that of step 3) in the second method for creating the initial inference model, except that the probability map is converted into a random association relationship, which is not described herein again in this embodiment.
4) And generating an initial first parameter matrix used by each inference unit to obtain m initial first parameter matrices.
In this embodiment, m is equal to x.
The m initial first parameter matrices include: and the n layers of inference layers and the output layer share the initial first parameter matrix.
The description of this step is the same as step 4) in the second method for creating an initial inference model, and this embodiment is not described herein again.
5) And creating an initial reasoning model according to the m reasoning units in the input layer, the m reasoning units in each reasoning layer in the n reasoning layers, the m reasoning units in the output layer and the m initial first parameter matrixes.
The related description of this step is the same as step 5) in the second method for creating an initial inference model, and this embodiment is not described herein again.
In summary, according to the training method for the inference model provided in this embodiment, the initial inference model is established according to the random association relationship, so that the computer device can randomly determine the association relationship between the inference units in the adjacent inference layers, and the probability map does not need to be determined according to the feature variables in the database, thereby saving resources consumed by the computer device for establishing the initial inference model.
Alternatively, in the second method for creating an initial inference model and the third method for creating an initial inference model, in order to avoid weakening the influence of the input feature variables as the number of inference layers increases, before creating the initial inference model, for each inference unit in the input layer, the computer device may establish an association relationship between the inference unit and an inference unit of the same type in each inference layer.
Such as: in fig. 6, the inference unit 611 also establishes an association relationship with the inference units 621 and 631; inference unit 612 also establishes associations with inference units 622 and 132; the inference unit 613 also establishes an association relationship with the inference units 623 and 633.
In order to determine the degree of association between each inference unit in the inference layer and the inference unit of the same type in the input layer, the computer device further generates an initial second parameter matrix corresponding to each type of inference unit to obtain m initial second parameter matrices, where the initial second parameter matrix corresponding to each inference unit is used to indicate the degree of association between the inference unit and the corresponding inference unit in the input layer.
The m initial second parameter matrices include: and n inference layers, wherein the inference units of the same type share the initial second parameter matrix.
At this time, an initial inference model is created according to the m inference units in the input layer, the m inference units in each inference layer in the n inference layers, the m inference units in the output layer, and the m initial first parameter matrices, including:
and creating an initial reasoning model according to the m reasoning units in the input layer, the m reasoning units in each reasoning layer in the n reasoning layers, the m reasoning units in the output layer, the m initial first parameter matrixes and the m initial second parameter matrixes.
Schematically, in fig. 6 above, the inference units 621 and 631 correspond to the same initial second parameter matrix; inference units 622 and 632 correspond to the same initial second parameter matrix; the inference units 623 and 633 correspond to the same initial second parameter matrix.
In summary, in the method for training the inference model provided in this embodiment, the inference units of the same type in different inference layers share the same initial second parameter matrix, so that the complexity of the initial inference model is simplified, the complexity of training the initial inference model is reduced, and the efficiency of training the initial inference model is improved.
It should be added that, when the initial inference model further includes the initial second parameter matrix, the initial second parameter matrix needs to be trained when the initial inference model is trained, and the training process refers to the process of training the initial first parameter matrix in step 303, which is not described herein again.
Referring to fig. 11, a block diagram of a data inference apparatus according to an embodiment of the present application is shown. The device comprises the following units: a condition acquisition unit 1110, a model acquisition unit 1120, and a data inference unit 1130.
A condition acquisition unit 1110 for acquiring inference conditions;
the model obtaining unit 1120 is configured to obtain a reasoning model, where the reasoning model includes an input layer, n inference layers, an output layer, and at least m first parameter matrices, and each of the input layer and the n inference layers includes at least one reasoning unit; the reasoning model comprises at least two reasoning units sharing the same first parameter matrix; the first parameter matrix used by each inference unit is used for indicating the association degree between the inference unit and the associated inference unit; the associated reasoning unit of the reasoning unit is a reasoning unit which is positioned in the input layer and has an associated relation with the reasoning unit; or the associated reasoning unit of the reasoning unit is a reasoning unit which is positioned at the previous layer of reasoning layer and has an associated relation with the reasoning unit; n and m are positive integers;
and the data reasoning unit 1130 is configured to input the reasoning conditions into the reasoning model to obtain a model reasoning result.
Optionally, the inference units in adjacent layers are in a full-connection relationship, where the full-connection relationship means that each inference unit in the ith layer has an association relationship with all inference units in the (i-1) th layer, i is an integer, and i is greater than or equal to 2 and less than or equal to n + 2;
the number of the reasoning units in the input layer and the number of the reasoning units in the output layer are both x;
the number of the reasoning units in each reasoning layer in the n reasoning layers is y; in the n layers of reasoning layers, the type of the b-th reasoning unit in the a-th layer reasoning layer is the same as that of the b-th reasoning unit in the c-th layer reasoning layer, a, b and c are integers, a is more than or equal to 1 and less than or equal to n, b is more than or equal to 1 and less than or equal to y, and c is more than or equal to 1 and less than or equal to n;
the at least m first parameter matrices include: the first parameter matrix is shared by the reasoning units of the same type in the reasoning layers except the first layer reasoning layer; the first layer of inference layer is the first layer of n layers of inference layers when the inference layers are sorted according to the data inference sequence, and the value of m is the maximum value of x and y.
Optionally, when x and y are equal, the inference model includes m first parameter matrices, the m first parameter matrices include: the first parameter matrix is shared by inference units of the same type in the n layers of inference layers and the output layer, wherein the type of the e-th inference unit in the d-th layer of inference layers is the same as that of the e-th inference unit in the output layer, d and e are integers, d is more than or equal to 1 and less than or equal to n, and e is more than or equal to 1 and less than or equal to y;
when x and y are not equal, the inference model comprises at least m first parameter matrices, and the at least m first parameter matrices comprise: the first parameter matrix corresponding to each inference unit in the first layer of inference layer, the first parameter matrix corresponding to each inference unit in the output layer, and the first parameter matrix shared by inference units of the same type in the inference layers except the first layer of inference layer.
Optionally, the association relationship between the inference units in adjacent layers is determined according to a probability map or a random association relationship, the probability map or the random association relationship is used for indicating the association relationship between the inference units of different types, the probability map is determined according to the feature classification and the feature variable in the database, and the random association relationship is generated randomly;
the number of the inference units in each layer of the input layer, the n layers of the inference layers and the output layer is m; the type of the g-th reasoning unit in the f-th layer is the same as that of the g-th reasoning unit in the h-th layer, f, g and h are integers, f is more than or equal to 1 and less than or equal to n +2, g is more than or equal to 1 and less than or equal to m, and h is more than or equal to 1 and less than or equal to n + 2;
the inference model comprises m first parameter matrixes, and inference units of the same type in the n layers of inference layers and the output layer share the same first parameter matrix.
Optionally, the j-th inference unit in the input layer and the j-th inference unit in each layer of inference layer have an association relationship, j is an integer, and j is greater than or equal to 1 and less than or equal to m;
the reasoning model also comprises m second parameter matrixes, the reasoning units of the same type in the n layers of reasoning layers share the same second parameter matrix, and the second parameter matrix used by each reasoning unit is used for indicating the degree of association between the reasoning unit and the reasoning unit of the same type in the input layer.
Optionally, the inference model is obtained by training according to a database and an initial inference model, where the database includes x feature classifications and feature variables corresponding to each feature classification; the initial reasoning model comprises an input layer, n layers of reasoning layers, an output layer and at least m initial first parameter matrixes, the number x of the feature classifications is equal to the number of the reasoning units in the input layer, and the number x of the feature classifications is equal to the number of the reasoning units in the output layer.
Optionally, the inference model is obtained by performing self-supervision inference model training on the initial inference model according to the input-output pair; the input and output pairs are established according to a database, each group of input and output pairs comprises input characteristic variables and output characteristic variables, the input characteristic variables are used for inputting the initial reasoning model in the training stage of the model, and the output characteristic variables are used for indicating expected reasoning results of the input characteristic variables in the training stage of the model; and the self-supervision reasoning model training is used for updating at least m initial first parameter matrixes according to the reasoning loss between the expected reasoning result and the actual reasoning result to obtain the reasoning model.
Reference is made to the embodiment shown in figure 3.
Referring to fig. 12, a block diagram of a method for training a reasoning model according to an embodiment of the present application is shown. The device comprises the following units: an acquisition unit 1210, a model creation unit 1220 and a model training unit 1230.
An obtaining unit 1210, configured to obtain a database, where the database includes x feature classifications and at least one feature variable corresponding to each feature classification;
a model creating unit 1220, configured to create an initial inference model, where the initial inference model includes: the system comprises an input layer, n layers of reasoning layers, an output layer and at least m initial first parameter matrixes, wherein the input layer, each layer of reasoning layer and the output layer respectively comprise at least one reasoning unit, and n and m are positive integers; the initial reasoning model comprises at least two reasoning units sharing the same initial first parameter matrix, and the reasoning units are used for processing the received data; the initial first parameter matrix used by each inference unit is used for indicating the association degree between the inference unit and the associated inference unit, and the associated inference unit of the inference unit is the inference unit which is positioned in the input layer and has an association relation with the inference unit; or the associated reasoning unit of the reasoning unit is a reasoning unit which is positioned in the previous layer of reasoning layer and has an associated relation with the reasoning unit; n and m are positive integers;
the model training unit 1230 is configured to input the feature variables corresponding to the i feature classifications in the database into the initial inference model, train the initial inference model to obtain an inference model, where the inference model is configured to obtain a model inference result according to an input inference condition, and i is a positive integer smaller than x.
Optionally, the model training unit 1230 is configured to:
inputting the characteristic variables corresponding to the i characteristic classes into an initial reasoning model to obtain a training reasoning result, wherein the characteristic variables corresponding to the i characteristic classes are acquired at the same moment;
comparing the training reasoning result with m-i characteristic variables to obtain the reasoning loss, wherein the m-i characteristic variables comprise characteristic variables corresponding to at least one characteristic classification except the i characteristic classifications in the database, and the at least one characteristic variable is acquired at the same time; the inference loss is used for indicating an error between the training inference result and the expected inference result;
updating at least m initial first parameter matrixes according to the inference loss, and continuing to input the characteristic variables corresponding to the i characteristic classifications into the initial inference model until the change of the at least m initial first parameter matrixes is within a preset range; or stopping training when the training times reach the preset times to obtain the inference model.
Optionally, the apparatus further comprises: an input-output pair creation unit.
The input-output pair creating unit is used for creating at least one group of input-output pairs according to the database before inputting the characteristic variables corresponding to the i characteristic classifications into the initial reasoning model to obtain a training reasoning result, wherein each group of input-output pairs comprises an input characteristic variable and an output characteristic variable, and the characteristic variables corresponding to the i characteristic classifications input into the initial reasoning model are determined from the input characteristic variables in the input-output pairs; the m-i characteristic variables comprise output characteristic variables in input and output pairs to which the characteristic variables corresponding to the i characteristic classifications belong.
Optionally, the model creating unit 1220 is configured to:
determining the number x of inference units in an input layer and an output layer according to a database, wherein the x inference units correspond to the x characteristic classifications one by one;
determining the number y of inference units in each inference layer of n inference layers, wherein y is a positive integer, the type of an s-th inference unit in an r-th inference layer is the same as that of an s-th inference unit in a t-th inference layer, r, s and t are positive integers, r is more than or equal to 1 and less than or equal to n, t is more than or equal to 1 and less than or equal to y, and s is more than or equal to 1 and less than or equal to n;
establishing an incidence relation between each inference unit in each inference layer and all inference units in the previous inference layer;
establishing an incidence relation between each inference unit in each layer of inference layer or each inference unit in the output layer and all inference units in the previous layer;
generating an initial first parameter matrix used by each inference unit to obtain at least m initial first parameter matrices, wherein the at least m initial first parameter matrices include: the method comprises the following steps that an initial first parameter matrix shared by reasoning units of the same type in reasoning layers except a first layer of reasoning layer is provided, and m is the maximum value of x and y;
and creating an initial reasoning model according to the input layer, each reasoning layer in the n reasoning layers and at least one reasoning unit in the output layer and at least m initial first parameter matrixes.
Optionally, when x and y are equal, the initial inference model comprises m initial first parameter matrices, the m initial first parameter matrices comprising: the same initial first parameter matrix shared by the inference units of the same type in the n layers of inference layers and the output layer;
when x and y are not equal, the initial inference model comprises at least m initial first parameter matrices, the at least m initial first parameter matrices comprising: an initial first parameter matrix used by the inference unit of each of the first layer of inference layers; the initial first parameter matrix is shared by inference units of the same type in different inference layers except the first layer of inference layer; and outputting an initial first parameter matrix used by each inference unit in the layer.
Optionally, the model creating unit 1220 is configured to:
determining a probability map according to the database, determining the number x of inference units in an input layer, each layer of inference layer and an output layer, wherein the probability map is used for indicating the incidence relation among x feature classifications, and the x inference units in the same layer are in one-to-one correspondence with the x feature classifications;
determining a related inference unit of a vth inference unit in a u layer according to the probability map, wherein u is an integer which is greater than 1 and less than n +2, and v is a positive integer which is less than or equal to x;
generating an initial first parameter matrix used by each inference unit to obtain m initial first parameter matrices, wherein the m initial first parameter matrices include: the initial first parameter matrix is shared by inference units of the same type in the n layers of inference layers and the output layer;
and creating an initial reasoning model according to the m reasoning units in the input layer, the m reasoning units in each reasoning layer in the n reasoning layers, the m reasoning units in the output layer and the m initial first parameter matrixes.
Optionally, the model creating unit 1220 is configured to:
determining the number x of inference units in an input layer, each inference layer and an output layer according to a database, wherein a probability map is used for indicating the incidence relation among x feature classifications, and the x inference units in the same layer are in one-to-one correspondence with the x feature classifications;
acquiring a random association relation among different feature classifications;
determining an associated inference unit of a vth inference unit in a nth layer inference layer according to the random association relation, wherein u is an integer larger than 1 and smaller than n +2, and v is a positive integer smaller than or equal to x;
generating an initial first parameter matrix used by each inference unit to obtain m initial first parameter matrices, wherein the m initial first parameter matrices include: the initial first parameter matrix is shared by inference units of the same type in the n layers of inference layers and the output layer;
and creating an initial reasoning model according to the m reasoning units in the input layer, the m reasoning units in each reasoning layer in the n reasoning layers, the m reasoning units in the output layer and the m initial first parameter matrixes.
Optionally, the model creating unit 1220 is configured to:
for each inference unit in the input layer, establishing an incidence relation between the inference unit and the inference units of the same type in each layer of inference layer; generating an initial second parameter matrix used by each inference unit to obtain m initial second parameter matrices, wherein the m initial second parameter matrices include: the initial second parameter matrix used by each inference unit is used for indicating the degree of association between the inference unit and the inference unit of the same type in the input layer; and creating an initial reasoning model according to the m reasoning units in the input layer, the m reasoning units in each reasoning layer in the n reasoning layers, the m reasoning units in the output layer, the m initial first parameter matrixes and the m initial second parameter matrixes.
Reference is made to the embodiment shown in fig. 10.
It should be noted that: in the above embodiment, when the device implements the functions thereof, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to implement all or part of the functions described above. In addition, the apparatus and method embodiments provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments for details, which are not described herein again.
An exemplary embodiment of the present application also provides a computer-readable storage medium having one or more programs stored therein, the one or more programs, when executed, being for implementing the above-mentioned data inference method; or a training method for implementing the above inference model.
An exemplary embodiment of the present application further provides a computer device, which includes the apparatus provided in the embodiment shown in fig. 3 or the alternative embodiment provided based on the embodiment shown in fig. 10.
Referring to fig. 13, a schematic structural diagram of a computer device according to an embodiment of the present application is shown. For example, the computer device may be a server for implementing the functionality of the above-described method examples. Computer device 1300 may include: a processor 1301.
Processor 1301 is used to implement the functions of computer device 1300. The processor 1301 is further configured to perform each step in the foregoing method embodiments, or other steps of the technical solutions described in this application.
Further, the computer device 1300 may also include a memory 1302, the memory 1302 for storing program codes and data of the computer device 1300.
Further, computer device 1300 may also include a bus 1303. The memory 1302 is coupled to the processor 1301 via a bus 1303.
The memory 1302 stores one or more programs configured to be executed by the one or more processors, the one or more programs containing instructions for implementing the data inference method described above; or, instructions for implementing the training method of the inference model described above.
It will be appreciated that fig. 13 merely illustrates a simplified design of a computer device 1300. In practical applications, the computer device 1300 may comprise any number of processors, memories, etc., and all computer devices that can implement the embodiments of the present application are within the scope of the embodiments of the present application.
The foregoing describes the solution provided by an embodiment of the present application, primarily from the perspective of a computer device. It will be appreciated that the computer device, in order to implement the above-described functions, comprises corresponding hardware structures and/or software modules for performing the respective functions. The various illustrative modules and algorithm steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present teachings.
The steps of a method or algorithm described in connection with the disclosure of the embodiments of the application may be embodied in hardware or in software instructions executed by a processor. The software instructions may be comprised of corresponding software modules that may be stored in Random Access Memory (RAM), flash Memory, Read Only Memory (ROM), Erasable Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), registers, a hard disk, a removable disk, a compact disc Read Only Memory (CD-ROM), or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an ASIC. Of course, the processor and the storage medium may reside as discrete components in a computer device.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in the embodiments of the present application may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
In the present embodiments, the terms "first," "second," "third," and the like (if any) are used for distinguishing between types of objects and not necessarily for describing a particular sequential or chronological order, it being understood that the objects so used may be interchanged under appropriate circumstances such that embodiments of the present application may be practiced in other sequences than those illustrated or otherwise described herein.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting the present application, and any modifications, equivalents, improvements and the like that are made within the spirit and principle of the present application should be included in the protection scope of the present application.
Claims (16)
1. A method of data inference, the method comprising:
acquiring inference conditions;
acquiring a reasoning model, wherein the reasoning model comprises an input layer, n layers of reasoning layers, an output layer and at least m first parameter matrixes, and each layer of reasoning layers in the input layer and the n layers of reasoning layers and the output layer comprise at least one reasoning unit; the reasoning model comprises at least two reasoning units sharing the same first parameter matrix; the first parameter matrix used by each inference unit is used for indicating the association degree between the inference unit and the associated inference unit; the associated reasoning unit of the reasoning unit is a reasoning unit which is positioned in the input layer and has an association relation with the reasoning unit; or the associated reasoning unit of the reasoning unit is a reasoning unit which is positioned at the previous layer of reasoning layer and has an associated relation with the reasoning unit; n and m are both positive integers;
and inputting the reasoning conditions into the reasoning model to obtain a model reasoning result.
2. The method according to claim 1, wherein the inference units in adjacent layers are in a fully-connected relationship, the fully-connected relationship means that each inference unit in the ith layer has an association relationship with all inference units in the (i-1) th layer, i is an integer, and i is greater than or equal to 2 and less than or equal to n + 2;
the number of the reasoning units in the input layer and the number of the reasoning units in the output layer are both x;
the number of the reasoning units in each reasoning layer in the n reasoning layers is y; in the n layers of reasoning layers, the type of the b-th reasoning unit in the a-th layer reasoning layer is the same as that of the b-th reasoning unit in the c-th layer reasoning layer, a, b and c are integers, a is more than or equal to 1 and less than or equal to n, b is more than or equal to 1 and less than or equal to y, and c is more than or equal to 1 and less than or equal to n;
the at least m first parameter matrices include: the first parameter matrix is shared by the reasoning units of the same type in the reasoning layers except the first layer reasoning layer; the first layer of inference layer is the first layer of the n layers of inference layers when the inference layers are sorted according to a data inference sequence, and the value of m is the maximum value of x and y.
3. The method of claim 2,
when x is equal to y, the inference model comprises m first parameter matrices comprising: the n-layer reasoning layer and the output layer share a first parameter matrix, wherein the type of the e-th reasoning unit in the d-layer reasoning layer is the same as that of the e-th reasoning unit in the output layer, d and e are integers, d is more than or equal to 1 and less than or equal to n, and e is more than or equal to 1 and less than or equal to y;
when x is not equal to y, the inference model comprises at least m first parameter matrices, including: the first parameter matrix corresponding to each inference unit in the first layer of inference layer, the first parameter matrix corresponding to each inference unit in the output layer, and the first parameter matrix shared by inference units of the same type in the inference layers except the first layer of inference layer.
4. The method according to claim 1, characterized in that the association between the inference units in adjacent layers is determined according to a probability map or a random association, the probability map or the random association being used to indicate the association between the inference units of different types, the probability map being determined according to the feature classification and the feature variables in the database, the random association being generated randomly;
the number of the inference units in each layer of the input layer, the n layers of inference layers and the output layer is m; the type of the g-th reasoning unit in the f-th layer is the same as that of the g-th reasoning unit in the h-th layer, wherein f, g and h are integers, f is more than or equal to 1 and less than or equal to n +2, g is more than or equal to 1 and less than or equal to m, and h is more than or equal to 1 and less than or equal to n + 2;
the inference model comprises m first parameter matrixes, and the inference units of the same type in the n layers of inference layers and the output layer share the same first parameter matrix.
5. The method according to claim 1 or 4,
the j-th reasoning unit in the input layer and the j-th reasoning unit in each layer of reasoning layer have the incidence relation, wherein j is an integer and is more than or equal to 1 and less than or equal to m;
the reasoning model further comprises m second parameter matrixes, the reasoning units of the same type in the n layers of reasoning layers share the same second parameter matrix, and the second parameter matrix used by each reasoning unit is used for indicating the degree of association between the reasoning unit and the reasoning unit of the same type in the input layer.
6. The method according to any one of claims 1 to 4, wherein the inference model is trained from a database and an initial inference model, wherein the database comprises x feature classes and feature variables corresponding to each feature class; the initial reasoning model comprises the input layer, the n layers of reasoning layers, the output layer and at least m initial first parameter matrixes, the number x of the feature classifications is equal to the number of reasoning units in the input layer, and the number x of the feature classifications is equal to the number of reasoning units in the output layer.
7. The method of claim 6, wherein the inference model is obtained by performing an auto-supervised inference model training on the initial inference model according to the input-output pairs; wherein the input-output pairs are created from the database, each set of input-output pairs comprising an input feature variable for inputting the initial inference model in a training phase of the model and an output feature variable for indicating a desired inference result of the input feature variable in the training phase of the model; and the self-supervision reasoning model training is used for updating the at least m initial first parameter matrixes according to the reasoning loss between the expected reasoning result and the actual reasoning result to obtain the reasoning model.
8. A data inference apparatus, characterized in that the apparatus comprises:
a condition acquisition unit for acquiring inference conditions;
the model acquisition unit is used for acquiring a reasoning model, the reasoning model comprises an input layer, n layers of reasoning layers, an output layer and at least m first parameter matrixes, and each layer of reasoning layers of the input layer and the n layers of reasoning layers and the output layer comprise at least one reasoning unit; the reasoning model comprises at least two reasoning units sharing the same first parameter matrix; the first parameter matrix used by each inference unit is used for indicating the association degree between the inference unit and the associated inference unit; the associated reasoning unit of the reasoning unit is a reasoning unit which is positioned in the input layer and has an association relation with the reasoning unit; or the associated reasoning unit of the reasoning unit is a reasoning unit which is positioned at the previous layer of reasoning layer and has an associated relation with the reasoning unit; n and m are both positive integers;
and the data reasoning unit is used for inputting the reasoning conditions into the reasoning model to obtain a model reasoning result.
9. The device according to claim 8, wherein the inference units in adjacent layers are in a fully-connected relationship, which means that each inference unit in the ith layer has an association relationship with all inference units in the (i-1) th layer, i is an integer, and i is greater than or equal to 2 and less than or equal to n + 2;
the number of the reasoning units in the input layer and the number of the reasoning units in the output layer are both x;
the number of the reasoning units in each reasoning layer in the n reasoning layers is y; in the n layers of reasoning layers, the type of the b-th reasoning unit in the a-th layer reasoning layer is the same as that of the b-th reasoning unit in the c-th layer reasoning layer, a, b and c are integers, a is more than or equal to 1 and less than or equal to n, b is more than or equal to 1 and less than or equal to y, and c is more than or equal to 1 and less than or equal to n;
the at least m first parameter matrices include: the first parameter matrix is shared by the reasoning units of the same type in the reasoning layers except the first layer reasoning layer; the first layer of inference layer is the first layer of the n layers of inference layers when the inference layers are sorted according to a data inference sequence, and the value of m is the maximum value of x and y.
10. The apparatus of claim 9,
when x is equal to y, the inference model comprises m first parameter matrices comprising: the n-layer reasoning layer and the output layer share a first parameter matrix, wherein the type of the e-th reasoning unit in the d-layer reasoning layer is the same as that of the e-th reasoning unit in the output layer, d and e are integers, d is more than or equal to 1 and less than or equal to n, and e is more than or equal to 1 and less than or equal to y;
when x is not equal to y, the inference model comprises at least m first parameter matrices, including: the first parameter matrix corresponding to each inference unit in the first layer of inference layer, the first parameter matrix corresponding to each inference unit in the output layer, and the first parameter matrix shared by inference units of the same type in the inference layers except the first layer of inference layer.
11. The apparatus according to claim 8, wherein the association relationship between the inference units in adjacent layers is determined according to a probability map or a random association relationship, the probability map or the random association relationship is used for indicating the association relationship between the inference units of different types, the probability map is determined according to the feature classification and the feature variable in the database, and the random association relationship is randomly generated;
the number of the inference units in each layer of the input layer, the n layers of inference layers and the output layer is m; the type of the g-th reasoning unit in the f-th layer is the same as that of the g-th reasoning unit in the h-th layer, wherein f, g and h are integers, f is more than or equal to 1 and less than or equal to n +2, g is more than or equal to 1 and less than or equal to m, and h is more than or equal to 1 and less than or equal to n + 2;
the inference model comprises m first parameter matrixes, and the inference units of the same type in the n layers of inference layers and the output layer share the same first parameter matrix.
12. The apparatus according to claim 8 or 11,
the j-th reasoning unit in the input layer and the j-th reasoning unit in each layer of reasoning layer have the incidence relation, wherein j is an integer and is more than or equal to 1 and less than or equal to m;
the reasoning model further comprises m second parameter matrixes, the reasoning units of the same type in the n layers of reasoning layers share the same second parameter matrix, and the second parameter matrix used by each reasoning unit is used for indicating the degree of association between the reasoning unit and the reasoning unit of the same type in the input layer.
13. The apparatus according to any one of claims 8 to 12, wherein the inference model is trained from a database and an initial inference model, wherein the database comprises x feature classes and feature variables corresponding to each feature class; the initial reasoning model comprises the input layer, the n layers of reasoning layers, the output layer and at least m initial first parameter matrixes, the number x of the feature classifications is equal to the number of reasoning units in the input layer, and the number x of the feature classifications is equal to the number of reasoning units in the output layer.
14. The apparatus according to claim 13, wherein the inference model is obtained by performing an auto-supervised inference model training on the initial inference model according to the input-output pairs; wherein the input-output pairs are created from the database, each set of input-output pairs comprising an input feature variable for inputting the initial inference model in a training phase of the model and an output feature variable for indicating a desired inference result of the input feature variable in the training phase of the model; and the self-supervision reasoning model training is used for updating the at least m initial first parameter matrixes according to the reasoning loss between the expected reasoning result and the actual reasoning result to obtain the reasoning model.
15. A computer device, characterized in that the computer device comprises:
one or more processors; and
a memory;
the memory stores one or more programs configured for execution by the one or more processors, the one or more programs including instructions for implementing the data inference method as claimed in any one of claims 1 to 7.
16. A computer-readable storage medium, storing one or more programs which, when executed, implement the data inference method of any of claims 1-7.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710999833.7A CN109697511B (en) | 2017-10-24 | 2017-10-24 | Data reasoning method and device and computer equipment |
PCT/CN2018/111504 WO2019080844A1 (en) | 2017-10-24 | 2018-10-23 | Data reasoning method and apparatus, and computer device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710999833.7A CN109697511B (en) | 2017-10-24 | 2017-10-24 | Data reasoning method and device and computer equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109697511A true CN109697511A (en) | 2019-04-30 |
CN109697511B CN109697511B (en) | 2022-04-05 |
Family
ID=66227614
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710999833.7A Active CN109697511B (en) | 2017-10-24 | 2017-10-24 | Data reasoning method and device and computer equipment |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109697511B (en) |
WO (1) | WO2019080844A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020243973A1 (en) * | 2019-06-06 | 2020-12-10 | 华为技术有限公司 | Model-based signal inference method and apparatus |
CN113344208A (en) * | 2021-06-25 | 2021-09-03 | 中国电信股份有限公司 | Data reasoning method, device and system |
CN114297404A (en) * | 2021-12-29 | 2022-04-08 | 北京信息科技大学 | Knowledge graph construction method for field evaluation expert behavior track |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102262690A (en) * | 2011-06-07 | 2011-11-30 | 中国石油大学(北京) | Modeling method of early warning model of mixed failures and early warning model of mixed failures |
CN103971133A (en) * | 2014-04-13 | 2014-08-06 | 北京工业大学 | Automatic steel plate surface defect recognition method based on case-based reasoning |
CN104933722A (en) * | 2015-06-29 | 2015-09-23 | 电子科技大学 | Image edge detection method based on Spiking-convolution network model |
WO2016074247A1 (en) * | 2014-11-15 | 2016-05-19 | Beijing Kuangshi Technology Co., Ltd. | Face detection using machine learning |
CN106250899A (en) * | 2016-07-29 | 2016-12-21 | 华东交通大学 | A kind of orange disease and insect pests monitoring and pre-alarming method based on distributed compression perception WSN |
US20170116520A1 (en) * | 2015-10-23 | 2017-04-27 | Nec Laboratories America, Inc. | Memory Efficient Scalable Deep Learning with Model Parallelization |
CN106650786A (en) * | 2016-11-14 | 2017-05-10 | 沈阳工业大学 | Image recognition method based on multi-column convolutional neural network fuzzy evaluation |
CN107203753A (en) * | 2017-05-25 | 2017-09-26 | 西安工业大学 | A kind of action identification method based on fuzzy neural network and graph model reasoning |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106610584A (en) * | 2015-10-27 | 2017-05-03 | 沈阳工业大学 | Remanufacturing process quality control method based on neural network and expert system |
CN106844407B (en) * | 2015-12-07 | 2020-03-10 | 华为技术有限公司 | Tag network generation method and system based on data set correlation |
CN105528638B (en) * | 2016-01-22 | 2018-04-24 | 沈阳工业大学 | The method that gray relative analysis method determines convolutional neural networks hidden layer characteristic pattern number |
-
2017
- 2017-10-24 CN CN201710999833.7A patent/CN109697511B/en active Active
-
2018
- 2018-10-23 WO PCT/CN2018/111504 patent/WO2019080844A1/en active Application Filing
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102262690A (en) * | 2011-06-07 | 2011-11-30 | 中国石油大学(北京) | Modeling method of early warning model of mixed failures and early warning model of mixed failures |
CN103971133A (en) * | 2014-04-13 | 2014-08-06 | 北京工业大学 | Automatic steel plate surface defect recognition method based on case-based reasoning |
WO2016074247A1 (en) * | 2014-11-15 | 2016-05-19 | Beijing Kuangshi Technology Co., Ltd. | Face detection using machine learning |
CN104933722A (en) * | 2015-06-29 | 2015-09-23 | 电子科技大学 | Image edge detection method based on Spiking-convolution network model |
US20170116520A1 (en) * | 2015-10-23 | 2017-04-27 | Nec Laboratories America, Inc. | Memory Efficient Scalable Deep Learning with Model Parallelization |
CN106250899A (en) * | 2016-07-29 | 2016-12-21 | 华东交通大学 | A kind of orange disease and insect pests monitoring and pre-alarming method based on distributed compression perception WSN |
CN106650786A (en) * | 2016-11-14 | 2017-05-10 | 沈阳工业大学 | Image recognition method based on multi-column convolutional neural network fuzzy evaluation |
CN107203753A (en) * | 2017-05-25 | 2017-09-26 | 西安工业大学 | A kind of action identification method based on fuzzy neural network and graph model reasoning |
Non-Patent Citations (2)
Title |
---|
BAOTIAN HU: "Convolutional Neural Network Architectures for Matching Natural Language Sentences", 《ARXIV》 * |
杲绍风: "机器人弧焊专家系统Virtual Arc", 《上海交通大学学报》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020243973A1 (en) * | 2019-06-06 | 2020-12-10 | 华为技术有限公司 | Model-based signal inference method and apparatus |
CN113344208A (en) * | 2021-06-25 | 2021-09-03 | 中国电信股份有限公司 | Data reasoning method, device and system |
CN114297404A (en) * | 2021-12-29 | 2022-04-08 | 北京信息科技大学 | Knowledge graph construction method for field evaluation expert behavior track |
Also Published As
Publication number | Publication date |
---|---|
CN109697511B (en) | 2022-04-05 |
WO2019080844A1 (en) | 2019-05-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109948029B (en) | Neural network self-adaptive depth Hash image searching method | |
US4979126A (en) | Neural network with non-linear transformations | |
US9619749B2 (en) | Neural network and method of neural network training | |
Liang et al. | A fast and accurate online sequential learning algorithm for feedforward networks | |
CN105144203B (en) | Signal processing system | |
CN109816032B (en) | Unbiased mapping zero sample classification method and device based on generative countermeasure network | |
CN110473083B (en) | Tree risk account identification method, device, server and storage medium | |
CN109033107B (en) | Image retrieval method and apparatus, computer device, and storage medium | |
Chen et al. | House price prediction using LSTM | |
CN110674869A (en) | Classification processing and graph convolution neural network model training method and device | |
CN112085615B (en) | Training method and device for graphic neural network | |
CN113822315A (en) | Attribute graph processing method and device, electronic equipment and readable storage medium | |
CN109697511B (en) | Data reasoning method and device and computer equipment | |
CN111224905B (en) | Multi-user detection method based on convolution residual error network in large-scale Internet of things | |
CN117670586A (en) | Power grid node carbon factor prediction method and system based on graph neural network | |
CN111309923B (en) | Object vector determination method, model training method, device, equipment and storage medium | |
Nápoles et al. | Hybrid model based on rough sets theory and fuzzy cognitive maps for decision-making | |
Usmanova et al. | Federated continual learning through distillation in pervasive computing | |
CN114332538A (en) | Image classification model training method, image classification method, device and storage medium | |
CN109033288A (en) | A kind of intelligent terminal security level classification method based on BP neural network | |
Javaheripi et al. | Swann: Small-world architecture for fast convergence of neural networks | |
CN117454750A (en) | Temperature prediction method, device, equipment and storage medium | |
Gora et al. | Investigating performance of neural networks and gradient boosting models approximating microscopic traffic simulations in traffic optimization tasks | |
CN116523001A (en) | Method, device and computer equipment for constructing weak line identification model of power grid | |
Chi et al. | Learning large-scale fuzzy cognitive maps using a hybrid of memetic algorithm and neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |