CN109241291A - Knowledge mapping optimal path inquiry system and method based on deeply study - Google Patents
Knowledge mapping optimal path inquiry system and method based on deeply study Download PDFInfo
- Publication number
- CN109241291A CN109241291A CN201810791353.6A CN201810791353A CN109241291A CN 109241291 A CN109241291 A CN 109241291A CN 201810791353 A CN201810791353 A CN 201810791353A CN 109241291 A CN109241291 A CN 109241291A
- Authority
- CN
- China
- Prior art keywords
- layer
- entity
- network
- value
- optimal path
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention proposes a kind of knowledge mapping optimal path inquiry methods based on deeply study, including two modules, respectively module one and module two, the module one is knowledge mapping optimal path model off-line training module, module two is knowledge mapping optimal path model application on site module, the knowledge mapping optimal path model off-line training module is equipped with deeply and learns component, make the training study of deeply to current entity, obtain next entity, current entity repetition training study is made with next entity again, obtain optimal path model, the optimal path model that module one obtains is input to by starting entity and target entity again, finally obtain optimal path, invention increases the generalization abilities of model, improve accuracy in computation, logical construction of the invention is clear, calculation is flexible, especially intensified learning with Deep learning can improve operation efficiency with distributed computing.
Description
Technical field
The present invention relates to computer fields, and in particular to it is a kind of based on deeply study knowledge mapping optimal path look into
Ask system and method.
Background technique
Knowledge mapping (Knowledge Graph) is intended to describe and portray various entities present in real world
(Entity) relationship (Relation) and between entity, usually come tissue and is indicated with digraph, and the node in figure indicates
Entity, and side is then made of relationship, relationship is used to connect two entities, and whether portray between them has described in the relationship
Relevance;If illustrating relevant property between them there are a line between two entities, otherwise indicating no relevance.In reality
In the application of border, to the additional numerical value added between one 0~1 of each entity relationship (i.e. each edge of figure) in knowledge mapping, instead
The correlation degree between entity is reflected;According to different application demands, the numerical value can indicate confidence level, tightness, distance or
Cost etc., therefore this knowledge mapping is referred to as probabilistic knowledge map.
Optimal path inquiry retrieves knowledge mapping field the relationship between two entities between probabilistic knowledge map entity
It is extremely important, is Knowledge Extraction, entity retrieval, relationship between the knowledge mapping network optimization and knowledge mapping entity
One of core technologies of applications such as analysis.Data query and retrieval type for this complexity, need a kind of effective number
It precisely could be effectively calculated required for user according to organizational form and efficient inquiry processing method as a result, therefore, improving
Search efficiency and reduce processing cost be highly desirable be also it is extremely challenging.The topological structure of probabilistic knowledge map is to add
Weigh digraph.
Currently, the figure optimal path inquiry method of mainstream has dijkstra's algorithm, Floyd algorithm and Bellman-Ford
Algorithm etc..However, with the arrival of big data era, the search efficiency of these methods can no longer meet people it is acceptable when
Between the memory space that can accommodate of range and machine, they for solve data volume greatly optimal road inquired it is incompetent
For power.
And it has now been found that, large-scale data network this for probabilistic knowledge map, if it is desired to query time is reduced,
Often using the strategy traded space for time, the higher query result of enquiry frequency is stored, the side Landmaeks-BFS
Method sorts according to enquiry frequency of the user to probabilistic knowledge map entity, by the optimal path beta pruning between common entity, real
Optimal path between body is stored in set, and this method reduces search space, but has ignored point of node in a network
Property is dissipated, inquiry accuracy rate is not high.In addition, there are also acceleration technique is used on inquiry data prediction, such as based on two-way
Parallel query method, the querying method based on goal directed and the querying method based on layering of search.These technologies are being looked into
It askes and meets requirement in efficiency, however, since some intermediate points have been given up in beta pruning, so declined in query accuracy,
If beta pruning is improper to may cause inquiry less than shortest path, if beta pruning is very few between two o'clock, being easy to degenerate is width
First search, time efficiency is low and poor expandability.It is difficult to the shortest path in accurately inquiry probabilistic knowledge map
Need to reach over time and space a balance, it is difficult to should guarantee that query time meets the requirement of user, also to guarantee to look into
Ask quality.
Summary of the invention
The present invention in order to overcome at least one of the drawbacks of the prior art described above (deficiency), it is high, general to provide a kind of accuracy
Optimal path inquiry method between the probabilistic knowledge map entity that change ability is strong, speed is fast and is easy to extend.
In order to solve the above technical problems, technical scheme is as follows:
A kind of knowledge mapping optimal path inquiry system based on deeply study, including two modules, respectively mould
Block one and module two, the module one be knowledge mapping optimal path model off-line training module, module two be knowledge mapping most
Shortest path model application on site module, the knowledge mapping optimal path model off-line training module are equipped with deeply study portion
Part, the training for making deeply to current entity learn, and obtain next entity, then make current entity repetition training with next entity
Study obtains optimal path model, then will originate entity and be input to the optimal path model that module one obtains with target entity, most
Obtain optimal path eventually, by being used cooperatively between two modules, reach that accuracy is high, generalization ability is strong, speed is fast and
It is easy to the purpose extended.
Further, the deeply study component is made of encoder, network components and logistic regression component, the net
Network component includes transition components and training assembly, and the transition components include CNN neural network and FC neural network, the training
Component includes intensified learning Policy strategy network and intensified learning value value network.
Further, the intensified learning Policy network is using the five layers of neural network connected entirely composition, intensified learning
The preceding four node layers number of Policy neural network reduces step by step, and layer 5 has k neuron, intensified learning Policy nerve net
The first layer and the second layer and the second layer and third layer of network, which are all made of dropout technology, prevents over-fitting, and activation primitive uses
Tanh function, enhances the generalization ability of model using batch standardized technique between third layer and the 4th layer, activation primitive uses
Sigmod function, the 4th layer is connected using complete come the probability for k relationship for obtaining being predicted, as next between layer 5
The action selection of a entity;
And the intensified learning value value network is using the five layers of neural network connected entirely composition, intensified learning
The first layer of value value neural network to the 4th layer using the full Connection Neural Network successively decreased step by step, layer 5 only one
Neuron is adopted between intensified learning value value neural network first layer and the second layer and between the second layer and third layer
Over-fitting is prevented with dropout technology, and the activation primitive of first layer and the second layer is all made of tanh function, and third layer activates letter
Number uses sigmod function, enhances the generalization ability of model, activation between third layer and the 4th layer using batch standardized technique
Function is all made of relu function, and the 4th layer is connected between layer 5 using complete, and output result is working as Value neural network forecast
Preceding state adds up bring income to dbjective state.
And a kind of knowledge mapping optimal path inquiry method based on deeply study proposed by the present invention, this method tool
Body the following steps are included:
S1. the entity relationship in probabilistic knowledge map is arranged from big to small by user's visitation frequency in the unit time first
Sequence chooses n relationship, generates required set of data samples;
S2. set of data samples is input in deeply study component and is trained study;
S3. it is carried out respectively the stage 1 in deeply study component, the training of the three phases in stage 2 and stage 3 is learned
It practises;
Stage 1: entity is converted by initial term vector using encoder, then passes through 1-10 layers of CNN convolutional neural networks
Encoded initial term vector is further processed and is converted into the term vector that deeply study component needs;
Stage 2: based on the relationship to be passed through of intensified learning Policy neural network forecast current entity next time;
Stage 3: value calculation is carried out to selected strategy based on intensified learning value network;
S4. after step S3 training study, the optimal path model of inquiry is obtained;
S5. input starting entity and target entity, successively pass through and are converted into term vector, it is defeated then to merge the two term vectors
Enter the optimal path model to the inquiry of step S4, until finding target entity, finally obtaining a starting point is that starting is real
Body, terminal are the optimal query path of target entity.
Further, n relationship is chosen in the step S1, n is not less than the 1/10 of probabilistic knowledge map entity relationship sum,
γ=n/2 relationship is randomly selected in this n relationship, by this corresponding γ relationship and each relationship in probabilistic knowledge map
Set of data samples needed for the two entities composition model training connected.
Further, the stage 1 of the step S3 is by the entity e of input1And e2Two are converted by encoder and network components
A term vector Gθ(e1) and Gθ(e2), θ is set of network parameters to be optimized, two term vector G that the stage 1 is obtainedθ(e1) with
Gθ(e2) similarity calculation is carried out, their COS distance is found out, is shown below:
Dθ(e1,e2)=| | Gθ(e1)-Gθ(e2)||cos,
In the training process, the two received data samples are represented by { (F, e1,e2), F is each data sample
Label be shown below to construct trained loss function:
Wherein n is the sum of training sample.
Further, the loss function L (θ) needs to minimize, and loss function L (θ) can be refined are as follows:
LsIndicate the loss function between identical entity, and LuIt indicates the loss function between different entities, needs to make LuTo the greatest extent
May be small, and make LsIt is as big as possible.
Further, the stage 2 and stage 3 of the step S3 carries out in the training component in deeply study component,
The training component includes tactful network and value network, and the stage 2 does Strategies Training, and the stage 3 does value training, and
Optimize the parameter sets of the two networks, the i.e. parameter θ of Policy strategy networkpWith the parameter θ of Value value networkv, two
In a training, be equipped with four-tuple<state, return, movement, model>, wherein state with the entity in probabilistic knowledge map come
It indicates.
Further, what the deeply by tactful network and value network based on target drives learnt obtains strategy
Function and cost function: it for strategic function, is fitted by the neural network of nonlinear function estimation, obtaining strategic function is f
(et,g|θp), for cost function, present node is equally fitted to target section by the neural network of nonlinear function estimation
The income of point, obtaining cost function is h (et,g|θv)。
Further, the return that cost function is obtained is multiplied to indicate plan with the estimation of strategy given by strategic function
The slightly loss function of network, is shown below:
Lf=log f (et,g|θp)×((rt+γh(et+1,g|θv)-h(et,g|θv)),
Wherein, γ ∈ (0,1) indicates discount factor, and according to LfTo parameter θpDerivation, and updated in such a way that gradient rises
The parameter θ of Policy strategy networkp, obtain following formula:
Indicate derivative operation,Indicate strategic function f (et,g|θp) entropy item, β ∈ (0,1) be learn
Habit rate;
If current strategies are positive with income product brought by the strategy is chosen, then positive update Policy strategy network
Parameter θpValue so that a possibility that predicting the state next time increase;It is reversed to update Policy strategy if product is negative
The parameter θ of networkpValue so that predict that the shape probability of state is as small as possible next time, until current network prediction strategy not
Until fluctuating again.
Further, the obtained cost function h (et,g|θv) and current entity actual gain rt+γh(et+1,g|θv)
The absolute value for making difference between the two calculates, and obtains the loss function of value network, is shown below:
Lh=| (rt+γ×h(et+1,g|θv))-h(et,g|θv) |,
Wherein, γ ∈ (0,1) indicates discount factor, and according to LhTo parameter θvDerivation, and updated in a manner of gradient decline
The parameter θ of Value value networkv, obtain following formula:
Derivative operation is indicated, if the income h (e of predictiont,g|θv) with calculate income rt+γh(et+1,g|θv) between accidentally
Difference is greater than the threshold value l that user gives, then updating the parameter θ of Value value networkv, so that the income error of prediction is as far as possible
It is small, until the income h (e of predictiont,g|θv) with calculate income rt+γh(et+1,g|θv) between the threshold that is given in user of error
Until no longer being fluctuated in the range of [- l, the l] of value.
Compared with prior art, the beneficial effect of technical solution of the present invention is:
(1) the invention proposes probabilistic knowledge maps, and the randomization carrying out 0~1 to entity relationship is handled, so that knowledge
Optimal path inquiry on map more meets actual application demand.
(2) since the present invention is trained by the way of intensified learning, on the one hand reduce existing deep learning method
In cause finally to calculate the poor problem of effect due to the irrationality of label design, secondly this mode is by saving each time
Current entity reduces search space to the shortest path between a certain entity in iterative process, so that the adaptability of model is more
By force, accuracy is higher.
(3) the present invention is based on deep learning technologies, and by two structures, identical, weight is shared and the convolution of pre-training is refreshing
Starting term vector and target term vector are merged through network, avoided since the change needs of target entity restart to instruct
Practice, increases the generalization ability of model, improve accuracy in computation.
(4) logical construction of each inside modules of the present invention is clear, calculation is flexible, has good loose coupling,
Network structure can be flexibly set, the needs of calculating are met, while not being limited by specific developing instrument and programming software, and
And can Quick Extended into distributed and parallelization exploitation environment, especially intensified learning and deep learning can be in a distributed manner
It calculates, improves operation efficiency.
Detailed description of the invention
Fig. 1 is a kind of technological frame figure of knowledge mapping optimal path inquiry method based on deeply study.
Fig. 2 is that deeply learns component logic structure chart.
Specific embodiment
The attached figures are only used for illustrative purposes and cannot be understood as limitating the patent;
To those skilled in the art, it is to be understood that certain known features and its explanation, which may be omitted, in attached drawing
's.
The following further describes the technical solution of the present invention with reference to the accompanying drawings and examples.Embodiment 1
The invention proposes a kind of knowledge mapping optimal path inquiry systems based on deeply study, as shown in Figure 1,
Including two modules, respectively module one and module two, module one is knowledge mapping optimal path model off-line training module, mould
Block two is knowledge mapping optimal path model application on site module, and the knowledge mapping optimal path model off-line training module is set
There is deeply to learn component, the training for making deeply to current entity learns, and data are carried out dress by module one and change instruction
Practice, so that it may obtain the current entity next entity optimal to target entity, then next entity repetition training is learnt, so
After obtain a trained optimal path model, then in module two by target entity and starting entity by conversion input
To module one generate optimal path model in, realization strengthen again, can finally obtain optimal query path, by two modules it
Between be used cooperatively, achieve the purpose that accuracy is high, generalization ability is strong, speed is fast and be easy to extension.
And module one constructs the set of data samples of optimal path model off-line training first, constructs as follows: first to probability
Entity relationship in knowledge mapping is sorted from large to small by user's visitation frequency in the nearest m unit time, and then chooses first n
Then relationship, n randomly select γ=n/2 not less than the 1/8 of probabilistic knowledge map entity relationship sum in this n relationship
Relationship, thus two entity composition models that this corresponding γ relationship and each relationship in probabilistic knowledge map are connected
The required set of data samples of training.
On this basis, each data sample constructed is input to deeply as shown in Figure 2 by module one
It practises in component and is trained study, search for and obtain the relationship of next maximum probability associated by current entity, obtain and complete
Merge the return value of the corresponding next entity of selected relationship later to update deeply study parameters of operating part.It changes in module one
For this process, and it is continuously updated deeply study parameters of operating part, until current entity is target entity or iteration time
Until number has been more than the greatest iteration threshold value that user gives, a candidate road from starting entity to target entity has been obtained at this time
Diameter.Then, module one calculates the Total Return in current candidate path and compares with the fullpath Total Return inquired before, if worked as
The income in preceding path be higher than before query path, then obtaining optimal path model, instead as the optimal path of inquiry
The above process is executed again, until deeply study parameters of operating part convergence.
The deeply study component of module one is as shown in Fig. 2, by word2vec (word insertion) encoder, CNN
(Convolutional Neural Network: convolutional neural networks) neural network, FC (Full Connect is connected entirely) mind
Through network, intensified learning Policy (strategy) network, intensified learning value (Value) network and logistic regression component composition.
The training process of deeply study component is broadly divided into 3 stages, wherein the stage 1 uses word2vec encoder by entity
Be converted into initial term vector, then by multi-layer C NN convolutional neural networks to encoded initial term vector further progress at
Reason is converted into the term vector that deeply study component needs;Stage 2 is based on intensified learning Policy (strategy) neural network forecast and works as
The relationship to be passed through next time of preceding entity;Stage 3 is based on intensified learning value (Value) network and is worth to selected strategy
It calculates.
In the stage 1, the present invention inputs c entity first, real by this c respectively by word2vec word embedded coding device
Body converts corresponding c term vector, and the dimension of this c term vector is identical, then, arbitrarily selects from c entity term vector at random
2 term vectors are selected, the two term vectors are input in multi-layer C NN convolutional neural networks, multi-layer C NN convolutional neural networks are total
Have 8 layers of structure: first layer carries out process of convolution to 2 entity term vectors of input respectively, the second layer to the convolution of first layer into
Row maximum pondization operation, third layer and the 4th layer continue to second layer pond layer obtained data progress process of convolution, then,
After the maximum pond layer of layer 5, it is sequentially ingressed into layer 6 and layer 7 and carries out process of convolution, finally by the 8th
The average pond layer of layer obtains two final term vectors.Especially, right after the second layer and layer 5 complete maximum pondization operation
It exports result and carries out batch standardization.To which the 8th layer of obtained term vector is the output in stage 1.Multi-layer C NN convolution mind
Task through network training is to calculate the distance of the 8th layer of two obtained term vector, and the term vector distance for allowing positive sample to obtain is to the greatest extent
May be small, and the term vector that negative sample obtains is apart from as big as possible.In addition, two complete phases of multilayer convolutional neural networks structure
Together, network weight is shared.
Mainly intensified learning Policy (strategy) network is trained in the stage 2.The present invention is first with current entity
Term vector and target entity term vector as input and by the obtained output vector of full articulamentum as Policy net
The input term vector of network.Policy network is using the five layers of neural network connected entirely composition, preceding four layers of neural network node number
Reduce step by step, layer 5 has k neuron.It is all made of between first layer and the second layer and between the second layer and third layer
Dropout technology prevents over-fitting, and activation primitive uses tanh function.Using batch standardized technique between third layer and the 4th layer
Enhance the generalization ability of model, meanwhile, activation primitive uses sigmod function.4th layer between layer 5 using full connection
The probability of k relationship being predicted is obtained, action selection as next entity.The output of Policy network is probability
Maximum relationship, and it as the obtained behavior of Policy network (Action).The selection mode of k relationship is as follows: first
First select k1A highest relationship of confidence level, then randomly chooses k-k from remaining relationship1It is a, and by them according to confidence level
It sorts from large to small, to obtain the maximum relationship of k confidence level of Policy network output.The training mission of Policy network
It is to select best strategy as far as possible, so that next entity bring Income Maximum that selected relationship reaches
And the stage 3 is mainly trained intensified learning Value (value) network.The input of Value network and Policy
The input of network is identical, i.e., using the term vector of the term vector of current entity and target entity as input and pass through full articulamentum institute
Obtained output vector.Value network is used to the 4th layer and is passed step by step using the five layers of neural network connected entirely composition, first layer
The full Connection Neural Network subtracted, only one neuron of layer 5.Between first layer and the second layer and the second layer and third layer
Between be all made of dropout technology and prevent over-fitting, the activation primitive of first layer and the second layer is all made of tanh function, and third
Layer activation primitive uses sigmod function.Enhance the extensive energy of model between third layer and the 4th layer using batch standardized technique
Power, activation primitive are all made of relu function.4th layer is connected between layer 5 using complete, and output result is Value network
The current state of prediction adds up bring income to dbjective state.The training mission of Value network is to make to predict under current state
Income, as far as possible with the error of the sum of the income predicted under the confidence level and NextState of relationship given by Policy network
It is small.
Module two in probabilistic knowledge map starting entity and target entity be input, successively by word2vec word it is embedding
Enter encoder and 8 layers of CNN convolutional neural networks are converted into one-dimensional term vector respectively, then, merges the two one-dimensional term vectors simultaneously
Input as intensified learning Policy strategy network and Value value network.Policy strategy network and Value value network
It overlaps each other, and from starting entity, the current entity next entity optimal to target entity is provided every time, until finding
Until target entity.Finally obtaining a starting point is starting entity, and terminal is the optimal query path of target entity.
A kind of knowledge mapping optimal path inquiry method based on deeply study that the present invention also proposes, specifically includes
Following steps:
S1. first to the entity relationship in probabilistic knowledge map by user's visitation frequency in the nearest m unit time from big
To small sequence, and then preceding n relationship is chosen, then n is closed not less than the 1/8 of probabilistic knowledge map entity relationship sum at this n
γ=n/2 relationship is randomly selected in system, thus by this corresponding γ relationship in probabilistic knowledge map and each relationship institute
Set of data samples needed for two entities composition model training of connection.
S2. then using the word2vec word embedded coding device of google company respectively by the current entity of input and target
Entity is converted into the one-dimensional term vector that two length are 512.
S3. then, carried out respectively the stage 1 in deeply study component, the instruction of the three phases in stage 2 and stage 3
Practice study.
Stage 1: the CNN convolutional neural networks that two structures of construction are identical and weight is shared, construction process are as follows:
The first layer of CNN convolutional neural networks includes 512 neurons, and using 22 × 1 convolution kernels, sliding step is solid
It is set to 2, this layer mainly carries out convolution to the one-dimensional term vector (length is equal to 512) that front word2vec word embedded coding device obtains
Processing obtains the one-dimensional vector that 2 length are 256.Then, the second layer of CNN convolutional neural networks is directed to the 2 of first layer output
A one-dimensional term vector is 2 × 1 using 2 convolution kernel sizes, and the convolution kernel that sliding step is 1 carries out maximum pondization operation, thus
Obtain the one-dimensional vector that 2 length are 256.Then on this basis, batch standard operation is executed to this 2 one-dimensional vectors.Then,
Export to the second layer using 44 × 1 convolution kernels 2 of the third layer of CNN convolutional neural networks are one-dimensional after criticizing standard
Vector carries out process of convolution, and sliding step is fixed as 4, to obtain the one-dimensional vector that 8 length are 64.Then, CNN convolution mind
The 4th layer through network using 14 × 1 convolution kernel, sliding step 1, to 8 one-dimensional vectors of third layer output again into
Row process of convolution is similarly obtained the one-dimensional vector that 8 length are 64.Then, the layer 5 of CNN convolutional neural networks is to the 4th layer
8 one-dimensional vectors carry out maximum pondization operation again, convolution kernel size is equal to 2 × 1, and convolution kernel number is equal to 4, sliding step
It is 2, thus, obtain the one-dimensional vector that 32 length are 32.On this basis, crowd standard behaviour is executed to this 32 one-dimensional vectors
Make.Then, the layer 6 of network layer 5 export using 24 × 1 convolution kernels 32 after criticizing standard it is one-dimensional to
Amount carries out process of convolution, and sliding step is fixed as 2, thus, obtain the one-dimensional vector that 64 length are 16.Then, network
64 one-dimensional vectors progress process of convolution that layer 7 exports layer 6 using 44 × 1 convolution kernels, sliding step 4,
To obtain the one-dimensional vector that 40 length are 512.Finally, the 8th layer of network is operated using average pondization, and finally obtain
256 length are the one-dimensional vector of 4 dimensions, and then, this 256 one-dimensional vectors are connected by connecting entirely with 512 neurons, from
And obtain the one-dimensional vector that length is 512.
After the CNN convolutional neural networks construction that two structures are identical and weight is shared finishes, the present invention is logical
The entity and relationship crossed in probabilistic knowledge map are trained them and parameter optimization, process are as follows:
The input of the two CNN convolutional neural networks is two entity e respectively1And e2, and it is 512 that output, which is two length,
One-dimensional vector Gθ(e1) and Gθ(e2), wherein θ is set of network parameters to be optimized.Then, to the two one-dimensional vectors into
Row similarity calculation finds out their COS distance: Dθ(e1,e2)=| | Gθ(e1)-Gθ(e2)||cosIf e1And e2This two
A physical differences are larger, then Dθ(e1,e2) larger, and if e1With it is same or similar, then Dθ(e1,e2) smaller.
Therefore, in the training process, the two CNN convolutional neural networks received data samples be represented by (F,
e1,e2), wherein F is the label of each data sample, if e1And e2Indicate identical entity, then F=1, anyway F=0.From
And obtain the loss function of construction training are as follows:
Wherein n is the sum of training sample.
On this basis, L is usedsIndicate the loss function between identical entity, and LuIndicate the loss letter between different entities
Number.In order to achieve the purpose that minimize loss function L (θ), need to make LuIt is as small as possible, and make LsIt is as big as possible.To training
Loss function L (θ) can be refined are as follows:
In the training process, this by minimize loss function L (θ), may finally allow identical physical distance as far as possible
Small, different physical distances is as big as possible, increases the discrimination of sample.In addition, in the training process, choosing 1,000,000 samples
This entity therefrom randomly selects 250,000 pairs of identical entities to as positive sample, and randomly selects 250,000 pairs of different entities
To as negative sample, it is input to after mixing and goes to train in network.
After calculating by the two CNN convolutional neural networks, length corresponding to current entity and target entity is obtained
For 512 one-dimensional vector.Then, the two one-dimensional vectors are subjected to full attended operation again, i.e., one that two length are 512
Dimensional vector is directly connected to obtain the one-dimensional vector that length is 1024, is then linked into the full articulamentum of 512 neurons,
Finally obtain the one-dimensional vector that a length is 512.We indicate fused current entity and target entity with it;
Stage 2 and stage 3 are mainly the Policy strategy network and Value value network in training deeply study component
Network, and optimize the parameter sets of the two networks, the i.e. parameter θ of Policy strategy networkpWith the parameter θ of Value value networkv。
Next optimal policy and dynamic undated parameter θ are searched in continuous repetitive exercise above-mentioned two stagepAnd θv, until getting
Until global optimum's strategy.Each round iteration can find a target entity, and undated parameter θ in fintie number of stepspAnd θv。
Especially, maximum number of iterations c is arranged in module onemaxIf current iteration number is more than to stop iteration.
For this purpose, the present invention, which is primarily based on probabilistic knowledge map, defines required four-tuple in the two network training process
<state is returned, movement, and model>, wherein state is indicated with the entity in probabilistic knowledge map, such as current entity et, mesh
Mark entity g and starting entity s;Current entity etTo next entity et+1Return rtIt indicates, rtEqual to etWith et+1Between relationship
Confidence level;Movement indicated with m, be intelligent body action selection, correspond to probabilistic knowledge map in current entity with it is next
Relationship between entity;Finally, model indicates the depth based on target drives in Policy strategy network or Value value network
The strategic function or cost function of intensified learning: the neural network estimated for strategic function, the present invention by nonlinear function
It is fitted, i.e., strategic function is f (et,g|θp), and for cost function, the nerve net of the same nonlinear function estimation of the present invention
Network is fitted present node to the income of destination node, i.e., cost function is h (et,g|θv)
Stage 2: first to the parameter sets θ of Policy strategy networkpCarry out random initializtion.Then, Policy strategy
Network receives current entity and the corresponding one-dimensional vector of target entity as input.The first layer of Policy strategy network has 256
A neuron is connect entirely with one-dimensional vector corresponding to current entity and target entity (length 512);The second layer has 64
A neuron;Third layer has 32 neurons;4th layer has 16 neurons;Layer 5 has 10 neurons, represents output
The value of 10 entities and the probability for selecting this 10 entities, this 10 entities be by current entity into next layer entity
The higher entity of preceding 7 confidence levels is collectively constituted with 3 entities of random selection in remaining entity, if next layer entity number
Less than 10,0 filling of so much remaining solid element.First layer, the second layer and third layer are all made of tanh activation letter
Number, and the 4th layer uses sigmod activation primitive with layer 5.Meanwhile using dropout technology and implementing to criticize between layers
Standardization improves precision of prediction.Finally, 10 neuron outputs of layer 5 is 10 selected by Policy strategy network
Then the probability of a relationship obtains selection of the relationship as behavior of maximum probability by softmax function.
Strategy given by the return obtained in the training process in stage 2 based on cost function and current strategies function
The loss function that estimation is multiplied to indicate Policy strategy network is to be shown below:
Lf=log f (et,g|θp)×((rt+γh(et+1,g|θv)-h(et,g|θv)),
Wherein, γ ∈ (0,1) indicates discount factor.Then, according to LfTo parameter θpDerivation, and in such a way that gradient rises
Undated parameter θp, it can obtain:
Wherein,Indicate derivative operation,Indicate strategic function f (et,g|θp) entropy item, β ∈ (0,
It 1) is learning rate, the purpose that the entropy item is added is to obtain time dominant strategy too early in order to avoid Policy strategy network, and fall into office
Portion is optimal.If current strategies are positive with income product brought by the strategy is chosen, forward direction updates θpValue, so that next
A possibility that secondary prediction state, increases;If product is negative, θ is reversely updatedpValue, so that predicting the shape probability of state next time
It is as small as possible, until the strategy of current network prediction no longer fluctuates;
Stage 3: first to the parameter sets θ of Value value networkvCarry out random initializtion.Then, tactful with Policy
Network is the same, and Value value network receives current entity and the corresponding one-dimensional vector of target entity as input.Value network
First layer have 256 neurons, connected entirely with one-dimensional vector corresponding to current entity and target entity (length 512)
It connects;The second layer has 128 neurons;Third layer has 64 neurons;4th layer has 32 neurons;Layer 5 has a nerve
Value in the state that member representative is current.It is all used between first layer and the second layer and between the second layer and third layer
Dropout technology prevents over-fitting.First layer and the second layer are all made of tanh activation primitive, and third layer is all made of with the 4th layer
Sigmod activation primitive.Implement batch standardization between third layer and the 4th layer to enhance the generalization ability of model.4th layer
The value of prediction is finally obtained using full Connection Neural Network between layer 5.
In the training process in stage 3, current entity actual gain r is calculatedt+γh(et+1,g|θv) and predicted income h
(et,g|θv) between difference absolute value, and the loss function as Value value network is shown below:
Lh=| (rt+γ×h(et+1,g|θv))-h(et,g|θv) |,
Wherein, γ ∈ (0,1) indicates discount factor.Then, according to LhTo parameter θvDerivation, and in a manner of gradient decline
Undated parameter θv, it can obtain::
Wherein,Indicate derivative operation.If the income h (e of predictiont,g|θv) with calculate income rt+γh(et+1,g|
θv) between error be greater than the given threshold value l of user, then updating θv, so that the income error of prediction is as small as possible, until prediction
Income h (et,g|θv) with calculate income rt+γh(et+1,g|θv) between error [- l, the l] of threshold value that gives in user
Until no longer being fluctuated in range;
S4. in an iterative process, and it is continuously updated deeply study parameters of operating part, until current entity is that target is real
Until body or the number of iterations have been more than the greatest iteration threshold value that user gives, obtained at this time from starting entity to target entity
A path candidate.Then, mould calculate current candidate path Total Return and with the fullpath Total Return pair inquired before
Than if the query path before the income of current path is higher than is held repeatedly as the optimal path model of inquiry
The row above process, until deeply study parameters of operating part convergence.
S5. the entity in two probabilistic knowledge maps, i.e. starting entity s and target entity g are inputted, by trained
Word2vec word embedded coding device converts them to the one-dimensional vector that length is 512 respectively.Then, the two vectors are merged
The one-dimensional vector for being 1024 at length, and using it as the input of trained multi-layer C NN convolutional neural networks, it has respectively obtained
The one-dimensional vector that length corresponding to beginning entity and target entity is 512.Then on this basis, then by the two one-dimensional vectors
Generate the vector that new length is 1024 by full articulamentum, and as trained intensified learning Policy strategy network and
The input of Value value network.Policy strategy network and Value value network overlap each other, and from starting entity, often
It is secondary to provide the current entity next entity optimal to target entity, until finding target entity.To finally obtain one
Starting point is starting entity s, and terminal is the optimal query path Path (s, g) of target entity g.
Finally, it is stated that the above examples are only used to illustrate the technical scheme of the present invention and are not limiting, although referring to compared with
Good embodiment describes the invention in detail, those skilled in the art should understand that, it can be to skill of the invention
Art scheme is modified or replaced equivalently, and without departing from the objective and range of technical solution of the present invention, should all be covered at this
In the scope of the claims of invention.
Claims (10)
1. a kind of knowledge mapping optimal path inquiry system based on deeply study, which is characterized in that including two modules,
Respectively module one and module two, the module one are knowledge mapping optimal path model off-line training module, and module two is to know
Know map optimal path model application on site module, it is strong that the knowledge mapping optimal path model off-line training module is equipped with depth
Chemistry practises component, and the training for making deeply to current entity learns, and obtains next entity, then make current entity with next entity
Repetition training study obtains optimal path model, then is input to the optimal road that module one obtains by starting entity and target entity
Diameter model, finally obtains optimal path.
2. the knowledge mapping optimal path inquiry system according to claim 1 based on deeply study, which is characterized in that
The deeply study component is made of encoder, network components and logistic regression component, and the network components include conversion
Component and training assembly, the transition components include CNN neural network and FC neural network, and the training assembly includes extensive chemical
Practise Policy strategy network and intensified learning value value network.
3. the knowledge mapping optimal path inquiry system according to claim 2 based on deeply study, which is characterized in that
The neural network that the intensified learning Policy network is connected entirely using five layers forms, before intensified learning Policy neural network
Four node layer numbers reduce step by step, and layer 5 has k neuron, the first layer and the second layer of intensified learning Policy neural network
And the second layer and third layer are all made of dropout technology prevents over-fitting, activation primitive uses tanh function, third layer and the
Enhance the generalization ability of model between four layers using batch standardized technique, activation primitive uses sigmod function, the 4th layer with
Action selection between layer 5 using full connection come the probability for k relationship for obtaining being predicted, as next entity;
The intensified learning value value network is using the five layers of neural network connected entirely composition, intensified learning value value mind
First layer through network to the 4th layer using the full Connection Neural Network successively decreased step by step, strengthen by only one neuron of layer 5
Dropout skill is all made of between study value value neural network first layer and the second layer and between the second layer and third layer
Art prevents over-fitting, and the activation primitive of first layer and the second layer is all made of tanh function, and third layer activation primitive uses
Sigmod function, enhances the generalization ability of model using batch standardized technique between third layer and the 4th layer, activation primitive is equal
Using relu function, the 4th layer is connected between layer 5 using complete, and output result is the current state of Value neural network forecast
Add up bring income to dbjective state.
4. a kind of knowledge mapping optimal path inquiry method based on deeply study, which comprises the following steps:
S1. the entity relationship in probabilistic knowledge map is sorted from large to small first by user's visitation frequency in the unit time, is selected
N relationship is taken, required set of data samples is generated;
S2. set of data samples is input in deeply study component and is trained study;
S3. it is carried out respectively the stage 1 in deeply study component, the training study of the three phases in stage 2 and stage 3;
Stage 1: entity is converted by initial term vector using encoder, then by 1-10 layers of CNN convolutional neural networks to
The initial term vector of coding, which is further processed, is converted into the term vector that deeply study component needs;
Stage 2: based on the relationship to be passed through of intensified learning Policy neural network forecast current entity next time;
Stage 3: value calculation is carried out to selected strategy based on intensified learning value network;
S4. after step S3 training study, the optimal path model of inquiry is obtained;
S5. then input starting entity and target entity merge the two term vectors and are input to successively by being converted into term vector
The optimal path model of the inquiry of step S4, until finding target entity, finally obtaining a starting point is starting entity, eventually
Point is the optimal query path of target entity.
5. the knowledge mapping optimal path inquiry method according to claim 4 based on deeply study, which is characterized in that
Choose n relationship in the step S1,1/10 of n not less than probabilistic knowledge map entity relationship sum, in this n relationship at random
Choose γ=n/2 relationship, two realities that this corresponding γ relationship and each relationship in probabilistic knowledge map are connected
The required set of data samples of body composition model training.
6. the knowledge mapping optimal path inquiry method according to claim 4 based on deeply study, which is characterized in that
The stage 1 of the step S3 is by the entity e of input1And e2Two term vector G are converted by encoder and network componentsθ(e1)
With Gθ(e2), θ is set of network parameters to be optimized, two term vector G that the stage 1 is obtainedθ(e1) and Gθ(e2) carry out it is similar
Degree calculates, and finds out their COS distance, is shown below:
Dθ(e1,e2)=| | Gθ(e1)-Gθ(e2)||cos,
In the training process, the two received data samples are represented by { (F, e1,e2), F is the mark of each data sample
Label, to construct trained loss function, are shown below:
Wherein n is the sum of training sample.
The stage 2 and stage 3 of the step S3 carries out in the training component in deeply study component, and the stage 2 does
Strategies Training, the stage 3 do value training, optimize the parameter sets of the two networks, i.e. Policy plan in the training process
The slightly parameter θ of networkpWith the parameter θ of Value value networkv, and it is equipped with four-tuple<state, it returns, movement, model>, wherein
State is indicated with the entity in probabilistic knowledge map.
7. the knowledge mapping optimal path inquiry method according to claim 6 based on deeply study, which is characterized in that
The loss function L (θ) needs to minimize, and loss function L (θ) can be refined are as follows:
LsIndicate the loss function between identical entity, and LuIt indicates the loss function between different entities, needs to make LuAs far as possible
It is small, and make LsIt is as big as possible.
8. the knowledge mapping optimal path inquiry method according to claim 6 based on deeply study, which is characterized in that
The deeply study by tactful network and value network based on target drives obtains strategic function and cost function:
It for strategic function, is fitted by the neural network of nonlinear function estimation, obtaining strategic function is f (et,g|θp), for valence
Value function is equally fitted present node to the income of destination node by the neural network of nonlinear function estimation, must be worth
Function is h (et,g|θv)。
9. the knowledge mapping optimal path inquiry method according to claim 8 based on deeply study, which is characterized in that
The return that cost function is obtained is multiplied to indicate the loss letter of tactful network with the estimation of strategy given by strategic function
Number, is shown below:
Lf=logf (et,g|θp)×((rt+γh(et+1,g|θv)-h(et,g|θv)),
Wherein, γ ∈ (0,1) indicates discount factor, and according to LfTo parameter θpDerivation, and updated in such a way that gradient rises
The parameter θ of Policy strategy networkp, obtain following formula:
Indicate derivative operation,Indicate strategic function f (et,g|θp) entropy item, β ∈ (0,1) be study
Rate;
If current strategies are positive with income product brought by the strategy is chosen, then the positive ginseng for updating Policy strategy network
Number θpValue so that a possibility that predicting the state next time increase;It is reversed to update Policy strategy network if product is negative
Parameter θpValue so that predict that the shape probability of state is as small as possible next time, until the strategy no longer wave of current network prediction
Until dynamic.
10. the knowledge mapping optimal path inquiry method according to claim 8 based on deeply study, feature exist
In the obtained cost function h (et,g|θv) and current entity actual gain rt+γh(et+1,g|θv) between the two make it is poor
The absolute value of value calculates, and obtains the loss function of value network, is shown below:
Lh=| (rt+γ×h(et+1,g|θv))-h(et,g|θv) |,
Wherein, γ ∈ (0,1) indicates discount factor, and according to LhTo parameter θvDerivation, and updated in a manner of gradient decline
The parameter θ of Value value networkv, obtain following formula:
Derivative operation is indicated, if the income h (e of predictiont,g|θv) with calculate income rt+γh(et+1,g|θv) between error it is big
In the threshold value l that user gives, then updating the parameter θ of Value value networkv, so that the income error of prediction is as small as possible, directly
To the income h (e of predictiont,g|θv) with calculate income rt+γh(et+1,g|θv) between the threshold value that gives in user of error [-
L, l] in the range of no longer fluctuate until.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810791353.6A CN109241291B (en) | 2018-07-18 | 2018-07-18 | Knowledge graph optimal path query system and method based on deep reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810791353.6A CN109241291B (en) | 2018-07-18 | 2018-07-18 | Knowledge graph optimal path query system and method based on deep reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109241291A true CN109241291A (en) | 2019-01-18 |
CN109241291B CN109241291B (en) | 2022-02-15 |
Family
ID=65072112
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810791353.6A Active CN109241291B (en) | 2018-07-18 | 2018-07-18 | Knowledge graph optimal path query system and method based on deep reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109241291B (en) |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109818786A (en) * | 2019-01-20 | 2019-05-28 | 北京工业大学 | A kind of cloud data center applies the more optimal choosing methods in combination of resources path of appreciable distribution |
CN109829579A (en) * | 2019-01-22 | 2019-05-31 | 平安科技(深圳)有限公司 | Minimal path calculation method, device, computer equipment and storage medium |
CN109947098A (en) * | 2019-03-06 | 2019-06-28 | 天津理工大学 | A kind of distance priority optimal route selection method based on machine learning strategy |
CN110288878A (en) * | 2019-07-01 | 2019-09-27 | 科大讯飞股份有限公司 | Adaptive learning method and device |
CN110347857A (en) * | 2019-06-06 | 2019-10-18 | 武汉理工大学 | The semanteme marking method of remote sensing image based on intensified learning |
CN110391843A (en) * | 2019-06-19 | 2019-10-29 | 北京邮电大学 | Transmission quality prediction, routing resource and the system of multi-area optical network |
CN110825821A (en) * | 2019-09-30 | 2020-02-21 | 深圳云天励飞技术有限公司 | Personnel relationship query method and device, electronic equipment and storage medium |
CN110825890A (en) * | 2020-01-13 | 2020-02-21 | 成都四方伟业软件股份有限公司 | Method and device for extracting knowledge graph entity relationship of pre-training model |
CN110956254A (en) * | 2019-11-12 | 2020-04-03 | 浙江工业大学 | Case reasoning method based on dynamic knowledge representation learning |
CN110990548A (en) * | 2019-11-29 | 2020-04-10 | 支付宝(杭州)信息技术有限公司 | Updating method and device of reinforcement learning model |
CN111382359A (en) * | 2020-03-09 | 2020-07-07 | 北京京东振世信息技术有限公司 | Service strategy recommendation method and device based on reinforcement learning and electronic equipment |
CN111401557A (en) * | 2020-06-03 | 2020-07-10 | 超参数科技(深圳)有限公司 | Agent decision making method, AI model training method, server and medium |
CN111563209A (en) * | 2019-01-29 | 2020-08-21 | 株式会社理光 | Intention identification method and device and computer readable storage medium |
CN111581343A (en) * | 2020-04-24 | 2020-08-25 | 北京航空航天大学 | Reinforced learning knowledge graph reasoning method and device based on graph convolution neural network |
CN111597209A (en) * | 2020-04-30 | 2020-08-28 | 清华大学 | Database materialized view construction system, method and system creation method |
CN111611339A (en) * | 2019-02-22 | 2020-09-01 | 北京搜狗科技发展有限公司 | Recommendation method and device for inputting related users |
CN112801731A (en) * | 2021-01-06 | 2021-05-14 | 广东工业大学 | Federal reinforcement learning method for order taking auxiliary decision |
CN112966591A (en) * | 2021-03-03 | 2021-06-15 | 河北工业职业技术学院 | Knowledge map deep reinforcement learning migration system for mechanical arm grabbing task |
CN113255347A (en) * | 2020-02-10 | 2021-08-13 | 阿里巴巴集团控股有限公司 | Method and equipment for realizing data fusion and method for realizing identification of unmanned equipment |
CN114248265A (en) * | 2020-09-25 | 2022-03-29 | 广州中国科学院先进技术研究所 | Multi-task intelligent robot learning method and device based on meta-simulation learning |
CN114626530A (en) * | 2022-03-14 | 2022-06-14 | 电子科技大学 | Reinforced learning knowledge graph reasoning method based on bilateral path quality assessment |
CN115099401A (en) * | 2022-05-13 | 2022-09-23 | 清华大学 | Learning method, device and equipment of continuous learning framework based on world modeling |
CN115936091A (en) * | 2022-11-24 | 2023-04-07 | 北京百度网讯科技有限公司 | Deep learning model training method and device, electronic equipment and storage medium |
CN117009548A (en) * | 2023-08-02 | 2023-11-07 | 广东立升科技有限公司 | Knowledge graph supervision system based on secret equipment maintenance |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106598856A (en) * | 2016-12-14 | 2017-04-26 | 广东威创视讯科技股份有限公司 | Path detection method and path detection device |
US20170124497A1 (en) * | 2015-10-28 | 2017-05-04 | Fractal Industries, Inc. | System for automated capture and analysis of business information for reliable business venture outcome prediction |
CN106776729A (en) * | 2016-11-18 | 2017-05-31 | 同济大学 | A kind of extensive knowledge mapping path query fallout predictor building method |
CN106934012A (en) * | 2017-03-10 | 2017-07-07 | 上海数眼科技发展有限公司 | A kind of question answering in natural language method and system of knowledge based collection of illustrative plates |
CN107577805A (en) * | 2017-09-26 | 2018-01-12 | 华南理工大学 | A kind of business service system towards the analysis of daily record big data |
CN107944025A (en) * | 2017-12-12 | 2018-04-20 | 北京百度网讯科技有限公司 | Information-pushing method and device |
CN108073711A (en) * | 2017-12-21 | 2018-05-25 | 北京大学深圳研究生院 | A kind of Relation extraction method and system of knowledge based collection of illustrative plates |
-
2018
- 2018-07-18 CN CN201810791353.6A patent/CN109241291B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170124497A1 (en) * | 2015-10-28 | 2017-05-04 | Fractal Industries, Inc. | System for automated capture and analysis of business information for reliable business venture outcome prediction |
CN106776729A (en) * | 2016-11-18 | 2017-05-31 | 同济大学 | A kind of extensive knowledge mapping path query fallout predictor building method |
CN106598856A (en) * | 2016-12-14 | 2017-04-26 | 广东威创视讯科技股份有限公司 | Path detection method and path detection device |
CN106934012A (en) * | 2017-03-10 | 2017-07-07 | 上海数眼科技发展有限公司 | A kind of question answering in natural language method and system of knowledge based collection of illustrative plates |
CN107577805A (en) * | 2017-09-26 | 2018-01-12 | 华南理工大学 | A kind of business service system towards the analysis of daily record big data |
CN107944025A (en) * | 2017-12-12 | 2018-04-20 | 北京百度网讯科技有限公司 | Information-pushing method and device |
CN108073711A (en) * | 2017-12-21 | 2018-05-25 | 北京大学深圳研究生院 | A kind of Relation extraction method and system of knowledge based collection of illustrative plates |
Cited By (39)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109818786B (en) * | 2019-01-20 | 2021-11-26 | 北京工业大学 | Method for optimally selecting distributed multi-resource combined path capable of sensing application of cloud data center |
CN109818786A (en) * | 2019-01-20 | 2019-05-28 | 北京工业大学 | A kind of cloud data center applies the more optimal choosing methods in combination of resources path of appreciable distribution |
CN109829579A (en) * | 2019-01-22 | 2019-05-31 | 平安科技(深圳)有限公司 | Minimal path calculation method, device, computer equipment and storage medium |
CN111563209A (en) * | 2019-01-29 | 2020-08-21 | 株式会社理光 | Intention identification method and device and computer readable storage medium |
CN111611339A (en) * | 2019-02-22 | 2020-09-01 | 北京搜狗科技发展有限公司 | Recommendation method and device for inputting related users |
CN109947098A (en) * | 2019-03-06 | 2019-06-28 | 天津理工大学 | A kind of distance priority optimal route selection method based on machine learning strategy |
CN110347857A (en) * | 2019-06-06 | 2019-10-18 | 武汉理工大学 | The semanteme marking method of remote sensing image based on intensified learning |
CN110391843A (en) * | 2019-06-19 | 2019-10-29 | 北京邮电大学 | Transmission quality prediction, routing resource and the system of multi-area optical network |
CN110391843B (en) * | 2019-06-19 | 2021-01-05 | 北京邮电大学 | Transmission quality prediction and path selection method and system for multi-domain optical network |
CN110288878A (en) * | 2019-07-01 | 2019-09-27 | 科大讯飞股份有限公司 | Adaptive learning method and device |
CN110288878B (en) * | 2019-07-01 | 2021-10-08 | 科大讯飞股份有限公司 | Self-adaptive learning method and device |
CN110825821B (en) * | 2019-09-30 | 2022-11-22 | 深圳云天励飞技术有限公司 | Personnel relationship query method and device, electronic equipment and storage medium |
CN110825821A (en) * | 2019-09-30 | 2020-02-21 | 深圳云天励飞技术有限公司 | Personnel relationship query method and device, electronic equipment and storage medium |
CN110956254A (en) * | 2019-11-12 | 2020-04-03 | 浙江工业大学 | Case reasoning method based on dynamic knowledge representation learning |
CN110990548A (en) * | 2019-11-29 | 2020-04-10 | 支付宝(杭州)信息技术有限公司 | Updating method and device of reinforcement learning model |
CN110990548B (en) * | 2019-11-29 | 2023-04-25 | 支付宝(杭州)信息技术有限公司 | Method and device for updating reinforcement learning model |
CN110825890A (en) * | 2020-01-13 | 2020-02-21 | 成都四方伟业软件股份有限公司 | Method and device for extracting knowledge graph entity relationship of pre-training model |
CN113255347B (en) * | 2020-02-10 | 2022-11-15 | 阿里巴巴集团控股有限公司 | Method and equipment for realizing data fusion and method for realizing identification of unmanned equipment |
CN113255347A (en) * | 2020-02-10 | 2021-08-13 | 阿里巴巴集团控股有限公司 | Method and equipment for realizing data fusion and method for realizing identification of unmanned equipment |
CN111382359B (en) * | 2020-03-09 | 2024-01-12 | 北京京东振世信息技术有限公司 | Service policy recommendation method and device based on reinforcement learning, and electronic equipment |
CN111382359A (en) * | 2020-03-09 | 2020-07-07 | 北京京东振世信息技术有限公司 | Service strategy recommendation method and device based on reinforcement learning and electronic equipment |
CN111581343A (en) * | 2020-04-24 | 2020-08-25 | 北京航空航天大学 | Reinforced learning knowledge graph reasoning method and device based on graph convolution neural network |
CN111581343B (en) * | 2020-04-24 | 2022-08-30 | 北京航空航天大学 | Reinforced learning knowledge graph reasoning method and device based on graph convolution neural network |
CN111597209A (en) * | 2020-04-30 | 2020-08-28 | 清华大学 | Database materialized view construction system, method and system creation method |
CN111597209B (en) * | 2020-04-30 | 2023-11-14 | 清华大学 | Database materialized view construction system, method and system creation method |
CN111401557B (en) * | 2020-06-03 | 2020-09-18 | 超参数科技(深圳)有限公司 | Agent decision making method, AI model training method, server and medium |
CN111401557A (en) * | 2020-06-03 | 2020-07-10 | 超参数科技(深圳)有限公司 | Agent decision making method, AI model training method, server and medium |
CN114248265A (en) * | 2020-09-25 | 2022-03-29 | 广州中国科学院先进技术研究所 | Multi-task intelligent robot learning method and device based on meta-simulation learning |
CN114248265B (en) * | 2020-09-25 | 2023-07-07 | 广州中国科学院先进技术研究所 | Method and device for learning multi-task intelligent robot based on meta-simulation learning |
CN112801731A (en) * | 2021-01-06 | 2021-05-14 | 广东工业大学 | Federal reinforcement learning method for order taking auxiliary decision |
CN112966591B (en) * | 2021-03-03 | 2023-01-20 | 河北工业职业技术学院 | Knowledge map deep reinforcement learning migration system for mechanical arm grabbing task |
CN112966591A (en) * | 2021-03-03 | 2021-06-15 | 河北工业职业技术学院 | Knowledge map deep reinforcement learning migration system for mechanical arm grabbing task |
CN114626530A (en) * | 2022-03-14 | 2022-06-14 | 电子科技大学 | Reinforced learning knowledge graph reasoning method based on bilateral path quality assessment |
CN115099401A (en) * | 2022-05-13 | 2022-09-23 | 清华大学 | Learning method, device and equipment of continuous learning framework based on world modeling |
CN115099401B (en) * | 2022-05-13 | 2024-04-26 | 清华大学 | Learning method, device and equipment of continuous learning framework based on world modeling |
CN115936091A (en) * | 2022-11-24 | 2023-04-07 | 北京百度网讯科技有限公司 | Deep learning model training method and device, electronic equipment and storage medium |
CN115936091B (en) * | 2022-11-24 | 2024-03-08 | 北京百度网讯科技有限公司 | Training method and device for deep learning model, electronic equipment and storage medium |
CN117009548A (en) * | 2023-08-02 | 2023-11-07 | 广东立升科技有限公司 | Knowledge graph supervision system based on secret equipment maintenance |
CN117009548B (en) * | 2023-08-02 | 2023-12-26 | 广东立升科技有限公司 | Knowledge graph supervision system based on secret equipment maintenance |
Also Published As
Publication number | Publication date |
---|---|
CN109241291B (en) | 2022-02-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109241291A (en) | Knowledge mapping optimal path inquiry system and method based on deeply study | |
Han et al. | A survey on metaheuristic optimization for random single-hidden layer feedforward neural network | |
Leng et al. | Design for self-organizing fuzzy neural networks based on genetic algorithms | |
Nagib et al. | Path planning for a mobile robot using genetic algorithms | |
CN108537366B (en) | Reservoir scheduling method based on optimal convolution bidimensionalization | |
CN113239897B (en) | Human body action evaluation method based on space-time characteristic combination regression | |
Raiaan et al. | A systematic review of hyperparameter optimization techniques in Convolutional Neural Networks | |
CN104504442A (en) | Neural network optimization method | |
Chouikhi et al. | Single-and multi-objective particle swarm optimization of reservoir structure in echo state network | |
Zhang et al. | Evolving neural network classifiers and feature subset using artificial fish swarm | |
CN104732067A (en) | Industrial process modeling forecasting method oriented at flow object | |
Fofanah et al. | Experimental Exploration of Evolutionary Algorithms and their Applications in Complex Problems: Genetic Algorithm and Particle Swarm Optimization Algorithm | |
WO2022147583A2 (en) | System and method for optimal placement of interacting objects on continuous (or discretized or mixed) domains | |
Zuo et al. | Domain selection of transfer learning in fuzzy prediction models | |
Parsa et al. | Multi-objective hyperparameter optimization for spiking neural network neuroevolution | |
CN115620046A (en) | Multi-target neural architecture searching method based on semi-supervised performance predictor | |
Park et al. | DAG-GCN: Directed Acyclic Causal Graph Discovery from Real World Data using Graph Convolutional Networks | |
CN116611504A (en) | Neural architecture searching method based on evolution | |
Chien et al. | Stochastic curiosity maximizing exploration | |
de Oliveira et al. | An evolutionary extreme learning machine based on fuzzy fish swarms | |
Phatai et al. | Cultural algorithm initializes weights of neural network model for annual electricity consumption prediction | |
Ikushima et al. | Differential evolution neural network optimization with individual dependent mechanism | |
Zhang et al. | Bandit neural architecture search based on performance evaluation for operation selection | |
Abramowitz et al. | Towards run-time efficient hierarchical reinforcement learning | |
Srinivasan et al. | Electricity price forecasting using evolved neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |