CN109657792A - Construct the method, apparatus and computer-readable medium of neural network - Google Patents
Construct the method, apparatus and computer-readable medium of neural network Download PDFInfo
- Publication number
- CN109657792A CN109657792A CN201811559229.3A CN201811559229A CN109657792A CN 109657792 A CN109657792 A CN 109657792A CN 201811559229 A CN201811559229 A CN 201811559229A CN 109657792 A CN109657792 A CN 109657792A
- Authority
- CN
- China
- Prior art keywords
- sample
- correlation
- degree
- positive
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Neurology (AREA)
- Image Analysis (AREA)
Abstract
Embodiment of the disclosure is related to constructing training method, device and the computer-readable medium of neural network.This method comprises: it includes first sample in two positive samples that sample group, which includes at least two positive samples and a negative sample, from for training the sample set of neural network to select sample group;The respective sample in sample group is handled respectively using the current value of the parameter set of neural network, to obtain corresponding character representation;Determine the corresponding fisrt feature of first sample indicate respectively to the corresponding degree of correlation of an at least positive sample and the corresponding character representation of a negative sample remaining in sample group;And it is based on the corresponding degree of correlation, determine the updated value of parameter set.Based on such method, more preferably neural network model can be obtained based on limited training sample, to obtain more preferably character representation.
Description
Technical field
Embodiment of the disclosure is related to artificial intelligence field, and more particularly, to the method for building neural network, dress
It sets and computer-readable medium.
Background technique
In recent years, machine learning techniques constantly developed, in computer vision, speech processes, artificial intelligence etc.
Various fields achieve huge breakthrough, improve machine algorithm significantly in image classification, target detection, speech recognition, machine
Performance in the multiple tasks such as device translation, information filtering, and obtained extensively in the different industries such as internet, video monitoring
Using.
Neural network model handles input data based on parameter set.In the training process to neural network model, instruction
Whether the housebroken parameter set for practicing the direct decision model of quality and quantity of sample is accurate, and then influences neural network model
Final output.In practical applications, cost sample being labeled by way of artificial or crowdsourcing costly, and its
The accuracy of the label of acquisition is also difficult to be guaranteed.Therefore, how adequately train to obtain more based on limited sample
Excellent neural network model becomes current focus of attention.
Summary of the invention
Embodiment of the disclosure provides a kind of for constructing the scheme of neural network.
According to the disclosure in a first aspect, proposing a kind of method for constructing neural network.This method comprises: from for instructing
The sample set for practicing neural network selects sample group, and sample group includes at least two positive samples and a negative sample, includes in two positive samples
First sample;The respective sample in sample group is handled respectively using the current value of the parameter set of neural network, to obtain correspondence
Character representation;Determine the corresponding fisrt feature of first sample indicate respectively with an at least positive sample remaining in sample group and
The corresponding degree of correlation of the corresponding character representation of one negative sample;And it is based on the corresponding degree of correlation, determine the updated value of parameter set.
According to the second aspect of the disclosure, propose a kind of for constructing the device of neural network.The device includes: selection
Module is configured as from for training the sample set of neural network to select sample group, and sample group includes at least two positive samples and one
Negative sample includes first sample in two positive samples;Processing module, be configured as using the current value of the parameter set of neural network come
The respective sample in sample group is handled, respectively to obtain corresponding character representation;Degree of correlation determining module is configured to determine that
The corresponding fisrt feature of one sample indicates corresponding with an at least positive sample remaining in sample group and a negative sample respectively
The corresponding degree of correlation of character representation;And updated value determining module, it is configured as determining parameter set more based on the corresponding degree of correlation
New value.
In the third aspect of the disclosure, a kind of computer readable storage medium is provided.The computer readable storage medium
With the computer-readable program instructions being stored thereon, the computer-readable program instructions are for executing according to first aspect
Described method.
There is provided Summary is their below specific in order to introduce the selection to concept in simplified form
It will be further described in embodiment.Summary be not intended to identify the disclosure key features or essential features, also without
Meaning limits the scope of the present disclosure.
Detailed description of the invention
Disclosure exemplary embodiment is described in more detail in conjunction with the accompanying drawings, the disclosure it is above-mentioned and other
Purpose, feature and advantage will be apparent, wherein in disclosure exemplary embodiment, identical reference label is usual
Represent same parts.
Fig. 1 illustrates the schematic diagrames for the environment that can be used to implement the embodiment of the present disclosure;
Fig. 2 illustrates the flow chart of the method for the building neural network according to the embodiment of the present disclosure;
Fig. 3 illustrates the flow chart of the method for the acquisition sample set according to the embodiment of the present disclosure;
Fig. 4 illustrates the schematic diagram of the building nerve network system according to the embodiment of the present disclosure;
Fig. 5 illustrates the flow chart of the method for the undated parameter collection according to some embodiments of the present disclosure;
Fig. 6 illustrates the flow chart of the method for undated parameter collection according to another embodiment of the present disclosure;
Fig. 7 illustrates according to an embodiment of the present disclosure for constructing the schematic block diagram of the device of neural network;And
Fig. 8 illustrates the schematic block diagram that can be used to implement the example apparatus of embodiment of present disclosure.
Specific embodiment
Preferred embodiment of the present disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in attached drawing
Preferred embodiment, however, it is to be appreciated that may be realized in various forms the disclosure without that should be limited by embodiments set forth here
System.On the contrary, thesing embodiments are provided so that the disclosure is more thorough and complete, and can be complete by the scope of the present disclosure
Ground is communicated to those skilled in the art.
Terminology used in this article " comprising " and its deformation indicate that opening includes, i.e., " including but not limited to ".Unless
Especially statement, term "or" indicate "and/or".Term "based" indicates " being based at least partially on ".Term " implemented by an example
Example " and " one embodiment " expression " at least one example embodiment "." at least one is other for term " another embodiment " expression
Embodiment ".Term " first ", " second " etc. may refer to different or identical object.Hereafter it is also possible that other are bright
True and implicit definition.
As discussed above, whether the housebroken parameter set of the direct decision model of the quality and quantity of training sample is quasi-
Really, so influence neural network model final output.In practical applications, sample is carried out by way of artificial or crowdsourcing
Costly, and its accuracy of label obtained is also difficult to be guaranteed the cost of mark.For example, the case where crowdsourcing marks
In, sample size obtained is marked by crowdsourcing and is often limited.In addition, when annotation process does not have absolute standard, it is right
In a same sample, there may be different judgements for different labelers.For example, in online education field, Ke Nengxi
Whether prestige answers a question to video middle school student is fluently labeled.At this point, there may be different judgement marks for different labelers
Standard, so that different annotation results can be provided to same section of video clip.Therefore, sample obtained is marked by crowdsourcing also to deposit
In the inconsistent situation of a large amount of label, to be difficult to use in efficient neural metwork training.
In accordance with an embodiment of the present disclosure, a kind of scheme of trained neural network is provided.In this scenario, from for training
The sample set of neural network selects sample group, and sample group includes at least two positive samples and a negative sample, includes the in two positive samples
One sample.Then, the respective sample in sample group is handled respectively using the current value of the parameter set of neural network, with acquisition pair
The character representation answered.Then, it is determined that the corresponding fisrt feature of first sample indicates respectively in sample group remaining at least one
The corresponding degree of correlation of positive sample and the corresponding character representation of a negative sample.Based on the corresponding degree of correlation, the update of parameter set is determined
Value.Based on such method, more preferably neural network model can be obtained based on limited training sample, to obtain more excellent
Character representation.
Fig. 1 shows the schematic diagram that can be used for implementing the environment 100 of disclosed method.As shown in Figure 1, environment 100
Including calculating equipment 130, it can be used for implementing the training of the neural network in a variety of realizations of the disclosure.Calculate equipment 130
The sample set 122 in neural network 1 10 and sample database 120 to be trained can be received, in order to according to sample set 120
To execute the training about the network 110.Housebroken parameter set 140 can be exported by calculating equipment 130, which can
To be provided to neural network 1 10 or other unshowned any suitable neural networks in turn.
Neural network 1 10 can be learnt from data with existing to certain knowledge and ability for handling new data.Mind
Various tasks can be designed to execute through network 110, such as image classification, target detection, speech recognition, machine translation,
Information filtering etc..The example of neural network 1 10 includes but is not limited to all kinds of deep neural networks (DNN), support vector machines
(SVM), decision tree, Random Forest model etc..In the realization of the disclosure, neural network can also be referred to as " machine learning mould
Type ".Hereinafter, term " neural network ", " learning model ", " learning network ", " model " and " network " alternatively uses.
Neural network 1 10 is shown as a kind of deep neural network in Fig. 1.Deep neural network has layer architecture, often
One network layer has one or more processing node (referred to as neuron or filter), based on corresponding parameter to input at
Reason.In deep neural network, preceding layer executes that treated, and output is next layer of input, and wherein the first layer in framework connects
Network inputs are received for handling, and the output of the last layer is provided as network output.As shown in Figure 1, neural network 1 10 includes
Network layer 112-1 and 112-2 etc., wherein network layer 112-1 receives network inputs, and network layer 112-2 provides network output.Nerve
The parameter set of the composition neural network 1 10 of parameter used in all processing node processings of network 110.Such parameter set it is specific
Value needs to determine by training process.
It should be appreciated that the framework of neural network shown in fig. 1 and network layer therein and the number of processing node are
Schematically.In different applications, as needed, neural network can be designed as having other frameworks.
Below with reference to Fig. 2-Fig. 6 description according to the scheme of the training neural network of the embodiment of the present disclosure.Fig. 2 illustrates root
According to the flow chart of the method 200 of the training neural network of the embodiment of the present disclosure.Method 200 can be by by the calculating equipment in Fig. 1
130 implement.Movement involved in method 200 is described below with reference to environment 100 described in Fig. 1.
In frame 202, equipment 130 is calculated from for training the sample set 120 of neural network 1 10 to select sample group, sample group
It include first sample in two positive samples including at least two positive samples and a negative sample.In some embodiments, positive sample indicates needle
Sample " just " is marked as to special characteristic.For example, for judging that video clip middle school student answer whether fluent, positive sample mark
It is fluent for knowing learner answering questions, and negative sample mark learner answering questions are not fluent.
In some embodiments, sample set 120 can be obtained in such a way that crowdsourcing marks.Below with reference to figure
The process of the method 300 for the acquisition sample set that 3 descriptions are implemented according to the disclosure.
In frame 302, multiple crowdsourcing labels of the available candidate samples of equipment 130 are calculated, plurality of crowdsourcing label is
Candidate samples are labeled as positive or negative label by what multiple users provided.For example, existing simultaneously 5 for the same sample x
Labeler is labeled sample x.For example, the crowdsourcing label of sample x can be expressed as (1,1,1,0,0), wherein 1 instruction mark
The sample is labeled as positive sample by person, and 0 instruction labeler sample is labeled as negative sample.Herein, it is labeled the sample that is positive
This sample can also directly be called positive sample for short.Similarly, a sample for being marked as negative sample can also be straight
It connects and is called negative sample for short.
In frame 304, calculates equipment 130 and determine in multiple crowdsourcing labels whether be higher than the accounting that candidate samples label is positive
Accounting threshold value?.Continue previous example, when the crowdsourcing label of sample x is (1,1,1,0,0), is marked as positive sample
Ratio is 3/5.In some embodiments, proportion threshold value can be preset.For example, 0.5 can be set by proportion threshold value,
I.e. when the sample is labeled as positive sample by the labeler of more than half, then the sample can be added in sample set using as
Positive sample.
In response to determining that the ratio is higher than proportion threshold value in frame 304, method 300 proceeds to frame 306, i.e., adds candidate samples
Add as the positive sample in sample set 120.For example, be the sample x1 of (1,1,1,0,0) for crowdsourcing label, it is labeled to be positive
The ratio of sample is higher than predetermined threshold, then sample x1 will be confirmed as the positive sample in sample set 120.Otherwise, method 300 carries out
To frame 308, negative sample candidate samples being added in sample set 120.For example, being (1,0,0,0,0) for crowdsourcing label
Sample x2, be marked as the ratio of positive sample lower than predetermined threshold, then sample x2 will be confirmed as negative sample in sample set 120
This.By the method for crowdsourcing mark as shown in Figure 3, each sample in sample set 120 can have corresponding more
A crowdsourcing label.
It in some embodiments, may include multiple positive samples and multiple negative samples in sample set 120.Selecting a sample
At this group, a pair of of positive sample can be chosen from multiple positive samples by calculating equipment 130.Calculating equipment 130 can also bear from multiple
One or more negative samples are randomly selected in sample.
It calculates equipment 130 in frame 204 with continued reference to Fig. 2 and is distinguished using the current value of the parameter set of neural network 1 10
The respective sample in sample group is handled, to obtain corresponding character representation.In the trained initial stage, the ginseng of neural network 1 10
Initial value can be set in manifold.Initial value can in a random basis or otherwise, pre-training process etc. come it is true
It is fixed.In the training process, it can use method 200 and update be iterated to the value of parameter set.In an iteration, parameter set
Current value be preceding an iteration after the completion of parameter set updated value.
Fig. 4 shows the schematic diagram of the framework 400 of the training neural network 1 10 according to the embodiment of the present disclosure.Such as Fig. 4 institute
Show, calculate the selection of equipment 130 includes being marked as the first sample 412-1 of positive sample and different from first sample 412-1's
Sample 412-2 (for the convenience of description, hereinafter referred to as the second sample), and be marked as negative sample multiple sample 414-1,
414-2 to 414-N (for the convenience of description, hereinafter it is individually or uniformly referred to as " third sample 414 ").For example, sample group
410 can also be represented as: gi=< xi +,xj +,x1 -,x2 -,…xk ->, wherein xi +Indicate first sample 412-1, xj +Instruction second
Sample 412-2, x1 -,x2 -,…xk -Indicate k third sample.Based on such mode, calculating equipment 130 can be from limited quantity
Sample set 120 in obtain a large amount of different sample groups 410, to solve the limited defect of training sample.
Continue Fig. 4 example, as shown in figure 4, neural network 420 receive sample group 410, and obtain with it is each in sample group
Sample (412-1,412-2,414-1,414-2 to 414-k) corresponding character representation (412-1,412-2,414-1,414-2
To 414-k).Neural network 420 is shown as a kind of deep neural network in Fig. 4.As shown in figure 4, neural network 420 can be with
Including neural network 1 10 shown in multiple Fig. 1, each neural network can receive the single sample in sample group, and utilize
Parameter set is handled single sample to obtain corresponding feature vector.It should be appreciated that the framework of the neural network shown in Fig. 4
And network layer therein and the number of processing node are schematical.In different applications, as needed, neural network
Model can be designed as having other frameworks.
With continued reference to Fig. 2, in frame 206, calculate equipment 130 determine the corresponding fisrt feature of first sample indicate respectively with
The corresponding degree of correlation of remaining an at least positive sample and the corresponding character representation of a negative sample in sample group.
Continue the example of Fig. 4, the character representation 432-1 of first sample 412-1 can be represented as fi +, the second sample 412-
2 character representation 432-2 can be represented as fj +, similarly, (the mark sheet of 414-1,414-2 to 414-k) of third sample 412
Show that (434-1,434-2 to 434-k) can be represented as f1 -、f2 -To fk -.In some embodiments, between two character representations
Degree of correlation r (xi +, x*) and the cosine value of angle between two character representations: r (x can be calculated asi +, x*) and=consine
(fi +,f*).In some embodiments, it is also possible to identify the correlation between two character representations using the index value of cosine value
Degree:Therefore, it during undated parameter collection, calculates equipment 130 and not only considers two positive samples
Between the degree of correlation, it is also contemplated that the degree of correlation between positive sample and negative sample.By calculating these degrees of correlation, equipment is calculated
130 can be with undated parameter collection so that the degree of correlation between first sample and another positive sample is greater than first sample and other negative samples
This degree of correlation.Some specific implementations of frame 206 are specifically described below with reference to Fig. 5 and Fig. 6.
As mentioned above, method 200 can be iterated execution, be constantly updated using the method and optimization neural network
110 parameter set, until reaching the condition of convergence.In each iteration of method 200, the sample group that selects in frame 202 can be with
It is same or different.The current value of parameter set per secondary update is preceding primary acquisition updated value.
Fig. 5 shows the flow chart of the process 500 of the undated parameter value according to some embodiments of the present disclosure.In frame 502,
Equipment 130 is calculated by summing it up the corresponding degree of correlation to determine degree of correlation summation.Continue the example of Fig. 4, calculating equipment 130 can add
And second sample character representation 412-2 and first sample character representation 412-1 between degree of correlation 440-1 and third sample
(degree of correlation 440-2 between 434-1,434-2 to 434-k) and the character representation 412-1 of first sample is extremely for this character representation
440-M, to obtain degree of correlation summation.In some embodiments, can also index and by way of indicate that the degree of correlation is total
With, such as it can be represented as:Wherein η indicates preset smooth ginseng
Number.
In frame 504, calculates equipment 130 and determine that the positive sample degree of correlation, the positive sample degree of correlation indicate fisrt feature expression and sample
The corresponding degree of correlation of the corresponding character representation of a remaining at least positive sample in this group.Continue the example of Fig. 4, positive sample is related
Degree 440-2 can be represented as r (xi +,xj +).In some embodiments, it is also possible to which the positive sample degree of correlation is expressed as exp (η
r(xi +,xj +)), wherein η indicates preset smoothing parameter.
In frame 506, ratio of the equipment 130 based on the positive sample degree of correlation Yu degree of correlation summation is calculated, determines parameter set more
New value.For example, continue Fig. 4 example, calculate equipment 130 can set objective function 460 to the positive sample degree of correlation to it is related
The ratio for spending summation, for example, it can be represented as:
In the equation, xi +Indicate the character representation 432-1, x of the first positive sample 412-1*Indicate the second positive sample 412-2
Character representation 432-2, η indicates preset smoothing parameter.In some embodiments, it is based on the objective function, calculates equipment 130
Can use gradient descent method reduces the value of the objective function, so that it is determined that the updated value of parameter set.
In some embodiments, calculate whether equipment 130 can determine the training of neural network 420 based on the ratio
Through restraining.In some embodiments, calculating equipment 130 can be compared the ratio with scheduled fractional threshold.If really
The fixed ratio is greater than scheduled fractional threshold, calculates equipment 130 and determines that the training of neural network 420 reaches convergence.In some realities
It applies in example, ratio calculated in ratio calculated and previous ones can also be calculated in current iteration by calculating equipment 130
Whether difference is less than scheduled difference threshold.If it is determined that the difference is less than scheduled difference threshold, calculates equipment 130 and determine mind
Training through network 420 reaches convergence.In some embodiments, determination in successive ignition can also be calculated by calculating equipment 130
The average value of ratio variation, and nerve is determined to whether the change of the average value is less than scheduled threshold value based on each iteration
Whether the training of network 420 restrains.After determining that the training of neural network 420 reaches convergence, ginseng can be exported by calculating equipment 130
The current value of manifold.
In some embodiments, calculating equipment 130 once can also be passed to n sample group to neural network, and can be with
Objective function 460 is arranged are as follows:
Wherein Ω indicates that parameter set to be trained, n indicate the number of the sample group of input.Specifically, calculating equipment 130 can
To use gradient descent method so that the value of objective function (2) reduces in an iterative process, and the mesh in a new iteration
When the difference between value in the value and previous ones of scalar functions (2) is less than scheduled threshold value, the current value of parameter set is exported,
To complete the training to neural network.
In some embodiments, iteration can also be stopped after iteration crosses threshold number by calculating equipment 130, and will be through more
New parameter set output, is used using the housebroken parameter set as neural network.
In some embodiments, the standard of parameter can also be improved using the crowdsourcing label of each sample by calculating equipment 130
Exactness.Specifically, Fig. 6 shows the flow chart of the method 600 of undated parameter collection according to another embodiment of the present disclosure.
In frame 602, equipment 130 is calculated based on associated more with an at least positive sample remaining in sample group and negative sample
A crowdsourcing label determines confidence level associated with an at least positive sample and at least a negative sample, and multiple crowdsourcing labels are by more
What a user provided is labeled as positive or negative label for candidate samples.As indicated above, it can be obtained by way of crowdsourcing
Sample set 120, so that the sample in sample set 120 can have multiple crowdsourcing labels.In some embodiments, equipment is calculated
130 can determine confidence level based on the ratio of positive sample or negative sample is marked as in the crowdsourcing label of given sample.For example,
It is the positive sample of (1,1,1,0,0) for crowdsourcing label, Maximum-likelihood estimation can be calculated as its confidence by calculating equipment 130
Degree:
Wherein d indicates the number of labeler, yi,jIndicate the jth crowdsourcing label of i-th of sample.
In some embodiments, confidence level of the Bayesian confidence as each sample can also be used by calculating equipment 130:
Wherein α and β indicates the priori knowledge for sample distribution positive and negative in sample set, such as α can be known sample
Concentration positive sample number percentage can be negative sample number percentage in known sample set multiplied by d multiplied by d, β.
It in frame 604, calculates equipment 130 and is based on confidence level, fisrt feature is indicated in sample group remaining at least one just
The degree of correlation of sample and the corresponding character representation of negative sample is weighted.It can be to the degree of correlation as shown in figure 4, calculating equipment 130
440-1 to 440-M is weighted, obtain it is weighted after degree of correlation 450-1 to 450-M.In some embodiments, equipment is calculated
Whether 130 can have crowdsourcing label based on each sample to determine whether to be weighted the degree of correlation corresponding with the sample.
Such as when the sample that some sample is expert's calibration, do not have crowdsourcing label, then its confidence level can be set to 1 by default.
In frame 606, equipment 130 is calculated based on the weighted degree of correlation, determines the updated value of parameter set.Continue showing for Fig. 4
Example, after considering confidence level, formula (1) can be updated to new objective function:
Wherein δ indicates corresponding confidence level.It should be appreciated that formula (2) can also be updated accordingly.In some embodiments
In, based on new objective function, calculating equipment 130 can use gradient descent method and the value of the objective function reduced, from
And determine the updated value of parameter set.By introducing the confidence level calculated based on aforesaid way, equipment 130 is calculated in the training process
It can fully consider the information of crowdsourcing label, increase crowdsourcing label and vote the weight of more consistent sample, reduce crowdsourcing label
It votes the weight of not relatively consistent sample, so that neural metwork training parameter obtained is more accurate, and then can lead to
It crosses neural network and obtains more accurately character representation.
In some embodiments, neural network 420 can be determined based on objective function Equation (6) by calculating equipment 130
Whether training has restrained.In some embodiments, the ratio and scheduled fractional threshold can be compared by calculating equipment 130
Compared with.If it is determined that the ratio is greater than scheduled fractional threshold, calculates equipment 130 and determine that the training of neural network 420 reaches convergence.
In some embodiments, it calculates equipment 130 and can also calculate in current iteration and calculated in ratio and previous ones calculated
The difference of ratio whether be less than scheduled difference threshold.If it is determined that the difference is less than scheduled difference threshold, equipment is calculated
130 determine that the training of neural network 420 reaches convergence.In some embodiments, successive ignition can also be calculated by calculating equipment 130
Middle determination ratio variation average value, and based on each iteration to the change of the average value whether be less than scheduled threshold value come
Determine whether the training of neural network 420 restrains.After determining that the training of neural network 420 reaches convergence, calculating equipment 130 can
With the current value of output parameter set.
In some embodiments, after the training for completing neural network 1 10, calculating equipment 130 can be by housebroken ginseng
Manifold 140 is output to neural network 1 10 or another neural network, to obtain housebroken neural network.Housebroken nerve
Network can use housebroken parameter set 140 and mode input (for example, voice, video, text, picture etc.) be converted to spy
Sign indicates, and can determine the conclusion whether mode input is positive by Logic Regression Models in turn.
Based on the scheme of trained neural network described herein, the program can be by being combined into sample pair for positive and negative sample
It is trained, so that the training sample for solving the problems, such as that training sample is concentrated is limited.In addition, in the training process, not only
Consider the degree of correlation between positive sample and positive sample, the degree of correlation between positive sample and negative sample is also further considered, by making
The degree of correlation between positive sample it is sufficiently large come undated parameter so that the parameter of neural network is more accurate, and then can
More accurately character representation is obtained by neural network.
The block diagram of Fig. 7 device 700 according to an embodiment of the present disclosure for data access.Device 700 can be wrapped
It includes in the calculating equipment 130 of Fig. 1 or is implemented as to calculate equipment 130.As shown in fig. 7, device 700 includes selecting module
710, it is configured as from for training the sample set of neural network to select sample group, sample group is negative including at least two positive samples and one
Sample includes first sample in two positive samples.Device 700 further includes processing module 720, is configured as the ginseng using neural network
The current value of manifold handles the respective sample in sample group respectively, to obtain corresponding character representation.Device 700 further includes phase
Pass degree determining module 730, be configured to determine that the corresponding fisrt feature of first sample indicate respectively with it is remaining in sample group
The corresponding degree of correlation of an at least positive sample and the corresponding character representation of a negative sample.In addition, device 700 further includes that updated value is true
Cover half block 740 is configured as determining the updated value of parameter set based on the corresponding degree of correlation.
In some embodiments, device 700 further includes sample set determining module, and sample set determining module includes: acquisition mould
Block is configured as obtaining multiple crowdsourcing labels of candidate samples, and multiple crowdsourcing labels are provided by multiple users by candidate sample
This is labeled as positive or negative label;Accounting determining module is configured to determine that will wait in multiple crowdsourcing labels of candidate samples
Whether the accounting for the label that this label of sampling is positive is higher than accounting threshold value;Positive sample adding module, is configured to respond to accounting
Higher than accounting threshold value, positive sample candidate samples being added in sample set;And negative sample adding module, it is configured as ringing
Accounting threshold value, negative sample candidate samples being added in sample set should be less than or equal in accounting.
In some embodiments, wherein updated value determining module 740 includes: summation determining module, is configured as by adding
Degree of correlation summation is determined with the corresponding degree of correlation;Positive sample degree of correlation determining module is configured to determine that the positive sample degree of correlation, just
The sample degree of correlation indicate fisrt feature indicate in sample group the corresponding character representation of a remaining at least positive sample it is corresponding
The degree of correlation;And the first updated value determining module, it is configured as the ratio based on the positive sample degree of correlation Yu degree of correlation summation, is determined
The updated value of parameter set.
In some embodiments, device 700 further include: the first convergence determining module is configured as related based on positive sample
The ratio of degree and the degree of correlation, determines whether the training of neural network restrains;And first output module, it is configured to respond to really
Determine the training convergence of neural network, the current value of output parameter set.
In some embodiments, wherein updated value determining module 740 includes: confidence determination module, is configured as being based on
Multiple crowdsourcing labels associated with an at least positive sample remaining in sample group and negative sample, it is determining with an at least positive sample and
The associated confidence level of an at least negative sample, multiple crowdsourcing labels be provided by multiple users by candidate samples label be positive or
Negative label;Weighting block is configured as indicating and remaining at least one positive sample in sample group fisrt feature based on confidence level
The degree of correlation of this character representation corresponding with negative sample is weighted;And the second updated value determining module, it is configured as being based on
The weighted degree of correlation determines the updated value of parameter set.
In some embodiments, wherein confidence level includes Bayesian confidence.
In some embodiments, device 700 further include: the second convergence determining module is configured as based on weighted phase
Guan Du, determines whether the training of neural network restrains;And second output module, it is configured to respond to determine neural network
Training convergence, the current value of output parameter set.
Fig. 8 shows the schematic block diagram that can be used to implement the example apparatus 800 of embodiment of present disclosure.Example
Such as, calculating equipment 130 as shown in Figure 1 is implemented by equipment 800.As shown, equipment 800 includes central processing unit
(CPU) 801, it can be according to the computer program instructions being stored in read-only memory (ROM) 802 or from storage unit
808 are loaded into the computer program instructions in random access storage device (RAM) 803, to execute various movements appropriate and processing.
In RAM 803, it can also store equipment 800 and operate required various programs and data.CPU 801, ROM 802 and RAM
803 are connected with each other by bus 804.Input/output (I/O) interface 805 is also connected to bus 804.
Multiple components in equipment 800 are connected to I/O interface 805, comprising: input unit 806, such as keyboard, mouse etc.;
Output unit 807, such as various types of displays, loudspeaker etc.;Storage unit 808, such as disk, CD etc.;And it is logical
Believe unit 809, such as network interface card, modem, wireless communication transceiver etc..Communication unit 809 allows equipment 800 by such as
The computer network of internet and/or various telecommunication networks exchange information/data with other equipment.
Each process as described above and processing, such as method 200, method 300, method 500 and/or method 600, can
It is executed by processing unit 801.For example, in some embodiments, method 200, method 300, method 500 and/or method 600 can quilts
It is embodied as computer software programs, is tangibly embodied in machine readable media, such as storage unit 808.In some implementations
In example, some or all of of computer program can be loaded into and/or install via ROM 802 and/or communication unit 807
Onto equipment 800.When computer program is loaded into RAM 803 and is executed by CPU 801, above-described side can be executed
One or more movements of method 200, method 300, method 500 and/or method 600.
The disclosure can be method, apparatus, system and/or computer program product.Computer program product may include
Computer readable storage medium, containing the computer-readable program instructions for executing various aspects of the disclosure.
Computer readable storage medium, which can be, can keep and store the tangible of the instruction used by instruction execution equipment
Equipment.Computer readable storage medium for example can be -- but being not limited to -- storage device electric, magnetic storage apparatus, optical storage are set
Standby, electric magnetic storage apparatus, semiconductor memory apparatus or above-mentioned any appropriate combination.Computer readable storage medium is more
Specific example (non exhaustive list) includes: portable computer diskette, hard disk, random access memory (RAM), read-only storage
Device (ROM), erasable programmable read only memory (EPROM or flash memory), static random access memory (SRAM), portable pressure
Contracting disk read-only memory (CD-ROM), memory stick, floppy disk, mechanical coding equipment, is for example deposited digital versatile disc (DVD) thereon
Contain punch card or groove internal projection structure and the above-mentioned any appropriate combination of instruction.Computer used herein above
Readable storage medium storing program for executing is not interpreted that instantaneous signal itself, the electromagnetic wave of such as radio wave or other Free propagations pass through
The electromagnetic wave (for example, the light pulse for passing through fiber optic cables) or pass through electric wire transmission that waveguide or other transmission mediums are propagated
Electric signal.
Computer-readable program instructions as described herein can be downloaded to from computer readable storage medium it is each calculate/
Processing equipment, or outer computer or outer is downloaded to by network, such as internet, local area network, wide area network and/or wireless network
Portion stores equipment.Network may include copper transmission cable, optical fiber transmission, wireless transmission, router, firewall, interchanger, gateway
Computer and/or Edge Server.Adapter or network interface in each calculating/processing equipment are received from network to be counted
Calculation machine readable program instructions, and the computer-readable program instructions are forwarded, for the meter being stored in each calculating/processing equipment
In calculation machine readable storage medium storing program for executing.
Computer program instructions for executing disclosure operation can be assembly instruction, instruction set architecture (ISA) instructs,
Machine instruction, machine-dependent instructions, microcode, firmware instructions, condition setup data or with one or more programming languages
The source code or object code that any combination is write, the programming language include the programming language-of object-oriented such as
Smalltalk, C++ etc., and conventional procedural programming languages-such as " C " language or similar programming language.Computer
Readable program instructions can be executed fully on the user computer, partly execute on the user computer, be only as one
Vertical software package executes, part executes on the remote computer or completely in remote computer on the user computer for part
Or it is executed on server.In situations involving remote computers, remote computer can pass through network-packet of any kind
It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit
It is connected with ISP by internet).In some embodiments, by utilizing computer-readable program instructions
Status information carry out personalized customization electronic circuit, such as programmable logic circuit, field programmable gate array (FPGA) or can
Programmed logic array (PLA) (PLA), the electronic circuit can execute computer-readable program instructions, to realize each side of the disclosure
Face.
Referring herein to according to the flow chart of the method, apparatus (system) of the embodiment of the present disclosure and computer program product and/
Or block diagram describes various aspects of the disclosure.It should be appreciated that flowchart and or block diagram each box and flow chart and/
Or in block diagram each box combination, can be realized by computer-readable program instructions.
These computer-readable program instructions can be supplied to general purpose computer, special purpose computer or other programmable datas
The processing unit of processing unit, so that a kind of machine is produced, so that these instructions are passing through computer or other programmable numbers
When being executed according to the processing unit of processing unit, produces and provided in one or more boxes in implementation flow chart and/or block diagram
Function action device.These computer-readable program instructions can also be stored in a computer-readable storage medium, this
A little instructions so that computer, programmable data processing unit and/or other equipment work in a specific way, thus, be stored with finger
The computer-readable medium of order then includes a manufacture comprising the one or more side in implementation flow chart and/or block diagram
The instruction of the various aspects of function action specified in frame.
Computer-readable program instructions can also be loaded into computer, other programmable data processing units or other
In equipment, so that series of operation steps are executed in computer, other programmable data processing units or other equipment, to produce
Raw computer implemented process, so that executed in computer, other programmable data processing units or other equipment
Instruct function action specified in one or more boxes in implementation flow chart and/or block diagram.
The flow chart and block diagram in the drawings show system, method and the computer journeys according to multiple embodiments of the disclosure
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
One module of table, program segment or a part of instruction, the module, program segment or a part of instruction include one or more use
The executable instruction of the logic function as defined in realizing.In some implementations as replacements, function marked in the box
It can occur in a different order than that indicated in the drawings.For example, two continuous boxes can actually be held substantially in parallel
Row, they can also be executed in the opposite order sometimes, and this depends on the function involved.It is also noted that block diagram and/or
The combination of each box in flow chart and the box in block diagram and or flow chart, can the function as defined in executing or dynamic
The dedicated hardware based system made is realized, or can be realized using a combination of dedicated hardware and computer instructions.
Each embodiment of the disclosure is described above, above description is exemplary, and non-exclusive, and
It is also not necessarily limited to disclosed each embodiment.It is right without departing from the scope and spirit of illustrated each embodiment
Many modifications and changes are obvious for those skilled in the art.The choosing of term used herein
It selects, it is intended to best explain the principle, practical application or the improvement to the technology in market of each embodiment, or make this technology
Other those of ordinary skill in field can understand each embodiment disclosed herein.
Claims (15)
1. a kind of method for constructing neural network, comprising:
From for training the sample set of neural network to select sample group, the sample group includes at least two positive samples and a negative sample
This, includes first sample in two positive sample;
The respective sample in the sample group is handled respectively using the current value of the parameter set of the neural network, with acquisition pair
The character representation answered;
Determine the corresponding fisrt feature of the first sample indicate respectively with an at least positive sample remaining in the sample group
The corresponding degree of correlation of character representation corresponding with a negative sample;And
Based on the corresponding degree of correlation, the updated value of the parameter set is determined.
2. according to the method described in claim 1, further including determining the sample set by following:
Obtain multiple crowdsourcing labels of candidate samples, the multiple crowdsourcing label is provided by multiple users by the candidate sample
This is labeled as positive or negative label;
Determine whether is the accounting of label that candidate samples label is positive in multiple crowdsourcing labels of the candidate samples
Higher than accounting threshold value;
It is higher than the accounting threshold value, the positive sample that the candidate samples are added in the sample set in response to the accounting
This;And
It is less than or equal to the accounting threshold value in response to the accounting, the candidate samples is added to one in the sample set
Negative sample.
3. according to the method described in claim 1, wherein determining that the updated value of the parameter set includes:
Degree of correlation summation is determined by summing it up the corresponding degree of correlation;
Determine the positive sample degree of correlation, the positive sample degree of correlation indicate the fisrt feature indicate with it is remaining in the sample group
The corresponding degree of correlation of the corresponding character representation of an at least positive sample;And
Ratio based on the positive sample degree of correlation Yu the degree of correlation summation, determines the updated value of the parameter set.
4. according to the method described in claim 3, further include:
The ratio based on the positive sample degree of correlation Yu the degree of correlation, determines whether the training of the neural network receives
It holds back;And
In response to the training convergence of the determination neural network, the current value of the parameter set is exported.
5. according to the method described in claim 1, wherein determining that the updated value of the parameter set includes:
Based on multiple crowdsourcing labels associated with an at least positive sample remaining in the sample group and negative sample, determining and institute
An at least positive sample and the associated confidence level of an at least negative sample are stated, the multiple crowdsourcing label is mentioned by multiple users
What is supplied is labeled as positive or negative label for the candidate samples;
Based on the confidence level, to an at least positive sample remaining in fisrt feature expression and the sample group and bear
The degree of correlation of the corresponding character representation of sample is weighted;And
Based on the weighted degree of correlation, the updated value of the parameter set is determined.
6. according to the method described in claim 5, wherein the confidence level includes Bayesian confidence.
7. according to the method described in claim 5, further include:
Based on the weighted degree of correlation, determine whether the training of the neural network restrains;And
In response to the training convergence of the determination neural network, the current value of the parameter set is exported.
8. a kind of for constructing the device of neural network, comprising:
Selecting module is configured as from for training the sample set of neural network to select sample group, and the sample group includes at least
Two positive samples and a negative sample include first sample in two positive sample;
Processing module is configured as handling respectively in the sample group using the current value of the parameter set of the neural network
Respective sample, to obtain corresponding character representation;
Degree of correlation determining module, be configured to determine that the corresponding fisrt feature of the first sample indicate respectively with the sample
The corresponding degree of correlation of remaining an at least positive sample and the corresponding character representation of a negative sample in group;And
Updated value determining module is configured as determining the updated value of the parameter set based on the corresponding degree of correlation.
9. device according to claim 8 further includes sample set determining module, the sample set determining module includes:
Module is obtained, is configured as obtaining multiple crowdsourcing labels of candidate samples, the multiple crowdsourcing label is by multiple users
What is provided is labeled as positive or negative label for the candidate samples;
Accounting determining module is configured to determine that marking the candidate samples in multiple crowdsourcing labels of the candidate samples
Whether the accounting for the label being positive is higher than accounting threshold value;
Positive sample adding module is configured to respond to the accounting higher than the accounting threshold value, the candidate samples is added
For the positive sample in the sample set;And
Negative sample adding module is configured to respond to the accounting less than or equal to the accounting threshold value, by the candidate sample
Originally the negative sample being added in the sample set.
10. device according to claim 8, wherein the updated value determining module includes:
Summation determining module is configured as determining degree of correlation summation by summing it up the corresponding degree of correlation;
Positive sample degree of correlation determining module, is configured to determine that the positive sample degree of correlation, the positive sample degree of correlation instruction described the
The corresponding degree of correlation of one character representation and the corresponding character representation of a remaining at least positive sample in the sample group;And
First updated value determining module is configured as the ratio based on the positive sample degree of correlation Yu the degree of correlation summation, really
The updated value of the fixed parameter set.
11. device according to claim 10, further includes:
First convergence determining module, is configured as the ratio based on the positive sample degree of correlation Yu the degree of correlation, determines
Whether the training of the neural network restrains;And
First output module is configured to respond to determine the training convergence of the neural network, exports working as the parameter set
Preceding value.
12. device according to claim 8, wherein the updated value determining module includes:
Confidence determination module is configured as based on associated at least a positive sample and negative sample remaining in the sample group
Multiple crowdsourcing labels, determine associated with an at least positive sample and at least negative sample confidence level, it is described more
A crowdsourcing label be provided by multiple users by the candidate samples labeled as positive or negative label;
Weighting block is configured as indicating and remaining institute in the sample group fisrt feature based on the confidence level
The degree of correlation for stating an at least positive sample and the corresponding character representation of negative sample is weighted;And
Second updated value determining module, is configured as based on the weighted degree of correlation, determine the parameter set it is described more
New value.
13. device according to claim 12, wherein the confidence level includes Bayesian confidence.
14. device according to claim 12, further includes:
Second convergence determining module, is configured as determining that the training of the neural network is based on the weighted degree of correlation
No convergence;And
Second output module is configured to respond to determine the training convergence of the neural network, exports working as the parameter set
Preceding value.
15. a kind of computer readable storage medium has the computer-readable program instructions being stored thereon, the computer can
Reader is instructed for executing method according to any one of claims 1-7.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811559229.3A CN109657792A (en) | 2018-12-19 | 2018-12-19 | Construct the method, apparatus and computer-readable medium of neural network |
PCT/CN2019/122677 WO2020125404A1 (en) | 2018-12-19 | 2019-12-03 | Method and apparatus for constructing neural network and computer-readable medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811559229.3A CN109657792A (en) | 2018-12-19 | 2018-12-19 | Construct the method, apparatus and computer-readable medium of neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109657792A true CN109657792A (en) | 2019-04-19 |
Family
ID=66114940
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811559229.3A Pending CN109657792A (en) | 2018-12-19 | 2018-12-19 | Construct the method, apparatus and computer-readable medium of neural network |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109657792A (en) |
WO (1) | WO2020125404A1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110765943A (en) * | 2019-10-23 | 2020-02-07 | 深圳市商汤科技有限公司 | Network training and recognition method and device, electronic equipment and storage medium |
WO2020125404A1 (en) * | 2018-12-19 | 2020-06-25 | 北京世纪好未来教育科技有限公司 | Method and apparatus for constructing neural network and computer-readable medium |
CN111860769A (en) * | 2020-06-16 | 2020-10-30 | 北京百度网讯科技有限公司 | Method and device for pre-training neural network |
CN112443019A (en) * | 2019-09-05 | 2021-03-05 | 梅州市青塘实业有限公司 | Closestool and flushing control method and device thereof |
CN112529029A (en) * | 2019-09-18 | 2021-03-19 | 华为技术有限公司 | Information processing method, neural network training method, device and storage medium |
CN112766320A (en) * | 2020-12-31 | 2021-05-07 | 平安科技(深圳)有限公司 | Classification model training method and computer equipment |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9967211B2 (en) * | 2015-05-31 | 2018-05-08 | Microsoft Technology Licensing, Llc | Metric for automatic assessment of conversational responses |
CN106021364B (en) * | 2016-05-10 | 2017-12-12 | 百度在线网络技术(北京)有限公司 | Foundation, image searching method and the device of picture searching dependency prediction model |
CN108009528B (en) * | 2017-12-26 | 2020-04-07 | 广州广电运通金融电子股份有限公司 | Triple Loss-based face authentication method and device, computer equipment and storage medium |
CN108764065B (en) * | 2018-05-04 | 2020-12-08 | 华中科技大学 | Pedestrian re-recognition feature fusion aided learning method |
CN109657792A (en) * | 2018-12-19 | 2019-04-19 | 北京世纪好未来教育科技有限公司 | Construct the method, apparatus and computer-readable medium of neural network |
-
2018
- 2018-12-19 CN CN201811559229.3A patent/CN109657792A/en active Pending
-
2019
- 2019-12-03 WO PCT/CN2019/122677 patent/WO2020125404A1/en active Application Filing
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020125404A1 (en) * | 2018-12-19 | 2020-06-25 | 北京世纪好未来教育科技有限公司 | Method and apparatus for constructing neural network and computer-readable medium |
CN112443019A (en) * | 2019-09-05 | 2021-03-05 | 梅州市青塘实业有限公司 | Closestool and flushing control method and device thereof |
CN112529029A (en) * | 2019-09-18 | 2021-03-19 | 华为技术有限公司 | Information processing method, neural network training method, device and storage medium |
CN110765943A (en) * | 2019-10-23 | 2020-02-07 | 深圳市商汤科技有限公司 | Network training and recognition method and device, electronic equipment and storage medium |
CN111860769A (en) * | 2020-06-16 | 2020-10-30 | 北京百度网讯科技有限公司 | Method and device for pre-training neural network |
CN112766320A (en) * | 2020-12-31 | 2021-05-07 | 平安科技(深圳)有限公司 | Classification model training method and computer equipment |
CN112766320B (en) * | 2020-12-31 | 2023-12-22 | 平安科技(深圳)有限公司 | Classification model training method and computer equipment |
Also Published As
Publication number | Publication date |
---|---|
WO2020125404A1 (en) | 2020-06-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109657792A (en) | Construct the method, apparatus and computer-readable medium of neural network | |
Wang et al. | Learning to Represent Student Knowledge on Programming Exercises Using Deep Learning. | |
CN110390108A (en) | Task exchange method and system based on deeply study | |
CN113344053B (en) | Knowledge tracking method based on examination question different composition representation and learner embedding | |
US20150254556A1 (en) | Systems and Methods for Allocating Capital to Trading Strategies for Big Data Trading in Financial Markets | |
CN111328407A (en) | Mechanical learning method, apparatus and computer program for providing customized personal educational content based on learning efficiency | |
CN110288007A (en) | The method, apparatus and electronic equipment of data mark | |
Frauenberger et al. | Ways of thinking in informatics | |
JP7222274B2 (en) | Model learning device, label estimation device, methods thereof, and program | |
CN111241992A (en) | Face recognition model construction method, recognition method, device, equipment and storage medium | |
CN107437111A (en) | Data processing method, medium, device and computing device based on neutral net | |
CN113591988B (en) | Knowledge cognitive structure analysis method, system, computer equipment, medium and terminal | |
CN115187772A (en) | Training method, device and equipment of target detection network and target detection method, device and equipment | |
CN110377733A (en) | A kind of text based Emotion identification method, terminal device and medium | |
CN114861754A (en) | Knowledge tracking method and system based on external attention mechanism | |
CN110222838A (en) | Deep neural network and its training method, device, electronic equipment and storage medium | |
CN105989438A (en) | Task relation management method, apparatus and system thereof, and electronic equipment | |
CN111159241A (en) | Click conversion estimation method and device | |
CN113705159A (en) | Merchant name labeling method, device, equipment and storage medium | |
CN114117033B (en) | Knowledge tracking method and system | |
CN113312445B (en) | Data processing method, model construction method, classification method and computing equipment | |
Krishnan et al. | Incorporating Wide Context Information for Deep Knowledge Tracing using Attentional Bi-interaction. | |
CN114170484A (en) | Picture attribute prediction method and device, electronic equipment and storage medium | |
CN113919979A (en) | Knowledge tracking method and device, nonvolatile storage medium and electronic device | |
CN113283584A (en) | Knowledge tracking method and system based on twin network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190419 |
|
RJ01 | Rejection of invention patent application after publication |