[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN109657792A - Construct the method, apparatus and computer-readable medium of neural network - Google Patents

Construct the method, apparatus and computer-readable medium of neural network Download PDF

Info

Publication number
CN109657792A
CN109657792A CN201811559229.3A CN201811559229A CN109657792A CN 109657792 A CN109657792 A CN 109657792A CN 201811559229 A CN201811559229 A CN 201811559229A CN 109657792 A CN109657792 A CN 109657792A
Authority
CN
China
Prior art keywords
sample
correlation
degree
positive
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811559229.3A
Other languages
Chinese (zh)
Inventor
许国伟
丁文彪
杨松帆
刘子韬
张邦鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Century TAL Education Technology Co Ltd
Original Assignee
Beijing Century TAL Education Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Century TAL Education Technology Co Ltd filed Critical Beijing Century TAL Education Technology Co Ltd
Priority to CN201811559229.3A priority Critical patent/CN109657792A/en
Publication of CN109657792A publication Critical patent/CN109657792A/en
Priority to PCT/CN2019/122677 priority patent/WO2020125404A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Image Analysis (AREA)

Abstract

Embodiment of the disclosure is related to constructing training method, device and the computer-readable medium of neural network.This method comprises: it includes first sample in two positive samples that sample group, which includes at least two positive samples and a negative sample, from for training the sample set of neural network to select sample group;The respective sample in sample group is handled respectively using the current value of the parameter set of neural network, to obtain corresponding character representation;Determine the corresponding fisrt feature of first sample indicate respectively to the corresponding degree of correlation of an at least positive sample and the corresponding character representation of a negative sample remaining in sample group;And it is based on the corresponding degree of correlation, determine the updated value of parameter set.Based on such method, more preferably neural network model can be obtained based on limited training sample, to obtain more preferably character representation.

Description

Construct the method, apparatus and computer-readable medium of neural network
Technical field
Embodiment of the disclosure is related to artificial intelligence field, and more particularly, to the method for building neural network, dress It sets and computer-readable medium.
Background technique
In recent years, machine learning techniques constantly developed, in computer vision, speech processes, artificial intelligence etc. Various fields achieve huge breakthrough, improve machine algorithm significantly in image classification, target detection, speech recognition, machine Performance in the multiple tasks such as device translation, information filtering, and obtained extensively in the different industries such as internet, video monitoring Using.
Neural network model handles input data based on parameter set.In the training process to neural network model, instruction Whether the housebroken parameter set for practicing the direct decision model of quality and quantity of sample is accurate, and then influences neural network model Final output.In practical applications, cost sample being labeled by way of artificial or crowdsourcing costly, and its The accuracy of the label of acquisition is also difficult to be guaranteed.Therefore, how adequately train to obtain more based on limited sample Excellent neural network model becomes current focus of attention.
Summary of the invention
Embodiment of the disclosure provides a kind of for constructing the scheme of neural network.
According to the disclosure in a first aspect, proposing a kind of method for constructing neural network.This method comprises: from for instructing The sample set for practicing neural network selects sample group, and sample group includes at least two positive samples and a negative sample, includes in two positive samples First sample;The respective sample in sample group is handled respectively using the current value of the parameter set of neural network, to obtain correspondence Character representation;Determine the corresponding fisrt feature of first sample indicate respectively with an at least positive sample remaining in sample group and The corresponding degree of correlation of the corresponding character representation of one negative sample;And it is based on the corresponding degree of correlation, determine the updated value of parameter set.
According to the second aspect of the disclosure, propose a kind of for constructing the device of neural network.The device includes: selection Module is configured as from for training the sample set of neural network to select sample group, and sample group includes at least two positive samples and one Negative sample includes first sample in two positive samples;Processing module, be configured as using the current value of the parameter set of neural network come The respective sample in sample group is handled, respectively to obtain corresponding character representation;Degree of correlation determining module is configured to determine that The corresponding fisrt feature of one sample indicates corresponding with an at least positive sample remaining in sample group and a negative sample respectively The corresponding degree of correlation of character representation;And updated value determining module, it is configured as determining parameter set more based on the corresponding degree of correlation New value.
In the third aspect of the disclosure, a kind of computer readable storage medium is provided.The computer readable storage medium With the computer-readable program instructions being stored thereon, the computer-readable program instructions are for executing according to first aspect Described method.
There is provided Summary is their below specific in order to introduce the selection to concept in simplified form It will be further described in embodiment.Summary be not intended to identify the disclosure key features or essential features, also without Meaning limits the scope of the present disclosure.
Detailed description of the invention
Disclosure exemplary embodiment is described in more detail in conjunction with the accompanying drawings, the disclosure it is above-mentioned and other Purpose, feature and advantage will be apparent, wherein in disclosure exemplary embodiment, identical reference label is usual Represent same parts.
Fig. 1 illustrates the schematic diagrames for the environment that can be used to implement the embodiment of the present disclosure;
Fig. 2 illustrates the flow chart of the method for the building neural network according to the embodiment of the present disclosure;
Fig. 3 illustrates the flow chart of the method for the acquisition sample set according to the embodiment of the present disclosure;
Fig. 4 illustrates the schematic diagram of the building nerve network system according to the embodiment of the present disclosure;
Fig. 5 illustrates the flow chart of the method for the undated parameter collection according to some embodiments of the present disclosure;
Fig. 6 illustrates the flow chart of the method for undated parameter collection according to another embodiment of the present disclosure;
Fig. 7 illustrates according to an embodiment of the present disclosure for constructing the schematic block diagram of the device of neural network;And
Fig. 8 illustrates the schematic block diagram that can be used to implement the example apparatus of embodiment of present disclosure.
Specific embodiment
Preferred embodiment of the present disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in attached drawing Preferred embodiment, however, it is to be appreciated that may be realized in various forms the disclosure without that should be limited by embodiments set forth here System.On the contrary, thesing embodiments are provided so that the disclosure is more thorough and complete, and can be complete by the scope of the present disclosure Ground is communicated to those skilled in the art.
Terminology used in this article " comprising " and its deformation indicate that opening includes, i.e., " including but not limited to ".Unless Especially statement, term "or" indicate "and/or".Term "based" indicates " being based at least partially on ".Term " implemented by an example Example " and " one embodiment " expression " at least one example embodiment "." at least one is other for term " another embodiment " expression Embodiment ".Term " first ", " second " etc. may refer to different or identical object.Hereafter it is also possible that other are bright True and implicit definition.
As discussed above, whether the housebroken parameter set of the direct decision model of the quality and quantity of training sample is quasi- Really, so influence neural network model final output.In practical applications, sample is carried out by way of artificial or crowdsourcing Costly, and its accuracy of label obtained is also difficult to be guaranteed the cost of mark.For example, the case where crowdsourcing marks In, sample size obtained is marked by crowdsourcing and is often limited.In addition, when annotation process does not have absolute standard, it is right In a same sample, there may be different judgements for different labelers.For example, in online education field, Ke Nengxi Whether prestige answers a question to video middle school student is fluently labeled.At this point, there may be different judgement marks for different labelers Standard, so that different annotation results can be provided to same section of video clip.Therefore, sample obtained is marked by crowdsourcing also to deposit In the inconsistent situation of a large amount of label, to be difficult to use in efficient neural metwork training.
In accordance with an embodiment of the present disclosure, a kind of scheme of trained neural network is provided.In this scenario, from for training The sample set of neural network selects sample group, and sample group includes at least two positive samples and a negative sample, includes the in two positive samples One sample.Then, the respective sample in sample group is handled respectively using the current value of the parameter set of neural network, with acquisition pair The character representation answered.Then, it is determined that the corresponding fisrt feature of first sample indicates respectively in sample group remaining at least one The corresponding degree of correlation of positive sample and the corresponding character representation of a negative sample.Based on the corresponding degree of correlation, the update of parameter set is determined Value.Based on such method, more preferably neural network model can be obtained based on limited training sample, to obtain more excellent Character representation.
Fig. 1 shows the schematic diagram that can be used for implementing the environment 100 of disclosed method.As shown in Figure 1, environment 100 Including calculating equipment 130, it can be used for implementing the training of the neural network in a variety of realizations of the disclosure.Calculate equipment 130 The sample set 122 in neural network 1 10 and sample database 120 to be trained can be received, in order to according to sample set 120 To execute the training about the network 110.Housebroken parameter set 140 can be exported by calculating equipment 130, which can To be provided to neural network 1 10 or other unshowned any suitable neural networks in turn.
Neural network 1 10 can be learnt from data with existing to certain knowledge and ability for handling new data.Mind Various tasks can be designed to execute through network 110, such as image classification, target detection, speech recognition, machine translation, Information filtering etc..The example of neural network 1 10 includes but is not limited to all kinds of deep neural networks (DNN), support vector machines (SVM), decision tree, Random Forest model etc..In the realization of the disclosure, neural network can also be referred to as " machine learning mould Type ".Hereinafter, term " neural network ", " learning model ", " learning network ", " model " and " network " alternatively uses.
Neural network 1 10 is shown as a kind of deep neural network in Fig. 1.Deep neural network has layer architecture, often One network layer has one or more processing node (referred to as neuron or filter), based on corresponding parameter to input at Reason.In deep neural network, preceding layer executes that treated, and output is next layer of input, and wherein the first layer in framework connects Network inputs are received for handling, and the output of the last layer is provided as network output.As shown in Figure 1, neural network 1 10 includes Network layer 112-1 and 112-2 etc., wherein network layer 112-1 receives network inputs, and network layer 112-2 provides network output.Nerve The parameter set of the composition neural network 1 10 of parameter used in all processing node processings of network 110.Such parameter set it is specific Value needs to determine by training process.
It should be appreciated that the framework of neural network shown in fig. 1 and network layer therein and the number of processing node are Schematically.In different applications, as needed, neural network can be designed as having other frameworks.
Below with reference to Fig. 2-Fig. 6 description according to the scheme of the training neural network of the embodiment of the present disclosure.Fig. 2 illustrates root According to the flow chart of the method 200 of the training neural network of the embodiment of the present disclosure.Method 200 can be by by the calculating equipment in Fig. 1 130 implement.Movement involved in method 200 is described below with reference to environment 100 described in Fig. 1.
In frame 202, equipment 130 is calculated from for training the sample set 120 of neural network 1 10 to select sample group, sample group It include first sample in two positive samples including at least two positive samples and a negative sample.In some embodiments, positive sample indicates needle Sample " just " is marked as to special characteristic.For example, for judging that video clip middle school student answer whether fluent, positive sample mark It is fluent for knowing learner answering questions, and negative sample mark learner answering questions are not fluent.
In some embodiments, sample set 120 can be obtained in such a way that crowdsourcing marks.Below with reference to figure The process of the method 300 for the acquisition sample set that 3 descriptions are implemented according to the disclosure.
In frame 302, multiple crowdsourcing labels of the available candidate samples of equipment 130 are calculated, plurality of crowdsourcing label is Candidate samples are labeled as positive or negative label by what multiple users provided.For example, existing simultaneously 5 for the same sample x Labeler is labeled sample x.For example, the crowdsourcing label of sample x can be expressed as (1,1,1,0,0), wherein 1 instruction mark The sample is labeled as positive sample by person, and 0 instruction labeler sample is labeled as negative sample.Herein, it is labeled the sample that is positive This sample can also directly be called positive sample for short.Similarly, a sample for being marked as negative sample can also be straight It connects and is called negative sample for short.
In frame 304, calculates equipment 130 and determine in multiple crowdsourcing labels whether be higher than the accounting that candidate samples label is positive Accounting threshold value?.Continue previous example, when the crowdsourcing label of sample x is (1,1,1,0,0), is marked as positive sample Ratio is 3/5.In some embodiments, proportion threshold value can be preset.For example, 0.5 can be set by proportion threshold value, I.e. when the sample is labeled as positive sample by the labeler of more than half, then the sample can be added in sample set using as Positive sample.
In response to determining that the ratio is higher than proportion threshold value in frame 304, method 300 proceeds to frame 306, i.e., adds candidate samples Add as the positive sample in sample set 120.For example, be the sample x1 of (1,1,1,0,0) for crowdsourcing label, it is labeled to be positive The ratio of sample is higher than predetermined threshold, then sample x1 will be confirmed as the positive sample in sample set 120.Otherwise, method 300 carries out To frame 308, negative sample candidate samples being added in sample set 120.For example, being (1,0,0,0,0) for crowdsourcing label Sample x2, be marked as the ratio of positive sample lower than predetermined threshold, then sample x2 will be confirmed as negative sample in sample set 120 This.By the method for crowdsourcing mark as shown in Figure 3, each sample in sample set 120 can have corresponding more A crowdsourcing label.
It in some embodiments, may include multiple positive samples and multiple negative samples in sample set 120.Selecting a sample At this group, a pair of of positive sample can be chosen from multiple positive samples by calculating equipment 130.Calculating equipment 130 can also bear from multiple One or more negative samples are randomly selected in sample.
It calculates equipment 130 in frame 204 with continued reference to Fig. 2 and is distinguished using the current value of the parameter set of neural network 1 10 The respective sample in sample group is handled, to obtain corresponding character representation.In the trained initial stage, the ginseng of neural network 1 10 Initial value can be set in manifold.Initial value can in a random basis or otherwise, pre-training process etc. come it is true It is fixed.In the training process, it can use method 200 and update be iterated to the value of parameter set.In an iteration, parameter set Current value be preceding an iteration after the completion of parameter set updated value.
Fig. 4 shows the schematic diagram of the framework 400 of the training neural network 1 10 according to the embodiment of the present disclosure.Such as Fig. 4 institute Show, calculate the selection of equipment 130 includes being marked as the first sample 412-1 of positive sample and different from first sample 412-1's Sample 412-2 (for the convenience of description, hereinafter referred to as the second sample), and be marked as negative sample multiple sample 414-1, 414-2 to 414-N (for the convenience of description, hereinafter it is individually or uniformly referred to as " third sample 414 ").For example, sample group 410 can also be represented as: gi=< xi +,xj +,x1 -,x2 -,…xk ->, wherein xi +Indicate first sample 412-1, xj +Instruction second Sample 412-2, x1 -,x2 -,…xk -Indicate k third sample.Based on such mode, calculating equipment 130 can be from limited quantity Sample set 120 in obtain a large amount of different sample groups 410, to solve the limited defect of training sample.
Continue Fig. 4 example, as shown in figure 4, neural network 420 receive sample group 410, and obtain with it is each in sample group Sample (412-1,412-2,414-1,414-2 to 414-k) corresponding character representation (412-1,412-2,414-1,414-2 To 414-k).Neural network 420 is shown as a kind of deep neural network in Fig. 4.As shown in figure 4, neural network 420 can be with Including neural network 1 10 shown in multiple Fig. 1, each neural network can receive the single sample in sample group, and utilize Parameter set is handled single sample to obtain corresponding feature vector.It should be appreciated that the framework of the neural network shown in Fig. 4 And network layer therein and the number of processing node are schematical.In different applications, as needed, neural network Model can be designed as having other frameworks.
With continued reference to Fig. 2, in frame 206, calculate equipment 130 determine the corresponding fisrt feature of first sample indicate respectively with The corresponding degree of correlation of remaining an at least positive sample and the corresponding character representation of a negative sample in sample group.
Continue the example of Fig. 4, the character representation 432-1 of first sample 412-1 can be represented as fi +, the second sample 412- 2 character representation 432-2 can be represented as fj +, similarly, (the mark sheet of 414-1,414-2 to 414-k) of third sample 412 Show that (434-1,434-2 to 434-k) can be represented as f1 -、f2 -To fk -.In some embodiments, between two character representations Degree of correlation r (xi +, x*) and the cosine value of angle between two character representations: r (x can be calculated asi +, x*) and=consine (fi +,f*).In some embodiments, it is also possible to identify the correlation between two character representations using the index value of cosine value Degree:Therefore, it during undated parameter collection, calculates equipment 130 and not only considers two positive samples Between the degree of correlation, it is also contemplated that the degree of correlation between positive sample and negative sample.By calculating these degrees of correlation, equipment is calculated 130 can be with undated parameter collection so that the degree of correlation between first sample and another positive sample is greater than first sample and other negative samples This degree of correlation.Some specific implementations of frame 206 are specifically described below with reference to Fig. 5 and Fig. 6.
As mentioned above, method 200 can be iterated execution, be constantly updated using the method and optimization neural network 110 parameter set, until reaching the condition of convergence.In each iteration of method 200, the sample group that selects in frame 202 can be with It is same or different.The current value of parameter set per secondary update is preceding primary acquisition updated value.
Fig. 5 shows the flow chart of the process 500 of the undated parameter value according to some embodiments of the present disclosure.In frame 502, Equipment 130 is calculated by summing it up the corresponding degree of correlation to determine degree of correlation summation.Continue the example of Fig. 4, calculating equipment 130 can add And second sample character representation 412-2 and first sample character representation 412-1 between degree of correlation 440-1 and third sample (degree of correlation 440-2 between 434-1,434-2 to 434-k) and the character representation 412-1 of first sample is extremely for this character representation 440-M, to obtain degree of correlation summation.In some embodiments, can also index and by way of indicate that the degree of correlation is total With, such as it can be represented as:Wherein η indicates preset smooth ginseng Number.
In frame 504, calculates equipment 130 and determine that the positive sample degree of correlation, the positive sample degree of correlation indicate fisrt feature expression and sample The corresponding degree of correlation of the corresponding character representation of a remaining at least positive sample in this group.Continue the example of Fig. 4, positive sample is related Degree 440-2 can be represented as r (xi +,xj +).In some embodiments, it is also possible to which the positive sample degree of correlation is expressed as exp (η r(xi +,xj +)), wherein η indicates preset smoothing parameter.
In frame 506, ratio of the equipment 130 based on the positive sample degree of correlation Yu degree of correlation summation is calculated, determines parameter set more New value.For example, continue Fig. 4 example, calculate equipment 130 can set objective function 460 to the positive sample degree of correlation to it is related The ratio for spending summation, for example, it can be represented as:
In the equation, xi +Indicate the character representation 432-1, x of the first positive sample 412-1*Indicate the second positive sample 412-2 Character representation 432-2, η indicates preset smoothing parameter.In some embodiments, it is based on the objective function, calculates equipment 130 Can use gradient descent method reduces the value of the objective function, so that it is determined that the updated value of parameter set.
In some embodiments, calculate whether equipment 130 can determine the training of neural network 420 based on the ratio Through restraining.In some embodiments, calculating equipment 130 can be compared the ratio with scheduled fractional threshold.If really The fixed ratio is greater than scheduled fractional threshold, calculates equipment 130 and determines that the training of neural network 420 reaches convergence.In some realities It applies in example, ratio calculated in ratio calculated and previous ones can also be calculated in current iteration by calculating equipment 130 Whether difference is less than scheduled difference threshold.If it is determined that the difference is less than scheduled difference threshold, calculates equipment 130 and determine mind Training through network 420 reaches convergence.In some embodiments, determination in successive ignition can also be calculated by calculating equipment 130 The average value of ratio variation, and nerve is determined to whether the change of the average value is less than scheduled threshold value based on each iteration Whether the training of network 420 restrains.After determining that the training of neural network 420 reaches convergence, ginseng can be exported by calculating equipment 130 The current value of manifold.
In some embodiments, calculating equipment 130 once can also be passed to n sample group to neural network, and can be with Objective function 460 is arranged are as follows:
Wherein Ω indicates that parameter set to be trained, n indicate the number of the sample group of input.Specifically, calculating equipment 130 can To use gradient descent method so that the value of objective function (2) reduces in an iterative process, and the mesh in a new iteration When the difference between value in the value and previous ones of scalar functions (2) is less than scheduled threshold value, the current value of parameter set is exported, To complete the training to neural network.
In some embodiments, iteration can also be stopped after iteration crosses threshold number by calculating equipment 130, and will be through more New parameter set output, is used using the housebroken parameter set as neural network.
In some embodiments, the standard of parameter can also be improved using the crowdsourcing label of each sample by calculating equipment 130 Exactness.Specifically, Fig. 6 shows the flow chart of the method 600 of undated parameter collection according to another embodiment of the present disclosure.
In frame 602, equipment 130 is calculated based on associated more with an at least positive sample remaining in sample group and negative sample A crowdsourcing label determines confidence level associated with an at least positive sample and at least a negative sample, and multiple crowdsourcing labels are by more What a user provided is labeled as positive or negative label for candidate samples.As indicated above, it can be obtained by way of crowdsourcing Sample set 120, so that the sample in sample set 120 can have multiple crowdsourcing labels.In some embodiments, equipment is calculated 130 can determine confidence level based on the ratio of positive sample or negative sample is marked as in the crowdsourcing label of given sample.For example, It is the positive sample of (1,1,1,0,0) for crowdsourcing label, Maximum-likelihood estimation can be calculated as its confidence by calculating equipment 130 Degree:
Wherein d indicates the number of labeler, yi,jIndicate the jth crowdsourcing label of i-th of sample.
In some embodiments, confidence level of the Bayesian confidence as each sample can also be used by calculating equipment 130:
Wherein α and β indicates the priori knowledge for sample distribution positive and negative in sample set, such as α can be known sample Concentration positive sample number percentage can be negative sample number percentage in known sample set multiplied by d multiplied by d, β.
It in frame 604, calculates equipment 130 and is based on confidence level, fisrt feature is indicated in sample group remaining at least one just The degree of correlation of sample and the corresponding character representation of negative sample is weighted.It can be to the degree of correlation as shown in figure 4, calculating equipment 130 440-1 to 440-M is weighted, obtain it is weighted after degree of correlation 450-1 to 450-M.In some embodiments, equipment is calculated Whether 130 can have crowdsourcing label based on each sample to determine whether to be weighted the degree of correlation corresponding with the sample. Such as when the sample that some sample is expert's calibration, do not have crowdsourcing label, then its confidence level can be set to 1 by default.
In frame 606, equipment 130 is calculated based on the weighted degree of correlation, determines the updated value of parameter set.Continue showing for Fig. 4 Example, after considering confidence level, formula (1) can be updated to new objective function:
Wherein δ indicates corresponding confidence level.It should be appreciated that formula (2) can also be updated accordingly.In some embodiments In, based on new objective function, calculating equipment 130 can use gradient descent method and the value of the objective function reduced, from And determine the updated value of parameter set.By introducing the confidence level calculated based on aforesaid way, equipment 130 is calculated in the training process It can fully consider the information of crowdsourcing label, increase crowdsourcing label and vote the weight of more consistent sample, reduce crowdsourcing label It votes the weight of not relatively consistent sample, so that neural metwork training parameter obtained is more accurate, and then can lead to It crosses neural network and obtains more accurately character representation.
In some embodiments, neural network 420 can be determined based on objective function Equation (6) by calculating equipment 130 Whether training has restrained.In some embodiments, the ratio and scheduled fractional threshold can be compared by calculating equipment 130 Compared with.If it is determined that the ratio is greater than scheduled fractional threshold, calculates equipment 130 and determine that the training of neural network 420 reaches convergence. In some embodiments, it calculates equipment 130 and can also calculate in current iteration and calculated in ratio and previous ones calculated The difference of ratio whether be less than scheduled difference threshold.If it is determined that the difference is less than scheduled difference threshold, equipment is calculated 130 determine that the training of neural network 420 reaches convergence.In some embodiments, successive ignition can also be calculated by calculating equipment 130 Middle determination ratio variation average value, and based on each iteration to the change of the average value whether be less than scheduled threshold value come Determine whether the training of neural network 420 restrains.After determining that the training of neural network 420 reaches convergence, calculating equipment 130 can With the current value of output parameter set.
In some embodiments, after the training for completing neural network 1 10, calculating equipment 130 can be by housebroken ginseng Manifold 140 is output to neural network 1 10 or another neural network, to obtain housebroken neural network.Housebroken nerve Network can use housebroken parameter set 140 and mode input (for example, voice, video, text, picture etc.) be converted to spy Sign indicates, and can determine the conclusion whether mode input is positive by Logic Regression Models in turn.
Based on the scheme of trained neural network described herein, the program can be by being combined into sample pair for positive and negative sample It is trained, so that the training sample for solving the problems, such as that training sample is concentrated is limited.In addition, in the training process, not only Consider the degree of correlation between positive sample and positive sample, the degree of correlation between positive sample and negative sample is also further considered, by making The degree of correlation between positive sample it is sufficiently large come undated parameter so that the parameter of neural network is more accurate, and then can More accurately character representation is obtained by neural network.
The block diagram of Fig. 7 device 700 according to an embodiment of the present disclosure for data access.Device 700 can be wrapped It includes in the calculating equipment 130 of Fig. 1 or is implemented as to calculate equipment 130.As shown in fig. 7, device 700 includes selecting module 710, it is configured as from for training the sample set of neural network to select sample group, sample group is negative including at least two positive samples and one Sample includes first sample in two positive samples.Device 700 further includes processing module 720, is configured as the ginseng using neural network The current value of manifold handles the respective sample in sample group respectively, to obtain corresponding character representation.Device 700 further includes phase Pass degree determining module 730, be configured to determine that the corresponding fisrt feature of first sample indicate respectively with it is remaining in sample group The corresponding degree of correlation of an at least positive sample and the corresponding character representation of a negative sample.In addition, device 700 further includes that updated value is true Cover half block 740 is configured as determining the updated value of parameter set based on the corresponding degree of correlation.
In some embodiments, device 700 further includes sample set determining module, and sample set determining module includes: acquisition mould Block is configured as obtaining multiple crowdsourcing labels of candidate samples, and multiple crowdsourcing labels are provided by multiple users by candidate sample This is labeled as positive or negative label;Accounting determining module is configured to determine that will wait in multiple crowdsourcing labels of candidate samples Whether the accounting for the label that this label of sampling is positive is higher than accounting threshold value;Positive sample adding module, is configured to respond to accounting Higher than accounting threshold value, positive sample candidate samples being added in sample set;And negative sample adding module, it is configured as ringing Accounting threshold value, negative sample candidate samples being added in sample set should be less than or equal in accounting.
In some embodiments, wherein updated value determining module 740 includes: summation determining module, is configured as by adding Degree of correlation summation is determined with the corresponding degree of correlation;Positive sample degree of correlation determining module is configured to determine that the positive sample degree of correlation, just The sample degree of correlation indicate fisrt feature indicate in sample group the corresponding character representation of a remaining at least positive sample it is corresponding The degree of correlation;And the first updated value determining module, it is configured as the ratio based on the positive sample degree of correlation Yu degree of correlation summation, is determined The updated value of parameter set.
In some embodiments, device 700 further include: the first convergence determining module is configured as related based on positive sample The ratio of degree and the degree of correlation, determines whether the training of neural network restrains;And first output module, it is configured to respond to really Determine the training convergence of neural network, the current value of output parameter set.
In some embodiments, wherein updated value determining module 740 includes: confidence determination module, is configured as being based on Multiple crowdsourcing labels associated with an at least positive sample remaining in sample group and negative sample, it is determining with an at least positive sample and The associated confidence level of an at least negative sample, multiple crowdsourcing labels be provided by multiple users by candidate samples label be positive or Negative label;Weighting block is configured as indicating and remaining at least one positive sample in sample group fisrt feature based on confidence level The degree of correlation of this character representation corresponding with negative sample is weighted;And the second updated value determining module, it is configured as being based on The weighted degree of correlation determines the updated value of parameter set.
In some embodiments, wherein confidence level includes Bayesian confidence.
In some embodiments, device 700 further include: the second convergence determining module is configured as based on weighted phase Guan Du, determines whether the training of neural network restrains;And second output module, it is configured to respond to determine neural network Training convergence, the current value of output parameter set.
Fig. 8 shows the schematic block diagram that can be used to implement the example apparatus 800 of embodiment of present disclosure.Example Such as, calculating equipment 130 as shown in Figure 1 is implemented by equipment 800.As shown, equipment 800 includes central processing unit (CPU) 801, it can be according to the computer program instructions being stored in read-only memory (ROM) 802 or from storage unit 808 are loaded into the computer program instructions in random access storage device (RAM) 803, to execute various movements appropriate and processing. In RAM 803, it can also store equipment 800 and operate required various programs and data.CPU 801, ROM 802 and RAM 803 are connected with each other by bus 804.Input/output (I/O) interface 805 is also connected to bus 804.
Multiple components in equipment 800 are connected to I/O interface 805, comprising: input unit 806, such as keyboard, mouse etc.; Output unit 807, such as various types of displays, loudspeaker etc.;Storage unit 808, such as disk, CD etc.;And it is logical Believe unit 809, such as network interface card, modem, wireless communication transceiver etc..Communication unit 809 allows equipment 800 by such as The computer network of internet and/or various telecommunication networks exchange information/data with other equipment.
Each process as described above and processing, such as method 200, method 300, method 500 and/or method 600, can It is executed by processing unit 801.For example, in some embodiments, method 200, method 300, method 500 and/or method 600 can quilts It is embodied as computer software programs, is tangibly embodied in machine readable media, such as storage unit 808.In some implementations In example, some or all of of computer program can be loaded into and/or install via ROM 802 and/or communication unit 807 Onto equipment 800.When computer program is loaded into RAM 803 and is executed by CPU 801, above-described side can be executed One or more movements of method 200, method 300, method 500 and/or method 600.
The disclosure can be method, apparatus, system and/or computer program product.Computer program product may include Computer readable storage medium, containing the computer-readable program instructions for executing various aspects of the disclosure.
Computer readable storage medium, which can be, can keep and store the tangible of the instruction used by instruction execution equipment Equipment.Computer readable storage medium for example can be -- but being not limited to -- storage device electric, magnetic storage apparatus, optical storage are set Standby, electric magnetic storage apparatus, semiconductor memory apparatus or above-mentioned any appropriate combination.Computer readable storage medium is more Specific example (non exhaustive list) includes: portable computer diskette, hard disk, random access memory (RAM), read-only storage Device (ROM), erasable programmable read only memory (EPROM or flash memory), static random access memory (SRAM), portable pressure Contracting disk read-only memory (CD-ROM), memory stick, floppy disk, mechanical coding equipment, is for example deposited digital versatile disc (DVD) thereon Contain punch card or groove internal projection structure and the above-mentioned any appropriate combination of instruction.Computer used herein above Readable storage medium storing program for executing is not interpreted that instantaneous signal itself, the electromagnetic wave of such as radio wave or other Free propagations pass through The electromagnetic wave (for example, the light pulse for passing through fiber optic cables) or pass through electric wire transmission that waveguide or other transmission mediums are propagated Electric signal.
Computer-readable program instructions as described herein can be downloaded to from computer readable storage medium it is each calculate/ Processing equipment, or outer computer or outer is downloaded to by network, such as internet, local area network, wide area network and/or wireless network Portion stores equipment.Network may include copper transmission cable, optical fiber transmission, wireless transmission, router, firewall, interchanger, gateway Computer and/or Edge Server.Adapter or network interface in each calculating/processing equipment are received from network to be counted Calculation machine readable program instructions, and the computer-readable program instructions are forwarded, for the meter being stored in each calculating/processing equipment In calculation machine readable storage medium storing program for executing.
Computer program instructions for executing disclosure operation can be assembly instruction, instruction set architecture (ISA) instructs, Machine instruction, machine-dependent instructions, microcode, firmware instructions, condition setup data or with one or more programming languages The source code or object code that any combination is write, the programming language include the programming language-of object-oriented such as Smalltalk, C++ etc., and conventional procedural programming languages-such as " C " language or similar programming language.Computer Readable program instructions can be executed fully on the user computer, partly execute on the user computer, be only as one Vertical software package executes, part executes on the remote computer or completely in remote computer on the user computer for part Or it is executed on server.In situations involving remote computers, remote computer can pass through network-packet of any kind It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit It is connected with ISP by internet).In some embodiments, by utilizing computer-readable program instructions Status information carry out personalized customization electronic circuit, such as programmable logic circuit, field programmable gate array (FPGA) or can Programmed logic array (PLA) (PLA), the electronic circuit can execute computer-readable program instructions, to realize each side of the disclosure Face.
Referring herein to according to the flow chart of the method, apparatus (system) of the embodiment of the present disclosure and computer program product and/ Or block diagram describes various aspects of the disclosure.It should be appreciated that flowchart and or block diagram each box and flow chart and/ Or in block diagram each box combination, can be realized by computer-readable program instructions.
These computer-readable program instructions can be supplied to general purpose computer, special purpose computer or other programmable datas The processing unit of processing unit, so that a kind of machine is produced, so that these instructions are passing through computer or other programmable numbers When being executed according to the processing unit of processing unit, produces and provided in one or more boxes in implementation flow chart and/or block diagram Function action device.These computer-readable program instructions can also be stored in a computer-readable storage medium, this A little instructions so that computer, programmable data processing unit and/or other equipment work in a specific way, thus, be stored with finger The computer-readable medium of order then includes a manufacture comprising the one or more side in implementation flow chart and/or block diagram The instruction of the various aspects of function action specified in frame.
Computer-readable program instructions can also be loaded into computer, other programmable data processing units or other In equipment, so that series of operation steps are executed in computer, other programmable data processing units or other equipment, to produce Raw computer implemented process, so that executed in computer, other programmable data processing units or other equipment Instruct function action specified in one or more boxes in implementation flow chart and/or block diagram.
The flow chart and block diagram in the drawings show system, method and the computer journeys according to multiple embodiments of the disclosure The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation One module of table, program segment or a part of instruction, the module, program segment or a part of instruction include one or more use The executable instruction of the logic function as defined in realizing.In some implementations as replacements, function marked in the box It can occur in a different order than that indicated in the drawings.For example, two continuous boxes can actually be held substantially in parallel Row, they can also be executed in the opposite order sometimes, and this depends on the function involved.It is also noted that block diagram and/or The combination of each box in flow chart and the box in block diagram and or flow chart, can the function as defined in executing or dynamic The dedicated hardware based system made is realized, or can be realized using a combination of dedicated hardware and computer instructions.
Each embodiment of the disclosure is described above, above description is exemplary, and non-exclusive, and It is also not necessarily limited to disclosed each embodiment.It is right without departing from the scope and spirit of illustrated each embodiment Many modifications and changes are obvious for those skilled in the art.The choosing of term used herein It selects, it is intended to best explain the principle, practical application or the improvement to the technology in market of each embodiment, or make this technology Other those of ordinary skill in field can understand each embodiment disclosed herein.

Claims (15)

1. a kind of method for constructing neural network, comprising:
From for training the sample set of neural network to select sample group, the sample group includes at least two positive samples and a negative sample This, includes first sample in two positive sample;
The respective sample in the sample group is handled respectively using the current value of the parameter set of the neural network, with acquisition pair The character representation answered;
Determine the corresponding fisrt feature of the first sample indicate respectively with an at least positive sample remaining in the sample group The corresponding degree of correlation of character representation corresponding with a negative sample;And
Based on the corresponding degree of correlation, the updated value of the parameter set is determined.
2. according to the method described in claim 1, further including determining the sample set by following:
Obtain multiple crowdsourcing labels of candidate samples, the multiple crowdsourcing label is provided by multiple users by the candidate sample This is labeled as positive or negative label;
Determine whether is the accounting of label that candidate samples label is positive in multiple crowdsourcing labels of the candidate samples Higher than accounting threshold value;
It is higher than the accounting threshold value, the positive sample that the candidate samples are added in the sample set in response to the accounting This;And
It is less than or equal to the accounting threshold value in response to the accounting, the candidate samples is added to one in the sample set Negative sample.
3. according to the method described in claim 1, wherein determining that the updated value of the parameter set includes:
Degree of correlation summation is determined by summing it up the corresponding degree of correlation;
Determine the positive sample degree of correlation, the positive sample degree of correlation indicate the fisrt feature indicate with it is remaining in the sample group The corresponding degree of correlation of the corresponding character representation of an at least positive sample;And
Ratio based on the positive sample degree of correlation Yu the degree of correlation summation, determines the updated value of the parameter set.
4. according to the method described in claim 3, further include:
The ratio based on the positive sample degree of correlation Yu the degree of correlation, determines whether the training of the neural network receives It holds back;And
In response to the training convergence of the determination neural network, the current value of the parameter set is exported.
5. according to the method described in claim 1, wherein determining that the updated value of the parameter set includes:
Based on multiple crowdsourcing labels associated with an at least positive sample remaining in the sample group and negative sample, determining and institute An at least positive sample and the associated confidence level of an at least negative sample are stated, the multiple crowdsourcing label is mentioned by multiple users What is supplied is labeled as positive or negative label for the candidate samples;
Based on the confidence level, to an at least positive sample remaining in fisrt feature expression and the sample group and bear The degree of correlation of the corresponding character representation of sample is weighted;And
Based on the weighted degree of correlation, the updated value of the parameter set is determined.
6. according to the method described in claim 5, wherein the confidence level includes Bayesian confidence.
7. according to the method described in claim 5, further include:
Based on the weighted degree of correlation, determine whether the training of the neural network restrains;And
In response to the training convergence of the determination neural network, the current value of the parameter set is exported.
8. a kind of for constructing the device of neural network, comprising:
Selecting module is configured as from for training the sample set of neural network to select sample group, and the sample group includes at least Two positive samples and a negative sample include first sample in two positive sample;
Processing module is configured as handling respectively in the sample group using the current value of the parameter set of the neural network Respective sample, to obtain corresponding character representation;
Degree of correlation determining module, be configured to determine that the corresponding fisrt feature of the first sample indicate respectively with the sample The corresponding degree of correlation of remaining an at least positive sample and the corresponding character representation of a negative sample in group;And
Updated value determining module is configured as determining the updated value of the parameter set based on the corresponding degree of correlation.
9. device according to claim 8 further includes sample set determining module, the sample set determining module includes:
Module is obtained, is configured as obtaining multiple crowdsourcing labels of candidate samples, the multiple crowdsourcing label is by multiple users What is provided is labeled as positive or negative label for the candidate samples;
Accounting determining module is configured to determine that marking the candidate samples in multiple crowdsourcing labels of the candidate samples Whether the accounting for the label being positive is higher than accounting threshold value;
Positive sample adding module is configured to respond to the accounting higher than the accounting threshold value, the candidate samples is added For the positive sample in the sample set;And
Negative sample adding module is configured to respond to the accounting less than or equal to the accounting threshold value, by the candidate sample Originally the negative sample being added in the sample set.
10. device according to claim 8, wherein the updated value determining module includes:
Summation determining module is configured as determining degree of correlation summation by summing it up the corresponding degree of correlation;
Positive sample degree of correlation determining module, is configured to determine that the positive sample degree of correlation, the positive sample degree of correlation instruction described the The corresponding degree of correlation of one character representation and the corresponding character representation of a remaining at least positive sample in the sample group;And
First updated value determining module is configured as the ratio based on the positive sample degree of correlation Yu the degree of correlation summation, really The updated value of the fixed parameter set.
11. device according to claim 10, further includes:
First convergence determining module, is configured as the ratio based on the positive sample degree of correlation Yu the degree of correlation, determines Whether the training of the neural network restrains;And
First output module is configured to respond to determine the training convergence of the neural network, exports working as the parameter set Preceding value.
12. device according to claim 8, wherein the updated value determining module includes:
Confidence determination module is configured as based on associated at least a positive sample and negative sample remaining in the sample group Multiple crowdsourcing labels, determine associated with an at least positive sample and at least negative sample confidence level, it is described more A crowdsourcing label be provided by multiple users by the candidate samples labeled as positive or negative label;
Weighting block is configured as indicating and remaining institute in the sample group fisrt feature based on the confidence level The degree of correlation for stating an at least positive sample and the corresponding character representation of negative sample is weighted;And
Second updated value determining module, is configured as based on the weighted degree of correlation, determine the parameter set it is described more New value.
13. device according to claim 12, wherein the confidence level includes Bayesian confidence.
14. device according to claim 12, further includes:
Second convergence determining module, is configured as determining that the training of the neural network is based on the weighted degree of correlation No convergence;And
Second output module is configured to respond to determine the training convergence of the neural network, exports working as the parameter set Preceding value.
15. a kind of computer readable storage medium has the computer-readable program instructions being stored thereon, the computer can Reader is instructed for executing method according to any one of claims 1-7.
CN201811559229.3A 2018-12-19 2018-12-19 Construct the method, apparatus and computer-readable medium of neural network Pending CN109657792A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811559229.3A CN109657792A (en) 2018-12-19 2018-12-19 Construct the method, apparatus and computer-readable medium of neural network
PCT/CN2019/122677 WO2020125404A1 (en) 2018-12-19 2019-12-03 Method and apparatus for constructing neural network and computer-readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811559229.3A CN109657792A (en) 2018-12-19 2018-12-19 Construct the method, apparatus and computer-readable medium of neural network

Publications (1)

Publication Number Publication Date
CN109657792A true CN109657792A (en) 2019-04-19

Family

ID=66114940

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811559229.3A Pending CN109657792A (en) 2018-12-19 2018-12-19 Construct the method, apparatus and computer-readable medium of neural network

Country Status (2)

Country Link
CN (1) CN109657792A (en)
WO (1) WO2020125404A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765943A (en) * 2019-10-23 2020-02-07 深圳市商汤科技有限公司 Network training and recognition method and device, electronic equipment and storage medium
WO2020125404A1 (en) * 2018-12-19 2020-06-25 北京世纪好未来教育科技有限公司 Method and apparatus for constructing neural network and computer-readable medium
CN111860769A (en) * 2020-06-16 2020-10-30 北京百度网讯科技有限公司 Method and device for pre-training neural network
CN112443019A (en) * 2019-09-05 2021-03-05 梅州市青塘实业有限公司 Closestool and flushing control method and device thereof
CN112529029A (en) * 2019-09-18 2021-03-19 华为技术有限公司 Information processing method, neural network training method, device and storage medium
CN112766320A (en) * 2020-12-31 2021-05-07 平安科技(深圳)有限公司 Classification model training method and computer equipment

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9967211B2 (en) * 2015-05-31 2018-05-08 Microsoft Technology Licensing, Llc Metric for automatic assessment of conversational responses
CN106021364B (en) * 2016-05-10 2017-12-12 百度在线网络技术(北京)有限公司 Foundation, image searching method and the device of picture searching dependency prediction model
CN108009528B (en) * 2017-12-26 2020-04-07 广州广电运通金融电子股份有限公司 Triple Loss-based face authentication method and device, computer equipment and storage medium
CN108764065B (en) * 2018-05-04 2020-12-08 华中科技大学 Pedestrian re-recognition feature fusion aided learning method
CN109657792A (en) * 2018-12-19 2019-04-19 北京世纪好未来教育科技有限公司 Construct the method, apparatus and computer-readable medium of neural network

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020125404A1 (en) * 2018-12-19 2020-06-25 北京世纪好未来教育科技有限公司 Method and apparatus for constructing neural network and computer-readable medium
CN112443019A (en) * 2019-09-05 2021-03-05 梅州市青塘实业有限公司 Closestool and flushing control method and device thereof
CN112529029A (en) * 2019-09-18 2021-03-19 华为技术有限公司 Information processing method, neural network training method, device and storage medium
CN110765943A (en) * 2019-10-23 2020-02-07 深圳市商汤科技有限公司 Network training and recognition method and device, electronic equipment and storage medium
CN111860769A (en) * 2020-06-16 2020-10-30 北京百度网讯科技有限公司 Method and device for pre-training neural network
CN112766320A (en) * 2020-12-31 2021-05-07 平安科技(深圳)有限公司 Classification model training method and computer equipment
CN112766320B (en) * 2020-12-31 2023-12-22 平安科技(深圳)有限公司 Classification model training method and computer equipment

Also Published As

Publication number Publication date
WO2020125404A1 (en) 2020-06-25

Similar Documents

Publication Publication Date Title
CN109657792A (en) Construct the method, apparatus and computer-readable medium of neural network
Wang et al. Learning to Represent Student Knowledge on Programming Exercises Using Deep Learning.
CN110390108A (en) Task exchange method and system based on deeply study
CN113344053B (en) Knowledge tracking method based on examination question different composition representation and learner embedding
US20150254556A1 (en) Systems and Methods for Allocating Capital to Trading Strategies for Big Data Trading in Financial Markets
CN111328407A (en) Mechanical learning method, apparatus and computer program for providing customized personal educational content based on learning efficiency
CN110288007A (en) The method, apparatus and electronic equipment of data mark
Frauenberger et al. Ways of thinking in informatics
JP7222274B2 (en) Model learning device, label estimation device, methods thereof, and program
CN111241992A (en) Face recognition model construction method, recognition method, device, equipment and storage medium
CN107437111A (en) Data processing method, medium, device and computing device based on neutral net
CN113591988B (en) Knowledge cognitive structure analysis method, system, computer equipment, medium and terminal
CN115187772A (en) Training method, device and equipment of target detection network and target detection method, device and equipment
CN110377733A (en) A kind of text based Emotion identification method, terminal device and medium
CN114861754A (en) Knowledge tracking method and system based on external attention mechanism
CN110222838A (en) Deep neural network and its training method, device, electronic equipment and storage medium
CN105989438A (en) Task relation management method, apparatus and system thereof, and electronic equipment
CN111159241A (en) Click conversion estimation method and device
CN113705159A (en) Merchant name labeling method, device, equipment and storage medium
CN114117033B (en) Knowledge tracking method and system
CN113312445B (en) Data processing method, model construction method, classification method and computing equipment
Krishnan et al. Incorporating Wide Context Information for Deep Knowledge Tracing using Attentional Bi-interaction.
CN114170484A (en) Picture attribute prediction method and device, electronic equipment and storage medium
CN113919979A (en) Knowledge tracking method and device, nonvolatile storage medium and electronic device
CN113283584A (en) Knowledge tracking method and system based on twin network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190419

RJ01 Rejection of invention patent application after publication