CN114595800A

CN114595800A - Correlation model training method, ranking method, device, electronic equipment and medium

Info

Publication number: CN114595800A
Application number: CN202011402049.1A
Authority: CN
Inventors: 沈炎军; 董国盛; 周泽南; 陈炜鹏; 许静芳
Original assignee: Beijing Sogou Technology Development Co Ltd
Current assignee: Beijing Sogou Technology Development Co Ltd
Priority date: 2020-12-02
Filing date: 2020-12-02
Publication date: 2022-06-07
Anticipated expiration: 2040-12-02
Also published as: CN114595800B

Abstract

The embodiment of the invention discloses a relevance model training method, which comprises the steps of obtaining a training data set according to historical query data of a user; for each training data in the training data set, mapping the word vector of the query word in the training data to an embedded layer in a convolutional neural network to obtain a mapping word vector of the query word in the training data; acquiring site feature vectors of network sites in training data; acquiring a correlation value of a query word and a website in training data according to the mapping word vector and the website feature vector; and obtaining the correlation model based on the correlation value of the query word in each training data and the website. The relevance model training method provided by the embodiment of the invention can improve the accuracy of the acquired website corresponding to the query word.

Description

Correlation model training method, ranking method, device, electronic equipment and medium

Technical Field

The embodiment of the invention relates to the technical field of internet, in particular to a correlation model training method, a correlation model sequencing method, a correlation model training device, electronic equipment and a correlation model sequencing medium.

Background

With the rapid development of internet technology, search engine technology has become mature, and a network site related to a query word can be obtained by inputting the query word into a search engine.

In the prior art, when a related website is obtained according to a query word, a graph model is usually used to obtain the correlation between the query word and the website, but in the process of obtaining the correlation between the query word and the website by using the graph model, a word which does not exist in the graph model, namely an unregistered word, usually exists, at this time, the correlation between the query word and the website can be obtained only after secondary processing of the unregistered word, but the graph model does not train the unregistered word, so that the accuracy of the query word and the corresponding website is low.

Disclosure of Invention

The embodiment of the invention provides a relevance model training method, a ranking method, a device, electronic equipment and a medium, which can improve the accuracy of acquiring a website corresponding to a query word.

The first aspect of the embodiments of the present invention provides a correlation model training method, including:

acquiring a training data set according to historical query data of a user, wherein the training data set comprises a positive sample data set and a negative sample data set, each training data in the training data set comprises a query word and a website, and the website is extracted from a result webpage corresponding to the query word;

for each training data in the training data set, mapping the word vector of the query word in the training data to an embedded layer in a convolutional neural network to obtain a mapping word vector of the query word in the training data; acquiring site feature vectors of network sites in training data; acquiring a correlation value of a query word and a website in training data according to the mapping word vector and the website feature vector;

and obtaining the correlation model based on the correlation value of the query word in each training data and the network station, wherein the correlation model is an end-to-end model.

Optionally, the obtaining a training data set according to the historical query data of the user includes:

searching a positive sample pair from historical query data of a user, and acquiring the positive sample data set based on the searched positive sample pair;

carrying out negative sampling on the historical query data of the user by using a word2vec negative sampling mode to obtain a negative sample pair; acquiring the negative sample data set based on the acquired negative sample pair;

and acquiring the training data set based on the positive sample data set and the negative sample data set.

Optionally, the negative sampling of the historical query data of the user by using a word2vec negative sampling mode to obtain a negative sample pair includes:

and carrying out negative sampling on the historical query data of the user by using a negative sampling probability formula in word2vec to obtain a negative sample pair, wherein the value of the over-parameter in the negative sampling probability formula is in a set value range.

Optionally, the mapping the word vector of the query word in the training data to the embedded layer in the convolutional neural network to obtain the mapping word vector of the query word in the training data includes:

and mapping the word vector of the query word to a word embedding layer by adopting a pre-training model and a convolutional neural network to obtain a mapping word vector of the query word.

Optionally, in the process of obtaining the relevance model based on the correlation value between the query term in each piece of training data and the website, the method includes:

parameters of the correlation model are adjusted using a log-loss function and a separation distance method.

A second aspect of the present invention provides a method for sequencing network stations, including:

acquiring a current query word input by a user in real time;

inputting the word vector of the current query word into a correlation model provided by the first aspect, and obtaining a correlation value between the current query word and each website in a website set;

and sequencing the network sites corresponding to the current query term based on the correlation value of the current query term and each network site.

Optionally, the set of network sites is composed of all network sites included in the correlation model.

A third aspect of an embodiment of the present invention provides a correlation model training apparatus, including:

the training data acquisition unit is used for acquiring a training data set according to historical query data of a user, wherein the training data set comprises a positive sample data set and a negative sample data set, each training data in the training data set comprises a query word and a website, and the website is extracted from a result webpage corresponding to the query word;

the training unit is used for mapping the word vector of the query word in the training data to an embedded layer in a convolutional neural network aiming at each training data in the training data set to obtain a mapping word vector of the query word in the training data; acquiring site feature vectors of network sites in training data; acquiring a correlation value of a query word and a website in training data according to the mapping word vector and the website feature vector;

and the model obtaining unit is used for obtaining the correlation model based on the correlation value of the query word in each training data and the network station, wherein the correlation model is an end-to-end model.

Optionally, the training data obtaining unit includes:

the positive sample data set acquisition module is used for searching a positive sample pair from historical query data of a user and acquiring the positive sample data set based on the searched positive sample pair;

the negative sample data set acquisition module is used for carrying out negative sampling on the historical query data of the user by using a word2vec negative sampling mode to acquire a negative sample pair; acquiring the negative sample data set based on the acquired negative sample pair;

and the training data acquisition module is used for acquiring the training data set based on the positive sample data set and the negative sample data set.

Optionally, the negative sample data set obtaining module is configured to perform negative sampling on the historical query data of the user by using a negative sampling probability formula in the word2vec, and obtain the negative sample pair, where a value of a hyperparameter in the negative sampling probability formula is within a set value range.

Optionally, the training unit includes:

and the word vector mapping module is used for mapping the word vectors of the query words to the word embedding layer by adopting a pre-training model and a convolutional neural network to obtain the mapping word vectors of the query words.

Optionally, the model obtaining unit is configured to adjust parameters of the correlation model by using a logarithmic loss function and a distance between the correlation model and the network station in a process of obtaining the correlation model based on the correlation value between the query term and the network station in each piece of training data.

A fourth aspect of the present invention provides a network station ranking device, including:

the query word acquisition unit is used for acquiring the current query word input by the user in real time;

a correlation value obtaining unit, configured to input the word vector of the current query word into a correlation model provided in the first aspect, and obtain a correlation value between the current query word and each website in a website set;

and the sequencing unit is used for sequencing the network sites corresponding to the current query term based on the correlation value of the current query term and each network site.

A fifth aspect of embodiments of the present invention provides an apparatus for data processing, comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory, and configured to be executed by the one or more processors comprises steps for the training method or the ordering method as described above.

A sixth aspect of embodiments of the present invention provides a machine-readable medium having stored thereon instructions, which, when executed by one or more processors, cause an apparatus to perform a training method or a ranking method as described above.

The embodiment of the invention has the following beneficial effects:

based on the technical scheme, a training data set is obtained according to historical query data of a user, and for each training data in the training data set, word vectors of query words in the training data are mapped to an embedded layer in a convolutional neural network to obtain mapping word vectors of the query words in the training data; acquiring site feature vectors of network sites in training data; acquiring a correlation value of a query word and a website in training data according to the mapping word vector and the website feature vector; obtaining the correlation model based on the correlation value of the query word in each training data and the website; because the correlation model is an end-to-end model, the correlation model obtained through the training converts the relation between the query word and the network site into a classification problem; therefore, correlation values between the query word and the plurality of network stations can be obtained through the correlation model, and the obtained network stations are ranked based on the correlation values; therefore, the problem that recommendation efficiency and recommendation accuracy are low due to the fact that words are not logged in when the graph model carries out website recommendation in the prior art can be solved, and the recommendation accuracy of the website is improved.

Drawings

FIG. 1 is a flowchart of a method for training a correlation model according to an embodiment of the present invention;

FIG. 2 is a flowchart of a method for ranking network stations according to an embodiment of the present invention;

FIG. 3 is a schematic structural diagram of a training apparatus for a correlation model according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a network station sequencing apparatus according to an embodiment of the present invention;

fig. 5 is a block diagram of a training apparatus or a website sequencing apparatus for a correlation model according to an embodiment of the present invention;

fig. 6 is a block diagram of a server in some embodiments of the inventions.

Detailed Description

In order to better understand the technical solutions, the technical solutions of the embodiments of the present invention are described in detail below with reference to the drawings and the specific embodiments, and it should be understood that the specific features of the embodiments and the embodiments of the present invention are detailed descriptions of the technical solutions of the embodiments of the present invention, and are not limitations of the technical solutions of the present invention, and the technical features of the embodiments and the embodiments of the present invention may be combined with each other without conflict.

Aiming at the technical problem of low accuracy of website recommended by query terms, the embodiment of the invention provides a training scheme of a correlation model, which is used for acquiring a training data set according to historical query data of a user, wherein the training data set comprises a positive sample data set and a negative sample data set, each training data in the training data set comprises a query term and a website, and the website is extracted from a result webpage corresponding to the query term; for each training data in the training data set, mapping the word vector of the query word in the training data to an embedded layer in a convolutional neural network to obtain a mapping word vector of the query word in the training data; acquiring site feature vectors of network sites in training data; acquiring a correlation value of a query word and a website in training data according to the mapping word vector and the website feature vector; and obtaining the correlation model based on the correlation value of the query word in each training data and the website, wherein the correlation model is an end-to-end model.

Thus, a training data set is obtained according to the historical query data of the user, and for each training data in the training data set, the word vector of the query word in the training data is mapped to the embedded layer in the convolutional neural network to obtain the mapping word vector of the query word in the training data; acquiring site feature vectors of network sites in training data; acquiring a correlation value of a query word and a website in training data according to the mapping word vector and the website feature vector; obtaining the correlation model based on the correlation value of the query word in each training data and the website; because the correlation model is an end-to-end model, the correlation model obtained through the training converts the relation between the query word and the network site into a classification problem; therefore, correlation values between the query word and the plurality of network stations can be obtained through the correlation model, and the obtained network stations are ranked based on the correlation values; therefore, the problem that recommendation efficiency and recommendation accuracy are low due to the fact that words are not logged in when the website recommendation is carried out by the graph model in the prior art can be solved, and the recommendation accuracy of the website is improved.

The correlation model used in the embodiment of the invention is an end-to-end model, and when the correlation model is obtained by training, training data needs to be constructed, and then the constructed training data is used for model training to obtain the correlation model. The relevance model predicts the relevance between the query words and the network sites, but when the user uses the query words to search, the result web pages are searched; each of the result web pages corresponds to a web site, and the result web page may be a page of the web site, such as www.jianbihua.cc, www.xxxjd.com, etc., and the result web page may be a home page or an inner page of the web site, etc. Therefore, in the process of constructing the training data, the website corresponding to the result webpage needs to be obtained according to the result webpage.

Specifically, when the network site corresponding to the result webpage is obtained, each result webpage corresponds to a webpage Uniform Resource Locator (URL), so that the network site corresponding to the result webpage can be determined according to the webpage URL of the result webpage. Specifically, the Host name field in the URL of the web page may be used as the unique identifier of the website corresponding to the result web page.

For example, if the web page URL corresponding to a certain resulting web page is "http:// www.nipic.com/show/10810369. html", then "http://" indicates HyperText Transfer Protocol (http), "www.nipic.com" indicates host or site name (Hostname), "www." indicates subdomain), "nipic. com" indicates Domain name (Domain), "com" indicates Top-level Domain, and "show" indicates a path in the web page URL. At this time, "www.nipic.com" is used as a unique identifier of the web site of the web page URL, that is, "www.nipic.com" is used to indicate the corresponding web site "nick map".

As shown in fig. 1, an embodiment of the present invention provides a method for training a correlation model, including the following steps:

s101, acquiring a training data set according to historical query data of a user, wherein the training data set comprises a positive sample data set and a negative sample data set, each training data in the training data set comprises a query word and a website, and the website is extracted from a result webpage corresponding to the query word;

s102, aiming at each training data in the training data set, mapping the word vector of the query word in the training data to an embedded layer in a convolutional neural network to obtain a mapping word vector of the query word in the training data; acquiring site feature vectors of network sites in training data; acquiring a correlation value of a query word and a website in training data according to the mapping word vector and the website feature vector;

s103, obtaining the correlation model based on the correlation value of the query word in each training data and the website, wherein the correlation model is an end-to-end model.

In step S101, when a correlation model is obtained through training, training data needs to be constructed, and the construction of the training data mainly needs to find positive correlation between a query term and a website. Based on this, after obtaining the user historical query dataFinding available positive sample pairs from historical query data of a user<query,doc₊>Representation, where query represents a query term, doc₊A web site corresponding to the result web page is represented; and then acquiring a positive sample data set based on the searched positive sample pair.

When the positive sample data set is obtained based on the found positive sample pairs, all or part of the found positive sample pairs may be combined into the positive sample data set. Of course, in searching from user historical query data<query,doc₊>Later, the system can also find the data<query,doc₊>And filtering and cleaning to obtain all or part of cleaned positive sample pairs to form a positive sample data set.

And when the negative sample data set is acquired according to the historical query data of the user, carrying out negative sampling on the historical query data of the user in a negative sampling mode to acquire a negative sample pair, wherein the negative sample pair can be used<query,doc->Representing, wherein query represents a query word, and doc represents a network site corresponding to a result webpage; and then acquiring a negative sample data set based on the acquired negative sample pair. At this time, all or part of the acquired negative sample pairs may be collected as negative sample data. Of course, all the data are obtained from the historical query data of the user in a negative sampling mode<query,doc₊>Thereafter, all of the obtained data can be processed<query,doc₊>And filtering and cleaning to obtain all or part of cleaned negative sample pairs to form a negative sample data set.

Specifically, when the negative sample pairs are obtained by using the negative sampling method, an average sampling method of network stations may be used, so that the probability that each network station becomes a negative sample is the same. Of course, the negative sampling can also be performed according to the ratio of the number of the result web pages corresponding to the web site.

Specifically, when a negative sample pair is obtained by adopting a negative sampling mode, negative sampling can be carried out by using a word2vec negative sampling mode, at the moment, the negative sample is selected by using unitary model distribution, the probability that one network station is selected as the negative sample is related to the occurrence frequency of the network station, the network station with the higher occurrence frequency is more easily selected as the negative sample, and the negative sampling probability formula specifically comprises the following steps:

wherein p (w) in formula 1_i) Representing a network site w_iProbability of negative sampling, f (w)_i) Denotes w_iT is a hyperparameter, and t is usually 0.75.

In the practical application process, in order to avoid the problem that the probability of the negative sampling samples of the network station with higher occurrence frequency is high in the formula 1, a large number of experiments are carried out on the value of t, and the problem that the probability of the negative sampling samples of the network station with higher occurrence frequency is high in the formula 1 can be effectively avoided when the t is in a set value range in the experiments, so that the value of the hyper-parameter in the negative sampling probability formula is in the set value range. The value range is set to be 0.45 to 0.65, at this time, t may be any one value from 0.45 to 0.65, for example, t may be 0.48, 0.5, 0.52, and the like, so that the problem of high probability of the negative sampling samples of the network station with a high occurrence frequency in the formula 1 can be effectively avoided.

Therefore, after a positive sample data set and a negative sample data set are mined according to historical query data of a user, a training data set is formed according to the positive sample data set and the negative sample data set; after a training data set is obtained, the relevance between the query terms and the grid sites is converted into a classification problem through a relevance model.

In the embodiment of the present invention, the word vector of the query word may be obtained through a word vector model, and the word vector model may be a CBOW, a Skip-gram model, or a GloVe model of word2vec, and the present invention is not limited specifically.

After the training data set is obtained, step S102 is executed, in which, for each training data, the word vector of the query word is mapped into the word embedding layer in the correlation model, so as to obtain a mapped word vector of the query word; similarly, mapping the site features of the network sites into a site embedding layer to obtain site feature vectors of the network sites; inputting the mapping word vector and the site feature vector into a correlation model to obtain a correlation value of the query word and the network site; therefore, each training data in the training data set is input into the correlation model for training, various parameters in the correlation model are adjusted until the correlation model meets the constraint condition, and then the final correlation model is obtained.

Specifically, when the correlation model is trained, the classification of the trained correlation model is promoted to be more accurate through the positive sample pairs and the negative sample pairs in the training data set, for example, some positive samples may be websites with higher correlation values with automobiles, such as "automobile, www.autohome.com.cn", "automobile, mall, bydauto, com, cn", and the like; some negative examples may be network stations that are less relevant to the car, such as "car, nev.ofweek.com", "car, field.10jqka.com.cn", etc., thereby enabling the trained correlation model to accurately determine network stations with higher correlation values to "car". Therefore, a large number of positive samples and negative samples are used for model training, so that the trained correlation model predicts the website with a higher correlation value with the query word, and the prediction accuracy is improved.

In the embodiment of the present invention, the vector dimensions of the mapping word vector and the site feature vector are the same, and may be, for example, 200-dimensional vectors, 256-dimensional vectors, 512-dimensional vectors, and the like; of course, the vector dimensions of the mapping word vector and the site feature vector may also be different, for example, the mapping word vector may be a 200-dimensional vector, the site feature vector may be a 256-dimensional vector, the mapping word vector may be a 512-dimensional vector, and the site feature vector may be a 300-dimensional vector, which is not limited in the present invention.

In the embodiment of the invention, in the training process of the correlation model, when the word vector of the query word is mapped to the word embedding layer in the correlation model, the word vector of the query word is mapped to the word embedding layer by adopting a pre-training model and a Convolutional Neural Network (CNN) to obtain the mapping word vector of the query word. Specifically, a pre-training model may be used to obtain a word vector of a query word, where the pre-training model is obtained by acquiring a large amount of corpora, such as dog searching encyclopedia, news corpora, encyclopedia, wikipedia, and the like, to perform word vector training; thus, the query word can obtain a first training result through the pre-trained word vector model; then, using CNN to train the word vector of the query word to obtain a second training result; training parameters of the CNN model based on the first training result and the second training result in the process of training the correlation model, finely adjusting the pre-training model, and stopping training when the test data is optimal to obtain the finely adjusted pre-training model and the CNN model; and inputting the word vector after the query word is subjected to fine tuning into the CNN model to obtain a mapping word vector of the query word.

Specifically, the model of the word embedding layer may be set for the length of the query word, the number of convolution kernels may be set according to the actual situation, the number of convolution kernels may be set according to the device performance and the real-time performance, may be set only according to the real-time performance or the device performance, or may be set manually or by the system, and the number of convolution kernels may be 32, 40, 52, 64, or the like, for example. The number of layers of the fully connected layers in the correlation model may be set, for example, 3 layers, 4 layers, 5 layers, and the like, and the present invention is not particularly limited.

Specifically, a word vector of the query word is input into a model of a word embedding layer, and finally, the output vector is used as a mapping word vector of the query word through the processing of a convolution layer and a full connection layer of the model of the word embedding layer.

In the embodiment of the invention, in the training process of the correlation model, the characteristics of the network station can be initialized by using the station embedding layer to obtain the initialized characteristics of the network station; and then, calculating the correlation values of the network sites and the query words according to the initialization characteristics of the network sites and the mapping word vectors of the query words. And when calculating the correlation value between the network station and the query term, calculating by using cosine similarity, specifically using the following formula:

wherein, in formula 2, X represents a website, Y represents a query term, sim (X, Y) represents a correlation value between the website and the query term, and X_iThe i-th dimension vector, y, representing the network station_iAn ith dimension vector representing a query term.

Specifically, in order to make the obtained site feature vector of the website more accurate, the site embedding layer corresponding to the website may be fine-tuned based on a fine-tuning network in the training process to obtain an adjusted site embedding layer, and the adjusted site embedding layer is used to obtain the site feature vector of the website.

In the embodiment of the invention, when the correlation value between the network station and the query word is calculated, similarity algorithms such as a euclidean distance, a manhattan distance, a chebyshev distance, a minkowski distance, a pearson correlation coefficient, a mahalanobis distance and the like can be adopted. And after obtaining the correlation value of the query word in each training data and the network station, obtaining a correlation model based on the correlation value of the query word in each training data and the network station, wherein the correlation model is an end-to-end model.

In the embodiment of the invention, a classifier in a relevance model calculates the relevance values between a query word and a plurality of network sites through the formula 2, and the network site with the maximum relevance value is used as a recommended network site of the query word; or selecting at least one network station from a plurality of network stations with correlation values larger than a preset threshold value as a recommended network station of the query term, so that the matching degree of the recommended network station and the query term is higher. In this way, the accuracy of the correlation model prediction is made higher. Of course, the recommended websites can be ranked according to the correlation values between the query word and the plurality of websites, so that the websites with high correlation values are ranked in front of the websites with low correlation values, and the websites with low correlation values are ranked in back of the websites with high correlation values, and the ranking accuracy of the recommended websites is further improved.

After the correlation value between the query word in the training data and the website is obtained in step S102, step S103 is performed to obtain a correlation model based on the correlation value between the query word in each training data and the website.

Specifically, in order to make the prediction of the correlation model more accurate, the output value of the classifier in the correlation model may be evaluated using a loss function to obtain an evaluation result; and adjusting parameters of the correlation model according to the evaluation result until the evaluation result of the output value predicted by the adjusted correlation model meets the evaluation constraint condition, taking the correlation model meeting the evaluation constraint condition as a final correlation model, and adjusting the parameters of the correlation model through a loss function, so that the obtained final correlation model can improve the model convergence efficiency and the model prediction accuracy. The classifier may be specifically softmax, SVM, or the like.

In the embodiment of the present invention, when the parameters of the correlation model are adjusted based on the loss function, the loss function may specifically use a logarithmic loss function, a quadratic loss function, an exponential loss function, and the like, and the loss function is specifically taken as the logarithmic loss function as an example below.

Specifically, when the parameters of the correlation model are adjusted by using the logarithmic loss function, the convergence efficiency of the correlation model can be improved and the accuracy of model prediction can be better by using a distance margin method, and the logarithmic loss function added with margin is specifically as follows:

in formula 3, k represents the number of training samples input at one time in the model training process, and generally takes values of 1, 2, 4, 8, 16, 32, and the like; n represents the number of samples of a network station corresponding to one query term, and if the positive and negative samples are 1:100, n is 101; j represents the position of a positive sample in n network sites; m and s are hyper-parameters which respectively represent the spacing distance and the scaling; wherein m can be pi/40, pi/30, pi/45 and the like, and s can be 3.5, 4, 5.5 and the like;

representing query termsThe included angle between the network station and the positive sample is the degree; theta.theta._jAnd the included angle between the query word and the negative sample of the website is represented.

Therefore, the website embedding layer is finely adjusted through the formula 3, so that the accuracy of the website feature vector of the website acquired by the adjusted website embedding layer is higher, and on the basis that the accuracy of the acquired website feature vector is higher, the accuracy of the calculated website feature vector and the accuracy of the mapping word vector are improved accordingly, and the prediction accuracy of the correlation model is improved accordingly.

Certainly, a log loss function that does not use a margin method may also be used to perform fine tuning on the site embedded layer, where the log loss function specifically includes:

wherein, in the formula 4

The prediction result of the ith category is shown, and m is a hyperparameter.

In the prior art, when the relevance between the query word and the website is calculated, if the recommendation of the website is performed according to the quality and authority of the website, the recommendation is not performed according to the query word, so that the relevance between the recommended website and the query word is greatly different; if the graph model is used for calculating the correlation between the query word and the website, since the non-logged word exists when the query word is segmented, the non-logged word needs to be processed for calculation, so that the recommendation efficiency and the recommendation accuracy are low.

In the embodiment of the invention, the relevance model converts the relation between the query word and the website into a classification problem; obtaining a sample pair of a query word and a network site by using historical query data and negative sampling of a user; training the sample data set formed by the sample pairs to obtain a correlation model; then, a correlation model obtained through training can obtain correlation values of the query words and the network sites, and the obtained network sites are ranked based on the correlation values; therefore, the problem that recommendation efficiency and recommendation accuracy are low due to the fact that words are not logged in the graph model can be solved, and a negative sampling mode is used in sampling, so that the recommended website obtained through the relevance model gives consideration to quality and authority of the website, and recommendation accuracy is further improved; compared with the prior art, the relevance model in the embodiment of the invention can improve the recommendation efficiency and the recommendation accuracy of the network station.

As shown in fig. 2, an embodiment of the present invention further provides a method for sorting network stations, including:

s201, acquiring a current query word input by a user in real time;

in step S201, a current query term input by the user may be obtained in real time; for example, if the user a inputs "cartoon" in the search engine, the "cartoon" input by the user a is acquired as the current query word.

S202, inputting a word vector of a current query word into a correlation model, and acquiring a correlation value between the current query word and each website in a website set;

in step S202, the step of training the correlation model specifically refers to the description of steps S101 to S103, and the correlation model is obtained through steps S101 to S103; after the current query word is obtained, the word vector of the current query word can be obtained through the word vector model; inputting the word vector of the current query word into a word embedding layer of the correlation model to obtain a mapping word vector of the current query word; and taking the mapping word vector of the current query word as the input of the correlation model, and outputting the correlation value of the current query word and each website in the website set after calculation by the correlation model.

Specifically, the network site set is composed of all or part of the network sites included in the correlation model, and the present invention is not particularly limited.

S203, based on the correlation value of the current query word and each network station, ranking the network stations corresponding to the current query word.

Specifically, when the websites corresponding to the current query term are ranked based on the correlation values of the current query term and each website, the websites are ranked in sequence from high to low according to the correlation values corresponding to the websites. Of course, the network stations may be sorted sequentially from low to high according to the correlation values corresponding to the network stations, or the remaining network stations sorted before the network station with the correlation value in the middle may be sorted from high to low according to the correlation values corresponding to the network stations. Of course, other ordering methods are possible, and the invention is not particularly limited.

For example, taking the query terms "science classic lens pictures" and "decoration effect graph" as examples, the correlation value obtained by the correlation model between the "science classic lens pictures" and the website is specifically shown in table 1 below:

TABLE 1

The website ordering for the "science classic shots" is "side. Similarly, the network sites of the decoration effect graph are sequentially ranked as www.17house.com, home, fang, com, zixun, jia, com and www.xiujukoo.com, and the ranking of the network sites is sequentially ranked from large to small according to the correlation values.

Device embodiment

Referring to fig. 3, a block diagram of a correlation model training apparatus according to an embodiment of the present invention is shown, which may specifically include:

a training data obtaining unit 301, configured to obtain a training data set according to historical query data of a user, where the training data set includes a positive sample data set and a negative sample data set, and each training data in the training data set includes a query word and a website, where the website is extracted from a result webpage corresponding to the query word;

a training unit 302, configured to map, for each training data in the training data set, a word vector of a query word in the training data to an embedded layer in a convolutional neural network, so as to obtain a mapped word vector of the query word in the training data; acquiring site feature vectors of network sites in training data; acquiring a correlation value of a query word and a website in training data according to the mapping word vector and the website feature vector;

a model obtaining unit 303, configured to obtain the correlation model based on a correlation value between the query term in each piece of training data and the website, where the correlation model is an end-to-end model.

In an alternative embodiment, the training data obtaining unit 301 includes:

In an optional implementation manner, the negative sample data set obtaining module is configured to perform negative sampling on the historical query data of the user by using a negative sampling probability formula in word2vec, and obtain the negative sample pair, where a value of an over-parameter in the negative sampling probability formula is within a set value range.

In an alternative embodiment, the training unit 302 includes:

and the word vector mapping module is used for mapping the word vectors of the query words to the word embedding layer by adopting a pre-training model and a convolutional neural network to obtain the mapped word vectors of the query words.

In an optional implementation manner, the model obtaining unit 303 is configured to adjust parameters of the correlation model by using a logarithmic loss function and a distance between the correlation model and the network station in the process of obtaining the correlation model based on the correlation value between the query term and the network station in each training data.

For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

The embodiments of the present invention are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Referring to fig. 4, a block diagram of a network station sequencing apparatus according to an embodiment of the present invention is shown, which may specifically include:

a query term obtaining unit 401, configured to obtain a current query term input by a user in real time;

a correlation value obtaining unit 402, configured to input the word vector of the current query word into a correlation model, and obtain a correlation value between the current query word and each website in a website set;

a sorting unit 403, configured to sort, based on the correlation value between the current query term and each website, the websites corresponding to the current query term.

In an alternative embodiment, the set of network sites consists of all network sites comprised by the relevance model.

Fig. 5 is a block diagram illustrating a structure of a correlation model training apparatus or a website ranking apparatus as a device according to an exemplary embodiment. For example, the apparatus 900 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.

Referring to fig. 5, apparatus 900 may include one or more of the following components: processing component 902, memory 904, power component 906, multimedia component 908, audio component 910, input/output (I/O) interface 912, sensor component 914, and communication component 916.

The processing component 902 generally controls overall operation of the device 900, such as operations associated with display, incoming calls, data communications, camera operations, and recording operations. Processing element 902 may include one or more processors 920 to execute instructions to perform all or a portion of the steps of the methods described above. Further, processing component 902 can include one or more modules that facilitate interaction between processing component 902 and other components. For example, the processing component 902 can include a multimedia module to facilitate interaction between the multimedia component 908 and the processing component 902.

The memory 904 is configured to store various types of data to support operation at the device 900. Examples of such data include instructions for any application or method operating on device 900, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 904 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power supply component 906 provides power to the various components of the device 900. The power components 906 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 900.

The multimedia component 908 comprises a screen providing an output interface between the device 900 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide motion action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 908 includes a front facing camera and/or a rear facing camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the device 900 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 910 is configured to output and/or input audio signals. For example, audio component 910 includes a Microphone (MIC) configured to receive external audio signals when apparatus 900 is in an operating mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 904 or transmitted via the communication component 916. In some embodiments, audio component 910 also includes a speaker for outputting audio signals.

I/O interface 912 provides an interface between processing component 902 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor component 914 includes one or more sensors for providing status assessment of various aspects of the apparatus 900. For example, the sensor assembly 914 may detect an open/closed state of the device 900, the relative positioning of the components, such as a display and keypad of the apparatus 900, the sensor assembly 914 may also detect a change in the position of the apparatus 900 or a component of the apparatus 900, the presence or absence of user contact with the apparatus 900, orientation or acceleration/deceleration of the apparatus 900, and a change in the temperature of the apparatus 900. The sensor assembly 914 may include a proximity sensor configured to detect the presence of a nearby object in the absence of any physical contact. The sensor assembly 914 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 914 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 916 is configured to facilitate communications between the apparatus 900 and other devices in a wired or wireless manner. The apparatus 900 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 916 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 916 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the apparatus 900 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as the memory 904 comprising instructions, executable by the processor 920 of the apparatus 900 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Fig. 6 is a block diagram of a server in some embodiments of the invention. The server 1900 may vary widely by configuration or performance and may include one or more Central Processing Units (CPUs) 1922 (e.g., one or more processors) and memory 1932, one or more storage media 1930 (e.g., one or more mass storage devices) storing applications 1942 or data 1944. Memory 1932 and storage medium 1930 can be, among other things, transient or persistent storage. The program stored in the storage medium 1930 may include one or more modules (not shown), each of which may include a series of instructions operating on a server. Still further, a central processor 1922 may be provided in communication with the storage medium 1930 to execute a series of instruction operations in the storage medium 1930 on the server 1900.

The server 1900 may also include one or more power supplies 1926, one or more wired or wireless network interfaces 1950, one or more input-output interfaces 1958, one or more keyboards 1956, and/or one or more operating systems 1941, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.

A non-transitory computer readable storage medium in which instructions, when executed by a processor of an apparatus (device or server), enable the apparatus to perform a correlation model training method, the method comprising: acquiring a training data set according to historical query data of a user, wherein the training data set comprises a positive sample data set and a negative sample data set, each training data in the training data set comprises a query word and a website, and the website is extracted from a result webpage corresponding to the query word; for each training data in the training data set, mapping the word vector of the query word in the training data to an embedded layer in a convolutional neural network to obtain a mapping word vector of the query word in the training data; acquiring site feature vectors of network sites in training data; acquiring a correlation value of a query word and a website in training data according to the mapping word vector and the website feature vector; and obtaining the correlation model based on the correlation value of the query word in each training data and the network station, wherein the correlation model is an end-to-end model.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A method for training a correlation model, comprising:

2. The training method of claim 1, wherein the obtaining a training data set based on the user historical query data comprises:

acquiring a positive sample pair from historical query data of a user, and acquiring the positive sample data set based on the acquired positive sample pair;

carrying out negative sampling on the historical query data of the user by using a word2vec negative sampling mode to obtain a negative sample pair;

acquiring the negative sample data set based on the acquired negative sample pair;

3. The training method of claim 2, wherein the using word2vec negative sampling to negatively sample the user historical query data to obtain negative sample pairs comprises:

4. The training method of claim 1, wherein the mapping the word vector of the query word in the training data to an embedded layer in a convolutional neural network to obtain a mapped word vector of the query word in the training data comprises:

5. A training method as claimed in claim 1, wherein in the obtaining of the relevance model based deep on the relevance value of the query term and the website in each training data, the method comprises:

6. A method for sequencing web sites, comprising:

acquiring a current query word input by a user in real time;

inputting the word vector of the current query word into the relevance model according to any one of claims 1 to 5, and obtaining a relevance value of the current query word and each website in the set of websites;

7. A correlation model training apparatus, comprising:

the training unit is used for mapping the word vector of the query word in the training data to an embedded layer in a convolutional neural network aiming at each training data in the training data set to obtain the mapping word vector of the query word in the training data; acquiring site feature vectors of network sites in training data; acquiring a correlation value of a query word and a website in training data according to the mapping word vector and the website feature vector;

8. A network station sequencing apparatus, comprising:

a correlation value obtaining unit, configured to input a word vector of the current query word into the correlation model according to any one of claims 1 to 5, and obtain a correlation value between the current query word and each website in the set of websites;

9. An apparatus for data processing, comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory, and wherein the one or more programs being configured to be executed by the one or more processors comprises the method steps of any of claims 1-6.

10. A machine-readable medium having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform the method recited by one or more of claims 1-6.