CN110827045B - Method and device for distinguishing commodity relationship - Google Patents
Method and device for distinguishing commodity relationship Download PDFInfo
- Publication number
- CN110827045B CN110827045B CN201810892271.0A CN201810892271A CN110827045B CN 110827045 B CN110827045 B CN 110827045B CN 201810892271 A CN201810892271 A CN 201810892271A CN 110827045 B CN110827045 B CN 110827045B
- Authority
- CN
- China
- Prior art keywords
- commodity
- relation
- objective function
- constraint
- distinguishing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 230000000295 complement effect Effects 0.000 claims abstract description 56
- 238000012549 training Methods 0.000 claims abstract description 55
- 239000013598 vector Substances 0.000 claims abstract description 40
- 238000013507 mapping Methods 0.000 claims abstract description 27
- 238000012216 screening Methods 0.000 claims abstract description 23
- 238000005457 optimization Methods 0.000 claims abstract description 18
- 238000006467 substitution reaction Methods 0.000 claims description 19
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 9
- 238000012795 verification Methods 0.000 claims description 9
- 238000007477 logistic regression Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 description 92
- 238000010586 diagram Methods 0.000 description 13
- 238000004891 communication Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000010276 construction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0282—Rating or review of business operators or products
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Strategic Management (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Entrepreneurship & Innovation (AREA)
- General Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Marketing (AREA)
- Economics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method and a device for distinguishing commodity relations, and relates to the technical field of computers. One embodiment of the method comprises the following steps: screening out a set with directed edges based on a target loss function; by introducing a mapping vector, an objective function for distinguishing complementary relations or alternative relations is established, and then a combined objective function is established by category constraint and multi-step path constraint; inputting the training set in the set into the combined objective function, and carrying out optimization solution on the training set so as to train to obtain a combined objective model and calculate to obtain a likelihood score; based on the likelihood scores, relationships between the items are distinguished. This embodiment can solve the technical problem of data sparsity without considering interdependence between commodities.
Description
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and apparatus for distinguishing a commodity relationship.
Background
Personalized recommendation systems have become an integral part of contacting a user between a buyer and a commodity and are widely used in the fields of multimedia recommendation, news recommendation, commodity recommendation, etc. Personalized recommendations often include two processes: retrieval and sorting. Thus, the personalized recommendation system can be optimized in two ways: 1. how to more accurately retrieve relevant goods from billions of goods; 2. how the relevance of the items can be ranked more accurately.
The Sceptre model is a model recently proposed to distinguish between complements and substitutes. The Sceptre model collects training datasets of "see again" and "buy at the same time" both datasets as alternatives and complements, respectively. On the basis, sceptre models take commodity purchase comments, prices and other characteristics of commodities as information sources, train a combination model of a text generation model and a supervision type link prediction method, and judge the complementary products and substitutes.
In the process of implementing the present invention, the inventor finds that at least the following problems exist in the prior art:
1) The problem of data sparsity is not well solved;
2) Only the direct relationship between the goods and the goods is considered, and the interdependence between the relationships is not considered.
Disclosure of Invention
In view of this, the embodiment of the invention provides a method and a device for distinguishing commodity relations, so as to solve the technical problems of data sparsity and no consideration of interdependence among commodities.
To achieve the above object, according to one aspect of the embodiments of the present invention, there is provided a method of distinguishing commodity relations, including:
screening out a set with directed edges based on a target loss function;
By introducing a mapping vector, an objective function for distinguishing complementary relations or alternative relations is established, and then a combined objective function is established by category constraint and multi-step path constraint;
Inputting the training set in the set into the combined objective function, and carrying out optimization solution on the training set so as to train to obtain a combined objective model and calculate to obtain a likelihood score;
based on the likelihood scores, relationships between the items are distinguished.
Optionally, inputting the training set in the set into the joint objective function, and performing optimization solution on the training set, so that a joint objective model is obtained through training, and a likelihood score is obtained through calculation, including:
Adopting width priority search, extracting multi-step path constraints from the training set to obtain all multi-step path constraints, and thus constructing a corresponding training set;
and inputting the constructed training set into the combined objective function, and carrying out optimization solution on the combined objective function in a random gradient descent mode, so that a combined objective model is obtained through training, and a likelihood score is obtained through calculation.
Optionally, distinguishing the relationship between the goods based on the likelihood score includes:
Inputting the verification set in the set into the trained combined target model, and respectively determining a substitution relation and a probability score threshold corresponding to the complementary relation according to an output result;
based on the likelihood score threshold and the likelihood score, relationships between the items are distinguished.
Optionally, establishing the joint objective function by category constraint and multi-step path constraint includes:
and combining the possibility of establishing category constraint and the possibility of establishing multi-step path constraint with the objective function for distinguishing the complementary relation or the alternative relation to obtain the combined objective function.
Optionally, the likelihood that the category constraint holds is described as:
I(f1)=I(i,r1,j)·I(i,r1,k)-I(i,r1,j)+1
Wherein, commodity j and commodity k are in the same commodity category, (i, r, j) indicates that commodity i and commodity j have oriented edges in the space of relation r, and (i, r, k) indicates that commodity i and commodity k have oriented edges in the space of relation r;
the likelihood that the multi-step path constraint holds is described as:
I(f1)=I(i,r1,j)·I(i,r1,k)-I(i,r1,j)+1
Wherein, (i, r 1, j) indicates that commodity i and commodity j have oriented edges in the space of the relation r 1, (j, r 2, k) indicates that commodity j and commodity k have oriented edges in the space of the relation r 2, and (i, r 3, k) indicates that commodity i and commodity k have oriented edges in the space of the relation r 3;
The objective function for distinguishing the complementary relation or the alternative relation is described as follows:
Wherein, beta is a mapping vector for a relation R, R is a set of all relations R, epsilon E is a set of all directed edges, P n(w0)~1/|{(i,r,j)}|3/4, (i, R, j) represents that a directed edge exists between a commodity i and a commodity j in a relation R space; the joint objective function is described as:
Where f c denotes a direct connected directed edge when c=0, a commodity category constraint when c=1, and a multi-step path constraint when c=2.
Optionally, screening out the set of existing directed edges based on the target loss function includes:
and obtaining a target loss function by using logistic regression:
Where N is the number of negative samples, P n (w) is the third quarter of the unigram distribution U (w) 1/| { (i, j) ∈ε P } | and ε P is the set of all directed edges;
inputting commodity data into the target loss function, and screening out a commodity set with directed edges.
In addition, according to another aspect of the embodiment of the present invention, there is provided an apparatus for distinguishing a commodity relationship, including:
the screening module is used for screening out a set with directed edges based on the target loss function;
the constraint module is used for establishing an objective function for distinguishing complementary relations or alternative relations by introducing mapping vectors, and further establishing a combined objective function by category constraint and multi-step path constraint;
The solving module is used for inputting the training set in the set into the joint objective function, and carrying out optimization solving on the training set so as to train to obtain a joint objective model and calculate to obtain a likelihood score;
And the distinguishing module is used for distinguishing the relation between commodities based on the likelihood score.
Optionally, the solving module is configured to:
Adopting width priority search, extracting multi-step path constraints from the training set to obtain all multi-step path constraints, and thus constructing a corresponding training set;
and inputting the constructed training set into the combined objective function, and carrying out optimization solution on the combined objective function in a random gradient descent mode, so that a combined objective model is obtained through training, and a likelihood score is obtained through calculation.
Optionally, the differentiating module is configured to:
Inputting the verification set in the set into the trained combined target model, and respectively determining a substitution relation and a probability score threshold corresponding to the complementary relation according to an output result;
based on the likelihood score threshold and the likelihood score, relationships between the items are distinguished.
Optionally, establishing the joint objective function by category constraint and multi-step path constraint includes:
and combining the possibility of establishing category constraint and the possibility of establishing multi-step path constraint with the objective function for distinguishing the complementary relation or the alternative relation to obtain the combined objective function.
Optionally, the likelihood that the category constraint holds is described as:
I(f1)=I(i,r1,j)·I(i,r1,k)-I(i,r1,j)+1
Wherein, commodity j and commodity k are in the same commodity category, (i, r, j) indicates that commodity i and commodity j have oriented edges in the space of relation r, and (i, r, k) indicates that commodity i and commodity k have oriented edges in the space of relation r;
the likelihood that the multi-step path constraint holds is described as:
I(f1)=I(i,r1,j)·I(i,r1,k)-I(i,r1,j)+1
Wherein, (i, r 1, j) indicates that commodity i and commodity j have oriented edges in the space of the relation r 1, (j, r 2, k) indicates that commodity j and commodity k have oriented edges in the space of the relation r 2, and (i, r 3, k) indicates that commodity i and commodity k have oriented edges in the space of the relation r 3;
The objective function for distinguishing the complementary relation or the alternative relation is described as follows:
Wherein, beta is a mapping vector for a relation R, R is a set of all relations R, epsilon E is a set of all directed edges, P n(w0)~1/|{(i,r,j)}|3/4, (i, R, j) represents that a directed edge exists between a commodity i and a commodity j in a relation R space; the joint objective function is described as:
Where f c denotes a direct connected directed edge when c=0, a commodity category constraint when c=1, and a multi-step path constraint when c=2.
Optionally, the screening module is configured to:
and obtaining a target loss function by using logistic regression:
Where N is the number of negative samples, P n (w) is the third quarter of the unigram distribution U (w) 1/| { (i, j) ∈ε P } | and ε P is the set of all directed edges;
inputting commodity data into the target loss function, and screening out a commodity set with directed edges.
According to another aspect of an embodiment of the present invention, there is also provided an electronic device including:
one or more processors;
storage means for storing one or more programs,
The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the methods of any of the embodiments described above.
According to another aspect of an embodiment of the present invention, there is also provided a computer readable medium having stored thereon a computer program which, when executed by a processor, implements the method according to any of the embodiments described above.
One embodiment of the above invention has the following advantages or benefits: because the objective function for distinguishing the complementary relation or the alternative relation is established by introducing the mapping vector, and then the technical means of combining the objective function is established by category constraint and multi-step path constraint, the technical problems of data sparsity and no consideration of interdependence among commodities are overcome. The invention considers commodity category information and commodity multi-step path information, establishes an objective function for distinguishing complementary relation or substitution relation by introducing a mapping vector, establishes a combined objective function by category constraint and multi-step path constraint, and obtains a likelihood score by optimizing solution calculation, thereby distinguishing complementary relation and substitution relation of commodities. Therefore, the embodiment of the invention combines the category constraint and the multi-step path constraint into the model, so that the problem of data sparsity can be effectively reduced, and the problem of interdependence among commodities is not considered, thereby more accurately distinguishing the commodity relationship.
Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of the main flow of a method of distinguishing merchandise relationships according to one embodiment of the invention;
FIG. 2 is a schematic diagram of the main flow of a method of distinguishing merchandise relationships according to one referenceable embodiment of the invention;
FIG. 3 is a schematic diagram of the major modules of an apparatus for distinguishing merchandise relationships according to an embodiment of the invention;
FIG. 4 is an exemplary system architecture diagram in which embodiments of the present invention may be applied;
fig. 5 is a schematic diagram of a computer system suitable for use in implementing an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
FIG. 1 is a schematic diagram of the main flow of a method of distinguishing merchandise relationships according to one embodiment of the invention. As an embodiment of the present invention, as shown in fig. 1, the method for distinguishing commodity relations may include:
step 101, screening out the set with directed edges based on the target loss function.
In this step, the commodity collection with the directed edge (i.e., the co-click data and the co-purchase data of the user) is screened from the massive commodity data based on the objective loss function. The co-click data represents the substitution relationship between the commodities, and the co-purchase data represents the complementary relationship between the commodities, so that the relationship between the commodities i and the commodities j in the data needs to be further distinguished through the method provided by the embodiment of the invention.
In practice, neither the complementary nor the substitute relationship of the merchandise is symmetrical. Thus, the semantics of the commodity need to be modeled with a target vector and an environment vector. If there is a directed edge from commodity i to commodity j, then the presence (y i,j =1) or absence (y i,j =0) of this edge can be calculated using the following formula:
Where v i is the target vector for commodity i, v' j is the environment vector for commodity j, and σ (·) represents the sigmoid function.
From the above equation, it can be found that if the value of p (y i,j |v, V') is larger, the likelihood of this directed edge from commodity i to commodity j being present is higher.
Then, logistic regression was used to obtain the following objective loss function:
Where N is the number of negative samples, P n (w) is the one-way grammar distribution U (w) 1/| { (i, j) ∈ε P } | to the three-quarter power, ε P is the set of all directed edges. w is a random variable and n is several squares of the grammatical distribution, in an embodiment of the invention n is three quarters.
Therefore, in this step, the objective loss function may be trained, and then a commodity set with a directed edge is screened out by the trained objective loss function, where a complementary relationship or a substitution relationship exists between two commodities.
Step 102, by introducing a mapping vector, an objective function for distinguishing complementary relations or alternative relations is established, and then a joint objective function is established by category constraint and multi-step path constraint.
In fact, there are a variety of semantic information (e.g., target vector, environment vector) for one commodity, and different semantics are represented in different relationships. The semantic meaning of the commodity can be expressed by introducing a mapping vector and mapping the feature vector of the commodity to different relation spaces, as shown in the following formula:
vr,i=vi+βr⊙vi,
v′r,i=v′i+βr⊙vi′
Where v r,i is the target vector of commodity i mapped to relationship r, v' r,i is the environment vector of commodity i mapped to relationship r, and β r is the mapping vector to relationship r.
If there is a directed edge from commodity i to commodity j, the likelihood that the directed edge (i, r, j) is present (z i,j,r =1) or absent (z i,j,r =0) can be calculated therefrom as follows:
Then, an objective function that distinguishes the complementary relationship or the alternative relationship can be obtained as follows:
Where R is the set of all relationships R, ε E is the set of all directed edges, P n(w0)~1/|{(i,r,j)}|3/4, (i, R, j) represents that commodity i and commodity j have directed edges in the relationship R space.
1) The problem of data sparsity is not solved well due to the existence of 1) in the prior art; 2) Only the direct relationship between the goods and the goods is considered, and the problem of interdependence among the relationships is not considered, and for this reason, the embodiment of the invention models two path constraints of the related commodity category similarity and the multi-step path on the basis of the established objective function.
In an embodiment of the present invention, two path constraints are considered, commodity category constraints and multi-step path constraints, respectively.
Most existing methods often have difficulty in handling sparsity problems in data, and in embodiments of the invention, the data sparsity problems are alleviated by adding commodity category information. Specifically, the following commodity category constraints are mainly considered:
That is, if commodity a and commodity B are in a substitute relationship and commodity B and commodity C are in the same commodity category, commodity a and commodity C are in a substitute relationship; if the commodity A and the commodity B are in a complementary relationship and the commodity B and the commodity C are in the same commodity category, the commodity A and the commodity C are in a complementary relationship.
Examples of the two constraints can be calculated according to the following formulaPossibility of establishment:
I(f1)=I(i,r,j)·I(i,r,k)-I(i,r,j)+1
wherein, in the same commodity category, commodity j and commodity k (i, r, j) represent that commodity i and commodity j have oriented edges in the space of relation r, and (i, r, k) represent that commodity i and commodity k have oriented edges in the space of relation r.
In addition to being constrained by commodity categories, multi-step paths can reflect more complex relationship dependencies. Specifically, the following multi-step path constraints are mainly considered:
That is, if commodity a is in a substitute relationship with commodity B and commodity B is in a substitute relationship with commodity C, commodity a is in a substitute relationship with commodity C; if the commodity A and the commodity B are in a complementary relationship and the commodity B and the commodity C are in a substitution relationship, the commodity A and the commodity C are in a complementary relationship; if the commodity A and the commodity B are in a substitution relationship and the commodity B and the commodity C are in a complementary relationship, the commodity A and the commodity C are in a complementary relationship; if the commodity A and the commodity B are in a complementary relationship, and the commodity B and the commodity C are in a complementary relationship, the commodity A and the commodity C are in a complementary relationship.
Examples of the two constraints can be calculated according to the following formulaPossibility of establishment:
I(f2)=I(i,r1,j)·I(j,r2,k)·I(i,r3,k)-I(i,r1,j)·I(j,r2,k)+1
Wherein, (i, r 1, j) indicates that commodity i and commodity j have oriented edges in the space of the relation r 1, (j, r 2, k) indicates that commodity j and commodity k have oriented edges in the space of the relation r 2, and (i, r 3, k) indicates that commodity i and commodity k have oriented edges in the space of the relation r 3.
Then, combining the two path constraints with an objective function for distinguishing complementary relations or alternative relations to obtain a combined objective function as follows:
Where f c denotes a direct connected directed edge when c=0, a commodity category constraint when c=1, and a multi-step path constraint when c=2.
According to the embodiment of the invention, the commodity category constraint and the multi-step path constraint are added under a unified model frame in a fuzzy logic modeling mode, so that the constraints are met as much as possible while the commodity target vector fits the training set, the quality of the commodity target vector is improved, and the relationship between commodities can be distinguished more accurately.
And step 103, inputting the training set in the set into the joint objective function, and carrying out optimization solution on the training set so as to train to obtain a joint objective model and calculate to obtain a likelihood score.
In the step, the set obtained by screening in the step 101 is randomly divided into a training set and a verification set, the training set is input into the combined objective function in the step 102, the combined objective function is optimized and solved, so that a combined objective model is obtained through training, and the probability score of the commodity is calculated.
Specifically, L (y|v, V ') is added to all L (F c |v, V', β) (c=0, c=1, c=2) as an optimization target, that is:
s.t.||vi||2≤1,||v′i||2≤1and||r||2≤1
Wherein α c is a weight.
The trained commodity target vector and the relationship mapping vector can meet all targets (namely, the directed edges, commodity category constraint and multi-step path constraint can be accurately fitted).
Optionally, step 103 specifically includes: adopting width priority search, extracting multi-step path constraints from the training set to obtain all multi-step path constraints, and thus constructing a corresponding training set; and inputting the constructed training set into the combined objective function, and carrying out optimization solution on the combined objective function in a random gradient descent mode, so that a combined objective model is obtained through training, and a likelihood score is obtained through calculation.
Thus, after optimizing the joint objective function, the objective vector and the environment vector of each commodity and the mapping vector of the relation r are obtained, and then the likelihood score of the relation r between the commodity i and the commodity j (i.e. the triplet (i, r, j) is established is calculated
Step 104, distinguishing the relation between commodities based on the likelihood score.
The likelihood score threshold may be preset, and then the relationship between the commodities is determined to be a complementary relationship or an alternative relationship through the likelihood score threshold and the likelihood score calculated in step 103.
Specifically, the method comprises the following steps: inputting the verification set in the set into the trained combined target model, and respectively determining a substitution relation and a probability score threshold corresponding to the complementary relation according to an output result; based on the likelihood score threshold and the likelihood score calculated in step 103, relationships between the items, i.e., complementary relationships or alternative relationships, are distinguished.
Because the probability score threshold value of each training set is different, the probability score threshold value is adjusted through the output result of the verification set, and the accurate determination of judgment can be improved.
According to the various embodiments described above, it can be seen that the present invention adopts the technical means of establishing a joint objective function by introducing a mapping vector to establish an objective function that distinguishes complementary relationships or alternative relationships, and further by category constraint and multi-step path constraint, thereby solving the problems of data sparsity and no consideration of interdependencies between commodities. That is, the prior art 1) does not solve the data sparsity problem well; 2) Only the direct relationship between the goods and the goods is considered, and the interdependence between the relationships is not considered. The invention considers commodity category information and commodity multi-step path information, establishes an objective function for distinguishing complementary relation or substitution relation by introducing a mapping vector, establishes a combined objective function by category constraint and multi-step path constraint, and obtains a likelihood score by optimizing solution calculation, thereby distinguishing complementary relation and substitution relation of commodities. Therefore, the embodiment of the invention combines the category constraint and the multi-step path constraint into the model, so that the problem of data sparsity can be effectively reduced, and the problem of interdependence among commodities is not considered, thereby more accurately distinguishing the commodity relationship.
FIG. 2 is a schematic diagram of the main flow of a method of distinguishing merchandise relationships according to one referenceable embodiment of the invention, which may include:
step 201, screening out a set with directed edges based on a target loss function;
step 202, establishing an objective function for distinguishing complementary relations or alternative relations by introducing mapping vectors;
Step 203, combining the possibility that the category constraint is established and the possibility that the multi-step path constraint is established with the objective function for distinguishing the complementary relation or the alternative relation to obtain a combined objective function;
Step 204, adopting breadth-first search, extracting multi-step path constraints from the training set, and obtaining all multi-step path constraints, thereby constructing a corresponding training set;
step 205, inputting the constructed training set into the joint objective function, and carrying out optimization solution on the joint objective function in a random gradient descent mode, so as to train to obtain a joint objective model and calculate to obtain a probability score;
Step 206, inputting the verification set in the set into the trained joint target model, and respectively determining a substitution relation and a probability score threshold corresponding to the complementary relation according to the output result;
Step 207, distinguishing the relation between commodities based on the likelihood score threshold and the likelihood score.
In addition, in the embodiment of the present invention, the specific implementation of the method for distinguishing the commodity relationship is described in detail in the above method for distinguishing the commodity relationship, so the description is not repeated here.
Fig. 3 is a schematic diagram of main modules of an apparatus for distinguishing commodity relations according to an embodiment of the present invention, and as shown in fig. 3, the apparatus 300 for distinguishing commodity relations includes a screening module 301, a constraint module 302, a solving module 303, and a distinguishing module 304. Wherein, the screening module 301 screens out the set with directed edges based on the objective loss function; the constraint module 302 establishes an objective function for distinguishing complementary relations or alternative relations by introducing mapping vectors, and further establishes a combined objective function by category constraint and multi-step path constraint; the solving module 303 inputs the training set in the set to the joint objective function, and performs optimization solving on the training set, so that a joint objective model is obtained through training and a likelihood score is obtained through calculation; the differentiating module 304 differentiates relationships between items based on the likelihood scores.
Optionally, the solution module 303 adopts a breadth-first search to extract multi-step path constraints from the training set, so as to obtain all multi-step path constraints, thereby constructing a corresponding training set; and inputting the constructed training set into the combined objective function, and carrying out optimization solution on the combined objective function in a random gradient descent mode, so that a combined objective model is obtained through training, and a likelihood score is obtained through calculation.
Optionally, the differentiating module 304 inputs the verification set in the set into the trained joint target model, and determines the likelihood score threshold corresponding to the replacement relationship and the complementary relationship according to the output result respectively; based on the likelihood score threshold and the likelihood score, relationships between the items are distinguished.
Optionally, establishing the joint objective function by category constraint and multi-step path constraint includes: and combining the possibility of establishing category constraint and the possibility of establishing multi-step path constraint with the objective function for distinguishing the complementary relation or the alternative relation to obtain the combined objective function.
Optionally, the likelihood that the category constraint holds is described as:
I(f1)=I(i,r1,j)·I(i,r1,k)-I(i,r1,j)+1
Wherein, commodity j and commodity k are in the same commodity category, (i, r, j) indicates that commodity i and commodity j have oriented edges in the space of relation r, and (i, r, k) indicates that commodity i and commodity k have oriented edges in the space of relation r;
the likelihood that the multi-step path constraint holds is described as:
I(f1)=I(i,r1,j)·I(i,r1,k)-I(i,r1,j)+1
Wherein, (i, r 1, j) indicates that commodity i and commodity j have oriented edges in the space of the relation r 1, (j, r 2, k) indicates that commodity j and commodity k have oriented edges in the space of the relation r 2, and (i, r 3, k) indicates that commodity i and commodity k have oriented edges in the space of the relation r 3;
The objective function for distinguishing the complementary relation or the alternative relation is described as follows:
Wherein, beta is a mapping vector for a relation R, R is a set of all relations R, epsilon E is a set of all directed edges, P n(w0)~1/|{(i,r,j)}|3/4, (i, R, j) represents that a directed edge exists between a commodity i and a commodity j in a relation R space; the joint objective function is described as:
Where f c denotes a direct connected directed edge when c=0, a commodity category constraint when c=1, and a multi-step path constraint when c=2.
Optionally, the filtering module 301 uses logistic regression to obtain the objective loss function:
Where N is the number of negative samples, P n (w) is the third quarter of the unigram distribution U (w) 1/| { (i, j) ∈ε P } | and ε P is the set of all directed edges;
inputting commodity data into the target loss function, and screening out a commodity set with directed edges.
According to the various embodiments described above, it can be seen that the present invention adopts the technical means of establishing a joint objective function by introducing a mapping vector to establish an objective function that distinguishes complementary relationships or alternative relationships, and further by category constraint and multi-step path constraint, thereby solving the problems of data sparsity and no consideration of interdependencies between commodities. That is, the prior art 1) does not solve the data sparsity problem well; 2) Only the direct relationship between the goods and the goods is considered, and the interdependence between the relationships is not considered. The invention considers commodity category information and commodity multi-step path information, establishes an objective function for distinguishing complementary relation or substitution relation by introducing a mapping vector, establishes a combined objective function by category constraint and multi-step path constraint, and obtains a likelihood score by optimizing solution calculation, thereby distinguishing complementary relation and substitution relation of commodities. Therefore, the embodiment of the invention combines the category constraint and the multi-step path constraint into the model, so that the problem of data sparsity can be effectively reduced, and the problem of interdependence among commodities is not considered, thereby more accurately distinguishing the commodity relationship.
The specific implementation of the device for distinguishing between goods according to the present invention is described in detail in the method for distinguishing between goods described above, and thus the description thereof will not be repeated here.
Fig. 4 illustrates an exemplary system architecture 400 to which the method of distinguishing merchandise relationships or the apparatus of distinguishing merchandise relationships of embodiments of the present invention may be applied.
As shown in fig. 4, the system architecture 400 may include terminal devices 401, 402, 403, a network 404, and a server 405. The network 404 is used as a medium to provide communication links between the terminal devices 401, 402, 403 and the server 405. The network 404 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
A user may interact with the server 405 via the network 404 using the terminal devices 401, 402, 403 to receive or send messages or the like. Various communication client applications, such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only) may be installed on the terminal devices 401, 402, 403.
The terminal devices 401, 402, 403 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 405 may be a server providing various services, such as a background management server (by way of example only) providing support for shopping-type websites browsed by users using the terminal devices 401, 402, 403. The background management server may analyze and process the received data such as the product information query request, and feedback the processing result (e.g., the target push information, the product information—only an example) to the terminal device.
It should be noted that, the method for distinguishing the commodity relationship provided in the embodiment of the present invention is generally executed on the server 405, and accordingly, the device for distinguishing the commodity relationship is generally disposed in the server 405.
It should be understood that the number of terminal devices, networks and servers in fig. 4 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 5, there is illustrated a schematic diagram of a computer system 500 suitable for use in implementing an embodiment of the present invention. The terminal device shown in fig. 5 is only an example, and should not impose any limitation on the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 5, the computer system 500 includes a Central Processing Unit (CPU) 501, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. In the RAM503, various programs and data required for the operation of the system 500 are also stored. The CPU 501, ROM 502, and RAM503 are connected to each other through a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
The following components are connected to the I/O interface 505: an input section 506 including a keyboard, a mouse, and the like; an output portion 507 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker, and the like; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The drive 510 is also connected to the I/O interface 505 as needed. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as needed so that a computer program read therefrom is mounted into the storage section 508 as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 509, and/or installed from the removable media 511. The above-described functions defined in the system of the present invention are performed when the computer program is executed by a Central Processing Unit (CPU) 501.
The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules involved in the embodiments of the present invention may be implemented in software or in hardware. The described modules may also be provided in a processor, for example, as: a processor includes a screening module, a constraint module, a solution module, and a discrimination module, wherein the names of these modules do not constitute a limitation on the module itself in some cases.
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to include: screening out a set with directed edges based on a target loss function; by introducing a mapping vector, an objective function for distinguishing complementary relations or alternative relations is established, and then a combined objective function is established by category constraint and multi-step path constraint; inputting the training set in the set into the combined objective function, and carrying out optimization solution on the training set so as to train to obtain a combined objective model and calculate to obtain a likelihood score; based on the likelihood scores, relationships between the items are distinguished.
According to the technical scheme provided by the embodiment of the invention, the technical means of establishing the combined objective function by introducing the mapping vector, establishing the objective function for distinguishing the complementary relation or the alternative relation and further establishing the combined objective function by category constraint and multi-step path constraint is adopted, so that the technical problems of data sparsity and no consideration of interdependence among commodities are solved. The invention considers commodity category information and commodity multi-step path information, establishes an objective function for distinguishing complementary relation or substitution relation by introducing a mapping vector, establishes a combined objective function by category constraint and multi-step path constraint, and obtains a likelihood score by optimizing solution calculation, thereby distinguishing complementary relation and substitution relation of commodities. Therefore, the embodiment of the invention combines the category constraint and the multi-step path constraint into the model, so that the problem of data sparsity can be effectively reduced, and the problem of interdependence among commodities is not considered, thereby more accurately distinguishing the commodity relationship.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.
Claims (10)
1. A method of differentiating between merchandise relationships comprising:
screening out a set with directed edges based on a target loss function;
By introducing a mapping vector, an objective function for distinguishing complementary relations or alternative relations is established, and then a combined objective function is established by category constraint and multi-step path constraint;
Inputting the training set in the set into the combined objective function, and carrying out optimization solution on the training set so as to train to obtain a combined objective model and calculate to obtain a likelihood score;
distinguishing relationships between goods based on the likelihood scores;
Establishing a joint objective function through category constraint and multi-step path constraint, including:
combining the possibility of establishment of category constraint and the possibility of establishment of multi-step path constraint with an objective function for distinguishing complementary relation or alternative relation to obtain a combined objective function;
The likelihood that the category constraint holds is described as:
I(f1)=I(i,r1,j)·I(i,r1,k)-I(i,r1,j)+1
Wherein, commodity j and commodity k are in the same commodity category, (i, r, j) indicates that commodity i and commodity j have oriented edges in the space of relation r, and (i, r, k) indicates that commodity i and commodity k have oriented edges in the space of relation r;
the likelihood that the multi-step path constraint holds is described as:
I(f2)=I(i,r1,j)·I(j,r2,k)·I(i,r3,k)-I(i,r1,j)·I(j,r2,k)+1
Wherein, (i, r 1, j) indicates that commodity i and commodity j have oriented edges in the space of the relation r 1, (j, r 2, k) indicates that commodity j and commodity k have oriented edges in the space of the relation r 2, and (i, r 3, k) indicates that commodity i and commodity k have oriented edges in the space of the relation r 3;
The objective function for distinguishing the complementary relation or the alternative relation is described as follows:
Wherein, beta is a mapping vector for a relation R, R is a set of all relations R, epsilon E is a set of all directed edges, P n(w0)~1/|{(i,r,j)}|3/4, (i, R, j) represents that a directed edge exists between a commodity i and a commodity j in a relation R space; the joint objective function is described as:
Where f c denotes a direct connected directed edge when c=0, a commodity category constraint when c=1, and a multi-step path constraint when c=2.
2. The method of claim 1, wherein inputting the training set of the set into the joint objective function, performing an optimization solution on the training set to obtain a joint objective model, and calculating to obtain a likelihood score, comprises:
Adopting width priority search, extracting multi-step path constraints from the training set to obtain all multi-step path constraints, and thus constructing a corresponding training set;
and inputting the constructed training set into the combined objective function, and carrying out optimization solution on the combined objective function in a random gradient descent mode, so that a combined objective model is obtained through training, and a likelihood score is obtained through calculation.
3. The method of claim 1, wherein distinguishing relationships between items of merchandise based on the likelihood score comprises:
Inputting the verification set in the set into the trained combined target model, and respectively determining a substitution relation and a probability score threshold corresponding to the complementary relation according to an output result;
based on the likelihood score threshold and the likelihood score, relationships between the items are distinguished.
4. The method of claim 1, wherein screening out the set of existing directed edges based on the objective loss function comprises:
and obtaining a target loss function by using logistic regression:
Where N is the number of negative samples, P n (w) is the third quarter of the unigram distribution U (w) 1/| { (i, j) ∈ε P } | and ε P is the set of all directed edges;
inputting commodity data into the target loss function, and screening out a commodity set with directed edges.
5. An apparatus for differentiating between merchandise items, comprising:
the screening module is used for screening out a set with directed edges based on the target loss function;
the constraint module is used for establishing an objective function for distinguishing complementary relations or alternative relations by introducing mapping vectors, and further establishing a combined objective function by category constraint and multi-step path constraint;
The solving module is used for inputting the training set in the set into the joint objective function, and carrying out optimization solving on the training set so as to train to obtain a joint objective model and calculate to obtain a likelihood score;
a distinguishing module for distinguishing a relationship between goods based on the likelihood score;
Establishing a joint objective function through category constraint and multi-step path constraint, including:
combining the possibility of establishment of category constraint and the possibility of establishment of multi-step path constraint with an objective function for distinguishing complementary relation or alternative relation to obtain a combined objective function;
The likelihood that the category constraint holds is described as:
I(f1)=I(i,r1,j)·I(i,r1,k)-I(i,r1,j)+1
Wherein, commodity j and commodity k are in the same commodity category, (i, r, j) indicates that commodity i and commodity j have oriented edges in the space of relation r, and (i, r, k) indicates that commodity i and commodity k have oriented edges in the space of relation r;
the likelihood that the multi-step path constraint holds is described as:
I(f2)=I(i,r1,j)·I(j,r2,k)·I(i,r3,k)-I(i,r1,j)·I(j,r2,k)+1
Wherein, (i, r 1, j) indicates that commodity i and commodity j have oriented edges in the space of the relation r 1, (j, r 2, k) indicates that commodity j and commodity k have oriented edges in the space of the relation r 2, and (i, r 3, k) indicates that commodity i and commodity k have oriented edges in the space of the relation r 3;
The objective function for distinguishing the complementary relation or the alternative relation is described as follows:
Wherein, beta is a mapping vector for a relation R, R is a set of all relations R, epsilon E is a set of all directed edges, P n(w0)~1/|{(i,r,j)}|3/4, (i, R, j) represents that a directed edge exists between a commodity i and a commodity j in a relation R space; the joint objective function is described as:
Where f c denotes a direct connected directed edge when c=0, a commodity category constraint when c=1, and a multi-step path constraint when c=2.
6. The apparatus of claim 5, wherein the solution module is configured to:
Adopting width priority search, extracting multi-step path constraints from the training set to obtain all multi-step path constraints, and thus constructing a corresponding training set;
and inputting the constructed training set into the combined objective function, and carrying out optimization solution on the combined objective function in a random gradient descent mode, so that a combined objective model is obtained through training, and a likelihood score is obtained through calculation.
7. The apparatus of claim 5, wherein the differentiating module is configured to:
Inputting the verification set in the set into the trained combined target model, and respectively determining a substitution relation and a probability score threshold corresponding to the complementary relation according to an output result;
based on the likelihood score threshold and the likelihood score, relationships between the items are distinguished.
8. The apparatus of claim 5, wherein the screening module is to:
and obtaining a target loss function by using logistic regression:
Where N is the number of negative samples, P n (w) is the third quarter of the unigram distribution U (w) 1/| { (i, j) ∈ε P } | and ε P is the set of all directed edges;
inputting commodity data into the target loss function, and screening out a commodity set with directed edges.
9. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
When executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-4.
10. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810892271.0A CN110827045B (en) | 2018-08-07 | 2018-08-07 | Method and device for distinguishing commodity relationship |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810892271.0A CN110827045B (en) | 2018-08-07 | 2018-08-07 | Method and device for distinguishing commodity relationship |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110827045A CN110827045A (en) | 2020-02-21 |
CN110827045B true CN110827045B (en) | 2024-08-20 |
Family
ID=69533855
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810892271.0A Active CN110827045B (en) | 2018-08-07 | 2018-08-07 | Method and device for distinguishing commodity relationship |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110827045B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105894350A (en) * | 2016-03-29 | 2016-08-24 | 南京大学 | Method for recommending mobile game props based on multi-example multiple labeling learning |
CN107730346A (en) * | 2017-09-25 | 2018-02-23 | 北京京东尚科信息技术有限公司 | The method and apparatus of article cluster |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8346783B2 (en) * | 2009-12-11 | 2013-01-01 | International Business Machines Corporation | Method and system for merchandise hierarchy refinement by incorporation of product correlation |
CN106228411B (en) * | 2016-07-31 | 2021-08-31 | 江西理工大学 | Method for solving sales promotion commodity combination based on commodity influence analysis |
CN108182621A (en) * | 2017-12-07 | 2018-06-19 | 合肥美的智能科技有限公司 | The Method of Commodity Recommendation and device for recommending the commodity, equipment and storage medium |
CN108280738A (en) * | 2017-12-13 | 2018-07-13 | 西安电子科技大学 | Method of Commodity Recommendation based on image and socialized label |
CN107944986B (en) * | 2017-12-28 | 2022-02-15 | 广东工业大学 | Method, system and equipment for recommending O2O commodities |
-
2018
- 2018-08-07 CN CN201810892271.0A patent/CN110827045B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105894350A (en) * | 2016-03-29 | 2016-08-24 | 南京大学 | Method for recommending mobile game props based on multi-example multiple labeling learning |
CN107730346A (en) * | 2017-09-25 | 2018-02-23 | 北京京东尚科信息技术有限公司 | The method and apparatus of article cluster |
Also Published As
Publication number | Publication date |
---|---|
CN110827045A (en) | 2020-02-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108153901B (en) | Knowledge graph-based information pushing method and device | |
US11281860B2 (en) | Method, apparatus and device for recognizing text type | |
US20190163742A1 (en) | Method and apparatus for generating information | |
EP4242955A1 (en) | User profile-based object recommendation method and device | |
CN107436875A (en) | File classification method and device | |
CN114329201B (en) | Training method of deep learning model, content recommendation method and device | |
CN110119445B (en) | Method and device for generating feature vector and classifying text based on feature vector | |
CN109840730B (en) | Method and device for data prediction | |
CN109697641A (en) | The method and apparatus for calculating commodity similarity | |
CN110674621B (en) | Attribute information filling method and device | |
CN111160847B (en) | Method and device for processing flow information | |
CN115033801B (en) | Article recommendation method, model training method and electronic equipment | |
CN110992127A (en) | Article recommendation method and device | |
CN110827101B (en) | Shop recommending method and device | |
CN112330382B (en) | Item recommendation method, device, computing equipment and medium | |
CN113495991A (en) | Recommendation method and device | |
CN112528103A (en) | Method and device for recommending objects | |
CN112528644B (en) | Entity mounting method, device, equipment and storage medium | |
US20180129664A1 (en) | System and method to recommend a bundle of items based on item/user tagging and co-install graph | |
CN112784861B (en) | Similarity determination method, device, electronic equipment and storage medium | |
CN112905885A (en) | Method, apparatus, device, medium, and program product for recommending resources to a user | |
CN110827045B (en) | Method and device for distinguishing commodity relationship | |
CN113743973B (en) | Method and device for analyzing market hotspot trend | |
CN114036397B (en) | Data recommendation method, device, electronic equipment and medium | |
CN113342969B (en) | Data processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |