[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN105138572A - Method and device for obtaining correlation weight of user tag - Google Patents

Method and device for obtaining correlation weight of user tag Download PDF

Info

Publication number
CN105138572A
CN105138572A CN201510446007.0A CN201510446007A CN105138572A CN 105138572 A CN105138572 A CN 105138572A CN 201510446007 A CN201510446007 A CN 201510446007A CN 105138572 A CN105138572 A CN 105138572A
Authority
CN
China
Prior art keywords
described user
user
users
user behavior
media content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510446007.0A
Other languages
Chinese (zh)
Other versions
CN105138572B (en
Inventor
杨帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201510446007.0A priority Critical patent/CN105138572B/en
Publication of CN105138572A publication Critical patent/CN105138572A/en
Application granted granted Critical
Publication of CN105138572B publication Critical patent/CN105138572B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for obtaining correlation weight of a user tag. The method for obtaining the correlation weight of the user tag comprises the steps as follows: a tag database is built; the tag database comprises a tag and class information corresponding to the tag; a user behavior is counted; a user behavior event is extracted; the user behavior event comprises the user tag; the user tag is the tag which is correlative to the user behavior in the tag database; and the correlation weight of the user tag is determined on the basis of the user behavior event. According to the scheme disclosed by the method, the user tag which is correlative to the user behavior and the weight corresponding to the user tag can be accurately obtained.

Description

Obtain the method and apparatus of the degree of association weights of user tag
Technical field
The disclosure relates generally to data analysis technique, is specifically related to the data analysis technique based on user behavior, particularly relates to the method and apparatus of the degree of association weights obtaining user tag.
Background technology
In internet, user behavior often content interested with it is associated.Such as, when user is interested in some particular brand, mechanism, businessman, its may pay close attention to these brands, mechanism or businessman microblogging, to forward these brands to oneself good friend, content that mechanism, businessman issue, or when brand, mechanism, businessman's issuing microblog, content of microblog is made comments.
At present, the above-mentioned Behavior preference of Obtaining Accurate user has following three kinds usually:
A) based on the method for machine learning, collection user's sample or user organize sample, then carry out feature extraction and machine learning to sample of users behavior, and recycling model obtains user behavior interest tags.
B) text based keyword extraction techniques carries out the foundation of user behavior interest tags, then utilizes user interactions relation to set up customer relationship figure, recycles the discovery that the methods such as similar page rank (PageRank) carry out user behavior interest.
C) algorithm (LatentDirichletAllocation, LDA) based on topic model carries out the discovery of user behavior interest, and this method utilizes customer relationship information and user tag information to carry out user behavior interest digging.
But there is following defect in prior art as above:
For such scheme a), its limitation is that the sample of users being difficult to carry out having commercial labels collects sample.
For such scheme b), although the program does not need to collect sample, for digging user on hobby widely accuracy rate still can, itself is not being possessed to the label of very strong transmission effects, the method accuracy rate is lower, easily causes erroneous judgement.
For such scheme c), for thematic more weak user behavior, its seed words being not easy to carry out theme is sorted out.
Summary of the invention
In view of above-mentioned defect of the prior art or deficiency, expect to provide a kind of method and apparatus obtaining the degree of association weights of user tag, it can accurately obtain by counting user behavior the user tag be associated with user behavior.
First aspect, the embodiment of the present application provides a kind of degree of association weights method obtaining user tag, and comprising: set up tag database, tag database comprises label and the classification information corresponding with label; Counting user behavior also extracts user behavior event, and wherein, user behavior event comprises user tag, and user tag is the label be associated with user behavior in tag database; And based on the degree of association weights of user behavior event determination user tag.
Second aspect, the embodiment of the present application additionally provides a kind of device obtaining the degree of association weights of user tag, comprising: creation module, is configured for and sets up tag database, and tag database comprises label and the classification information corresponding with label; Extraction module, is configured for counting user behavior and extracts user behavior event, and wherein, user behavior event comprises user tag, and user tag is the label be associated with user behavior in tag database; And determination module, be configured for the degree of association weights based on user behavior event determination user tag.
The embodiment of the present application provide scheme, accurately can obtain the user tag and the weights corresponding with this user tag that are associated to user behavior.
In some implementations of the application, also can calculate the weights of corresponding user tag to different classes of user behavior respectively, then the weights corresponding to each user behavior are superposed, obtain the weights of this user tag.
In some implementations of the application, can also correct based on the weights of priori data to the user behavior obtained, thus make the weights of the user tag finally obtained and the actual preference unification more of user.
Accompanying drawing explanation
By reading the detailed description done non-limiting example done with reference to the following drawings, the other features, objects and advantages of the application will become more obvious:
Fig. 1 shows the indicative flowchart of the degree of association weights method of the acquisition user tag according to the application's embodiment;
Fig. 2 shows the schematic diagram based on the degree of association weights of user behavior event determination user tag in Fig. 1;
Fig. 3 shows the schematic diagram of the degree of association weights device of the acquisition user tag according to the application's embodiment.
Embodiment
Below in conjunction with drawings and Examples, the application is described in further detail.Be understandable that, specific embodiment described herein is only for explaining related invention, but not the restriction to this invention.It also should be noted that, for convenience of description, illustrate only in accompanying drawing and invent relevant part.
It should be noted that, when not conflicting, the embodiment in the application and the feature in embodiment can combine mutually.Below with reference to the accompanying drawings and describe the application in detail in conjunction with the embodiments.
Shown in Figure 1, be the indicative flowchart of the degree of association weights method of the acquisition user tag according to the application's embodiment.
Specifically, in step 110, set up tag database, tag database comprises label and the classification information corresponding with label.
In some implementations, can obtain from respective media platform and meet pre-conditioned label, and these labels are added in tag database.The forum etc. of microblogging, micro-letter public platform, micro-letter circle of friends, each website such as can be comprised from media platform.When obtaining label, such as, can by these in media platform, through official's account of the representative various brands of certification, mechanism or businessman as label, be added in tag database.
In addition, because the title of each label likely overlaps, such as, " golf " this label, likely represents a kind of model of automobile, also likely represents this ball game of golf.Therefore, in some implementations, setting up in tag database, corresponding classification information can added for each label, thus the classification of clear and definite this label, with eliminate label ambiguity, avoid obscuring between each label.
Then, in the step 120, counting user behavior extract user behavior event.
In some implementations, user behavior event such as can comprise user tag, and user tag is the label be associated with user behavior in tag database.
In other implementations, user behavior event, except comprising user tag, can also comprise at least one item in user name, behavior time of origin, user behavior classification.
Such as, user U creates a behavior C1 be associated with the label A in tag database when time T1, and the user behavior event for this user behavior can be described by the four-tuple that (U, T1, C1, A) is such.
Then, in step 130, based on the degree of association weights of user behavior event determination user tag.
Such as, at least one in the number of times that the time that can occur according to the classification of user behavior, user behavior, user behavior occur etc., determines the degree of association weights of user tag.
In some implementations, user behavior classification such as can comprise following at least one item:
User pays close attention to other users from media platform.Such as, what user U paid close attention to is official's account of certain brand from other users of media platform, so behavior of this " concern " of user U, can be considered to the label corresponding with this brand and create and associate, thus this label becomes the user tag of user U.
User pay close attention to other users deliver from media content.Such as, user U pay close attention to another user K deliver in media content, contain the content relevant to label g a certain in tag database, so user U also can be considered to create with this label associate, thus this label becomes the user tag of user U.
User delivers from media content.Such as, what user U delivered refer to certain brand in media content, and so, user U also can be considered to the label corresponding with this brand and create and associate, thus this label becomes the user tag of user U.
What user issued for other users from media platform makes comments from media content.Such as, user U other users are delivered comment on from media content time, refer to certain brand, so, user U also can be considered to the label corresponding with this brand and create and associate, thus this label becomes the user tag of user U.
User forward from other users of media platform issue from media content.Such as, user other users are delivered forward from media content time, refer to certain brand, so, user U also can be considered to the label corresponding with this brand and create and associate, thus this label becomes the user tag of user U.
In some implementations, the degree of association weights based on user behavior event determination user tag such as can comprise step 131, using the superposition of the weights of each user behavior that is associated with the user tag degree of association weights as user tag.
Shown in Figure 2, for step 130 in Fig. 1 is namely based on schematic structure Figure 200 of the degree of association weights of user behavior event determination user tag.
Figure 2 illustrates by the superposition of the weights of each user behavior will be associated with the user tag degree of association weights as user tag.
Such as, in 201, it is official's account of certain brand from other users of media platform that user pays close attention to, and can calculate degree of association f between user and the label corresponding to this brand based on the behavior of user 1(211).
Similarly, in 202, other users that user U pays close attention to deliver in media content, contain the content relevant to label g a certain in tag database, so, degree of association f between this user U and label g can be calculated based on the behavior of user 2(212).
Similarly, in 203, what user delivered refer to certain brand in media content, so, can calculate degree of association f between this user and the label corresponding to this brand based on the behavior of user 3(213).
Similarly, in 204, in the comment delivered from media content that user issues for other users from media platform, refer to certain brand, so, degree of association f between this user and the label corresponding to this brand can be calculated based on the behavior of user 4(214).
Similarly, in 205, user forward issue from other users of media platform in media content, refer to certain brand, so, degree of association f between this user and the label corresponding to this brand can be calculated based on the behavior of user 5(215).
Obtaining the degree of association f of all types of user behavior be associated with a certain user tag 1~ f 5after, can pass through f 1~ f 5cumulative (220) obtain the degree of association weights (230) of this final user tag and this user.
In some implementations, pay close attention to when other users of media platform when user behavior comprises user, the weights f of this user behavior 1by being calculated by formula (1):
f 1(g)=(C g+TFIDF g)×T g/2(1)
Wherein, C gfor multi-level weight, C gvalue be nonnegative real number, and when user for its pay close attention to be provided with classification information from other users of media platform time C gvalue be greater than when user not for its pay close attention to from other users of media platform, classification information is set time C gvalue.
Such as, user U is official's account of certain brand that it is paid close attention to when being provided with classification information, C gvalue can be set to 1, if do not arrange any classification information for this official's account, C gvalue can be 0.
Here, concrete restriction can not be carried out to classification information, as long as user U is provided with classification information for this official's account, just can think that user U is high compared to the attention rate of other perpetual objects not arranging classification information to the attention rate of this official's account, and then the degree of association weights of the label corresponding to this official's account and this user are also higher.
TFIDF gto characterize in sample whole user to the concern relation of label g.In some implementations, TFIDF gcan be calculated by following formula (2):
TFIDF g=countUser(g)/lg(totalUser+0.01)(2)
Wherein, totalUser is the number of users of sample, and countUser is the number of users be associated with user tag in sample, and operational symbol lg represents denary logarithm computing.
T gfor time weighting, in some implementations, can be calculated by following formula (3):
T g=(T c-T 0)/(T now-T 0)(3)
Wherein, T cfor day occurs user behavior, T 0for the origination date preset, T nowfor current date.
Such as, the date that user pays close attention to the official account corresponding with label g is on May 8th, 2014, and the origination date preset is on January 1st, 2009, and current date is on July 20th, 2015.So, in some implementations, T 0=20090101, T c=20140508, T now=20150720.
In some implementations, when user behavior comprise user pay close attention to other users deliver from media content time, the weights f of user behavior 2can be calculated by following formula (4):
f 2(g)=close_f 2(g)×credibility_f 2(g)(4)
Wherein, close_f 2g () is the cohesion between user and other users of its concern.Such as, if when the quantity of the correlation behavior between user A and user B and/or frequency are more than a preset value, can think, between user A and user B, there is higher cohesion.Now, when one of them user (such as user B) delivered comprise label g from media content time, can think that user A also creates the user behavior of relevance with this label g.
Such as, if when the quantity of the correlation behavior between user A and user B and/or frequency are more than a preset value, can think that user B is the close friend of user A, the close friend that namely user B belongs to user A gathers CO.Now, close_f can be obtained based on following formula (5) 2the value of (g):
001"/>
Credibility_f 2g confidence level that () is other users.Such as, some in the behavior of user B or certain several special parameter exceedes threshold value time, can think that the confidence level of this user B is higher.In some implementations, such as, when user B behavior meet below at least one condition time, can think that the confidence level of user B is higher: if the quantity a predetermined level is exceeded from media content that user B issues, the quantity of other users from media platform that user B pays close attention to exceedes predetermined number of users, the quantity paying close attention to other users from media platform of user B exceedes predetermined number of users, and the quantity of other users from media platform paid close attention to mutually with user B exceedes predetermined number of users etc.
In these implementations, credibility_f 2g () can be calculated by following formula (6):
002"/>
Here, b can be certain behavioral parameters of the user B behavior preset, and max is the predetermined threshold corresponding to behavior parameter.
In some implementations, when user behavior comprise user deliver from media content time, the weights of user behavior can be calculated by following formula (7):
f 3(g)=(L g×T g+At g)/2(7)
Wherein, L gfor Sentiment orientation mark.Such as, when user deliver from media content, there is forward Sentiment orientation time, L gvalue is 1, user deliver when media content has negative sense Sentiment orientation, L gvalue is-1.In some implementations, the Sentiment orientation from media content that existing Sentiment orientation mining algorithm such as can be adopted to deliver to obtain user.
T gfor time weighting, formula as above (3) can be adopted to calculate T gnumerical value.
At gattach most importance to and pay close attention to mark, when user deliver in media content to the paying close attention to of user tag time, At gvalue is 1, otherwise At gvalue is 0.In some implementations, such as, the symbol paid close attention to is represented (such as containing in media content of delivering of user, user microblog issue in media content, contain "" this symbol, and followed by the official account corresponding to label g after " " symbol) time, the value of Atg can be set as 1; Otherwise, user deliver do not comprise in media content represent pay close attention to symbol time, can by At gvalue be set as 0.
In some implementations, when user behavior comprise user for other users from media platform issue make comments from media content time, the weights of user behavior can be calculated by following formula (8):
f 4(g)=L g×IR g×(T g+At g)/2(8)
L gfor Sentiment orientation mark, when user deliver from media content, there is forward Sentiment orientation time, L gvalue is 1, user deliver when media content has negative sense Sentiment orientation, L gvalue is-1.
IR goriginal text point identification, when user for other users from media platform issue from media content make comments and/or user forward from the content from media content that other users of media platform issue directly point to other users issue from media content time, IR gbe 1, otherwise be 0.In some implementations, such as, if when there is in original text the synonym occurring word in this comment and/or forwarding, can think that this comment and/or forwarding are for pointing to comment and/or the forwarding of original text, now, IR g=1.
T gfor time weighting, formula as above (3) can be adopted to calculate.
At gattach most importance to and pay close attention to mark, when user deliver in media content to the paying close attention to of user tag time, At gvalue is 1, otherwise At gvalue is 0.
In some implementations, when user behavior comprise user forward from other users of media platform issue from media content time, the weights f of user behavior 5g () can adopt above-mentioned formula (8) to calculate, that is: f 5(g)=L g× IR g× (T g+ At g)/2.
In some implementations, the degree of association weights based on user behavior event determination user tag can also comprise step 132, based on the weights of predetermined CF correcting user behavior.Here, predetermined CF can be associated with the confidence level of user behavior.In some implementations, different CF can be configured to different classes of user behavior.Such as, the value of CF A can be determined by following formula (9):
003"/>
In other words, when user behavior be user pay close attention to other users paid close attention to from other users of media platform, user deliver deliver from media content from media content or user time, the value of CF A is 0.4, and when user behavior be user for other users from media platform issue make comments from media content and/or forward time, the value of CF A is 0.6.
In some implementations, the degree of association weights based on user behavior event determination user tag can also comprise step 133, based on the weights of pre-determined accuracy factor correction user behavior.
Such as, the user of a part of quantity can be chosen in the sample, and Corpus--based Method obtains the accuracy that each class user behavior and the weights corresponding with it that calculate determine the weights of each class user behavior.
In some implementations, user U such as can express with following formula (10) for the degree of association weights of user tag g:
004"/>
Wherein, f ifor the weights of every class user behavior, A ifor the CF corresponding to the classification of user behavior, Z ifor the accuracy factor corresponding to the classification of user behavior.
Shown in Figure 3, be the schematic diagram 300 of the device of the degree of association weights of the acquisition user tag of the embodiment of the present application.
As shown in Figure 3, the device obtaining the degree of association weights of user tag comprises creation module 310, extraction module 320 and determination module 330.
Wherein, creation module 310 is configurable for setting up tag database, and tag database comprises label and the classification information corresponding with label.
Extraction module 320 is configurable extracts user behavior event for counting user behavior, and wherein, in some implementations, user behavior event such as can comprise user tag, user tag is the label be associated with user behavior in tag database.Or in other implementations, user behavior event, except comprising except user tag, can also comprise at least one item in user name, behavior time of origin, user behavior classification.
Determination module 330 is configurable for the degree of association weights based on user behavior event determination user tag.
In some implementations, user behavior classification such as can comprise user pay close attention to other users paid close attention to from other users of media platform, user deliver from media content, user deliver from media content, user for other users from media platform issue make comments from media content, user forwards at least one item in media content issued from other users of media platform.
In some implementations, determination module 330 can be configured for further using the degree of association weights of the superposition of the weights of each user behavior be associated with user tag as user tag.
In some implementations, pay close attention to when other users of media platform when user behavior comprises user, determination module 330 can be configured for based on f further 1(g)=(C g+ TFIDF g) × T g/ 2 weights determining user behavior.
Wherein, C gfor multi-level weight, C gvalue be nonnegative real number, and when user for its pay close attention to be provided with classification information from other users of media platform time C gvalue be greater than when user not for its pay close attention to from other users of media platform, classification information is set time C gvalue.
TFIDF g=countUser (g)/lg (totalUser+0.01), wherein, countUser is the number of users be associated with user tag in sample, and totalUser is the number of users of sample.
T gfor time weighting, T g=(T c-T 0/ T now-T 0), wherein, T cfor day occurs user behavior, T 0for the origination date preset, T nowfor current date.
In some implementations, when user behavior comprise user pay close attention to other users deliver from media content time, determination module 330 can be configured for based on f further 2(g)=close_f 2(g) × credibility_f 2g () determines the weights of user behavior.
Wherein, close_f 2g () is the cohesion between user and other users of its concern.Credibility_f 2g confidence level that () is other users.
In some implementations, when user behavior comprise user deliver from media content time, determination module 330 can be configured for based on f further 3(g)=(L g× T g+ At gthe weights of user behavior are determined in)/2.
Wherein, L gfor Sentiment orientation mark, when user deliver from media content, there is forward Sentiment orientation time, L gvalue is 1, user deliver when media content has negative sense Sentiment orientation, L gvalue is-1.
T gfor time weighting, T g=(T c-T 0)/(T now-T 0), wherein, T cfor day occurs user behavior, T 0for the origination date preset, T nowfor current date.
At gattach most importance to and pay close attention to mark, when user deliver in media content to the paying close attention to of user tag time, At gvalue is 1, otherwise At gvalue is 0.
In some implementations, when user behavior comprise user for other users from media platform issue from media content make comments and/or user forward from other users of media platform issue from media content time, determination module 330 can be configured for based on f further 4(g)=L g× IR g× (T g+ At gthe weights of user behavior are determined in)/2.
Wherein, L gfor Sentiment orientation mark, when user deliver from media content, there is forward Sentiment orientation time, L gvalue is 1, user deliver when media content has negative sense Sentiment orientation, L gvalue is-1.
IR goriginal text point identification, when user for other users from media platform issue from media content make comments and/or user forward from the content from media content that other users of media platform issue directly point to other users issue from media content time, IR gbe 1, otherwise be 0.
T gfor time weighting, T g=(T c-T 0)/(T now-T 0), wherein, T cfor day occurs user behavior, T 0for the origination date preset, T nowfor current date.
At gattach most importance to and pay close attention to mark, when user deliver in media content to the paying close attention to of user tag time, At gvalue is 1, otherwise At gvalue is 0.
In some implementations, determination module 330 can also be configured for the weights based on the behavior of predetermined CF correcting user further.Here, predetermined CF such as can be associated with the confidence level of user behavior.
In some implementations, when user behavior comprise user forward from other users of media platform issue from media content time, the weights f of user behavior 5g () can adopt and calculate f 4g () similar mode is carried out, that is: f 5(g)=L g× IR g× (T g+ At g)/2.
In some implementations, determination module 330 can also be configured for the weights based on pre-determined accuracy factor correction user behavior further.
Process flow diagram in accompanying drawing and block diagram, illustrate according to the architectural framework in the cards of the system of various embodiments of the invention, method and computer program product, function and operation.In this, each square frame in process flow diagram or block diagram can represent a part for module, program segment or a code, and a part for described module, program segment or code comprises one or more executable instruction for realizing the logic function specified.Also it should be noted that at some as in the realization of replacing, the function marked in square frame also can be different from occurring in sequence of marking in accompanying drawing.Such as, in fact the square frame that two adjoining lands represent can perform substantially concurrently, and they also can perform by contrary order sometimes, and this determines according to involved function.Also it should be noted that, the combination of the square frame in each square frame in block diagram and/or process flow diagram and block diagram and/or process flow diagram, can realize by the special hardware based system of the function put rules into practice or operation, or can realize with the combination of specialized hardware and computer instruction.
Unit involved by being described in the embodiment of the present application or module can be realized by the mode of software, also can be realized by the mode of hardware.Described unit or module also can be arranged within a processor, such as, can be described as: a kind of processor comprises creation module, extraction module and determination module.Wherein, the title of these unit or module does not form the restriction to this unit or module itself under certain conditions, and such as, creation module can also be described to " for setting up the module of tag database ".
As another aspect, present invention also provides a kind of computer-readable recording medium, this computer-readable recording medium can be the computer-readable recording medium comprised in device described in above-described embodiment; Also can be individualism, be unkitted the computer-readable recording medium in the equipment of allocating into.Computer-readable recording medium stores more than one or one program, and described program is used for performance description in the formula input method of the application by one or more than one processor.
More than describe and be only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art are to be understood that, invention scope involved in the application, be not limited to the technical scheme of the particular combination of above-mentioned technical characteristic, also should be encompassed in when not departing from described inventive concept, other technical scheme of being carried out combination in any by above-mentioned technical characteristic or its equivalent feature and being formed simultaneously.The technical characteristic that such as, disclosed in above-mentioned feature and the application (but being not limited to) has similar functions is replaced mutually and the technical scheme formed.

Claims (20)

1. obtain a degree of association weights method for user tag, it is characterized in that, comprising:
Set up tag database, described tag database comprises label and the classification information corresponding with described label;
Counting user behavior also extracts user behavior event, and wherein, described user behavior event comprises user tag, and described user tag is the label be associated with described user behavior in described tag database; And
Based on the degree of association weights of described user behavior event determination user tag.
2. method according to claim 1, is characterized in that, described user behavior event also comprises following at least one item:
User name, behavior time of origin and user behavior classification.
3. method according to claim 2, is characterized in that, described user behavior classification comprises following at least one item:
Described user pays close attention to other users from media platform;
Other users that described user pays close attention to deliver from media content;
Described user delivers from media content;
What described user issued for other users from media platform makes comments from media content; And
Described user forward from other users of media platform issue from media content.
4. method according to claim 1, is characterized in that, the described degree of association weights based on described user behavior event determination user tag comprise:
Using the superposition of the weights of each described user behavior that is associated with the described user tag degree of association weights as described user tag.
5. method according to claim 3, is characterized in that, pay close attention to when other users of media platform when described user behavior comprises described user, the weights of described user behavior are:
f 1(g)=(C g+TFIDF g)×T g/2;
Wherein, C gfor multi-level weight, C gvalue be nonnegative real number, and when described user for its pay close attention to be provided with classification information from other users of media platform time C gvalue be greater than when described user not for its pay close attention to from other users of media platform, classification information is set time Cg value;
TFIDF gfor user whole in sample is to the concern relation of label g, TFIDF g=countUser (g)/lg (totalUser+0.01), wherein, totalUser is the number of users of described sample, and countUser is the number of users be associated with described user tag in described sample;
T gfor time weighting, T g=(T c-T 0)/(T now-T 0), wherein, T cfor day occurs user behavior, T 0for the origination date preset, T nowfor current date.
6. method according to claim 3, is characterized in that, when described user behavior comprise described user pay close attention to other users deliver from media content time, the weights of described user behavior are:
f 2(g)=close_f 2(g)×credibility_f 2(g);
Wherein, close_f 2g () is the cohesion between other users described in described user and its concern;
Credibility_f2 (g) is the confidence level of other users described.
7. method according to claim 3, is characterized in that, when described user behavior comprise described user deliver from media content time, the weights of described user behavior are:
f 3(g)=(L g×T g+At g)/2;
L gfor Sentiment orientation mark, when described user deliver from media content, there is forward Sentiment orientation time, Lg value is 1, described user deliver when media content has negative sense Sentiment orientation, L gvalue is-1;
T gfor time weighting, T g=(T c-T 0)/(T now-T 0), wherein, T cfor day occurs user behavior, T 0for the origination date preset, T nowfor current date;
At gattach most importance to and pay close attention to mark, when described user deliver in media content to the paying close attention to of described user tag time, At gvalue is 1, otherwise At gvalue is 0.
8. method according to claim 3, it is characterized in that, when described user behavior comprise described user for other users from media platform issue from media content make comments and/or described user forward from other users of media platform issue from media content time, the weights of described user behavior are:
f 4(g)=L g×IR g×(T g+At g)/2;
L gfor Sentiment orientation mark, when described user deliver from media content, there is forward Sentiment orientation time, L gvalue is 1, described user deliver when media content has negative sense Sentiment orientation, L gvalue is-1;
IR gfor original text point identification, when described user for other users from media platform issue from media content make comments and/or described user forward from the content from media content that other users of media platform issue directly point to other users described issue from media content time, IR gbe 1, otherwise be 0;
T gfor time weighting, T g=(T c-T 0)/(T now-T 0), wherein, T cfor day occurs user behavior, T 0for the origination date preset, T nowfor current date;
At gattach most importance to and pay close attention to mark, when described user deliver in media content to the paying close attention to of described user tag time, At gvalue is 1, otherwise At gvalue is 0.
9. the method according to claim 1-8 any one, is characterized in that, the described degree of association weights based on described user behavior event determination user tag also comprise:
The weights of described user behavior are corrected based on predetermined CF;
Wherein, described predetermined CF is associated with the confidence level of described user behavior.
10. method according to claim 9, is characterized in that, the described degree of association weights based on described user behavior event determination user tag also comprise:
Based on the weights of user behavior described in pre-determined accuracy factor correction.
11. 1 kinds of devices obtaining the degree of association weights of user tag, is characterized in that, comprising:
Creation module, is configured for and sets up tag database, and described tag database comprises label and the classification information corresponding with described label;
Extraction module, is configured for counting user behavior and extracts user behavior event, and wherein, described user behavior event comprises user tag, and described user tag is the label be associated with described user behavior in described tag database; And
Determination module, is configured for the degree of association weights based on described user behavior event determination user tag.
12. devices according to claim 11, is characterized in that, described user behavior event also comprises following at least one item:
User name, behavior time of origin and user behavior classification.
13. devices according to claim 12, is characterized in that, described user behavior classification comprises following at least one item:
Described user pays close attention to other users from media platform;
Other users that described user pays close attention to deliver from media content;
Described user delivers from media content;
What described user issued for other users from media platform makes comments from media content; And
Described user forward from other users of media platform issue from media content.
14. devices according to claim 11, is characterized in that, described determination module is configured for further:
Using the superposition of the weights of each described user behavior that is associated with the described user tag degree of association weights as described user tag.
15. devices according to claim 13, is characterized in that, described determination module is configured for further to be paid close attention to when other users of media platform, based on f when described user behavior comprises described user 1(g)=(C g+ TFIDF g) × T g/ 2 weights determining described user behavior;
Wherein, C gfor multi-level weight, C gvalue be nonnegative real number, and when described user for its pay close attention to be provided with classification information from other users of media platform time C gvalue be greater than when described user not for its pay close attention to from other users of media platform, classification information is set time C gvalue;
TFIDF gfor user whole in sample is to the concern relation of label g, TFIDFg=countUser (g)/lg (totalUser+0.01), wherein, countUser is the number of users be associated with described user tag in described sample, and totalUser is the number of users of described sample;
T gfor time weighting, T g=(T c-T 0)/(T now-T 0), wherein, T cfor day occurs user behavior, T 0for the origination date preset, T nowfor current date.
16. devices according to claim 13, is characterized in that, described determination module be configured for further when described user behavior comprise described user pay close attention to other users deliver from media content time, based on f 2(g)=close_f 2(g) × credibility_f 2g () determines the weights of described user behavior;
Wherein, close_f 2g () is the cohesion between other users described in described user and its concern;
Credibility_f 2g () is the confidence level of other users described.
17. devices according to claim 13, is characterized in that, described determination module be configured for further when described user behavior comprise described user deliver from media content time, based on f 3(g)=(L g× T g+ At gthe weights of described user behavior are determined in)/2;
Wherein, L gfor Sentiment orientation mark, when described user deliver from media content, there is forward Sentiment orientation time, L gvalue is 1, described user deliver when media content has negative sense Sentiment orientation, L gvalue is-1;
T gfor time weighting, T g=(Tc-T 0)/(T now-T 0), wherein, T cfor day occurs user behavior, T 0for the origination date preset, T nowfor current date;
At gattach most importance to and pay close attention to mark, when described user deliver in media content to the paying close attention to of described user tag time, At gvalue is 1, otherwise At gvalue is 0.
18. devices according to claim 13, it is characterized in that, described determination module be configured for further when described user behavior comprise described user for other users from media platform issue from media content make comments and/or described user forward from other users of media platform issue from media content time, based on f 4(g)=L g× IR g× (T g+ At gthe weights of described user behavior are determined in)/2;
Wherein, L gfor Sentiment orientation mark, when described user deliver from media content, there is forward Sentiment orientation time, L gvalue is 1, described user deliver when media content has negative sense Sentiment orientation, L gvalue is-1;
IR gfor original text point identification, when described user for other users from media platform issue from media content make comments and/or described user forward from the content from media content that other users of media platform issue directly point to other users described issue from media content time, IR gbe 1, otherwise be 0;
T gfor time weighting, T g=(Tc-T 0)/(T now-T 0), wherein, T cfor day occurs user behavior, T 0for the origination date preset, T nowfor current date;
At gattach most importance to and pay close attention to mark, when described user deliver in media content to the paying close attention to of described user tag time, At gvalue is 1, otherwise At gvalue is 0.
19. devices according to claim 11-18 any one, it is characterized in that, described determination module is configured for further:
The weights of described user behavior are corrected based on predetermined CF;
Wherein, described predetermined CF is associated with the confidence level of described user behavior.
20. devices according to claim 19, is characterized in that, described determination module is configured for further:
Based on the weights of user behavior described in pre-determined accuracy factor correction.
CN201510446007.0A 2015-07-27 2015-07-27 Method and device for acquiring relevance weight of user tag Active CN105138572B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510446007.0A CN105138572B (en) 2015-07-27 2015-07-27 Method and device for acquiring relevance weight of user tag

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510446007.0A CN105138572B (en) 2015-07-27 2015-07-27 Method and device for acquiring relevance weight of user tag

Publications (2)

Publication Number Publication Date
CN105138572A true CN105138572A (en) 2015-12-09
CN105138572B CN105138572B (en) 2019-12-10

Family

ID=54723921

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510446007.0A Active CN105138572B (en) 2015-07-27 2015-07-27 Method and device for acquiring relevance weight of user tag

Country Status (1)

Country Link
CN (1) CN105138572B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105872593A (en) * 2016-03-21 2016-08-17 乐视网信息技术(北京)股份有限公司 Barrage pushing method and device
CN108512674A (en) * 2017-02-24 2018-09-07 百度在线网络技术(北京)有限公司 Method, apparatus and equipment for output information
CN111768213A (en) * 2020-09-03 2020-10-13 耀方信息技术(上海)有限公司 User label weight evaluation method
CN112650931A (en) * 2021-01-04 2021-04-13 杭州情咖网络技术有限公司 Content recommendation method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102654860A (en) * 2011-03-01 2012-09-05 北京彩云在线技术开发有限公司 Personalized music recommendation method and system
CN102760163A (en) * 2012-06-12 2012-10-31 奇智软件(北京)有限公司 Personalized recommendation method and device of characteristic information
CN102867016A (en) * 2012-07-18 2013-01-09 北京开心人信息技术有限公司 Label-based social network user interest mining method and device
CN103279533A (en) * 2013-05-31 2013-09-04 北京华悦博智科技有限责任公司 Method and system for social relationship recommendation
CN104035957A (en) * 2014-04-14 2014-09-10 百度在线网络技术(北京)有限公司 Search method and device
WO2015021937A1 (en) * 2013-08-14 2015-02-19 腾讯科技(深圳)有限公司 Method and device for user recommendation
CN104750789A (en) * 2015-03-12 2015-07-01 百度在线网络技术(北京)有限公司 Label recommendation method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102654860A (en) * 2011-03-01 2012-09-05 北京彩云在线技术开发有限公司 Personalized music recommendation method and system
CN102760163A (en) * 2012-06-12 2012-10-31 奇智软件(北京)有限公司 Personalized recommendation method and device of characteristic information
CN102867016A (en) * 2012-07-18 2013-01-09 北京开心人信息技术有限公司 Label-based social network user interest mining method and device
CN103279533A (en) * 2013-05-31 2013-09-04 北京华悦博智科技有限责任公司 Method and system for social relationship recommendation
WO2015021937A1 (en) * 2013-08-14 2015-02-19 腾讯科技(深圳)有限公司 Method and device for user recommendation
CN104035957A (en) * 2014-04-14 2014-09-10 百度在线网络技术(北京)有限公司 Search method and device
CN104750789A (en) * 2015-03-12 2015-07-01 百度在线网络技术(北京)有限公司 Label recommendation method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105872593A (en) * 2016-03-21 2016-08-17 乐视网信息技术(北京)股份有限公司 Barrage pushing method and device
CN108512674A (en) * 2017-02-24 2018-09-07 百度在线网络技术(北京)有限公司 Method, apparatus and equipment for output information
CN111768213A (en) * 2020-09-03 2020-10-13 耀方信息技术(上海)有限公司 User label weight evaluation method
CN112650931A (en) * 2021-01-04 2021-04-13 杭州情咖网络技术有限公司 Content recommendation method

Also Published As

Publication number Publication date
CN105138572B (en) 2019-12-10

Similar Documents

Publication Publication Date Title
CN109145216B (en) Network public opinion monitoring method, device and storage medium
CN109325165B (en) Network public opinion analysis method, device and storage medium
CN103744981B (en) System for automatic classification analysis for website based on website content
CN103294800B (en) A kind of information-pushing method and device
McKenzie et al. Weighted multi-attribute matching of user-generated points of interest
CN105045901B (en) The method for pushing and device of search key
WO2019153604A1 (en) Device and method for creating human/machine identification model, and computer readable storage medium
CN107704503A (en) User's keyword extracting device, method and computer-readable recording medium
CN109145215A (en) Internet public opinion analysis method, apparatus and storage medium
US9798820B1 (en) Classification of keywords
US20220405801A1 (en) Expert Search Thread Invitation Engine
CN105653562B (en) The calculation method and device of correlation between a kind of content of text and inquiry request
CN106802915A (en) A kind of academic resources based on user behavior recommend method
CN106250513A (en) A kind of event personalization sorting technique based on event modeling and system
CN105095187A (en) Search intention identification method and device
CN102033880A (en) Marking method and device based on structured data acquisition
CN104978665A (en) Brand evaluation method and brand evaluation device
CN106204156A (en) A kind of advertisement placement method for network forum and device
CN103577462A (en) Document classification method and document classification device
CN104268230B (en) A kind of Chinese micro-blog viewpoint detection method based on heterogeneous figure random walk
CN101770482A (en) Method and system for issuing advertisements
CN104077417A (en) Figure tag recommendation method and system in social network
Bing et al. Using query log and social tagging to refine queries based on latent topics
CN105138572A (en) Method and device for obtaining correlation weight of user tag
CN108363784A (en) A kind of public sentiment trend estimate method based on text machine learning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant