Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
As described in the background art, currently, the calculation methods for the information popularity (hereinafter, news is taken as an example) can be mainly divided into two types: 1. based on the user behavior data: the behaviors of clicking, praise, commenting, forwarding and the like of the news materials by the user are collectively called as user behaviors, different behaviors of the user reflect the attention degree of the user to the materials, and statistics of data such as the number of clicks, praise, forwarding number, commenting number and the like of the news is the most intuitive and simplest method for calculating the popularity of the news. 2. Based on the news content: mining hot words and hot degree values thereof (which can be regarded as mining keywords and weights thereof) in a certain period of time by analyzing news contents in the certain period of time; and for a certain news content, adding the hot values of the hot words according to the matched hot words to obtain an overall hot value of the news.
The inventor finds that the evaluation is inaccurate based on two existing heat evaluation methods, specifically, 1. a news heat calculation method based on user behavior data does not consider the quality of news content and influence of time factors on news, and the hot news is not extracted timely enough. 2. Based on news contents, the news popularity calculation method based on the news contents completely depends on the contents, similar news mostly do not represent interest of users, are concerned by the users, do not pay attention to feedback of the users to the news, and do not consider influence of time factors on the news.
Based on this, according to an aspect of the embodiments of the present application, there is provided an information hotness evaluation method. Alternatively, in this embodiment, the above information heat evaluation method may be applied to a hardware environment as shown in fig. 1. As shown in figure 1 of the drawings, in which,
according to an aspect of an embodiment of the present application, there is provided an information hotness assessment method. Alternatively, in the present embodiment, the information popularity assessment method described above may be applied to a hardware environment formed by the terminal 102 and the server 104 as shown in fig. 1. As shown in fig. 1, the server 104 is connected to the terminal 102 through a network, which may be used to provide services for the terminal or a client installed on the terminal, may be provided with a database on the server or independent from the server, may be used to provide data storage services for the server 104, and may also be used to handle cloud services, and the network includes but is not limited to: the terminal 102 is not limited to a PC, a mobile phone, a tablet computer, etc. the terminal may be a wide area network, a metropolitan area network, or a local area network. The information heat evaluation method according to the embodiment of the present application may be executed by the server 104, or may be executed by the terminal 102, or may be executed by both the server 104 and the terminal 102. The terminal 102 may execute the information popularity assessment method according to the embodiment of the present application by a client installed thereon.
Taking the terminal 102 and/or the server 104 to execute the information popularity assessment method in this embodiment as an example, fig. 2 is a schematic flow chart of an optional information popularity assessment method according to an embodiment of the present application, and as shown in fig. 2, the flow chart of the method may include the following steps:
step S202, content information, user behavior information and release time of the target information are obtained;
step S204, evaluating the content information, the user behavior information and the release time respectively to obtain a multi-dimensional evaluation result;
and S206, determining the heat degree of the target information based on the multi-dimensional evaluation result.
Through the steps S202 to S208, after the content information, the user behavior information, and the release time of the target information are acquired, the content information, the user behavior information, and the release time are evaluated to obtain a multi-dimensional evaluation result; and determining the heat degree of the target information based on the multi-dimensional evaluation result. The method and the device can comprehensively consider the information content dimension, the information release time dimension and the user behavior dimension to carry out multi-dimensional evaluation on the heat degree of the information, can obtain the heat degree of the information more truly and accurately, and provide more accurate heat degree ranking.
For the technical solution in step S202, the information may be information in a text form, information in a video form, information in a picture or voice form, in this embodiment, the information may be news, information in an event, and the like, and in this embodiment, the information may be described by taking news as an example. In the present embodiment, after news distribution, news content, news distribution time, and user behavior may be acquired. Exemplary user behaviors may include: the number of user clicks, the number of praise, the number of comment, the number of forward, the number of click-fail, etc.
For the technical solution in step S204, after the content information, the user behavior information, and the release time are obtained, the dimensions may be evaluated one by one, specifically, the evaluation of the content information may be evaluated in the existing manner, for example, by analyzing news content in a certain time, and mining hotwords and hotness values thereof (which may be considered as mining keywords and weights thereof) in the time period; for a certain news content, the heat values of the hot words are added according to the matched hot words to obtain a corresponding evaluation result, in this embodiment, the quality of the content can be evaluated, exemplary keywords of the content information can be analyzed, the quality of the content is evaluated based on the number of the keywords, and for the evaluation of the user behavior, specific operations of the user on the target news, such as the user click number, the praise number, the comment number, the forwarding number and the attenuation number, can be evaluated according to the comprehensive user behavior quality evaluation result. For the evaluation of the time of release, the user may already know through another channel for events occurring at an earlier time due to the strong effectiveness of the news content. The earlier things happen the less attractive to the user, the popularity value of the news that has been distributed needs to decay over time and at an increasing rate. Thus, the time-decay of the information can be evaluated with respect to the release time. So as to obtain the multi-dimensional evaluation result of the target information.
And aiming at the technical scheme of the step S208, comprehensively considering time attenuation, user behaviors and material contents to calculate the popularity of the news material. Specifically, the content quality evaluation result, the user behavior quality evaluation result, and the time attenuation result may be assigned with weights according to actual situations, and targeted evaluation may be performed based on different scenarios or different situations.
As an exemplary embodiment, the click rate of the target information may also be used as an evaluation dimension where the heat of the target information is important, however, evaluation using the click rate may cause deviation of evaluation results, and the news material that is exposed once and clicked has a click rate of 1, and the news material that is exposed ten times and clicked nine times has a click rate of 0.9, so that the news material with a high click rate is not necessarily hot. News materials with high exposure times are not necessarily hot. In this embodiment, a wilson score may be used to evaluate a click rate dimension of target content, for example, to obtain an exposure number of the target information; determining a Wilson score for the target information based on the number of exposures and the user behavior information. The Wilson score considers the exposure number and the click rate together, and the higher the score, the higher the quality.
Specifically, see the following equation:
n=u+v
p=u/n
where u denotes the number of positive instances (clicks), v denotes the number of negative instances (exposure only without clicks), and n denotes the total number of instances. p represents the click rate, z is the quantile (parameter) of the normal distribution, and S represents the final wilson score.
The range of wilson scores is [0,1), p is 0 when the number of clicks is 0, the score S is 0, p is 1 when the exposed average is clicked, and the score S tends to 1. When p is the same, the larger n is, the smaller the numerator reduction speed is than the denominator reduction speed, and the larger the score S is, which indicates that when the click rate is the same, the more the number of clicks is, the more reliable the result is. When a user browses news, the situation of point error may exist, and the larger the number of samples is, the lower the probability of point error is, and the more credible the result is. Illustratively, when a news material appears, the number of clicks is 0, the score S is 0, and the news material is less visible to a user, in order to solve the problem that the hotness is 0 during cold start, when the wilson score is calculated, the release of target information needs to be monitored in real time, and when the target information is released for the first time, the wilson score is randomly assigned for initial assignment. Illustratively, an initial score of (0, 0.1) may be assigned randomly.
As an exemplary embodiment, after the popularity of the target information is obtained, since the target information may be migrated over time, the popularity of the target information may be updated in real time based on the user behavior information.
As an exemplary embodiment, due to the strong effectiveness of news content, a user may already know through another channel about an event that occurred at an earlier time. The earlier things happen the less attractive to the user, the popularity value of the news that has been distributed needs to decay over time and at an increasing rate. Therefore, the current time and the release time of the target information can be obtained; calculating the current time and the issuing time of the target information according to the preset time coefficient or the cooling law to obtain a time attenuation result, specifically, the following time attenuation evaluation formula adopting the time coefficient can be referred to:
T=1/(1+α×(T1-T0))
wherein, T1For the current time, T0And the time is the material release time, alpha is a time coefficient and is used for controlling the time decay speed, and T represents the final time decay score.
When T is1=T0When T is 1, news is just released without attenuation; to be full ofThe timeliness requirement is met, news recommended to the user by the platform must be within a certain time range, and 24h is taken as an example: when T is1-T0At 24, T approaches 0 and is given by e-5When α is equal to 6, T ≈ 0.
It is also possible to use a decay method similar to an exponential function, for example using newton's law of cooling:
T=ek×(T1-T0)
where T represents the final time decay score and K is a coefficient.
As an exemplary embodiment, the clicking, praise, comment, forwarding and other behaviors of the user represent the attention degree of the material, and the richer behaviors and the more times represent the more attention of the material. In this embodiment, the user behavior may be evaluated by classifying the user behavior information according to a preset rule; assigning corresponding weights based on different types of user behavior information; and calculating a user behavior evaluation result based on the weight and the type of the user behavior information. Illustratively, the attention degree of users represented by different user behaviors to the materials is different, the clicking operation is relatively simple, the users can be attracted by titles or main figures and some of the users can be error points, and therefore the weight given to the clicking behaviors is low; the like and forward behaviors represent the approval of the user for the material and want to share the material with others, so the weights given to the two behaviors are higher; the comment behavior may be a good comment or a bad comment, and cannot be judged directly according to the behavior, so that general weight is given to the comment behavior. In summary, the user behavior quality assessment is disclosed as follows:
Q1=log(Nclick/100+Nlike+Nshare+Ncomment/10)
wherein N isclick、Nlike、Nshare、NcommentRespectively representing the times of clicking, praise, sharing and commenting.
After the wilson score, the content quality evaluation result, the user behavior quality evaluation result and the time decay result are obtained, the popularity of the news material can be calculated by comprehensively considering the wilson score, the time decay, the user behavior and the material content. Specifically, the following formula can be adopted:
H=S×T×(Q1+Q2)
wherein, H is the target information heat, S is the Wilson score, T is the time decay result, Q1 is the user behavior evaluation result, and Q2 is the content quality evaluation result. After the new information is released, there is initially a lack of user behavior, Q1The trend is 0, the heat degree of the information is mainly evaluated by the content of the information, and after the information is released for a period of time, the user behavior is increased, Q1Increasingly, the popularity of information in turn tends to be assessed by user behavior.
As an exemplary embodiment, the user behavior information may be obtained in real time and the user behavior information may be obtained in real time according to the formula Q1=log(Nclick/100+Nlike+Nshare+Ncomment/10) and evaluated based on the formula H ═ sxt × (Q)1+Q2) And updating the evaluation result in real time to obtain the heat evaluation result more accurately.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (e.g., a ROM (Read-Only Memory)/RAM (Random Access Memory), a magnetic disk, an optical disk) and includes several instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the methods according to the embodiments of the present application.
According to another aspect of the embodiment of the application, an information popularity assessment device for implementing the information popularity assessment method is also provided. Fig. 3 is a schematic diagram of an alternative information heat evaluation apparatus according to an embodiment of the present application, and as shown in fig. 3, the apparatus may include:
an obtaining module 302, configured to obtain content information, user behavior information, and release time of the target information;
the evaluation module 304 is configured to evaluate the content information, the user behavior information, and the release time to obtain a multi-dimensional evaluation result;
a heat determination module 306, configured to determine a heat of the target information based on the multi-dimensional evaluation result.
It should be noted that the obtaining module 302 in this embodiment may be configured to execute the step S202, the evaluating module 304 in this embodiment may be configured to execute the step S204, and the heat determining module 306 in this embodiment may be configured to execute the step S206.
It should be noted here that the modules described above are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the disclosure of the above embodiments. It should be noted that the modules described above as a part of the apparatus may be operated in a hardware environment as shown in fig. 1, and may be implemented by software, or may be implemented by hardware, where the hardware environment includes a network environment.
According to another aspect of the embodiments of the present application, there is also provided an electronic device for implementing the above information heat evaluation method, where the electronic device may be a server, a terminal, or a combination thereof.
Fig. 4 is a block diagram of an alternative electronic device according to an embodiment of the present application, as shown in fig. 4, including a processor 402, a communication interface 404, a memory 406, and a communication bus 408, where the processor 402, the communication interface 404, and the memory 406 communicate with each other via the communication bus 408, where,
a memory 406 for storing a computer program;
the processor 402, when executing the computer program stored in the memory 406, performs the following steps:
acquiring content information, user behavior information and release time of target information;
evaluating the content information, the user behavior information and the release time respectively to obtain a multi-dimensional evaluation result;
and determining the heat degree of the target information based on the multi-dimensional evaluation result.
Alternatively, in this embodiment, the communication bus may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 4, but this does not indicate only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The memory may include RAM, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory. Alternatively, the memory may be at least one memory device located remotely from the processor.
As an example, as shown in fig. 4, the memory 402 may include, but is not limited to, the obtaining module 302, the evaluating module 404, and the heat determining module 306 of the information heat evaluating apparatus. In addition, other module units in the above information heat evaluation apparatus may also be included, but are not limited to, and are not described in detail in this example.
The processor may be a general-purpose processor, and may include but is not limited to: a CPU (Central Processing Unit), an NP (Network Processor), and the like; but also a DSP (Digital Signal Processing), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.
Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments, and this embodiment is not described herein again.
It can be understood by those skilled in the art that the structure shown in fig. 4 is only an illustration, and the device implementing the above information heat evaluation method may be a terminal device, and the terminal device may be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 4 is a diagram illustrating the structure of the electronic device. For example, the terminal device may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in FIG. 4, or have a different configuration than shown in FIG. 4.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disk, ROM, RAM, magnetic or optical disk, and the like.
According to still another aspect of an embodiment of the present application, there is also provided a storage medium. Alternatively, in the present embodiment, the storage medium may be a program code for executing the information heat evaluation method.
Optionally, in this embodiment, the storage medium may be located on at least one of a plurality of network devices in a network shown in the above embodiment.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps:
acquiring content information, user behavior information and release time of target information;
evaluating the content information, the user behavior information and the release time respectively to obtain a multi-dimensional evaluation result;
and determining the heat degree of the target information based on the multi-dimensional evaluation result.
Optionally, the specific example in this embodiment may refer to the example described in the above embodiment, which is not described again in this embodiment.
Optionally, in this embodiment, the storage medium may include, but is not limited to: various media capable of storing program codes, such as a U disk, a ROM, a RAM, a removable hard disk, a magnetic disk, or an optical disk.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a storage medium, and including instructions for causing one or more computer devices (which may be personal computers, servers, network devices, or the like) to execute all or part of the steps of the method described in the embodiments of the present application.
In the above embodiments of the present application, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, and may also be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution provided in the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The foregoing is only a preferred embodiment of the present application and it should be noted that those skilled in the art can make several improvements and modifications without departing from the principle of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.