CN107562939B - Vertical domain news recommendation method and device and readable storage medium - Google Patents
Vertical domain news recommendation method and device and readable storage medium Download PDFInfo
- Publication number
- CN107562939B CN107562939B CN201710862705.8A CN201710862705A CN107562939B CN 107562939 B CN107562939 B CN 107562939B CN 201710862705 A CN201710862705 A CN 201710862705A CN 107562939 B CN107562939 B CN 107562939B
- Authority
- CN
- China
- Prior art keywords
- news
- user
- module
- website
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 59
- 230000006399 behavior Effects 0.000 claims description 20
- 239000013598 vector Substances 0.000 claims description 9
- 230000007774 longterm Effects 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 7
- 238000012216 screening Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 description 10
- 238000004891 communication Methods 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000007418 data mining Methods 0.000 description 2
- 230000005484 gravity Effects 0.000 description 2
- 238000000547 structure data Methods 0.000 description 2
- 206010063385 Intellectualisation Diseases 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 238000010079 rubber tapping Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention discloses a website news recommending method, a website news recommending device and a readable storage medium. The website news recommending method comprises the following steps: establishing a user interest module according to user characteristic data acquired by a portal website; establishing a news module according to news characteristic data stored in a portal website; and performing vertical news recommendation by combining the interest module and the news module according to a preset recommendation rule, and displaying the recommended news. According to the invention, the user module and the news module are established, so that the interest module and the news attribute of the user can be accurately obtained, the news recommendation accuracy and the professionality in the vertical field are greatly improved, the stickiness of the user to the portal website is greatly increased, and the use experience is also improved.
Description
Technical Field
The invention relates to the field of data mining and machine learning, in particular to a website news recommending method and device and a readable storage medium.
Background
At present, most portal websites use a function for news recommendation, and the current news recommendation method generally performs corresponding news recommendation according to information such as news clicked by a user or searched keywords. Although such a news recommendation method can quickly recommend a large amount of news related to keywords, it is not able to deeply mine the points of interest of the user, and the user can obtain a large amount of news recommendations, but at the same time, because the content to be focused on cannot be accurately obtained through the recommended news content, the trust of the website is lost, and finally, the attention of the website is reduced, and other adverse effects are caused.
Disclosure of Invention
The invention mainly aims to provide a website news recommendation method, aiming at solving the problem that a portal website cannot accurately recommend news in the vertical field to a user.
In order to achieve the above object, the present invention provides a website news recommendation method, including the steps of:
establishing a user interest module according to user characteristic data acquired by a website;
establishing a news module according to news characteristic data stored in a website;
and generating the recommended news in the vertical field by combining the user interest module and the news module according to the preset recommendation rule, and sending the recommended news to the user side for displaying.
The step of establishing the user interest module according to the user characteristic data acquired by the website comprises the following steps:
and a first interest module which is established by the basic information data when a user end browses a website and is in cold start is obtained, wherein the user interest module comprises the first interest module.
The step of establishing the user interest module at the cold start through the basic information data comprises the following steps:
and acquiring user classification business expansion information based on the browsing history of the user side, and analyzing the user classification business expansion information to obtain a second interest module related to the short-term preference of the user, wherein the user interest module further comprises the second interest module.
The step of analyzing the user classification business expansion information to obtain a second interest module comprises the following steps:
and acquiring the behavior data of the user, and analyzing the behavior data of the user to obtain a third interest module related to the long-term preference of the user, wherein the user interest module further comprises the third interest module.
Optionally, the step of establishing a news module according to the news feature data stored in the website includes:
acquiring news data in a text form stored in a website, and performing data structure processing on the news data to generate digitalized news data;
and establishing a digitalized news module according to the processed digitalized news data.
The step of performing data structure processing on the news data to generate digitalized news data comprises the following steps:
and converting the news data in the text form into corresponding keyword vectors, and obtaining the digitalized news data according to the keyword vectors.
Optionally, the step of generating the recommended news of the vertical domain by combining the preset recommendation rule with the user interest module and the news module includes:
and generating primary recommended news after recommending according to the recommendation rule, the user interest module and the news module, further screening the primary recommended news according to the expert opinions, and finally generating the recommended news.
Optionally, the step of sending the recommended news to the user side for display includes:
and comprehensively arranging the acquired recommended news and displaying the recommended news on a client.
In addition, to achieve the above object, the present invention also provides a news recommendation apparatus, including: the system comprises a memory, a processor and a news recommending program stored on the memory and capable of running on the processor, wherein the news recommending program realizes the steps of the website news recommending method when being executed by the processor.
In addition, to achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a news recommendation program, which when executed by a processor, implements the steps of the news recommendation method as described above.
According to the website news recommendation method, the reading habits, interests and behaviors of the user are counted and obtained by establishing the user interest module, and the news types which the user wants to obtain are analyzed. And then, acquiring related types of news by establishing a news module, and obtaining the currently recommended news by matching certain rules with the user interests. The method and the system can track the short, medium and long-term comprehensive news reading of the user, accurately acquire the news reading requirement of the user, accurately recommend the news in the related field, and effectively ensure the accuracy and the specialty, so that the user can acquire high-quality news recommendation in the portal website, and the use experience of the user on the portal website is improved.
Drawings
FIG. 1 is a schematic diagram of a terminal \ device structure of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a website news recommendation method according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating a detailed process of step S10 in another embodiment of the method for recommending news on websites of the present invention;
FIG. 4 is a schematic diagram of a module structure of a website news recommendation method of the present invention;
FIG. 5 is a schematic view of a workflow of the website news recommendation method according to the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, fig. 1 is a schematic terminal structure diagram of a hardware operating environment according to an embodiment of the present invention.
The terminal of the embodiment of the invention can be a PC, and can also be a mobile terminal device with a display function, such as a smart phone, a tablet computer, an electronic book reader, an MP3(Moving Picture Experts Group Audio Layer III, dynamic video Experts compress standard Audio Layer 3) player, an MP4(Moving Picture Experts Group Audio Layer IV, dynamic video Experts compress standard Audio Layer 3) player, a portable computer, and the like.
As shown in fig. 1, the terminal may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Optionally, the terminal may further include a camera, a Radio Frequency (RF) circuit, a sensor, an audio circuit, a WiFi module, and the like. Such as light sensors, motion sensors, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display screen according to the brightness of ambient light, and a proximity sensor that may turn off the display screen and/or the backlight when the mobile terminal is moved to the ear. As one of the motion sensors, the gravity acceleration sensor can detect the magnitude of acceleration in each direction (generally, three axes), detect the magnitude and direction of gravity when the mobile terminal is stationary, and can be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer and tapping) and the like for recognizing the attitude of the mobile terminal; of course, the mobile terminal may also be configured with other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which are not described herein again.
Those skilled in the art will appreciate that the terminal structure shown in fig. 1 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a website news recommendation program.
In the terminal shown in fig. 1, the network interface 1004 is mainly used for connecting to a backend server and performing data communication with the backend server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be configured to call the website news recommender stored in the memory 1005 and perform the following operations:
establishing a user interest module according to user characteristic data acquired by a website;
establishing a news module according to news characteristic data stored in a website;
and generating the recommended news in the vertical field by combining the user interest module and the news module according to the preset recommendation rule, and sending the recommended news to the user side for displaying.
Further, the processor 1001 may call the website news recommender stored in the memory 1005, and further perform the following operations:
and a first interest module which is established by the basic information data when a user end browses a website and is in cold start is obtained, wherein the user interest module comprises the first interest module.
Referring to fig. 2, a first embodiment of the present invention provides a news recommendation method, where the website news recommendation method includes:
step S10, establishing an interest module of the user module according to the user characteristic data;
step S20, establishing a news module according to the news characteristic data;
and step S30, generating the recommended news of the vertical field according to the preset recommendation rule by combining the user interest module and the news module, and sending the recommended news to the user side for displaying.
Specifically, the news recommendation system first establishes a user interest module according to the user feature data requirement to establish the interest module of the user, wherein the user feature data comprises basic information, user behavior features and classified service extension features. Then, according to the feature data of the news, the news stored in the text form on the Internet is converted into a structured data form, so that the subsequent calculation is facilitated. And finally, generating recommended news from the data of the user module and the news module according to a preset news recommendation rule, and sending the recommended news to a user side (a browser or a client).
Most of the current web portals and the like recommend news according to the usage habits of users, and the current news recommendation method is generally to recommend news related to the users according to data such as click records or search records of the users (for example, if the users obtain news related to Liu De Wai by searching Liu De Wai, the web portals increase the news related to Liu De Wai, even the recommendations of entertainment news). This news recommendation method may provide the user with the convenience of information acquisition for a short time, but is not really an intelligent recommendation. The user obtains a certain news through searching or clicking, and the like, probably because of the short time requirement of work or individuals, but not the personal interests and hobbies, so that after the relevant information has no use requirement, the relevant news continuously recommended by the portal website is like spam information for the user, which not only has no positive effect, but also can cause the user to feel dislike.
If the user viscosity is required to be increased, the demand point of the user needs to be accurately found, and the method and the system model the news through technologies of keyword extraction, topic discovery and the like of the news and convert unstructured data into storable structured data. And then constructing and updating a user module through analyzing the basic information and the behaviors of the user. And providing needed or interested information for the user by a mixed recommendation algorithm and combining with expert knowledge of the industry. And finally, through analysis of feedback of the user on the recommendation list, the composition of the list is adjusted, and the purpose of more individuation and intellectualization of the recommendation list is achieved.
Firstly, a user module is required to be established, and the problems faced by different conditions in the recommendation process are solved by acquiring user characteristic data. The user characteristic data comprises basic information, user behavior characteristics and classified service expansion characteristics. The basic information is information filled in by a user during website registration and is used for carrying out simple news recommendation for a new user without reading history during cold start; the system can improve the weight of keywords or named entities in related news and store the keywords or named entities in a long-term interest module of the user. The user part operation behavior and related description are presented in table 1.
TABLE 1
And the classified service extension information is user characteristics which can express user preference and are extracted from service categories (referred to as news read by a user) used by the user. As the news reading amount of the user increases, the extended information also increases, so that the extended information needs to be updated in time so as to accurately capture the short-term interest of the user.
And finally, integrating the characteristics of the three users, and performing integrated modeling on the users to depict the users. Module U of user iiIs U ═ Ii,RHi,Pi}. Wherein IiBasic information representing user I, Ii={gender=0,age=4...};RHiThe ID of the article which represents the most valuable read by the user i is 20, RHi={NT1,NT2,...NT20};PiAnd the keywords which represent the interest of the user in a certain period are obtained by the behavior characteristics of the user i and the classified service extension information together. Pi ═ ki1,ki2,...ki20}。
Besides establishing a user interest module, the invention also needs to establish news, so that corresponding news recommendation can be carried out according to the interest of the user. News is stored in text form on the internet and belongs to unstructured data. When modeling news, a series of data knots are needed to be passedThe structuring process works to convert the text data into storable, computable structured data. News features are divided into three categories here: basic features, textual features, and named entities. Through the news modeling process, a piece of news with ID j can be represented as Nj={NIj,Kj}. Wherein NIjRepresenting the basic characteristic of news j, KjRepresenting a collection of textual features and named entities extracted from news j that may replace the news.
The basic feature NIj records the most basic features of news, and most of the basic features are self-contained attributes when news is released. The basic features are designed to be captured during news capture and stored in NIj in the news module, and can be used to solve the cold start problem. The news base feature list is shown in table 2.
TABLE 2
In addition to the basic features, news has textual features, which are a series of keywords (or keywords) extracted from the body of the news that may indicate the subject of the news. Named entities are inherent names, abbreviations and other unique identifications in text, typically including 7 categories, people, institutions, places, dates, times, money and percentages. In different web portals, a library of chinese named entities of unique vocabulary is often created as needed to quickly and accurately identify the named entities in the news. Then, the weight of the keyword in the current news is determined according to the frequency and the position of the keyword in the news (the keyword at the position of a title or a subtitle has more important status when expressing news content) and other factors.
The recommendation algorithm adopts a hybrid recommendation algorithm, namely, results obtained by three recommendation algorithms (an AJS recommendation algorithm, an AK-means recommendation algorithm and an ABC-BC recommendation algorithm which are common algorithms of data mining at present) are fused according to the reading habits of users. Each recommendation algorithm is improved according to the characteristics and requirements of the vertical portal to obtain a more accurate recommendation result (the system work flow chart of the invention is shown in fig. 5). The three recommendation methods are respectively content-based recommendation, association rules and collaborative filtering, and all the recommendation methods adopt classical algorithms. This patent focuses on the design of the modules for the vertical portal, and the relationship and structure between the modules (a schematic diagram of the module structure is shown in fig. 4).
In order to enable the recommendation list to be more reasonable and humanized, the website news recommendation method can sort a plurality of news () selected by the news recommendation system according to the news release time. Through news sequencing, news with high timeliness has more chances to be read, and a recommendation list is more reasonable and more readable. And moreover, three-dimensional news recommendation can be performed in modes of manual recommendation by an editor (background manager), acquisition of news with the highest click frequency within a certain time (namely hot news), and the like, so that the user can be ensured to acquire the most accurate news in time.
Further, referring to fig. 3, the step of establishing the user interest module according to the user feature data acquired by the website includes:
step S11, acquiring the basic information data of the user in the user characteristic data, and establishing a first interest module through the basic information data when the user terminal browses the website and is in cold start, wherein the user interest module comprises a first interest module.
Specifically, basic information registered by a user on a website is obtained, and a first interest module of the user is established according to the basic information, wherein the first interest module is mainly used for news recommendation when the user is in cold start.
When a user uses the portal website, the user is firstly required to register an account, and when the user performs account registration, some basic information needs to be filled in to help complete account registration. The invention solves the problem of news recommendation when a user is cold started (i.e. the user browses a portal website for the first time or browses when no reading history exists) by acquiring the basic information during registration, and when the user is cold started, because the user does not have the reading history and the like to analyze and reference, news related to data in the basic information is recommended according to the basic information of the user during news recommendation, other users with similar interests to the user are found out at the same time, and the news is recommended according to the preferences of other users with similar interests to the user. The registration information pages required by different web portals are different and are often adjusted according to the content of each web portal, for example, a campus network may require to fill in an academic number (information about profession, graduation, and the like of a user may be obtained), and a sports web portal may require to fill in a good or interesting sports item (information about a sports item that a user is interested in may be obtained). Therefore, when the user performs cold start (namely, besides the basic information, no other data such as reading history of the user exists), the user interest module can be established according to the basic information during user registration so as to perform related news recommendation.
Further, the step of establishing the user interest module at the cold start by the basic information data comprises the following steps:
step S12, obtaining the user classification business expansion information based on the browsing history record of the user terminal, and analyzing the user classification business expansion information to obtain a second interest module related to the short-term preference of the user, wherein the user interest module further comprises a second interest module.
Specifically, by acquiring and analyzing the classified service extension information data of the user, a second interest module about the short-term preference of the user is established.
The classification business expansion information is a second interest module which extracts a certain amount of recently read news from news reading history of the user and can accurately acquire short-term preference of the user by analyzing the news reading history. In order to ensure that the short-term interest of the user can be accurately obtained, samples extracted by classifying the service expansion information also need to be updated in time, and the expansion information also needs to be increased along with the increase of the news reading amount of the user so as to ensure the accuracy of the user characteristics, so that a short-term interest module of the user can be accurately established.
News is effective, and a user may increase the attention to news contents in a certain or certain fields in a short period of time due to work or personal reasons, for example, during a world cup, the reading frequency of the news related to football that the user can read may be greatly increased, which indicates that the user has a greater interest in the world cup, and the system may increase the push of the news related to football and the world cup in a short period of time to meet the requirement of the user on the news related to football and the world cup in the short period of time. The interest module of short-term news reading of the user is established by acquiring the bandit business expansion information data of the user, so that the short-term interest of the user is acquired more accurately.
Further, the step of analyzing the user classification business expansion information to obtain the second interest module includes:
step S13, acquiring the behavior data of the user, and analyzing the behavior data of the user to obtain a third interest module related to the long-term preference of the user, wherein the user interest module further includes the third interest module.
Specifically, the third interest module of the long-term preference of the user is obtained according to the obtained behavior data of the user in the reading history and the reading operation behavior and according to the analysis of the behavior data of the user.
When reading news, a user can perform different reading operations according to different reading purposes and interests, and the reading preference, the reading purpose and the like of the user can be analyzed and judged according to the operation behaviors of reading the news. The reading operation behavior refers to a series of operations except reading performed by the user according to different interests or purposes when reading news. The meanings represented by different operation behaviors are different, and the weight of news with the operation behaviors in the news read by the user is improved (generally, the user operates the news with more interest, but only needs to know the news with relatively less operation), so that a more detailed third interest module with long-term preference can be established according to the operation times of the relevant operation behaviors of the user.
The user long-term interest module is established, so that the user preference can be analyzed more accurately, the user can be helped to acquire deep contents of a certain field or certain fields which are interested more easily, and the user can acquire deep and high-quality news, so that the user can establish high adhesion to the use of the portal website, and the user's loyalty is improved.
Further, the step of establishing a news module according to the news characteristic data stored in the website includes:
step S21, obtaining news data in text form stored in the website, and performing data structure processing on the news data to generate digitalized news data;
step S22, creating a digitized news module according to the processed digitized news data.
Specifically, news is stored in text form on the internet, and the text form cannot be used for building a news module, so that a process of converting news content in text form into a data structure is required, and a digitalized news module is built according to the obtained data structure.
News read by a user on a portal website is stored in a website server in a text form, the news in the text form needs to acquire data structure data through data volume structuring processing, and the data structure data is more convenient to store in calculation (the storage in the text form needs to waste a large amount of server space and is difficult to be directly used for calculation), so that a data news module can be better established. Through the establishment of the digitalized news module, the recommendation rule is matched with the user interest module, so that news related to the user interest can be acquired and recommended.
Further, the step of performing data structure processing on the news data to generate the digitalized news data includes:
and step S211, converting the news data in the text form into corresponding keyword vectors, and obtaining digitalized news data according to the keyword vectors.
Specifically, news of a website is converted into a data structure in a keyword vector form by methods of extracting basic features and text features and the like from data in a text form.
The news of the portal is stored in a text form, but the news in the text form cannot be used when the news module is built, so structural transformation is needed to be carried out so as to better build the news module. The data structuring process of the news in the text form is divided into two steps, and firstly, the basic characteristics of the news are obtained. The basic news characteristics record the most basic characteristics of news, and most of the basic news characteristics are self-contained attributes when the news is published. The news basic characteristics are shown in table 3.
TABLE 3
The information such as the type of news can be roughly acquired through the basic characteristics, so that rapid news recommendation can be performed in cold start. Besides basic features, identification and acquisition of text features of news are needed. The text features of news are a series of words extracted from the body of the news that may indicate the subject of the news, and named entities are inherent names, abbreviations and other unique identifiers in the text, typically including 7 categories, people, organizations, places, dates, times, money, and percentages. In different web portals, a library of chinese named entities of unique vocabulary is often created as needed to quickly and accurately identify the named entities in the news. In order to make the selected keyword more representative of the document gist, a TF × IDF method is generally adopted, and the formula is as follows:
and calculating the weight of each keyword, and storing the keyword with the highest weight. In order to make the selected keywords more accurate to represent the news content, the TF IDF formula is improved, and the formula is as follows:
after the weight calculation of the keywords is completed, the 10 keywords with the highest weight values are taken and stored in a news module KjIn (1). Wherein WjkRepresenting the weight of k keywords of the news with the ID of j in the news; tf isjkRepresents a keyword kjkNumber of occurrences in news j; tdfkRepresents a keyword kjkNumber of occurrences in all documents; w represents weight, the system stipulates that when the keyword appears in the title and the subtitle and the keyword is a named entity, W is larger than 1, otherwise W is smaller than or equal to 1 (when the word positioned in the title and the named entity express news content, the word and the named entity have more important status, so the weight is large). The formula improvement is based on that words and named entities in the titles have more important positions when expressing news contents, and specific weight values are determined according to different systems. Through news modeling, a news vector module N with the ID of j can be obtainedj={NIj,Kj}. Wherein NIjRepresenting the basic characteristic of news j, KjRepresenting a collection of textual features and named entities extracted from news j that may replace the news.
Further, the step of generating the recommended news of the vertical field by combining the user interest module and the news module through the preset recommendation rule includes:
and step S31, generating primary recommended news after recommending according to the recommendation rule and the user interest module and the news module, further screening the primary recommended news according to the expert opinions, and finally generating the recommended news.
Specifically, primary recommended news can be generated according to the recommendation rule, the user interest module and the news module, and in order to improve the accuracy and the content depth of the recommended news, the primary recommended news is screened according to the opinions of professionals in related industries, and the recommended news is obtained after screening.
Since the vertical portal mainly provides news and other contents for a specific field (region), the recommended news of the vertical portal location generally contains some professional knowledge related contents. In order to ensure the professionality of the recommended news, the potential relation between the keywords acquired by the association rule is acquired by the association rule algorithm according to the association rule formulated by the expert. Potential relations among the keywords are obtained through mining of all reading histories of all users, so that the accuracy of an algorithm, particularly a recommendation algorithm based on contents, is improved. The expert knowledge in the specific field is used as the background knowledge of the recommendation system, so that the cold start problem can be solved, the recommendation accuracy can be improved, and the trust of the user on the system can be improved.
Further, the step of sending the recommended news to the user side for display includes:
and step S32, comprehensively arranging the acquired recommended news and displaying the recommended news on a user side.
Specifically, the number of the news acquired through the recommendation rule is large, so that the news needs to be combined and arranged according to a certain sequence, and the news is displayed at the user side after the arrangement is finished.
The news display list mainly has three recommendation modes, and the system adopts a mixed recommendation algorithm to generate recommended news according to a user module and a news module; and editing (background manager) recommended news generated according to the simple mapping rule, and selecting a plurality of pieces of news with the highest contents by the system according to the click rate in a certain time to generate the recommended news. Through the three recommendation modes, news contents which the user wants to obtain can be displayed for the user comprehensively, and through manual and system dual recommendation, deviation of recommended news caused by influence of human emotion or system calculation is avoided. And the current hottest news is not recommended by the user, so that the current hottest news information cannot be missed no matter whether the user pays attention to the related field or not. The news with different recommending rules is different in emphasis point, and the emphasis point of the user can be judged according to the user module, so that the recommended news can be combined and arranged better.
The invention also provides a device based on the website news recommendation method.
The device based on the website news recommendation method comprises the following steps: a memory, a processor and a news recommender stored on the memory and operable on the processor, the news recommender when executed by the processor implementing the news recommendation method steps as described above.
The method implemented when the news recommendation program running on the processor is executed may refer to each embodiment of the website news recommendation method of the present invention, and details are not repeated here.
In addition, the embodiment of the invention also provides a computer readable storage medium.
The computer readable storage medium of the present invention has stored thereon a news recommender, which when executed by a processor implements the steps of the news recommendation method as described above.
The method implemented when the news recommendation program running on the processor is executed may refer to each embodiment of the website news recommendation method of the present invention, and details are not repeated here.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.
Claims (8)
1. A website news recommendation method is characterized by comprising the following steps:
establishing a user interest module according to user characteristic data acquired by a website;
establishing a news module according to news characteristic data stored in a website;
combining a user interest module and a news module according to a preset recommendation rule to generate recommended news of a vertical field, and sending the recommended news to a user side for displaying;
the step of establishing the news module according to the news characteristic data stored in the website comprises the following steps:
acquiring news data in a text form stored in a website, and performing data structure processing on the news data to generate digitalized news data;
establishing a digitalized news module according to the processed digitalized news data;
the step of performing data structure processing on the news data to generate digitalized news data comprises the following steps:
converting the news data in the text form into corresponding keyword vectors, and obtaining digitalized news data according to the keyword vectors;
the method further comprises the following steps:
calculating the weight of each keyword through a TF (Trans-frequency) IDF formula, and storing the keywords with the highest weights in a preset number into a news module Kj, wherein the TF IDF formula is as follows:
wherein, WjkRepresenting the weight of k keywords of the news with the ID of j in the news; tf isjkRepresents a keyword kjkNumber of occurrences in news j; tdfkRepresents a keyword kjkNumber of occurrences in all documents; w represents a weight, when the keyword is a named entity, W is larger than 1, otherwise, W is smaller than or equal to 1;
news modeling is carried out according to keywords, and a news vector module Nj with the ID of j is { NIj, Kj }, wherein NIj represents basic characteristics of news j, and a news module Kj represents a set of text characteristics and named entities which are extracted from news j and can replace the news.
2. A website news recommendation method according to claim 1, wherein the step of establishing a user interest module based on the user characteristic data acquired from the website comprises:
and a first interest module which is established by the basic information data when a user end browses a website and is in cold start is obtained, wherein the user interest module comprises the first interest module.
3. A website news recommendation method as claimed in claim 2, wherein said step of building a user interest module at a cold start through the basic information data is followed by:
and acquiring user classification business expansion information based on the browsing history of the user side, and analyzing the user classification business expansion information to obtain a second interest module related to the short-term preference of the user, wherein the user interest module further comprises the second interest module.
4. A website news recommendation method according to claim 3, wherein said step of analyzing the user-categorized business development information to obtain the second interest module is followed by:
and acquiring the behavior data of the user, and analyzing the behavior data of the user to obtain a third interest module related to the long-term preference of the user, wherein the user interest module further comprises the third interest module.
5. The website news recommendation method of claim 1, wherein the step of generating recommended news of the vertical domain by combining the user interest module and the news module through the preset recommendation rule comprises:
and generating primary recommended news after recommending according to the recommendation rule, the user interest module and the news module, further screening the primary recommended news according to the expert opinions, and finally generating the recommended news.
6. The website news recommendation method of claim 1, wherein the step of sending the recommended news to a user side for presentation comprises:
and comprehensively arranging the acquired recommended news and displaying the recommended news on a user side.
7. A news recommender, the apparatus comprising: a memory, a processor and a web site news recommender stored on the memory and operable on the processor, the web site news recommender when executed by the processor implementing the steps of the web site news recommendation method as claimed in any one of claims 1 to 6.
8. A computer-readable storage medium, having stored thereon a website news recommender, which, when executed by a processor, performs the steps of the news recommendation method as claimed in any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710862705.8A CN107562939B (en) | 2017-09-21 | 2017-09-21 | Vertical domain news recommendation method and device and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710862705.8A CN107562939B (en) | 2017-09-21 | 2017-09-21 | Vertical domain news recommendation method and device and readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107562939A CN107562939A (en) | 2018-01-09 |
CN107562939B true CN107562939B (en) | 2021-03-23 |
Family
ID=60982119
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710862705.8A Active CN107562939B (en) | 2017-09-21 | 2017-09-21 | Vertical domain news recommendation method and device and readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107562939B (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109300061A (en) * | 2018-08-31 | 2019-02-01 | 哈尔滨理工大学 | A kind of individual information recommendation plateform system based on deep learning tourism |
CN109660591A (en) * | 2018-11-02 | 2019-04-19 | 北京奇虎科技有限公司 | The automatic push method, apparatus and calculating equipment of Personalize News |
CN110413890A (en) * | 2019-07-29 | 2019-11-05 | 武汉匠楚科技有限公司 | A kind of method that news recommender system polymerization news is presented |
CN110580317B (en) * | 2019-08-29 | 2022-02-22 | 武汉赛可锐信息技术有限公司 | Social information analysis method and device, terminal equipment and storage medium |
CN110688476B (en) * | 2019-09-23 | 2024-06-25 | 腾讯科技(北京)有限公司 | Text recommendation method and device based on artificial intelligence |
CN110866183B (en) * | 2019-11-06 | 2023-06-13 | 北京字节跳动网络技术有限公司 | Social interface recommendation method and device, electronic equipment and storage medium |
CN110968789B (en) * | 2019-12-04 | 2023-05-23 | 掌阅科技股份有限公司 | Electronic book pushing method, electronic equipment and computer storage medium |
CN111601164A (en) * | 2020-05-21 | 2020-08-28 | 广州欢网科技有限责任公司 | Intelligent television news pushing method and system |
CN111767466B (en) * | 2020-09-01 | 2020-12-04 | 腾讯科技(深圳)有限公司 | Recommendation information recommendation method and device based on artificial intelligence and electronic equipment |
CN114670760B (en) * | 2020-12-24 | 2023-08-18 | 九号智能(常州)科技有限公司 | Control method and device for vehicle, electronic equipment and storage medium |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106202131A (en) * | 2015-05-08 | 2016-12-07 | 蔡奇 | A kind of news based on user interest recommends method |
US10783179B2 (en) * | 2015-08-03 | 2020-09-22 | International Business Machines Corporation | Automated article summarization, visualization and analysis using cognitive services |
CN106383887B (en) * | 2016-09-22 | 2023-04-07 | 深圳博沃智慧科技有限公司 | Method and system for collecting, recommending and displaying environment-friendly news data |
CN107025310A (en) * | 2017-05-17 | 2017-08-08 | 长春嘉诚信息技术股份有限公司 | A kind of automatic news in real time recommends method |
-
2017
- 2017-09-21 CN CN201710862705.8A patent/CN107562939B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN107562939A (en) | 2018-01-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107562939B (en) | Vertical domain news recommendation method and device and readable storage medium | |
CN112084268A (en) | Method and device for displaying search results and computer storage medium | |
KR101955463B1 (en) | System and Method for recommending application using contents analysis | |
US8909617B2 (en) | Semantic matching by content analysis | |
US8359306B2 (en) | Intelligent automatic recognition toolbar search method and system | |
CN110598098A (en) | Information recommendation method and device and information recommendation device | |
CN114564666B (en) | Encyclopedia information display method, device, equipment and medium | |
CN112100513A (en) | Knowledge graph-based recommendation method, device, equipment and computer readable medium | |
CN110781307A (en) | Target item keyword and title generation method, search method and related equipment | |
KR102269061B1 (en) | System for recommending providing integrated contents using usage information recognition of applications | |
CN101957825A (en) | Method for searching image based on image and video content in webpage | |
CN113806588A (en) | Method and device for searching video | |
CN105373580A (en) | Method and device for displaying subjects | |
TWI457775B (en) | Method for sorting and managing websites and electronic device of executing the same | |
CN107506441B (en) | Data arrangement method and device, electronic equipment and storage medium | |
CN103425767B (en) | A kind of determination method and system pointing out data | |
KR20140056635A (en) | System and method for providing contents recommendation service | |
JP5805134B2 (en) | Terminal device and device program | |
KR102712013B1 (en) | Method and device for transmitting information | |
CN113869063A (en) | Data recommendation method and device, electronic equipment and storage medium | |
JP2012242844A (en) | Recommendation information generation device and recommendation information generation method | |
US20180285447A1 (en) | Content recommendation apparatus, content recommendation system, content recommendation method, and program | |
CN108959316A (en) | A kind of method and apparatus adding a webpage to collection | |
KR20150045236A (en) | Method and apparatus for managing scrap information | |
JP6294279B2 (en) | Content recommendation device, content recommendation system, content recommendation method, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |