CN114186099A

CN114186099A - Data storage method and device

Info

Publication number: CN114186099A
Application number: CN202111509697.1A
Authority: CN
Inventors: 王得贤; 李长亮
Original assignee: Beijing Kingsoft Digital Entertainment Co Ltd
Current assignee: Beijing Kingsoft Digital Entertainment Co Ltd
Priority date: 2021-07-13
Filing date: 2021-12-10
Publication date: 2022-03-15

Abstract

The application provides a data storage method and a data storage device, wherein the data storage method comprises the following steps: acquiring current hotspot data and historical query data of a user; determining at least one target hotspot message according to the current hotspot data and the historical query data of the user; and extracting the target characteristics of each target hotspot message, and storing each target hotspot message and the target characteristics of each target hotspot message in a one-to-one corresponding storage manner in an associated manner. Therefore, the high-frequency access information can be stored, the data content in the storage medium can be dynamically adjusted, the access efficiency is improved, the data unloading frequency in different storage media is reduced, the data caching is solved from the data contact level, and the service life of the storage medium is prolonged.

Description

Data storage method and device

Technical Field

The application relates to the field of artificial intelligence of computer technology, in particular to a data storage method. The present application also relates to a data storage device, a computing device, and a computer-readable storage medium.

Background

Artificial Intelligence (AI) refers to the ability of an engineered (i.e., designed and manufactured) system to perceive the environment, as well as the ability to acquire, process, apply, and represent knowledge. The development conditions of key technologies in the field of artificial intelligence comprise key technologies such as machine learning, knowledge maps, natural language processing, computer vision, human-computer interaction, biological feature recognition, virtual reality/augmented reality and the like. The Knowledge Graph (Knowledge Graph) describes concepts, entities and relations in an objective world in a structured form, expresses information of the internet into a form closer to a human cognitive world, and provides the capability of better organizing, managing and understanding mass information of the internet. With the development of computer technology, various access technologies are coming out endlessly. For access technologies of large-scale knowledge maps, a map database or a relatively stable large data distributed platform is generally adopted to store the large-scale knowledge maps. That is, in large-scale data storage, a small part of data is stored in a memory, so that efficient reading is facilitated, and a large part of data is stored in a hard disk. However, in the process of user retrieval and access, high frequency of data reading is accompanied, which has great influence on the efficiency of data access reading and the life of the storage medium.

In the prior art, generally, the sizes of the cache and the memory are reasonably utilized, and meanwhile, the garbage collector is utilized to move out or move in the high-frequency access data within the data active time, or a memory management system is designed to complete data management and keep the high-frequency access data in the cache, so that the high-efficiency data access is realized. However, for storing data in the knowledge graph, the adoption of the method can lead to poor user experience, thereby reducing the viscosity of the user. There is therefore a need for an effective solution to the above problems.

Disclosure of Invention

In view of this, embodiments of the present application provide a data storage method to solve technical defects in the prior art. The embodiment of the application also provides a data storage device, a computing device and a computer readable storage medium.

According to a first aspect of embodiments of the present application, there is provided a data storage method, including:

acquiring current hotspot data and historical query data of a user;

determining at least one target hotspot message according to the current hotspot data and the historical query data of the user;

and extracting the target characteristics of each target hotspot message, and storing each target hotspot message and the target characteristics of each target hotspot message in a one-to-one corresponding storage manner in an associated manner.

According to a second aspect of embodiments of the present application, there is provided a data storage device comprising:

the acquisition module is configured to acquire current hotspot data and historical query data of a user;

the determining module is configured to determine at least one target hotspot message according to the current hotspot data and the historical user query data;

and the storage module is configured to extract the target characteristics of each piece of target hotspot information and perform associated storage on each piece of target hotspot information and the target characteristics of each piece of target hotspot information in a one-to-one corresponding storage manner.

According to a third aspect of embodiments herein, there is provided a computing device comprising:

a memory and a processor;

the memory is used for storing computer-executable instructions, and the processor realizes the steps of the data storage method when executing the computer-executable instructions.

According to a fourth aspect of embodiments herein, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the data storage method.

According to a fifth aspect of embodiments of the present application, there is provided a chip storing computer instructions which, when executed by the chip, implement the steps of the data storage method.

According to the data storage method, current hotspot data and user historical query data are acquired; determining at least one target hotspot message according to the current hotspot data and the historical query data of the user; and extracting the target characteristics of each target hotspot message, and storing each target hotspot message and the target characteristics of each target hotspot message in a one-to-one corresponding storage manner in an associated manner. Therefore, the high-frequency access information can be stored, the data content in the storage medium can be dynamically adjusted, the access efficiency is improved, the data unloading frequency in different storage media is reduced, the data caching is solved from the data contact level, and the service life of the storage medium is prolonged.

Drawings

Fig. 1 is a schematic flowchart of a data storage method according to an embodiment of the present application;

FIG. 2 is a schematic flow chart of a data storage according to an embodiment of the present application;

FIG. 3 is a schematic flow chart diagram illustrating another data storage method according to an embodiment of the present application;

FIG. 4 is a schematic structural diagram of a data storage device according to an embodiment of the present application;

fig. 5 is a block diagram of a computing device according to an embodiment of the present application.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.

The terminology used in the one or more embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the present application. As used in one or more embodiments of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present application refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments of the present application to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first aspect may be termed a second aspect, and, similarly, a second aspect may be termed a first aspect, without departing from the scope of one or more embodiments of the present application.

First, the noun terms to which one or more embodiments of the present invention relate are explained.

Graph Database (Graph Database): the graph database is a non-relational database, and stores entities and relationship information between the entities by applying graph theory, and common graph databases include Neo4j (a graph database facing a network), OrientDB (a document-graph database capable of deep-level expansion and having flexibility of a document database and graph database management link capacity), titan (a client library depending on a storage engine), and the like. Graph databases are important carriers of knowledge-graph data.

Graph embedding: is a process of mapping graph data into low denseness vectors; is to convert the attribute map into a vector or set of vectors. Embedding should capture the topology of the graph, vertex-to-vertex relationships, and other relevant information about the graph, subgraph, and vertices. Graph embedding represents the entire graph with a single vector. This embedding is used to make predictions at the graph level or to compare, visualize the entire graph.

Dynamic monitoring of network media: the method is used for monitoring information such as information report, forwarding and comment of media, and is an important decision reference basis for enterprises and governments at all levels to master media dynamics and know the latest public opinion information hotspots, so that network public praise of the enterprises and governments are deeply known, self image is maintained and improved, and benefit loss is prevented.

In the present application, a data storage method is provided. The present application is also directed to a data storage device, a computing device, and a computer-readable storage medium, each of which is described in detail in the following embodiments.

Fig. 1 shows a flowchart of a data storage method according to an embodiment of the present application, which specifically includes the following steps:

step 102: and acquiring current hotspot data and historical query data of a user.

Specifically, the current hotspot refers to a thing or thing which causes broad attention, discussion and is often controversial to the masses on the media at present, the current hotspot data may be at least one of characters, pictures, sounds, videos and the like which are related to the field with higher current discussion degree, such as the current hotspot topic, focus and the like, and the current hotspot data is acquired in a set form, for example, characters such as a second killing activity, a price reduction promotion, a hot movie ticket, a new album released by a star and the like and related descriptions corresponding to the characters are collected one by one to serve as the current hotspot data. The historical query data of the user is data formed by querying and browsing through carriers such as multimedia and the like by the user, and can be at least one of characters, pictures, sound, videos and the like, the historical query data of the user is also acquired in a set form, and if the user U1 queries the situation of '2021 year college entrance examination' through a webpage, the user U1 browses content related to the situation of '2021 year college entrance examination' on the webpage as the historical query data of the user.

In practical application, when acquiring current hotspot data, network hotspot data, namely the current hotspot data, can be acquired in real time by using a network media dynamic monitoring technology, for example, the current hotspot data on a network is automatically captured according to a certain rule by using a web crawler technology; the current hotspot data can also be obtained by searching for "hotspots", "hot searches", "hot meetings", etc. on the search website. In addition, the user query behavior needs to be recorded, and the user historical query data is further acquired according to the user query behavior. For example, by using a web crawler technology, hot data on a network, that is, current hot data, is acquired in real time; and acquiring historical query data of the user according to the recorded query behavior of the user.

It should be noted that, in the present application, current hotspot data may be obtained first, and then historical query data of a user may be obtained; or acquiring historical query data of the user and then acquiring current hotspot data; and meanwhile, historical query data of the user can be acquired while current hotspot data are acquired. The sequence of obtaining the current hotspot data and the historical query data of the user is not limited.

According to the method and the device, data preparation work is done for determining the target hotspot information by acquiring the current hotspot data and the historical query data of the user, the accuracy and the comprehensiveness of the target hotspot information are ensured, the data storage efficiency is further improved, and the service life of the storage medium can be prolonged to a certain extent.

Step 104: and determining at least one target hotspot message according to the current hotspot data and the historical query data of the user.

And further, determining target hotspot information according to the obtained current hotspot data and the user historical query data on the basis of obtaining the current hotspot data and the user historical query data.

Specifically, the target hotspot information may be current hotspot information, such as "hot news today", or hotspot information predicted according to historical query data of the user, for example, most users search "food of the month" on the internet, and the predicted hotspot information may be information related to "food of the month".

In practical application, under the condition that current hotspot data are obtained, associated data within an N-degree relation of the current hotspot data can be determined in a knowledge graph corresponding to the current hotspot data according to the current hotspot data, wherein N is a positive integer, and the size of N can be set according to requirements; the N-degree relationship refers to a relationship between two nodes indirectly associated with each other through (N-1) nodes in the knowledge graph, for example, the node a and the node B are connected with each other through the node C, the relationship between the node a and the node B is a two-degree relationship, and the relationship between the node a and the node B and the node C is a one-degree relationship, that is, data corresponding to the node a is associated with the two-degree relationship of data corresponding to the node B, and data corresponding to the node a and the node B are associated with the two-degree relationship of data corresponding to the node C; the current hot data and each associated data of the current hot data respectively correspond to one node in the knowledge graph spectrum. Meanwhile, a user query behavior structure chart can be constructed according to the user historical query data, wherein the user query behavior structure chart is a chart representing the connection relation of all data obtained by connecting all user historical query data as a node according to the query sequence of all user historical query data. And then, calculating the embedded representation of the historical query data of each user according to the connection relation of each node in the user query behavior structure chart, and inputting the embedded representation of the historical query data of each user into a preset prediction model for prediction on the basis to obtain predicted hot spot data. Further, current hot spot data, associated data within the N-degree relation of the current hot spot data and information corresponding to the predicted hot spot data are respectively determined, and then at least one target hot spot information is determined according to the information.

For example, in the case that N is 2, the current hotspot data is D1, and it is necessary to determine, according to the current hotspot data D1, associated data within the two-degree relationship of the current hotspot data D1, that is, data D12 and data D13; meanwhile, a user query behavior structure diagram is constructed according to the user historical query data D2, D3 and D4, and then embedded representations of D2, D3 and D4 are respectively determined, so that predicted hotspot data are D3 'and D4', wherein the predicted hotspot data D3 'and the user historical query data D3 can be the same or different, for example, the user historical query data D3 is an article about Mars, the predicted hotspot data D3' can be another article about Mars or all articles about Mars, and can also be an article about Mars from the user historical query data D3; similarly, the predicted hotspot data D4' may or may not be the same as the user historical query data D4. Further, the information corresponding to the current hot spot data D1, the data D12, the data D13, and the predicted hot spot data D3 'and D4' is determined, and the target hot spot information is further determined according to the determined information.

It should be noted that, in order to further ensure the integrity of the target hotspot information and avoid the repetition of the target hotspot information, the target hotspot information set may be determined first, and then the target hotspot information may be determined from the target hotspot information set. In an optional implementation manner of this embodiment, a specific implementation process for determining at least one piece of target hotspot information according to the current hotspot data and the historical query data of the user may be as follows:

determining a target hotspot information set according to the current hotspot data and historical query data of a user;

and determining the information in the target hotspot information set as target hotspot information to obtain at least one piece of target hotspot information.

Specifically, a target hotspot information set refers to a set containing one or more target hotspot information.

In practical application, the associated data of the current hotspot data can be matched according to the current hotspot data; and then constructing a user query behavior structure chart according to the user historical query data, further determining the embedded expression of the user historical query data according to the user query behavior structure chart, predicting the hot data and obtaining the predicted hot data. On the basis, combining the information corresponding to the current hot spot data, the information corresponding to the associated data of the current hot spot data and the information corresponding to the predicted hot spot data to generate a target hot spot information set. And then, duplicate removal is performed on the same information in the target hotspot information set, at this moment, the target hotspot information set contains at least one piece of target hotspot information, namely, each piece of information in the target hotspot information set is the target hotspot information.

Along with the above example, determining that the associated data of the current hotspot data D1 comprises data D12 and data D13 according to the current hotspot data D1; and constructing a user query behavior structure diagram according to the user historical query data D2, D3 and D4, further respectively determining the embedded representation of D2, D3 and D4, and obtaining predicted hotspot data D3 'and D4'. Further, information corresponding to the current hotspot data D1, information corresponding to the data D12, information corresponding to the data D13, information corresponding to the predicted hotspot data D3 ', and information corresponding to the predicted hotspot data D4' are merged to obtain a target hotspot information set. If the information corresponding to the current hotspot data D1, the information corresponding to the data D12, the information corresponding to the data D13, the information corresponding to the predicted hotspot data D3 'and the information corresponding to the predicted hotspot data D4' are different from each other, the information corresponding to the current hotspot data D1, the information corresponding to the data D12, the information corresponding to the data D13, the information corresponding to the predicted hotspot data D3 'and the information corresponding to the predicted hotspot data D4' are all target hotspot information; if the information corresponding to the current hotspot data D1 is the same as the predicted hotspot data D3 ', to avoid information duplication, the information corresponding to the current hotspot data D1 (or the information corresponding to the predicted hotspot data D3') in the target hotspot information set may be deleted, and at this time, the information corresponding to the remaining data D12, the information corresponding to the data D13, the information corresponding to the predicted hotspot data D3 '(or the information corresponding to the current hotspot data D1), and the information corresponding to the predicted hotspot data D4' in the target hotspot information set are all target hotspot information.

For another example, the information corresponding to the current hotspot data includes: examination postponed, last snow in 2021 year; the information corresponding to the predicted point data includes: the young and bright match takes the crown and the examination is delayed; combining the information and removing duplication to obtain the target hotspot information: late exam, last snow and twilight match in 2021 years.

In an optional implementation manner of this embodiment, information corresponding to current hotspot data and information corresponding to associated data of the current hotspot data are both current hotspot information, information corresponding to hotspot data predicted according to user historical query data is predicted hotspot information, and the current hotspot information and the predicted hotspot information are merged to obtain a target hotspot data set. That is, the target hotspot information set includes current hotspot information and predicted hotspot information, and at this time, the target hotspot information set is determined according to the current hotspot data and the historical query data of the user, and the specific implementation process may be as follows:

determining current hotspot information in a knowledge graph according to the current hotspot data;

determining predicted hotspot information according to historical query data of a user;

and combining the current hotspot information and the predicted hotspot information to generate a target hotspot information set.

Specifically, a Knowledge map (Knowledge Graph) is called Knowledge domain visualization or Knowledge domain mapping map in the book intelligence world, is a series of different graphs for displaying the relationship between the Knowledge development process and the structure, describes Knowledge resources and carriers thereof by using a visualization technology, and excavates, analyzes, constructs, draws and displays Knowledge and the mutual relation between the Knowledge resources and the carriers; the knowledge graph can achieve the modern theory of multidisciplinary fusion by combining the theory and method of applying mathematics, graphics, information visualization technology, information science and other disciplines with the method of metrology citation analysis, co-occurrence analysis and the like and utilizing the visualized graph to vividly display the core structure, development history, frontier field and overall knowledge framework of the disciplines. The current hotspot information refers to information determined according to current hotspot data and/or associated data of the current hotspot data; the predicted hotspot information is information determined from historical query data of the user.

In practical application, the obtained current hotspot data can be used as a current node in a knowledge graph to obtain a first-degree relation node and a second-degree relation node of the current node, and further, information corresponding to the current node, information corresponding to the first-degree relation node and information corresponding to the second-degree relation node are determined, and the determined information is current hotspot information, that is, the current hotspot information is determined in the knowledge graph according to the current hotspot data. In addition, it is also necessary to further obtain user historical query data by recording user query behaviors, construct a user behavior structure diagram based on the user historical query data, determine an embedded representation of each user historical query data, predict hotspot data, and obtain information corresponding to the predicted hotspot data, that is, predicted hotspot information. And merging the determined current hotspot information and the predicted hotspot information, namely, taking a union set of the current hotspot information and the predicted hotspot information to obtain a target hotspot information set.

For example, according to the current hotspot data and by combining knowledge maps, determining current hotspot information { N1, N2, N3, N4 }; according to the historical query data of the user, if the determined predicted hotspot information is { N3, N5}, a union set of { N1, N2, N3, N4} and { N3, N5} is taken to obtain a target hotspot information set of { N1, N2, N3, N4, N5 }.

After the target hotspot information is determined, the importance degree of different target hotspot information is different due to different content of the target hotspot information, and the weight of each target hotspot information is different. In order to further reflect the importance degree of different target hotspot information, the target hotspot information can be sorted according to the weight of each target hotspot information. After at least one piece of target hotspot information is determined, the weight of each piece of target hotspot information needs to be determined; and sequencing the target hotspot information according to the weight of the target hotspot information. Therefore, the storage is facilitated according to the weight of each target hotspot message.

The target hotspot information is determined according to the current hotspot information and the predicted hotspot information, so that the weight of each target hotspot information is necessarily influenced by the weight of the current hotspot information and the weight of the predicted hotspot information, and therefore, the determination of the weight of each target hotspot information can be realized through the following processes:

acquiring a first weight and a second weight of each target hotspot message, wherein the first weight is the weight of each target hotspot message in the current hotspot message, and the second weight is the weight of each target hotspot message in the predicted hotspot message;

and determining the weight of the first target hotspot information according to the first weight and the second weight of the first target hotspot information, wherein the first target hotspot information is any one of the target hotspot information.

Specifically, the first weight refers to a weight of the target hotspot information relative to the current hotspot information, for example, the target hotspot information X1 corresponds to hotspot information Y1 in the current hotspot information, and the weight of the hotspot information Y1 is 0.1, the first weight of the target hotspot information X1 is 0.1, and if the target hotspot information X2 does not correspond to each hotspot information in the current hotspot information, the first weight of the target hotspot information X2 is 0; the second weight refers to a weight of the target hotspot information relative to the predicted hotspot information, for example, the target hotspot information X3 corresponds to hotspot information Y2 in the predicted hotspot information, and the weight of the hotspot information Y2 is 0.7, then the second weight of the target hotspot information X3 is 0.7, and if the target hotspot information X4 does not correspond to each hotspot information in the predicted hotspot information, then the second weight of the target hotspot information X4 is 0.

In practical application, in order to make the weight of each target hotspot message more accurate, and the target hotspot message is derived from the set of the current hotspot message and the predicted hotspot message, the weight of each target hotspot message can be determined from the two aspects of the current hotspot message and the predicted hotspot message respectively. Firstly, determining the weight of each target hotspot message in the current hotspot message, namely determining the weight of the current hotspot message corresponding to each target hotspot message, thereby obtaining the first weight of each target hotspot message; and determining the weight of each target hotspot message in the predicted hotspot message, namely determining the weight of the predicted hotspot message corresponding to each target hotspot message, thereby obtaining the second weight of each target hotspot message. In the application, the first weight of each target hotspot message can be obtained first, and then the second weight of each target hotspot message can be obtained; the second weight of each target hotspot message can be obtained, and then the first weight of each target hotspot message is obtained; the first weight and the second weight of each target hotspot information may also be obtained at the same time, which is not limited in the present application. On the basis of obtaining the first weight and the second weight of each target hotspot message, further determining the weight of each target hotspot message according to the first weight and the second weight of each target hotspot message respectively.

For example, the target hotspot information includes N1, N2 and N3, and the current hotspot information and the corresponding weight, the predicted hotspot information and the corresponding weight are shown in table 1. Wherein: the target hotspot information N1 corresponds to the current hotspot information N1, and the weight of the current hotspot information N1 is 0.6, so that the first weight of the target hotspot information N1 is 0.6; the target hotspot information N2 corresponds to the current hotspot information N2, and the weight of the current hotspot information N2 is 0.4, so that the first weight of the target hotspot information N2 is 0.4; the target hotspot information N3 does not correspond to the current hotspot information N1, N2, and the first weight of the target hotspot information N3 is 0; the target hotspot information N1 corresponds to the predicted hotspot information N1, the weight of the predicted hotspot information N1 is 0.55, and the second weight of the target hotspot information N1 is 0.55; the target hotspot information N2 does not correspond to the predicted hotspot information N1, N3, and the second weight of the target hotspot information N1 is 0; the target hotspot information N3 corresponds to the predicted hotspot information N3, the weight of the predicted hotspot information N3 is 0.45, and the second weight of the target hotspot information N3 is 0.45. Further, determining the weight of the target hotspot information N1 according to the first weight 0.6 and the second weight 0.55 of the target hotspot information N1; determining the weight of the target hotspot information N2 according to the first weight 0.4 and the second weight 0 of the target hotspot information N2; and determining the weight of the target hotspot information N3 according to the first weight 0 and the second weight 0.45 of the target hotspot information N3.

TABLE 1 weight of current hotspot information and predicted hotspot information

In addition, when the current hotspot information and the predicted hotspot information contain the same hotspot information, because the hotspot information is determined in the current hotspot information according to the current hotspot data, and the hotspot information is determined in the predicted hotspot information according to the historical query data of the user, the hotspot information is determined in different manners, the weight of the hotspot information in the current hotspot information may be different from the weight of the hotspot information in the predicted hotspot information.

It should be noted that, because each piece of current hotspot information and each piece of predicted hotspot information have certain differences in content and importance, weights may be set in advance for each piece of current hotspot information and each piece of predicted hotspot information. In the current hotspot information, the weight of the current hotspot information corresponding to the current hotspot data is highest, the weight of the current hotspot information corresponding to the first-degree associated data of the current hotspot data is the second, the weight of the current hotspot information corresponding to the second-degree associated data of the current hotspot data is the second, and so on. In the predicted hotspot information, determining the weight of the predicted hotspot information according to the searching and browsing times of the user on the predicted hotspot information, wherein the higher the searching and browsing times, the higher the weight of the corresponding predicted hotspot information is; the lower the search and browse times, the lower the weight of the corresponding predicted hotspot information.

For example, if the current hotspot information includes first current hotspot information corresponding to current hotspot data, second current hotspot information corresponding to first-degree associated data of the current hotspot data, and third current hotspot information corresponding to second-degree associated data of the current hotspot data, the weight of the first current hotspot information is greater than the weight of the second current hotspot information is greater than the weight of the third current hotspot information, and the weight of the first current hotspot information may be set to 0.5, the weight of the second current hotspot information is set to 0.3, and the weight of the third current hotspot information is set to 0.2. The predicted hot spot information comprises first predicted hot spot information and second predicted hot spot information, wherein the first predicted hot spot information is searched for and browsed for 600 times, the second predicted hot spot information is searched for and browsed for 400 times, and the weight of the first predicted hot spot information is set to be 0.6, and the weight of the second predicted hot spot information is set to be 0.4.

In one or more embodiments of this embodiment, in order to improve the efficiency of determining the weight of each target hotspot information and reflect that the current hotspot information and the predicted hotspot information have different degrees of influence on the target hotspot information, a specific implementation process of determining the weight of the first target hotspot information according to the first weight and the second weight of the first target hotspot information may be as follows:

and calculating a weighted sum of the first weight and the second weight of the first target hotspot information, and determining the weighted sum as the weight of the first target hotspot information.

Specifically, the weighted sum is a sum obtained by adding the first weight and the second weight, which are respectively given weights, and multiplying the first weight by the weight corresponding to the first weight and the second weight by the weight corresponding to the second weight.

In practical application, when calculating the weight of each target hotspot message, the product of the first weight of the target hotspot message and the weight of the first weight can be calculated first to obtain a first product; and then, calculating the product of the second weight of the target hotspot information and the weight number of the second weight to obtain a second product, and then adding the first product and the second product to obtain the sum which is the weight of the target hotspot information. The specific calculation process is shown in formula 1.

y＝a*x₁+b*x₂(formula 1)

Wherein y represents the weight of the target hotspot information, a represents the weight of the first weight, and x₁A first weight representing target hotspot information, b represents the weight of a second weight, x₂A second weight representing the target hotspot information.

For example, the weight a of the first weight is 0.6, the weight b of the second weight is 0.4, and the target hotspot information is N1, N2 and N3. When the first weight of the target hotspot information N1 is 0.6 and the second weight is 0.55, the weight of the target hotspot information N1 is 0.6 × 0.6+0.55 × 0.4 — 0.58. When the first weight of the target hotspot information N2 is 0.4 and the second weight is 0, the weight of the target hotspot information N2 is 0.6 × 0.4+0.4 × 0 — 0.24; when the first weight of the target hotspot information N3 is 0 and the second weight is 0.45, the weight of the target hotspot information N3 is 0.6 × 0+0.4 × 0.45 — 0.18.

It should be noted that, because the first weight and the second weight are determined in different manners, in order to improve the accuracy of the weight of the target hotspot information, the first weight and the second weight may be normalized first, then a weighted sum of the first weight and the second weight of the normalized first target hotspot information is calculated, and the weighted sum is determined as the weight of the first target hotspot information.

According to the method and the device, under the condition that the current hotspot data and the historical query data of the user are obtained, at least one piece of target hotspot information is further determined according to the current hotspot data and the historical query data of the user, a foundation is laid for storing the target hotspot data subsequently, and meanwhile the data storage speed is improved.

Step 106: and extracting the target characteristics of each target hotspot message, and storing each target hotspot message and the target characteristics of each target hotspot message in a one-to-one corresponding storage manner in an associated manner.

On the basis of determining at least one target hotspot message according to the current hotspot data and the historical query data of the user, further, target features of each target hotspot message need to be extracted, and then each target hotspot message corresponds to the corresponding target feature one to one and is stored in a correlation manner.

Specifically, the target feature refers to a feature that can be used as target hotspot information; the one-to-one correspondence refers to correspondence between target hotspot information and target features, for example, the target hotspot information M1 corresponds to the target feature M1, the target hotspot information M2 corresponds to the target feature M2, and the target hotspot information M3 corresponds to the target feature M3; the association storage means that the target hotspot information and the target characteristics of the target hotspot information are stored in a structure pair manner, for example, the target hotspot information is M1, the target characteristic of the target hotspot information M1 is M1, and the target hotspot information is stored in a structure of "M1-M1" or "M1-M1".

In practical application, before the target hotspot information is stored, the target features of each piece of target hotspot information need to be extracted, and there are many ways of extracting the target features, such as a word bank table representation, which is not limited in the present application. Further, the target hotspot information and the target characteristics of the target hotspot information are stored in a one-to-one correspondence manner. The place for performing the association storage may be a memory or a cache with a high data access frequency. The target hotspot information belongs to high-frequency access data and is stored in the memory or the cache, so that the speed of reading the target hotspot information by a user is increased, and the viscosity of the user can be increased.

In an optional implementation manner of this embodiment, after determining the weight of each piece of target hotspot information, each piece of target hotspot information needs to be sorted according to the weight of each piece of target hotspot information. On the basis, the target hotspot information and the target characteristics of the target hotspot information are stored in an associated manner, and the target hotspot information and the target characteristics of the target hotspot information can be sequentially stored in an associated manner according to the sorting result. The higher the weight of the target hotspot information is, the more likely each target hotspot information is to be accessed with high frequency, whereas the lower the weight of the target hotspot information is, the lower the possibility of each target hotspot information being accessed with high frequency is. Therefore, the target hotspot information with high weight is stored in the memory or in front of the cache, so that the reading by the user is more convenient, namely the target hotspot information with high weight and the target characteristics of the target hotspot information are stored in a correlation manner according to the sorting result, and the effect is better.

In addition, data which is not used any longer is stored in the memory or the cache, and the data is low-frequency access data, namely historical hotspot information. Since the historical hotspot information is mostly not accessed, a large amount of storage space is occupied when the historical hotspot information is stored in a memory or a cache, and thus the reading of the memory or cache data is slowed down. In order to avoid these phenomena, the stored historical hotspot information may be compressed and the original historical hotspot information is covered, and the specific implementation process may be as follows:

determining stored historical hotspot information;

compressing the stored historical hotspot information to obtain compressed historical hotspot information;

and replacing the stored historical hotspot information with the compressed historical hotspot information.

Specifically, the compression is a process of reducing the size of the stored historical hotspot information by a specific algorithm, for example, a certain file is compressed, the original file size is 8G, and the compressed file size is only 1G.

In practical application, what the stored historical hotspot information is needs to be determined, then the stored historical hotspot information is compressed, so that the historical hotspot information becomes smaller, the stored historical hotspot information is replaced by the compressed historical hotspot information, and the occupied storage space is reduced. For example, the stored historical hotspot information is H, the stored historical hotspot information H is compressed to obtain historical hotspot information H, and then the historical hotspot information H is replaced by the historical hotspot information H. At this time, the memory or the cache contains the historical hotspot information H but not the historical hotspot information H.

Because the compressed historical hotspot information is used for replacing the stored historical hotspot information, if a user reads the historical hotspot data, the problem that the user cannot find the historical hotspot information in a memory or a cache can be solved. Based on the method, the stored historical hot spot information can be compressed, the identification information of the stored historical hot spot information is extracted, and the identification information and the compressed historical hot spot information are stored in an associated mode, so that a user can conveniently find the compressed historical hot spot information according to the identification information. The specific implementation process is as follows:

extracting identification information of the historical hotspot information, wherein the identification information comprises an abstract and/or a label;

and the identification information of the historical hotspot information and the compressed historical hotspot information are stored in a correlation manner to replace the stored historical hotspot information.

In practical application, before compressing the stored historical hot spot information, the identification information (abstract and/or label) of the historical hot spot information is extracted, then the stored historical hot spot information is compressed, and on the basis, the compressed historical hot spot information and the corresponding identification information (abstract and/or label) are stored in a correlation mode, and the stored historical hot spot information is replaced. Therefore, when the user accesses the historical hot spot information, the compressed historical hot spot information is triggered and decompressed in the form of the label or the abstract information.

For further explanation of the data storage method provided in the present application, refer to fig. 2. Fig. 2 shows a schematic flow chart of data storage according to an embodiment of the present application. Firstly, determining current hotspot information: acquiring current hotspot data from a network by using a crawler technology, and determining current hotspot information in a knowledge graph according to the current hotspot data, wherein the current hotspot information comprises D1 and F, C; then, determining predicted hotspot information: firstly, obtaining user historical query data, including two pieces of user historical query data A → B → C → D2 and A → C → E → F, constructing a user query behavior structure diagram (the specific user query behavior structure diagram is shown in FIG. 2) according to the two pieces of user historical query data, respectively determining diagram embedding of A, B, C, D2 and E, F based on the user query behavior structure diagram, respectively determining diagram embedding of A, B, C, D2 and E, F as shown in FIG. 2, and obtaining predicted hotspot information D2 according to prediction of hotspot information, wherein current hotspot information D1 is the same as the predicted hotspot information D2; then, the current hotspot information D1, F, C and the predicted hotspot information D2 are merged to determine three target hotspot data D3, F, C, wherein the target hotspot data D3 may be any one of the current hotspot information D1 and the predicted hotspot information D2. Further determining the weight and sequencing, namely determining the weight of the target hotspot data D3 and F, C and sequencing according to the weight, and performing associated storage on each target hotspot information and the target characteristics of each target hotspot information according to the weight sequencing result; after the new target hotspot information is determined, determining the previously determined target hotspot information as historical hotspot information, namely determining the stored historical hotspot information, extracting the identification information of the stored historical data, wherein the identification information comprises a tag and an abstract, compressing the stored historical hotspot information to obtain compressed historical hotspot information, and storing the compressed historical hotspot information and the identification information in a correlation manner and replacing the stored historical hotspot information.

Fig. 3 is a schematic flow chart diagram illustrating another data storage method according to an embodiment of the present application, which specifically includes the following steps:

step 302: and acquiring current hotspot data and historical query data of a user.

Step 304: and determining current hotspot information in the knowledge graph according to the current hotspot data.

Step 306: and determining predicted hotspot information according to historical query data of the user.

It should be noted that step 304 and step 306 may be performed simultaneously; step 304 may be performed first, and then step 306 may be performed; step 306 may be executed first, and then step 304 may be executed, which is not limited in this application. In this embodiment, step 304 and step 306 are performed simultaneously as an example to explain.

Step 308: and combining the current hotspot information and the predicted hotspot information to generate a target hotspot information set.

Step 310: and determining the information in the target hotspot information set as target hotspot information to obtain at least one piece of target hotspot information.

Step 312: and acquiring a first weight and a second weight of each target hotspot message.

The first weight is the weight of each target hotspot message in the current hotspot message, and the second weight is the weight of each target hotspot message in the predicted hotspot message.

Step 314: and calculating a weighted sum of a first weight and a second weight of the first target hotspot information, and determining the weighted sum as the weight of the first target hotspot information, wherein the first target hotspot information is any one target hotspot information.

Step 316: and sequencing the target hotspot information according to the weight of the target hotspot information.

Step 318: and sequentially performing associated storage on the target hotspot information and the target characteristics of the target hotspot information according to the sorting result.

Step 320: and determining the stored historical hotspot information.

Step 322: and extracting identification information of the historical hotspot information, wherein the identification information comprises a summary and/or a label.

Step 324: and compressing the stored historical hotspot information to obtain the compressed historical hotspot information.

Step 326: and the identification information of the historical hotspot information and the compressed historical hotspot information are stored in a correlation manner to replace the stored historical hotspot information.

Corresponding to the above method embodiment, the present application further provides a data storage device embodiment, and fig. 4 shows a schematic structural diagram of a data storage device provided by an embodiment of the present application. As shown in fig. 4, the apparatus includes:

an obtaining module 402 configured to obtain current hotspot data and user historical query data;

a determining module 404 configured to determine at least one target hotspot message according to the current hotspot data and the user historical query data;

the storage module 406 is configured to extract the target features of each piece of target hotspot information, and perform associated storage on each piece of target hotspot information and the target features of each piece of target hotspot information in a one-to-one corresponding storage manner.

In one or more implementations of this embodiment, the determining module 404 is further configured to:

determining the weight of each target hotspot message;

sorting the target hotspot information according to the weight of the target hotspot information;

further, the storage module 406 is further configured to:

and sequentially performing associated storage on the target hotspot information and the target characteristics of the target hotspot information according to the sorting result.

determining the weight of the first target hotspot information according to the first weight and the second weight of the first target hotspot information, wherein the first target hotspot information is any one target hotspot information.

and calculating a weighted sum of a first weight and a second weight of the first target hotspot information, and determining the weighted sum as the weight of the first target hotspot information.

determining a target hotspot information set according to the current hotspot data and the historical query data of the user;

In one or more implementations of this embodiment, the target hotspot information set includes current hotspot information and predicted hotspot information;

further, the determining module 404 is further configured to:

determining predicted hotspot information according to the historical query data of the user;

and merging the current hotspot information and the predicted hotspot information to generate a target hotspot information set.

In one or more implementations of this embodiment, the apparatus further includes a historical hotspot information processing module configured to:

determining stored historical hotspot information;

In one or more implementations of this embodiment, the historical hotspot information processing module is configured to:

extracting identification information of historical hotspot information, wherein the identification information comprises an abstract and/or a label;

replacing the stored historical hotspot information with the compressed historical hotspot information, comprising:

and performing associated storage on the identification information of the historical hotspot information and the compressed historical hotspot information to replace the stored historical hotspot information.

According to the data storage device, the current hotspot data and the historical query data of the user are acquired through the acquisition module; the determining module determines at least one target hotspot message according to the current hotspot data and the historical user query data; further, the storage module extracts the target characteristics of each target hotspot message and stores each target hotspot message and the target characteristics of each target hotspot message in a one-to-one correspondence manner. Therefore, the high-frequency access information can be stored, the data content in the storage medium can be dynamically adjusted, the access efficiency is improved, the data unloading frequency in different storage media is reduced, the data caching is solved from the data contact level, and the service life of the storage medium is prolonged.

The above is a schematic scheme of a data storage device of the present embodiment. It should be noted that the technical solution of the data storage device and the technical solution of the data storage method belong to the same concept, and details that are not described in detail in the technical solution of the data storage device can be referred to the description of the technical solution of the data storage method. Further, the components in the device embodiment should be understood as functional blocks that must be created to implement the steps of the program flow or the steps of the method, and each functional block is not actually divided or separately defined. The device claims defined by such a set of functional modules are to be understood as a functional module framework for implementing the solution mainly by means of a computer program as described in the specification, and not as a physical device for implementing the solution mainly by means of hardware.

Fig. 5 illustrates a block diagram of a computing device 500 provided according to an embodiment of the present application. The components of the computing device 500 include, but are not limited to, a memory 510 and a processor 520. Processor 520 is coupled to memory 510 via bus 530, and database 550 is used to store data.

Computing device 500 also includes access device 540, access device 540 enabling computing device 500 to communicate via one or more networks 560. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. The access device 540 may include one or more of any type of network interface, e.g., a Network Interface Card (NIC), wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.

In one embodiment of the application, the above-described components of computing device 500 and other components not shown in FIG. 5 may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 5 is for purposes of example only and is not limiting as to the scope of the present application. Those skilled in the art may add or replace other components as desired.

Computing device 500 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smartphone), wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 500 may also be a mobile or stationary server.

Wherein processor 520 is configured to execute the computer-executable instructions of the data storage method.

The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the data storage method belong to the same concept, and details that are not described in detail in the technical solution of the computing device can be referred to the description of the technical solution of the data storage method.

An embodiment of the present application further provides a computer readable storage medium, which stores computer instructions, and the computer instructions, when executed by a processor, implement the steps of the data storage method as described above.

The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the technical solution of the data storage method, and details that are not described in detail in the technical solution of the storage medium can be referred to the description of the technical solution of the data storage method.

The embodiment of the application discloses a chip, which stores computer instructions, and the computer instructions are executed by a processor to realize the steps of the data storage method.

The foregoing description of specific embodiments of the present application has been presented. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

It should be noted that, for the sake of simplicity, the above-mentioned method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present application is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

The preferred embodiments of the present application disclosed above are intended only to aid in the explanation of the application. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the application and its practical applications, to thereby enable others skilled in the art to best understand and utilize the application. The application is limited only by the claims and their full scope and equivalents.

Claims

1. A method of storing data, comprising:

acquiring current hotspot data and historical query data of a user;

2. The method of claim 1, wherein after determining at least one target hotspot information, further comprising:

determining the weight of each target hotspot message;

the associating and storing the target hotspot information and the target characteristics of the target hotspot information comprises:

3. The method of claim 2, wherein the determining the weight of each target hotspot message comprises:

4. The method of claim 3, wherein determining the weight of the first target hotspot information according to the first weight and the second weight of the first target hotspot information comprises:

5. The method according to claim 1 or 2, wherein the determining at least one target hotspot information according to the current hotspot data and the user historical query data comprises:

6. The method of claim 5, wherein the target hotspot information set comprises current hotspot information and predicted hotspot information;

determining a target hotspot information set according to the current hotspot data and the historical user query data, wherein the determining comprises the following steps:

7. The method of claim 1, further comprising:

determining stored historical hotspot information;

8. The method of claim 7, wherein before compressing the stored historical hotspot information, further comprising:

9. A data storage device, comprising:

10. A computing device, comprising:

a memory and a processor;

the memory is configured to store computer-executable instructions, and the processor is configured to execute the computer-executable instructions to implement the steps of the data storage method of any one of claims 1 to 8.

11. A computer-readable storage medium storing computer instructions, which when executed by a processor, perform the steps of the data storage method of any one of claims 1 to 8.