[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN108536812B - Method, device and equipment for clearing invalid data resources and computer readable medium - Google Patents

Method, device and equipment for clearing invalid data resources and computer readable medium Download PDF

Info

Publication number
CN108536812B
CN108536812B CN201810302134.7A CN201810302134A CN108536812B CN 108536812 B CN108536812 B CN 108536812B CN 201810302134 A CN201810302134 A CN 201810302134A CN 108536812 B CN108536812 B CN 108536812B
Authority
CN
China
Prior art keywords
invalid
data
query
invalid data
data resources
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810302134.7A
Other languages
Chinese (zh)
Other versions
CN108536812A (en
Inventor
李德禹
卫科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201810302134.7A priority Critical patent/CN108536812B/en
Publication of CN108536812A publication Critical patent/CN108536812A/en
Application granted granted Critical
Publication of CN108536812B publication Critical patent/CN108536812B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method for clearing invalid data resources, which is used for searching various types of data resources according to an invalid mining rule and determining whether the various types of data resources have invalid data resources, if so, clearing the invalid data resources according to a clearing mode corresponding to the invalid data resources, wherein the invalid data resources comprise at least one of invalid query operation, invalid calculation operation, invalid data storage and invalid data calling. Invalid data resources are effectively eliminated, and resource waste caused by resource occupation caused by repeated calculation operation, invalid calculation operation, data storage and data calling in the data analysis process is avoided. The invention also provides a device for clearing the invalid data resources, equipment for clearing the invalid data resources and a computer readable storage medium, and the technical effects are achieved.

Description

Method, device and equipment for clearing invalid data resources and computer readable medium
Technical Field
The invention relates to the field of computers, in particular to a method for clearing invalid data resources, a device for clearing invalid data resources, equipment for clearing invalid data resources and a computer readable storage medium.
Background
In the internet and other various industries, data analysis is increasingly used, and particularly for large companies with various business types, each department and each business in the large companies have application demand scenes such as data calculation, storage, calling and the like, and data cross analysis application can occur. The application of calculation, storage, invocation, etc. of these data, once it lacks effective management, waste of resources occurs, for example: invalid computation, repeated computation, invalid storage, repeated storage, invalid invocation, and the like.
At present, the conventional management method is manual combing after reporting, or processing by adopting an internal bill mechanism. The main problems of the conventional management method are that: (1) the efficiency is low, and the timeliness and the efficiency are poor due to the need of manual reporting, sorting and judgment; (2) the method is non-autonomous, related applications such as computing operation, data storage and data calling are autonomously reported by a service line, and the reasonability cannot be strictly ensured by relying on the autonomy; (3) limitation, the reporting and internal billing mechanisms have certain subjectivity, the investigation period cannot be flexible, and certain limitation is realized.
Therefore, how to avoid resource waste caused by invalid calculation, storage, calling and the like in the data analysis application process is a technical problem which needs to be solved urgently.
Disclosure of Invention
An embodiment of the present invention provides a method for clearing an invalid data resource, and further relates to a device for clearing an invalid data resource, an apparatus for clearing an invalid data resource, and a computer-readable storage medium, so as to solve at least one of the above technical problems in the prior art.
In a first aspect, an embodiment of the present invention provides a method for clearing an invalid data resource, including:
searching various types of data resources according to an invalid mining rule, and determining whether invalid data resources exist in the various types of data resources;
and if so, clearing the invalid data resources according to a clearing mode corresponding to the invalid data resources, wherein the invalid data resources comprise at least one of invalid query operation, invalid calculation operation, invalid data storage and invalid data calling.
With reference to the first aspect, in a first implementation manner of the first aspect, the searching for the various types of data resources according to the invalid mining rule to determine whether there is an invalid data resource in the various types of data resources includes:
when the data resource is query operation, calculating the similarity between the query operations;
and selecting one query job from the query jobs with the similarity larger than the threshold value, and determining the unselected query job as an invalid query job.
With reference to the first implementation manner of the first aspect, in a second implementation manner of the first aspect, the clearing the invalid data resources according to a clearing manner corresponding to the invalid data resources includes:
executing the selected query operation to obtain a query result;
establishing a public data mart table according to the query result;
and deleting the invalid query operation.
With reference to the first aspect, in a third implementation manner of the first aspect, the searching is performed on various types of data resources according to an invalid mining rule, and determining whether an invalid data resource exists in the various types of data resources includes:
when the data resources are data storage, scanning each temporary data storage table;
judging whether the scanned temporary data storage tables are repeated or not, selecting one temporary data storage table from the repeated temporary data storage tables, and determining the unselected temporary data storage table as invalid data storage.
With reference to the third implementation manner of the first aspect, in a fourth implementation manner of the first aspect, the clearing the invalid data resources according to a clearing manner corresponding to the invalid data resources includes:
and storing the offline invalid data.
With reference to the first aspect, in a fifth implementation manner of the first aspect, the searching for the various types of data resources according to the invalid mining rule to determine whether there is an invalid data resource in the various types of data resources includes:
when the data resource is a calculation operation, judging whether the calculation operation is abnormally interrupted or not according to an output log of the calculation operation; and if so, determining the abnormally interrupted computing operation as an invalid computing operation.
With reference to the first aspect, in a sixth implementation manner of the first aspect, the searching is performed on various types of data resources according to an invalid mining rule, and determining whether an invalid data resource exists in the various types of data resources includes:
and determining whether invalid data resources of the employee exist according to the binding relationship between the various types of data resources and the human resource system, wherein the invalid data resources of the employee comprise at least one of invalid calculation jobs, invalid data storage and invalid data calling of the employee.
With reference to the first aspect, in a seventh implementation manner of the first aspect, the searching for the various types of data resources according to the invalid mining rule to determine whether the invalid data resources exist in the various types of data resources includes:
and determining whether invalid data resources of the closed service exist according to the binding relationship between various types of data resources and the service management system, wherein the invalid data resources of the closed service comprise at least one of invalid operation, invalid data storage and invalid data calling of the closed service.
With reference to one of the fifth, sixth, and seventh embodiments of the first aspect, the present invention, in an eighth embodiment of the first aspect, a method for clearing an invalid data resource according to a clearing manner corresponding to the invalid data resource, includes:
closing an invalid computing job off line, and cleaning garbage generated in the historical computing by the invalid computing job; or
Storing the invalid data off line; or
The invalid data is called offline.
In a second aspect, an embodiment of the present invention provides an apparatus for clearing an invalid data resource, including:
the invalid data resource searching module is used for searching various types of data resources according to an invalid mining rule;
the invalid data resource confirming module is used for determining whether invalid data resources exist in various types of data resources;
and the invalid data resource clearing module is used for clearing the invalid data resources according to a clearing mode corresponding to the invalid data resources if the invalid data resources exist in various types of data resources, wherein the invalid data resources comprise at least one of invalid query operation, invalid calculation operation, invalid data storage and invalid data calling.
With reference to the second aspect, in a first implementation manner of the second aspect, the invalid data resource confirming module includes:
the similarity calculation unit is used for calculating the similarity between the query jobs when the data resources are the query jobs;
and the repeated query confirming unit is used for selecting one query job from the query jobs with the similarity larger than the threshold value and determining the unselected query job as an invalid query job.
With reference to the first implementation manner of the second aspect, in a second implementation manner of the second aspect, the invalid data resource clearing module includes:
and the invalid query job clearing unit is used for executing the selected query job to obtain a query result, establishing a public data mart table according to the query result and deleting the invalid query job.
With reference to the second aspect, in a third implementation manner of the second aspect, the invalid data resource confirming module further includes:
the storage table scanning unit is used for scanning each temporary data storage table when the data resource is used for data storage;
and an invalid data storage confirming unit, configured to select one temporary data storage table from the repeated temporary data storage tables, and determine an unselected temporary data storage table as an invalid data storage.
With reference to the third implementation manner of the second aspect, in a fourth implementation manner of the second aspect, the invalid data resource clearing module further includes:
and the invalid data storage offline unit is used for offline invalid data storage.
With reference to the second aspect, in a fifth implementation manner of the second aspect, the invalid data resource confirming module further includes:
an invalid calculation confirming unit, configured to, when the data resource is a calculation job, determine whether the calculation job is abnormally interrupted according to an output log of the calculation job; and if so, determining the abnormally interrupted computing operation as an invalid computing operation.
With reference to the second aspect, in a sixth implementation manner of the second aspect, the invalid data resource confirming module further includes:
and the data resource confirming unit of the employees is used for determining whether invalid data resources of the employees exist or not according to the binding relationship between various types of data resources and the human resource system, wherein the invalid data resources of the employees include at least one of invalid calculation operation, invalid data storage and invalid data calling of the employees.
With reference to the second aspect, in a seventh implementation manner of the second aspect, the invalid data resource confirming module further includes:
and the closing service data resource confirming unit is used for determining whether invalid data resources of the closed service exist according to the binding relationship between various types of data resources and the service management system, wherein the invalid data resources of the closed service comprise at least one of invalid operation, invalid data storage and invalid data calling of the closed service.
With reference to any one of the fifth embodiment, the sixth embodiment and the seventh embodiment of the second aspect, in an eighth embodiment of the second aspect, the invalid data resource clearing module further includes:
and the invalid data resource offline unit is used for closing the invalid computing operation offline, cleaning garbage generated by the invalid computing operation in the history computing, storing the invalid data offline, or calling the invalid data offline.
In a third aspect, an embodiment of the present invention provides a device for clearing an invalid data resource, where functions of the device may be implemented by hardware, or may be implemented by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above-described functions.
In one possible design, the structure of the apparatus for clearing invalid data resources includes a processor and a storage device, the storage device is used for storing a program for supporting the clearing device for invalid data resources to execute the method in the first aspect, and the processor is configured to execute the program stored in the storage device. The clearing device of invalid data resources may further comprise a communication interface for communicating the clearing device of invalid data resources with other devices or a communication network.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium for storing computer software instructions for an apparatus for clearing an invalid data resource, where the computer software instructions include a program for executing the method for clearing an invalid data resource in the first aspect to be a device for clearing an invalid data resource.
One of the above technical solutions has the following advantages or beneficial effects: in the scheme, various types of data resources are searched according to an invalid mining rule, whether invalid data resources exist in the various types of data resources is determined, if yes, the invalid data resources are cleared according to a clearing mode corresponding to the invalid data resources, and the invalid data resources comprise at least one of invalid query operation, invalid calculation operation, invalid data storage and invalid data calling. The scheme avoids resource waste caused by occupation of resources by repeated calculation, invalid calculation, storage, calling and the like in the data analysis application process.
The foregoing summary is provided for the purpose of description only and is not intended to be limiting in any way. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features of the present invention will be readily apparent by reference to the drawings and following detailed description.
Drawings
In the drawings, like reference numerals refer to the same or similar parts or elements throughout the several views unless otherwise specified. The figures are not necessarily to scale. It is appreciated that these drawings depict only some embodiments in accordance with the disclosure and are therefore not to be considered limiting of its scope.
Fig. 1 is a flowchart of a method for clearing an invalid data resource according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a method for clearing invalid data resources according to an embodiment of the present invention;
FIG. 3 is a block diagram of an apparatus for clearing invalid data resources according to an embodiment of the present invention;
fig. 4 is a structural diagram of an apparatus for clearing invalid data resources according to an embodiment of the present invention.
Detailed Description
In the following, only certain exemplary embodiments are briefly described. As those skilled in the art will recognize, the described embodiments may be modified in various different ways, all without departing from the spirit or scope of the present invention. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.
In the existing data analysis, different business departments select the same data operation according to common data analysis requirements, so that the phenomena of data operation repetition and resource waste are caused. In order to avoid resource waste, the present application provides the following embodiments to solve the existing problems.
Example one
In a specific embodiment, as shown in fig. 1, a method for clearing an invalid data resource is provided, including:
step S100: searching in various types of data resources according to an invalid mining rule, and determining whether invalid data resources exist in the various types of data resources.
Organizations such as companies, schools, governments, markets and the like all include various types of departments, such as financial departments, public relations departments, human resource departments, business departments and the like, the various types of departments correspond to data resources to form different types of data resources, the various types of data resources include some invalid data resources, invalid mining is performed on the various types of data resources respectively, for example, all data is acquired through scanning the data resources, searching is performed on the different types of data resources to be mined respectively according to different invalid mining rules, and whether invalid data resources exist in the data resources to be mined is determined.
Step S200: and if so, clearing the invalid data resources according to a clearing mode corresponding to the invalid data resources, wherein the invalid data resources comprise at least one of invalid query operation, invalid calculation operation, invalid data storage and invalid data calling.
As shown in fig. 2, on one hand, resources occupied by repeated operations or stored data resources, such as repeated query operations, repeated data storage, and the like, are integrated and unified for the repeated data resources, and the repeated data resources are offline, so that waste can be avoided. On the other hand, some data resources whose output is frequently interrupted are detected, and such data jobs are set to be invalid and cleared. In a third aspect, data resources associated with employees who have left their job or with closed business are set to invalid, such as invalid computing jobs, data stores, and data calls, and are purged.
In this embodiment, the provided method for clearing invalid data resources avoids resource waste caused by occupation of resources by repeated calculation, invalid calculation, storage, invocation, and the like in a data analysis application process in a data analysis process.
In one implementation, searching for various types of data resources according to an invalid mining rule, and determining whether invalid data resources exist in the various types of data resources includes:
when the data resource is query operation, calculating the similarity between the query operations;
and selecting one query job from the query jobs with the similarity larger than the threshold value, and determining the unselected query job as an invalid query job.
In this embodiment, referring to fig. 2, the query job such as HQL (object query language) may be created by performing a query according to one or more parameters of a query type, a query table, a query field, a query amount, and the like, so as to prepare for finding a query job (such as HQL) to be executed repeatedly in the next step. After the query operation is established, the query operations of all departments are compared to determine whether the query operations are consistent. The comparison may be performed by calculating similarity between query jobs corresponding to each department. If the similarity is greater than the threshold, for example, if the threshold is 1, it indicates that the query jobs are relatively close, for example, if the query calculation results of two HQLs are consistent, the two HQLs may be considered to be similar, and then one of the query jobs may be selected for sharing, and the unselected query job is determined to be an invalid query job.
In one implementation, clearing the invalid data resource in a clearing manner corresponding to the invalid data resource includes:
and executing the selected query operation to obtain a query result, establishing a public data mart table according to the query result, deleting invalid query operation, and keeping the public data mart table with the strongest timeliness.
For example, the data resources of each department are respectively queried according to the extracted query job with the highest similarity. Because the results obtained by the query can be shared, the queried common data form a public data MART table (datamart, which may be referred to as a MART table for short) with the strongest timeliness, so that when a query operation is performed on a certain department next time, the execution of the query operation is omitted, and the data is directly extracted from the public data MART table. For example, a common MART table is built for HQL statements with similarity greater than 1. The repeated HQL is taken off line, and a production MART table with the strongest timeliness is reserved. Each department does not need to repeatedly calculate the same data any more, and the data is obtained from the MART table with the strongest timeliness. In addition, the repeated data obtained by inquiring can be cleared.
In one implementation, searching for various types of data resources according to an invalid mining rule, and determining whether invalid data resources exist in the various types of data resources includes:
when the data resources are data storage, scanning each temporary data storage table;
judging whether the scanned temporary data storage tables are repeated or not, selecting one temporary data storage table from the repeated temporary data storage tables, and determining the unselected temporary data storage table as invalid data storage.
In one implementation, clearing the invalid data resource in a clearing manner corresponding to the invalid data resource includes:
and storing the offline invalid data. Meanwhile, a temporary data storage table with the strongest timeliness is reserved.
Referring to fig. 2, different service lines or departments often have similar analysis dimensions for the same data, which may cause the possibility that different service lines store the same temporary data through storage operations, resulting in repeated storage of data and waste of storage resources. In the embodiment, the problem of repeated data storage is found in time by scanning temporary data storage tables corresponding to each business department, one temporary data storage table with the best timeliness is selected from the repeated temporary data storage tables for common use, the temporary data storage table with the best timeliness is reserved, and the temporary data storage table which is not selected is offline, so that the problem of resource waste is solved.
In one implementation, searching for various types of data resources according to an invalid mining rule, and determining whether invalid data resources exist in the various types of data resources includes:
when the data resource is a calculation operation, judging whether the calculation operation is abnormally interrupted or not according to an output log of the calculation operation; and if so, determining the abnormally interrupted computing operation as an invalid computing operation.
Typically, an invalid computing job is a computing job that is no longer valuable, and the types include a variety of types, such as a computing job that frequently fails, a computing job for an employee that has left the job, a computing job that has closed the job, and so on, see FIG. 2. Among them, frequently failed calculation jobs, unattended calculation jobs, and the like occupy not only calculation resources but also have no value. Therefore, invalid calculation operation can be mined, valuable calculation operation is not needed to be timely off-line, and resource waste can be avoided. The specific mining method is to determine whether the calculation job is an invalid calculation job according to an output log obtained after execution of the calculation job, for example, whether the calculation job is abnormal may be determined by setting a threshold of the number of frequent failures, and if the number of failures of output recorded in the output log is greater than the threshold, the calculation job is determined to be an invalid calculation job. And eliminating invalid calculation operation in time. For example, the invalid jobs are offline and closed, and garbage (garpage) generated by history calculation is also cleared in time.
In one implementation, searching for various types of data resources according to an invalid mining rule, and determining whether invalid data resources exist in the various types of data resources includes:
and determining whether invalid data resources of the employee exist according to the binding relationship between the various types of data resources and the human resource system, wherein the invalid data resources of the employee comprise at least one of invalid calculation jobs, invalid data storage and invalid data calling of the employee.
Due to the fact that workers leave the office in many business departments, related operations such as calculation operations, data storage and unclear data calling exist, temporary data storage cannot be timely deleted after the workers leave the office, storage resources are still occupied, invalid data operations still continue to run, and calculation resources are consumed. In the embodiment, a mechanism for binding and verifying data storage and a human resource system and a mechanism for binding and verifying data operation and the human resource system are set, so that temporary data storage and data calling tasks of employees who leave a job are ensured, inefficacy is sensed in time, and the employees are offline in time after failure.
In one implementation, searching for various types of data resources according to an invalid mining rule, and determining whether invalid data resources exist in the various types of data resources includes:
and determining whether invalid data resources of the closed service exist according to the binding relationship between various types of data resources and the service management system, wherein the invalid data resources of the closed service comprise at least one of invalid operation, invalid data storage and invalid data calling of the closed service.
Because each business department has closed business, and the storage of data and the operation of data operation all take place in public big data platform, if the business closes, but data storage and data operation related to closing the business can't be off-line in time. In this embodiment, a mechanism for computing operation, data storage, and binding and verifying between data calling and a service management system may be provided, so as to ensure that a data calling task of a closed service is timely sensed inefficacy and timely downloaded after failure.
Referring to fig. 2, when the employee leaves or the service is closed, invalid calculation jobs, data storage and data call related to the employee leaving and the service closing are mined, and the mined invalid data and data jobs are removed, so that resource waste is avoided.
Clearing the invalid data resources according to the clearing mode corresponding to the invalid data resources, comprising the following steps:
and closing the invalid computing operation, cleaning garbage generated by the invalid computing operation in the history computing, storing invalid data to be offline, or calling the invalid data to be offline.
Example two
In another specific embodiment, as shown in fig. 3, there is provided an apparatus for clearing an invalid data resource, including:
the invalid data resource searching module 10 is used for searching various types of data resources according to an invalid mining rule;
an invalid data resource confirming module 20, configured to determine whether an invalid data resource exists in the various types of data resources;
and the invalid data resource clearing module 30 is configured to clear the invalid data resources in a clearing manner corresponding to the invalid data resources if the invalid data resources exist in the various types of data resources, where the invalid data resources include at least one of an invalid query job, an invalid computation job, an invalid data store, and an invalid data call.
Further, in the above apparatus, the invalid data resource confirming module includes:
the similarity calculation unit is used for calculating the similarity between the query jobs when the data resources are the query jobs;
and the repeated query confirming unit is used for determining the query operation with the similarity larger than the threshold value as a repeated query operation and determining the repeated query operation as an invalid query operation.
Further, in the above apparatus, the invalid data resource clearing module includes:
and the data mart table establishing unit is used for executing the query operation to obtain a query result, establishing a public data mart table according to the query result, deleting repeated query operation and reserving the public data mart table with the strongest timeliness.
Further, in the above apparatus, the invalid data resource confirming module further includes:
the storage table scanning unit is used for scanning each temporary data storage table when the data resource is used for data storage;
and the invalid data storage confirming unit is used for judging whether the scanned temporary data storage table is repeated or not and determining the repeated temporary data storage table as invalid data storage.
Further, in the above apparatus, the invalid data resource clearing module further includes:
and the invalid storage table offline unit is used for offline repeated temporary data storage tables and reserving the temporary data storage table with the strongest timeliness.
Further, in the above apparatus, the invalid data resource confirming module further includes:
an invalid calculation confirming unit, which is used for judging whether the calculation operation is abnormally interrupted according to the output log of the calculation operation when the data resource is the calculation operation; and if so, determining the abnormally interrupted computing operation as an invalid computing operation.
Further, in the above apparatus, the invalid data resource confirming module further includes:
and the data resource confirming unit of the employees is used for determining whether invalid data resources of the employees exist or not according to the binding relationship between various types of data resources and the human resource system, wherein the invalid data resources of the employees include at least one of invalid calculation operation, invalid data storage and invalid data calling of the employees.
Further, in the above apparatus, the invalid data resource confirming module further includes:
and the closing service data resource confirming unit is used for determining whether invalid data resources of the closed service exist according to the binding relationship between various types of data resources and the service management system, wherein the invalid data resources of the closed service comprise at least one of invalid operation, invalid data storage and invalid data calling of the closed service.
Further, in the above apparatus, the invalid data resource clearing module further includes:
and the invalid data resource offline unit is used for closing the invalid computing operation offline, cleaning garbage generated by the invalid computing operation in the history computing, storing the invalid data offline, or calling the invalid data offline.
EXAMPLE III
An embodiment of the present invention provides a device for clearing an invalid data resource, as shown in fig. 4, including:
a memory 400 and a processor 500, the memory 400 having stored therein a computer program operable on the processor 500. The processor 500 implements the method of the voice-controlled smart home device in the above embodiment when executing the computer program. The number of the memory 400 and the processor 500 may be one or more.
A communication interface 600 for the memory 400 and the processor 500 to communicate with the outside.
Memory 400 may comprise high-speed RAM memory and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
If the memory 400, the processor 500, and the communication interface 600 are implemented independently, the memory 400, the processor 500, and the communication interface 600 may be connected to each other through a bus and perform communication with each other. The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 3, but this does not mean only one bus or one type of bus.
Optionally, in a specific implementation, if the memory 400, the processor 500, and the communication interface 600 are integrated on a single chip, the memory 400, the processor 500, and the communication interface 600 may complete communication with each other through an internal interface.
Example four
A computer-readable storage medium storing a computer program which, when executed by a processor, implements a method of clearing an invalid data resource as in any one of embodiments included herein.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a computer readable storage medium. The storage medium may be a read-only memory, a magnetic or optical disk, or the like.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive various changes or substitutions within the technical scope of the present invention, and these should be covered by the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (13)

1. A method for clearing invalid data resources, comprising:
searching various types of data resources according to an invalid mining rule, and determining whether invalid data resources exist in the various types of data resources;
if so, determining a corresponding clearing mode according to the invalid data resources, wherein the invalid data resources comprise at least one of invalid query operation, invalid calculation operation, invalid data storage and invalid data calling;
under the condition that the data resources are query jobs, the query jobs comprise one or more parameters of query types, query tables, query fields, query quantity and the like, the similarity among the query jobs corresponding to each department is calculated, one query job is selected from the query jobs with the similarity larger than a first threshold value, unselected query jobs are deleted as invalid query jobs, the selected query jobs are executed to obtain query results, and a public data mart table is established according to the query results;
if the data resources are calculation jobs, judging that the calculation jobs are abnormally interrupted calculation jobs if the output failure times recorded in the output logs of the calculation jobs are larger than a second threshold value, and deleting the abnormally interrupted calculation jobs as invalid calculation jobs;
under the condition that the data resources are data storage, selecting a temporary data storage table from repeated temporary data storage tables, and deleting unselected temporary data storage tables as invalid data storage;
when the staff leaves or the business is closed, invalid calculation jobs, data storage and data calls related to the staff leaves and the business is mined out and cleared.
2. The method of claim 1, wherein deleting an abnormally interrupted computing job as an invalid computing job if the data resource is a computing job, comprises:
judging whether the computing operation is abnormally interrupted or not according to the output log of the computing operation; and if so, determining the abnormally interrupted computing operation as an invalid computing operation, and deleting the invalid computing operation.
3. The method of claim 1, wherein searching for invalid data resources in the various types of data resources according to the invalid mining rule to determine whether invalid data resources exist in the various types of data resources comprises:
and determining whether invalid data resources of the employee exist according to the binding relationship between the various types of data resources and the human resource system, wherein the invalid data resources of the employee comprise at least one of invalid calculation jobs, invalid data storage and invalid data calling of the employee.
4. The method of claim 1, wherein searching for invalid data resources in the various types of data resources according to the invalid mining rule to determine whether invalid data resources exist in the various types of data resources comprises:
and determining whether invalid data resources of the closed service exist according to the binding relationship between various types of data resources and the service management system, wherein the invalid data resources of the closed service comprise at least one of invalid operation, invalid data storage and invalid data calling of the closed service.
5. The method according to any one of claims 2 to 4, wherein clearing the invalid data resources in a clearing manner corresponding to the invalid data resources comprises:
closing an invalid computing job off line, and cleaning garbage generated in the historical computing by the invalid computing job; or
Storing the invalid data off line; or
The invalid data is called offline.
6. An apparatus for clearing invalid data resources, comprising:
the invalid data resource searching module is used for searching various types of data resources according to an invalid mining rule;
the invalid data resource confirming module is used for determining whether invalid data resources exist in various types of data resources;
the clearing mode determining module is used for determining a corresponding clearing mode according to the invalid data resources, wherein the invalid data resources comprise at least one of invalid query operation, invalid calculation operation, invalid data storage and invalid data calling;
an invalid query job deleting module, configured to, when the data resource is a query job, calculate similarity between query jobs corresponding to each department, select a query job from the query jobs having similarity greater than a first threshold, delete unselected query jobs as the invalid query job, execute the selected query job to obtain a query result, and establish a public data mart table according to the query result, where the query job includes one or more parameters of a query type, a query table, a query field, a query amount, and the like;
an invalid calculation job deleting module, configured to, if the data resource is a calculation job and the number of output failures recorded in the output log of the calculation job is greater than a second threshold, determine that the calculation job is an abnormally interrupted calculation job, and delete the abnormally interrupted calculation job as an invalid calculation job;
an invalid data deleting module, configured to select a temporary data storage table from the repeated temporary data storage tables when the data resource is data storage, and delete the unselected temporary data storage table as the invalid data storage; when the staff leaves or the business is closed, invalid calculation jobs, data storage and data calls related to the staff leaves and the business is mined out and cleared.
7. The apparatus of claim 6, wherein the invalid data store deletion module comprises:
the temporary data storage table scanning unit is used for scanning each temporary data storage table under the condition that the data resource is used for storing data;
an invalid data storage confirming unit, configured to select one temporary data storage table from the repeated temporary data storage tables, and determine an unselected temporary data storage table as an invalid data storage;
and the invalid data storage deleting unit is used for deleting the invalid data storage.
8. The apparatus of claim 6, wherein the invalid computing job deletion module comprises:
an invalid calculation job confirmation unit, configured to, when the data resource is a calculation job, determine whether the calculation job is abnormally interrupted according to an output log of the calculation job, and if so, determine the abnormally interrupted calculation job as an invalid calculation job;
an invalid calculation job deletion unit configured to delete the invalid calculation job.
9. The apparatus of claim 6, further comprising:
and the data resource confirmation module of the employee leaves is used for determining whether invalid data resources of the employee leave exist or not according to the binding relationship between various types of data resources and the human resource system, wherein the invalid data resources of the employee leave comprise at least one of invalid calculation operation, invalid data storage and invalid data calling of the employee leave.
10. The apparatus of claim 6, further comprising:
and the closing service data resource confirming module is used for determining whether invalid data resources of the closed service exist according to the binding relationship between various types of data resources and the service management system, wherein the invalid data resources of the closed service comprise at least one of invalid calculation operation, invalid data storage and invalid data calling of the closed service.
11. The apparatus of any of claims 7 to 10, further comprising:
and the invalid data resource offline module is used for closing the invalid computing operation offline, cleaning garbage generated by the invalid computing operation in historical computing, storing the invalid data offline, or calling the invalid data offline.
12. An apparatus for clearing invalid data resources, comprising:
one or more processors;
storage means for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method recited in any of claims 1-5.
13. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-5.
CN201810302134.7A 2018-04-04 2018-04-04 Method, device and equipment for clearing invalid data resources and computer readable medium Active CN108536812B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810302134.7A CN108536812B (en) 2018-04-04 2018-04-04 Method, device and equipment for clearing invalid data resources and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810302134.7A CN108536812B (en) 2018-04-04 2018-04-04 Method, device and equipment for clearing invalid data resources and computer readable medium

Publications (2)

Publication Number Publication Date
CN108536812A CN108536812A (en) 2018-09-14
CN108536812B true CN108536812B (en) 2020-05-08

Family

ID=63483234

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810302134.7A Active CN108536812B (en) 2018-04-04 2018-04-04 Method, device and equipment for clearing invalid data resources and computer readable medium

Country Status (1)

Country Link
CN (1) CN108536812B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112116173A (en) * 2019-06-19 2020-12-22 中国石油化工股份有限公司 Invalid operation reduction system
CN112116118A (en) * 2019-06-19 2020-12-22 中国石油化工股份有限公司 Operation reduction early warning system based on data mining
CN112116303A (en) * 2019-06-19 2020-12-22 中国石油化工股份有限公司 Operation ticket early warning system

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6185552B1 (en) * 1998-03-19 2001-02-06 3Com Corporation Method and apparatus using a binary search engine for searching and maintaining a distributed data structure
US8612669B1 (en) * 2010-06-28 2013-12-17 Western Digital Technologies, Inc. System and method for performing data retention in solid-state memory using copy commands and validity and usage data
US9633094B2 (en) * 2014-04-25 2017-04-25 Bank Of America Corporation Data load process
CN104317534A (en) * 2014-10-29 2015-01-28 小米科技有限责任公司 Method and device for cleaning data storage space
CN104317627B (en) * 2014-11-13 2017-06-06 北京奇虎科技有限公司 The cleaning data one-key scan method and device of memory space
CN104317628A (en) * 2014-11-13 2015-01-28 北京奇虎科技有限公司 Mobile terminal and storage space cleaning method thereof

Also Published As

Publication number Publication date
CN108536812A (en) 2018-09-14

Similar Documents

Publication Publication Date Title
CN110138837A (en) Request processing method, device, computer equipment and storage medium
CN108536812B (en) Method, device and equipment for clearing invalid data resources and computer readable medium
CN106294134A (en) The collapse localization method of code and device
CN110956269A (en) Data model generation method, device, equipment and computer storage medium
CN105549508A (en) Alarm method based on information combination and apparatus thereof
CN111680914A (en) Event distribution method, event distribution device, storage medium, and electronic apparatus
US5826104A (en) Batch program status via tape data set information for dynamically determining the real time status of a batch program running in a main frame computer system
CN109034668B (en) ETL task scheduling method, ETL task scheduling device, computer equipment and storage medium
US20070299931A1 (en) Aggregate storage space allocation
CN112433888B (en) Data processing method and device, storage medium and electronic equipment
US8775217B1 (en) System and method for managing account processing
CN110489416A (en) A kind of information storage means and relevant device based on data processing
CN113824590B (en) Method for predicting problem in micro service network, computer device, and storage medium
US9183388B2 (en) Injustice detecting system, injustice detecting device and injustice detecting method
CN116187675A (en) Task allocation method, device, equipment and storage medium
CN110677310B (en) Traffic attribution method, device and terminal
CN113283879A (en) Method and device for determining construction amount, electronic equipment and readable storage medium
US20030018683A1 (en) Method, system and program for deleting work flow item
CN111639057A (en) Log message processing method and device, computer equipment and storage medium
CN110826055B (en) Tenant safety automatic intelligent detection method and system based on service load
CN113516425B (en) Inventory tracking method and device
CN112330472B (en) Service data processing method, device, computer equipment and storage medium
CN111754184A (en) Cross-enterprise workflow control method, system, device and storage medium
CN116719648B (en) Data management method and system for computer system
Mishra et al. Improving Intelligence Metrics using Frequency Domain Convolutions for Improving Bug Prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant