CN112395270A - Data management method, device, equipment and storage medium - Google Patents
Data management method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN112395270A CN112395270A CN202011360310.6A CN202011360310A CN112395270A CN 112395270 A CN112395270 A CN 112395270A CN 202011360310 A CN202011360310 A CN 202011360310A CN 112395270 A CN112395270 A CN 112395270A
- Authority
- CN
- China
- Prior art keywords
- data
- database
- short
- term data
- term
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 58
- 238000013523 data management Methods 0.000 title claims abstract description 40
- 230000007774 longterm Effects 0.000 claims abstract description 65
- 238000005070 sampling Methods 0.000 claims description 19
- 238000007726 management method Methods 0.000 claims description 15
- 238000013480 data collection Methods 0.000 claims description 11
- 238000000605 extraction Methods 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 claims description 3
- 230000001502 supplementing effect Effects 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 238000013500 data storage Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000010295 mobile communication Methods 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000003466 welding Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/252—Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a data management method, a device, equipment and a storage medium, wherein the data management method comprises the following steps: after a data acquisition instruction is received, acquiring equipment data in parallel in a thread pool mode at intervals of first preset time to obtain short-term data, and storing the short-term data to a first database; before the short-term data are monitored to be out of date, the short-term data are sampled and extracted from the first database to obtain long-term data, the long-term data are stored in the second database, and when the single-table data volume is larger than a preset threshold value, sub-table storage is carried out according to equipment. According to the technical scheme, short-term data acquired in parallel in a thread pool mode are stored in the first database, then the short-term data in the first database are extracted to obtain long-term data, the long-term data are stored in the second database, the problem that the data volume of a single table is too large is solved by using sub-table storage, the data are acquired more rapidly, and the retrieval efficiency can be guaranteed under the condition that the data volume is too large.
Description
Technical Field
Embodiments of the present invention relate to data processing technologies, and in particular, to a data management method, apparatus, device, and storage medium.
Background
In the field of software-defined networking, controllers as logic centers and control centers need to master global network information in order to manage and configure the network. With the gradual increase of forwarding devices managed by the controller, the time for acquiring all device data is increased, and meanwhile, the data volume of the device data is linearly increased along with the number of the devices. When the number of the devices reaches a certain scale, great challenges are certainly brought to the acquisition and storage of device data, and on the other hand, the pressure of device data storage is extended to the performance pressure of retrieving the device data. Therefore, in the case where a controller manages a large-scale plant, collection, storage, and retrieval of plant data will face no small difficulty.
In the prior art, for collecting device data, a commonly used method may include: the controller actively sends a query command to the equipment; the device actively sends device data to the controller via telemetrology. The controller active query approach is more general because not all devices support telemetrics, however, this approach is time consuming and, especially in the case of a large number of devices, it is almost impossible to use a serial approach to sequentially obtain device data for each device. For the storage of the device data, the traditional relational database is mostly used at present, and on the premise of not influencing the query performance, the maximum storage capacity of a single table of the database is in the millions, so that the continuous storage time and time granularity of the device data are greatly limited. And the storage capacity of the database list table is too large, so that the data retrieval of the equipment is not convenient.
Disclosure of Invention
The invention provides a data management method, a device, equipment and a storage medium, which are used for realizing the collection, storage and retrieval of data of a thousand-level equipment or a billion-level equipment.
In a first aspect, an embodiment of the present invention provides a data management method, including:
after a data acquisition instruction is received, acquiring equipment data in parallel in a thread pool mode at intervals of first preset time to obtain short-term data, and storing the short-term data to a first database;
before the short-term data is monitored to be out of date, sampling and extracting the short-term data from the first database to obtain long-term data, and storing the long-term data to a second database.
Further, collecting device data in parallel in a thread pool mode at intervals of a first preset time to obtain short-term data, and storing the short-term data in a first database, wherein the method comprises the following steps:
determining the number of the core threads according to the number of the devices in a management range, wherein the number of the core threads is the maximum number of threads which run simultaneously;
and acquiring the equipment data of all equipment in a management range in batches at intervals of first preset time to obtain the short-term data, and storing the short-term data to the first database.
Further, collecting the device data of all devices in a management range in batches at intervals of a first preset time to obtain the short-term data, and storing the short-term data to the first database, including:
and after the equipment data collection of the equipment in any batch is monitored to be completed, supplementing the candidate equipment in the candidate queue to the core thread for equipment data collection.
Further, sampling and extracting the short-term data from the first database to obtain long-term data, and storing the long-term data to a second database, including:
and if the data volume of the long-term data is larger than a preset threshold value, respectively storing the equipment data in different data tables according to equipment in the second database.
Further, the names of the data tables in the second database include original table names and device names.
Further, the first database corresponds to a first expiration policy,
the method comprises the following steps of acquiring equipment data in parallel in a thread pool mode at a first preset time interval to obtain short-term data, and storing the short-term data in a first database, wherein the method further comprises the following steps:
clearing the short-term data with storage duration longer than the first expiration policy.
Further, the second database corresponds to a second expiration policy, the first expiration policy being less than the second expiration policy,
after sampling and extracting the short-term data from the first database to obtain long-term data and storing the long-term data to a second database, the method further comprises the following steps:
clearing the long-term data with storage duration longer than the second expiration policy.
In a second aspect, an embodiment of the present invention further provides a data management apparatus, including:
the acquisition and storage module is used for acquiring equipment data in parallel in a thread pool mode at intervals of first preset time after receiving a data acquisition instruction to obtain short-term data and storing the short-term data to a first database;
and the extraction and storage module is used for sampling and extracting the short-term data from the first database to obtain long-term data and storing the long-term data to a second database before monitoring that the short-term data is out of date.
In a third aspect, an embodiment of the present invention further provides a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the data management method according to any one of the first aspect when executing the program.
In a fourth aspect, embodiments of the present invention also provide a storage medium containing computer-executable instructions for performing the data management method according to any one of the first aspect when executed by a computer processor.
After a data acquisition instruction is received, equipment data are acquired in parallel in a thread pool mode at intervals of first preset time to obtain short-term data, and the short-term data are stored in a first database; before the short-term data is monitored to be out of date, sampling and extracting the short-term data from the first database to obtain long-term data, and storing the long-term data to a second database. According to the technical scheme, the short-term data acquired in parallel in a thread pool mode are stored in the first database, the short-term data in the first database are extracted to obtain the long-term data, and the long-term data are stored in the second database, so that the data are acquired more rapidly, and the data are stored more conveniently for equipment data retrieval.
Drawings
Fig. 1 is a flowchart of a data management method according to an embodiment of the present invention;
fig. 2 is a flowchart of a data management method according to a second embodiment of the present invention;
fig. 3 is a schematic diagram of relationships of acquisition, storage and retrieval in a data management method according to a second embodiment of the present invention;
fig. 4 is a structural diagram of a data management apparatus according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of a computer device according to a fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like. In addition, the embodiments and features of the embodiments in the present invention may be combined with each other without conflict.
Example one
Fig. 1 is a flowchart of a data management method according to an embodiment of the present invention, where this embodiment is applicable to a case of acquiring device data of multiple devices, and the method may be executed by a computer, and specifically includes the following steps:
and step 110, after receiving the data acquisition instruction, acquiring device data in parallel in a thread pool mode at an interval of a first preset time to obtain short-term data, and storing the short-term data to a first database.
The device may include a network device such as a gateway, a router, or a forwarding device, or may also include a factory device such as a cutting machine, a welding machine, or a press, and each device may include at least one device interface, and the device interface may transmit a large amount of device data. In practical applications, the device may comprise any device that needs to collect interface data.
The first preset time may be set according to the number of devices and the data amount that need to be collected actually, for example, device data may be collected from the device interface once every 10 seconds.
In the thread pool technology, the core thread may be a threshold of a device that can be acquired at one time, and the number of the core threads may be determined according to the number of devices that need to acquire device data currently. In the data management method provided in this embodiment, the number of devices that acquire device data at one time may be equal to the number of core threads. The number of the core threads can be set to be a value which can be evenly divided by the number of the devices, so that in the process of collecting the device data of the last batch, idle threads are relatively few, and the thread pool resources can be more reasonably utilized.
The first database may comprise a time series database infiluxdb. The InfluxDB can automatically clear the data exceeding the specified time through an expiration strategy, and compared with conventional relational databases such as MySQL, PostgreSQL and the like, the method saves the operation of timed deletion. In addition, on the premise of ensuring the query efficiency, the data volume stored by the InfluxDB can reach the ten-million level, and more device data can be stored. In practical applications, the first database may also be any database supporting multiple data table sublist storage.
The short-term data stored in the first database is time-granular and short in storage time, and can be used for instant query. The expiration policy of the first database may be one hour, and the first database may automatically clear device data whose storage time is longer than one hour, so that the first database may always store instant data, which is convenient for extracting the instant data.
Specifically, after receiving the data acquisition instruction, the device data may be acquired from the device interfaces of all the devices within the management range at intervals of 10 seconds to obtain short-term data, and then the short-term data may be stored in the first database.
It should be noted that, in the devices with the number of core threads, after the device data acquisition of any device is completed, other devices that do not acquire device data may be bit-complemented into the core thread to continue device data acquisition. For example, the number of the devices requiring device data acquisition by the current controller may be 1000, the number of the core threads may be 250, the device data of 250 devices may be acquired at the same time, and after the device data acquisition of a device in any core thread is completed, the other 750 devices not performing device data acquisition may perform bit padding.
In addition, the number of core threads is also related to the performance of the current controller, and the stronger the performance of the controller is, the more the number of devices capable of simultaneously performing device data acquisition is, the larger the number of corresponding core threads may be.
It can be known that, in order to solve the problems of too large data amount and low query efficiency in single-table storage, the infiluxdb can store the device data in separate tables, so as to ensure that the data amount of a single table does not exceed ten million levels, and further ensure the efficiency of data retrieval. In the first database, the collected device data of the multiple devices can be stored in different data tables of the first database according to the devices, so that the device data can be conveniently retrieved according to the tables. The retrieval requirement may include retrieving device data of a preset device within a preset time period, retrieving device data of a preset interface of the preset device within the preset time period, retrieving device data of an interface state in a DOWN state within the preset time period, or retrieving device data of an interface state in a UP state within the preset time period, and the like.
After the device data are stored in the first database according to the device sub-tables, the data amount difference of each data table is not large, so that the situation that the data amount of part of the data tables is large and the data amount of other data tables is small is not easy to occur. Furthermore, the data amount of each data table can be estimated according to the number of interfaces of the device and the storage time.
In addition, at the time of device data retrieval, a data table to be queried may be determined based on the input device name. The device name is equivalent to creating an index for all the device data stored in the database, so that the device data query is facilitated.
Furthermore, when the number of the devices to be managed is increased from 1000 to 1500, 500 data tables can be automatically created according to the names of the 500 new devices, and the 500 data tables are used for storing the device data collected from the new devices.
And 120, before monitoring that the short-term data is out of date, sampling and extracting the short-term data from the first database to obtain long-term data, and storing the long-term data to a second database.
The way of sampling and extracting short-term data from the first database may include: extracting equipment data of a preset time point in a preset time period every other preset time period; extracting an average value or a median value of the equipment data in a preset time period; and extracting the equipment data of the last time point or the first time point in the preset time period every other preset time period. In practical application, other modes of sampling and extracting the device data can be included, and the setting can be carried out according to actual requirements.
The time granularity of the long-term data stored in the second database is sparse, the storage time is long, and the long-term data can be used for historical query. The collected equipment data can be stored in a first database, and then the short-term data is periodically sampled from the first database and stored in a second database. The expiration policy of the second database may be one year, the second database may automatically clear the device data stored for more than one year, and the data size of each data table in the second database may be controlled to not exceed ten million levels, so as to facilitate the retrieval of the device data.
Specifically, before the expiration of the short-term data is monitored, one piece of equipment data can be sampled from the short-term data in ten minutes in the first database every ten minutes and stored in the second database.
It should be noted that, if the number of devices accessing the controller is too large, and the data amount of the device data exceeds a preset threshold value for ensuring the query efficiency, the device data may be stored in a sub-table manner, where the preset threshold value may be ten million-level data amount. The table division can be performed according to the equipment, and the table name can adopt the form of 'original table name _ equipment name', so that the table name of the corresponding equipment can be conveniently and quickly determined and stored in the process of retrieval. The naming mode of the data table requires the uniqueness of the device name of the access device in the controller.
It will be appreciated that, in retrieving device data, immediate data may be retrieved from a first database storing short-term data and historical data may be retrieved from a second database storing long-term data. If the sub-table storage is carried out according to the equipment during the storage, the sub-table query is carried out according to the equipment during the retrieval, and then the result is returned after being summarized. The retrieval performance is ensured by controlling the data amount of each table not to exceed ten million levels when storing.
It will also be appreciated that the first and second of the first and second databases only distinguish between the two databases and do not have an actual order or position of the first and second databases. The first database and the second database have different expiration policies, and when the data of the device is read and written, the two databases can be distinguished by displaying the designated database name and the expiration policy name.
When a plurality of databases need to be used simultaneously in the same data management process, the manner of displaying any database may be as follows: select from monitor _ history. polarity _ history. traffic _ device 1. Wherein, monitor _ history represents the name of the database, policy _ history represents the name of the expiration policy, traffic _ device1 represents the name of the data table, and the middle can be connected by an English symbol.
The embodiment of the invention provides a data management method, which comprises the following steps: after a data acquisition instruction is received, acquiring equipment data in parallel in a thread pool mode at intervals of first preset time to obtain short-term data, and storing the short-term data to a first database; before the short-term data is monitored to be out of date, sampling and extracting the short-term data from the first database to obtain long-term data, and storing the long-term data to a second database. According to the technical scheme, the short-term data acquired in parallel in a thread pool mode are stored in the first database, the short-term data in the first database are extracted to obtain the long-term data, and the long-term data are stored in the second database, so that the data are acquired more rapidly, and the data are stored more conveniently for equipment data retrieval.
Example two
Fig. 2 is a flowchart of a data management method according to a second embodiment of the present invention, which is embodied on the basis of the second embodiment. In this embodiment, the method may further include:
Fig. 3 is a schematic diagram of a relationship between acquisition, storage, and retrieval in a data management method according to a second embodiment of the present invention, as shown in fig. 3, device data of all devices are acquired in parallel in a thread pool manner, and the acquired device data are stored in a first database in a short term; and sampling and extracting the equipment data from the first database and storing the equipment data into a second database, wherein the second database can store long-term data. In order to ensure the retrieval performance of the second database, the device data may be stored in a device table, and the device data of each device is stored in a corresponding data table.
In addition, the first database may store the device data in a sub-table. During retrieval, the device data of the preset device may be retrieved from the data table, and then the retrieved device data of all the devices may be integrated.
In this embodiment, the number of the devices in the management range of the controller may be 1000, the controller needs to monitor the device data of 1000 devices, each device has 10 interfaces on average, and the average time for collecting one device data is 800 milliseconds.
To collect device data for 1000 devices in 10 seconds, a multi-thread concurrency technique may be used, and thread resources may be managed using a thread pool. The parameters of the thread pool in this embodiment may be: the number of core threads may be 250, the maximum number of threads may be 300, and the maximum number of queues may be 1000. The device data of 250 devices can be collected in parallel in each batch, and the device data of 1000 devices can ensure that the collection is completed in four batches.
The number of kernel threads 250 is divisible by the number of devices 1000, and the number of idle threads in the last batch is guaranteed to be as small as possible.
The average time for collecting the equipment data in each batch can be 1.1 seconds, and the time for collecting the equipment data in four batches can be less than 5 seconds. And the total time can be between 5 seconds and 7 seconds by adding the time of data preprocessing and data storage, so that the acquisition of the equipment data of 1000 equipment can be ensured to be completed within 10 seconds.
In this embodiment, after the data acquisition instruction is received, the number of the kernel threads may be determined according to the number of the devices that the controller needs to manage, so as to ensure that the number of the kernel threads is divisible by the number of the devices. Of course, the number of core threads is also related to the performance of the controller, and when the performance of the controller is strong enough, the number of core threads may be set to 1000, and the device data of 1000 devices may be collected.
And 220, acquiring the equipment data of all equipment in a management range in batches at intervals of first preset time to obtain short-term data, and storing the short-term data to the first database.
The time granularity of the first database may be 10 seconds, and correspondingly, the device data of all the devices may be collected every 10 seconds. Of course, the first preset time can also be determined according to actual requirements.
It is known that 1000 devices with 10 interfaces on average collect device data every 10 seconds for 60 minutes in an hour, six device data per minute, and the maximum amount of data that can be stored in the first database for a short period of time in an hour may be 1000 x 10 x 6 x 60-3,600,000.
From the above calculation, it can be known that the data amount stored in the first database in a short period is in the million level, which is less than ten million level, and the sub-table storage is not required.
In addition, after the completion of the device data acquisition of the devices in any batch is monitored, the candidate devices in the candidate queue are relocated to the core thread for device data acquisition.
When acquiring the device data of the devices with the number of the core threads, the other devices may wait for acquisition in the candidate queue, and the device that does not perform device data acquisition may be the candidate device. In order to enable the core thread to collect the device data all the time, namely to ensure that the core thread is in operation all the time, and to minimize the time spent on collecting the device data, after the device data collection of any device in the core thread is completed, any candidate device is extracted from the candidate queue to the core thread, and the device data collection is continued.
The average time for collecting the device data of one device is 800 milliseconds, and the time for collecting each device is not necessarily equal, so the average time for collecting the device data of 250 devices can be 1.1 seconds, and after the device data collection of any device is completed, the candidate devices in the candidate queue are replaced to the core thread for device data collection. Further ensuring that the acquisition of the device data of 1000 devices is completed within 10 seconds.
In this embodiment, the short-term data may be stored in the first database. When the data volume of the short-term data is smaller than a preset threshold value, the short-term data does not need to be stored in a sub-table mode; of course, when the data amount of the short-term data is greater than the preset threshold, the short-term data also needs to be stored in a sub-table according to the devices, and each device may correspond to a corresponding data table.
The expiration policy may be an inherent attribute of the database, the first database may correspond to the first expiration policy, the first expiration policy may be 1 hour because the first database is a database storing short-term data, and the first database may automatically clear device data stored in the first database for a period of time longer than 1 hour.
In this embodiment, the first database is used to store the short-term data, so that the short-term data stored in the first database for a time longer than the first expiration policy can be cleared in time, and the instant device data stored in the first database can be ensured as much as possible, thereby facilitating the retrieval of the instant device data.
And 240, sampling and extracting the short-term data from the first database to obtain long-term data, and storing the long-term data into a second database.
Specifically, the time granularity of the second database may be 10 minutes, the expiration policy may be 1 year, correspondingly, the device data may be sampled and extracted from the first database every 10 minutes, and the device data stored in the second database for more than 1 year may be automatically cleared.
It can be known that when the device data within 1 hour is searched, the short-term database can be directly searched; when the device data before 1 hour is searched, a long-term database can be searched, if the query relates to a plurality of devices, the query can be firstly carried out in a table-by-table manner, and the query result is returned after being summarized.
In addition, the specific manner of sampling and extracting the short-term data from the first database to obtain the long-term data has already been described in detail in the first embodiment, and is not described herein again.
In one embodiment, step 240 may specifically include:
if the short-term data comprises the device data of at least two devices, the device data of the at least two devices are respectively stored in different data tables in the second database.
It is known that 1000 average 10-interface devices, 24 hours a day 365 days a year, collect device data in the first database every ten minutes, and six times an hour, so the maximum amount of data that can be stored in the second database for a long period of time in a year can be 1000 × 10 × 6 × 24 × 365 — 525,600,000.
From the above calculation, the data amount stored for a long time is in the hundred million level, which is greater than the threshold value, and the sublist storage is required. In this embodiment, the tables may be sorted according to the devices, the second database has 1000 tables, and the maximum data size of each table is 525,600, which is less than ten million, so that the retrieval performance may be ensured. As shown in fig. 3, one table for short-term storage and 1000 tables for long-term storage correspond to 1000 devices, respectively, and the dotted line in the figure may indicate a one-to-one correspondence relationship between the devices and the data tables.
In this embodiment, for convenience of retrieval, if the controller needs to acquire the device data of at least two devices at the same time, the acquired device data may be stored in a sub-table manner.
In one embodiment, the names of the data tables in the second database include original table names and device names.
For example, the name of the data table may be "traffic _ device1," where "traffic" represents the original table name of the data table, and "device 1" represents the name of a device, and the device name is unique among devices.
In this embodiment, the names of the data table include the original table name and the device name, and the data table name for storing the corresponding device data can be determined quickly when retrieval is performed conveniently.
The second database may correspond to a second expiration policy, and because the second database is a database storing long-term data, the second expiration policy may be 1 year, and the second database may automatically clear device data stored in the second database for a period of time longer than 1 year.
In this embodiment, the second database is used to store the long-term data, so that the long-term data stored in the second database for a time longer than the second expiration policy can be removed in time, and the data stored in the second database can be ensured as much as possible to be historical device data, thereby facilitating the retrieval of the historical device data.
According to the technical scheme, after a data acquisition instruction is received, the number of the core threads is determined according to the number of the devices in a management range, the device data of all the devices in the management range are acquired in batches at intervals of first preset time to obtain short-term data, the short-term data are stored in a first database, the first database corresponds to a first expiration policy, the short-term data with the storage duration longer than the first expiration policy are cleared, the short-term data are sampled and extracted from the first database to obtain long-term data, the long-term data are stored in a second database, the second database corresponds to a second expiration policy, the first expiration policy is smaller than the second expiration policy, and the long-term data with the storage duration longer than the second expiration policy are cleared. According to the technical scheme, the short-term data acquired in parallel by the thread pool method are stored in the first database, the short-term data in the first database are extracted to obtain the long-term data, and the long-term data are stored in the second database, so that the data are acquired more rapidly, and the data are stored more conveniently for equipment data retrieval.
And the first database can automatically clear short-term data with the storage duration longer than the first expiration policy in the first database, and the second database can automatically clear long-term data with the storage duration longer than the second expiration policy in the second database, so that the operation of timed deletion can be omitted.
In addition, when the data volume of the device data stored in the first database or the second database is greater than the preset threshold, the device data may be stored in separate tables, so that the data volume in each data table is less than the preset threshold, which is convenient for querying the device data and managing the device data of the newly added device.
EXAMPLE III
Fig. 4 is a structural diagram of a data management apparatus according to a third embodiment of the present invention, where the apparatus is suitable for a situation where a controller needs to collect device data of multiple devices, so as to improve efficiency of device data collection and efficiency of retrieval. The apparatus may be implemented by software and/or hardware and is typically integrated in a computer.
As shown in fig. 4, the apparatus includes:
the acquisition and storage module 410 is configured to, after receiving a data acquisition instruction, perform parallel acquisition on device data in a thread pool manner at an interval of a first preset time to obtain short-term data, and store the short-term data in a first database;
and the extraction and storage module 420 is used for sampling and extracting the short-term data from the first database to obtain long-term data and storing the long-term data to a second database before the expiration of the short-term data is monitored.
The data management device provided by this embodiment parallelly acquires device data in a thread pool manner at an interval of a first preset time after receiving a data acquisition instruction, obtains short-term data, and stores the short-term data in a first database; before the short-term data is monitored to be out of date, sampling and extracting the short-term data from the first database to obtain long-term data, and storing the long-term data to a second database. According to the technical scheme, the short-term data collected in a thread pool parallel mode is stored in the first database, so that the data are collected more quickly, the short-term data in the first database are extracted to obtain long-term data, the long-term data are stored in the second database, the time granularity of the data in the long-term database is large, and the total data volume in the long-term database can be reduced. Further making storage of device data more convenient for device data retrieval.
On the basis of the foregoing embodiment, the acquisition and storage module 410 is specifically configured to:
determining the number of the core threads according to the number of the devices in the management range;
and acquiring the equipment data of all equipment in a management range in batches at intervals of first preset time to obtain the short-term data, and storing the short-term data to the first database.
On the basis of the above embodiment, acquiring the device data of all devices within a management range in batches at intervals of a first preset time to obtain the short-term data, and storing the short-term data in the first database includes:
and after the equipment data collection of the equipment in any batch is monitored to be completed, supplementing the candidate equipment in the candidate queue to the core thread for equipment data collection.
On the basis of the foregoing embodiment, the extraction storage module 420 is specifically configured to:
and if the data volume of the long-term data is larger than a preset threshold value, respectively storing the equipment data in different data tables according to equipment in the second database.
In one embodiment, the names of the data tables in the second database include original table names and device names.
In one embodiment, the first database corresponds to a first expiration policy, and the apparatus further comprises:
and the first clearing module is used for clearing the short-term data with the storage duration longer than the first expiration policy.
In one embodiment, the second database corresponds to a second expiration policy, and the first expiration policy is smaller than the second expiration policy, the apparatus further comprising:
and the second clearing module is used for clearing the long-term data with the storage duration longer than the second expiration policy.
The data management device provided by the embodiment of the invention can execute the data management method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 5 is a schematic structural diagram of a computer device according to a fourth embodiment of the present invention, as shown in fig. 5, the computer device includes a processor 510, a memory 520, and a computer program stored in the memory and executable on the processor; the number of the processors 510 in the computer device may be one or more, and one processor 510 is taken as an example in fig. 5; the processor 510 and the memory 520 in the computer device may be connected by a bus or other means, as exemplified by the bus connection in fig. 5.
The memory 520 is a computer-readable storage medium, and can be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the data management method in the embodiment of the present invention (for example, the collection storage module 410 and the extraction storage module 420 in the data management apparatus). The processor 510 executes various functional applications of the computer device and data processing by executing software programs, instructions, and modules stored in the memory 520, that is, implements the data management method described above.
The memory 520 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 520 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 520 may further include memory located remotely from processor 510, which may be connected to a computer device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The computer device provided by the embodiment of the invention can execute the data management method provided by the embodiment of the invention, and has corresponding functions and beneficial effects.
EXAMPLE five
An embodiment of the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform a data management method, including:
after a data acquisition instruction is received, acquiring equipment data in parallel in a thread pool mode at intervals of first preset time to obtain short-term data, and storing the short-term data to a first database;
before the short-term data is monitored to be out of date, sampling and extracting the short-term data from the first database to obtain long-term data, and storing the long-term data to a second database.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the operations of the method described above, and may also perform related operations in the data management method provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the data management apparatus, the included units and modules are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.
Claims (10)
1. A method for managing data, comprising:
after a data acquisition instruction is received, acquiring equipment data in parallel in a thread pool mode at intervals of first preset time to obtain short-term data, and storing the short-term data to a first database;
before the short-term data is monitored to be out of date, sampling and extracting the short-term data from the first database to obtain long-term data, and storing the long-term data to a second database.
2. The data management method of claim 1, wherein collecting device data in parallel in a thread pool manner at intervals of a first preset time to obtain short-term data, and storing the short-term data in a first database comprises:
determining the number of the core threads according to the number of the devices in a management range, wherein the number of the core threads is the maximum number of threads which run simultaneously;
and acquiring the equipment data of all equipment in a management range in batches at intervals of first preset time to obtain the short-term data, and storing the short-term data to the first database.
3. The data management method of claim 1, wherein the collecting the device data of all devices within a management range in batches at intervals of a first preset time to obtain the short-term data and storing the short-term data in the first database comprises:
and after the equipment data collection of the equipment in any batch is monitored to be completed, supplementing the candidate equipment in the candidate queue to the core thread for equipment data collection.
4. The data management method of claim 1, wherein sampling and extracting the short-term data from the first database to obtain long-term data, and storing the long-term data to a second database comprises:
and if the data volume of the long-term data is larger than a preset threshold value, respectively storing the equipment data in different data tables according to equipment in the second database.
5. The data management method of claim 4, wherein the names of the data tables in the second database comprise original table names and device names.
6. The data management method of claim 1, further comprising: the first database corresponds to a first expiration policy,
the method comprises the following steps of acquiring equipment data in parallel in a thread pool mode at a first preset time interval to obtain short-term data, and storing the short-term data in a first database, wherein the method further comprises the following steps:
clearing the short-term data with storage duration longer than the first expiration policy.
7. The data management method of claim 6, wherein the second database corresponds to a second expiration policy, wherein the first expiration policy is less than the second expiration policy,
after sampling and extracting the short-term data from the first database to obtain long-term data and storing the long-term data to a second database, the method further comprises the following steps:
clearing the long-term data with storage duration longer than the second expiration policy.
8. A data management apparatus, comprising:
the acquisition and storage module is used for acquiring equipment data in parallel in a thread pool mode at intervals of first preset time after receiving a data acquisition instruction to obtain short-term data and storing the short-term data to a first database;
and the extraction and storage module is used for sampling and extracting the short-term data from the first database to obtain long-term data and storing the long-term data to a second database before monitoring that the short-term data is out of date.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the data management method according to any of claims 1-7 when executing the program.
10. A storage medium containing computer-executable instructions for performing the data management method of any one of claims 1-7 when executed by a computer processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011360310.6A CN112395270A (en) | 2020-11-27 | 2020-11-27 | Data management method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011360310.6A CN112395270A (en) | 2020-11-27 | 2020-11-27 | Data management method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112395270A true CN112395270A (en) | 2021-02-23 |
Family
ID=74605472
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011360310.6A Pending CN112395270A (en) | 2020-11-27 | 2020-11-27 | Data management method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112395270A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105912703A (en) * | 2016-04-26 | 2016-08-31 | 北京百度网讯科技有限公司 | Data storage method and data query method and device |
CN106933836A (en) * | 2015-12-29 | 2017-07-07 | 航天信息股份有限公司 | A kind of date storage method and system based on point table |
CN108241717A (en) * | 2016-12-27 | 2018-07-03 | 中国移动通信集团公司 | A kind of data processing method, apparatus and system |
CN109460438A (en) * | 2018-09-26 | 2019-03-12 | 中国平安人寿保险股份有限公司 | Message data storage method, device, computer equipment and storage medium |
CN111552687A (en) * | 2020-03-10 | 2020-08-18 | 远景智能国际私人投资有限公司 | Time sequence data storage method, query method, device, equipment and storage medium |
-
2020
- 2020-11-27 CN CN202011360310.6A patent/CN112395270A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106933836A (en) * | 2015-12-29 | 2017-07-07 | 航天信息股份有限公司 | A kind of date storage method and system based on point table |
CN105912703A (en) * | 2016-04-26 | 2016-08-31 | 北京百度网讯科技有限公司 | Data storage method and data query method and device |
CN108241717A (en) * | 2016-12-27 | 2018-07-03 | 中国移动通信集团公司 | A kind of data processing method, apparatus and system |
CN109460438A (en) * | 2018-09-26 | 2019-03-12 | 中国平安人寿保险股份有限公司 | Message data storage method, device, computer equipment and storage medium |
CN111552687A (en) * | 2020-03-10 | 2020-08-18 | 远景智能国际私人投资有限公司 | Time sequence data storage method, query method, device, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103942210B (en) | Processing method, device and the system of massive logs information | |
CN101673192B (en) | Method for time-sequence data processing, device and system therefor | |
CN111258978A (en) | Data storage method | |
CN105069134A (en) | Method for automatically collecting Oracle statistical information | |
CN109271435A (en) | A kind of data pick-up method and system for supporting breakpoint transmission | |
CN103514277A (en) | Task parallel processing method for electricity utilization information collection system | |
CN108241717A (en) | A kind of data processing method, apparatus and system | |
CN112084016A (en) | Flow calculation performance optimization system and method based on flink | |
CN114238388A (en) | Heterogeneous data collection and retrieval system based on multiple protocols | |
CN110083600A (en) | A kind of method, apparatus, calculating equipment and the storage medium of log collection processing | |
CN117633116A (en) | Data synchronization method, device, electronic equipment and storage medium | |
CN110515938B (en) | Data aggregation storage method, equipment and storage medium based on KAFKA message bus | |
CN111523004A (en) | Storage method and system for edge computing gateway data | |
CN112395270A (en) | Data management method, device, equipment and storage medium | |
CN115291806A (en) | Processing method, processing device, electronic equipment and storage medium | |
CN110704442A (en) | Real-time acquisition method and device for big data | |
CN113360576A (en) | Power grid mass data real-time processing method and device based on Flink Streaming | |
CN112925811A (en) | Data processing method, device, equipment, storage medium and program product | |
CN116737829A (en) | Data synchronization method and device, storage medium and electronic equipment | |
CN109739883A (en) | Promote the method, apparatus and electronic equipment of data query performance | |
CN111324513B (en) | Monitoring management method and system for artificial intelligence development platform | |
CN116010340A (en) | Data table management method and device | |
CN103268353A (en) | Power grid alarming automatic response system and power grid alarming automatic response method | |
CN114328526A (en) | Data processing method and device, electronic equipment and computer readable storage medium | |
CN104199930A (en) | System and method for acquiring and processing data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Country or region after: China Address after: No. 9 Mozhou East Road, Nanjing City, Jiangsu Province, 211111 Applicant after: Zijinshan Laboratory Address before: No. 9 Mozhou East Road, Jiangning Economic Development Zone, Jiangning District, Nanjing City, Jiangsu Province Applicant before: Purple Mountain Laboratories Country or region before: China |