[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN118394772B - Method for updating data asset in real time under change of database table - Google Patents

Method for updating data asset in real time under change of database table Download PDF

Info

Publication number
CN118394772B
CN118394772B CN202410866175.4A CN202410866175A CN118394772B CN 118394772 B CN118394772 B CN 118394772B CN 202410866175 A CN202410866175 A CN 202410866175A CN 118394772 B CN118394772 B CN 118394772B
Authority
CN
China
Prior art keywords
database
data
partition
updating
update
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410866175.4A
Other languages
Chinese (zh)
Other versions
CN118394772A (en
Inventor
高海玲
高经郡
郭亚奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kejie Technology Co ltd
Original Assignee
Beijing Kejie Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kejie Technology Co ltd filed Critical Beijing Kejie Technology Co ltd
Priority to CN202410866175.4A priority Critical patent/CN118394772B/en
Publication of CN118394772A publication Critical patent/CN118394772A/en
Application granted granted Critical
Publication of CN118394772B publication Critical patent/CN118394772B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of databases, in particular to a method for updating data assets in real time under the change of a database table, which comprises the following steps: setting a plurality of partitions of a database; acquiring the use frequency of each partition in a single updating period; setting corresponding updating priority according to the frequency of use; setting a temporary storage area according to the update priority, and associating the temporary storage area with a corresponding partition; the temporary storage area responds to the data table change instruction and updates the corresponding partition when the single update period is finished; resetting each partition according to the frequency of use; the method has the advantages that the data in the database is divided by partitioning the database and setting the priority for each partition of the database, and the corresponding updating mode is set, so that the response speed of the database data is effectively improved, and meanwhile, the resource consumption of database updating is effectively saved.

Description

Method for updating data asset in real time under change of database table
Technical Field
The invention relates to the technical field of databases, in particular to a method for updating data assets in real time under the change of a database table.
Background
In the big data age, data is continuously emerging and exponentially growing, making data a core value for organizations and enterprises. Database management systems (DBMS) play a critical role in storing, managing, and providing access to these massive data. However, the existing database management system is mainly designed for static or fixed table structure, and cannot adapt to the dynamic change of the database table.
The traditional data acquisition method is generally based on a fixed acquisition strategy and structure, and cannot be adapted to the metadata change of a database table in real time, so that the acquired metadata is inconsistent with an actual database, and the accuracy and timeliness of the data are reduced. Especially in big data environment, such untimely and inaccurate data acquisition can have serious influence on business decision and operation.
Chinese patent application publication No.: CN117971848a discloses a database updating method and device, a computer readable storage medium and an electronic device. The database updating method comprises the following steps: dividing database data into a plurality of data groups according to an operation period, wherein no data dependency relationship exists among the data groups; analyzing the operation information of updating the database into single field operation, and determining the processing priority of each field operation according to the operation type; aggregating field operations corresponding to the same data group according to the processing priority, so that each data group obtains a queue comprising data operations with different processing priorities; and enabling the queues to be executed concurrently so as to update the database data according to the execution result. The disclosure can improve the efficiency of database updating and maintenance.
Chinese patent grant bulletin number: CN114048269B discloses a method and apparatus for synchronously updating metadata in a distributed database. The distributed database comprises a plurality of database nodes, wherein a first node stores a first main copy of a target replication table, and the other plurality of second nodes stores first standby copies of the target replication table, and an identical protocol is adopted between the first main copy and the first standby copies. The first node may receive an update request related to first metadata of a first data table in the distributed database, and write update information related to the first metadata to the first primary replica according to the update request, such that the update information is synchronized to respective first backup copies of the plurality of second nodes based on an identical protocol, for the plurality of second nodes to obtain update data of the first metadata according to the update information when using the first metadata.
Therefore, the prior art cannot meet the requirement of monitoring database table changes in real time, dynamically and efficiently and updating data in real time. The present invention aims to solve these problems and provides a real-time data updating method for dynamic changes of database tables to ensure that data and database tables remain synchronized.
The traditional data acquisition method cannot monitor the dynamic change of the database table in real time, and update corresponding data in time after the table is changed. The existing database management system and data acquisition technology are mainly oriented to fixed table structures, and are difficult to adapt to dynamically-changed database tables.
The prior art fails to provide a solution for monitoring database table changes in real time, dynamically and efficiently and updating data in real time. This highlights the urgent need for innovative methods to enable real-time updating of data under database table changes in the big data age.
Disclosure of Invention
Therefore, the invention provides a method for updating data assets in real time under the change of a database table, which is used for solving the problems that the traditional data acquisition method cannot monitor the dynamic change of the database table in real time and cannot update corresponding data in real time in the prior art, so that the accuracy and timeliness of the data cannot be ensured, and the update resource of the database is wasted.
In order to achieve the above object, the present invention provides a method for updating data assets in real time under database table change, comprising:
setting a plurality of partitions of a database according to a data source;
Acquiring the frequency of use of corresponding data sources in each partition in a single updating period;
setting updating priority of the corresponding partition according to the use frequency, and generating corresponding sequencing;
setting a temporary storage area according to the sequence and associating the temporary storage area with a corresponding partition;
The temporary storage area responds to a data table change instruction, and updates the corresponding partition according to the partition association when a single update period is finished, and determines the new use frequency of the corresponding data source in each partition in the update period;
Resetting the ordering of each partition according to the new use frequency;
wherein the partitions are related to the type of the database, and for a single partition, the corresponding data can be completely read in the database;
the sorting is to carry out ascending sorting according to the updating priority;
The data sheet change instruction includes a number of operations on assets in the database.
Further, the process of setting the partitions includes:
determining a number of data sources of the database;
setting the database into a plurality of areas corresponding to the data sources according to the data sources;
wherein for a single region it corresponds to a single data source.
Further, when the frequency of use is acquired, recording the items pointing to the corresponding partition in the updating period, and setting a counting strategy outside the database;
the counting strategy responds to a single-shot item, and the partition corresponding to the item is counted in an accumulated mode;
The single shot item is an item that the database identifies as being capable of implementation.
Further, for a single partition, the database establishes the temporary storage area according to the update priority, and stores the asset transition in the partition in a corresponding temporary storage area;
The temporary storage area updates the asset transition to the corresponding partition in response to the update cycle.
Further, at the end of the single update period, the partitions that are subject to change are independently set in the database, and
And when the time threshold is updated, setting the partitions which are unchanged as read-only.
Further, when the update priority is set for each partition, a detection strategy for collecting database tables or database changes is further set according to the database tables;
The detection policy responds to several operations on the database table and uses the results of the operations to update the partition, and
In response to several operations on the database, and the results of the operations are used at the end of an update period to set the update priority of the corresponding partition in the next update period.
Further, for the single update period, recording the accumulated count and the final change result of each partition in the current update period according to the detection strategy, outputting the final change result to the corresponding partition when the current update period is finished, and setting the update priority according to the accumulated count.
Further, the resetting process of each partition includes:
sorting the partitions according to the accumulated counts;
Marking each zone which changes;
stacking according to the sequence;
And popping the changed partitions according to the sequence, and reordering the partitions.
Further, for a single partition with a single update period, the detection strategy is further provided with a temporary storage area of the partition;
Wherein the capacity of the temporary storage area is proportional to the corresponding partition.
Further, after resetting each partition according to the new usage frequency, the method further includes:
and monitoring the activation state of each partition in a preset check period, and sequentially roll-calling the inactive partitions.
Compared with the prior art, the method has the beneficial effects that the data in the database is divided by utilizing the mode of partitioning the database and setting the priority for each partition of the database, and the corresponding updating mode is set, so that the response speed of the database data is effectively improved, and the resource consumption of database updating is effectively saved.
Further, by recording the project change, the database is recorded, so that the resource consumed by the change of the database is effectively saved, and meanwhile, the calculation power of data calling is saved, thereby further saving the resource consumption of database updating.
Further, by means of solidifying the data assets which do not change for a long time, the reading speed of the corresponding data assets is improved, the capacity of the readable and writable data assets is effectively reduced, meanwhile, the reading and writing space is saved, and accordingly the resource consumption of database updating is further saved.
Further, by means of setting the update priority, each partition in the database is correspondingly recorded and adjusted, so that the utilization rate of the data asset is effectively improved, data loss caused by data table change is avoided, and resource consumption of database update is further saved.
Furthermore, centralized change is carried out on the database through setting temporary storage, so that the resource consumption of database update is further saved while the risk of data assets caused by real-time change of the database is avoided.
In particular, the invention also enables:
(1) Updating in real time: the invention can monitor the change of the database table in real time and update corresponding data in real time, thereby ensuring the accuracy and timeliness of the data.
(2) Dynamic adaptability: the method can adapt to the dynamic change of the database table metadata by real-time monitoring and dynamic updating, and keep the data synchronous with the database table.
(3) The cost is reduced: the automatic data updating process reduces the manual intervention cost, improves the efficiency and saves time and resources for enterprises.
Drawings
FIG. 1 is a flow chart of a method for updating data assets in real time under database table change of the present invention;
FIG. 2 is a flow chart of setting update priority according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a real-time data update structure under database table change according to an embodiment of the present invention;
Fig. 4 is a schematic diagram of a flow chart of updating data in real time under database table change according to an embodiment of the invention.
Detailed Description
In order that the objects and advantages of the invention will become more apparent, the invention will be further described with reference to the following examples; it should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are merely for explaining the technical principles of the present invention, and are not intended to limit the scope of the present invention.
It should be noted that, in the description of the present invention, terms such as "upper," "lower," "left," "right," "inner," "outer," and the like indicate directions or positional relationships based on the directions or positional relationships shown in the drawings, which are merely for convenience of description, and do not indicate or imply that the apparatus or elements must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
Furthermore, it should be noted that, in the description of the present invention, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention can be understood by those skilled in the art according to the specific circumstances.
Referring to fig. 1, a flowchart of a method for updating data assets in real time under database table change according to the present invention is shown, which includes:
Step S1, setting a plurality of partitions of a database according to a data source;
S2, obtaining the use frequency of the corresponding data source in each partition in a single updating period;
step S3, setting updating priority of the corresponding partition according to the using frequency, and generating corresponding sequencing;
Step S4, setting a temporary storage area according to the sequence and associating the temporary storage area with a corresponding partition;
step S5, the temporary storage area responds to the data table change instruction, and updates the corresponding partition according to the partition association when the single update period is finished, and determines the new use frequency of the update period;
s6, resetting each partition according to the new use frequency;
Wherein, the partition is related to the type of the database, and for a single partition, the corresponding data can be completely read in the database;
The sorting is ascending sorting according to the updating priority;
the data sheet change instruction includes several operations on the assets in the database.
Specifically, the process of setting the plurality of partitions includes:
determining a number of data sources of the database;
setting the database into a plurality of areas corresponding to the data sources according to the data sources;
wherein for a single region it corresponds to a single data source.
Specifically, when the frequency of use is acquired, recording the items pointing to the corresponding partition in an updating period, and setting a counting strategy outside the database;
The counting strategy responds to a single-shot item, and the partition corresponding to the item is counted in an accumulated mode;
a single shot item is an item that the database identifies as being capable of implementation.
The method for recording the project change records the database, so that the resource consumed by the change of the database is effectively saved, and the calculation power of data calling is saved, thereby further saving the resource consumption of database updating.
Specifically, for a single partition, the database stores asset transition in that partition in a corresponding temporary storage area according to when the temporary storage area is established;
The temporary storage area updates the asset transition to the corresponding partition in response to the update period.
Specifically, at the end of a single update period, the partitions that are subject to the change are set independently in the database, and
And when the time threshold is updated, the partitions which are not changed are set to be read-only.
By means of solidifying the data assets which do not change for a long time, the reading speed of the corresponding data assets is improved, the capacity of the readable and writable data assets is effectively reduced, meanwhile, the reading and writing space is saved, and accordingly the resource consumption of database updating is further saved.
Specifically, when the updating priority is set for each partition, a detection strategy for collecting database tables or database changes is further set according to the database tables;
detecting a policy response to a number of operations on the database table and using the results of the operations to update the partition, and
In response to several operations on the database, and the results of the operations are used at the end of an update period to set the update priority of the corresponding partition in the next update period.
Specifically, for a single update period, the detection strategy records the accumulated count and the final change result of each partition in the update period, and outputs the final change result to the corresponding partition at the end of the update period, and
The update priority is set according to the accumulated count.
Referring to fig. 2, a flowchart of setting update priority according to an embodiment of the invention includes:
Step St1, sorting all the partitions according to accumulated counts;
Step St2, marking each zone with variation;
step St3, sequentially stacking according to the sequence.
By means of setting the updating priority, the partitions in the database are correspondingly recorded and adjusted, the utilization rate of the data asset is effectively improved, meanwhile, data loss caused by data table change is avoided, and therefore resource consumption of database updating is further saved.
Specifically, for a single partition of a single update period, the detection policy is further provided with a temporary storage area of the partition;
Wherein the capacity of the temporary storage area is proportional to the corresponding partition.
Centralized change is carried out on the database through setting temporary storage, so that the resource consumption of database update is further saved while the risk of data assets caused by real-time change of the database is avoided.
Specifically, the method further comprises the steps of:
And S7, monitoring the activation state of each partition in a preset check period, and sequentially roll-calling the inactive partition.
The invention provides a detection middleware which realizes real-time monitoring and dynamic updating of data of a database table through a data source identification, monitoring, data and metadata analysis and change notification module.
With the above method, in practice:
1. And identifying the database table to be monitored through the data source identification module, and linking according to the type of the database.
2. The structure and the data of the database table are monitored in real time by the monitoring module, the change condition is detected, and the time stamp of the state of the database table monitored last time can be recorded by the monitoring module and compared with the current state. If the two time stamps are inconsistent, the database table is changed, whether the structure or the data is changed or not is judged by analyzing the log content, and the two modes are suitable for various databases.
3. The data and metadata analysis module analyzes the data and metadata of the database table and acquires the change information. Once the change of the database table is monitored, a notification is sent through a change notification module, and the real-time update of the data is triggered.
4. Intelligent data source identification and monitoring: and a machine learning algorithm is introduced to realize an intelligent data source identification and monitoring module. Through learning of historical data and pattern recognition, the system can automatically recognize the database table to be monitored, and the intelligent level and the self-adaptive capacity of monitoring are improved.
5. Monitoring module based on machine learning: and combining a machine learning technology, establishing a prediction model of database table change, and realizing intelligent analysis and early warning of change behaviors. The monitoring module can accurately predict the possibility of data change, so that false alarm and missing alarm are effectively reduced, and the accuracy and efficiency of monitoring are improved.
6. Event-driven real-time change notification: an event-driven based architecture is introduced and the change notification module is designed as an asynchronous event processing system. When the monitoring module detects the change of the database table, the corresponding event is immediately triggered, and the real-time change notification and processing are realized through an event driving mechanism, so that the response speed and the real-time performance of the system are improved.
Fig. 3 and fig. 4 are schematic diagrams of a structure of data real-time update under the change of the database table according to the embodiment of the invention and a flow chart of data real-time update under the change of the database table according to the embodiment of the invention, respectively.
The invention can be used for various data source adaptations:
Firstly, a data source identification module is utilized to identify various database tables needing to be monitored, whether conditions needing to be monitored are configured in a conf configuration file or not is judged, and configuration is added to monitor. These database tables may be from different data sources, possibly with different data formats and structures. Through data source adaptation, the database tables in various data sources can be identified and monitored, after the various data sources are adapted, the system can print log records, long connection is kept on the monitored database, and a monitoring flow is output.
Example 1:
This embodiment applies to data management of online retail business;
In an online retail business, data includes product information, order data, user information, etc., which may be stored in different data sources. Suppose that the system needs to monitor three key data tables of product information, order data, and user information.
1. Data source:
Product information table: stored in a relational database of the enterprise (e.g., mySQL).
Order data table: stored in a NoSQL database (e.g., mongo db) of the cloud facilitator.
User information table: stored in the custom file storage system of the enterprise.
2. Data source adaptation:
Aiming at the product information table, the database connection information and the monitoring conditions are specified through the configuration file, so that the system can be adapted to the MySQL database, and real-time monitoring of the product information table is realized.
And setting corresponding connection information and monitoring conditions for the order data table through the configuration file, and adapting to the MongoDB database to ensure that the order data table is monitored.
The user information table specifies connection information through the configuration file and is adapted to the custom file storage system so as to monitor the user information table.
3. The monitoring flow comprises the following steps:
The configuration file contains monitoring conditions for the three tables, and after the system is started, the data source identification module establishes connection according to the configuration and starts to monitor database tables in the three data sources.
The system will periodically check the table for compliance with the monitoring conditions, maintain long connections to the database table, and record the relevant logs for debugging and monitoring.
Example 2:
This embodiment is data source adaptation;
1. relational databases (e.g., mySQL):
And (3) an adapting step:
configuration file setting: database connection information is specified in the configuration file, including database type, host address, port, username, password, etc.
[MySQLDataSource]
type = MySQL
host = localhost
port = 3306
username = username
password = password
database = database
And a data source adapting module: and writing a data source adaptation module, establishing connection with the MySQL database according to the information in the configuration file, and executing related monitoring and data acquisition operations.
2. NoSQL database (e.g., mongoDB):
And (3) an adapting step:
configuration file setting: mongoDB connection information is specified in the configuration file, including host address, port, user name, password, etc.
[MongoDBDataSource]
type = MongoDB
host = localhost
port = 27017
username = username
password = password
database = database
And a data source adapting module: and writing a data source adaptation module, establishing connection with the MongoDB database according to the information in the configuration file, and executing related monitoring and data acquisition operations.
3. File system:
And (3) an adapting step:
Configuration file setting: information such as path and format of the file system is specified in the configuration file.
[FileSystemDataSource]
type = FileSystem
path = /path/to/your/files
format = CSV
And a data source adapting module: and writing a data source adaptation module, reading data in a file system according to information in the configuration file, and realizing monitoring and data acquisition operations.
Journal analysis: and the change condition of the database table is monitored in real time through a monitoring module by adopting a log analysis technology. Database systems often record database oplogs, including changes to the table structure. By analyzing the logs, the change of the database table can be detected in time, and the analysis flow is as follows:
1. The system establishes a connection with the server database and captures a database operation log. Database management systems typically record various operations on the database, including changes to the table structure and data addition and deletion operations. The log is typically stored in a particular format, such as a binary log file or a text file.
2. Once the database operation log is obtained, the parsing module begins parsing the log record. This includes identifying the type of log record, e.g., DDL (data definition language) record, DML (data operation language) record, etc
3. The parsing module extracts information related to database table changes. For DDL records, this may include changes in table structure, such as additions, deletions, modifications to columns, changes to indexes, and so forth. For DML records, addition, deletion, update, etc. of data may be included.
4. The analysis module compares the states before and after the change to determine the concrete content of the change. This may involve comparing the table structure, the number of data lines, field values, time stamps, etc. If there is a difference, it can be confirmed that a change has occurred.
5. Upon confirming that the change has occurred, the parsing module may record the corresponding event, such as a time stamp, a change type (DDL or DML), a specific content of the change, etc.
Comparative example 1: database table change monitoring of online retail business;
based on example 1, in the online retail business described above, we need to monitor the database tables for structural and data changes.
1. Database operation log:
For product information tables, the database system records DDL logs, including changes to the table structure (e.g., adding new product attribute fields).
The database system of the order data table records the DML log, including the operations of adding, deleting and modifying order data.
The user information table, since stored in the file system, may not have a direct database oplog, may require other means to monitor.
2. Monitoring and analyzing process:
after the system establishes connection with the database, the database operation log is captured regularly.
And the log analysis module extracts relevant information according to the type of the log record. Extracting change information of a table structure for DDL records; for DML recording, information of data fluctuation is extracted.
The analysis module compares the states before and after the change, such as the table structures before and after the comparison, the data line number, the field value, and the like, to determine the specific content of the change.
Upon confirming that the change has occurred, the parsing module records the corresponding event, including the time stamp, the change type, the specific content of the change, etc.
Metadata update: the metadata analysis module is used for analyzing metadata of the database table to obtain change information, wherein the change information comprises: changes in table structure-add, delete, modify, index change, etc., changes in data-add, delete, update, etc. And monitoring the change of the database table, sending a notification through a change notification module, and triggering the real-time update of the data. In the updating process, according to the metadata information obtained by analysis, the data are correspondingly updated, and synchronization with the database table is ensured.
In practice, the present invention also enables algorithms and techniques for processing large-scale data and improving monitoring efficiency:
1. and (3) incremental monitoring:
The machine learning-based anomaly detection method trains the data by using a machine learning model, and then monitors the data by detecting the anomaly behavior of the data, thereby improving the monitoring efficiency and accuracy.
Working principle:
training of the machine learning model is performed using historical data, for example using supervised learning or unsupervised learning algorithms.
The trained model may identify patterns of normal data and detect abnormal data or events that do not correspond to the normal patterns.
Example 3:
a) Unsupervised learning model
The historical data is clustered using a clustering algorithm K-means to divide the data points into clusters.
And for the new data, judging whether the new data belongs to an abnormal cluster or not by calculating the distance or the similarity between the new data and each cluster.
B) Supervised learning model:
An anomaly detection model is trained using supervised learning algorithms (e.g., random forest, support vector machine, etc.).
The historical data is marked as normal or abnormal samples, and a model is trained to identify anomalies in the new data.
Dominance and application scenarios:
Monitoring efficiency is improved: the machine learning model can learn complex data patterns, so that anomalies can be detected more accurately, and the limitations of traditional incremental monitoring are avoided.
Is suitable for complex environments: the method is suitable for monitoring requirements under large-scale data or complex data environments, such as network traffic, database operation and the like.
2. Efficient data structure and index:
the data structure and the index are dynamically adjusted by adopting the self-adaptive index optimization, and the database index is dynamically optimized according to the actual data access mode and the workload, so that the query performance is improved and the maintenance cost is reduced. Compared with the traditional static index, the optimization strategy has higher flexibility and adaptability, and can automatically adjust the index to adapt to the changed data environment.
By using the mode, the following steps can be realized:
a) And (3) real-time data monitoring: adaptive index optimization requires real-time monitoring of data access patterns and query requests in a database. The characteristics and trends of data access are known by collecting and analyzing the actual data access patterns, including frequency, range, order, etc.
B) Dynamic index adjustment: and dynamically adjusting the index structure of the database according to the data access mode monitored in real time. For example, adding or deleting indices to particular fields, adjusting index types or index parameters to optimize query performance and response time.
C) Intelligent optimization strategy: and (3) formulating an intelligent index optimization strategy, and selecting proper index configuration according to different data characteristics and workloads. This may involve using different index types (e.g., B-tree, hash index, full text index, etc.) or optimizing index parameters (e.g., fill factor, block size, etc.).
D) And (3) adaptability adjustment: index optimization strategies should be adaptive, capable of automatically accommodating data changes and workload changes. For example, when the data distribution or access pattern changes, the adaptive index optimization may adjust the index structure in time to ensure stability and efficiency of query performance.
E) Performance evaluation and monitoring: after adaptive index optimization is performed, the performance and effect of the index needs to be continuously evaluated and monitored. The performance indexes of the index are collected and analyzed regularly, and the optimization strategy is adjusted timely, so that the high performance and stability of the database system are guaranteed.
F) The actual application scene is as follows: the adaptive index optimization is suitable for application scenes with higher requirements on query performance and continuously changing data access modes. For example, merchandise searches in e-commerce platforms, user relationship queries in social networks, real-time data analysis in internet of things systems, and the like.
3. And (3) parallelizing:
the monitoring and resolving processes are parallelized, and multiple data sources or multiple change events are processed simultaneously, so that the overall processing speed is improved.
Example 4:
and a multithreading or distributed architecture is used for simultaneously monitoring a plurality of data sources, and monitoring and analyzing tasks are distributed to different processing units, so that the processing efficiency is improved.
4. Event-based trigger mechanism:
The manner in which the polling is not periodic may result in unnecessary performance overhead. With event-based trigger mechanisms, the monitoring and updating operations are triggered only when a change occurs.
Working principle:
a) Event driven mechanism: event-based triggering mechanisms are by subscribing to or listening for specific events in the system, rather than periodically polling for data changes. When an event occurs, the system immediately triggers the corresponding processing flow, thereby realizing real-time response and processing.
B) System notification or trigger: database systems or operating systems typically provide event notification mechanisms or trigger functions that are capable of triggering related operations upon the occurrence of a particular event. These events may be data updates, file changes, network messages, etc.
C) No polling is required: unlike periodic polling, event-based triggering mechanisms do not require a timed query or scan of the data source. In contrast, the system can receive the notification when the event occurs, and immediately perform corresponding processing, so that unnecessary resource waste and delay are avoided.
Example 5:
a) Database triggers: in a relational database, actions triggered when a particular data operation (e.g., insert, update, delete) occurs may be defined by triggers (Trigger). For example, when an update occurs to a database table, the trigger may automatically trigger the monitoring and updating operations without periodic polling.
B) File system monitoring: the operating system provides a mechanism for file system monitoring, such as inotify in Linux systems, to monitor file change events (e.g., file creation, modification, deletion, etc.). Applications may subscribe to these events and perform the corresponding operations when the events occur without having to continually poll for file status.
The advantages are that:
a) Real-time response: the trigger mechanism based on the event can realize real-time response and processing, and delay and resource waste caused by periodic polling are avoided.
B) And (3) reducing the system load: and continuous monitoring or scanning of a data source is not needed, so that system resources are saved, and system load is reduced.
C) Efficiency is improved: the event-driven processing mode enables the system to process data change more efficiently, and improves the overall efficiency and performance of the system.
5. The compression algorithm is adopted:
For the condition of monitoring logs or large-scale data, the compression algorithm is adopted to reduce the data transmission and storage cost, improve the efficiency, and when the monitoring logs are transmitted, the compression algorithm (such as Gzip or Snappy) is used for compressing the logs, so that the data transmission quantity is reduced; when monitoring data is stored, a column type storage and compression algorithm is adopted, so that the occupied storage space is reduced.
A) The application content is as follows:
Monitoring log compression: and compressing log content by adopting a compression algorithm for log data generated by the monitoring system. The log generated by the monitoring system may contain a large amount of text information, and the size of the log file can be reduced by compression, so that the storage space and the transmission bandwidth are saved.
Data transmission content compression: in the data asset monitoring process, if monitoring data is required to be transmitted to other systems or stored in a remote server, a compression algorithm can be used for compressing the data, so that the transmission data quantity is reduced, and the network transmission cost and delay are reduced.
And (3) storage data compression: for monitoring data which needs long-term storage, a compression algorithm can be adopted to compress and store the data. The compressed data occupies less storage space, reduces the storage cost, and can improve the reading efficiency and the access speed of the data.
B) Compressing the content:
Monitoring log data generated by a system: the monitoring system can generate a large amount of log data, record various events, operations and results in the monitoring process, and the log data can be compressed by adopting a compression algorithm (such as gzip, snappy and the like), so that the size of a log file is reduced, and the storage space is saved.
Transmitted monitoring data: in the monitoring process, the system may need to transmit the monitored data to other systems or remote servers for further processing or storage, and the transmitted monitored data can be compressed through a compression algorithm, so that the size of data transmission is reduced, and the network bandwidth consumption and the transmission delay are reduced.
C) Benefits of compression:
Saving storage space: the log data generated by the compression monitoring system can remarkably reduce the occupation of the storage space, save the cost and prolong the storage period.
Reducing network transmission quantity: the compressed and transmitted monitoring data can reduce the data transmission quantity, improve the data transmission efficiency, and is particularly suitable for the environment with limited network bandwidth.
The data transmission speed is accelerated: the compressed data transmission speed is higher, the time cost of data transmission is reduced, and the data processing efficiency and the instantaneity are improved.
Improving the system performance: the data volume is reduced, the load of the system can be reduced, and the overall performance and response speed of the system are improved.
Ensuring data security: the compressed data can enhance the safety of the data, reduce the risk of data leakage, and particularly play a role in protecting the data transmission and storage process.
With continued reference to fig. 3, in an application, a data source adaptation module, a detection middleware and an asset update module may be respectively constructed:
1. Data source adaptation: the database tables to be monitored are classified and identified through the data source identification module, and the database tables are classified according to the database model, and are respectively: the relational database, the non-relational database, the face-to-face database and the XML database, and establishes connection with the corresponding server database.
2. Real-time monitoring database table: and starting a monitoring module, monitoring metadata of a database table in real time, and detecting the change condition.
3. Parsing the metadata: and analyzing metadata of the database table by using a metadata analysis module to acquire the change information.
4. Initiating a real-time update notification: once the change of the database table is monitored, a notification is sent through a change notification module, and the real-time update of the data is triggered.
5. Updating data in real time: and updating the data in real time according to the change information, and keeping synchronization with the database table.
By using the method, the invention can:
(1) Updating in real time: the invention can monitor the change of the database table in real time and update corresponding data in real time, thereby ensuring the accuracy and timeliness of the data.
(2) Dynamic adaptability: the method can adapt to the dynamic change of the database table metadata by real-time monitoring and dynamic updating, and keep the data synchronous with the database table.
(3) The cost is reduced: the automatic data updating process reduces the manual intervention cost, improves the efficiency and saves time and resources for enterprises.
Thus far, the technical solution of the present invention has been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of protection of the present invention is not limited to these specific embodiments. Equivalent modifications and substitutions for related technical features may be made by those skilled in the art without departing from the principles of the present invention, and such modifications and substitutions will be within the scope of the present invention.
The foregoing description is only of the preferred embodiments of the invention and is not intended to limit the invention; various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (8)

1. A method for updating data assets in real time under database table changes, comprising:
setting a plurality of partitions of a database according to a data source;
Acquiring the frequency of use of corresponding data sources in each partition in a single updating period;
setting updating priority of the corresponding partition according to the use frequency, and generating corresponding sequencing;
setting a temporary storage area according to the sequence and associating the temporary storage area with a corresponding partition;
The temporary storage area responds to a data table change instruction, and updates the corresponding partition according to the partition association when a single update period is finished, and determines the new use frequency of the corresponding data source in each partition in the update period;
Resetting the ordering of each partition according to the new use frequency;
wherein the partitions are related to the type of the database, and for a single partition, the corresponding data can be completely read in the database;
the sorting is to carry out ascending sorting according to the updating priority;
the data sheet change command includes a number of operations on the assets in the database;
The process of performing the operations in the single update period is to record the accumulated count and the final change result of each partition in the current update period according to the detection strategy, and output the final change result to the corresponding partition when the current update period is finished;
Recording the items pointing to the corresponding partitions in the updating period when the frequency of use is acquired, and setting a counting strategy outside the database;
the counting strategy responds to a single-shot item, and the partitions corresponding to the item are counted in an accumulated mode;
the single shot item is an item identified by the database as being implementable;
For a single partition, the database establishes the temporary storage area according to the update priority, and stores the asset variation in the partition in a corresponding temporary storage area;
The temporary storage area updates the asset transition to the corresponding partition in response to the update cycle.
2. The method for real-time updating of data assets under database table variation according to claim 1, wherein the process of setting up said plurality of partitions includes:
determining a number of data sources of the database;
setting the database into a plurality of areas corresponding to the data sources according to the data sources;
wherein for a single region it corresponds to a single data source.
3. The method for real-time updating of data assets under a change in a database table according to claim 1, wherein at the end of said single update period, each partition in which the change occurs is independently set in said database, and
And when the time threshold is updated, setting the partitions which are unchanged as read-only.
4. The method for updating data assets in real time under the change of a database table according to claim 2, wherein when the update priority is set for each partition, a detection strategy for collecting the database table or the database change is further provided according to the database table;
The detection policy responds to several operations on the database table and uses the results of the operations to update the partition, and
In response to several operations on the database, and the results of the operations are used at the end of an update period to set the update priority of the corresponding partition in the next update period.
5. The method of claim 4, wherein for the single update period, the update priority is set according to the accumulated count.
6. The method for real-time updating of data assets under database table variation according to claim 5, wherein said resetting of each partition includes:
sorting the partitions according to the accumulated counts;
Marking each zone which changes;
stacking according to the sequence;
And popping the changed partitions according to the sequence, and reordering the partitions.
7. The method for real-time updating of data assets under database table variation according to claim 6, wherein for a single partition of a single update period, said detection strategy is further provided with a temporary storage area of the partition;
Wherein the capacity of the temporary storage area is proportional to the corresponding partition.
8. The method for real-time updating of data assets under database table variation according to any one of claims 5-7, further comprising, after said resetting each partition according to said new usage frequency:
and monitoring the activation state of each partition in a preset check period, and sequentially roll-calling the inactive partitions.
CN202410866175.4A 2024-07-01 2024-07-01 Method for updating data asset in real time under change of database table Active CN118394772B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410866175.4A CN118394772B (en) 2024-07-01 2024-07-01 Method for updating data asset in real time under change of database table

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410866175.4A CN118394772B (en) 2024-07-01 2024-07-01 Method for updating data asset in real time under change of database table

Publications (2)

Publication Number Publication Date
CN118394772A CN118394772A (en) 2024-07-26
CN118394772B true CN118394772B (en) 2024-10-01

Family

ID=91987653

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410866175.4A Active CN118394772B (en) 2024-07-01 2024-07-01 Method for updating data asset in real time under change of database table

Country Status (1)

Country Link
CN (1) CN118394772B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113961575A (en) * 2021-10-19 2022-01-21 西安东方宏业科技股份有限公司 Database processing method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8768916B1 (en) * 2011-12-21 2014-07-01 Teradata Us, Inc. Multi level partitioning a fact table
WO2021061173A1 (en) * 2019-12-04 2021-04-01 Futurewei Technologies, Inc. Data integrity validation on lsm tree snapshots
CN115391361A (en) * 2022-08-24 2022-11-25 国任财产保险股份有限公司 Real-time data processing method and device based on distributed database

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113961575A (en) * 2021-10-19 2022-01-21 西安东方宏业科技股份有限公司 Database processing method and device

Also Published As

Publication number Publication date
CN118394772A (en) 2024-07-26

Similar Documents

Publication Publication Date Title
CN111984499B (en) Fault detection method and device for big data cluster
CN111460023B (en) Method, device, equipment and storage medium for processing service data based on elastic search
US8938421B2 (en) Method and a system for synchronizing data
US8924328B1 (en) Predictive models for configuration management of data storage systems
CN111885040A (en) Distributed network situation perception method, system, server and node equipment
US11347740B2 (en) Managed query execution platform, and methods thereof
CN110650038A (en) Security event log collecting and processing method and system for multiple classes of supervision objects
CN111314158B (en) Big data platform monitoring method, device, equipment and medium
CN108228322B (en) Distributed link tracking and analyzing method, server and global scheduler
CN111046022A (en) Database auditing method based on big data technology
JP2020057416A (en) Method and device for processing data blocks in distributed database
CN114154035A (en) Data processing system for dynamic loop monitoring
CN109308290B (en) Efficient data cleaning and converting method based on CIM
CN111258798A (en) Fault positioning method and device for monitoring data, computer equipment and storage medium
CN115344207A (en) Data processing method and device, electronic equipment and storage medium
CN114090529A (en) Log management method, device, system and storage medium
CN108228432A (en) A kind of distributed link tracking, analysis method and server, global scheduler
CN118394772B (en) Method for updating data asset in real time under change of database table
CN108363761A (en) Hadoop awr automatic loads analyze information bank, analysis method and storage medium
CN111414355A (en) Offshore wind farm data monitoring and storing system, method and device
CN114969083B (en) Real-time data analysis method and system
WO2022266975A1 (en) Method for millisecond-level accurate slicing of time series stream data
CN116431324A (en) Edge system based on Kafka high concurrency data acquisition and distribution
CN115858499A (en) Database partition processing method and device, computer equipment and storage medium
CN116108065A (en) Active time sequence data management method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant