[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN113127558B - Metadata synchronization method, system, equipment and storage medium - Google Patents

Metadata synchronization method, system, equipment and storage medium Download PDF

Info

Publication number
CN113127558B
CN113127558B CN201911412597.XA CN201911412597A CN113127558B CN 113127558 B CN113127558 B CN 113127558B CN 201911412597 A CN201911412597 A CN 201911412597A CN 113127558 B CN113127558 B CN 113127558B
Authority
CN
China
Prior art keywords
metadata
synchronized
data
kylin
steps
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911412597.XA
Other languages
Chinese (zh)
Other versions
CN113127558A (en
Inventor
杨玉磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yiyiyun Technology Co ltd
Original Assignee
Beijing Yiyiyun Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yiyiyun Technology Co ltd filed Critical Beijing Yiyiyun Technology Co ltd
Priority to CN201911412597.XA priority Critical patent/CN113127558B/en
Publication of CN113127558A publication Critical patent/CN113127558A/en
Application granted granted Critical
Publication of CN113127558B publication Critical patent/CN113127558B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/275Synchronous replication

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a metadata synchronization method, a system, equipment and a storage medium, wherein the method comprises the following steps: receiving a storage path and an item name of Kylin metadata to be synchronized; acquiring metadata of items to be synchronized from the storage path; receiving the position information of a target cluster; and sending the metadata of the items to be synchronized to the position of the target cluster. The invention provides a method for quickly and accurately synchronizing Kylin metadata, which is used for meeting the requirement that the same Kylin service needs to be realized simultaneously in a plurality of clusters, and only one cluster is needed to manually establish and maintain project metadata, and other clusters can be synchronized into the same state by one key, so that manual repeated operation is avoided, and the efficiency and the accuracy are improved.

Description

Metadata synchronization method, system, equipment and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a metadata synchronization method, system, device, and storage medium.
Background
After Kylin becomes the top-level project of Apache foundation, the attention is greatly increased. And the pre-calculation technology of Kylin can greatly accelerate the use efficiency of the OLAP service with a fixed query mode, and is widely applied to various Internet companies in the middle and outer. OLAP (Online Analytical Process, online analysis process) refers to analyzing data in a multi-dimensional manner, and is capable of flexibly providing Roll-up (Roll-up), drill-down (Drill-down), and perspective analysis (Pivot) operations. It is a method of presenting integrated decision information, which is commonly used in decision support systems, business intelligence, or data warehouses. Kylin is APACHE KYLIN for short, is an open source OLAP engine on a Hadoop big data platform, and adopts a multi-bit cube pre-calculation technology to improve the SQL (Structured Query Language ) query speed of big data to the sub-second level. Hadoop is a distributed system infrastructure developed by the Apache foundation, and users can develop distributed programs without knowing the details of the distributed bottom layers, and the power of clusters is fully utilized for high-speed operation and storage.
Using Kylin, a developer manually builds project, model, cube, table and manually builds on it. This is not a problem when there is only one set of Kylin clusters, however when there are tens or even hundreds of sets of Kylin clusters due to security reasons or other factors, if the same set of Kylin projects is to be created manually at each cluster, the efficiency is too low and human error is prone to occur.
Disclosure of Invention
Aiming at the problems in the prior art, the invention aims to provide a metadata synchronization method, a system, equipment and a storage medium, which can realize rapid and accurate synchronization of Kylin metadata.
The embodiment of the invention provides a metadata synchronization method, which comprises the following steps:
receiving the project name of Kylin metadata to be synchronized;
acquiring metadata of an item to be synchronized from a storage path of Kylin metadata to be synchronized;
Receiving the position information of a target cluster;
and sending the metadata of the items to be synchronized to the position of the target cluster.
Optionally, after acquiring metadata of the items to be synchronized from the storage path, the method further includes the following steps:
extracting each model name and table name from the metadata of the items to be synchronized;
And acquiring the metadata of each model and the metadata of each table from the storage path according to the model name and the table name.
Optionally, after acquiring metadata of the items to be synchronized from the storage path, the method further includes the following steps:
extracting names of all data cubes from metadata of the items to be synchronized;
Acquiring metadata of each data cube from the storage path according to the name of the data cube;
For each data cube, metadata is obtained for individual segments (cube segments) of the data cube.
Optionally, after sending the metadata of the item to be synchronized to the location of the target cluster, the method further includes the following steps:
Requesting a Kylin construction interface, and sequentially submitting the construction of the fragments for each data cube according to the metadata of the fragments.
Optionally, the acquiring metadata of all fragments of the data cube includes the following steps:
For each data cube, obtaining a name of a description (cube_desc) from a description field of metadata of the data cube, and obtaining metadata of the description of the data cube from the storage path according to the name of the description;
And extracting field content of the fragment from the metadata of the description, and generating metadata of the fragment according to the field content of the fragment.
Optionally, after the metadata of all the segments of the data cube is obtained, the method further includes the following steps:
And screening the metadata of the fragments, and removing the log data and the physical storage information data when the fragments are constructed.
Optionally, before sending the metadata of the item to be synchronized to the location of the target cluster, the method further includes the following steps:
receiving database renaming information;
and modifying the database name in the metadata of the table according to the database renaming information.
Optionally, after sending the metadata of the item to be synchronized to the location of the target cluster, the method further includes the following steps:
The signature is updated for each data cube in turn.
The embodiment of the invention also provides a metadata synchronization system which is applied to the metadata synchronization method, and the system comprises the following steps:
The information acquisition module is used for receiving the project name of the Kylin metadata to be synchronized and receiving the position information of the target cluster;
The data extraction module is used for acquiring metadata of the items to be synchronized from a storage path of the Kylin metadata to be synchronized;
And the data importing module is used for sending the metadata of the items to be synchronized to the position of the target cluster.
The embodiment of the invention also provides metadata synchronization equipment, which comprises:
A processor;
A memory having stored therein executable instructions of the processor;
Wherein the processor is configured to perform the steps of the metadata synchronization method via execution of the executable instructions.
The embodiment of the invention also provides a computer readable storage medium for storing a program, which when executed, implements the steps of the metadata synchronization method.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
The metadata synchronization method, system, equipment and storage medium provided by the invention have the following advantages:
The invention provides a method for quickly and accurately synchronizing Kylin metadata, which is used for meeting the requirement that the same Kylin service needs to be realized simultaneously in a plurality of clusters, and only one cluster is needed to manually establish and maintain project metadata, and other clusters can be synchronized into the same state by one key, so that manual repeated operation is avoided, the efficiency is improved, and the investigation and error correction cost caused by manual operation errors can be reduced.
Drawings
Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings.
FIG. 1 is a flow chart of a metadata synchronization method according to an embodiment of the present invention;
FIG. 2 is a flow chart of metadata acquisition according to an embodiment of the present invention;
FIG. 3 is a flow chart of metadata synchronization according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a metadata synchronization system according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a metadata synchronization device according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a computer-readable storage medium according to an embodiment of the invention.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.
As shown in fig. 1, in order to solve the above technical problem, an embodiment of the present invention provides a metadata synchronization method, which includes the following steps:
S100: receiving the project name of Kylin metadata to be synchronized;
the metadata is also called as intermediate data, relay data, meta data, and is data (data about data) describing data, mainly describing information of data attribute (property), and is used for supporting functions such as indicating storage location, history data, resource searching, file recording, and the like. In this context, metadata of Kylin refers to data of all information describing one project. Project represents an item, representing a set of metadata. A plurality of project items can be created in the kyin to effectively separate various business data.
S200: acquiring metadata of an item to be synchronized from a storage path of Kylin metadata to be synchronized;
S300: receiving position information of a target cluster, wherein the position information can be a webpage website of the target cluster;
S400: and sending the metadata of the items to be synchronized to the position of the target cluster according to the position information of the target cluster.
The invention provides a method for rapidly and accurately synchronizing Kylin metadata, which is used for acquiring Kylin metadata to be synchronized through steps S100 and S200 and synchronizing the Kylin metadata into a target cluster through steps S300 and S400. For the requirement that the same Kylin service needs to be realized in a plurality of clusters at the same time, by adopting the method, project metadata is manually created and maintained in one cluster, other clusters can be synchronized into the same state by one key, manual repeated operation is avoided, and the investigation and error correction cost caused by manual misoperation can be reduced while the efficiency is improved.
The above numbers of the steps are merely for distinguishing the steps, and not for limiting the order of implementation of the steps. Steps S100 to S400 may be performed in the order shown in fig. 1, or the execution order of each step may be adjusted, for example, steps S100 and S300 may be simultaneously performed to obtain all information of metadata synchronization, and then steps S200 and S400 may be performed, which may also achieve the objects and effects of the present invention and fall within the scope of protection of the present invention.
The Kylin metadata mainly comprises project item information, model information, cube data cube information, table information and cube segment information.
Where Table refers to a Table of stored data in Hive. The source data for Kylin pre-computation comes from Hive. The Database name (Database name) +table name (table name) may uniquely specify a table in a Hive data store. Hive is a data warehouse infrastructure built on Hadoop. It provides a series of tools that can be used to perform data Extraction Translation Load (ETL), a mechanism that can store, query and analyze large-scale data stored in Hadoop. Hive defines a simple SQL-like query language, called HQL, that allows users familiar with SQL to query data. SQL, the structured query language, is a database query and programming language used for accessing data and querying, updating and managing a relational database system; and is also an extension of the database script file. The statement, also called the "data retrieval statement", is used to obtain data from the table, determining how the data is presented at the application.
The Model refers to the association relationship between certain tables in the Hive database, namely the data Model, and also can be the association relationship between all tables in the database.
Cube refers to a data Cube, a technology commonly used for data analysis and indexing; it can build a multidimensional index on the raw data. The data cube is used for analyzing the data, so that the data query efficiency can be greatly improved.
Cube Segment refers to the data Cube data calculated for a Segment in the source data. Typically, the amount of data in a data warehouse will increase with time, and Cube segments are also constructed in chronological order.
As shown in fig. 2, in this embodiment, the step S200: after obtaining the metadata of the item to be synchronized from the storage path of the Kylin metadata to be synchronized, the method further comprises the following steps:
S210: extracting respective model names from metadata of the items to be synchronized;
S220: acquiring metadata of each model from a storage path of the Kylin metadata to be synchronized according to the model names;
s230: extracting each table name from metadata of the items to be synchronized;
S240: and acquiring metadata of each table from the storage path of the Kylin metadata to be synchronized according to the table names.
In one embodiment, the steps S200 to S240 may be implemented as follows:
S200: metadata of items specified by a user are obtained (fed) by using org.apache.kylin.common.persistence.resource tool (ResourceTool hereinafter), and the HBase (a distributed, array-oriented open source database) has a storage path of: project/$ { project_name }. Json, local storage path: { meta_local_path }/project-
S210: analyzing the metadata of the item acquired in step S200, extracting all model names contained therein, and executing step S230: metadata of each name is acquired sequentially using ResourceTool, and the HBase storage path is/model_desc/$ { model }, json, local storage path: { meta_local_path }/model-
S240: analyzing the project metadata obtained in the first step, extracting all Table names contained therein, and executing step S250: the metadata of each table is acquired by using ResourceTool in sequence, wherein the HBase storage path is/table/$ { table }. Json, local storage path: { meta_local_path }/table-
As shown in fig. 2, in this embodiment, the step S200: after the metadata of the item to be synchronized is obtained from the storage path of the Kylin metadata to be synchronized, the method further comprises the following steps:
s250: extracting names of all data cubes from metadata of the items to be synchronized;
S260: acquiring metadata of each data cube from a storage path of the Kylin metadata to be synchronized according to the names of the data cubes;
S270: for each data cube, metadata for the respective segment of the data cube is obtained.
In this embodiment, the step S270: acquiring metadata of all fragments of a data cube, comprising the steps of:
s271: for each data cube, obtaining a descriptive name from a descriptive field of metadata of the data cube;
S272: acquiring the metadata of the description of the data cube from the storage path of the Kylin metadata to be synchronized according to the name of the description;
s273: and extracting field content of the fragment from the metadata of the description, and generating metadata of the fragment according to the field content of the fragment.
Further, in this embodiment, the step S270: after obtaining the metadata of all the fragments of the data cube, the method further comprises the following steps:
S274: and screening the metadata of the fragments, and removing the log data and the physical storage information data when the fragments are constructed.
Specifically, in a specific example, the steps S250 to S270 may be specifically implemented as follows:
S250: analyzing the metadata of the item acquired in step S200, extracting the names of all the data cubes contained therein, and executing step S260: metadata of each data cube is acquired sequentially through ResourceTool, and the HBase storage path is/cube/$ { cube }. Json
S270: for each data cube, the following steps are performed:
S271: the name of cube_descriptor is obtained from the descriptor field in the metadata, and then step S272 is performed: using ResourceTool to obtain metadata of each Cube desc, the HBase storage path is/cube_desc/$ { cube_desc }, json, local storage path: { meta_local_path }/cube_desc +.
S273: the segment field contents are extracted to generate metadata for all the data cube segments. Local storage path: { meta_local_path }/cube_segment-
S274: processing cube metadata, removing unnecessary data cube segment information, and mainly generating historical data during previous construction. For creating a data cube fragment according to this information after the target cluster is imported. Local storage path: $ { meta_local_path }/cube/.
Through the steps, all necessary metadata in the items to be synchronized are already extracted. The metadata extracted in each of steps S200 to S270 is required to conform to a predetermined directory structure.
As shown in fig. 3, in this embodiment, step S300: before sending the metadata of the items to be synchronized to the location of the target cluster, the method further comprises the following steps:
s281: receiving database renaming information;
S282: modifying the database name in the metadata of the table according to the database renaming information, for example, a RenameDBName tool can be used for unified renaming.
Furthermore, before importing the metadata into the target cluster, the method may further include the steps of:
S290: it is checked whether the same item name exists in the target Kylin cluster, preventing the existing items from conflicting with metadata to be imported now. If the same project names exist, the user needs to be reminded, a next instruction of the user is received, and the metadata of the original project is cleaned or the importing is abandoned according to the instruction of the user.
When executing the step S300, the up operation may be executed by using ResourceTool, where the input parameter is the extracted storage path of the metadata of the item, and if the target cluster has a password, the password of the Kylin of the target cluster needs to be obtained in advance.
In this embodiment, step S300: after sending the metadata of the items to be synchronized to the location of the target cluster, the method further comprises the following steps:
S310: the signature is updated for each data cube in turn. Since the database name of the table is modified, the signature needs to be updated for each cube in turn using org.apache.
In this embodiment, the step S300: after sending the metadata of the items to be synchronized to the location of the target cluster, the method further comprises the following steps:
S320: requesting a Kylin construction interface, and sequentially submitting the construction of the fragments for each data cube according to the metadata of the fragments. Specifically, step S320 may be performed using the following steps:
s321: submitting the construction of each data cube segment for each data cube in turn, and directly requesting a Kylin construction interface: api/tubes/$ { cube_name }/build, and record build ID.
S322: all build IDs described above are polled and the build status of each segment is checked. The results were recorded for the constructs whose results were fixed (end), ERROR, DISCARDED (discard). If not, it is indicated that the segment is being built, waiting and continuing to poll to the end of the build.
As shown in fig. 4, an embodiment of the present invention further provides a metadata synchronization system, which is applied to the metadata synchronization method, where the system includes:
The information acquisition module M100 is used for receiving the project name of the Kylin metadata to be synchronized and receiving the position information of the target cluster;
the data extraction module M200 is used for acquiring metadata of the items to be synchronized from a storage path of the Kylin metadata to be synchronized;
And the data importing module M300 is used for sending the metadata of the items to be synchronized to the position of the target cluster.
The metadata synchronization system of the present invention can directly acquire the synchronization requirement information of the Kylin metadata through the information acquisition module M100, specifically, includes the storage path and the project name of the Kylin metadata to be synchronized and the position information of the target cluster, acquires the Kylin metadata to be synchronized through the data extraction module M200, and synchronizes the Kylin metadata into the target cluster through the data import module M300. For the requirement that the same Kylin service needs to be realized in a plurality of clusters at the same time, by adopting the method, project metadata is manually created and maintained in one cluster, other clusters can be synchronized into the same state by one key, manual repeated operation is avoided, and the investigation and error correction cost caused by manual misoperation can be reduced while the efficiency is improved.
The functions of each module can be realized by adopting the specific implementation mode of the metadata synchronization method. For example, the functions of the information acquisition module M100 may be implemented using the embodiments of the steps S100 and S300 described above, the functions of the data extraction module M200 may be implemented using the embodiments of the step S200 described above, and further the steps S210 to S270 described above may be performed to obtain metadata of a complete item, the functions of the data import module M300 may be implemented using the embodiments of the step S300 described above, and further the steps of the steps S281, S282, S290, S310 and S320 described above may be performed to complete the complete import and construction of metadata.
The program of the metadata synchronization method and system can be written by two languages of java and linux shell, and when the metadata synchronization method and system are executed, jdk/jre environments are required to be deployed on a unix-like system, and in addition, a Kylin client is required. When the source cluster executes the code corresponding to the data extraction module M200, parameters that the user needs to input are: item names and metadata local storage paths to be synchronized. The source cluster determines a storage path of the metadata in the source cluster according to the item name, and the data extraction module M200 extracts the metadata from the storage path in the source cluster and stores the metadata in the metadata local storage path.
When data is imported, metadata is transmitted to the target cluster, a user needs to fill in database renaming information, and then code corresponding to the data importing module M300 is executed. Parameters that the user needs to input are: web url (web address) of the target Kylin cluster, password of the target Kylin cluster, and storage path of metadata to be imported. The data import module M300 is configured to import metadata from the local storage path into the target cluster, and complete metadata construction.
The embodiment of the invention also provides metadata synchronization equipment, which comprises a processor; a memory having stored therein executable instructions of the processor; wherein the processor is configured to perform the steps of the metadata synchronization method via execution of the executable instructions.
Those skilled in the art will appreciate that the various aspects of the invention may be implemented as a system, method, or program product. Accordingly, aspects of the invention may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" platform.
An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 5. The electronic device 600 shown in fig. 5 is merely an example, and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.
As shown in fig. 5, the electronic device 600 is embodied in the form of a general purpose computing device. Combinations of electronic devices 600 may include, but are not limited to: at least one processing unit 610, at least one memory unit 620, a bus 630 connecting different platform combinations (including memory unit 620 and processing unit 610), a display unit 640, etc.
Wherein the storage unit stores program code that is executable by the processing unit 610 such that the processing unit 610 performs the steps according to various exemplary embodiments of the present invention described in the electronic prescription stream processing method section above in this specification. For example, the processing unit 610 may perform the steps as shown in fig. 1.
The memory unit 620 may include readable media in the form of volatile memory units, such as Random Access Memory (RAM) 6201 and/or cache memory unit 6202, and may further include Read Only Memory (ROM) 6203.
The storage unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
Bus 630 may be a local bus representing one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or using any of a variety of bus architectures.
The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 600, and/or any device (e.g., router, modem, etc.) that enables the electronic device 600 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 650. Also, electronic device 600 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 660. The network adapter 660 may communicate with other modules of the electronic device 600 over the bus 630. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 600, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage platforms, and the like.
The embodiment of the invention also provides a computer readable storage medium for storing a program, which when executed, implements the steps of the metadata synchronization method. In some possible embodiments, the aspects of the present invention may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the invention as described in the electronic prescription stream processing method section of this specification, when said program product is run on the terminal device.
Referring to fig. 6, a program product 800 for implementing the above-described method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable storage medium may also be any readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
In summary, compared with the prior art, the metadata synchronization method, system, device and storage medium provided by the invention have the following advantages:
The invention provides a method for quickly and accurately synchronizing Kylin metadata, which is used for meeting the requirement that the same Kylin service needs to be realized simultaneously in a plurality of clusters, and only one cluster is needed to manually establish and maintain project metadata, and other clusters can be synchronized into the same state by one key, so that manual repeated operation is avoided, the efficiency is improved, and the investigation and error correction cost caused by manual operation errors can be reduced.
The foregoing is a further detailed description of the invention in connection with the preferred embodiments, and it is not intended that the invention be limited to the specific embodiments described. It will be apparent to those skilled in the art that several simple deductions or substitutions may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.

Claims (7)

1. A method of metadata synchronization, the method comprising the steps of:
receiving the project name of Kylin metadata to be synchronized;
acquiring metadata of an item to be synchronized from a storage path of Kylin metadata to be synchronized;
Receiving the position information of a target cluster;
Sending the metadata of the items to be synchronized to the position of the target cluster;
Before sending the metadata of the items to be synchronized to the location of the target cluster, the method further comprises the following steps:
receiving database renaming information;
Modifying the database name in the metadata according to the database renaming information;
checking whether the same item name exists in the target cluster;
If the same project names exist, the user needs to be reminded, a next instruction of the user is received, and the metadata of the original project is cleaned or the importing is abandoned according to the user instruction;
After the metadata of the item to be synchronized is obtained from the storage path of the Kylin metadata to be synchronized, the method further comprises the following steps:
extracting each model name and table name from the metadata of the items to be synchronized;
According to the model names and the table names, metadata of each model and metadata of each table are obtained from the storage path;
extracting names of all data cubes from metadata of the items to be synchronized;
Acquiring metadata of each data cube from the storage path according to the name of the data cube;
for each data cube, obtaining metadata of each segment of the data cube;
And screening the metadata of the fragments, and removing the log data and the physical storage information data when the fragments are constructed.
2. The metadata synchronization method according to claim 1, further comprising the steps of, after sending metadata of the item to be synchronized to the location of the target cluster:
Requesting a Kylin construction interface, and sequentially submitting the construction of the fragments for each data cube according to the metadata of the fragments.
3. The method of metadata synchronization according to claim 1, wherein the acquiring metadata of all segments of a data cube comprises the steps of:
For each data cube, acquiring the description name from the description field of the metadata of the data cube, and acquiring the metadata of the description of the data cube from the storage path according to the description name;
And extracting field content of the fragment from the metadata of the description, and generating metadata of the fragment according to the field content of the fragment.
4. The metadata synchronization method according to claim 1, further comprising the steps of, after sending metadata of the item to be synchronized to the location of the target cluster:
The signature is updated for each data cube in turn.
5. A metadata synchronization system, characterized by being applied to the metadata synchronization method of any one of claims 1 to 4, the system comprising:
The information acquisition module is used for receiving the project name of the Kylin metadata to be synchronized and receiving the position information of the target cluster;
The data extraction module is used for acquiring metadata of the items to be synchronized from a storage path of the Kylin metadata to be synchronized;
And the data importing module is used for sending the metadata of the items to be synchronized to the position of the target cluster.
6. A metadata synchronization apparatus, comprising:
A processor;
A memory having stored therein executable instructions of the processor;
wherein the processor is configured to perform the steps of the metadata synchronization method of any one of claims 1 to 4 via execution of the executable instructions.
7. A computer-readable storage medium storing a program, characterized in that the program when executed implements the steps of the metadata synchronization method of any one of claims 1 to 4.
CN201911412597.XA 2019-12-31 2019-12-31 Metadata synchronization method, system, equipment and storage medium Active CN113127558B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911412597.XA CN113127558B (en) 2019-12-31 2019-12-31 Metadata synchronization method, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911412597.XA CN113127558B (en) 2019-12-31 2019-12-31 Metadata synchronization method, system, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113127558A CN113127558A (en) 2021-07-16
CN113127558B true CN113127558B (en) 2024-08-06

Family

ID=76770279

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911412597.XA Active CN113127558B (en) 2019-12-31 2019-12-31 Metadata synchronization method, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113127558B (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8078749B2 (en) * 2008-01-30 2011-12-13 Microsoft Corporation Synchronization of multidimensional data in a multimaster synchronization environment with prediction
CN103150394B (en) * 2013-03-25 2014-07-23 中国人民解放军国防科学技术大学 Distributed file system metadata management method facing to high-performance calculation
CN106528341B (en) * 2016-11-09 2019-07-30 上海新炬网络信息技术股份有限公司 Automation disaster tolerance system based on Greenplum database
CN109871384B (en) * 2019-02-22 2021-04-30 携程旅游信息技术(上海)有限公司 Method, system, equipment and storage medium for container migration based on PaaS platform
CN110032575A (en) * 2019-04-15 2019-07-19 网易(杭州)网络有限公司 Data query method, apparatus, equipment and storage medium
CN110543476A (en) * 2019-07-03 2019-12-06 威富通科技有限公司 Synchronization method and device of database table structure and server

Also Published As

Publication number Publication date
CN113127558A (en) 2021-07-16

Similar Documents

Publication Publication Date Title
CN109684352B (en) Data analysis system, data analysis method, storage medium, and electronic device
US10282197B2 (en) Open application lifecycle management framework
US11023500B2 (en) Systems and methods for code parsing and lineage detection
CN109471851B (en) Data processing method, device, server and storage medium
CN111709527A (en) Operation and maintenance knowledge map library establishing method, device, equipment and storage medium
JP5791149B2 (en) Computer-implemented method, computer program, and data processing system for database query optimization
US20090313208A1 (en) Sortable hash table
US11494395B2 (en) Creating dashboards for viewing data in a data storage system based on natural language requests
CN113760891B (en) Data table generation method, device, equipment and storage medium
US10474675B2 (en) Explain tool for optimizing SPARQL queries
CN105760418B (en) Method and system for performing cross-column search on relational database table
US11461333B2 (en) Vertical union of feature-based datasets
US10901811B2 (en) Creating alerts associated with a data storage system based on natural language requests
CN112949269A (en) Method, system, equipment and storage medium for generating visual data analysis report
US20130006979A1 (en) Enhancing cluster analysis using document metadata
CN111984745B (en) Database field dynamic expansion method, device, equipment and storage medium
CN110874364B (en) Query statement processing method, device, equipment and storage medium
US20210042302A1 (en) Cost-based optimization for document-oriented database queries
US10223086B2 (en) Systems and methods for code parsing and lineage detection
CN113127558B (en) Metadata synchronization method, system, equipment and storage medium
CN116894022A (en) Improving accuracy and efficiency of database auditing using structured audit logs
CN113760600B (en) Database backup method, database restoration method and related devices
CN115292313A (en) Pseudo-column implementation method and device, electronic equipment and storage medium
CN113626423A (en) Log management method, device and system of service database
CN112835905A (en) Indexing method, device, equipment and storage medium for array type column

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant