CN115168361A - Label management method and device - Google Patents
Label management method and device Download PDFInfo
- Publication number
- CN115168361A CN115168361A CN202210847895.7A CN202210847895A CN115168361A CN 115168361 A CN115168361 A CN 115168361A CN 202210847895 A CN202210847895 A CN 202210847895A CN 115168361 A CN115168361 A CN 115168361A
- Authority
- CN
- China
- Prior art keywords
- label
- tag
- static
- dynamic
- sql
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/221—Column-oriented storage; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2308—Concurrency control
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
- G06F3/0486—Drag-and-drop
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/547—Remote procedure calls [RPC]; Web services
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Human Computer Interaction (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of label management, and provides a label management method and a device, wherein the label management method comprises the following steps: creating a main body object, and correspondingly creating a label storage table in a ClickHouse warehouse by a back end; acquiring an external data source, mapping a database table structure and data to be marked into a CLickHouse through a FLinkX, and associating the database table structure and the data with a main object; creating a static label under the subject object; creating a dynamic label under the main object through SQL or a dragging mode; and issuing the dynamic label and the static label under the subject object into an API interface. According to the label management method and device provided by the exemplary embodiment of the invention, the production speed of label data can be increased; the ready speed of the label data is accelerated; reducing the average response time length of the query request; the method supports the quasi-real-time updating of the tag data; the label expression and the query SQL are friendly to users, and the maintainability of the system is improved; marking and inquiring are processed in the ClickHouse, and hardware resources can be saved by half.
Description
Technical Field
The present invention relates to the field of tag management technologies, and in particular, to a tag management method and apparatus.
Background
The label is a semantic expression of the platform service data, which can be a basic attribute of the object, and can also represent a certain characteristic attribute of the description object obtained by calculating and analyzing the original data. The label is widely applied to application scenes such as user portraits, product portraits and the like.
The existing label management platform defines a calculation process of labels on an interface in a visual mode, and realizes calculation of label data of composite service requirements in pb-level data through technologies such as big data spark, hive, hbase and the like, so as to further provide a data basis for user grouping and user labels. The label management platform is a label full life cycle management platform, and the overall architecture is divided into three layers, namely a label management layer, a label library and a label service layer. The life cycle of the tag is divided into creation, storage and query. The label management layer is supported by a label engine, visual label rule definition modes such as dragging, circle selection, lightweight scripts and the like are provided for a demand analyst, the label engine automatically analyzes label rules, marks are automatically marked and falls off, complete label management and metering statistical functions are provided, and the full life cycle management of labels is realized. The label library is a carrier of the label, realizes the precipitation of value data, and provides resource support for the label to external services. The label service layer comprises a series of services facing terminal application, such as label API service, dynamic marking service and the like, and the rapid service of the value data is realized.
Currently, in the field, data synchronization, data storage and tag search in a tag creation process are mainly realized based on a data synchronization engine (DataX) + big data storage (Hive) + search engine (ElasticSearch), and in practical application, the tag creation process has many steps, takes long time, and has the following defects:
1. the data synchronization efficiency is low, the created label relies on a Datax platform to synchronize the original data to a label storage engine library, and the synchronization speed is low in a large data volume scene; after the tag is successfully created, the data needs to be synchronized to the elastic search again by means of the DataX platform, so that the complexity of the system is increased, and the use efficiency of the tag data is reduced. The number of the tags under the main body determines the number of columns of the static tag table, and when the number of the columns of the static tag table is too large, data synchronization becomes a bottleneck.
2. The data real-time performance is poor, before the label is created, partial label data needs to be previewed according to the creation rule, the label data is provided for an external platform to be used in a mode of creating API service, and the query response time is required not to be too slow. Storing the tags into a hive library, taking the user tags as an example, the structure of a storage table is as follows: creating an image table with userid as a main key, wherein other fields of the table are characteristic fields of the image, performing in operation on the selected crowd and the image table, and then performing group by operation, wherein when the characteristic fields are added or deleted, the table structure of the image table needs to be modified; when the number of the circled crowds is large, group by operation of a large record set is involved, the hive statement has poor execution performance and high execution delay, cannot acquire a query result quickly, and is not suitable for a real-time scene. On the other hand, the generated tag search is performed by storing the tag in the large-width table of ElasticSearch, and the structure of the large-width table is as follows: when data is inserted into the large-width table, it is necessary to wait for the data of the service to be ready before running the association table operation, and then insert the association result into the ElasticSearch. Often, a task delay of a certain service party is encountered, so that a related task of inserting the elastic search cannot be executed, and an operator cannot use the latest image data in time.
3. Data redundancy, wherein a data rule occupies a row of storage space for each label value corresponding to each object, so that a large amount of redundancy of data items such as label names, label values and the like is caused, and label marking can store a plurality of pieces of redundant data in a static label table and a dynamic label table.
4. The label semantics are monotonous, the label can only be created based on original static data, the function of creating a new label based on the aggregation operation of the existing labels is not provided, and the feature range which can be described by a single label is small. The marking process completely depends on the base table sql statement, one main object can only select the same database under one data source, and the main object cannot be used for related query marking of different base tables of different sources.
Therefore, how to construct tags with various dimensions and semantics and realize a tag management platform capable of quickly querying in real time is an urgent problem to be solved.
Therefore, how to provide a label management method with high efficiency and wider application range is a technical problem to be solved urgently.
Disclosure of Invention
In view of the above, the present invention mainly solves the problem.
In one aspect, the present invention provides a tag management method, including:
step S1: creating a main body object, and correspondingly creating a label storage table in a ClickHouse warehouse by a back end;
step S2: acquiring an external data source, mapping a database table structure and data to be marked into a CLickHouse through FLinkX, and associating the database table structure and the data with a main object;
and step S3: creating a static label under the subject object;
and step S4: creating a dynamic label under the main object through SQL or a dragging mode;
step S5: and publishing the dynamic label and the static label under the subject object as an API interface. Further, in step S1 of the tag management method of the present invention, the tag storage table includes a static tag storage table and a dynamic tag storage table, where the static tag storage table is used to store data corresponding to the basic attribute of the object, and the dynamic tag storage table is used to store tags that can describe a batch of objects and are obtained through static tag calculation and aggregation.
Further, step S2 of the tag management method of the present invention includes:
step S21: acquiring and storing an external data source, and selecting a data sheet from the external data source as a tag data source sheet;
step S22: acquiring field information in a tag data source table and mapping the field information into a click House field type;
step S23: building a table building statement according to the fields, building a target table and executing the table building statement in the target table;
step S24: and assembling the tag data source table and the target table into a FlinkX task execution parameter, and submitting the FlinkX task execution parameter to a ClickHouse for data synchronization.
Further, step S3 of the tag management method of the present invention includes:
step S31: selecting one or more columns of the added data source table as a label data source;
step S32: adding a static label column in a static label storage table;
step S33: assembling marking SQL sentences;
step S34: copying the structure of a static label storage table, and creating a new table;
step S35: executing a marking SQL sentence, writing the tag data into a new table, and deleting the copied static tag storage table;
step S36: the new table name is modified to the name of the static tag storage table being copied.
Further, step S4 of the tag management method of the present invention includes: and performing set operation on the static tags by combining SQL operators to create dynamic tags.
Further, step S4 of the tag management method of the present invention includes:
step S41: establishing a dynamic label establishing rule;
step S42: the front end is assembled into an expression in a json format, the back end converts the json into an object and checks whether the expression meets the dynamic label creation rule or not;
step S43: performing set operation on the static tags by combining SQL operators to create dynamic tags; analyzing the expression, constructing marking SQL of the dynamic label and executing the marking SQL of the dynamic label;
step S44: and checking the execution state of the marking SQL of the dynamic label to produce the value domain distribution condition of the dynamic label.
Further, step S4 of the tag management method of the present invention further includes: and creating a new dynamic label by dragging the dynamic label and the static label and combining the judgment condition and the filtering condition.
Further, step S5 of the tag management method of the present invention includes: and dragging and generating an API interface for a third party to call according to the dynamic label or the static label under the main object as the input parameter or the output parameter.
Further, step S5 of the tag management method of the present invention further includes: and automatically taking out the static label storage table and the dynamic label storage table and the fields corresponding to the main object by dragging the main object to the SQL editing page, and writing an SQL (structured query language) generation API (application programming interface) for a third-party platform to call.
In another aspect, the present invention provides a tag management apparatus, including:
the system comprises a tag storage table creating module, a tag storage module and a tag matching module, wherein the tag storage table creating module is used for creating a main object, and a tag storage table is correspondingly created in a ClickHouse warehouse by a back end, a static tag storage table is used for storing data corresponding to basic attributes of the object, and a dynamic tag storage table is used for storing tags which can describe a batch of objects and are obtained through static tag calculation and aggregation;
the data synchronization module is used for acquiring and storing an external data source and selecting a data sheet from the external data source as a tag data source sheet; acquiring field information in a tag data source table and mapping the field information into a click House field type; assembling a table building statement according to the field, building a target table and executing the table building statement in the target table; assembling a tag data source table and a target table into a FlinkX task execution parameter and submitting the FlinkX task execution parameter to a ClickHouse for data synchronization;
the static label creating module is used for selecting one or more columns of the added data source table as label data sources; adding a static label column in a static label storage table; assembling marking SQL sentences; copying the structure of a static tag storage table, and creating a new table; executing a marking SQL statement, writing the tag data into a new table, and deleting the copied static tag storage table; modifying the name of the new table into the name of the copied static label storage table;
the dynamic label creating module is used for creating a dynamic label creating rule; the front end is assembled into an expression in a json format, and the back end converts json into an object and checks whether the expression meets the dynamic label creation rule or not; performing set operation on the static tags by combining SQL operators to create dynamic tags; analyzing the expression, constructing marking SQL of the dynamic label and executing the marking SQL of the dynamic label; checking the execution state of the marking SQL of the dynamic label, and generating a value domain distribution condition for the dynamic label; the dynamic label creating module is also used for creating a new dynamic label by dragging the dynamic label and the static label and combining the judging condition and the filtering condition;
the tag issuing module is used for dragging and generating an API interface for a third party to call according to the dynamic tag or the static tag under the main object as the input parameter or the output parameter; and the SQL generation API interface is compiled for a third-party platform to call by dragging the main object to the SQL editing page and automatically taking out the static label storage table and the dynamic label storage table and the fields corresponding to the main object.
The label management method and the label management device have the following beneficial effects:
1) The tag data are constructed in parallel, so that the production speed of the tag data is increased;
2) The method comprises the steps that HDFS files are led into a ClickHouse in a concurrent mode, and the ready speed of label data is increased;
3) Reducing the average response time length of the query request;
4) The method supports the quasi-real-time updating of the label data;
5) The label expression and the query SQL are friendly to users, and the maintainability of the system is improved;
6) Marking and inquiring are processed in the ClickHouse, and half hardware resources can be saved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart illustrating a tag management method according to an exemplary first embodiment of the present invention.
Fig. 2 is a flowchart of an exemplary second embodiment tag management method according to the present invention.
Fig. 3 is a flowchart of an exemplary third embodiment label management method according to the present invention.
Fig. 4 is a flowchart illustrating a tag management method according to an exemplary fourth embodiment of the present invention.
Fig. 5 is an architecture diagram of an exemplary eighth embodiment label management device in accordance with the present invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
It should be noted that, in the case of no conflict, the features in the following embodiments and examples may be combined with each other; moreover, all other embodiments that can be derived by one of ordinary skill in the art from the embodiments disclosed herein without making any creative effort fall within the scope of the present disclosure.
It is noted that various aspects of the embodiments are described below within the scope of the appended claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the disclosure, one skilled in the art should appreciate that one aspect described herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. Additionally, such an apparatus may be implemented and/or such a method may be practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.
Fig. 1 is a flowchart of a tag management method according to an exemplary first embodiment of the present invention, and as shown in fig. 1, the method of this embodiment includes:
step S1: creating a main body object, and correspondingly creating a label storage table in a ClickHouse warehouse by a back end;
step S2: acquiring an external data source, mapping a database table structure and data to be marked into a CLickHouse through a FLinkX, and associating the database table structure and the data with a main object;
and step S3: creating a static label under the subject object;
and step S4: creating a dynamic label under the main object through SQL or a dragging mode;
step S5: and publishing the dynamic label and the static label under the subject object as an API interface.
In step S1 of the method of this embodiment, the tag storage table includes a static tag storage table and a dynamic tag storage table, where the static tag storage table is used to store data corresponding to basic attributes of objects, and the dynamic tag storage table is used to store tags that can describe a batch of objects and are obtained through static tag calculation and aggregation.
Fig. 2 is a flowchart of a label management method according to an exemplary second embodiment of the present invention, where this embodiment is a preferred embodiment of the method shown in fig. 1, and as shown in fig. 2, step S2 of the method of this embodiment includes:
step S21: acquiring and storing an external data source, and selecting a data sheet from the external data source as a tag data source sheet;
step S22: acquiring field information in a tag data source table and mapping the field information into a click House field type;
step S23: assembling a table building statement according to the field, building a target table and executing the table building statement in the target table;
step S24: and assembling the tag data source table and the target table into a FlinkX task execution parameter, and submitting the FlinkX task execution parameter to a ClickHouse for data synchronization.
Fig. 3 is a flowchart of a label management method according to an exemplary third embodiment of the present invention, where this embodiment is a preferred embodiment of the method shown in fig. 1, and as shown in fig. 3, step S3 of the method of this embodiment includes:
step S31: selecting one or more columns of the added data source table as a label data source;
step S32: adding a static label column in a static label storage table;
step S33: assembling marking SQL sentences;
step S34: copying the structure of a static tag storage table, and creating a new table;
step S35: executing a marking SQL statement, writing the tag data into a new table, and deleting the copied static tag storage table;
step S36: the new table name is modified to the name of the static tag storage table being copied.
Fig. 4 is a flowchart of a label management method according to an exemplary fourth embodiment of the present invention, where this embodiment is a preferred embodiment of the method shown in fig. 1, and as shown in fig. 4, step S4 of the method of this embodiment includes: and performing set operation on the static tags by combining SQL operators to create dynamic tags.
Step S4 of the method of this embodiment includes:
step S41: establishing a dynamic label establishing rule;
step S42: the front end is assembled into an expression in a json format, and the back end converts json into an object and checks whether the expression meets the dynamic label creation rule or not;
step S43: performing set operation on the static tags by combining SQL operators to create dynamic tags; analyzing the expression, constructing marking SQL of the dynamic label and executing the marking SQL of the dynamic label;
step S44: and checking the execution state of the marking SQL of the dynamic label to produce the value domain distribution condition of the dynamic label.
An exemplary fifth embodiment of the present invention provides a tag management method, where this embodiment is a preferred embodiment of the method shown in fig. 1, and step S4 of the method in this embodiment further includes: and creating a new dynamic label by dragging the dynamic label and the static label and combining the judgment condition and the filtering condition. The label management method of the embodiment can drag the label through the visual editing interface, and graphically set the connection condition and/or the judgment condition such as the size and comparison to assemble a new label.
An exemplary sixth embodiment of the present invention provides a tag management method, where this embodiment is a preferred embodiment of the method shown in fig. 1, and step S5 of the method of this embodiment includes: and dragging and generating an API interface for a third party to call according to the dynamic label or the static label under the main object as the input parameter or the output parameter.
An exemplary seventh embodiment of the present invention provides a tag management method, where this embodiment is a preferred embodiment of the method shown in fig. 1, and step S5 of the method in this embodiment includes: and automatically taking out the static label storage table and the dynamic label storage table and the fields corresponding to the main object by dragging the main object to the SQL editing page, and writing an SQL (structured query language) generation API (application programming interface) for a third-party platform to call.
Fig. 5 is an architecture diagram of a tag management apparatus according to an exemplary eighth embodiment of the present invention, and as shown in fig. 5, the tag management apparatus of this embodiment includes:
the system comprises a tag storage table creating module, a tag storage module and a tag matching module, wherein the tag storage table creating module is used for creating a main object, and a tag storage table is correspondingly created in a ClickHouse warehouse by a back end, a static tag storage table is used for storing data corresponding to basic attributes of the object, and a dynamic tag storage table is used for storing tags which can describe a batch of objects and are obtained through static tag calculation and aggregation;
the data synchronization module is used for acquiring and storing external data sources and selecting one data sheet from the external data sources as a tag data source sheet; acquiring field information in a tag data source table and mapping the field information into a ClickHouse field type; assembling a table building statement according to the field, building a target table and executing the table building statement in the target table; assembling a tag data source table and a target table into a FlinkX task execution parameter and submitting the FlinkX task execution parameter to a ClickHouse for data synchronization;
the static label creating module is used for selecting one or more columns of the added data source table as label data sources; adding a static label column in a static label storage table; assembling marking SQL sentences; copying the structure of a static label storage table, and creating a new table; executing a marking SQL sentence, writing the tag data into a new table, and deleting the copied static tag storage table; modifying the name of the new table into the name of the copied static label storage table;
the dynamic label creating module is used for creating a dynamic label creating rule; the front end is assembled into an expression in a json format, and the back end converts json into an object and checks whether the expression meets the dynamic label creation rule or not; performing set operation on the static tags by combining SQL operators to create dynamic tags; analyzing the expression, constructing marking SQL of the dynamic label and executing the marking SQL of the dynamic label; checking the execution state of the marking SQL of the dynamic label, and generating a value domain distribution condition for the dynamic label; the dynamic label creating module is also used for creating a new dynamic label by dragging the dynamic label and the static label and combining the judgment condition and the filtering condition;
the tag issuing module is used for dragging and generating an API interface for a third party to call according to the dynamic tag or the static tag under the main object as the input parameter or the output parameter; and the SQL generation API interface is compiled for a third-party platform to call by dragging the main object to the SQL editing page and automatically taking out the static label storage table and the dynamic label storage table and the fields corresponding to the main object.
In practical applications, the tag management apparatus of this embodiment has the following features:
1) The tag data are constructed in parallel, so that the production speed of the tag data is increased; the shortage that data synchronization needs queuing when a certain number is considered to be exceeded is avoided.
2) By means of the method, the HDFS file is led into the ClickHouse in a concurrent mode, and the readying speed of the label data is increased by directly marking the HDFS associated with the ClickHouse table engine.
3) Through single-table and multi-table association query, the average response time of a query request is below 2 seconds, and the average response time of a complex query is below 5 seconds.
4) The method supports the quasi-real-time updating of the label data; marking and inquiring directly in the ClickHouse, and generating an api result in real time; the defect that in the prior art, after marking data are synchronized to es, the marking data can be inquired through an api interface on the 2 nd day is overcome.
5) The label expression and the query SQL are friendly to users, and the maintainability of the system is improved; through column type storage and logic calculation based on bitmap intersection, union and difference, the processing script is simple and efficient. The defect that the back-end query SQL in the prior art needs to write complex row-column conversion is overcome.
6) Marking and inquiring are processed in the ClickHouse, and compared with a mode that data is stored in hive firstly and then an API is issued to the outside after es are synchronized, half of hardware resources can be saved.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (10)
1. A label management method, characterized in that the label management method comprises:
step S1: creating a main body object, and correspondingly creating a label storage table in a ClickHouse warehouse by a back end;
step S2: acquiring an external data source, mapping a database table structure and data to be marked into a CLickHouse through FLinkX, and associating the database table structure and the data with a main object;
and step S3: creating a static label under the subject object;
and step S4: creating a dynamic label under the main object through SQL or a dragging mode;
step S5: and publishing the dynamic label and the static label under the subject object as an API interface.
2. The tag management method according to claim 1, wherein in step S1, the tag storage table includes a static tag storage table and a dynamic tag storage table, wherein the static tag storage table is used for storing data corresponding to basic attributes of the objects, and the dynamic tag storage table is used for storing tags that are obtained by static tag calculation and aggregation and can describe a batch of objects.
3. The tag management method according to claim 1, wherein step S2 comprises:
step S21: acquiring and storing an external data source, and selecting a data sheet from the external data source as a tag data source sheet;
step S22: acquiring field information in a tag data source table and mapping the field information into a click House field type;
step S23: building a table building statement according to the fields, building a target table and executing the table building statement in the target table;
step S24: and assembling the tag data source table and the target table into a FlinkX task execution parameter and submitting the FlinkX task execution parameter to a ClickHouse for data synchronization.
4. The tag management method according to claim 1, wherein step S3 comprises:
step S31: selecting one or more columns of the added data source table as a label data source;
step S32: adding a static label column in a static label storage table;
step S33: assembling marking SQL sentences;
step S34: copying the structure of a static tag storage table, and creating a new table;
step S35: executing a marking SQL sentence, writing the tag data into a new table, and deleting the copied static tag storage table;
step S36: the new table name is modified to the name of the static tag storage table being copied.
5. The tag management method according to claim 1, wherein step S4 comprises: and performing set operation on the static tags by combining SQL operators to create dynamic tags.
6. The tag management method according to claim 5, wherein step S4 comprises:
step S41: establishing a dynamic label establishing rule;
step S42: the front end is assembled into an expression in a json format, the back end converts the json into an object and checks whether the expression meets the dynamic label creation rule or not;
step S43: performing set operation on the static tags by combining SQL operators to create dynamic tags; analyzing the expression, constructing marking SQL of the dynamic label and executing the marking SQL of the dynamic label;
step S44: and checking the execution state of the marking SQL of the dynamic label to produce the value range distribution condition of the dynamic label.
7. The tag management method according to claim 1, wherein step S4 further comprises: and creating a new dynamic label by dragging the dynamic label and the static label and combining the judgment condition and the filtering condition.
8. The tag management method according to claim 1, wherein step S5 comprises: and dragging and generating an API interface for a third party to call according to the dynamic label or the static label under the main object as the input parameter or the output parameter.
9. The tag management method according to claim 1, wherein step S5 further comprises: and automatically taking out the static label storage table and the dynamic label storage table and the fields corresponding to the subject object by dragging the subject object to the SQL editing page, and writing the SQL to generate an API interface for a third-party platform to call.
10. A tag management apparatus, characterized in that the tag management apparatus comprises:
the system comprises a tag storage table creating module, a tag storage module and a tag matching module, wherein the tag storage table creating module is used for creating a main object, and a tag storage table is correspondingly created in a ClickHouse warehouse by a back end, a static tag storage table is used for storing data corresponding to basic attributes of the object, and a dynamic tag storage table is used for storing tags which can describe a batch of objects and are obtained through static tag calculation and aggregation;
the data synchronization module is used for acquiring and storing external data sources and selecting one data sheet from the external data sources as a tag data source sheet; acquiring field information in a tag data source table and mapping the field information into a ClickHouse field type; building a table building statement according to the fields, building a target table and executing the table building statement in the target table; assembling a tag data source table and a target table into a FlinkX task execution parameter and submitting the FlinkX task execution parameter to a ClickHouse for data synchronization;
the static label creating module is used for selecting one or more columns of the added data source table as label data sources; adding a static label column in a static label storage table; assembling marking SQL sentences; copying the structure of a static tag storage table, and creating a new table; executing a marking SQL statement, writing the tag data into a new table, and deleting the copied static tag storage table; modifying the name of the new table into the name of the copied static label storage table;
the dynamic label creating module is used for creating a dynamic label creating rule; the front end is assembled into an expression in a json format, and the back end converts json into an object and checks whether the expression meets the dynamic label creation rule or not; performing set operation on the static tags by combining SQL operators to create dynamic tags; analyzing the expression, constructing marking SQL of the dynamic label and executing the marking SQL of the dynamic label; checking the execution state of the marking SQL of the dynamic label to obtain the distribution condition of the production value range of the dynamic label; the dynamic label creating module is also used for creating a new dynamic label by dragging the dynamic label and the static label and combining the judging condition and the filtering condition;
the tag issuing module is used for dragging and generating an API interface for a third party to call according to the dynamic tag or the static tag under the main object as the input parameter or the output parameter; and the system is also used for automatically bringing out the static label storage table and the dynamic label storage table and the fields corresponding to the subject object by dragging the subject object to the SQL editing page, and writing the SQL to generate an API interface for a third party platform to call.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210847895.7A CN115168361A (en) | 2022-07-19 | 2022-07-19 | Label management method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210847895.7A CN115168361A (en) | 2022-07-19 | 2022-07-19 | Label management method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115168361A true CN115168361A (en) | 2022-10-11 |
Family
ID=83495649
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210847895.7A Pending CN115168361A (en) | 2022-07-19 | 2022-07-19 | Label management method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115168361A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117331513A (en) * | 2023-12-01 | 2024-01-02 | 蒲惠智造科技股份有限公司 | Data reduction method and system based on Hadoop architecture |
-
2022
- 2022-07-19 CN CN202210847895.7A patent/CN115168361A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117331513A (en) * | 2023-12-01 | 2024-01-02 | 蒲惠智造科技股份有限公司 | Data reduction method and system based on Hadoop architecture |
CN117331513B (en) * | 2023-12-01 | 2024-03-19 | 蒲惠智造科技股份有限公司 | Data reduction method and system based on Hadoop architecture |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109101652B (en) | Label creating and managing system | |
CN108027818B (en) | Inquiry based on figure | |
CN106547809B (en) | Representing compound relationships in a graph database | |
JP5008878B2 (en) | Mapping file system models to database objects | |
US9026901B2 (en) | Viewing annotations across multiple applications | |
CN111966677B (en) | Data report processing method and device, electronic equipment and storage medium | |
CN111506621B (en) | Data statistical method and device | |
CN113312392A (en) | Lightweight rule engine processing method and device | |
CN104035754A (en) | XML (Extensible Markup Language)-based custom code generation method and generator | |
CN106294695A (en) | A kind of implementation method towards the biggest data search engine | |
Kongdenfha et al. | Rapid development of spreadsheet-based web mashups | |
CN101488086A (en) | Software generation method and apparatus based on field model | |
CN116361487A (en) | Multi-source heterogeneous policy knowledge graph construction and storage method and system | |
CN115033646B (en) | Method for constructing real-time warehouse system based on Flink and Doris | |
CN104199978A (en) | System and method for realizing metadata cache and analysis based on NoSQL and method | |
CN110737729A (en) | Engineering map data information management method based on knowledge map concept and technology | |
CN116795859A (en) | Data analysis method, device, computer equipment and storage medium | |
CN117520514A (en) | Question-answering task processing method, device, equipment and readable storage medium | |
CN114820080A (en) | User grouping method, system, device and medium based on crowd circulation | |
CN110232028A (en) | A kind of test exemple automation operation method and system | |
CN115168361A (en) | Label management method and device | |
CN112527918B (en) | Data processing method and device | |
CN111125045B (en) | Lightweight ETL processing platform | |
CN115952203B (en) | Data query method, device, system and storage medium | |
CN114895875B (en) | Zero-code visual information system metadata production application method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |