[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN118227656A - Query method and device based on data lake - Google Patents

Query method and device based on data lake Download PDF

Info

Publication number
CN118227656A
CN118227656A CN202410650121.4A CN202410650121A CN118227656A CN 118227656 A CN118227656 A CN 118227656A CN 202410650121 A CN202410650121 A CN 202410650121A CN 118227656 A CN118227656 A CN 118227656A
Authority
CN
China
Prior art keywords
query
processing diagram
data
technology
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410650121.4A
Other languages
Chinese (zh)
Other versions
CN118227656B (en
Inventor
陈刚
陈纯
伍赛
赵俊博
张东祥
唐秀
宋明黎
高云君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202410650121.4A priority Critical patent/CN118227656B/en
Publication of CN118227656A publication Critical patent/CN118227656A/en
Application granted granted Critical
Publication of CN118227656B publication Critical patent/CN118227656B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • G06F16/244Grouping and aggregation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a query method and a query device based on a data lake, comprising the following steps: sampling the target data set to be queried according to user input to obtain mode information M and data sample information, so as to construct query; decomposing the query into a plurality of subtasks, thereby constructing a processing graph; correcting the processing diagram, adopting a shuffle technology and/or a Collapse technology, and optimizing the corrected processing diagram by combining a cost model; and generating codes according to the optimized processing diagram and executing the codes to output a user query result. The invention does not need an intermediate mode, simplifies the query process, does not need data conversion and loading, simplifies the operation and improves the query efficiency on the whole. In the aspect of query details, a query optimizer aiming at LLM generated codes is designed, so that the execution efficiency of the LLM generated codes and the interpretability of a corresponding method are greatly improved, wherein a processing diagram is corrected to assist the LLM in improving the query accuracy, and the accuracy of the whole natural language query task is higher than that of the traditional method.

Description

Query method and device based on data lake
Technical Field
The invention relates to the technical field of data management, in particular to a query method and device based on a data lake.
Background
Data lakes are designed as a comprehensive memory center that is intended to accommodate, manage, and protect large amounts of data, regardless of its structure. Whether the data is meticulously organized, semi-structured, or completely unstructured, the data lake is able to hold the data in its original format. Furthermore, it is good at handling a wide range of data types, without size limitations. However, providing data lake services requires complex engineering efforts. Conventional data lakes generally employ two different approaches. In both strategies, it is necessary to define a unified, intermediate schema and establish a mapping relationship between each source schema and this central schema. In the query driven approach, any query for the mediator schema is converted to a corresponding query for each source dataset according to these mappings. These queries are then processed individually by the source system. The function of the data lake is to combine the results into a coherent output. In contrast, in a data driven model, we transform all incoming data to be consistent with the mediation schema, and then load them into the host system within the data lake. Here, the host system is responsible for processing all queries, providing a centralized data processing solution.
Implementing data chain of lakes systems faces three main challenges:
1. expert knowledge is required for pattern definition. The formulation of appropriate mediation schemas requires extensive domain expertise and comprehensive knowledge of all source data. It may not always be possible to reach such insight.
2. Complexity of pattern mapping. Even with the aid of semi-automatic tools, establishing a pattern map is a complex and labor-intensive task. This process often results in significant processing overhead and requires significant human intervention.
3. Data conversion is a dilemma. This challenge is twofold, depending on the method chosen. In data driven approaches, the process of data conversion and loading is resource intensive and expensive. On the other hand, query driven approaches introduce complexity of query rewrite and result merging, making ensuring the correctness and traceability of query results challenging.
Disclosure of Invention
The invention aims to provide a query method and a query device based on a data lake aiming at the problems of difficult mode definition, complex mode mapping and difficult data conversion in the current multi-mode database field.
The aim of the invention is realized by the following technical scheme: a data lake-based query method, comprising:
(1) Sampling the target data set to be queried according to user input to obtain mode information M and data sample information S, so as to construct query Q;
(2) Decomposing the query Q into a plurality of subtasks, thereby constructing a processing diagram G;
(3) Modifying the processing diagram G, adopting a shuffle technology and/or a Collapse technology, and optimizing the modified processing diagram G by combining a cost model;
(4) And generating codes according to the optimized processing diagram and executing the codes to output a user query result.
Further, data sampling is performed from the target data set, and the most representative data sample is obtained, wherein the core logic of the data sampling is data feature statistics and null value removal. In addition, connectable tuples are retrieved from the associated data following the primary foreign key relationship, the data association information is integrated, and added to the pattern information M, ultimately forming a query Q.
Further, the decomposing the query Q into a number of subtasks includes:
the query Q is broken down into several subtasks using a large model.
Further, the large model is trained as follows:
The subtasks and the corresponding operators are used as training sets for training the large model;
the trained large model is used to translate the subtasks into corresponding operators.
Further, the correction processing map G includes: form correction, logic correction, and global correction; the form is modified to be checked for table names and column names of subtask description references in the processing diagram; the logic is modified to check for conditions in the subtask description; the global correction is checked for relationships between subtasks.
The optimization of the corrected processing graph G by adopting a shuffle technology and/or a Collapse technology and combining a cost model comprises the following steps:
Adopting a shuffle technology, and changing the sequence of operators in the processing diagram G by combining a cost model to optimize a query plan;
adopting a Collapse technology, and circularly calculating the cost benefit brought by combining a plurality of operators by combining a cost model, so as to optimize a query plan;
Wherein the operator is translated from the subtask.
Further, the method further comprises the following steps: and (3) repeatedly executing the step (2) and the step (3) to obtain a plurality of optimized processing graphs, and selecting the optimal processing graph to execute the step (4).
Further, the optimal process map is selected from a plurality of optimized process maps by a cost model.
Further, the method further comprises the following steps: after the code is generated, code verification is carried out, if verification fails, the step (4) is re-executed, and if the verification still fails, the steps (2) - (4) are re-executed.
Further, the code verification includes:
and verifying the level of the operators according to the generated codes and the corresponding optimized processing diagram.
The invention also provides a query device based on the data lake, which comprises:
The preprocessing module is used for sampling the target data set to be inquired according to user input to obtain mode information M and data sample information S so as to construct inquiry Q;
The processing diagram construction module is used for decomposing the query Q into a plurality of subtasks so as to construct a processing diagram G;
the processing diagram optimizing module is used for correcting the processing diagram G, adopting a shuffle technology and/or a Collapse technology, and combining a cost model to optimize the corrected processing diagram G;
and the code generation and execution module is used for generating codes according to the optimized processing diagram and executing the codes so as to output a user query result.
Compared with the prior art, the invention has the beneficial effects that: compared with the existing data lake query method, the method does not need an intermediary mode, simplifies the query process, does not need data conversion and loading, simplifies the operation, and improves the query efficiency on the whole.
In the aspect of query details, a query optimizer aiming at LLM generated codes is designed, so that the execution efficiency of the LLM generated codes and the interpretability of a corresponding method are greatly improved, wherein a processing diagram is corrected to assist the LLM in greatly improving the query accuracy, and the accuracy of the whole natural language query task is higher than that of the traditional method.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.
FIG. 1 is a schematic flow chart of a query method based on a data lake according to an embodiment of the present invention;
FIG. 2 is a diagram of an exemplary query optimization provided by an embodiment of the present invention;
fig. 3 is a functional block diagram of the present invention.
Detailed Description
The present invention will be described in detail with reference to the accompanying drawings. The features of the examples and embodiments described below may be combined with each other without conflict.
The query method based on the data lake, as shown in fig. 1, comprises the following steps:
(1) Sampling the target data set to be queried according to user input to obtain mode information M and data sample information S, so as to construct query Q;
For example, when a user submits a query task to a target dataset: the total number of teachers in the school is queried, then the relevant table of the whole database (the similarity comparison based on keywords is simply filtered) is scanned, the column name, the type and the main foreign key relation of each column are obtained, the mode information M is collectively called, data scanning and sampling are carried out, and representative statistical information (such as scoring class intervals 1-5), main high-frequency values and the like are obtained according to statistics, and the data sample information S is collectively called.
(2) Decomposing the query Q into a plurality of subtasks, thereby constructing a processing diagram G;
Specifically, query Q is decomposed into a number of subtasks J i using a Large Language Model (LLMs) to construct a process graph G. Wherein the large model is also pre-trained as follows: the LLM is trained by an operator list (containing detailed usage) of an operator database and query processing instances, and the trained LLM is used for converting an input query Q into a processing graph G composed of a plurality of subtasks.
The operation uses an operator whose description information specifically, like an execution plan in a database, we decompose the whole query logic into several sub-operations, each of which will describe the specific conditions of the operation, e.g. the filter operator, which will contain the exact conditions for filtering the data, a complete query task, possibly consisting of read (read data set), filter (filter condition), select (pick target column), write (write result to file), and other various query operations.
In another embodiment, the method further comprises: an offline operator database is built, which contains typical query pairs (Ji, gi) derived from a series of typical natural language queries (i.e. user inputs), gi representing a processing graph built on the basis of subtasks Ji, where the processing graph is derived from a pre-contained plurality of query examples, which essentially contain combined examples of various operators, and which is updated on the basis of the user inputs in step (1) and the new processing graph built in step (2).
(3) Modifying the processing diagram G, and optimizing the modified processing diagram G by adopting a shuffle technology and/or a Collapse technology and combining a cost model;
The subtasks generated by LLM are unreliable, so that logic correction is required to be carried out to improve the accuracy of the subtasks, the correction mainly comprises form correction, logic correction and global correction, the form correction is carried out for table names and column names which are referenced by subtask descriptions in a processing diagram, the correction is tried, and the correction cannot be carried out, and the second step is returned for retry; the logic correction is checked against conditions in the subtask description, such as whether the join operation conditions conform to the foreign key relation, and whether the read operation describes the operation of different mode data correctly; the global correction is checked against the relationships between subtasks, such as whether the target table and column of the subsequent operation was read or generated in the previous task.
After finishing the correction, further performing logic optimization on the processing diagram to improve efficiency, specifically, adopting a shuffle technology, and changing the sequence of operators in the processing diagram G by combining a cost model to optimize a query plan; adopting a Collapse technology, and circularly calculating the cost benefit brought by combining a plurality of operators by combining a cost model, so as to optimize a query plan; according to a specific problem, one or both of the shuffle technique and the Collapse technique are employed to optimize the process graph G, and thus the query plan.
Implementation of the Shuffle technique takes the joint and filter operators combination in the normal case as shown in fig. 2, the processing graph generated from the logical LLM will attempt to execute the joint first and then the filter, but in most cases this order of operations results in inefficiency. The shuffle technique first obtains a possibly better state after the sequence adjustment (e.g., executing a filter before executing a join in a common manner), then calculates the total amount of the CPU according to the information such as the data amount, the number of data operations, the number of operation rows, the number of columns, etc., which is called the calculation cost, compares the two costs, determines that the latter cost is lower, and converts the processing diagram into the latter.
The implementation of the Collapse technique is based on defined merging rules, exact cost is calculated through a cost model, merging is carried out again, and specifically, collape technology considers two parts of sub-query expansion and code optimal implementation, wherein the sub-query expansion refers to sub-query implementation in a query task, a large number of repeated calculations can be caused, and the query needs to be expanded into a non-sub-query form, and the equivalent form is obtained based on a large number of sub-query predefined rules. The best implementation of the code is aimed at finally generating more efficient codes, such as group grouping operation and aggregation function operators such as max, which are presented separately, but if a corresponding separate code is generated for each operator, the data is repeatedly copied in the memory, and preferably, the two operators are combined, and finally, a row of codes is used for achieving the aim.
In another embodiment, the method further comprises: and (3) repeatedly executing the step (2) and the step (3) to obtain a plurality of optimized processing graphs, and selecting the optimal processing graph to execute the step (4). Wherein the optimal process map is selected from the plurality of optimized process maps by a cost model.
(4) And generating codes according to the optimized processing diagram and executing the codes to output a user query result.
Specifically, according to the optimized processing diagram or the optimal processing diagram, the running state s is generated, the code is generated iteratively, the code generation process is performed iteratively, and the code generation process sequentially proceeds through the diagrams from the diagram vertex (operator) without a precursor. This creates an operating state s= [ J ', G', C '], where J' represents the task that has been completed by the existing subgraph G ', C' represents the current set of code fragments, each new operator converted to code will update state s, guiding the subsequent code generation work. Finally, the code segments are linked into a coherent output. At each step of code generation, we will generate corresponding custom hints based on the corresponding operator categories and select the optimal code implementation examples.
In an embodiment, further comprising: after the code is generated, code verification is carried out, if verification fails, the step (4) is re-executed, and if the verification still fails, the steps (2) - (4) are re-executed.
Performing code verification, including: and verifying the level of the operators according to the generated codes and the corresponding optimized processing diagram. Specifically, we find the corresponding implementation in the code in order based on each operator in the processing graph, which is step verification, and pay special attention to whether the condition of the operator has the corresponding accurate implementation in the code, i.e. logic verification, for example, the condition of the operator of the filter is converted into code language, and the two needs to be distinguished, and the distinguishing is relatively loose, and when the distinguishing cannot be performed under the preset rule, the next execution is allowed, and whether a valid result can be obtained is determined. And verifying the grammar of the code while verifying the operator, finally executing the code, returning a result, and according to whether the user feedback verification result meets the expectations or not.
The present invention thus answers the query by providing the LLM with comprehensive metadata from the source dataset using predefined data processing operations Fu Laihui with detailed semantic descriptions, including modification and optimization of the processing graph and verification of subsequently generated code. Therefore, the user is allowed to submit the query by using natural language so as to improve intuitiveness, completely bypass the intermediary mode, simplify the query process, avoid data conversion and loading, and simplify the operation. Meanwhile, the code generation process is simplified, and the performance and the accuracy of the result are improved.
The invention also provides a query device based on the data lake, as shown in fig. 3, comprising:
The preprocessing module is used for sampling the target data set to be inquired according to user input to obtain mode information M and data sample information S so as to construct inquiry Q;
The processing diagram construction module is used for decomposing the query Q into a plurality of subtasks so as to construct a processing diagram G;
The processing diagram optimizing module is used for correcting the processing diagram G, adopting a shuffle technology and/or a Collapse technology and combining a cost model to optimize the corrected processing diagram G;
and the code generation and execution module is used for generating codes according to the optimized processing diagram and executing the codes so as to output a user query result.
It should be noted that, the embodiment of the apparatus shown in this embodiment is matched with the content of the embodiment of the method, and reference may be made to the content of the embodiment of the method, which is not described herein again.
The above embodiments are merely for illustrating the design concept and features of the present invention, and are intended to enable those skilled in the art to understand the content of the present invention and implement the same, the scope of the present invention is not limited to the above embodiments. Therefore, all equivalent changes or modifications according to the principles and design ideas of the present invention are within the scope of the present invention.

Claims (10)

1. A data lake-based query method, comprising:
(1) Sampling the target data set to be queried according to user input to obtain mode information M and data sample information S, so as to construct query Q;
(2) Decomposing the query Q into a plurality of subtasks, thereby constructing a processing diagram G;
(3) Modifying the processing diagram G, adopting a shuffle technology and/or a Collapse technology, and optimizing the modified processing diagram G by combining a cost model;
(4) And generating codes according to the optimized processing diagram and executing the codes to output a user query result.
2. The method of claim 1, wherein data sampling is performed from a target data set to obtain a most representative data sample, and wherein core logic is data feature statistics and null removal; in addition, connectable tuples are retrieved from the associated data following the primary foreign key relationship, the data association information is integrated, and added to the pattern information M, ultimately forming a query Q.
3. The method of claim 1, wherein decomposing the query Q into a number of sub-tasks comprises:
the query Q is broken down into several subtasks using a large model.
4. A method according to claim 3, wherein the large model is further trained as follows:
The subtasks and the corresponding operators are used as training sets for training the large model;
the trained large model is used to translate the subtasks into corresponding operators.
5. The method of claim 1, wherein the modifying the process map G comprises: form correction, logic correction, and global correction; the form is modified to be checked for table names and column names of subtask description references in the processing diagram; the logic is modified to check for conditions in the subtask description; the global correction is to check the relation among subtasks;
the optimization of the corrected processing graph G by adopting a shuffle technology and/or a Collapse technology and combining a cost model comprises the following steps:
Adopting a shuffle technology, and changing the sequence of operators in the processing diagram G by combining a cost model to optimize a query plan;
adopting a Collapse technology, and circularly calculating the cost benefit brought by combining a plurality of operators by combining a cost model, so as to optimize a query plan;
Wherein the operator is translated from the subtask.
6. The method as recited in claim 1, further comprising: and (3) repeatedly executing the step (2) and the step (3) to obtain a plurality of optimized processing graphs, and selecting the optimal processing graph to execute the step (4).
7. The method of claim 6, wherein the optimal process map is selected from a plurality of optimized process maps by a cost model.
8. The method as recited in claim 1, further comprising: after the code is generated, code verification is carried out, if verification fails, the step (4) is re-executed, and if the verification still fails, the steps (2) - (4) are re-executed.
9. The method of claim 8, wherein the performing code verification comprises:
and verifying the level of the operators according to the generated codes and the corresponding optimized processing diagram.
10. A data lake-based query device, comprising:
The preprocessing module is used for sampling the target data set to be inquired according to user input to obtain mode information M and data sample information S so as to construct inquiry Q;
The processing diagram construction module is used for decomposing the query Q into a plurality of subtasks so as to construct a processing diagram G;
the processing diagram optimizing module is used for correcting the processing diagram G, adopting a shuffle technology and/or a Collapse technology, and combining a cost model to optimize the corrected processing diagram G;
and the code generation and execution module is used for generating codes according to the optimized processing diagram and executing the codes so as to output a user query result.
CN202410650121.4A 2024-05-24 2024-05-24 Query method and device based on data lake Active CN118227656B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410650121.4A CN118227656B (en) 2024-05-24 2024-05-24 Query method and device based on data lake

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410650121.4A CN118227656B (en) 2024-05-24 2024-05-24 Query method and device based on data lake

Publications (2)

Publication Number Publication Date
CN118227656A true CN118227656A (en) 2024-06-21
CN118227656B CN118227656B (en) 2024-08-13

Family

ID=91509573

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410650121.4A Active CN118227656B (en) 2024-05-24 2024-05-24 Query method and device based on data lake

Country Status (1)

Country Link
CN (1) CN118227656B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210365630A1 (en) * 2020-05-24 2021-11-25 Quixotic Labs Inc. Domain-specific language interpreter and interactive visual interface for rapid screening
CN116628172A (en) * 2023-07-24 2023-08-22 北京酷维在线科技有限公司 Dialogue method for multi-strategy fusion in government service field based on knowledge graph
CN117112590A (en) * 2023-05-10 2023-11-24 深圳华为云计算技术有限公司 Method for generating structural query language and data query equipment
CN117271557A (en) * 2023-09-25 2023-12-22 星环信息科技(上海)股份有限公司 SQL generation interpretation method, device, equipment and medium based on business rule
CN117667973A (en) * 2023-11-29 2024-03-08 中国电信股份有限公司技术创新中心 Data query method, device, electronic equipment and storage medium
CN118012900A (en) * 2023-12-21 2024-05-10 浙江大学 Natural language intelligent query method and device based on multi-agent interaction

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210365630A1 (en) * 2020-05-24 2021-11-25 Quixotic Labs Inc. Domain-specific language interpreter and interactive visual interface for rapid screening
CN117112590A (en) * 2023-05-10 2023-11-24 深圳华为云计算技术有限公司 Method for generating structural query language and data query equipment
CN116628172A (en) * 2023-07-24 2023-08-22 北京酷维在线科技有限公司 Dialogue method for multi-strategy fusion in government service field based on knowledge graph
CN117271557A (en) * 2023-09-25 2023-12-22 星环信息科技(上海)股份有限公司 SQL generation interpretation method, device, equipment and medium based on business rule
CN117667973A (en) * 2023-11-29 2024-03-08 中国电信股份有限公司技术创新中心 Data query method, device, electronic equipment and storage medium
CN118012900A (en) * 2023-12-21 2024-05-10 浙江大学 Natural language intelligent query method and device based on multi-agent interaction

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
纪昌明;马皓宇;李传刚;李宁宁;俞洪杰;: "基于可行域搜索映射的并行动态规划", 水利学报, no. 06, 27 June 2018 (2018-06-27) *
赵朝阳: "ChatGPT给语言大模型带来的启示和多模态大模型新的发展思路", 数据分析与知识发现, 21 March 2023 (2023-03-21) *

Also Published As

Publication number Publication date
CN118227656B (en) 2024-08-13

Similar Documents

Publication Publication Date Title
EP2369506B1 (en) System and method of optimizing performance of schema matching
US9552335B2 (en) Expedited techniques for generating string manipulation programs
US7676453B2 (en) Partial query caching
US8719299B2 (en) Systems and methods for extraction of concepts for reuse-based schema matching
CN104137095B (en) System for evolution analysis
CN107491476B (en) Data model conversion and query analysis method suitable for various big data management systems
CN108710662B (en) Language conversion method and device, storage medium, data query system and method
CN1786950A (en) Method and system for processing abstract query
CN105718593A (en) Database query optimization method and system
CN101055566B (en) Function collection method and device of electronic data table
CN116776895A (en) Knowledge-guided large language model query clarification method and system for API recommendation
US20230126509A1 (en) Database management system and method for graph view selection for a relational-graph database
Paige Viewing a program transformation system at work
WO2011106006A1 (en) Optimization method and apparatus
CN118012900A (en) Natural language intelligent query method and device based on multi-agent interaction
CN116244333A (en) Database query performance prediction method and system based on cost factor calibration
CN118227656B (en) Query method and device based on data lake
CN115794870A (en) Query template parameter instantiation method aiming at unitary radix constraint
CN118626626B (en) Information processing method, apparatus, device, storage medium, and computer program product
CN111581047A (en) Supervision method for intelligent contract behavior
Ehrig et al. Ontology mapping by axioms (OMA)
CN114036188B (en) Method for optimizing and processing union in relational database management system
CN117390064B (en) Database query optimization method based on embeddable subgraph
CN116150187A (en) SQL optimization processing method based on relational database
CN118708607A (en) Method and system for generating automatic data operation flow based on predefined operator

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant