CN112307061A - Method and device for querying data - Google Patents
Method and device for querying data Download PDFInfo
- Publication number
- CN112307061A CN112307061A CN201911051553.9A CN201911051553A CN112307061A CN 112307061 A CN112307061 A CN 112307061A CN 201911051553 A CN201911051553 A CN 201911051553A CN 112307061 A CN112307061 A CN 112307061A
- Authority
- CN
- China
- Prior art keywords
- query
- data
- target
- query engine
- statement
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 71
- 230000004044 response Effects 0.000 claims description 38
- 230000006870 function Effects 0.000 claims description 33
- 238000012545 processing Methods 0.000 claims description 15
- 238000004458 analytical method Methods 0.000 claims description 11
- 230000014509 gene expression Effects 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 9
- 230000009286 beneficial effect Effects 0.000 abstract 1
- 230000000875 corresponding effect Effects 0.000 description 64
- 238000010586 diagram Methods 0.000 description 16
- 230000008569 process Effects 0.000 description 8
- 241000233805 Phoenix Species 0.000 description 7
- 238000004891 communication Methods 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 4
- 238000013500 data storage Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the disclosure discloses a method and a device for querying data. One embodiment of the method comprises: acquiring a target statement, wherein the target statement is used for operating data in a data table, and the target statement comprises at least one table name of the data table; analyzing the target statement to obtain at least one table name contained in the target statement; determining a query engine associated with at least one table name from a predetermined set of query engines as a target query engine; and responding to the target statement as a query statement, and querying data in the data table corresponding to the target statement by adopting a target query engine. The implementation mode enriches the data query mode and is beneficial to improving the data query speed.
Description
Technical Field
The embodiment of the disclosure relates to the technical field of computers, in particular to a method and a device for querying data.
Background
Currently, with the rapid development of big data technology, the data volume of each big company is continuously accumulated, EB (byte, Exabyte) level data storage is very common, and how to quickly and efficiently query out a target result in such a large-scale data storage is more and more concerned. In general, a data analyst wants to query as fast as possible, and also wants to freely define query dimensions and indexes and flexibly edit query statements.
In the prior art, some Query functions are implemented based on a Hive (data warehouse tool) engine, data may be stored in an HDFS (Distributed File System), and the data Query functions are implemented by writing Hive SQL (Structured Query Language). The Hive can map the structured data file into a data table, provide a simple SQL query function, and convert SQL statements into MapReduce (computation model, framework and platform oriented to big data parallel processing) tasks for execution.
Disclosure of Invention
The present disclosure presents methods and apparatus for querying data.
In a first aspect, an embodiment of the present disclosure provides a method for querying data, the method including: acquiring a target statement, wherein the target statement is used for operating data in a data table, and the target statement comprises at least one table name of the data table; analyzing the target statement to obtain at least one table name contained in the target statement; determining a query engine associated with at least one table name from a predetermined set of query engines as a target query engine; and responding to the target statement as a query statement, and querying data in the data table corresponding to the target statement by adopting a target query engine.
In some embodiments, the target statement contains table names of at least two data tables; and analyzing the target statement to obtain at least one table name contained in the target statement, wherein the table name comprises: analyzing the target statement to obtain at least two table names contained in the target statement; and determining a query engine associated with the at least one table name from a predetermined set of query engines as a target query engine, including: respectively determining a query engine associated with each table name of at least two table names from a predetermined query engine set; in response to determining that each query engine indicates the same query engine, the same query engine indicated by each query engine is targeted query engine.
In some embodiments, before querying data in a data table corresponding to the target statement with the target query engine in response to the target statement being the query statement, the method includes: in response to determining that there are at least two different query engines in the respective query engines, for each of the at least two table names, performing the following target query engine determination steps based on the table name: and in response to receiving creation information for indicating that the data table indicated by the table name is created in a predetermined high-speed engine, creating the data table indicated by the table name in the high-speed engine, and taking the high-speed engine after the data table is created as a target query engine.
In some embodiments, the target query engine determining step further comprises: and in response to the frequency of accessing the data in the data table indicated by the table name being greater than or equal to a preset threshold value, and in response to receiving creation information indicating that the data table indicated by the table name is not created in a predetermined high-speed engine, taking a first query engine in the set of query engines as a target query engine.
In some embodiments, the target sentence is obtained from the user side; and creating a data table indicated by the table name in the high speed engine, including: sending each field in the data table indicated by the table name to the user side; acquiring fields which are returned by a user side and selected by the user from all the fields; and creating the data table indicated by the table name in the high-speed engine based on the selected fields, wherein the fields in the created data table indicated by the table name are fields selected from various fields by the user.
In some embodiments, the method further comprises: and responding to the data in the data table corresponding to the target statement which is not queried by the first query engine, querying the data in the data table corresponding to the target statement by using a second query engine, wherein the second query engine is a query engine different from the first query engine in the query engine set.
In some embodiments, parsing the target statement to obtain at least one table name included in the target statement includes: and matching the target statement by adopting a predetermined regular expression to obtain at least one table name contained in the target statement.
In some embodiments, parsing the target statement to obtain at least one table name included in the target statement includes: in response to the target statement containing a first preset keyword, analyzing the target statement into a structured query language according to an analysis rule established for the first preset keyword; at least one table name is extracted from the structured query language.
In some embodiments, the method further comprises: and responding to the target statement containing a second preset keyword, and processing the data in the data table by adopting a self-defining function corresponding to the second preset keyword so as to perform self-defining operation indicated by the second preset keyword on the data in the data table.
In a second aspect, an embodiment of the present disclosure provides an apparatus for querying data, the apparatus including: the obtaining unit is configured to obtain a target statement, wherein the target statement is used for operating data in the data tables, and the target statement comprises at least one table name of the data tables; the analysis unit is configured to analyze the target statement to obtain at least one table name contained in the target statement; a first determining unit configured to determine, from a predetermined set of query engines, a query engine associated with at least one table name as a target query engine; and the first query unit is configured to respond to the target statement as a query statement and query data in the data table corresponding to the target statement by using the target query engine.
In some embodiments, the target statement contains table names of at least two data tables; and a parsing unit, further configured to: analyzing the target statement to obtain at least two table names contained in the target statement; and a first determining unit further configured to: respectively determining a query engine associated with each table name of at least two table names from a predetermined query engine set; in response to determining that each query engine indicates the same query engine, the same query engine indicated by each query engine is targeted query engine.
In some embodiments, the apparatus comprises: a second determining unit configured to, in response to the determined existence of at least two different query engines in the respective query engines, perform, for each of the at least two table names, the following target query engine determining steps based on the table name: and in response to receiving creation information for indicating that the data table indicated by the table name is created in a predetermined high-speed engine, creating the data table indicated by the table name in the high-speed engine, and taking the high-speed engine after the data table is created as a target query engine.
In some embodiments, the target query engine determining step further comprises: and in response to the frequency of accessing the data in the data table indicated by the table name being greater than or equal to a preset threshold value, and in response to receiving creation information indicating that the data table indicated by the table name is not created in a predetermined high-speed engine, taking a first query engine in the set of query engines as a target query engine.
In some embodiments, the target sentence is obtained from the user side; and a second determination unit further configured to: sending each field in the data table indicated by the table name to the user side; acquiring fields which are returned by a user side and selected by the user from all the fields; and creating the data table indicated by the table name in the high-speed engine based on the selected fields, wherein the fields in the created data table indicated by the table name are fields selected from various fields by the user.
In some embodiments, the apparatus further comprises: and a second query unit configured to query the data in the data table corresponding to the target statement by using a second query engine in response to the data in the data table corresponding to the target statement not being queried by using the first query engine, wherein the second query engine is a different query engine in the set of query engines than the first query engine.
In some embodiments, the parsing unit is further configured to: and matching the target statement by adopting a predetermined regular expression to obtain at least one table name contained in the target statement.
In some embodiments, the parsing unit is further configured to: in response to the target statement containing a first preset keyword, analyzing the target statement into a structured query language according to an analysis rule established for the first preset keyword; at least one table name is extracted from the structured query language.
In some embodiments, the apparatus further comprises: and the processing unit is configured to respond to the target statement containing a second preset keyword, and process the data in the data table by adopting a self-defining function corresponding to the second preset keyword so as to perform self-defining operation indicated by the second preset keyword on the data in the data table.
In a third aspect, an embodiment of the present disclosure provides an electronic device for querying data, including: one or more processors; a storage device, on which one or more programs are stored, which, when executed by the one or more processors, cause the one or more processors to implement the method of any of the embodiments of the method for querying data as described above.
In a fourth aspect, embodiments of the present disclosure provide a computer-readable medium for querying data, on which a computer program is stored, which when executed by a processor, implements the method of any of the embodiments of the method for querying data as described above.
The method and apparatus for querying data provided by the embodiments of the present disclosure obtain a target statement, where the target statement is used to operate data in a data table, and the target statement includes at least one table name of the data table, then parse the target statement to obtain at least one table name included in the target statement, then determine, from a predetermined query engine set, a query engine associated with the at least one table name as a target query engine, and finally, in a case where the target statement is a query statement, query data in the data table corresponding to the target statement by using the target query engine, so that the embodiments of the present disclosure may determine, from the query engine set, a query engine for operating data in the data table corresponding to the target statement, and further may operate data in the data table corresponding to different target statements by using different query engines, data operation is carried out based on the characteristics of each query engine, the query modes of the data are enriched, and the query speed of the data is improved.
Drawings
Other features, objects and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which some embodiments of the present disclosure may be applied;
FIG. 2 is a flow diagram for one embodiment of a method for querying data, according to the present disclosure;
FIG. 3A is a schematic architecture diagram of a method for querying data according to the present disclosure;
FIG. 3B is a schematic diagram of one application scenario of a method for querying data according to the present disclosure;
FIG. 4 is a flow diagram of yet another embodiment of a method for querying data according to the present disclosure;
FIG. 5 is a schematic diagram of yet another application scenario of a method for querying data according to the present disclosure;
FIG. 6 is a schematic block diagram illustrating one embodiment of an apparatus for querying data according to the present disclosure;
FIG. 7 is a schematic block diagram of a computer system suitable for use with an electronic device to implement some embodiments of the present disclosure.
Detailed Description
The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 illustrates an exemplary system architecture 100 of an embodiment of a method for querying data or an apparatus for querying data to which embodiments of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or transmit data (e.g., target sentences) or the like. The terminal devices 101, 102, 103 may have various client applications installed thereon, such as video playing software, news information applications, image processing applications, web browser applications, shopping applications, search applications, instant messaging tools, mailbox clients, social platform software, and the like.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices with data query function, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.
The server 105 may be a server providing various services, such as a background server performing corresponding operations on data to be operated (e.g., queried) by the terminal devices 101, 102, 103. The background server may perform analysis and other processing on the received data of the target statement (e.g., the query statement), so as to obtain data in the data table to be operated by the target statement (e.g., data in the data table to be queried by the target statement). Optionally, in the case that the target statement is an inquiry statement, the background server may further send the data in the inquired data table to the terminal device. As an example, the server 105 may be a cloud server.
The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be further noted that the method for querying data provided by the embodiments of the present disclosure may be executed by a server, may also be executed by a terminal device, and may also be executed by the server and the terminal device in cooperation with each other. Accordingly, each part (for example, each unit) included in the apparatus for querying data may be entirely disposed in the server, may be entirely disposed in the terminal device, and may be disposed in the server and the terminal device, respectively.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. When the electronic device on which the method for querying data operates does not require data transmission with other electronic devices in the course of performing the method, the system architecture may include only the electronic device (e.g., server) on which the method for querying data operates.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for querying data in accordance with the present disclosure is shown. The method for querying data comprises the following steps:
In this embodiment, an execution subject (for example, a server shown in fig. 1) of the method for querying data may obtain the target sentence from another electronic device (for example, a terminal device shown in fig. 1) or locally through a wired connection manner or a wireless connection manner.
The target statement is used for operating data in the data table, and the target statement comprises at least one table name of the data table. By way of example, the target statement may be a statement for performing at least one of the following operations on data in a data table: add, delete, modify, find.
Here, the target statement may be a statement conforming to the SQL syntax or may be a statement conforming to another predetermined syntax rule, and the embodiment of the present disclosure is not limited herein.
In this embodiment, the execution body may analyze the target statement acquired in step 201 to obtain at least one table name included in the target statement.
It should be understood that, in general, the execution body may parse all table names contained in the target sentence, however, when the target sentence contains the same table name, the execution body may parse table names contained in the target sentence, which are different from each other.
Here, when the target sentence includes a Union operator, the execution body may first split the target sentence according to the Union operator for subsequent parsing.
In some optional implementations of this embodiment, the executing main body may execute the step 202 in the following manner: and matching the target statement by adopting a predetermined regular expression to obtain at least one table name contained in the target statement. Regular Expression (Regular Expression), also called Regular Expression, is commonly used to retrieve and replace text that conforms to a predetermined rule. Many programming languages support string operations using regular expressions.
It can be appreciated that the speed and accuracy of obtaining the table names can be improved by using regular expressions.
In some optional implementations of this embodiment, the executing main body may also execute the step 202 in the following manner:
firstly, in response to the target sentence containing a first preset keyword, the target sentence is analyzed into the structured query language according to an analysis rule established for the first preset keyword.
Here, the first preset keyword may be a predetermined keyword different from the SQL keyword. As an example, the first preset key may be "foreach", "set", "function", or the like. The analysis rule established for each first preset keyword can be implemented by adopting a program statement.
As an example, the parsing rule corresponding to the first preset keyword "foreach" may be "if the target sentence contains foreach, the target sentence is equivalently transformed into code corresponding to foreach". For example, for a target statement "# foreach (768,838,12301) select from app. demo _ table where id $ { each }" containing foreach, its equivalent can be transformed into the following code: selected from app, demo _ table where id 768; selected from app, demo _ table where id 838; selected from app, demo _ table where id 12301.
As yet another example, the parsing rule corresponding to the first preset keyword "set" may be "if the target sentence contains a set, the target sentence is equivalently transformed into a code corresponding to the set". For example, for a target statement "# set NAME" # set select "# from app. demo _ table where NAME" $ { NAME } ", which includes a set, the equivalent can be transformed into the following code: selected from app, demo _ table where id ═ abc'.
As still another example, the parsing rule corresponding to the first preset keyword "function" may be "if the target sentence contains a function, the target sentence is equivalently transformed into a code corresponding to the function". For example, for a target statement containing a function "# function test? select from $ test (1) a left join $ test (2) b on a.name ═ b.name ", the equivalent of which can be transformed into the following code: select from app, demo _ table where id is 1) a left join (select from app, demo _ table where id is 2) b on a.
It should be understood that the parsing rule corresponding to the first preset keyword may be implemented by programming by a skilled person, and this optional implementation manner is not described in detail herein.
At least one table name is then extracted from the structured query language.
Here, after parsing the Structured Query Language (SQL), the execution body may extract at least one table name from the obtained structured query language. For example, for the 3 SQL statements "select from app. demo _ table where id" 768 "obtained by the above analysis; selected from app, demo _ table where id 838; select from app, demo _ table where id 12301 ", the execution body may extract 3" app.
It can be understood that, in the optional implementation manner, the first preset keyword may be associated with the parsing rule, so that more flexible data query is implemented, and the speed of data query is improved.
Optionally, in the process of determining the target sentence, the table name may also be set at a predetermined position in the target sentence (for example, the position of the first 10 characters in the target sentence). Thus, the execution body may also directly extract at least one table name included in the target sentence from the predetermined position.
In this embodiment, the execution subject may determine, from a predetermined set of query engines, a query engine associated with at least one table name, and use the determined query engine associated with the at least one table name as a target query engine.
The query engine in the above query engine set may be any existing query engine, for example, a Kylin query engine, a Phoenix query engine, an Elasticsearch query engine, a Presto query engine, a Spark query engine, or the like; or may be a query engine built by the technician. The respective query engines in the above query engine set may be different from each other, for example, the query engine set may include the following 5 query engines: kylin query engine, Phoenix query engine, Elasticissearch query engine, Presto query engine, Spark query engine.
Here, the stored table name for each data table may be pre-associated with one or more query engines in the set of query engines. By way of example, the table name of the data table may be randomly associated with one query engine in the set of query engines, or may be associated with one query engine in the set of query engines according to the characteristics of the data in each data table.
Illustratively, referring to fig. 3A, fig. 3A is a schematic architecture diagram of a method for querying data according to the present disclosure. In fig. 3A, the data at the bottom layer may be stored in a data warehouse constructed based on HDFS (Hadoop Distributed File System), and the data from different service systems is divided into multiple Hive data tables in different service fields after being processed by ETL (Extract-Transform-Load). These data tables may be used only for the underlying storage of data and do not provide query services to the outside.
The data in the data table may interface with each advanced query engine (i.e., each query engine in the set of query engines) in different scheduling manners. The Spark query engine is the lowest storage engine providing query service, and is also responsible for scheduling data to other query engines; the kylin query engine and the Phoenix query engine are constructed based on Hbase (a distributed and column-oriented open source database), and are oriented to column storage; the Elasticissearch query engine is constructed based on a Lucene (full-text search engine) and is oriented to document storage; the Presto query engine is similar to the Hive engine and is based on HDFS, but the calculation operation is performed in the memory and the speed is more than 10 times that of the Hive engine. Particularly, the Presto query engine relies on the underlying compression of the Orc format for data storage, and the format of the Hive data table needs to be converted into the Orc format by a compression format converter, so that the stable operation of the Presto query engine can be ensured.
Because the Hive engine has a slow query speed, the Hive engine does not directly provide services to the outside, but is uniformly supported by the Spark query engine. In general, Spark query engines can achieve minute-level response speed, which is about half of the query performance compared with Hive engines. In addition, the Spark query engine is also responsible for synchronizing data with other query engines and performing incremental updates periodically. Meanwhile, description information of the data table on other query engines, including table names, field types, comments and the like, can be recorded, and the information is stored persistently to provide support for engine routing services (used for determining the query engine for accessing data in the data table through a program).
The Presto query engine described above, in addition to the Spark query engine, can provide faster query performance. For the data table customized by the service, after the data compression format is set to the Orc format, the Presto query engine can be used to query the data in the data table.
For the second-level query scene, a Phoenix query engine can be used, a proper index is constructed according to the service requirement by relying on the strong storage capacity and the reading and writing efficiency of HBase, and data is pushed through a Spark query engine. Table information (e.g., table name, field type, comments) may also be persisted to provide support for engine routing services.
For a service scene with indexes and dimensions not changing frequently, a Kylin query engine can be used, so that a sub-second-level query speed can be realized. Cube is configured in a Kylin query engine, and a Spark query engine triggers Build operation through an application programming interface. Because excessive fields can reduce the query efficiency of the Kylin query engine, the Merge operation in the SQL statement can be periodically triggered through a controller (the Merge operation is used for merging the update and insert statements in the Sql statement, another table can be queried according to the connection condition of one table or sub-query, the update operation is performed on the connection condition, and the insert operation cannot be performed in a matching way). Table information (e.g., table name, field type, comments) may also be persisted to provide support for engine routing services.
For scenarios that require full-text retrieval, an Elasticsearch query engine may be used. The Spark query engine can extract data to obtain a matrix data table (Dataframe), then change the matrix data table into JavaRDD (JavaResilient Distributed data sets) through a function, and call a Bulk API (for realizing batch operation) to write the data in batches. Since the query language of the Elasticsearch query engine is not SQL, a set of plug-ins (e.g., Elasticsearch-SQL plug-in) needs to be deployed on top of the Elasticsearch query engine to provide support for JDBC (Java DataBase Connectivity) drivers.
Listed above are 5 kinds of query engines (i.e. Kylin query engine, Phoenix query engine, Elasticsearch query engine, Presto query engine, Spark query engine) in the query engine set, each having its own application scenario, so that one engine routing service can be used to decide when to use which engine to query. The table name and the associated query engine may be stored in association with an engine configuration repository. When an SQL statement (e.g., a target statement) is sent, all table names in the statement are obtained by a regular matching method, and then a query engine associated with the obtained table names is determined from the query engine set as a target query engine.
Turning now to fig. 2.
And step 204, responding to the target statement as a query statement, and querying data in a data table corresponding to the target statement by adopting a target query engine.
In this embodiment, when the target statement acquired in step 201 is a query statement, the execution main body may also query data in the data table corresponding to the target statement by using the target query engine determined in step 203. Wherein the query statement may be a statement for querying data in a data table. The query statement may be a statement conforming to the SQL syntax or a statement conforming to another predetermined syntax rule. The data table corresponding to the target statement may be a data table indicated by a table name included in the target statement.
With continued reference to fig. 3B, fig. 3B is a schematic diagram of an application scenario of the method for querying data according to the present embodiment. In the application scenario of fig. 3B, the server 301 first acquires the target statement 3011. As shown, target statement 3011 is used to look up non-null data with field comm in the data table with table name emp. The target sentence 3011 includes a table name "emp" of the data table. Then, the server 301 analyzes the target sentence 3011 to obtain a table name 3012 (emp in the drawing) included in the target sentence 3011. After that, the server 301 determines a query engine associated with the table name "emp" as the target query engine 3014 from the predetermined query engine set 3013. Finally, the server 301 queries the data in the data table 3015 corresponding to the target statement 3011 by using the target query engine 3014. As an example, in the illustration, the server 301 uses the target query engine 3014 to query the data in the data table 3015 (i.e., the data table indicated by the table name emp) corresponding to the target statement 3011, so as to obtain the query result 3016.
The method for querying data according to the above embodiment of the disclosure includes obtaining a target statement, where the target statement is used to manipulate data in a data table, where the target statement includes at least one table name of the data table, then parsing the target statement to obtain at least one table name included in the target statement, then determining, from a predetermined query engine set, a query engine associated with the at least one table name as a target query engine, and finally, in a case where the target statement is a query statement, querying, by using the target query engine, data in the data table corresponding to the target statement, so that an embodiment of the disclosure may determine, from the query engine set, a query engine for manipulating data in the data table corresponding to the target statement, and further may manipulate data in the data table corresponding to different target statements by using different query engines, data operation is carried out based on the characteristics of each query engine, the query modes of the data are enriched, and the query speed of the data is improved.
In some optional implementations of this embodiment, the target statement may also include table names of at least two data tables. Thus, step 202 may comprise: and analyzing the target statement to obtain at least two table names contained in the target statement. Based on this, the executing main body may execute the step 203 in the following manner:
first, from a predetermined set of query engines, a query engine associated with each of at least two table names is determined, respectively.
Then, in a case where the determined respective query engines indicate the same query engine, the same query engine indicated by the respective query engines is taken as a target query engine.
It is to be understood that, in the case that the determined query engines indicate the same query engine, the alternative implementation may determine a target query engine associated with each of the at least two table names, so that the data in the data table indicated by each of the at least two table names may be queried by the target query engine.
In some optional implementation manners of this embodiment, the executing main body may further perform the following steps:
and responding to the target statement containing a second preset keyword, and processing the data in the data table by adopting a self-defining function corresponding to the second preset keyword so as to perform self-defining operation indicated by the second preset keyword on the data in the data table.
The second preset keyword may be a predetermined character string. The custom function corresponding to the second preset keyword may be a function (also referred to as a method, an operator (e.g., Spark operator)) written by a technician for the second preset keyword. The second preset keyword is different from the first preset keyword.
Illustratively, a custom function corresponding to the second preset keyword "UDF" may be used to process the column transform; a custom function corresponding to the second preset keyword "UDAF" may be used to aggregate multiple lines of data in the data table; a custom function corresponding to the second preset keyword "$ { yesterday }" may be used to obtain the yesterday date; a custom function corresponding to the second preset keyword "$ { last _ month }" may be used to obtain the last month date; a custom function corresponding to the second preset keyword "$ { N _ days _ ago }" may be used to obtain a date N days ago, where N may be an integer greater than 2; a custom function corresponding to the second preset keyword "$ { M _ months _ ago }" may be used to obtain a date M months ago, where M may be an integer greater than 2.
It should be understood that the above-mentioned custom function corresponding to the second preset keyword may be written by a technician, and this optional implementation manner is not described in detail herein.
It can be understood that, in the optional implementation manner, the data in the data table is processed by using the custom function corresponding to the second preset keyword, so that functions that can be realized by the existing keyword can be further expanded, thereby realizing the SQL-oriented high-level programming requirement and enabling data query to be more efficient and flexible.
With further reference to FIG. 4, a flow 400 of yet another embodiment of a method for querying data is illustrated. The process 400 of the method for querying data includes the steps of:
In the present embodiment, an execution subject (e.g., a server shown in fig. 1) of the method for querying data may acquire a target sentence. The target statement is used for operating data in the data tables, and the target statement comprises table names of at least two data tables.
In this embodiment, the execution body may analyze the target statement to obtain at least two table names included in the target statement.
In step 403, a query engine associated with each table name of the at least two table names is determined from a predetermined set of query engines, respectively. Thereafter, step 404 is performed.
In this embodiment, the execution subject may determine, from a predetermined set of query engines, a query engine associated with each of the at least two table names, respectively.
At step 404, it is determined whether the determined query engines indicate the same query engine. Then, if yes, go to step 405; if not, go to step 406.
In this embodiment, the execution subject may determine whether the determined query engines indicate the same query engine.
In step 405, the same query engine indicated by each query engine is used as the target query engine. Thereafter, step 407 is performed.
In this embodiment, in a case where the determined query engines indicate the same query engine, the execution subject may use the same query engine indicated by the query engines as the target query engine.
In this embodiment, in a case that the determined query engines do not indicate the same query engine (i.e. there are at least two different query engines in the determined query engines), the executing entity may execute, for each table name of the at least two table names, the following target query engine determining steps based on the table name: and in response to receiving creation information for indicating that the data table indicated by the table name is created in a predetermined high-speed engine, creating the data table indicated by the table name in the high-speed engine, and taking the high-speed engine after the data table is created as a target query engine. The high-speed engine may be a memory computing engine. As an example, the high speed engine may be a Presto query engine or a Spark query engine. The creation information may be information transmitted from a user side (i.e., an electronic device used by a user) to indicate whether to create a data table in the high-speed engine.
In this embodiment, when the target statement is a query statement, the execution body may query data in a data table corresponding to the target statement by using a target query engine.
It should be noted that, in addition to the above-mentioned contents, the present embodiment may further include the same or similar features and effects as those of the embodiment corresponding to fig. 2, and details are not repeated herein.
As can be seen from fig. 4, the process 400 of the method for querying data in this embodiment may adopt different query engines in the query engine set to implement the operation of data in the data table for different application scenarios, so as to further enrich the query modes of data and improve the query speed of data.
In some optional implementations of this embodiment, the determining, by the target query engine, step further includes:
and in response to the frequency of accessing the data in the data table indicated by the table name being greater than or equal to a preset threshold value, and in response to receiving creation information indicating that the data table indicated by the table name is not created in a predetermined high-speed engine, taking a first query engine in the set of query engines as a target query engine. The first query engine may be any one of the set of query engines, or may be a predetermined query engine of the set of query engines. By way of example, the first query engine may be a Presto query engine.
It can be understood that, in this alternative implementation manner, the first query engine may be adopted to query the data in the data table corresponding to the target statement without creating the data table indicated by the table name in a predetermined high-speed engine, so that the step of creating the data table indicated by the table name may be omitted, and the speed of accessing the data in the data table at this time is increased.
In some optional implementations of this embodiment, the target sentence is obtained from the user side. Thus, the execution agent may execute "create the data table indicated by the table name in the high-speed engine" in the following manner in the above alternative implementation:
step one, each field in the data table indicated by the table name is sent to the user side.
And secondly, acquiring fields which are returned by the user side and selected by the user from all the fields.
And thirdly, creating a data table indicated by the table name in the high-speed engine based on the selected field. And the fields in the created data table indicated by the table name are fields selected by the user from various fields.
It is understood that after the execution body sends all the fields in the data table indicated by the table name to the user terminal, the user terminal may present the received fields, and then the user may select one or more fields from all the fields in the data table indicated by the table name through the user terminal used by the user terminal. Here, the user may select the field according to actual requirements, for example, the user may select the field in the data table whose access frequency is greater than or equal to the preset frequency threshold, or may first determine the storage space occupied by each field in the data table, so as to select the field whose occupied storage space is less than or equal to the preset threshold.
It can be understood that, in the alternative implementation manner, the data table indicated by the table name may be created in the high-speed engine according to the field selected by the user, so that the created data table does not include fields not selected by the user, and compared to a scheme in which the data table is created in the high-speed engine by using all fields in the data table, when the high-speed engine is a memory calculation engine, the alternative implementation manner may reduce a memory space occupied by the created data table, and increase a speed of subsequently accessing data in the created data table.
In some optional implementation manners of this embodiment, the executing main body may further perform the following steps:
and responding to the data in the data table corresponding to the target statement which is not inquired by the first inquiry engine, and inquiring the data in the data table corresponding to the target statement by the second inquiry engine. Wherein the second query engine is a different query engine of the set of query engines than the first query engine.
Here, the second query engine may be any one of the set of query engines different from the first query engine, or may be a predetermined query engine of the set of query engines different from the first query engine. By way of example, when the set of query engines consists of a Kylin query engine, a Phoenix query engine, an Elasticsearch query engine, a Presto query engine, a Spark query engine, the first query engine may be the Presto query engine and the second query engine may be the Spark query engine.
It can be understood that, since each query engine in the query engine set may be suitable for querying data in different storage forms, for a certain query engine, the storage form of the data in the data table corresponding to the target statement may be different from the storage form of the data suitable for querying, so that the query engine does not query the data in the data table corresponding to the target statement.
By way of example, continuing to refer to fig. 5, fig. 5 is a schematic diagram of yet another application scenario for a method for querying data according to the present disclosure. This example of a method for querying data may be implemented by the following steps (including steps 501-514):
In this example, an execution subject (e.g., a server shown in fig. 1) of the method for querying data may obtain the target sentence from the user side through a wired connection or a wireless connection. The target statement is used for operating data in the data tables, and the target statement comprises table names of at least two data tables.
In this example, the execution body may parse the target statement to obtain at least two table names included in the target statement.
In this example, the execution subject may determine the query engine associated with each of the at least two table names from a predetermined set of query engines respectively
At step 504, it is determined whether the determined query engines indicate the same query engine. Then, if the determined query engines indicate the same query engine, step 505 is executed; if the determined query engines do not indicate the same query engine, i.e., there are at least two different query engines in the determined query engines, then step 506 is performed.
In this example, the execution subject may determine whether the determined respective query engines indicate the same query engine.
And 505, taking the same query engine indicated by each query engine as a target query engine. Thereafter, step 512 is performed.
In this example, in a case where the determined respective query engines indicate the same query engine, the execution subject may take the same query engine indicated by the respective query engines as the target query engine.
In this example, the execution subject may select a table name from the at least two table names in a case where the determined respective query engines do not indicate the same query engine (i.e., there are at least two different query engines in the determined respective query engines).
In this example, the execution subject may determine whether the data in the data table indicated by the table name is accessed more frequently than or equal to a preset threshold.
At step 508, it is determined whether creation information for instructing to create a data table indicated by the table name in a predetermined high-speed engine is received. Thereafter, if the creation information for instructing to create the data table indicated by the table name in the predetermined high-speed engine is received, step 509 is executed; if the creation information indicating that the data table indicated by the table name is created in the predetermined high-speed engine is not received, or if the creation information indicating that the data table indicated by the table name is not created in the predetermined high-speed engine is received, step 510 is performed.
In this example, in the case where it is judged in step 507 that the frequency of access to data in the data table indicated by the table name is greater than or equal to the preset threshold, the execution subject may determine whether creation information indicating that the data table indicated by the table name is created in a predetermined high-speed engine is received.
In this example, in step 508, in a case that it is determined that creation information indicating that the data table indicated by the table name is created in the predetermined high-speed engine is received, the execution main body may send each field in the data table indicated by the table name to the user side, obtain fields returned by the user side and selected by the user from each field, create the data table indicated by the table name in the high-speed engine based on the selected fields, and use the high-speed engine after the data table is created as the target query engine. And the fields in the created data table indicated by the table name are fields selected by the user from various fields.
In this example, in step 508, it is determined that creation information indicating that the data table indicated by the table name is created in a predetermined high-speed engine is not received, or, in the case that creation information indicating that the data table indicated by the table name is not created in a predetermined high-speed engine is received, the execution subject may set the first query engine in the query engine set as the target query engine.
In this example, the execution subject may determine whether an unselected table name exists in the at least two table names.
And step 512, in response to that the target statement is a query statement, querying data in a data table corresponding to the target statement by using a target query engine. Thereafter, step 514 is performed.
In this example, in step 511, it is determined that there is an unselected table name in the at least two table names, and if the target statement is a query statement, the execution subject may query data in the data table corresponding to the target statement by using a target query engine.
In step 513, unselected table names are selected from the at least two table names. Thereafter, step 507 is performed.
In this example, it is determined in step 511 that there is no unselected table name in the at least two table names, and the execution subject may select the unselected table name from the at least two table names.
And 514, responding to the data in the data table corresponding to the target statement which is not inquired by the first inquiry engine, and inquiring the data in the data table corresponding to the target statement by the second inquiry engine.
In this example, in a case where the first query engine is not used to query the data in the data table corresponding to the target statement, the execution subject may use the second query engine to query the data in the data table corresponding to the target statement. Wherein the second query engine is a different query engine of the set of query engines than the first query engine.
As can be seen from fig. 5, the method for querying data in this application scenario may implement, for different application scenarios, the operation of data in the data table by using different query engines in the query engine set, and may provide query services on the minute level, the second level, and even the sub-second level. Meanwhile, richer grammar expression capability can be met, simple control logic is supported, logic of time processing, text processing and set processing is simplified, and a user can flexibly use a class structured query language grammar to realize complex business query tasks.
With further reference to fig. 6, as an implementation of the method shown in fig. 2 described above, the present disclosure provides an embodiment of an apparatus for querying data, the embodiment of the apparatus corresponding to the embodiment of the method shown in fig. 2, and the embodiment of the apparatus may further include the same or corresponding features as the embodiment of the method shown in fig. 2 and produce the same or corresponding effects as the embodiment of the method shown in fig. 2, in addition to the features described below. The device can be applied to various electronic equipment.
As shown in fig. 6, the apparatus 600 for querying data of the present embodiment includes: the obtaining unit 601 is configured to obtain a target statement, where the target statement is used for operating data in a data table, and the target statement contains a table name of at least one data table; the parsing unit 602 is configured to parse the target statement to obtain at least one table name included in the target statement; the first determining unit 603 is configured to determine, from a predetermined set of query engines, a query engine associated with at least one table name as a target query engine; the first query unit 604 is configured to query, with the target query engine, data in the data table corresponding to the target statement in response to the target statement being a query statement.
In this embodiment, the obtaining unit 601 of the apparatus 600 for querying data may obtain the target sentence from another electronic device (e.g., the terminal device shown in fig. 1) or locally through a wired connection manner or a wireless connection manner. The target statement is used for operating data in the data table, and the target statement comprises at least one table name of the data table. By way of example, the target statement may be a statement for performing at least one of the following operations on data in a data table: add, delete, modify, find.
In this embodiment, the analysis unit 602 may analyze the target statement acquired by the acquisition unit 601 to obtain at least one table name included in the target statement. However, when the target sentence includes the same table name, the apparatus 600 may analyze and obtain different table names included in the target sentence.
In this embodiment, the first determining unit 603 may determine, from a predetermined query engine set, a query engine associated with at least one table name as a target query engine. The query engine in the above query engine set may be any existing query engine, for example, a Kylin query engine, a Phoenix query engine, an Elasticsearch query engine, a Presto query engine, a Spark query engine, or the like; or may be a query engine built by a technician.
In this embodiment, when the target statement acquired by the acquiring unit 601 is a query statement, the first querying unit 604 may also query data in the data table corresponding to the target statement by using the target query engine determined by the first determining unit 603. Wherein the query statement may be a statement for querying data in a data table. The query statement may be a statement conforming to the SQL syntax or a statement conforming to another predetermined syntax rule.
In some optional implementations of this embodiment, the target statement includes table names of at least two data tables; and a parsing unit 602, further configured to: and analyzing the target statement to obtain at least two table names contained in the target statement. And, the first determining unit 603, further configured to: respectively determining a query engine associated with each table name of at least two table names from a predetermined query engine set; in response to determining that each query engine indicates the same query engine, the same query engine indicated by each query engine is targeted query engine.
In some optional implementations of this embodiment, the apparatus 600 includes: a second determining unit (not shown in the figures) configured to, in response to the determined existence of at least two different query engines in the respective query engines, for each of the at least two table names, perform the following target query engine determining steps based on the table name: and in response to receiving creation information for indicating that the data table indicated by the table name is created in a predetermined high-speed engine, creating the data table indicated by the table name in the high-speed engine, and taking the high-speed engine after the data table is created as a target query engine.
In some optional implementations of this embodiment, the target query engine determining step further includes: and in response to the frequency of accessing the data in the data table indicated by the table name being greater than or equal to a preset threshold value, and in response to receiving creation information indicating that the data table indicated by the table name is not created in a predetermined high-speed engine, taking a first query engine in the set of query engines as a target query engine.
In some optional implementations of this embodiment, the target statement is obtained from the user side; and a second determining unit (not shown in the figure), which may be further configured to: sending each field in the data table indicated by the table name to the user side; acquiring fields which are returned by a user side and selected by the user from all the fields; and creating the data table indicated by the table name in the high-speed engine based on the selected fields, wherein the fields in the created data table indicated by the table name are fields selected from various fields by the user.
In some optional implementations of this embodiment, the apparatus 600 further includes: and a second query unit (not shown in the figure) configured to respond to the data which is not queried in the data table corresponding to the target statement by the first query engine, and query the data in the data table corresponding to the target statement by the second query engine, wherein the second query engine is a query engine different from the first query engine in the query engine set.
In some optional implementations of this embodiment, the parsing unit 602 may be further configured to: and matching the target statement by adopting a predetermined regular expression to obtain at least one table name contained in the target statement.
In some optional implementations of this embodiment, the parsing unit 602 may also be further configured to: in response to the target statement containing a first preset keyword, analyzing the target statement into a structured query language according to an analysis rule established for the first preset keyword; at least one table name is extracted from the structured query language.
In some optional implementations of this embodiment, the apparatus 600 further includes: and the processing unit (not shown in the figure) is configured to respond to the target statement containing the second preset keyword, and process the data in the data table by adopting a self-defining function corresponding to the second preset keyword so as to perform the self-defining operation indicated by the second preset keyword on the data in the data table.
The apparatus for querying data according to the above embodiment of the disclosure acquires, by an acquiring unit 601, a target statement, where the target statement is used to manipulate data in a data table, and the target statement includes a table name of at least one data table, then, a parsing unit 602 parses the target statement to obtain at least one table name included in the target statement, then, a first determining unit 603 determines, from a set of predetermined query engines, a query engine associated with at least one table name as a target query engine, and finally, in response to the target statement being a query statement, a first querying unit 604 queries, using the target query engine, data in the data table corresponding to the target statement, so that the apparatus for querying data according to the above embodiment of the disclosure may determine, from the set of query engines, a query engine for manipulating data in the data table corresponding to the target statement, and then different query engines can be used for operating data in the data tables corresponding to different target statements, and data operation is carried out based on the characteristics of each query engine, so that the query modes of the data are enriched, and the query speed of the data is improved.
Referring now to fig. 7, a schematic diagram of an electronic device (e.g., the server or terminal device of fig. 1) 700 suitable for use in implementing embodiments of the present disclosure is shown. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a fixed terminal such as a digital TV, a desktop computer, and the like. The terminal device/server shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 7, electronic device 700 may include a processing means (e.g., central processing unit, graphics processor, etc.) 701 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from storage 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data necessary for the operation of the electronic apparatus 700 are also stored. The processing device 701, the ROM 702, and the RAM703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Generally, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 707 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 708 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 illustrates an electronic device 700 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 7 may represent one device or may represent multiple devices as desired.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via the communication means 709, or may be installed from the storage means 708, or may be installed from the ROM 702. The computer program, when executed by the processing device 701, performs the above-described functions defined in the methods of embodiments of the present disclosure.
It should be noted that the computer readable medium described in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, however, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring a target statement, wherein the target statement is used for operating data in a data table, and the target statement comprises at least one table name of the data table; analyzing the target statement to obtain at least one table name contained in the target statement; determining a query engine associated with at least one table name from a predetermined set of query engines as a target query engine; and responding to the target statement as a query statement, and querying data in the data table corresponding to the target statement by adopting a target query engine.
Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes an acquisition unit, a parsing unit, a first determination unit, and a first query unit. Where the names of these units do not in some cases constitute a limitation on the unit itself, for example, the acquiring unit may also be described as a "unit that acquires a target sentence".
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is possible without departing from the inventive concept as defined above. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.
Claims (12)
1. A method for querying data, comprising:
acquiring a target statement, wherein the target statement is used for operating data in a data table, and the target statement comprises at least one table name of the data table;
analyzing the target statement to obtain the at least one table name contained in the target statement;
determining a query engine associated with the at least one table name from a predetermined set of query engines as a target query engine;
and responding to the target statement as a query statement, and querying data in a data table corresponding to the target statement by adopting the target query engine.
2. The method of claim 1, wherein the target statement contains table names of at least two data tables; and
the analyzing the target statement to obtain the at least one table name included in the target statement includes:
analyzing the target statement to obtain the at least two table names contained in the target statement; and
the determining, from a predetermined set of query engines, a query engine associated with the at least one table name as a target query engine includes:
respectively determining a query engine associated with each table name of the at least two table names from a predetermined set of query engines;
in response to determining that each query engine indicates the same query engine, the same query engine indicated by each query engine is taken as the target query engine.
3. The method of claim 2, wherein before the querying data in the data table corresponding to the target statement with the target query engine in response to the target statement being a query statement, the method comprises:
in response to determining that there are at least two different query engines in the respective query engines, for each of the at least two table names, performing the following target query engine determination steps based on the table name: and in response to receiving creation information for indicating that the data table indicated by the table name is created in a predetermined high-speed engine, creating the data table indicated by the table name in the high-speed engine, and taking the high-speed engine after the data table is created as a target query engine.
4. The method of claim 3, wherein the target query engine determining step further comprises:
and in response to the frequency of accessing the data in the data table indicated by the table name being greater than or equal to a preset threshold value, and in response to receiving creation information indicating that the data table indicated by the table name is not created in a predetermined high-speed engine, taking a first query engine in the set of query engines as a target query engine.
5. The method of claim 3, wherein the target sentence is obtained from a user side; and
the creating of the data table indicated by the table name in the high speed engine includes:
sending each field in the data table indicated by the table name to the user side;
acquiring fields which are returned by the user side and selected by the user from the fields;
and creating a data table indicated by the table name in the high-speed engine based on the selected field, wherein the field in the created data table indicated by the table name is the field selected by the user from the fields.
6. The method of claim 4, wherein the method further comprises:
and in response to the data in the data table corresponding to the target statement is not queried by the first query engine, querying the data in the data table corresponding to the target statement by a second query engine, wherein the second query engine is a query engine different from the first query engine in the query engine set.
7. The method according to one of claims 1 to 6, wherein the parsing the target statement to obtain the at least one table name included in the target statement comprises:
and matching the target statement by adopting a predetermined regular expression to obtain the at least one table name contained in the target statement.
8. The method according to one of claims 1 to 6, wherein the parsing the target statement to obtain the at least one table name included in the target statement comprises:
in response to the target statement containing a first preset keyword, analyzing the target statement into a structured query language according to an analysis rule established for the first preset keyword;
extracting the at least one table name from the structured query language.
9. The method according to one of claims 1-6, wherein the method further comprises:
and responding to the target statement containing a second preset keyword, and processing the data in the data table by adopting a self-defined function corresponding to the second preset keyword so as to perform self-defined operation indicated by the second preset keyword on the data in the data table.
10. An apparatus for querying data, comprising:
the obtaining unit is configured to obtain a target statement, wherein the target statement is used for operating data in a data table, and the target statement contains a table name of at least one data table;
the analysis unit is configured to analyze the target statement to obtain the at least one table name contained in the target statement;
a first determining unit configured to determine, from a predetermined set of query engines, a query engine associated with the at least one table name as a target query engine;
and the first query unit is configured to respond to the target statement being a query statement, and query data in a data table corresponding to the target statement by using the target query engine.
11. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-9.
12. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911051553.9A CN112307061A (en) | 2019-10-31 | 2019-10-31 | Method and device for querying data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911051553.9A CN112307061A (en) | 2019-10-31 | 2019-10-31 | Method and device for querying data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112307061A true CN112307061A (en) | 2021-02-02 |
Family
ID=74485203
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911051553.9A Pending CN112307061A (en) | 2019-10-31 | 2019-10-31 | Method and device for querying data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112307061A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114138830A (en) * | 2021-11-15 | 2022-03-04 | 紫金诚征信有限公司 | Second-level query method and device for mass data of big data and computer medium |
CN114357276A (en) * | 2021-12-23 | 2022-04-15 | 北京百度网讯科技有限公司 | Data query method and device, electronic equipment and storage medium |
CN114817299A (en) * | 2022-05-17 | 2022-07-29 | 在线途游(北京)科技有限公司 | Data analysis method and device based on UDAF |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006059049A (en) * | 2004-08-19 | 2006-03-02 | Fuji Xerox Co Ltd | Information search system, method, and program |
US20080250057A1 (en) * | 2005-09-27 | 2008-10-09 | Rothstein Russell I | Data Table Management System and Methods Useful Therefor |
CN105975617A (en) * | 2016-05-20 | 2016-09-28 | 北京京东尚科信息技术有限公司 | Multi-partition-table inquiring and processing method and device |
WO2018095351A1 (en) * | 2016-11-28 | 2018-05-31 | 中兴通讯股份有限公司 | Method and device for search processing |
CN108572963A (en) * | 2017-03-09 | 2018-09-25 | 北京京东尚科信息技术有限公司 | Information acquisition method and device |
CN109710859A (en) * | 2019-01-21 | 2019-05-03 | 北京字节跳动网络技术有限公司 | Data query method and apparatus |
CN110222072A (en) * | 2019-06-06 | 2019-09-10 | 江苏满运软件科技有限公司 | Data Query Platform, method, equipment and storage medium |
-
2019
- 2019-10-31 CN CN201911051553.9A patent/CN112307061A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006059049A (en) * | 2004-08-19 | 2006-03-02 | Fuji Xerox Co Ltd | Information search system, method, and program |
US20080250057A1 (en) * | 2005-09-27 | 2008-10-09 | Rothstein Russell I | Data Table Management System and Methods Useful Therefor |
CN105975617A (en) * | 2016-05-20 | 2016-09-28 | 北京京东尚科信息技术有限公司 | Multi-partition-table inquiring and processing method and device |
WO2018095351A1 (en) * | 2016-11-28 | 2018-05-31 | 中兴通讯股份有限公司 | Method and device for search processing |
CN108121709A (en) * | 2016-11-28 | 2018-06-05 | 中兴通讯股份有限公司 | A kind of search processing method and device |
CN108572963A (en) * | 2017-03-09 | 2018-09-25 | 北京京东尚科信息技术有限公司 | Information acquisition method and device |
CN109710859A (en) * | 2019-01-21 | 2019-05-03 | 北京字节跳动网络技术有限公司 | Data query method and apparatus |
CN110222072A (en) * | 2019-06-06 | 2019-09-10 | 江苏满运软件科技有限公司 | Data Query Platform, method, equipment and storage medium |
Non-Patent Citations (2)
Title |
---|
KONSTANTINOS A. VOULGARIS 等: ""Accelerated Search for Non-Negative Greedy Sparse Decomposition via Dimensionality Reduction"", 2019 SENSOR SIGNAL PROCESSING FOR DEFENCE CONFERENCE (SSPD) * |
程龙: ""关系-XML双引擎数据库管理系统CoSQLRX中XML数据索引的研究与实现"", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114138830A (en) * | 2021-11-15 | 2022-03-04 | 紫金诚征信有限公司 | Second-level query method and device for mass data of big data and computer medium |
CN114357276A (en) * | 2021-12-23 | 2022-04-15 | 北京百度网讯科技有限公司 | Data query method and device, electronic equipment and storage medium |
CN114357276B (en) * | 2021-12-23 | 2023-08-22 | 北京百度网讯科技有限公司 | Data query method, device, electronic equipment and storage medium |
CN114817299A (en) * | 2022-05-17 | 2022-07-29 | 在线途游(北京)科技有限公司 | Data analysis method and device based on UDAF |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10783124B2 (en) | Data migration in a networked computer environment | |
CN110096584B (en) | Response method and device | |
CN109656923B (en) | Data processing method and device, electronic equipment and storage medium | |
WO2023029854A1 (en) | Data query method and apparatus, storage medium, and electronic device | |
CN109614402A (en) | Multidimensional data query method and device | |
CN112307061A (en) | Method and device for querying data | |
US10866960B2 (en) | Dynamic execution of ETL jobs without metadata repository | |
CN112182255A (en) | Method and apparatus for storing media files and for retrieving media files | |
US10423617B2 (en) | Remote query optimization in multi data sources | |
CN115344688B (en) | Business data display method and device, electronic equipment and computer readable medium | |
CN110888839A (en) | Data storage and data search method and device | |
CN115168362A (en) | Data processing method and device, readable medium and electronic equipment | |
CN114969044B (en) | Materialized column creation method and data query method based on data lake | |
CN110941683B (en) | Method, device, medium and electronic equipment for acquiring object attribute information in space | |
CN113760240A (en) | Method and device for generating data model | |
CN113312331A (en) | Data migration method, device, system, electronic equipment and computer readable medium | |
CN113448957A (en) | Data query method and device | |
CN111581237B (en) | Data query method, device and system and electronic equipment | |
CN116340364A (en) | Data processing method, device, equipment and storage medium | |
CN111859028B (en) | Method, apparatus and computer program product for creating an index for streaming storage | |
CN111177183B (en) | Method and device for generating database access statement | |
CN117349288A (en) | Data query method and device based on online analysis processing and electronic equipment | |
CN117349290A (en) | Data processing method and device based on online analysis processing and electronic equipment | |
CN117251214A (en) | Execution method of data operation instruction based on Apache Hudi table format of distributed database | |
CN115994151A (en) | Data request changing method, device, electronic equipment and computer readable medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |