[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN102521307A - Parallel query processing method for share-nothing database cluster in cloud computing environment - Google Patents

Parallel query processing method for share-nothing database cluster in cloud computing environment Download PDF

Info

Publication number
CN102521307A
CN102521307A CN2011103926770A CN201110392677A CN102521307A CN 102521307 A CN102521307 A CN 102521307A CN 2011103926770 A CN2011103926770 A CN 2011103926770A CN 201110392677 A CN201110392677 A CN 201110392677A CN 102521307 A CN102521307 A CN 102521307A
Authority
CN
China
Prior art keywords
data
back end
hash
obtains
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011103926770A
Other languages
Chinese (zh)
Inventor
李睿峰
王殿成
冯玉
李祥凯
冷建全
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingbase Information Technologies Co Ltd
Original Assignee
Beijing Kingbase Information Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingbase Information Technologies Co Ltd filed Critical Beijing Kingbase Information Technologies Co Ltd
Priority to CN2011103926770A priority Critical patent/CN102521307A/en
Publication of CN102521307A publication Critical patent/CN102521307A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a parallel query processing method for a share-nothing database cluster in a cloud computing environment. The method comprises the following steps of: firstly, segmenting a query plan to obtain sections which are executed by all data nodes; secondly, performing section processing on all data nodes to obtain Hash joins of all data nodes; and thirdly, performing Hash aggregation on control nodes by flowing of aggregated data between all data nodes to obtain a query result set. By the flowing operation of node data, the data can flow between all nodes in the query execution process, the parallel execution query process is finished, and an execution speed is greatly increased.

Description

Shared-nothing database cluster parallel query disposal route under a kind of cloud computing environment
Technical field
The present invention relates to a kind of database parallel query method, relate in particular to the shared-nothing database cluster parallel query disposal route under a kind of cloud computing environment, belong to the data base cluster system technical field.
Background technology
Along with deepening continuously of enterprise and e-government; The complicacy of database application strengthens day by day; Press for the problem that solves mass data processing, mass data storage and enhanced scalability, so that good data is stored, inquiry service for business support system provides.But the demand can't be fundamentally satisfied in traditional single-node data storehouse.
For this reason, people use for reference the technical concept of computer cluster (one group of loose integrated computer software and/or hardware are coupled together highly closely cooperation accomplish evaluation work), and a plurality of databases are coupled together the composition data base cluster system.Data base cluster system (DatabaseCluster System; Be called for short DBCS) Clustering is combined with Database Systems; It is one group of complete, autonomous calculation processing unit (node); Each node all has hardware resources such as CPU, internal memory and disk alone, moves independently operating system and autonomous Database Systems.Each node is worked in coordination with each other and is calculated through high-speed dedicated network or the interconnection of commercial universal network, and system provides parallel transaction services as the unified data storehouse.
At present, in database field two kinds of data-base clusters are arranged: a kind of is not have (ShareNothing) data-base cluster of sharing, and another kind is shared disk (Share Disk) data-base cluster.Along with the rise of development of database and cloud computing technology, the shared-nothing database cluster obtains widespread use with advantages such as its concurrency are good in the cloud computing field.
The data base querying service is one of application scenarios that running frequency is the highest in these data base cluster systems.In application number was 201010277129.9 Chinese invention patent application, a kind of parallel query method of distributed data base was disclosed.In this method, the inquiry proxy module is set at first, makes the inquiry proxy module directly receive the querying command of external module; Next is provided with a plurality of inquiry nucleus modules that are associated with the inquiry proxy module; The inquiry nucleus module provides the granularity inquiry to the data multidimensional through task resolution, improves whole response speed, and the inquiry nucleus module passes through multithreading; Many concurrent Query Databases; Give full play to the high-throughput of commercial data base, high concurrent characteristics, simultaneously multithreading is coordinated control, the querying flow management; The 3rd, mechanism is read in form and the employing of adopting batch data to return in advance, improves response speed.The thought of this method through query task " is divided and rule "; By the how concurrent initiation inquiry of inquiry nucleus module; Preferentially return the fast granularity data of inquiry velocity, thereby reached the effect that the entire system query performance significantly promotes, can satisfy the needs of most application system fast query.
Summary of the invention
Technical matters to be solved by this invention is to provide the disposal route of the shared-nothing database cluster parallel query under a kind of cloud computing environment.
For realizing above-mentioned goal of the invention, the present invention adopts following technical scheme:
Shared-nothing database cluster parallel query disposal route under a kind of cloud computing environment is characterized in that may further comprise the steps:
The first step: inquiry plan is cut apart, obtained the section of carrying out respectively by each back end;
Second step: each back end carries out slicing treatment respectively, and the Hash that obtains each back end connects;
The 3rd step: each back end flows through the polymerization combined data, carries out the Hash polymerization in Control Node, obtains query results.
Wherein more excellently, in said second step, the slicing treatment of each back end may further comprise the steps:
Data node scan of step 1) first table obtains first Hash table, scans second table then, carries out Hash with first Hash table and is connected, and obtains first data set;
Step 2) another back end scans the 3rd table, obtains the 3rd Hash table, scans the 4th table then, carries out Hash with the 3rd Hash table and is connected, and obtains second data set;
Said another back end of step 3) carries out Hash based on said first data set with said second data set and is connected.
Perhaps, the slicing treatment of each back end adopts following steps:
Data node scan of step 11) first table obtains first Hash table, scans second table then; Carrying out Hash with first Hash table is connected; Obtain first data set, and second table is perhaps redistributed the streams data operation through the mobile operation of broadcast data, send to other back end;
Another back end of step 12) scans the 3rd table, based on said second table that receives and the 3rd table of scanning, obtains the 3rd Hash table, scans the 4th table then, carries out Hash with the 3rd Hash table and is connected, and obtains second data set;
Said another back end of step 13) carries out Hash based on said first data set with said second data set and is connected.
Wherein more excellently, in said the 3rd step, after said Hash connection the carrying out Hash polymerization and ordering that each back end obtains in going on foot second, through said polymerization combined data flow operation, send to said Control Node again.
The present invention utilizes the node data flow operation, make the inquiry implementation in data can between each node, flow, thereby make each back end all obtain needed total data, accomplish concurrent execution query script.Because being carried out the parallel query support, handles query statement, so promoted execution speed significantly.
Description of drawings
Below in conjunction with accompanying drawing and embodiment the present invention is done further detailed description.
Fig. 1 is in the specific embodiment of the present invention, the synoptic diagram that node data flows;
Fig. 2 is in the specific embodiment of the present invention, the synoptic diagram of employed inquiry plan tree.
Embodiment
Under cloud computing environment, the data in the shared-nothing database cluster are burst storages, and each back end is some data only.In the implementation of some SQL statement, some back end need be operated the total data that comprises other node datas could accomplish inquiry plan.For this reason; Suitable local the insert node data flow operation of the present invention through setting at the inquiry plan of back end; Make the inquiry implementation in data can between each node, flow, thereby make each back end all obtain needed total data, accomplish concurrent execution query script.Bright specifically in the face of this expansion down.
In certain shared-nothing database cluster, comprise Control Node and one or more back end as the specific embodiment of the invention.Each back end is responsible for storing data, and the plan of accepting the interview is also carried out, and returns the plan result then.User's data for example adopts Hash (hash also claims hash) distribution mode or scope (range) distribution mode etc. to leave in respectively on each back end at first with more uniform mode.Control Node is accepted the query requests of client, analysis request and generated query plan, and the distribution inquiry plan is given back end, so that the concurrent execution inquiry of each back end.
In order to realize maximized parallel query; The present invention is divided into different sections with inquiry plan; Be used for each back end and carry out, and in section, add the appropriate data flow operation, carry out inquiry so that each back end can access required total data.Each section all is the part of inquiry plan, and can independently on each back end, work.The execution content that comprises in partially sliced has been Duoed single stepping than the traditional database query script: streams data.The streams data of section in being according to schedule cut apart to inquiry plan operated and done.The both sides that data flow operation relates to are arranged in different sections.
Before address; The inquiry plan of some back end must have the data of other back end could be accomplished; The suitable local node data flow operation of inserting of for this reason cutting into slices at the inquiry plan of this back end; Data required in the inquiry implementation are flowed between each node, and so each back end all can obtain needed total data, has also guaranteed all execution separately on each back end of query manipulation simultaneously.Need to prove that not all inquiry plan all needs streams data operation, for example inquiry system table information etc. does not just need.
In the present invention, streams data comprises three types:
A: the polymerization combined data flows, and is meant that the data after each back end is with Hash connection (join) are dealt on the single node, normally is dealt into Control Node and merges.
B: the redistribution streams data, promptly heavy distributing data flows, and is meant the difference of the cryptographic hash of the train value that utilizes Hash to connect (join), and the data after the screening are redistributed on other back end.
C: broadcast data flows, i.e. the data that the data node need send to a plurality of node broadcasts.The redistribution streams data is to send data to specific minority back end, and it is to send data to a plurality of nodes that broadcast data flows.
Why need broadcast data to flow and the redistribution streams data; Be because all only preserved a part of data on each back end; When occurring doing connection (join) operation between the different pieces of information node, need the data of oneself be sent to other back end.This two operations have been arranged, just can guarantee all execution separately on each back end of all connections (join) operation.It is after each back end has been carried out inquiry plan separately that the polymerization combined data flows, and data are mail on the Control Node, is integrated by Control Node then and returns to client.
The broadcast data energy of flow guarantees that enough the data integrity of inquiring about on the individual data node under all situations, following of situation about having need do the redistribution data manipulation and just can guarantee the data integrity of inquiring about on the individual data node.
Under cloud computing environment, the shared-nothing database cluster is the database of a distributed storage.For the user, need from the database of these distributed storage, obtain complete data at any time.The polymerization combined data flows in order to guarantee that it is complete returning to user's data.And broadcast data flows and the redistribution streams data is the integrality that guarantees data during for inquiry on the individual data node.
Based on the classification of above-mentioned streams data, the suitable local node data flow operation of setting at the inquiry plan of back end of inserting also is divided three classes: broadcast data operation, redistribution data manipulation and polymerization combined data are operated.
The mobile operation of broadcast data is the data that node need send to a plurality of node broadcasts.
The operation of redistribution streams data is a difference of utilizing the cryptographic hash of the train value (connecting key) that connects (join), and the data after the screening are redistributed on other back end.
Polymerization combined data flow operation is that the data that each node will be carried out after inquiring about are dealt on the single node, normally is dealt into Control Node and merges.
Below in conjunction with Fig. 1 and Fig. 2, inquiring about with the order in a commercial management field is example, and practical implementation step of the present invention is described further.
In this order query example, suppose following query statement:
Figure BDA0000114771310000051
The implication of above-mentioned query statement is in nearest 2 months, the consumption total value what is pressed, the statistics rank of from high to low the customer consumption total value being carried out.Relate to 4 tables altogether: client (customer) table, order (orders) table, commodity (lineitem) table and nationality (nation) table.
Under the environment of shared-nothing database cluster, suppose that the data of above 4 tables all relatively have been evenly distributed on n the data node, inquiry plan is divided into and data section the count consistent section of n, i.e. n section.Streams data between each back end is as shown in Figure 1, and corresponding inquiry plan tree is as shown in Figure 2.
The whole step of the parallel query that the present invention adopted is:
The first step: inquiry plan is cut apart, obtained the section of carrying out respectively by each back end;
Second step: each back end carries out slicing treatment respectively, and the Hash that obtains each back end connects;
The 3rd step: each back end flows through the polymerization combined data, carries out the Hash polymerization in Control Node, obtains query results.
The concrete steps of the slicing treatment of each back end are as shown in Figure 1 in second step, comprising:
Data node scan of step 1) first table obtains first Hash table, scans second table then, carries out Hash with first Hash table and is connected, and obtains first data set;
Step 2) another back end scans the 3rd table, obtains the 3rd Hash table, scans the 4th table then, carries out Hash with the 3rd Hash table and is connected, and obtains second data set;
Said another back end of step 3) carries out Hash based on said first data set with said second data set and is connected.
Connect to the Hash that step 3) has just obtained each back end through step 1), carried out for the 3rd step again to obtain result set.
If having certain data (supposing second table) of a data node is that a plurality of back end all need, so just need utilizes broadcast data to flow and operate.If having certain data (supposing second table) of a data node is another back end needs, so just need to utilize the operation of redistribution streams data.
Particularly, abovementioned steps 1) can change to step 3):
Data node scan of step 11) first table obtains first Hash table, scans second table then; Carrying out Hash with first Hash table is connected; Obtain first data set, and second table is perhaps redistributed the streams data operation through the mobile operation of broadcast data, send to other back end;
Another back end of step 12) scans the 3rd table, based on said second table that receives and the 3rd table of scanning, obtains the 3rd Hash table, scans the 4th table then, carries out Hash with the 3rd Hash table and is connected, and obtains second data set;
Another back end of step 13) carries out Hash based on said first data set with said second data set and is connected.
Below in conjunction with specific embodiment, be elaborated based on Fig. 2:
Steps A. the nationality of scanning separately simultaneously on each back end shows data, and the nationality on each back end is shown data to other node broadcasts, and promptly broadcast data flows, and makes each back end obtain whole nationalities and shows data.Because nationality's table record bar number is few, so this step execution is very fast.At this, supposing needs nationality's table is broadcasted according to search request.
Scan customer data separately simultaneously on each node of step B., and carry out Hash based on the Hash table that customer data and the nationality who receives show data and be connected, generate the RS-CN data set.
Each node of step C. scans order table data separately simultaneously, and filtering data generates the RS-O data set.
Each node of step D. scans commodity list data separately simultaneously, filters to generate the RS-L data set.
Step e. the RS-O Hash table that each node simultaneously will be separately carries out Hash with RS-L and is connected, and generation RS-OL data set notices that this process need not carry out the redistribution data manipulation, because the branch Boulez of order and commodity all is the key value of inquiring about (order key).This has just guaranteed that the object that need connect separately all is on machine separately, so n node just begins parallel the connection.
Step F. each node redistributes the streams data operation with the own RS-OL data set that generates in step e according to custkey (client's key value) between all nodes; Can be according to Hash (hash; Hash) distribution mode or scope (range) distribution mode redistributes data between node, and acquiescence adopts the Hash distribution mode in the present embodiment.Like this, each node all can have the RS-OL data set of oneself.
Each node of step G. will the own RS-CN data set that generates at step B, carries out Hash with the RS-OL data set through the redistribution streams data on the own node and is connected.
Last each back end of step H. carries out polymerization, ordering, mails to Control Node.
Shared-nothing database cluster parallel query disposal route provided by the present invention makes full use of the computing power of each clustered node, the request of concurrent processing data query under cloud computing environment.Can be on this theoretical method with the increase of node number the continuous handling capacity of elevator system, continue the satisfying magnanimity data query to performance demands.
More than shared-nothing database cluster parallel query disposal route provided by the present invention has been carried out detailed explanation.To those skilled in the art, any conspicuous change of under the prerequisite that does not deviate from connotation of the present invention, it being done all will constitute to infringement of patent right of the present invention, with corresponding legal responsibilities.

Claims (6)

1. the shared-nothing database cluster parallel query disposal route under the cloud computing environment; Said data-base cluster comprises Control Node and a plurality of back end; Said Control Node is accepted the query requests of client; The generated query plan also is distributed to said back end, it is characterized in that may further comprise the steps:
The first step: said inquiry plan is cut apart, obtained the section of carrying out respectively by each back end;
Second step: each back end carries out slicing treatment respectively, and the Hash that obtains each back end connects;
The 3rd step: each back end carries out the Hash polymerization through polymerization combined data flow operation in Control Node, obtains query results.
2. shared-nothing database cluster parallel query disposal route as claimed in claim 1 is characterized in that:
In said second step, the slicing treatment of each back end may further comprise the steps:
Data node scan of step 1) first table obtains first Hash table, scans second table then, carries out Hash with first Hash table and is connected, and obtains first data set;
Step 2) another back end scans the 3rd table, obtains the 3rd Hash table, scans the 4th table then, carries out Hash with the 3rd Hash table and is connected, and obtains second data set;
Said another back end of step 3) carries out Hash based on said first data set with said second data set and is connected.
3. shared-nothing database cluster parallel query disposal route as claimed in claim 1 is characterized in that:
In said second step, the slicing treatment of each back end may further comprise the steps:
Data node scan of step 11) first table obtains first Hash table, scans second table then; Carrying out Hash with first Hash table is connected; Obtain first data set, and second table is perhaps redistributed the streams data operation through the mobile operation of broadcast data, send to other back end;
Another back end of step 12) scans the 3rd table, based on said second table that receives and the 3rd table of scanning, obtains the 3rd Hash table, scans the 4th table then, carries out Hash with the 3rd Hash table and is connected, and obtains second data set;
Said another back end of step 13) carries out Hash based on said first data set with said second data set and is connected.
4. shared-nothing database cluster parallel query disposal route as claimed in claim 3 is characterized in that:
The mobile operation of said broadcast data is the data that node need send to a plurality of node broadcasts.
5. shared-nothing database cluster parallel query disposal route as claimed in claim 3 is characterized in that:
Said redistribution streams data operation is a difference of utilizing the cryptographic hash of the train value that connects, and the data after the screening are redistributed on other back end.
6. shared-nothing database cluster parallel query disposal route as claimed in claim 1 is characterized in that:
In said the 3rd step, after said Hash connection the carrying out Hash polymerization and ordering that each back end obtains in going on foot second, through said polymerization combined data flow operation, send to said Control Node again.
CN2011103926770A 2011-12-01 2011-12-01 Parallel query processing method for share-nothing database cluster in cloud computing environment Pending CN102521307A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011103926770A CN102521307A (en) 2011-12-01 2011-12-01 Parallel query processing method for share-nothing database cluster in cloud computing environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011103926770A CN102521307A (en) 2011-12-01 2011-12-01 Parallel query processing method for share-nothing database cluster in cloud computing environment

Publications (1)

Publication Number Publication Date
CN102521307A true CN102521307A (en) 2012-06-27

Family

ID=46292228

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011103926770A Pending CN102521307A (en) 2011-12-01 2011-12-01 Parallel query processing method for share-nothing database cluster in cloud computing environment

Country Status (1)

Country Link
CN (1) CN102521307A (en)

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102841944A (en) * 2012-08-27 2012-12-26 南京云创存储科技有限公司 Method achieving real-time processing of big data
CN103123652A (en) * 2013-03-14 2013-05-29 曙光信息产业(北京)有限公司 Data query method and cluster database system
CN103136364A (en) * 2013-03-14 2013-06-05 曙光信息产业(北京)有限公司 Cluster database system and data query processing method thereof
CN103399943A (en) * 2013-08-14 2013-11-20 曙光信息产业(北京)有限公司 Communication method and communication device for parallel query of clustered databases
CN103455633A (en) * 2013-09-24 2013-12-18 浪潮齐鲁软件产业有限公司 Method of distributed analysis for massive network detailed invoice data
CN103823834A (en) * 2013-12-03 2014-05-28 华为技术有限公司 Device and method for data transmission among Hash join operators
WO2014139450A1 (en) * 2013-03-13 2014-09-18 Huawei Technologies Co., Ltd. System and method for distributed sql join processing in shared-nothing relational database clusters using stationary tables
CN104572754A (en) * 2013-10-24 2015-04-29 北大方正集团有限公司 Database system and database system access method and device
CN104885078A (en) * 2012-12-29 2015-09-02 华为技术有限公司 Method for two-stage query optimization in massively parallel processing database clusters
CN105007317A (en) * 2015-07-10 2015-10-28 深圳市创梦天地科技有限公司 Data processing method for distributed nodes, and gateway equipment
CN105045871A (en) * 2015-07-15 2015-11-11 国家超级计算深圳中心(深圳云计算中心) Data aggregation query method and apparatus
CN105183901A (en) * 2015-09-30 2015-12-23 北京京东尚科信息技术有限公司 Method and device for reading database table through data query engine
CN106250519A (en) * 2016-08-04 2016-12-21 曙光信息产业(北京)有限公司 Data query method and apparatus for parallel database
US9576026B2 (en) 2013-03-13 2017-02-21 Futurewei Technologies, Inc. System and method for distributed SQL join processing in shared-nothing relational database clusters using self directed data streams
CN106874272A (en) * 2015-12-10 2017-06-20 华为技术有限公司 A kind of distributed connection method and system
CN107229692A (en) * 2017-05-19 2017-10-03 哈工大大数据产业有限公司 A kind of distributed multi-table connecting method and system based on streamline
CN107545005A (en) * 2016-06-28 2018-01-05 华为软件技术有限公司 A kind of data processing method and device
WO2018006594A1 (en) * 2016-07-08 2018-01-11 华为技术有限公司 Method and apparatus for generating hash connection table
WO2018054221A1 (en) * 2016-09-23 2018-03-29 Huawei Technologies Co., Ltd. Pipeline dependent tree query optimizer and scheduler
CN108021578A (en) * 2016-11-03 2018-05-11 北京国双科技有限公司 The relation query method and device of data file
CN108255871A (en) * 2016-12-29 2018-07-06 华为技术有限公司 A kind of data query method and data query node
CN108304504A (en) * 2018-01-18 2018-07-20 吉浦斯信息咨询(深圳)有限公司 A kind of user online status method for quickly querying and system
CN109101621A (en) * 2018-08-09 2018-12-28 中国建设银行股份有限公司 A kind of batch processing method and system of data
CN110019360A (en) * 2017-10-27 2019-07-16 阿里巴巴集团控股有限公司 A kind of data processing method and device
WO2022057357A1 (en) * 2020-09-18 2022-03-24 华为云计算技术有限公司 Data query method and apparatus, and database system
CN117271132A (en) * 2023-10-10 2023-12-22 星环信息科技(上海)股份有限公司 Database-based big data set operation method, device, equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1829988A (en) * 2003-08-01 2006-09-06 甲骨文国际公司 Ownership reassignment in a shared-nothing database system
CN102163195A (en) * 2010-02-22 2011-08-24 北京东方通科技股份有限公司 Query optimization method based on unified view of distributed heterogeneous database

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1829988A (en) * 2003-08-01 2006-09-06 甲骨文国际公司 Ownership reassignment in a shared-nothing database system
CN102163195A (en) * 2010-02-22 2011-08-24 北京东方通科技股份有限公司 Query optimization method based on unified view of distributed heterogeneous database

Cited By (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102841944A (en) * 2012-08-27 2012-12-26 南京云创存储科技有限公司 Method achieving real-time processing of big data
CN104885078A (en) * 2012-12-29 2015-09-02 华为技术有限公司 Method for two-stage query optimization in massively parallel processing database clusters
CN104885078B (en) * 2012-12-29 2018-06-15 华为技术有限公司 For the method for the Two-phrase query optimization in MPP data-base cluster
US9152669B2 (en) 2013-03-13 2015-10-06 Futurewei Technologies, Inc. System and method for distributed SQL join processing in shared-nothing relational database clusters using stationary tables
WO2014139450A1 (en) * 2013-03-13 2014-09-18 Huawei Technologies Co., Ltd. System and method for distributed sql join processing in shared-nothing relational database clusters using stationary tables
US9576026B2 (en) 2013-03-13 2017-02-21 Futurewei Technologies, Inc. System and method for distributed SQL join processing in shared-nothing relational database clusters using self directed data streams
CN105247513A (en) * 2013-03-13 2016-01-13 华为技术有限公司 System and method for distributed SQL join processing in shared-nothing relational database clusters using stationary tables
CN103136364A (en) * 2013-03-14 2013-06-05 曙光信息产业(北京)有限公司 Cluster database system and data query processing method thereof
CN103123652A (en) * 2013-03-14 2013-05-29 曙光信息产业(北京)有限公司 Data query method and cluster database system
CN103136364B (en) * 2013-03-14 2016-08-24 曙光信息产业(北京)有限公司 Clustered database system and data query processing method thereof
CN103399943A (en) * 2013-08-14 2013-11-20 曙光信息产业(北京)有限公司 Communication method and communication device for parallel query of clustered databases
CN103455633A (en) * 2013-09-24 2013-12-18 浪潮齐鲁软件产业有限公司 Method of distributed analysis for massive network detailed invoice data
CN104572754A (en) * 2013-10-24 2015-04-29 北大方正集团有限公司 Database system and database system access method and device
CN104572754B (en) * 2013-10-24 2018-06-05 北大方正集团有限公司 A kind of Database Systems, Database Systems access method and device
CN103823834A (en) * 2013-12-03 2014-05-28 华为技术有限公司 Device and method for data transmission among Hash join operators
CN103823834B (en) * 2013-12-03 2017-04-26 华为技术有限公司 Device and method for data transmission among Hash join operators
CN105007317A (en) * 2015-07-10 2015-10-28 深圳市创梦天地科技有限公司 Data processing method for distributed nodes, and gateway equipment
CN105007317B (en) * 2015-07-10 2019-08-06 深圳市创梦天地科技有限公司 A kind of data processing method and gateway of distributed node
CN105045871B (en) * 2015-07-15 2018-09-28 国家超级计算深圳中心(深圳云计算中心) Data aggregate querying method and device
CN105045871A (en) * 2015-07-15 2015-11-11 国家超级计算深圳中心(深圳云计算中心) Data aggregation query method and apparatus
CN105183901A (en) * 2015-09-30 2015-12-23 北京京东尚科信息技术有限公司 Method and device for reading database table through data query engine
CN106874272A (en) * 2015-12-10 2017-06-20 华为技术有限公司 A kind of distributed connection method and system
CN106874272B (en) * 2015-12-10 2020-02-14 华为技术有限公司 Distributed connection method and system
CN107545005A (en) * 2016-06-28 2018-01-05 华为软件技术有限公司 A kind of data processing method and device
WO2018006594A1 (en) * 2016-07-08 2018-01-11 华为技术有限公司 Method and apparatus for generating hash connection table
CN106250519A (en) * 2016-08-04 2016-12-21 曙光信息产业(北京)有限公司 Data query method and apparatus for parallel database
CN109791492A (en) * 2016-09-23 2019-05-21 华为技术有限公司 Assembly line association tree query optimizer and scheduler
WO2018054221A1 (en) * 2016-09-23 2018-03-29 Huawei Technologies Co., Ltd. Pipeline dependent tree query optimizer and scheduler
US10671607B2 (en) 2016-09-23 2020-06-02 Futurewei Technologies, Inc. Pipeline dependent tree query optimizer and scheduler
CN108021578A (en) * 2016-11-03 2018-05-11 北京国双科技有限公司 The relation query method and device of data file
CN108255871A (en) * 2016-12-29 2018-07-06 华为技术有限公司 A kind of data query method and data query node
CN108255871B (en) * 2016-12-29 2022-01-28 华为技术有限公司 Data query method and data query node
CN107229692A (en) * 2017-05-19 2017-10-03 哈工大大数据产业有限公司 A kind of distributed multi-table connecting method and system based on streamline
CN107229692B (en) * 2017-05-19 2018-05-01 哈工大大数据产业有限公司 A kind of distributed multi-table connecting method and system based on assembly line
CN110019360A (en) * 2017-10-27 2019-07-16 阿里巴巴集团控股有限公司 A kind of data processing method and device
CN108304504A (en) * 2018-01-18 2018-07-20 吉浦斯信息咨询(深圳)有限公司 A kind of user online status method for quickly querying and system
CN109101621A (en) * 2018-08-09 2018-12-28 中国建设银行股份有限公司 A kind of batch processing method and system of data
WO2022057357A1 (en) * 2020-09-18 2022-03-24 华为云计算技术有限公司 Data query method and apparatus, and database system
CN117271132A (en) * 2023-10-10 2023-12-22 星环信息科技(上海)股份有限公司 Database-based big data set operation method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN102521307A (en) Parallel query processing method for share-nothing database cluster in cloud computing environment
US20220407781A1 (en) Intelligent analytic cloud provisioning
CN103106249B (en) A kind of parallel data processing system based on Cassandra
Liu et al. Survey of real-time processing systems for big data
Yang et al. Druid: A real-time analytical data store
Gurjar et al. Cloud business intelligence–is what business need today
US6879984B2 (en) Analytical database system that models data to speed up and simplify data analysis
US8538985B2 (en) Efficient processing of queries in federated database systems
CN102375837B (en) Data acquiring system and method
CN102521246A (en) Cloud data warehouse system
CN105824868B (en) A kind of distributed data base data processing method and distributed data base system
US20140156586A1 (en) Big-fast data connector between in-memory database system and data warehouse system
CN107180113B (en) Big data retrieval platform
CN106126641A (en) A kind of real-time recommendation system and method based on Spark
CN102722553A (en) Distributed type reverse index organization method based on user log analysis
CN102521389A (en) Postgresql database cluster system mixedly using solid state drives and hard disk drive and optimizing method thereof
CN108509437A (en) A kind of ElasticSearch inquiries accelerated method
CN110990372A (en) Dimensional data processing method and device and data query method and device
Waas Beyond Conventional Data Warehousing—Massively Parallel Data Processing with Greenplum Database: (Invited Talk)
CN103455633A (en) Method of distributed analysis for massive network detailed invoice data
CN103473276A (en) Storage method of very large data and distributed database system and retrieval method thereof
CN105516284A (en) Clustered database distributed storage method and device
CN104871153A (en) System and method for flexible distributed massively parallel processing (mpp) database
CN111126852A (en) BI application system based on big data modeling
CN105022791A (en) Novel KV distributed data storage method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20120627