CN106339432A - System and method for balancing load according to content to be inquired - Google Patents
System and method for balancing load according to content to be inquired Download PDFInfo
- Publication number
- CN106339432A CN106339432A CN201610688791.0A CN201610688791A CN106339432A CN 106339432 A CN106339432 A CN 106339432A CN 201610688791 A CN201610688791 A CN 201610688791A CN 106339432 A CN106339432 A CN 106339432A
- Authority
- CN
- China
- Prior art keywords
- node
- server
- data
- database
- query
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Fuzzy Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a system for balancing a load according to a content to be inquired. The system comprises a data portioning server, a master server, a node server and a front end display module, and the master server comprises a node index storage unit, a thread distribution unit, a simplifying unit, a main processor and a temporary table storage unit. The invention further discloses a method for balancing the load according to the content to be inquired. By means of parallel computation of a plurality of nodes, the calculation of a large database is distributed to a plurality of node databases, the capability of computing through multiple machines and cores is played, and the query speed or the statistical speed of the database with a large data volume is greatly increased. The load of the content to be inquired is balanced to reduce unnecessary concurrent computation, the concurrent access capability of the whole system is improved by times or dozens of times under the same hardware condition, the system and the method do not rely on special hardware or network, and can be implemented through a common PC and a gigabit network or even a 100 M network, and the cost performance is very high.
Description
Technical field
The present invention relates to a kind of inquiry of database or statistical system, specifically one kind carries out load balance according to query contents
System.
Background technology
Development with computer technology and popularization, large database promptly enters into the industry-by-industries such as telecommunications, finance.
Sql(structured query language, SQL) it is the operation commands set aiming at database and setting up,
It is a kind of database language.The major function of sql is exactly to contact with various Databases, makes between different types of database
Linked up.According to ansi(ANSI) regulation, sql is by the standard as Relational DBMS
Language.It is only necessary to send the order of " what does " when using sql, without consideration " how doing ".Sql sentence can be used
To execute the various operations to database, for example, to update the data the data in storehouse, to extract data etc. from database.Mesh
Before, most popular Relational DBMSs, such as qracle, sybase, microsoft sql server,
Access etc. employs sql language standard.However, going deep into informatization, all trades and professions all establish substantial amounts of
Database, and the data volume of these databases is also increasing, limits the inquiry to database and Statistical Speed.For example in meter
In charge system, miscellaneous service program needs to carry out frequently inquiry operation to the data in database, and the data volume being related to is very
Huge, the frequency accessing database is very high, and thus excessive database interaction leads to the performance of computer program to reduce.
In order to improve inquiry and the Statistical Speed of database, the most frequently used mode is that hardware system is optimized, for example
The patent application of Patent Office of the People's Republic of China's Application No. 200610041548.6, it proposes a kind of side of accelerating database searching speed
Method, as shown in figure 1, it passes through to open up the common memory section for depositing data data index in Installed System Memory, by guarding
Data data index in database is called in corresponding common memory section to enter for business industry in the way of agreement by process respectively
Journey is called, and by finger daemon timing or circulation, the record in database is inquired about, in time by the data content more connecing simultaneously
Recorded in above-mentioned common memory section.
The method of this accelerating database searching speed can improve the inquiry velocity of database to a certain extent, reduces
Dependence to database performance.But for the inquiry for high-volume database or statistics, due to the restriction of hardware computation speed,
This method can not fundamentally solve the slow-footed problem of data base querying, and the lifting of computing power, such as improve cpu
Frequency, increase internal memory, raising disk access speeds etc., its room for promotion is limited, and the upgrading of hardware performance needs to put in a large number
Fund cost.Thus solve the problem rate of large data library inquiry or statistics how effectively, always one need
Problem to be solved.At present in most of distributed system, the method for data distribution, adopting random distribution or hash distribution more.
These methods are all between mathematical method, and service distribution to be unmatched.The ID card No. of such as people, if according to hash
Distribution, then the data of the people of a province, also can be distributed on all nodes.When inquiring about this province's data, access certain province
It is necessary to access all nodes when data.When there being big concurrently access, the concurrency performance of system will be poor, and this is just
Use for people brings inconvenience.
Content of the invention
It is an object of the invention to provide a kind of system carrying out load balance according to query contents, to solve above-mentioned background skill
The problem proposing in art.
For achieving the above object, the following technical scheme of present invention offer:
A kind of system carrying out load balance according to query contents, including data segmentation server, master server, node server and
Front end display module, described master server is connected with data segmentation server, front end display module and node server respectively, section
The quantity of point server is two or more, and data segmentation server is connected with the source database depositing mass data, and data is divided
Cut and be connected by communication between server and the node server with operation independent disposal ability, master server includes node index and deposits
Storage unit, thread allocation unit, simplified element, primary processor and interim table memory cell, node index storage unit respectively with
Data segmentation server, simplified element and thread allocation unit are connected, and thread allocation unit is connected with node server, main place
Reason device is connected with interim table memory cell.
As the further scheme of the present invention: node server includes modal processor, modal processor and node data
Storehouse is connected, and node server adopts common pc machine.
As the further scheme of the present invention: by wired or wireless between data segmentation server and node server
Mode is connected.
The described method carrying out load balance according to query contents, specifically comprises the following steps that
Step one, arranges multiple node databases;
Step 2, the mass data in source database is split by data segmentation server according to rule, in partition data,
If data content is little with the data volume of the corresponding informance of node server, can be directly as index information;And if number
Larger with the data volume ratio of the corresponding informance of node server according to content, then it is time-consuming long to be likely to result in parsing, thus can profit
With simplified element, index information is simplified, to improve the analyzing efficiency of thread allocation unit, then by the data after segmentation
It is assigned in the node database of each node server;
Step 3, according to segmentation rule, formation during partition data represents the data content assigned by each node database
Index information, index information is left in node index storage unit;
Step 4, according to segmentation rule, simplifies to index information;
Step 5, thread allocation unit parses to inquiry or statistical parameter, and combines the rope in node index storage unit
Fuse ceases, and distributes the query or statistical task of each node database, finds out specific corresponding to each ad hoc inquiry or statistics
Node;
Step 6, each modal processor carries out parallel query or counts and feed back to main service to each node database
Device.Wherein, each node database all can carry out independent computing, thus each node database all can share one
Part query or statistical task, and greatly improve the access efficiency of database;
Step 7, if the result set data volume that receives of master server less or node server 12 quantity few,
The inquiry of node server or statistics directly can be transferred to front end display module by master server;And if node serve
The data volume that the quantity of device is more or node server returns to master server is larger, then can inquiry or statistics multiple
Make in interim table memory cell, and one interim table of generation is collected by interim table memory cell;
Step 8, master server is inquired about again to the information of interim table or is counted, and forms final result set, will be final
Result set be transferred to front end display module, front end display module by forms such as the data genaration receiving figure, forms, and with
Technician realizes interaction.
Compared with prior art, the invention has the beneficial effects as follows: the present invention by way of multi-node parallel computing, by one
The operand of individual large database distributes to multiple node databases, such that it is able to give full play to multimachine and multinuclear calculates simultaneously
Ability, can be greatly enhanced the query or statistical rate in Volume data storehouse, with respect to the mode of configuration of optimizing hardware, this
Invention will not be limited by room for promotion, and inquiry or Statistical Rate can improve 10 times, 100 times even 1000 times;The present invention
Carry out load balancing using to the content inquired about, to each inquiry, first judge the node that this inquiry may access in advance, permissible
Greatly reduce unnecessary parallel computation, under same hardware condition, can even tens times of raising whole system at double
Concurrent access ability;Node server adopts common pc machine, required with respect to the optimization of master server hardware configuration
Cost, on the premise of lifting identical inquiry or Statistical Rate, increases node server input cost less;The present invention disobeys
Rely in special hardware and network, common pc machine and gigabit networking even 100,000,000 networks are it is achieved that cost performance is very high.
Brief description
Fig. 1 is the structural representation of the system carrying out load balance according to query contents.
Fig. 2 is the workflow diagram of the system carrying out load balance according to query contents.
Fig. 3 is the source database schematic diagram of big data quantity in the system carrying out load balance according to query contents.
Fig. 4 is the parsing schematic diagram of the source database of data volume in the system carrying out load balance according to query contents.
Wherein: 11- master server, 12- node server, 13- source database, 14- data splits server, the main place of 15-
Reason device, 16- interim table memory cell, 17- node database, 18- modal processor, 19- front end display module, 20- node rope
Draw memory cell, 21- thread allocation unit, 22- simplified element.
Specific embodiment
With reference to specific embodiment, the technical scheme of this patent is described in more detail.
Refer to Fig. 1-2, a kind of system carrying out load balance according to query contents, split server 14, master including data
Server 11, node server 12 and front end display module 19, described master server 11 splits server 14, front with data respectively
End display module 19 is connected with node server 12, and the quantity of node server 12 is two or more, and data splits server 14
It is connected with the source database 13 depositing mass data, data splits server 14 and the node with operation independent disposal ability
It is connected by communication between server 12, master server 11 includes node index storage unit 20, thread allocation unit 21, simplified element
22nd, primary processor 15 and interim table memory cell 16, node index storage unit 20 is split server 14 respectively, is simplified with data
Unit 22 and thread allocation unit 21 are connected, and thread allocation unit 21 is connected with node server 12, primary processor 15 with face
When table memory cell 16 be connected.Node server 12 includes modal processor 18, modal processor 18 and node database 17 phase
Even, node server 12 adopts common pc machine.By wired or nothing between data segmentation server 14 and node server 12
The mode of line is connected.
Specific embodiment 1
Refering to Fig. 3-4, Fig. 3 is the source database schematic diagram of a big data quantity.This source database includes four tables of data:
Store table, sales table, time table and product table, data volume is 400,000,100,000,000,1825 and 1000 respectively.First have to source
The data of database is split, and is assigned in each node database.Data volume ratio due to store table and sales table
Larger, time table and product table data volume less, therefore to store table and sales table, are split by store field,
Time table and product table are not split, and are copied directly to each node database.During partition data, city word can also be added
Section, region field is ranked up, and the data in one city of guarantee or a region is in a node database or adjacent as far as possible
On node database.
Form index information then according to segmentation rule.Assume to be split according to store title here, then formed
Store title and the corresponding informance of node server.For convenience of description, now by store title be divided into store1,
Store2, store3 ..., then index information can be represented with table 1:
Table 1
Node server | Store title |
n1 | store1 |
n2 | store2 |
n3 | store3 |
And if when the data volume of index information is larger (specific name of such as store is long, or store quantity is more),
Simplification process can be carried out to index information.The corresponding table of a upper strata Classifying Sum field for example can be produced, as table 2 institute
Show:
Table 2
Node server | Store title | The top level domain of store title |
n1 | store1 | a |
n2 | store2 | b |
n3 | store3 | c |
Wherein, the top level domain in table 2 represents store title respectively with alphabetical a, b, b, and in addition, top level domain also may be used
With using the abbreviation of store title, specific symbol etc., it act as simplifying search, reduces the time of parsing.
When user will access database, it is inquired about or statistical parameter parses, and combine index information, distribution is each
The query or statistical task of individual node database.Inquiry described here or statistical parameter can be inquiry content, user right
Deng.For example, when user needs the related data inquiring about store1 and store2, because the data of store1 and store2 is point
It is not divided on two nodes of n1 and n2, therefore according to index information, system can be produced on producing two nodes of n1 and n2
Raw two threads carry out concurrent operation, without producing thread on n3 node, so have many user accesses data when simultaneously
The operand of system during storehouse, will be greatly reduced.Certainly, if certain user is the director of store1, and it can only have inquiry
During the authority of store1 related data, even if the inquiry content of this user includes store1 and store2, system can take in inquiry
Hold the common factor (i.e. resolving) with user right, thus a thread is only produced on n1 node.
After being assigned with thread task, the data of each node database to be inquired about or be counted, each be saved
Point data base execution sql instruction.Then the result set of each node is imported to interim table, inquired about again after being collected or unite
Meter, the interim table importing being completed executes sql instruction again, such that it is able to obtain final result set.Finally by result set
Pass to front end display module, shown using the various controls (such as form, figure) that represent
Above the better embodiment of this patent is explained in detail, but this patent is not limited to above-mentioned embodiment,
In the ken that those of ordinary skill in the art possesses, can also make each on the premise of without departing from this patent objective
Plant change.
Claims (5)
1. a kind of system carrying out load balance according to query contents is it is characterised in that include data segmentation server, main service
Device, node server and front end display module, described master server splits server, front end display module and section with data respectively
Point server is connected, and the quantity of node server is two or more, and data splits server and the source number depositing mass data
It is connected according to storehouse, be connected by communication between data segmentation server and the node server with operation independent disposal ability, main service
Device includes node index storage unit, thread allocation unit, simplified element, primary processor and interim table memory cell, node rope
Draw memory cell to be connected with data segmentation server, simplified element and thread allocation unit respectively, thread allocation unit and section
Point server is connected, and primary processor is connected with interim table memory cell.
2. the system carrying out load balance according to query contents according to claim 1 is it is characterised in that described node serve
Device includes modal processor, and modal processor is connected with node database, and node server adopts common pc machine.
3. the system carrying out load balance according to query contents according to claim 1 is it is characterised in that described data is split
It is connected by way of wired or wireless between server and node server.
4. a kind of method of work of the described system carrying out load balance according to query contents as arbitrary in claim 1-3, it is special
Levy and be, specifically comprise the following steps that
Step one, arranges multiple node databases;
Step 2, the mass data in source database is split by data segmentation server according to rule, in partition data,
If data content is little with the data volume of the corresponding informance of node server, can be directly as index information;And if number
Larger with the data volume ratio of the corresponding informance of node server according to content, then it is time-consuming long to be likely to result in parsing, thus can profit
With simplified element, index information is simplified, to improve the analyzing efficiency of thread allocation unit, then by the data after segmentation
It is assigned in the node database of each node server;
Step 3, according to segmentation rule, formation during partition data represents the data content assigned by each node database
Index information, index information is left in node index storage unit;
Step 4, according to segmentation rule, simplifies to index information;
Step 5, thread allocation unit parses to inquiry or statistical parameter, and combines the rope in node index storage unit
Fuse ceases, and distributes the query or statistical task of each node database, finds out specific corresponding to each ad hoc inquiry or statistics
Node;
Step 6, each modal processor carries out parallel query or counts and feed back to main service to each node database
Device.
5. wherein, each node database all can carry out independent computing, thus each node database all can be shared
A part of query or statistical task, and greatly improve the access efficiency of database;
Step 7, if the result set data volume that receives of master server less or node server 12 quantity few,
The inquiry of node server or statistics directly can be transferred to front end display module by master server;And if node serve
The data volume that the quantity of device is more or node server returns to master server is larger, then can inquiry or statistics multiple
Make in interim table memory cell, and one interim table of generation is collected by interim table memory cell;
Step 8, master server is inquired about again to the information of interim table or is counted, and forms final result set, will be final
Result set be transferred to front end display module, front end display module by forms such as the data genaration receiving figure, forms, and with
Technician realizes interaction.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610688791.0A CN106339432A (en) | 2016-08-19 | 2016-08-19 | System and method for balancing load according to content to be inquired |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610688791.0A CN106339432A (en) | 2016-08-19 | 2016-08-19 | System and method for balancing load according to content to be inquired |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106339432A true CN106339432A (en) | 2017-01-18 |
Family
ID=57824290
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610688791.0A Pending CN106339432A (en) | 2016-08-19 | 2016-08-19 | System and method for balancing load according to content to be inquired |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106339432A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109933719A (en) * | 2019-01-30 | 2019-06-25 | 维沃移动通信有限公司 | A kind of searching method and terminal device |
WO2019128978A1 (en) * | 2017-12-29 | 2019-07-04 | 阿里巴巴集团控股有限公司 | Database system, and method and device for querying database |
CN111309805A (en) * | 2019-12-13 | 2020-06-19 | 华为技术有限公司 | Data reading and writing method and device for database |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101120340A (en) * | 2004-02-21 | 2008-02-06 | 数据迅捷股份有限公司 | Ultra-shared-nothing parallel database |
CN101908075A (en) * | 2010-08-17 | 2010-12-08 | 上海云数信息科技有限公司 | SQL-based parallel computing system and method |
CN101916280A (en) * | 2010-08-17 | 2010-12-15 | 上海云数信息科技有限公司 | Parallel computing system and method for carrying out load balance according to query contents |
CN101916281A (en) * | 2010-08-17 | 2010-12-15 | 上海云数信息科技有限公司 | Concurrent computational system and non-repetition counting method |
US20110125745A1 (en) * | 2009-11-25 | 2011-05-26 | Bmc Software, Inc. | Balancing Data Across Partitions of a Table Space During Load Processing |
-
2016
- 2016-08-19 CN CN201610688791.0A patent/CN106339432A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101120340A (en) * | 2004-02-21 | 2008-02-06 | 数据迅捷股份有限公司 | Ultra-shared-nothing parallel database |
US20110125745A1 (en) * | 2009-11-25 | 2011-05-26 | Bmc Software, Inc. | Balancing Data Across Partitions of a Table Space During Load Processing |
CN101908075A (en) * | 2010-08-17 | 2010-12-08 | 上海云数信息科技有限公司 | SQL-based parallel computing system and method |
CN101916280A (en) * | 2010-08-17 | 2010-12-15 | 上海云数信息科技有限公司 | Parallel computing system and method for carrying out load balance according to query contents |
CN101916281A (en) * | 2010-08-17 | 2010-12-15 | 上海云数信息科技有限公司 | Concurrent computational system and non-repetition counting method |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019128978A1 (en) * | 2017-12-29 | 2019-07-04 | 阿里巴巴集团控股有限公司 | Database system, and method and device for querying database |
US11789957B2 (en) | 2017-12-29 | 2023-10-17 | Alibaba Group Holding Limited | System, method, and apparatus for querying a database |
CN109933719A (en) * | 2019-01-30 | 2019-06-25 | 维沃移动通信有限公司 | A kind of searching method and terminal device |
CN111309805A (en) * | 2019-12-13 | 2020-06-19 | 华为技术有限公司 | Data reading and writing method and device for database |
CN111309805B (en) * | 2019-12-13 | 2023-10-20 | 华为技术有限公司 | Data reading and writing method and device for database |
US11868333B2 (en) | 2019-12-13 | 2024-01-09 | Huawei Technologies Co., Ltd. | Data read/write method and apparatus for database |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10963428B2 (en) | Multi-range and runtime pruning | |
US20220253421A1 (en) | Index Sharding | |
CN106547796B (en) | Database execution method and device | |
US20200019552A1 (en) | Query optimization method and related apparatus | |
US8538954B2 (en) | Aggregate function partitions for distributed processing | |
US20170083573A1 (en) | Multi-query optimization | |
US6801903B2 (en) | Collecting statistics in a database system | |
CN101916280A (en) | Parallel computing system and method for carrying out load balance according to query contents | |
CN111767303A (en) | Data query method and device, server and readable storage medium | |
WO2020135613A1 (en) | Data query processing method, device and system, and computer-readable storage medium | |
CN110909111B (en) | Distributed storage and indexing method based on RDF data characteristics of knowledge graph | |
CN101908075A (en) | SQL-based parallel computing system and method | |
CN104123346A (en) | Structural data searching method | |
US20100235344A1 (en) | Mechanism for utilizing partitioning pruning techniques for xml indexes | |
Labouseur et al. | Scalable and Robust Management of Dynamic Graph Data. | |
US11809468B2 (en) | Phrase indexing | |
US20230401210A1 (en) | Just-In-Time Injection In A Distributed Database | |
CN101916281B (en) | Concurrent computational system and non-repetition counting method | |
CN114297173A (en) | Knowledge graph construction method and system for large-scale mass data | |
CN106339432A (en) | System and method for balancing load according to content to be inquired | |
Braganholo et al. | A survey on xml fragmentation | |
US20230315701A1 (en) | Data unification | |
Xu et al. | Semantic connection set-based massive RDF data query processing in Spark environment | |
CN112818010B (en) | Database query method and device | |
US20170031909A1 (en) | Locality-sensitive hashing for algebraic expressions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170118 |