CN118277456B - Initiating node output method in MPP distributed system - Google Patents
Initiating node output method in MPP distributed system Download PDFInfo
- Publication number
- CN118277456B CN118277456B CN202410704629.8A CN202410704629A CN118277456B CN 118277456 B CN118277456 B CN 118277456B CN 202410704629 A CN202410704629 A CN 202410704629A CN 118277456 B CN118277456 B CN 118277456B
- Authority
- CN
- China
- Prior art keywords
- node
- hash
- computing
- computing node
- obtaining
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000000977 initiatory effect Effects 0.000 title claims abstract description 35
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000013507 mapping Methods 0.000 claims abstract description 27
- 239000011159 matrix material Substances 0.000 claims abstract description 25
- 238000001914 filtration Methods 0.000 claims abstract description 6
- 238000004364 calculation method Methods 0.000 claims abstract description 4
- 239000012634 fragment Substances 0.000 claims abstract 9
- 238000013467 fragmentation Methods 0.000 claims abstract 4
- 238000006062 fragmentation reaction Methods 0.000 claims abstract 4
- 238000012163 sequencing technique Methods 0.000 claims abstract 2
- 238000007726 management method Methods 0.000 description 18
- 239000000463 material Substances 0.000 description 2
- 101000822425 Arthrobacter sp. (strain KUJ 8602) Guanidinobutyrase Proteins 0.000 description 1
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 1
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2255—Hash tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
- H04L67/1023—Server selection for load balancing based on a hash applied to IP addresses or costs
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1029—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers using data related to the state of servers by a load balancer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/104—Peer-to-peer [P2P] networks
- H04L67/1061—Peer-to-peer [P2P] networks using node-based peer discovery mechanisms
- H04L67/1065—Discovery involving distributed pre-established resource-based relationships among peers, e.g. based on distributed hash tables [DHT]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1095—Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Signal Processing (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域Technical Field
本发明涉及数据库技术领域,尤其涉及一种MPP分布式系统中的发起节点输出方法。The present invention relates to the field of database technology, and in particular to an initiating node output method in an MPP distributed system.
背景技术Background Art
在一个MPP分布式系统中,通常包含管理节点和计算节点,管理节点是任务发起节点,负责任务的计划和调度,并接收从计算节点返回的数据;计算节点是数据所在节点,负责任务的执行,并将需要返回的数据返回给发起节点;所以发起节点相当于总调度师,发挥重要的作用。In an MPP distributed system, it usually includes management nodes and computing nodes. The management node is the task initiating node, which is responsible for planning and scheduling tasks and receiving data returned from the computing nodes. The computing nodes are the nodes where the data is located, which are responsible for executing tasks and returning the data that needs to be returned to the initiating node. Therefore, the initiating node is equivalent to the chief dispatcher and plays an important role.
当在网络负载不重,并发任务数不高时,选择哪个管理节点作为任务的发起节点,似乎不太关键,而对于业务负载量大,高并发场景下,网络带宽被挤压,跨网络的通讯网络延迟高,网络就很容易成为性能瓶颈,从而造成任务积压,不能及时完成场景业务,这时候如何正确的选择任务的发起节点就显得至关重要。When the network load is not heavy and the number of concurrent tasks is not high, it does not seem to be critical to choose which management node to use as the initiating node for the task. However, in scenarios with heavy business load and high concurrency, the network bandwidth is squeezed and the cross-network communication network latency is high. The network can easily become a performance bottleneck, resulting in a backlog of tasks and the inability to complete the scenario business in a timely manner. At this time, it is crucial to correctly select the initiating node for the task.
以往DBA在选择任务发起节点时,通常会采用随机选择或顺序选择的方式,但是这两种方法都没有考虑网络对于性能的影响,使得sql不能运行在最佳的发起节点上,可能会导致由于网络瓶颈带来的性能的异常,影响用户使用。In the past, DBAs usually used random selection or sequential selection to select task initiation nodes. However, neither of these two methods considered the impact of the network on performance, which prevented SQL from running on the best initiation node. This may cause performance anomalies due to network bottlenecks, affecting user usage.
发明内容Summary of the invention
本发明旨在至少解决相关技术中存在的技术问题之一。为此,本发明提供一种MPP分布式系统中的发起节点输出方法。The present invention aims to solve at least one of the technical problems existing in the related art. To this end, the present invention provides an initiating node output method in an MPP distributed system.
本发明提供一种MPP分布式系统中的发起节点输出方法,包括:The present invention provides an initiating node output method in an MPP distributed system, comprising:
S1:采集数据库集群中的节点数量及节点IP,根据分片信息进行排序获得节点分片及节点IP的第一映射关系;S1: Collect the number of nodes and node IPs in the database cluster, sort them according to the sharding information to obtain the first mapping relationship between the node shards and the node IPs;
S2:采集服务器信息,根据所述服务器信息创建计算节点与管理节点的IP距离矩阵,获得距离矩阵;S2: Collect server information, create an IP distance matrix between computing nodes and management nodes according to the server information, and obtain a distance matrix;
S3:为每个计算节点分配多个哈希桶,通过管理节点的nodedatamap表记录计算节点与哈希桶的对应关系,并根据nodedatamap表信息及所述对应关系获取节点分片与计算节点的第二映射关系;S3: Allocate multiple hash buckets to each computing node, record the correspondence between the computing nodes and the hash buckets through the nodedatamap table of the management node, and obtain the second mapping relationship between the node shards and the computing nodes according to the nodedatamap table information and the correspondence;
S4:提取查询语句的过滤条件中的哈希键,并由所述第二映射关系及所述第一映射关系,根据所述哈希键查询获得所述查询语句对应的计算节点的节点IP,获得节点IP查询结果;S4: extracting a hash key in the filter condition of the query statement, and obtaining the node IP of the computing node corresponding to the query statement according to the hash key query based on the second mapping relationship and the first mapping relationship, and obtaining a node IP query result;
S5:由所述距离矩阵,根据所述节点IP查询结果查询获得计算节点和管理节点极小值,将极小值对应的节点作为发起节点输出。S5: Obtain the minimum values of computing nodes and management nodes from the distance matrix according to the node IP query result, and output the node corresponding to the minimum value as the initiating node.
根据本发明提供的一种MPP分布式系统中的发起节点输出方法,步骤S1中的所述第一映射关系为一维数组。According to an initiating node output method in an MPP distributed system provided by the present invention, the first mapping relationship in step S1 is a one-dimensional array.
根据本发明提供的一种MPP分布式系统中的发起节点输出方法,步骤S2中的所述距离矩阵为二维矩阵。According to an initiating node output method in an MPP distributed system provided by the present invention, the distance matrix in step S2 is a two-dimensional matrix.
根据本发明提供的一种MPP分布式系统中的发起节点输出方法,所述距离矩阵中,当计算节点的IP与管理节点的IP相同时,计算节点与管理节点的距离为0,当计算节点的IP与管理节点的IP不同时,计算节点与管理节点的距离为1。According to an initiating node output method in an MPP distributed system provided by the present invention, in the distance matrix, when the IP of the computing node is the same as the IP of the management node, the distance between the computing node and the management node is 0; when the IP of the computing node is different from the IP of the management node, the distance between the computing node and the management node is 1.
根据本发明提供的一种MPP分布式系统中的发起节点输出方法,步骤S3中,为每个计算节点分配多个哈希桶时,分配方式为根据哈希桶总数均匀分配。According to an initiating node output method in an MPP distributed system provided by the present invention, in step S3, when multiple hash buckets are allocated to each computing node, the allocation method is uniform allocation according to the total number of hash buckets.
根据本发明提供的一种MPP分布式系统中的发起节点输出方法,所述哈希桶总数为65536个。According to an initiating node output method in an MPP distributed system provided by the present invention, the total number of hash buckets is 65536.
根据本发明提供的一种MPP分布式系统中的发起节点输出方法,步骤S4进一步包括:According to an initiating node output method in an MPP distributed system provided by the present invention, step S4 further comprises:
S41:提取查询语句的过滤条件中分布列的哈希值;S41: extracting the hash value of the distribution column in the filter condition of the query statement;
S42:根据哈希值计算获得哈希键;S42: Obtain a hash key according to the hash value calculation;
S43:由所述第二映射关系,根据所述哈希键获取查询语句对应的数据所在分片;S43: Obtaining the shard where the data corresponding to the query statement is located according to the hash key based on the second mapping relationship;
S44:由所述第一映射关系根据所述数据所在分片获取计算节点对应的节点IP,获得节点IP查询结果。S44: Obtain the node IP corresponding to the computing node according to the shard where the data is located by using the first mapping relationship to obtain a node IP query result.
根据本发明提供的一种MPP分布式系统中的发起节点输出方法,步骤S4中,所述过滤条件为含有where条件且存在等式分布列的精确查询语句。According to an initiating node output method in an MPP distributed system provided by the present invention, in step S4, the filtering condition is a precise query statement containing a where condition and having an equality distribution column.
本发明提供的一种MPP分布式系统中的发起节点输出方法,用于解决当并发压力大,网络负载重时,由于不正确的选择造成网络负载加剧问题,该方法能够根据MPP特点,自动选择最优发起节点,保证所选择的发起节点与计算节点为同一台机器,节省网络开销,避免了由于不正确的选择发起节点,而造成的性能异常,使用户业务更快速完成,提升用户体验及使用效率。The present invention provides an initiating node output method in an MPP distributed system, which is used to solve the problem of aggravated network load due to incorrect selection when the concurrency pressure is large and the network load is heavy. The method can automatically select the optimal initiating node according to the characteristics of MPP, ensure that the selected initiating node and the computing node are the same machine, save network overhead, avoid performance anomalies caused by incorrect selection of initiating nodes, enable user services to be completed more quickly, and improve user experience and usage efficiency.
本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。Additional aspects and advantages of the present invention will be given in part in the following description and in part will be obvious from the following description, or will be learned through practice of the present invention.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本发明或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the present invention or the prior art, the drawings required for use in the embodiments or the description of the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying creative work.
图1是本发明提供的一种MPP分布式系统中的发起节点输出方法流程图。FIG1 is a flow chart of an initiating node output method in an MPP distributed system provided by the present invention.
具体实施方式DETAILED DESCRIPTION
为使本发明的目的、技术方案和优点更加清楚,下面将结合本发明中的附图,对本发明中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。以下实施例用于说明本发明,但不能用来限制本发明的范围。In order to make the purpose, technical scheme and advantages of the present invention clearer, the technical scheme of the present invention will be clearly and completely described below in conjunction with the drawings in the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments in the present invention, all other embodiments obtained by ordinary technicians in the field without creative work are within the scope of protection of the present invention. The following embodiments are used to illustrate the present invention, but cannot be used to limit the scope of the present invention.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明实施例的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, the description with reference to the terms "one embodiment", "some embodiments", "example", "specific example", or "some examples" etc. means that the specific features, structures, materials or characteristics described in conjunction with the embodiment or example are included in at least one embodiment or example of the embodiments of the present invention. In this specification, the schematic representations of the above terms do not necessarily refer to the same embodiment or example. Moreover, the specific features, structures, materials or characteristics described may be combined in any one or more embodiments or examples in a suitable manner. In addition, those skilled in the art may combine and combine the different embodiments or examples described in this specification and the features of the different embodiments or examples, without contradiction.
下面结合图1描述本发明的实施例。An embodiment of the present invention is described below with reference to FIG. 1 .
本发明提供一种MPP分布式系统中的发起节点输出方法,包括:The present invention provides an initiating node output method in an MPP distributed system, comprising:
S1:采集数据库集群中的节点数量及节点IP,根据分片信息进行排序获得节点分片及节点IP的第一映射关系;S1: Collect the number of nodes and node IPs in the database cluster, sort them according to the sharding information to obtain the first mapping relationship between the node shards and the node IPs;
其中,步骤S1中的所述第一映射关系为一维数组。Wherein, the first mapping relationship in step S1 is a one-dimensional array.
本阶段中,首先获取分片和ip的映射关系,具体来讲获取数据库集群中的节点数量及ip信息,并按分片信息进行排序,定义N列一维数组A,其中的N为计算节点个数,表达式为A[0]=ip1,A[1]=ip2…A[N-1]=ipN。In this stage, we first obtain the mapping relationship between shards and IP addresses. Specifically, we obtain the number of nodes and IP addresses in the database cluster and sort them by shard information. We define an N-column one-dimensional array A, where N is the number of computing nodes. The expression is A[0]=ip1, A[1]=ip2…A[N-1]=ipN.
S2:采集服务器信息,根据所述服务器信息创建计算节点与管理节点的IP距离矩阵,获得距离矩阵;S2: Collect server information, create an IP distance matrix between computing nodes and management nodes according to the server information, and obtain a distance matrix;
其中,步骤S2中的所述距离矩阵为二维矩阵。Wherein, the distance matrix in step S2 is a two-dimensional matrix.
其中,所述距离矩阵中,当计算节点的IP与管理节点的IP相同时,计算节点与管理节点的距离为0,当计算节点的IP与管理节点的IP不同时,计算节点与管理节点的距离为1。In the distance matrix, when the IP of the computing node is the same as the IP of the management node, the distance between the computing node and the management node is 0; when the IP of the computing node is different from the IP of the management node, the distance between the computing node and the management node is 1.
本阶段的目的为获得计算节点和发起节点的距离矩阵,具体来讲首先根据服务器管理员提供的服务器位置信息或者ip信息,定义N*N的二维矩阵M,然后创建计算节点和管理节点ip距离矩阵,ip相同时,距离为0,ip不同时,距离为1。The purpose of this stage is to obtain the distance matrix between the computing node and the initiating node. Specifically, first define an N*N two-dimensional matrix M based on the server location information or IP information provided by the server administrator, and then create an IP distance matrix between the computing node and the management node. If the IPs are the same, the distance is 0, and if the IPs are different, the distance is 1.
S3:为每个计算节点分配多个哈希桶,通过管理节点的nodedatamap表记录计算节点与哈希桶的对应关系,并根据nodedatamap表信息及所述对应关系获取节点分片与计算节点的第二映射关系;S3: Allocate multiple hash buckets to each computing node, record the correspondence between the computing nodes and the hash buckets through the nodedatamap table of the management node, and obtain the second mapping relationship between the node shards and the computing nodes according to the nodedatamap table information and the correspondence;
进一步的,本阶段目的为获得分片与计算节点的关系, MPP集群N个计算节点,id范围为n1-nN,每个节点分配65536/N个hash桶,hash桶id为h0-h65535。Furthermore, the purpose of this stage is to obtain the relationship between shards and computing nodes. The MPP cluster has N computing nodes, with ids ranging from n1 to nN. Each node is allocated 65536/N hash buckets, and the hash bucket ids are h0-h65535.
而MPP发起节点中的nodedatamap(系统表,用于记录节点分片与数据的映射),记录分片和hash桶的关系,每个node上对应多个hash桶,之后我们通过nodedatamap中信息即能获得分片与计算节点ip的关系。The nodedatamap (system table used to record the mapping between node shards and data) in the MPP initiating node records the relationship between shards and hash buckets. Each node corresponds to multiple hash buckets. Then we can obtain the relationship between the shards and the computing node IP through the information in the nodedatamap.
其中,步骤S3中,为每个计算节点分配多个哈希桶时,分配方式为根据哈希桶总数均匀分配。In step S3, when multiple hash buckets are allocated to each computing node, the allocation method is to evenly allocate according to the total number of hash buckets.
其中,所述哈希桶总数为65536个。Among them, the total number of hash buckets is 65536.
在一些实施例中,GBase数据库对应的特征为65536个哈希桶。In some embodiments, the GBase database corresponds to 65536 hash buckets.
S4:提取查询语句的过滤条件中的哈希键,并由所述第二映射关系及所述第一映射关系,根据所述哈希键查询获得所述查询语句对应的计算节点的节点IP,获得节点IP查询结果;S4: extracting a hash key in the filter condition of the query statement, and obtaining the node IP of the computing node corresponding to the query statement according to the hash key query based on the second mapping relationship and the first mapping relationship, and obtaining a node IP query result;
其中,步骤S4进一步包括:Wherein, step S4 further comprises:
S41:提取查询语句的过滤条件中分布列的哈希值;S41: extracting the hash value of the distribution column in the filter condition of the query statement;
S42:根据哈希值计算获得哈希键;S42: Obtain a hash key according to the hash value calculation;
S43:由所述第二映射关系,根据所述哈希键获取查询语句对应的数据所在分片;S43: Obtaining the shard where the data corresponding to the query statement is located according to the hash key based on the second mapping relationship;
S44:由所述第一映射关系根据所述数据所在分片获取计算节点对应的节点IP,获得节点IP查询结果。S44: Obtain the node IP corresponding to the computing node according to the shard where the data is located by using the first mapping relationship to obtain a node IP query result.
其中,步骤S4中,所述过滤条件为含有where条件且存在等式分布列的精确查询语句。Wherein, in step S4, the filtering condition is a precise query statement containing a where condition and having an equal distribution column.
本阶段中,首先提取sql中过滤条件中的hash值v,确定该sql计算节点的ip,具体的在获取到sql后,找到精确查询过滤条件中的hash分布列的值v,标志是含有where条件且存在分布列’=’,找到值v之后进行crc32(v)%65536计算出hashkey,其中的crc为基于crc检验原理的加解密算法,之后从步骤S3中的nodedatamap信息中获取分片和hashkey的关系得到数据所在分片,根据步骤S1中得到的分片和ip映射关系,得到计算节点ip。In this stage, first extract the hash value v in the filter condition in SQL and determine the IP of the SQL computing node. Specifically, after obtaining SQL, find the value v of the hash distribution column in the precise query filter condition. The sign is that it contains the where condition and the distribution column '=' exists. After finding the value v, perform crc32(v)%65536 to calculate the hashkey, where crc is an encryption and decryption algorithm based on the crc verification principle. Then, obtain the relationship between the shard and the hashkey from the nodedatamap information in step S3 to obtain the shard where the data is located. According to the shard and IP mapping relationship obtained in step S1, the computing node IP is obtained.
S5:由所述距离矩阵,根据所述节点IP查询结果查询获得计算节点和管理节点极小值,将极小值对应的节点作为发起节点输出。S5: Obtain the minimum values of computing nodes and management nodes from the distance matrix according to the node IP query result, and output the node corresponding to the minimum value as the initiating node.
本阶段目的为获得最优发起节点ip,根据步骤S4得到的计算节点ip,查找步骤S2中得到的计算节点和发起节点距离矩阵的最小值,min(M[k,i]),k为计算节点ip的分片值,i为0......N-1,将该M节点作为计算节点输出。The purpose of this stage is to obtain the optimal initiating node IP. According to the computing node IP obtained in step S4, the minimum value of the distance matrix between the computing node and the initiating node obtained in step S2 is found, min(M[k,i]), k is the shard value of the computing node IP, i is 0...N-1, and the M node is output as the computing node.
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit it. Although the present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that they can still modify the technical solutions described in the aforementioned embodiments, or make equivalent replacements for some of the technical features therein. However, these modifications or replacements do not deviate the essence of the corresponding technical solutions from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410704629.8A CN118277456B (en) | 2024-06-03 | 2024-06-03 | Initiating node output method in MPP distributed system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410704629.8A CN118277456B (en) | 2024-06-03 | 2024-06-03 | Initiating node output method in MPP distributed system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118277456A CN118277456A (en) | 2024-07-02 |
CN118277456B true CN118277456B (en) | 2024-09-20 |
Family
ID=91634412
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410704629.8A Active CN118277456B (en) | 2024-06-03 | 2024-06-03 | Initiating node output method in MPP distributed system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118277456B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN119148947B (en) * | 2024-11-18 | 2025-01-24 | 江苏华库数据技术有限公司 | A method and device for dynamically adjusting the number of deployment shards |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112135297A (en) * | 2020-09-22 | 2020-12-25 | 平安科技(深圳)有限公司 | Communication method, central server, equipment and medium of Internet of things |
CN117971506A (en) * | 2024-03-29 | 2024-05-03 | 天津南大通用数据技术股份有限公司 | MPP database query task balancing method, system, equipment and medium |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6951846B2 (en) * | 2017-03-07 | 2021-10-20 | 株式会社日立製作所 | Computer system and task allocation method |
CN107330098B (en) * | 2017-07-06 | 2020-08-04 | 北京理工大学 | Query method, computing node and query system for custom report |
CN117591608B (en) * | 2024-01-19 | 2024-04-30 | 恒辉信达技术有限公司 | Cloud primary database data slicing method based on distributed hash |
CN118041925A (en) * | 2024-03-27 | 2024-05-14 | 中国工商银行股份有限公司 | Block chain network node, resource file positioning method and device |
-
2024
- 2024-06-03 CN CN202410704629.8A patent/CN118277456B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112135297A (en) * | 2020-09-22 | 2020-12-25 | 平安科技(深圳)有限公司 | Communication method, central server, equipment and medium of Internet of things |
CN117971506A (en) * | 2024-03-29 | 2024-05-03 | 天津南大通用数据技术股份有限公司 | MPP database query task balancing method, system, equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN118277456A (en) | 2024-07-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102541990B (en) | Database redistribution method and system utilizing virtual partitions | |
CN107145537B (en) | Table data importing method and system | |
CN108600321A (en) | A kind of diagram data storage method and system based on distributed memory cloud | |
CN118277456B (en) | Initiating node output method in MPP distributed system | |
US20050235001A1 (en) | Method and apparatus for refreshing materialized views | |
WO2019161679A1 (en) | Data processing method and device for use in online analytical processing | |
CN106415534B (en) | The method and apparatus of contingency table subregion in a kind of distributed data base | |
CN110399368B (en) | Method for customizing data table, data operation method and device | |
US20110295907A1 (en) | Apparatus and Method for Expanding a Shared-Nothing System | |
CN113742343A (en) | Data splitting method, device and storage medium based on large amount of service data scenes | |
JP2019504390A (en) | Data inquiry method and apparatus, and database system | |
Cheng et al. | Efficient event correlation over distributed systems | |
Silberstein et al. | Efficient bulk insertion into a distributed ordered table | |
WO2020238546A1 (en) | Kv database configuration method, query method, device and storage medium | |
CN108268614A (en) | A kind of distribution management method of forest reserves spatial data | |
CN103092886A (en) | Achieving method, device and system for data query operation | |
CN109150964B (en) | Migratable data management method and service migration method | |
CN117574428A (en) | Hidden query methods, devices, equipment and media for massive distributed storage | |
CN111026709A (en) | Data processing method and device based on cluster access | |
US20090171921A1 (en) | Accelerating Queries Based on Exact Knowledge of Specific Rows Satisfying Local Conditions | |
CN105978744A (en) | Resource allocation method, device and system | |
CN115062027A (en) | Hash connection method, computing node, storage medium and program product | |
KR101451280B1 (en) | Distributed database management system and method | |
CN114090530A (en) | Log summarizing and inquiring method and device under distributed architecture | |
CN111259062A (en) | Method and device capable of ensuring sequence of result sets of full-table query statements of distributed database |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address | ||
CP03 | Change of name, title or address |
Address after: Room 201-33, Unit 2, Building 2, No. 39 Gaoxin 6th Road, Binhai Science and Technology Park, Binhai New Area, Tianjin, China 300452 Patentee after: TIANJIN NANKAI UNIVERSITY GENERAL DATA TECHNOLOGIES Co.,Ltd. Country or region after: China Address before: 300384 building J, Haitai green industrial base, 6 Haitai development road, Huayuan Industrial Zone, Binhai New Area, Tianjin Patentee before: TIANJIN NANKAI UNIVERSITY GENERAL DATA TECHNOLOGIES Co.,Ltd. Country or region before: China |