[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN106506490A - A kind of Distributed Calculation control method and distributed computing system - Google Patents

A kind of Distributed Calculation control method and distributed computing system Download PDF

Info

Publication number
CN106506490A
CN106506490A CN201610959237.1A CN201610959237A CN106506490A CN 106506490 A CN106506490 A CN 106506490A CN 201610959237 A CN201610959237 A CN 201610959237A CN 106506490 A CN106506490 A CN 106506490A
Authority
CN
China
Prior art keywords
message
node
client
thread
url
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610959237.1A
Other languages
Chinese (zh)
Other versions
CN106506490B (en
Inventor
李军
黄海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guozhi Digital Economy Shenzhen Co ltd
Original Assignee
Shenzhen Zhigaodian Intellectual Property Operation Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Zhigaodian Intellectual Property Operation Co Ltd filed Critical Shenzhen Zhigaodian Intellectual Property Operation Co Ltd
Priority to CN201610959237.1A priority Critical patent/CN106506490B/en
Publication of CN106506490A publication Critical patent/CN106506490A/en
Application granted granted Critical
Publication of CN106506490B publication Critical patent/CN106506490B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/06Notations for structuring of protocol data, e.g. abstract syntax notation one [ASN.1]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer And Data Communications (AREA)

Abstract

A kind of Distributed Calculation control method and distributed computing system, method include:S1, the message format for defining communication between node;S2, each node based on message in target URL or type of message addressing:If message is sent to client node by server node, S3 is executed, if message is sent to server node by client node, execute S4, if message is sent to client node by client node, execute S5;The nodename or type of message of S3, the node processes ID according to target URL in message or target URL, finds corresponding connection and message is sent to client node;S4, target URL in message is empty after, directly transmit message to server node;If there is in S5 message complete target URL and the IP address of two client nodes belongs to same network, two client nodes are directly set up and connect and send message, message is sent to server node otherwise and turns S3;Can also be in communication with each other across the node of LAN based on the present invention.

Description

A kind of Distributed Calculation control method and distributed computing system
Technical field
The present invention relates to field of computer technology, more particularly to a kind of Distributed Calculation control method and Distributed Calculation System.
Background technology
Traditional distributed development framework such as RMI based on service registry, CORBA etc., their message transmission rely on IP Addressing, the ISP in LAN be merely able to by inside same LAN user access, across LAN service not Can intercommunication.
Content of the invention
The technical problem to be solved in the present invention is, for the drawbacks described above of prior art, there is provided a kind of Distributed Calculation Control method and distributed computing system.
The technical solution adopted for the present invention to solve the technical problems is:A kind of Distributed Calculation control method is constructed, side Method includes:
S1, the message format for defining communication between node:Message includes that message header and message body, message header are included with lower word Section:Origin url, target URL, type of message, wherein URL are included with properties:IP address, the node processes of the unique mark node ID and the nodename for identifying node type of static configuration;
S2, each node based on message in target URL or type of message addressing:If message is sent to client by server node End node, then execution step S3, if message is sent to server node by client node, execution step S4, if client's end segment Message is sent to client node by point, then execution step S5;
S3, server node according to the node processes ID of target URL in message or the nodename of target URL or Type of message, finds corresponding connection and message is sent to client node;
After target URL in message is empty by S4, client node, message is directly transmitted to server node;
If there is in S5 message complete target URL and the IP address of two client nodes belong to same network, Then two client nodes are directly set up and connect and send message, message is sent to server node otherwise and goes to step S3.
In Distributed Calculation control method of the present invention, step S3 includes:
If S31 targets URL are complete, execution step S32, if target URL has been only written nodename, is executed Step S33, if target URL is sky, execution step S34;
S32, found from connection pool according to the node processes ID in URL and accordingly connect and send message, terminated;
The all URL of nodename identical in S33, target URL from routing table in query node title and message, Enter step S35;
S34, according to the type of message in message, all URL of the type of message are supported in inquiry from routing table, enter step Rapid S35;
S35, all connections accordingly are found according to the node processes ID in URL from connection pool, randomly choose therein One linkup transmit message, if sending failure, selects next connection and all attempts until sending successful or all connections Finish.
In Distributed Calculation control method of the present invention, the IP address includes local IP and public network IP;
Step S5 is specifically included:
If having complete target URL, otherwise execution step S52, execution step S53 in S51 message;
If S52 origin urls are identical with the public network IP in target URL and the network segment of local IP is identical, or, two clients The local IP of end node is respectively identical with the public network IP of itself, then judge that the IP address of two client nodes belongs to same Network, two client nodes are directly set up and connect and send message, enter step S53, otherwise, knot if failure is sent Beam;
S53, message is sent to server node and goes to step S3.
In Distributed Calculation control method of the present invention, before step S2, also include S12:Client node Set up with server node after being connected, by heartbeat message the URL of the client node and the message class that supports upon actuation Type is sent to server node.
In Distributed Calculation control method of the present invention, present invention additionally comprises following condition managing step:Client Whether end node detects currently available with the connection of server by heartbeat message and reports the running status of itself, such as core Jump message and send failure, then client re-establishes one and is connected to server, and the running status includes that client is current Maximum thread, idle line number of passes, completed number of tasks, failed tasks number, untreated number of tasks, task processor state, Software version, server end calculate message distribution amount according to running status.
In Distributed Calculation control method of the present invention, the message header also includes precedence field, priority Field is arranged to effective message and is arranged in Priority Queues, and the message that precedence field is arranged to invalid is arranged in commonly In queue, the queue message of each node is sent according to the order of first in first out, and is retransmited after Priority Queues is sent common Queue.
In Distributed Calculation control method of the present invention, the thread pool of each node is by the thread that works, idle line Journey, interim thread composition, wherein, the Thread Count sum of thread and the idle thread of working within maximum thread, temporary track The Thread Count of journey is within maximum number of concurrent;
Present invention additionally comprises following thread management step:When the message that node is received is arranged to have for precedence field During the message of effect, if work thread reaches maximum thread and interim thread is not up to maximum number of concurrent, temporary track is created Message is otherwise placed in local Priority Queues by journey;Wherein, after the completion of the task of interim thread, if there is no which in Preset Time His waiting task then stops the interim thread.
In Distributed Calculation control method of the present invention, message header also includes the uniqueness mark as each message The message sequence number field of knowledge, message lifetime field and message duration field, wherein, for disappearing for request-respond style Breath, message SN of the response message using request message, message forward message life cycle to successively decrease one, when message life cycle It is changed into when 0, abandoning the message;When node receives message duration field is arranged to effective message, based on mongodb Message is saved in hard disk, after program surprisingly terminates or computer circuit breaking is restarted, recovers the message from hard disk.
In Distributed Calculation control method of the present invention, methods described also includes:
Configuration management step:Client node reports to server node the configuration item required for itself after starting, and takes Business device node beams back client node under loading corresponding configuration item simultaneously from configuration database, client node is for existing local The configuration item of configuration parameter is returned using server node for the configuration item without locally configured parameter using locally configured parameter The parameter of the configuration item for returning;
Management of process step:Start finger daemon to safeguard the running status and initialization calculate node of calculation procedure Program file, keeps a heartbeat signal by standard input and output between finger daemon and calculation procedure, if finger daemon is not Can detect calculation procedure heartbeat signal when, show that calculation procedure comes into abnormality, then force termination calculation procedure And restart calculation procedure;If calculation procedure fails the heartbeat signal for detecting finger daemon, actively terminate own process;
Software upgrading step:Server node sends version information, version information bag to the calculation procedure of client node Digital digest information list containing version number and All Files, client node compare local program after receiving version information The digital digest information of file, finds the file for needing to update;Client node downloads institute's renewal in need from server node File, be saved in an interim catalogue;The calculation procedure of client node sends messages to finger daemon and notifies to have updated Finish;Finger daemon is received after updating the message for finishing, and terminates all of calculation procedure, and the program file in temp directory is merged To original calculation procedure catalogue, to merge and start original calculation procedure after finishing.
The invention also discloses a kind of distributed computing system, including for executing described Distributed Calculation control method Server node and some client nodes.
Implement the Distributed Calculation control method and distributed computing system of the present invention, have the advantages that:This Invention sends message using self-defining message format, and message header includes origin url, target URL, type of message, and URL includes IP Address, node processes ID and nodename, for two client nodes that IP address belongs to same network directly can be built Vertical connection, other situations can forward message by server, and server is according to node processes ID, nodename or message Type determines the client node that will set up connection and then sends message, therefore, can also be in communication with each other across the node of LAN;
Further, for the message of request-respond style, node is recorded and which by the origin url in the message of reception The URL of the node of connection is set up, and the URL based on record determines target URL of message to be sent, improve the property of message transmission Energy;Message header also includes message duration field, realizes message duration based on Mongodb, and it is very high that it processes can system Handling capacity, reach high concurrent, the purpose of low delay, and surprisingly terminating or after computer circuit breaking is restarted in program can be with Recover message from hard disk, prevent information drop-out;Message header also includes that precedence field, prioritized messages are arranged in Priority Queues Preferentially can send.
Description of the drawings
Below in conjunction with drawings and Examples, the invention will be further described, in accompanying drawing:
Fig. 1 is the structural representation of the distributed computing system of the present invention;
Fig. 2 is the Distributed Calculation control method flow chart of the present invention;
Fig. 3 is the graph of a relation of finger daemon and calculation procedure.
Specific embodiment
In order to be more clearly understood to the technical characteristic of the present invention, purpose and effect, now control accompanying drawing is described in detail The specific embodiment of the present invention.
It is the structural representation of the distributed computing system of the present invention with reference to Fig. 1, distributed computing system includes server Node and at least one client node.It is the Distributed Calculation control method flow chart of the present invention with reference to Fig. 2.The present invention divides Cloth calculation control method, method include:
S1, the message format for defining communication between node:Message includes that message header and message body, message header are included with lower word Section:Origin url, target URL, type of message, wherein URL are included with properties:IP address, the node processes of the unique mark node ID and the nodename for identifying node type of static configuration;
S2, each node based on message in target URL or type of message addressing:If message is sent to client by server node End node, then execution step S3, if message is sent to server node by client node, execution step S4, if client's end segment Message is sent to client node by point, then execution step S5;
S3, server node according to the node processes ID of target URL in message or the nodename of target URL or Type of message, finds corresponding connection and message is sent to client node;
After target URL in message is empty by S4, client node, message is directly transmitted to server node;
If there is in S5 message complete target URL and the IP address of two client nodes belong to same network, Then two client nodes are directly set up and connect and send message, message is sent to server node otherwise and goes to step S3.
The present invention is described in detail with a specific embodiment below.
The present invention mainly carries out self-defined to message header point.In preferred embodiment, the message structure of definition is as follows:
In the embodiment, message header includes following field successively:
Message SN, is typically designated as ID, and ID is the unique identification of each message, is managed by each node oneself, leads to Often using incrementally or random mode sequence.Under synchronous communication pattern, for the message of request-respond style, response message Directly using the serial number of request message, so that requesting node is when response message is received, can be looked for by message SN To just in the thread of wait-for-response message.The node that each node is established a connection by the origin url record in the message of reception URL, and the URL based on record determines target URL of message to be sent.
Origin url:That is source node address, for identifying the source address of message, destination node send response message when Time needs to use origin url.Each node can preserve the origin url in message when receiving other nodes and sending message, form one Routing table.
Target URL:That is the route according to this address computation destination node is sent out message by destination node address, system It is sent to destination node.
Message life cycle, TTL is designated as typically, represents message max-forwards number of times, when hop count alreadys exceed TTL settings Value, then abandon the message.Can be such as to be set to a numerical value by TTL, forward and once just successively decrease once, when TTL is just changed into 0 Abandon message.Heartbeat message is the numerical value for setting TTL as 1.
Priority, for representing whether the message is prioritized messages, (is for example arranged when attribute setting is effective For true) represent that the message, for prioritized messages, is otherwise common message.
Message duration, when node is received when message duration field is arranged to effective message (for example when the word Section is set to represent when true that the message, for persistent message, is otherwise common message), message is protected based on mongodb Hard disk is stored to, after program surprisingly terminates or computer circuit breaking is restarted, recovers the message from hard disk.
Type of message, can be the type of message of system definition, such as HEATBEAT, or the self-defining class of node Type.When target URL is not provided with, system can be according to the content of message type field, in all nodes for supporting the message Random choose one selects next node as destination node if transmission is unsuccessful, successful or all until sending Till available node has all had attempted to.
Message is sent in the present invention, mainly according to the content of target URL, URL structure is described below.In preferred embodiment The URL structure of definition is as follows:
In preferred embodiment, URL is included with properties:
Local IP, is typically designated as LocalIP in program, if node is in LAN, for the IP address of LAN;
Public network IP, is typically designated as NetIP in program, if node is in LAN, for the IP address of router;
Nodename, nodename are static configuration, and for identifying node type, process is restarted or IP changes deutomerite Point title will not change.In a distributed system, nodename can repeat, and the effect first of nodename is for seeking Location, second is easy for system manager is more easily managed to node.
Port numbers, that is, be used for the port numbers of Socket connections
Node processes ID;The unique mark node, when a computer has opened multiple client simultaneously, then using process ID Make a distinction.Process ID is a random character string, generates when process initiation, and it is only in a distributed system One presence.
With reference to above-mentioned self-defining message header and the URL in message header, the address procedures of the present invention are discussed in detail.
Specifically, step S3 includes:
If S31 targets URL are complete, execution step S32, if target URL has been only written nodename, is executed Step S33, if target URL is sky, execution step S34;
S32, found from connection pool according to the node processes ID in URL and accordingly connect and send message, terminated;
The all URL of nodename identical in S33, target URL from routing table in query node title and message, Enter step S35;
S34, according to the type of message in message, all URL of the type of message are supported in inquiry from routing table, enter step Rapid S35;
S35, all connections accordingly are found according to the node processes ID in URL from connection pool, randomly choose therein One linkup transmit message, if sending failure, selects next connection and all attempts until sending successful or all connections Finish.
It can be seen that, server to client can be realizing the addressing of three kinds of modes.The first is based on directly in target URL Node processes ID is addressed, and second is that the nodename based on target URL is addressed when first kind of way cannot be realized, and the 3rd It is when above both of which cannot be realized to plant, and is addressed based on type of message.
In due to above-mentioned steps, server need to know that the URL of each client node, type of message could be realized addressing, and be This, client node is set up with server node after be connected upon actuation, by heartbeat message the URL of the client node with The type of message of support is sent to server node.It is the structural representation of heartbeat message as follows:
Heartbeat message is destined for server, so target URL is without filling out, it is address blank.The URL of node is accused by origin url Know server, and the type of message that node is supported is placed on notification server in message body.
Specifically, step S5 is specifically included:
If having complete target URL, otherwise execution step S52, execution step S53 in S51 message;
If S52 origin urls are identical with the public network IP in target URL and the network segment of local IP is identical, or, two clients The local IP of end node is respectively identical with the public network IP of itself, then judge that the IP address of two client nodes belongs to same Network, two client nodes are directly set up and connect and send message, enter step S53, otherwise, knot if failure is sent Beam;
S53, message is sent to server node and goes to step S3.
In step S52, judge that the IP address of two client nodes belongs to same network and has been divided into two kinds of situations:
1), origin url is identical with the public network IP in target URL and the network segment of local IP is identical, it is believed that they are in same Individual network, for example:
Node 1:NetIP=183.14.227.144LocalIP=192.168.2.101
Node 2:NetIP=183.14.227.144LocalIP=192.168.2.6
2), the local IP of node is identical with the public network IP of itself, it is believed that the node is under public network environment, if two Under node is all in public network environment, then it is assumed that they are in same network.For example:
Node 1:NetIP=183.14.227.144LocalIP=183.14.227.144
Node 2:NetIP=183.14.55.165LocalIP=183.14.55.165
In addition, the present invention also to system service in terms of have made some improvements, specifically include:Condition managing step, thread pipe Reason step, configuration management step, management of process step, software upgrading step.
Condition managing step:
Heartbeat message mentioned above mainly reports the URL of node and the type of message that supports, further, can be with Condition managing is carried out based on heartbeat message, specially:Client node detects connection currently with server by heartbeat message Whether can use and report the running status of itself, if heartbeat message sends failure, client re-establishes a connection To server, the running status includes the current maximum thread of client, idle line number of passes, completed number of tasks, mistake Number of tasks, untreated number of tasks, task processor state, software version is lost, server end calculates message point according to running status Send out amount.
As client node can be actively connected to server upon actuation, which passes through heartbeat message guarantees that connection is protected always Hold, so no matter whether client is in LAN, server can send messages to client at any time.Server end according to These state computation message distribution amounts;In addition, running status also provides data to the back-stage management page, looked into for operation maintenance personnel in real time See the running status of each node.
Thread management step:
The thread pool of each node is made up of the thread that works, idle thread, interim thread, and wherein, work thread and sky , within maximum thread, the Thread Count of interim thread is within maximum number of concurrent for the Thread Count sum of idle thread;Thread management Step mainly includes:When the message that node is received is arranged to effective message for precedence field, if work thread Reach maximum thread and interim thread is not up to maximum number of concurrent, then create interim thread, otherwise message is placed in local Priority Queues;Wherein, after the completion of the task of interim thread, stop this if there is no other waiting tasks in Preset Time interim Thread.
Configuration management step:
Used as distributed system, the centralized management of configuration item is essential, and the distributed node of the present invention is supported remotely Dynamic configuration when configuration and operation.Configuration management step is specifically included:Matching somebody with somebody required for itself after client node startup Putting item and reporting to server node, server node beams back client section under corresponding configuration item being loaded simultaneously from configuration database Point, client node adopt locally configured parameter for the configuration item of existing locally configured parameter, for without locally configured ginseng The parameter of the configuration item that several configuration items is returned using server node.That is, configuration item is divided into two-layer using stacked management: The global configuration that server node is issued->The local privately owned configuration of client node, privately owned configuration preference is in global configuration.One Denier global configuration is changed, and server will be broadcasted to all clients, and client need not come into force by restarting configuration item.
Management of process step:
Distributed system be generally all be operated in unmanned in the environment of, often caused by various objective the reason for, meter Always there are some bug in calculation program, so causing program crashing because of the mistake of accumulation after long-play.In order to Unattended operation in 24 hours is reached, in addition to being rigid in checking up in test link, the sheep that operationally can also do mends jail, is this Invention introduces finger daemon, as shown in Figure 3.Management of process step is specifically included:Start finger daemon to safeguard calculation procedure Running status and initialization calculate node program file, between finger daemon and calculation procedure pass through standard input and output Keep a heartbeat signal, if finger daemon fails the heartbeat signal for detecting calculation procedure, force termination calculation procedure And restart calculation procedure;If calculation procedure fails the heartbeat signal for detecting finger daemon, actively terminate own process;
Software upgrading step:
For a distributed computing system, software is generally in a quick state for updating, and application program is in operation When be to change the program file of itself, and finger daemon does not have communication function, it is impossible to enough obtain fresh information from server, So the present invention using calculation procedure cooperate with finger daemon complete by the way of carry out software upgrading.For this purpose, software upgrading step Specifically include:Server node sends version information to the calculation procedure of client node, version information comprising version number and Local program file is compared in digital digest information (MD5) list of All Files, client node after receiving version information Digital digest information, finds the file for needing to update;Client node from server node download renewal in need file, It is saved in an interim catalogue;The calculation procedure of client node sends messages to finger daemon and notifies renewal to finish;Guard Process is received after updating the message for finishing, and is terminated all of calculation procedure, the program file in temp directory is merged into original Calculation procedure catalogue, to merge and start original calculation procedure after finishing.
With needing in distributed system, data to be processed are more and more huger, need the computer of input also more and more, These computer inters mutually cooperate more and more difficult, based on the method for the present invention, it is possible to achieve a set of motility and autgmentability Very high distributed computing system, is embedded into various types of application programs in the way of software kit, so as to reach application program Between data interchange, collaborative work purpose, original application need not change original framework, it is only necessary to introduce related software Bag, you can easily access the distributed computing system of the present invention.
In sum, implement the Distributed Calculation control method and distributed computing system of the present invention, have with following Beneficial effect:The present invention sends message using self-defining message format, and message header includes origin url, target URL, type of message, and URL includes IP address, node processes ID and nodename, for two client nodes that IP address belongs to same network can With directly set up connection, other situations can pass through server forwarding message, server according to node processes ID, nodename, Or type of message determines the client node that will set up connection and then sends message, therefore, can also across the node of LAN It is in communication with each other;Further, for the message of request-respond style, node is recorded and which by the origin url in the message of reception The URL of the node of connection is set up, and the URL based on record determines target URL of message to be sent, improve the property of message transmission Energy;Message header also includes message duration field, realizes message duration based on Mongodb, and it is very high that it processes can system Handling capacity, reach high concurrent, the purpose of low delay, and surprisingly terminating or after computer circuit breaking is restarted in program can be with Recover message from hard disk, prevent information drop-out;Message header also includes that precedence field, prioritized messages are arranged in Priority Queues Preferentially can send.
Embodiments of the invention are described above in conjunction with accompanying drawing, but be the invention is not limited in above-mentioned concrete Embodiment, above-mentioned specific embodiment are only schematic, rather than restricted, one of ordinary skill in the art Under the enlightenment of the present invention, in the case of without departing from present inventive concept and scope of the claimed protection, can also make a lot Form, these are belonged within the protection of the present invention.

Claims (10)

1. a kind of Distributed Calculation control method, it is characterised in that method includes:
S1, the message format for defining communication between node:Message includes that message header and message body, message header include following field: Origin url, target URL, type of message, wherein URL are included with properties:IP address, the node processes ID of the unique mark node The nodename for identifying node type with static configuration;
S2, each node based on message in target URL or type of message addressing:If message is sent to client's end segment by server node Point, then execution step S3, if message is sent to server node, execution step S4 by client node, if client node will Message is sent to client node, then execution step S5;
Nodename or the message of S3, server node node processes ID or target URL according to target URL in message Type, finds corresponding connection and message is sent to client node;
After target URL in message is empty by S4, client node, message is directly transmitted to server node;
If having the IP address of complete target URL and two client nodes to belong to same network in S5 message, two Individual client node is directly set up and connects and send message, message is sent to server node otherwise and goes to step S3.
2. Distributed Calculation control method according to claim 1, it is characterised in that step S3 includes:
If S31 targets URL are complete, execution step S32, if target URL has been only written nodename, execution step S33, if target URL is sky, execution step S34;
S32, found from connection pool according to the node processes ID in URL and accordingly connect and send message, terminated;
The all URL of nodename identical in S33, target URL from routing table in query node title and message, enter Step S35;
S34, according to the type of message in message, all URL of the type of message are supported in inquiry from routing table, enter step S35;
S35, all connections accordingly are found from connection pool according to the node processes ID in URL, randomly choose one of those Linkup transmit message, if sending failure, selects next connection until message sends successful or all connections and all attempts Finish.
3. Distributed Calculation control method according to claim 1, it is characterised in that the IP address include local IP and Public network IP;
Step S5 is specifically included:
If having complete target URL, otherwise execution step S52, execution step S53 in S51 message;
If S52 origin urls are identical with the public network IP in target URL and the network segment of local IP is identical, or two client nodes Local IP respectively identical with the public network IP of itself, then judge two client nodes IP address belong to same network, Two client nodes are directly set up and connect and send message, enter step S53 if failure is sent, and otherwise, terminate;
S53, message is sent to server node and goes to step S3.
4. Distributed Calculation control method according to claim 1, it is characterised in that also include before step S2 S12:Client node is set up with server node after being connected upon actuation, by heartbeat message the URL of the client node Server node is sent to the type of message that supports.
5. Distributed Calculation control method according to claim 4, it is characterised in that present invention additionally comprises following state pipe Reason step:Whether client node detects currently available with the connection of server by heartbeat message and reports the operation of itself State, if heartbeat message sends failure, client re-establishes one and is connected to server, and the running status includes visitor At the current maximum thread in family end, idle line number of passes, completed number of tasks, failed tasks number, untreated number of tasks, task Reason device state, software version, server end are used for calculating message distribution amount according to running status.
6. Distributed Calculation control method according to claim 1, it is characterised in that the message header also includes priority Field, precedence field are arranged to effective message and are arranged in Priority Queues, and precedence field is arranged to invalid and disappears Breath is arranged in common queue, and the queue message of each node is sent according to the order of first in first out, and Priority Queues is sent After retransmit common queue.
7. Distributed Calculation control method according to claim 6, it is characterised in that during the thread pool of each node is by working Thread, idle thread, interim thread composition, wherein, the thread that works is with the Thread Count sum of idle thread in maximum thread Within, the Thread Count of interim thread is within maximum number of concurrent;
Present invention additionally comprises following thread management step:When the message that node is received is arranged to effective for precedence field During message, if work thread reaches maximum thread and interim thread is not up to maximum number of concurrent, interim thread is created, no Message is placed in local Priority Queues then;Wherein, after the completion of the task of interim thread, if not having other to wait to locate in Preset Time Reason task then stops the interim thread.
8. Distributed Calculation control method according to claim 1, it is characterised in that message header also includes disappearing as each The message sequence number field of the unique identification of breath, message lifetime field and message duration field, wherein, for request- The message of respond style, message SN of the response message using request message, message forward message life cycle to successively decrease one, The message is abandoned when message life cycle being changed into 0;When node receives message duration field is arranged to effective message, Message is saved in by hard disk based on mongodb, after program surprisingly terminates or computer circuit breaking is restarted, recovering from hard disk should Message.
9. Distributed Calculation control method according to claim 1, it is characterised in that methods described also includes:
Configuration management step:Client node reports to server node, server the configuration item required for itself after starting Node beams back client node under loading corresponding configuration item simultaneously from configuration database, client node is for existing locally configured The configuration item of parameter adopts locally configured parameter, is returned using server node for the configuration item without locally configured parameter The parameter of configuration item;
Management of process step:Start finger daemon to safeguard the program of the running status and initialization calculate node of calculation procedure File, keeps a heartbeat signal by standard input and output between finger daemon and calculation procedure, if finger daemon fails to examine Measure calculation procedure heartbeat signal when, then terminate calculation procedure restarting calculation procedure;If calculation procedure fails to detect To the heartbeat signal of finger daemon, then actively terminate own process;
Software upgrading step:Server node sends version information to the calculation procedure of client node, and version information includes version This number and the digital digest information list of All Files, client node compare local program file after receiving version information Digital digest information, find need update file;Client node from server node download renewal in need text Part, and it is saved in an interim catalogue;The calculation procedure of client node sends messages to finger daemon and notifies renewal to finish; Finger daemon is received after updating the message for finishing, and terminates all of calculation procedure, the program file in temp directory is merged into Original calculation procedure catalogue, to merge and start original calculation procedure after finishing.
10. a kind of distributed computing system, it is characterised in that include requiring the distribution described in any one of 1-9 for perform claim The server node of formula calculation control method and some client nodes.
CN201610959237.1A 2016-11-03 2016-11-03 A kind of distributed computing control method and distributed computing system Active CN106506490B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610959237.1A CN106506490B (en) 2016-11-03 2016-11-03 A kind of distributed computing control method and distributed computing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610959237.1A CN106506490B (en) 2016-11-03 2016-11-03 A kind of distributed computing control method and distributed computing system

Publications (2)

Publication Number Publication Date
CN106506490A true CN106506490A (en) 2017-03-15
CN106506490B CN106506490B (en) 2019-07-09

Family

ID=58322484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610959237.1A Active CN106506490B (en) 2016-11-03 2016-11-03 A kind of distributed computing control method and distributed computing system

Country Status (1)

Country Link
CN (1) CN106506490B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107402806A (en) * 2017-04-20 2017-11-28 阿里巴巴集团控股有限公司 The task processing method and device of distributed document framework
CN107832165A (en) * 2017-11-23 2018-03-23 国云科技股份有限公司 A kind of method for lifting distributed system processing request stability
CN108768715A (en) * 2018-05-22 2018-11-06 烽火通信科技股份有限公司 Access the business configuration adaptation method and system of webmaster
CN109274641A (en) * 2018-08-09 2019-01-25 广东神马搜索科技有限公司 Connection method and device, calculating equipment and storage medium between client and service node
CN109426574A (en) * 2017-08-31 2019-03-05 华为技术有限公司 Distributed computing system, data transmission method and device in distributed computing system
CN110365754A (en) * 2019-06-28 2019-10-22 苏州浪潮智能科技有限公司 A kind of distributed document transmission storage method, equipment and storage medium
CN111372143A (en) * 2019-12-26 2020-07-03 视联动力信息技术股份有限公司 Signaling interaction method and device and readable storage medium
CN111562990A (en) * 2020-07-15 2020-08-21 北京东方通软件有限公司 Lightweight serverless computing method based on message
CN111736965A (en) * 2019-12-11 2020-10-02 西安宇视信息科技有限公司 Task scheduling method and device, scheduling server and machine-readable storage medium
CN114205889A (en) * 2021-12-09 2022-03-18 中山大学 Zenoh-based communication method in cross-local-area-network distributed system
CN115314493A (en) * 2022-08-03 2022-11-08 苏州创意云网络科技有限公司 Cluster scheduling method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001050754A1 (en) * 1999-12-30 2001-07-12 Swisscom Mobile Ag Image data transmission method
CN101771632A (en) * 2008-12-29 2010-07-07 厦门雅迅网络股份有限公司 Cross-LAN system communication method
CN102947824A (en) * 2010-06-11 2013-02-27 迪内希·阿南德·尼丁 System and method for addressing and accessing information using a key identifier
CN103002001A (en) * 2011-09-08 2013-03-27 宏伍工作室公司 Systems, methods and media for distributing peer-to-peer communications
CN104052723A (en) * 2013-03-15 2014-09-17 联想(北京)有限公司 Information processing method and server

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001050754A1 (en) * 1999-12-30 2001-07-12 Swisscom Mobile Ag Image data transmission method
CN101771632A (en) * 2008-12-29 2010-07-07 厦门雅迅网络股份有限公司 Cross-LAN system communication method
CN102947824A (en) * 2010-06-11 2013-02-27 迪内希·阿南德·尼丁 System and method for addressing and accessing information using a key identifier
CN103002001A (en) * 2011-09-08 2013-03-27 宏伍工作室公司 Systems, methods and media for distributing peer-to-peer communications
CN104052723A (en) * 2013-03-15 2014-09-17 联想(北京)有限公司 Information processing method and server

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107402806B (en) * 2017-04-20 2020-08-18 阿里巴巴集团控股有限公司 Task processing method and device of distributed file architecture
CN107402806A (en) * 2017-04-20 2017-11-28 阿里巴巴集团控股有限公司 The task processing method and device of distributed document framework
CN109426574B (en) * 2017-08-31 2022-04-05 华为技术有限公司 Distributed computing system, data transmission method and device in distributed computing system
CN109426574A (en) * 2017-08-31 2019-03-05 华为技术有限公司 Distributed computing system, data transmission method and device in distributed computing system
CN107832165A (en) * 2017-11-23 2018-03-23 国云科技股份有限公司 A kind of method for lifting distributed system processing request stability
CN108768715A (en) * 2018-05-22 2018-11-06 烽火通信科技股份有限公司 Access the business configuration adaptation method and system of webmaster
CN109274641A (en) * 2018-08-09 2019-01-25 广东神马搜索科技有限公司 Connection method and device, calculating equipment and storage medium between client and service node
CN110365754A (en) * 2019-06-28 2019-10-22 苏州浪潮智能科技有限公司 A kind of distributed document transmission storage method, equipment and storage medium
CN111736965A (en) * 2019-12-11 2020-10-02 西安宇视信息科技有限公司 Task scheduling method and device, scheduling server and machine-readable storage medium
CN111372143A (en) * 2019-12-26 2020-07-03 视联动力信息技术股份有限公司 Signaling interaction method and device and readable storage medium
CN111562990A (en) * 2020-07-15 2020-08-21 北京东方通软件有限公司 Lightweight serverless computing method based on message
CN111562990B (en) * 2020-07-15 2020-10-27 北京东方通软件有限公司 Lightweight serverless computing method based on message
CN114205889A (en) * 2021-12-09 2022-03-18 中山大学 Zenoh-based communication method in cross-local-area-network distributed system
CN114205889B (en) * 2021-12-09 2023-10-20 中山大学 Zenoh-based inter-LAN distributed system communication method
CN115314493A (en) * 2022-08-03 2022-11-08 苏州创意云网络科技有限公司 Cluster scheduling method and device
CN115314493B (en) * 2022-08-03 2024-10-11 苏州创意云网络科技有限公司 Cluster scheduling method and device

Also Published As

Publication number Publication date
CN106506490B (en) 2019-07-09

Similar Documents

Publication Publication Date Title
CN106506490A (en) A kind of Distributed Calculation control method and distributed computing system
JP4647234B2 (en) Method and apparatus for discovering network devices
US20140089619A1 (en) Object replication framework for a distributed computing environment
US20110238820A1 (en) Computer, communication device, and communication control system
CN111538763B (en) Method for determining master node in cluster, electronic equipment and storage medium
US9367261B2 (en) Computer system, data management method and data management program
JP5678723B2 (en) Switch, information processing apparatus and information processing system
JP2019008417A (en) Information processing apparatus, memory control method and memory control program
US20050038888A1 (en) Method of and apparatus for monitoring event logs
EP3817290A1 (en) Member change method for distributed system, and distributed system
JP5617304B2 (en) Switching device, information processing device, and fault notification control program
US11134015B2 (en) Load balancing through selective multicast replication of data packets
CN104793981B (en) A kind of online snapshot management method and device of cluster virtual machine
US7453865B2 (en) Communication channels in a storage network
Abouzamazem et al. Efficient inter-cloud replication for high-availability services
JP5063599B2 (en) Device management system using log management object, and method for generating and controlling logging data in the system
JP4272105B2 (en) Storage group setting method and apparatus
US20060282524A1 (en) Apparatus, system, and method for facilitating communication between an enterprise information system and a client
JP2003015973A (en) Network device management device, management method and management program
JPH0591108A (en) Message communication control method and communication system
JP2007249659A (en) System-switching method, computer system therefor, and program
CN102724080B (en) Network management system and network management method
US20150142960A1 (en) Information processing apparatus, information processing method and information processing system
JP4790579B2 (en) Process monitoring apparatus and monitoring method
JP2007316719A (en) Message communication method, device and program

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Patentee after: Zhigaogao Intellectual Property Group Co.,Ltd.

Address before: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Patentee before: SHENZHEN ZHIGAODIAN INTELLECTUAL PROPERTY OPERATION Co.,Ltd.

TR01 Transfer of patent right

Effective date of registration: 20230105

Address after: 518064 109, Building 15, Block B, Building 5, No. 54, Xingnan Road, Nanyou Community, Yuehai Street, Nanshan District, Shenzhen, Guangdong Province

Patentee after: Guozhi Digital Economy (Shenzhen) Co.,Ltd.

Address before: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Patentee before: Zhigaogao Intellectual Property Group Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230428

Address after: Room D504-507, Zhejiang University Science and Technology Park, No. 698 Jingdong Avenue, Nanchang High tech Industrial Development Zone, Nanchang City, Jiangxi Province, 330096

Patentee after: Jiangxi Zhichan Big Data Co.,Ltd.

Address before: 518064 109, Building 15, Block B, Building 5, No. 54, Xingnan Road, Nanyou Community, Yuehai Street, Nanshan District, Shenzhen, Guangdong Province

Patentee before: Guozhi Digital Economy (Shenzhen) Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231227

Address after: 518000 Room 109, Building 15, Nanyou District B, Building 5, No. 54, Xingnan Road, Nanyou Community, Yuehai Street, Nanshan District, Shenzhen, Guangdong

Patentee after: Guozhi Digital Economy (Shenzhen) Co.,Ltd.

Address before: Room D504-507, Zhejiang University Science and Technology Park, No. 698 Jingdong Avenue, Nanchang High tech Industrial Development Zone, Nanchang City, Jiangxi Province, 330096

Patentee before: Jiangxi Zhichan Big Data Co.,Ltd.

TR01 Transfer of patent right