Disclosure of Invention
In view of this, an object of the embodiments of the present invention is to provide a method and an apparatus for routing requests of a distributed cluster, which can facilitate rapid positioning of service failures and input/output blocking, and facilitate analysis of load balancing of a federal cluster, and improve availability of the distributed cluster.
Based on the above object, a first aspect of the embodiments of the present invention provides a method for routing requests of a distributed cluster, including performing the following steps:
Receiving a service request of a distributed file system from a client, analyzing client information from the service request and recording the client information to a log;
determining corresponding sub-clusters in the distributed file system based on the service request, forwarding the service request to the sub-clusters and recording the connection information of the sub-clusters to a log;
receiving feedback information for the service request from the subset group, and recording the feedback information to a log in association with client information corresponding to the feedback information;
in response to the feedback information including a service execution failure, client information corresponding to the feedback information is transmitted to the sub-cluster based on the connection information.
In some implementations, parsing the client information from the service request includes parsing at least one of an initiation time of the service request, a network address of the client, a remote procedure call method requested by the client, and method parameters of the remote procedure call method.
In some embodiments, determining a corresponding sub-cluster in the distributed file system based on the service request, forwarding the service request to the sub-cluster and logging connection information of the sub-cluster includes determining a corresponding sub-cluster in the distributed file system based on the service request, determining an active name node in the sub-cluster, forwarding the service request to the name node, and logging a network address of the name node to the log.
In some implementations, receiving feedback information for the service request from the subset group includes receiving at least one of whether the service request was successfully performed, an end time of the service request.
In some implementations, logging feedback information in association with client information corresponding to the feedback information includes determining an overall time consumption of the service request based on an end time of the service request and an initiation time of the service request, and logging whether the service request was successfully performed, the end time of the service request, and the overall time consumption of the service request in association with the client information corresponding to the feedback information.
In some embodiments, the method further comprises feeding back feedback information to the client based on the client information in response to the feedback information comprising success or failure of service execution.
In some embodiments, the method further comprises the steps of parsing the remote procedure call method from the remote procedure call query request in response to receiving the remote procedure call query request and returning the client information and feedback information associated with the remote procedure call method recorded in the log as a result.
In some implementations, the distributed clusters are Hadoop clusters and the sub-clusters are subordinate to the Hadoop clusters.
In some implementations, the clients and the subsets form federal clusters of the Hadoop cluster.
A second aspect of an embodiment of the present invention provides an apparatus, comprising:
A processor;
a controller storing program code executable by a processor, the processor executing the following steps when executing the program code:
Receiving a service request of a distributed file system from a client, analyzing client information from the service request and recording the client information to a log;
determining corresponding sub-clusters in the distributed file system based on the service request, forwarding the service request to the sub-clusters and recording the connection information of the sub-clusters to a log;
receiving feedback information for the service request from the subset group, and recording the feedback information to a log in association with client information corresponding to the feedback information;
in response to the feedback information including a service execution failure, client information corresponding to the feedback information is transmitted to the sub-cluster based on the connection information.
In some implementations, parsing the client information from the service request includes parsing at least one of an initiation time of the service request, a network address of the client, a remote procedure call method requested by the client, and method parameters of the remote procedure call method.
In some embodiments, determining a corresponding sub-cluster in the distributed file system based on the service request, forwarding the service request to the sub-cluster and logging connection information of the sub-cluster includes determining a corresponding sub-cluster in the distributed file system based on the service request, determining an active name node in the sub-cluster, forwarding the service request to the name node, and logging a network address of the name node to the log.
In some implementations, receiving feedback information for the service request from the subset group includes receiving at least one of whether the service request was successfully performed, an end time of the service request.
In some implementations, logging feedback information in association with client information corresponding to the feedback information includes determining an overall time consumption of the service request based on an end time of the service request and an initiation time of the service request, and logging whether the service request was successfully performed, the end time of the service request, and the overall time consumption of the service request in association with the client information corresponding to the feedback information.
In some implementations, the method further includes feeding back feedback information to the client based on the client information in response to the feedback information including success or failure of service execution.
In some embodiments, the method further comprises the steps of parsing the remote procedure call method from the remote procedure call query request in response to receiving the remote procedure call query request and returning the client information and feedback information associated with the remote procedure call method recorded in the log as a result.
In some implementations, the distributed clusters are Hadoop clusters and the sub-clusters are subordinate to the Hadoop clusters.
In some implementations, the clients and the subsets form federal clusters of the Hadoop cluster.
The method and the device for routing the request of the distributed cluster have the advantages that the service request of the distributed file system is received from the client, the client information is analyzed from the service request and recorded to the log, the corresponding sub-cluster is determined in the distributed file system based on the service request, the service request is forwarded to the sub-cluster and the connection information of the sub-cluster is recorded to the log, the feedback information aiming at the service request is received from the sub-cluster, the feedback information is recorded to the log in association with the client information corresponding to the feedback information, the feedback information comprises service execution failure, and the client information corresponding to the feedback information is sent to the sub-cluster based on the connection information.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention will be described in further detail with reference to the accompanying drawings.
It should be noted that, in the embodiments of the present invention, all the expressions "first" and "second" are used to distinguish two entities with the same name but different entities or different parameters, and it is noted that the "first" and "second" are only used for convenience of expression, and should not be construed as limiting the embodiments of the present invention, and the following embodiments are not described one by one.
Based on the above objects, a first aspect of the embodiments of the present invention proposes an embodiment of a routing method for requests of a distributed cluster, which facilitates fast positioning of service failures and input/output blocking, and helps to analyze load balancing of the federal cluster, and improve availability of the distributed cluster. Fig. 1 is a schematic flow chart of a method for routing requests of a distributed cluster according to the present invention.
The method for routing the request of the distributed cluster, as shown in fig. 1, comprises the following steps:
step S101, receiving a service request of a distributed file system from a client, analyzing client information from the service request and recording the client information to a log;
step S103, determining corresponding sub-clusters in the distributed file system based on the service request, forwarding the service request to the sub-clusters and recording the connection information of the sub-clusters to a log;
Step S105, receiving feedback information for the service request from the subset group, and recording the feedback information to a log in association with client information corresponding to the feedback information;
Step S107, in response to the feedback information including a service execution failure, transmits the client information corresponding to the feedback information to the sub-cluster based on the connection information.
In the Hadoop distributed file system HDFS, a method for recording overall time consumption of a Router on a client IP, client request information, forwarded server IP, a server processing result and RPC call is designed aiming at a federal scene based on the Router, so that the problem positioning and checking efficiency when the access to the HDFS is abnormal is improved, and meanwhile, the log examination can be completely and thoroughly ensured.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by a computer program for instructing relevant hardware, where the program may be stored on a computer readable storage medium, and where the program, when executed, may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a random-access memory (RAM), or the like. Embodiments of the computer program may achieve the same or similar effects as any of the method embodiments previously described.
In some implementations, parsing the client information from the service request includes parsing at least one of an initiation time of the service request, a network address of the client, a remote procedure call method requested by the client, and method parameters of the remote procedure call method.
In some embodiments, determining a corresponding sub-cluster in the distributed file system based on the service request, forwarding the service request to the sub-cluster and logging connection information of the sub-cluster includes determining a corresponding sub-cluster in the distributed file system based on the service request, determining an active name node in the sub-cluster, forwarding the service request to the name node, and logging a network address of the name node to the log.
In some implementations, receiving feedback information for the service request from the subset group includes receiving at least one of whether the service request was successfully performed, an end time of the service request.
In some implementations, logging feedback information in association with client information corresponding to the feedback information includes determining an overall time consumption of the service request based on an end time of the service request and an initiation time of the service request, and logging whether the service request was successfully performed, the end time of the service request, and the overall time consumption of the service request in association with the client information corresponding to the feedback information.
The various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
In some embodiments, the method further comprises feeding back feedback information to the client based on the client information in response to the feedback information comprising success or failure of service execution.
In some embodiments, the method further comprises the steps of parsing the remote procedure call method from the remote procedure call query request in response to receiving the remote procedure call query request and returning the client information and feedback information associated with the remote procedure call method recorded in the log as a result.
In some implementations, the distributed clusters are Hadoop clusters and the sub-clusters are subordinate to the Hadoop clusters.
The computer-readable storage medium (e.g., memory) described herein may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. By way of example, and not limitation, nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which acts as external cache memory. By way of example, and not limitation, RAM may be available in a variety of forms such as synchronous RAM (DRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The storage devices of the disclosed aspects are intended to comprise, without being limited to, these and other suitable types of memory.
The following describes a specific embodiment of the present invention further with reference to the specific embodiment shown in fig. 2. Referring to fig. 2, first, when a client initiates a request to an HDFS, and when an RPC (remote procedure call) of a Router receives the client request, a request initiating time stime of the client, an IP address src (host: port) of the client, an RPC method of the request, and a method parameter paramers are recorded according to request information of the client.
When forwarding the request, the Router finds the correct subset according to the request information submitted by the client, forwards the request to the active NameNodes of the subset, and records the IP address dst (host: port) of the target NameNode during forwarding.
When the NameNode request returns, the Router returns a result according to the request calling the NameNode, records whether the execution is successful or not (succeed/failed), the time etime when the execution is finished, calculates the whole time consumption T (etime-atime) of the request according to the time stime of the client-side initiating the request and the time etime when the execution is finished, and records the whole time consumption T.
Furthermore, the method disclosed according to the embodiment of the present invention may also be implemented as a computer program executed by a CPU, which may be stored in a computer-readable storage medium. When executed by a CPU, performs the functions defined above in the methods disclosed in the embodiments of the present invention. The above method steps and system units may also be implemented with a controller and a computer readable storage medium storing a computer program for causing the controller to implement the above steps or unit functions.
According to the method for routing the request of the distributed cluster, the service request of the distributed file system is received from the client, the client information is analyzed from the service request and recorded to the log, the corresponding sub-cluster is determined in the distributed file system based on the service request, the service request is forwarded to the sub-cluster and the connection information of the sub-cluster is recorded to the log, the feedback information aiming at the service request is received from the sub-cluster, the feedback information and the client information corresponding to the feedback information are recorded to the log in an associated mode, and the client information corresponding to the feedback information is sent to the sub-cluster based on the connection information in response to the feedback information comprising service execution failure, so that the service failure and input-output blockage can be conveniently and rapidly located, the load balance of the federal cluster can be analyzed, and the availability of the distributed cluster is improved.
It should be noted that, the steps in the embodiments of the method for routing requests of the distributed cluster may be intersected, replaced, added and deleted, so that the method for routing requests transformed by the distributed cluster by using these reasonable permutation and combination shall also belong to the protection scope of the present invention, and shall not limit the protection scope of the present invention to the embodiments.
Based on the above objects, a second aspect of the embodiments of the present invention proposes an embodiment of a routing apparatus for requests of a distributed cluster, which facilitates fast localization of service failures and input/output blocking, and helps to analyze load balancing of the federal cluster, and improve availability of the distributed cluster. The device comprises:
A processor;
a controller storing program code executable by a processor, the processor executing the following steps when executing the program code:
Receiving a service request of a distributed file system from a client, analyzing client information from the service request and recording the client information to a log;
determining corresponding sub-clusters in the distributed file system based on the service request, forwarding the service request to the sub-clusters and recording the connection information of the sub-clusters to a log;
receiving feedback information for the service request from the subset group, and recording the feedback information to a log in association with client information corresponding to the feedback information;
in response to the feedback information including a service execution failure, client information corresponding to the feedback information is transmitted to the sub-cluster based on the connection information.
The steps of a method or algorithm described in connection with the disclosure herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a user terminal.
In some implementations, parsing the client information from the service request includes parsing at least one of an initiation time of the service request, a network address of the client, a remote procedure call method requested by the client, and method parameters of the remote procedure call method.
In some embodiments, determining a corresponding sub-cluster in the distributed file system based on the service request, forwarding the service request to the sub-cluster and logging connection information of the sub-cluster includes determining a corresponding sub-cluster in the distributed file system based on the service request, determining an active name node in the sub-cluster, forwarding the service request to the name node, and logging a network address of the name node to the log.
In some implementations, receiving feedback information for the service request from the subset group includes receiving at least one of whether the service request was successfully performed, an end time of the service request.
In some implementations, logging feedback information in association with client information corresponding to the feedback information includes determining an overall time consumption of the service request based on an end time of the service request and an initiation time of the service request, and logging whether the service request was successfully performed, the end time of the service request, and the overall time consumption of the service request in association with the client information corresponding to the feedback information.
In one or more exemplary designs, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one location to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a general purpose or special purpose computer or general purpose or special purpose processor. Further, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, digital Versatile Disc (DVD), floppy disk, blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
In some implementations, the method further includes feeding back feedback information to the client based on the client information in response to the feedback information including success or failure of service execution.
In some embodiments, the method further comprises the steps of parsing the remote procedure call method from the remote procedure call query request in response to receiving the remote procedure call query request and returning the client information and feedback information associated with the remote procedure call method recorded in the log as a result.
In some implementations, the distributed clusters are Hadoop clusters and the sub-clusters are subordinate to the Hadoop clusters.
In some implementations, the clients and the subsets form federal clusters of the Hadoop cluster.
The devices and apparatuses disclosed in the embodiments of the present invention may be various electronic terminal apparatuses, for example, mobile phones, personal Digital Assistants (PDAs), tablet computers (PADs), smart televisions, and the like, or may be large-sized terminal apparatuses, for example, servers, etc., so the protection scope disclosed in the embodiments of the present invention should not be limited to a specific type of devices and apparatuses. The client disclosed by the embodiment of the invention can be applied to any one of the electronic terminal devices in the form of electronic hardware, computer software or a combination of the electronic hardware and the computer software.
According to the routing device for the request of the distributed cluster, the service request of the distributed file system is received from the client, the client information is analyzed from the service request and recorded to the log, the corresponding sub-cluster is determined in the distributed file system based on the service request, the service request is forwarded to the sub-cluster and the connection information of the sub-cluster is recorded to the log, the feedback information aiming at the service request is received from the sub-cluster, the feedback information and the client information corresponding to the feedback information are recorded to the log in an associated mode, and the client information corresponding to the feedback information is sent to the sub-cluster based on the connection information in response to the feedback information comprising service execution failure, so that the quick positioning of service failure and input-output blocking can be facilitated, the analysis of load balance of the federal cluster is facilitated, and the availability of the distributed cluster is improved.
It should be noted that, in particular, the foregoing embodiment of the apparatus employs an embodiment of the routing method of the request of the distributed cluster to specifically describe the working process of each module, and those skilled in the art can easily think that these modules are applied to other embodiments of the routing method of the request of the distributed cluster. Of course, since the steps in the embodiments of the method for routing requests of the distributed clusters may be intersected, replaced, added and subtracted, these reasonable permutation and combination are changed, and the apparatus shall also belong to the protection scope of the present invention, and shall not limit the protection scope of the present invention to the embodiments.
Embodiments of the invention may also include corresponding computer devices. The computer device includes a memory, at least one processor, and a computer program stored on the memory and executable on the processor, the processor executing any one of the methods described above when the program is executed.
The memory is used as a non-volatile computer readable storage medium, and can be used for storing a non-volatile software program, a non-volatile computer executable program and modules, such as program instructions/modules corresponding to the routing method of the request of the distributed cluster in the embodiment of the application. The processor executes various functional applications of the server and data processing, i.e. the routing method of requests for the distributed clusters implementing the above-described method embodiments, by running non-volatile software programs, instructions and modules stored in the memory.
The memory may include a storage program area that may store an operating system, an application program required for at least one function, and a storage data area that may store data created according to the use of a routing device for a request of a distributed cluster, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the local module through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Embodiments of the present invention may further include a corresponding computer-readable storage medium storing computer-executable instructions for performing the method for routing requests of the distributed clusters in any of the method embodiments described above and the routing device for implementing the request of the distributed clusters in any of the device embodiments described above. Embodiments of the computer readable storage medium may achieve the same or similar effects as any of the foregoing method and apparatus embodiments.
Embodiments of the invention may also include a corresponding computer program product comprising a computer program stored on a computer readable storage medium, the computer program comprising instructions which, when executed by a computer, cause the computer to perform the method of routing requests for a distributed cluster in any of the method embodiments described above and routing means for implementing the method of routing requests for a distributed cluster in any of the apparatus embodiments described above. Embodiments of the computer program product may achieve the same or similar results as embodiments of any of the methods and apparatus described previously.
Finally, it should be noted that, as will be appreciated by those skilled in the art, all or part of the procedures in implementing the methods of the embodiments described above may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the program may include the procedures of the embodiments of the methods described above when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a random-access memory (RAM), or the like. Embodiments of the computer program may achieve the same or similar effects as any of the method embodiments previously described.
The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.
It will be appreciated by persons skilled in the art that the foregoing discussion of any embodiment is merely exemplary and is not intended to imply that the scope of the disclosure of embodiments of the invention, including the claims, is limited to such examples, that combinations of features of the above embodiments or of different embodiments may be made and that many other variations of the different aspects of the embodiments of the invention described above exist within the spirit of the embodiments of the invention, which are not provided in detail for clarity. Therefore, any omission, modification, equivalent replacement, improvement, etc. of the embodiments should be included in the protection scope of the embodiments of the present invention.