[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN110362381A - HDFS cluster High Availabitity dispositions method, system, equipment and storage medium - Google Patents

HDFS cluster High Availabitity dispositions method, system, equipment and storage medium Download PDF

Info

Publication number
CN110362381A
CN110362381A CN201910543171.1A CN201910543171A CN110362381A CN 110362381 A CN110362381 A CN 110362381A CN 201910543171 A CN201910543171 A CN 201910543171A CN 110362381 A CN110362381 A CN 110362381A
Authority
CN
China
Prior art keywords
mirror image
host
container
high availabitity
hdfs
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910543171.1A
Other languages
Chinese (zh)
Inventor
汪涵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Huichuan Technology Co Ltd
Shenzhen Inovance Technology Co Ltd
Original Assignee
Shenzhen Huichuan Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Huichuan Technology Co Ltd filed Critical Shenzhen Huichuan Technology Co Ltd
Priority to CN201910543171.1A priority Critical patent/CN110362381A/en
Publication of CN110362381A publication Critical patent/CN110362381A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a kind of HDFS cluster High Availabitity dispositions method, system, equipment and storage mediums, the described method includes: being the HDFS cluster creative management node mirror image and back end mirror image, the management node mirror image and back end mirror image respectively include the configuration file by the HDFS cluster configuration for High Availabitity mode;The management node mirror image and back end mirror image are uploaded to mirror image warehouse respectively;It will be in the HDFS clustered deploy(ment) to multiple hosts by Kubernetes platform, and main management container is created, in the hot standby management container of the second host creation in the first host based on the management node mirror image for being uploaded to the mirror image warehouse by the Kubernetes platform, and data storage container is created at least one third host based on the back end mirror image for being uploaded to the mirror image warehouse.The embodiment of the present invention may make HDFS cluster to can run on cheap hardware device, to improve the resource utilization of HDFS cluster, reduce digitlization transition threshold and the capital investment of enterprise.

Description

HDFS cluster High Availabitity dispositions method, system, equipment and storage medium
Technical field
The present embodiments relate to big data application fields, more specifically to a kind of HDFS (Hadoop Distributed File System, Hadoop distributed file system) it cluster High Availabitity dispositions method, system, equipment and deposits Storage media.
Background technique
Currently, big data plays the effect being more and more obvious in work and life, such as big data auxiliary shopping is put down Platform recommends the product for being suitble to client, and big data auxiliary avoids blocking up, and big data auxiliary does health examination, big data amusement etc..By It is huge in data volume, the speed and precision of calculating is required it is relatively high, it is simple by be continuously increased the quantity of processor come The effect of anticipation has been not achieved in the computing capability for enhancing single computer.Currently, big data processing direction progressively towards point The computing cluster of cloth develops, and the computer that will be distributed over different spaces passes through network and is connected with each other the organically collection of composition one Mass data to be treated, is then distributed in this cluster, transfers to the calculating unit in decentralized system while calculating by group, Finally these calculated results are merged to obtain final result.
Existing traditional big data cluster High Availabitity dispositions method based on virtual machine or physical machine is as follows: preparing one first A fully distributed Hadoop environment and a fully distributed ZooKeeper environment, close all clothes of entire cluster Business.Then by tri- texts of core-site.xml, hdfs-site.xml and yarn-site.xml in the config directory of Hadoop Part is revised as HA (High Available, High Availabitity) mode.Finally start ZooKeeper and Hadoop cluster, and starts ZKFC (ZooKeeper Failover Controller, ZooKeeper failure branch controller) carrys out monitoring management node (namenode) state.
Active namenode (main management node) externally provides service under HA mode, and standby namenode is (hot standby Management node) moment standby preparation is notified that most JournalNodes (special pipe when host node has any modification Manage the node of edit log file) process.The modification information in JournalNodes can be read in hot standby node, and monitors always The variation of edit log file, change application in the management node of oneself.Hot standby management node may insure to malfunction in cluster When, NameSpace state is fully synchronized with main management node.
In order to guarantee high-availability cluster in only one main management node of synchronization, need using to ZooKeeper.It is first First by Hadoop cluster main management node and hot standby management node all registered in ZooKeeper system, when main management section When point breaks down, ZooKeeper system can detect such case, and hot standby management node is switched to host node automatically.
However, above-mentioned traditional big data High Availabitity dispositions method based on physical machine or virtual machine has the disadvantage that
(1) machine resources waste: building cluster using the even expensive server of virtual machine, cause machine resources Waste;
(2) load balancing in cluster can not be provided: physical machine and the resource of virtual machine be it is fixed, this will so that some The calculating task of node is heavy, and load capacity is excessively high, and some nodes are more idle;
(3) application component creation with deployment low efficiency, time-consuming: traditional Hadoop needs distribution and installation literary when deployment Part into every machine and cluster service configuration etc., a large amount of network bandwidth of cluster and machine resources can be occupied, and need to consume Take the long period;
(4) cluster dilatation low efficiency, time-consuming: if big data cluster needs dilatation, machine can only be increased newly, due to virtual Machine and physical machine need load operating system kernel when starting, and it also requires the group for spending the time that installation is gone to configure big data Part, it is lower so as to cause efficiency, take a long time;
(5) health examination and can not be automatically repaired: virtual machine and physical machine can not accomplish fault self-recovery.
Summary of the invention
The embodiment of the present invention is directed to above-mentioned traditional big data High Availabitity dispositions method machine based on physical machine or virtual machine The wasting of resources can not provide the load balancing in cluster, and application component creation and deployment low efficiency, time-consuming, cluster dilatation effect Rate is low, time-consuming, and can not health examination and the problem of be automatically repaired, provide a kind of HDFS cluster High Availabitity dispositions method, System, equipment and storage medium.
The technical solution that the embodiment of the present invention solves above-mentioned technical problem is to provide a kind of HDFS cluster High Availabitity deployment side Method, comprising:
For the HDFS cluster creative management node mirror image and back end mirror image, the management node mirror image and data section Point mirror image respectively includes the configuration file by the HDFS cluster configuration for High Availabitity mode;
The management node mirror image and back end mirror image are uploaded to mirror image warehouse respectively;
By Kubernetes platform by the HDFS clustered deploy(ment) to multiple hosts, and by the Kubernetes Platform creates main management container, second in the first host based on the management node mirror image for being uploaded to the mirror image warehouse Host creates hot standby management container, and based on being uploaded to the back end mirror image in the mirror image warehouse at least one third place Host creates data storage container.
Preferably, the management node mirror image and back end mirror image respectively include identical starting script, and described open Dynamic script includes:
Main management container starting after, where periodically broadcasting the main management container to all data storage containers first The program code of the configuration information of host;
After the starting of hot standby management container, where periodically broadcasting the hot standby management container to all data storage containers The program code of the configuration information of second host;
After data storage container starting, the number periodically is sent to the main management container and the hot standby management container According to the program code of the configuration information of the third host where storage container.
Preferably, described will be in the HDFS clustered deploy(ment) to multiple hosts by Kubernetes platform, comprising:
By the first yaml file, by the port mapping of the shared storage log inside the main management container to the first place On the corresponding ports of host IP address;
By the 2nd yaml file, by the port mapping of the shared storage log inside the hot standby management container to second On the corresponding ports of host IP address;
By the 3rd yaml file, by the port mapping of the shared storage log inside the data storage container to third On the corresponding ports of host IP address.
Preferably, the High Availabitity dispositions method further include:
By the first yaml file, at least part data file in the main management container is mounted to described First host;
By the 2nd yaml file, at least part data file in the hot standby management container is mounted to institute State the second host;
By the 3rd yaml file, at least part data file in the data storage container is mounted to institute State third host.
The embodiment of the present invention also provides a kind of HDFS cluster High Availabitity deployment system, and the High Availabitity deployment system includes mirror As creating unit, mirror image uploading unit and clustered deploy(ment) unit, in which:
The mirror image creating unit, for being the HDFS cluster creative management node mirror image and back end mirror image, institute State management node mirror image and back end mirror image respectively include be by the HDFS cluster configuration High Availabitity mode configuration file;
The mirror image uploading unit, for the management node mirror image and back end mirror image to be uploaded to mirror image warehouse;
The clustered deploy(ment) unit, for passing through Kubernetes platform for the HDFS clustered deploy(ment) to multiple hosts On, and by the Kubernetes platform based on being uploaded to the management node mirror image in the mirror image warehouse in the first host It creates main management container, create hot standby management container in the second host, and based on the data section for being uploaded to the mirror image warehouse Point mirror image creates data storage container at least one third host.
Preferably, the management node mirror image and back end mirror image respectively include identical starting script, and described open Dynamic script includes:
Main management container starting after, where periodically broadcasting the main management container to all data storage containers first The program code of the configuration information of host;
After the starting of hot standby management container, where periodically broadcasting the hot standby management container to all data storage containers The program code of the configuration information of second host;
After data storage container starting, the number periodically is sent to the main management container and the hot standby management container According to the program code of the configuration information of the third host where storage container.
Preferably, the clustered deploy(ment) unit includes the first mapping subelement, the second mapping subelement and third mapping Unit, in which:
The first mapping subelement is deposited shared inside the main management container for by the first yaml file In the port mapping to the corresponding ports of the first host IP address for storing up log;
The second mapping subelement will be shared inside the hot standby management container for by the 2nd yaml file In the port mapping to the corresponding ports of the second host IP address for storing log;
The third maps subelement, for that will share inside the data storage container by the 3rd yaml file In the port mapping to the corresponding ports of third host IP address for storing log.
Preferably, the clustered deploy(ment) unit includes the first carry subelement, the second carry subelement and third carry Unit:
The first carry subelement, for by the first yaml file, by the main management container at least A part of data file is mounted to first host;
The second carry subelement, for by the 2nd yaml file, by the hot standby management container extremely Few a part of data file is mounted to second host;
The third carry subelement, for by the 3rd yaml file, by the data storage container extremely Few a part of data file is mounted to the third host.
The embodiment of the present invention also provides a kind of HDFS cluster High Availabitity deployment facility, including memory and processor, described The computer program that can be executed in the processor is stored in memory, and when the processor execution computer program The step of realizing the as above any one HDFS cluster High Availabitity dispositions method.
The embodiment of the present invention also provides a kind of computer readable storage medium, and computer journey is stored on the storage medium Sequence when the computer program is executed by processor, realizes the step of the as above any one HDFS cluster High Availabitity dispositions method Suddenly.
HDFS cluster High Availabitity dispositions method, system, equipment and the storage medium of the embodiment of the present invention, by for HDFS collection Group's creation has the management node mirror image and back end mirror image of High Availabitity configuration file, and passes through Kubernetes platform for institute It states in HDFS clustered deploy(ment) to multiple hosts, so that HDFS cluster can run on cheap hardware device, to improve The resource utilization of HDFS cluster reduces digitlization transition threshold and the capital investment of enterprise.
Also, the embodiment of the present invention passed through the containerization of Kubernetes platform, it can be achieved that HDFS cluster efficient deployment Dilatation, load balancing, automation O&M, a key upgrade expanding, High Availabitity and high fault-tolerant disaster tolerance performance.
Detailed description of the invention
Fig. 1 is the flow diagram of HDFS cluster High Availabitity dispositions method provided in an embodiment of the present invention;
Fig. 2 is the process signal that port mapping is realized in HDFS cluster High Availabitity dispositions method provided in an embodiment of the present invention Figure;
Fig. 3 is the process that data file carry is realized in HDFS cluster High Availabitity dispositions method provided in an embodiment of the present invention Schematic diagram;
Fig. 4 is the structural schematic diagram of HDFS cluster High Availabitity deployment system provided in an embodiment of the present invention;
Fig. 5 is the schematic diagram of HDFS cluster High Availabitity deployment facility provided in an embodiment of the present invention.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.
As shown in Figure 1, be the flow diagram of HDFS cluster High Availabitity dispositions method provided in an embodiment of the present invention, it is above-mentioned HDFS cluster is the application based on HDFS (Hadoop Distributed File System, Hadoop distributed file system) Program, for realizing business data processing, such as business datum manages and maintains.The HDFS cluster High Availabitity portion of the present embodiment The client device execution that arranging method can be deployed Kubernetes platform and be equipped with container software for editing by one, and should HDFS cluster High Availabitity dispositions method includes:
Step S11: for HDFS cluster creative management node mirror image and back end mirror image.Above-mentioned management node mirror image sum number It include identical configuration file according to node mirror image.
It include that will be used to HDFS cluster configuration be High Availabitity mode in HDFS cluster configuration catalogue in above-mentioned configuration file File.Specifically, above-mentioned for including following three files: core- by the file that HDFS cluster configuration is High Availabitity mode Site.xml, hdfs-site.xml and yarn-site.xml, and when above-mentioned file is packaged addition mirror image, it need to be by core- Relevant parameter status modifier in site.xml, hdfs-site.xml and yarn-site.xml is High Availabitity mode.
Step S12: management node mirror image and back end mirror image are uploaded to mirror image warehouse and (such as pass through docker Build and push order).
Particularly, before mirror image is packaged upload, Dockerfile need to first be write.Dockerfile is by series of orders The script constituted with parameter.When writing Dockerfile, relevant HDFS cluster component installation kit can be put to In catalogue where Dockerfile, and management node mirror image and back end mirror image have different Dockerfile.
Step S13: HDFS clustered deploy(ment) to multiple hosts (such as is passed through by kubectl by Kubernetes platform Apply order) on, and by Kubernetes platform based on being uploaded to the management node mirror image in mirror image warehouse in the first host Machine creates main management container, creates hot standby management container in the second host, and based on the back end for being uploaded to mirror image warehouse Mirror image creates data storage container at least one third host, to be Kubernetes High Availabitity by HDFS clustered deploy(ment) Container cluster.
Above-mentioned HDFS cluster High Availabitity dispositions method, by creative management node mirror image and back end mirror image, and passes through Kubernetes platform is by the HDFS clustered deploy(ment) to multiple hosts, so that HDFS cluster can run on cheap hardware Equipment reduces digitlization transition threshold and the capital investment of enterprise to improve the resource utilization of HDFS cluster.And And (hot standby management is run due to increasing a hot standby management node in the Kubernetes container cluster after the completion of deployment Second host of container), it, can be rapid when main management node (running the first host of main management container) breaks down It is switched to hot standby management node, realizes the High Availabitity of application program.
Since Hadoop cluster is to carry out node communication with machine domain name, need to pass through in HDFS cluster After the completion of Kubernetes Platform deployment, the map information of host domain name and IP address, example are registered between each node mutually Such as can under Linux server /etc/hosts file in the content that is added in such as following table 1 establish mapping relations:
10.42.3.2 Namenode (main management node)
10.42.3.3 Namenode-ha (hot standby management node)
10.42.3.4 Datanode1 (back end 1)
10.42.3.5 Datanode2 (back end 2)
Table 1: node communications setting table
To which Hadoop system can pass through the host of back end 1 (running the host for having data storage container 1) (i.e. the third host of operation data storage container, can be by configured to access one of data memory node for domain name / etc/hosts by host domain name mapping be IP address), if the not information in above-mentioned file in allocation list 1, will lead to Hadoop cluster can not identify host domain name and internal lost contact.
To establish above-mentioned mapping relations, in one embodiment of the invention, the management node created in above-mentioned steps S11 Mirror image and back end mirror image respectively include identical starting script, and above-mentioned starting script includes: to start in main management container Afterwards, the program code of the configuration information of the first host where periodically broadcasting main management container to all data storage containers; The second host after the starting of hot standby management container, where periodically broadcasting hot standby management container to all data storage containers The program code of configuration information;After data storage container starting, institute periodically is sent to main management container and hot standby management container State the program code of the configuration information of the third host where data storage container.By the above-mentioned means, main management container, heat Standby management container and data storage container can know mutual configuration information at any time, so that Hadoop cluster will not be because of can not know Other host domain name and internal lost contact.
Specifically, according to the main management container of management node mirror image foundation and hot standby management container, according to back end mirror The data storage container that picture is established can judge its node type according to the IP address of its host respectively, to select in execution State corresponding code in starting script.
Since Kubernetes container cluster internal can communicate with each other, but outside can not access.Therefore in the present invention One embodiment in, as shown in Fig. 2, above by Kubernetes platform by HDFS clustered deploy(ment) to multiple hosts, Include:
Step S131: by the first yaml file, the port mapping of the shared storage log inside main management container is arrived On the corresponding ports of first host IP address.Specifically, the port of the shared storage log inside above-mentioned main management container can It is mapped to the WEB administration page port of the Hadoop main management node of the first host IP address, Hadoop main management node The WEB administration page of api interface and Hadoop yarn.
Step S132: by the 2nd yaml file, by the port mapping of the shared storage log inside hot standby management container Onto the corresponding ports of the second host IP address.Specifically, the end of the shared storage log inside above-mentioned hot standby management container Mouth maps to the WEB administration page port of the hot standby management node of Hadoop of the second host IP address, the hot standby pipe of Hadoop The api interface of node and the WEB administration page of the hot standby yarn of Hadoop are managed (if hot standby management node is in stand-by state Under, the WEB administration page of main management node can be automatic jumped to).
Step S133: by the 3rd yaml file, by the port mapping of the shared storage log inside data storage container Onto the corresponding ports of third host IP address.Specifically, the end of the shared storage log inside above-mentioned data storage container Mouth maps to WEB administration page port (port of the Resource Management node of the Hadoop yarn of third host IP address Be incremented by with the number of data storage container), (port is with data storage container for the api interface of Hadoop data storage container Number is incremented by).
By the above-mentioned means, external equipment (such as WEB terminal) can be realized to the main management container, hot standby in HDFS cluster Manage the access of container and data storage container.
In the mirror image (IMAGE) one in above-mentioned first yaml file, the 2nd yaml file, the 3rd yaml file, need Insert the address in the mirror image warehouse in step S12.
Further, since container is the application program of one " burn-after-reading ", to avoid the significant data file in container from existing It is lost after restarting, above-mentioned HDFS cluster High Availabitity dispositions method may also include that
Step S134: by the first yaml file, at least part data file in main management container is mounted to One host, that is, the catalogue where the significant data file in main management container is mounted to the related mesh of the first host Record.
Step S135: by the 2nd yaml file, at least part data file in hot standby management container is mounted to Second host, that is, the catalogue where the significant data file in hot standby management container is mounted to the phase of the second host Close catalogue.
Step S136: by the 3rd yaml file, at least part data file in data storage container is mounted to Third host, that is, the catalogue where the significant data file in data storage container is mounted to the phase of third host Close catalogue.
By the above-mentioned means, the mapping between container in-list and host catalogue can be realized, significant data file is held Longization is stored on the hard disk of host.
As shown in figure 4, the embodiment of the present invention also provides a kind of HDFS cluster High Availabitity deployment system, above-mentioned HDFS cluster is Based on the application program of HDFS (Hadoop Distributed File System, Hadoop distributed file system), it is used for Realize that business data processing, such as business datum manage and maintain.The HDFS cluster High Availabitity deployment system of the present embodiment can It is integrated into the client device for deploying Kubernetes platform and being equipped with container software for editing, and the HDFS cluster is high It can include mirror image creating unit 41, mirror image uploading unit 42 and clustered deploy(ment) unit 43, above-mentioned image creation with deployment system Unit 41, mirror image uploading unit 42 and cluster deployment unit 43 (such as above-mentioned are equipped in combination with the equipment such as computer are run on The client device of container software for editing) on software realization.
Mirror image creating unit 41 is used to be HDFS cluster creative management node mirror image and back end mirror image, above-mentioned management section Point mirror image and back end mirror image include identical configuration file.
It include that will be used to HDFS cluster configuration be High Availabitity mode in HDFS cluster configuration catalogue in above-mentioned configuration file File.Specifically, above-mentioned for including following three files: core- by the file that HDFS cluster configuration is High Availabitity mode Site.xml, hdfs-site.xml and yarn-site.xml, and when above-mentioned file is packaged addition mirror image, it need to be by core- Relevant parameter status modifier in site.xml, hdfs-site.xml and yarn-site.xml is High Availabitity mode.
Mirror image uploading unit 42 is used to management node mirror image and back end mirror image being uploaded to mirror image warehouse.
Clustered deploy(ment) unit 43 be used for by Kubernetes platform by HDFS clustered deploy(ment) to multiple hosts (such as lead to Cross kubectl apply order) on, and by Kubernetes platform based on the management node mirror image for being uploaded to mirror image warehouse Main management container is created, in the hot standby management container of the second host creation in the first host, and is based on being uploaded to mirror image warehouse Back end mirror image at least one third host create data storage container, to be by HDFS clustered deploy(ment) Kubernetes High Availabitity container cluster.
For the inter-node communication for realizing Hadoop cluster, in one embodiment of the invention, above-mentioned image creating unit The management node mirror image and back end mirror image of 41 creations respectively include identical starting script, and above-mentioned starting script includes: After the starting of main management container, the configuration of the first host where periodically broadcasting main management container to all data storage containers The program code of information;After the starting of hot standby management container, hot standby management container institute periodically is broadcasted to all data storage containers The second host configuration information program code;After data storage container starting, periodically to main management container and heat Standby management container sends the program code of the configuration information of the third host where the data storage container.
For the outside access for realizing the Kubernetes container cluster after deployment, above-mentioned clustered deploy(ment) unit 43 includes first It maps subelement, the second mapping subelement and third and maps subelement, in which:
First mapping subelement is used for by the first yaml file, by the end of the shared storage log inside main management container Mouth is mapped on the corresponding ports of the first host IP address.Specifically, the shared storage log inside above-mentioned main management container Port map to the first host IP address Hadoop main management node WEB administration page port, Hadoop supervisor Manage the api interface of node and the WEB administration page of Hadoop yarn.
Second mapping subelement is used for through the 2nd yaml file, by the shared storage log inside hot standby management container In port mapping to the corresponding ports of the second host IP address.Specifically, the shared storage inside above-mentioned hot standby management container The port of log map to the hot standby management node of Hadoop of the second host IP address WEB administration page port, The api interface of the hot standby management node of Hadoop and the WEB administration page of the hot standby yarn of Hadoop are (if at hot standby management node In the stand-by state, the WEB administration page of main management node can be automatic jumped to).
Third maps subelement and is used for by the 3rd yaml file, by the shared storage log inside data storage container In port mapping to the corresponding ports of third host IP address.Specifically, the shared storage inside above-mentioned data storage container The port of log maps to the WEB administration page end of the Resource Management node of the Hadoop yarn of third host IP address Mouthful (port is incremented by with the number of data storage container), the api interface of Hadoop data storage container (are deposited the port with data The number of storage container is incremented by).
And the mirror image (IMAGE) one in above-mentioned first yaml file, the 2nd yaml file, the 3rd yaml file In, the address in mirror image warehouse need to be inserted.
Further, since container is the application program of one " burn-after-reading ", to avoid the significant data file in container from existing It is lost after restarting, above-mentioned clustered deploy(ment) unit 43 may also include the first carry subelement, the second carry subelement and third carry Subelement, in which: the first carry subelement is used for by the first yaml file, by least part data in main management container File mount is to first host;Second carry subelement will be in hot standby management container for passing through the 2nd yaml file At least part data file be mounted to the second host;Third carry subelement is used to count by the 3rd yaml file Third host is mounted to according at least part data file in storage container.
The HDFS cluster in HDFS cluster High Availabitity deployment system and above-mentioned Fig. 1-3 corresponding embodiment in the present embodiment is high Same design can be belonged to dispositions method, specific implementation process is shown in corresponding embodiment of the method in detail, and in embodiment of the method Technical characteristic it is corresponding in this apparatus embodiments be applicable in, which is not described herein again.
The embodiment of the present invention also provides a kind of HDFS cluster High Availabitity deployment facility 5, which specifically can be and deploy The client of Kubernetes platform, as shown in figure 5, the HDFS cluster High Availabitity deployment facility 5 includes memory 51 and processing Device 52 is stored with the computer program that can be executed in processor 52 in memory 51, and when processor 52 executes computer program The step of realizing HDFS cluster High Availabitity dispositions method as described above.
The HDFS cluster in HDFS cluster High Availabitity deployment facility 4 and above-mentioned Fig. 1-3 corresponding embodiment in the present embodiment High Availabitity dispositions method belongs to same design, and specific implementation process is shown in corresponding embodiment of the method, and embodiment of the method in detail In technical characteristic it is corresponding in this apparatus embodiments be applicable in, which is not described herein again.
The embodiment of the present invention also provides a kind of computer readable storage medium, and computer journey is stored on the storage medium Sequence, when computer program is executed by processor, the step of realizing HDFS cluster High Availabitity dispositions method as described above.The present embodiment In computer readable storage medium belong to the HDFS cluster High Availabitity dispositions method in above-mentioned Fig. 1-3 corresponding embodiment it is same Design, specific implementation process are shown in corresponding embodiment of the method in detail, and the technical characteristic in embodiment of the method is in this equipment reality It applies to correspond in example and be applicable in, which is not described herein again.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present application constitutes any limit It is fixed.
It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different Functional unit, module are completed.Each functional unit in embodiment, module can integrate in a processor, be also possible to each A unit physically exists alone, and can also be integrated in one unit with two or more units, and above-mentioned integrated unit was both It can take the form of hardware realization, can also realize in the form of software functional units.In addition, each functional unit, module Specific name be also only for convenience of distinguishing each other, the protection scope being not intended to limit this application.Unit in above system, The specific work process of module, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment The part of load may refer to the associated description of other embodiments.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed Scope of the present application.
In embodiment provided herein, it should be understood that disclosed Application Container cluster, which is alarmed, to be realized Method, system and equipment, may be implemented in other ways.For example, Application Container cluster described above is alarmed Realize that system embodiment is only schematical.
In addition, each functional unit in each embodiment of the application can integrate in a processor, it is also possible to Each unit physically exists alone, and can also be integrated in one unit with two or more units.Above-mentioned integrated unit Both it can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated module/unit be realized in the form of SFU software functional unit and as independent product sale or In use, can store in a computer readable storage medium.Based on this understanding, the application realizes above-mentioned implementation All or part of the process in example method, can also instruct relevant hardware to complete, the meter by computer program Calculation machine program can be stored in a computer readable storage medium, the computer program when being executed by processor, it can be achieved that on The step of stating each embodiment of the method.Wherein, the computer program includes computer program code, the computer program generation Code can be source code form, object identification code form, executable file or certain intermediate forms etc..The computer-readable medium It may include: any entity or changing interface equipment, recording medium, USB flash disk, the movement that can carry the computer program code Hard disk, magnetic disk, CD, computer storage, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), electric carrier signal, telecommunication signal and software distribution medium etc..It needs to illustrate It is that the content that the computer-readable medium includes can be fitted according to the requirement made laws in jurisdiction with patent practice When increase and decrease, such as in certain jurisdictions, according to legislation and patent practice, computer-readable medium do not include be electric carrier wave Signal and telecommunication signal.
Embodiment described above is only to illustrate the technical solution of the application, rather than its limitations;Although referring to aforementioned reality Example is applied the application is described in detail, those skilled in the art should understand that: it still can be to aforementioned each Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features;And these are modified Or replacement, the spirit and scope of each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution should all Comprising within the scope of protection of this application.

Claims (10)

1. a kind of HDFS cluster High Availabitity dispositions method, which is characterized in that the High Availabitity dispositions method includes:
For the HDFS cluster creative management node mirror image and back end mirror image, the management node mirror image and back end mirror As respectively including the configuration file by the HDFS cluster configuration for High Availabitity mode;
The management node mirror image and back end mirror image are uploaded to mirror image warehouse respectively;
By Kubernetes platform by the HDFS clustered deploy(ment) to multiple hosts, and by the Kubernetes platform Main management container is created, in the second host in the first host based on the management node mirror image for being uploaded to the mirror image warehouse Machine creates hot standby management container, and based on being uploaded to the back end mirror image in the mirror image warehouse at least one third host Create data storage container.
2. HDFS cluster High Availabitity dispositions method according to claim 1, which is characterized in that the management node mirror image and Back end mirror image respectively includes identical starting script, and the starting script includes:
The first host after the starting of main management container, where periodically broadcasting the main management container to all data storage containers The program code of the configuration information of machine;
Hot standby management container starting after, periodically to all data storage containers broadcast it is described it is hot standby management container where second The program code of the configuration information of host;
After data storage container starting, the data periodically are sent to the main management container and the hot standby management container and are deposited The program code of the configuration information of third host where storage container.
3. HDFS cluster High Availabitity dispositions method according to claim 2, which is characterized in that described to pass through Kubernetes Platform will be in the HDFS clustered deploy(ment) to multiple hosts, comprising:
By the first yaml file, by the port mapping of the shared storage log inside the main management container to the first host On the corresponding ports of IP address;
By the 2nd yaml file, by the port mapping of the shared storage log inside the hot standby management container to the second host On the corresponding ports of machine IP address;
By the 3rd yaml file, by the port mapping of the shared storage log inside the data storage container to third host On the corresponding ports of machine IP address.
4. HDFS cluster High Availabitity dispositions method according to claim 3, which is characterized in that the High Availabitity dispositions method Further include:
By the first yaml file, at least part data file in the main management container is mounted to described first Host;
By the 2nd yaml file, at least part data file in the hot standby management container is mounted to described the Two hosts;
By the 3rd yaml file, at least part data file in the data storage container is mounted to described Three hosts.
5. a kind of HDFS cluster High Availabitity deployment system, which is characterized in that the High Availabitity deployment system includes that mirror image creation is single Member, mirror image uploading unit and clustered deploy(ment) unit, in which:
The mirror image creating unit, for being the HDFS cluster creative management node mirror image and back end mirror image, the pipe Reason node mirror image and back end mirror image respectively include the configuration file by the HDFS cluster configuration for High Availabitity mode;
The mirror image uploading unit, for the management node mirror image and back end mirror image to be uploaded to mirror image warehouse;
The clustered deploy(ment) unit, for by Kubernetes platform by the HDFS clustered deploy(ment) to multiple hosts, And it is created based on the management node mirror image for being uploaded to the mirror image warehouse in the first host by the Kubernetes platform It builds main management container, create hot standby management container in the second host, and based on the back end for being uploaded to the mirror image warehouse Mirror image creates data storage container at least one third host.
6. HDFS cluster High Availabitity deployment system according to claim 5, which is characterized in that the management node mirror image and Back end mirror image respectively includes identical starting script, and the starting script includes:
The first host after the starting of main management container, where periodically broadcasting the main management container to all data storage containers The program code of the configuration information of machine;
Hot standby management container starting after, periodically to all data storage containers broadcast it is described it is hot standby management container where second The program code of the configuration information of host;
After data storage container starting, the data periodically are sent to the main management container and the hot standby management container and are deposited The program code of the configuration information of third host where storage container.
7. HDFS cluster High Availabitity deployment system according to claim 6, which is characterized in that the clustered deploy(ment) unit packet Include the first mapping subelement, the second mapping subelement and third mapping subelement, in which:
The first mapping subelement, for passing through the first yaml file, by the shared storage day inside the main management container In the port mapping of will to the corresponding ports of the first host IP address;
The second mapping subelement, for passing through the 2nd yaml file, by the shared storage inside the hot standby management container In the port mapping of log to the corresponding ports of the second host IP address;
The third maps subelement, for passing through the 3rd yaml file, by the shared storage inside the data storage container In the port mapping of log to the corresponding ports of third host IP address.
8. HDFS cluster High Availabitity deployment system according to claim 7, which is characterized in that the clustered deploy(ment) unit packet Include the first carry subelement, the second carry subelement and third carry subelement:
The first carry subelement, for passing through the first yaml file, by least one in the main management container Divided data file mount is to first host;
The second carry subelement, for passing through the 2nd yaml file, by least one in the hot standby management container Partial document data is mounted to second host;
The third carry subelement, for passing through the 3rd yaml file, by least one in the data storage container Partial document data is mounted to the third host.
9. a kind of HDFS cluster High Availabitity deployment facility, which is characterized in that including memory and processor, deposited in the memory The computer program that can be executed in the processor is contained, and realizes such as right when the processor execution computer program It is required that described in any one of 1-5 the step of HDFS cluster High Availabitity dispositions method.
10. a kind of computer readable storage medium, which is characterized in that computer program is stored on the storage medium, it is described When computer program is executed by processor, the HDFS cluster High Availabitity dispositions method as described in any one of claims 1 to 4 is realized The step of.
CN201910543171.1A 2019-06-21 2019-06-21 HDFS cluster High Availabitity dispositions method, system, equipment and storage medium Pending CN110362381A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910543171.1A CN110362381A (en) 2019-06-21 2019-06-21 HDFS cluster High Availabitity dispositions method, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910543171.1A CN110362381A (en) 2019-06-21 2019-06-21 HDFS cluster High Availabitity dispositions method, system, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110362381A true CN110362381A (en) 2019-10-22

Family

ID=68217577

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910543171.1A Pending CN110362381A (en) 2019-06-21 2019-06-21 HDFS cluster High Availabitity dispositions method, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110362381A (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795404A (en) * 2019-10-31 2020-02-14 京东方科技集团股份有限公司 Hadoop distributed file system and operation method and repair method thereof
CN110806880A (en) * 2019-11-04 2020-02-18 紫光云技术有限公司 High-reliability, high-performance and high-efficiency container cluster deployment method
CN110806881A (en) * 2019-11-05 2020-02-18 浪潮云信息技术有限公司 Method for deploying different CPU architectures by kubernets
CN110837394A (en) * 2019-11-07 2020-02-25 浪潮云信息技术有限公司 High-availability configuration version warehouse configuration method, terminal and readable medium
CN111026414A (en) * 2019-12-12 2020-04-17 杭州安恒信息技术股份有限公司 HDP platform deployment method based on kubernets
CN111131449A (en) * 2019-12-23 2020-05-08 华中科技大学 Method for constructing service clustering framework of water resource management system
CN111158851A (en) * 2019-12-10 2020-05-15 航天物联网技术有限公司 Rapid deployment method of virtual machine
CN111880934A (en) * 2020-07-29 2020-11-03 北京浪潮数据技术有限公司 Resource management method, device, equipment and readable storage medium
CN111897541A (en) * 2020-08-03 2020-11-06 上海嗨酷强供应链信息技术有限公司 Software interaction platform and method for automatically deploying resources in cloud environment
CN112311886A (en) * 2020-10-30 2021-02-02 新华三大数据技术有限公司 Multi-cluster deployment method, device and management node
CN112667564A (en) * 2020-12-30 2021-04-16 湖南博匠信息科技有限公司 Zynq platform record management method and system
CN112769964A (en) * 2021-04-12 2021-05-07 江苏红网技术股份有限公司 Method for yann support hybrid operation
CN113645071A (en) * 2021-08-10 2021-11-12 广域铭岛数字科技有限公司 Cluster deployment method, system, medium and electronic terminal
CN113656181A (en) * 2021-08-23 2021-11-16 中国工商银行股份有限公司 Method and device for issuing real-time application cluster instance resources
CN113886058A (en) * 2020-07-01 2022-01-04 中国联合网络通信集团有限公司 Cross-cluster resource scheduling method and device
CN113934564A (en) * 2021-09-26 2022-01-14 聚好看科技股份有限公司 Cluster log storage method and device
CN114691357A (en) * 2022-03-16 2022-07-01 东云睿连(武汉)计算技术有限公司 HDFS containerization service system, method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105141456A (en) * 2015-08-25 2015-12-09 山东超越数控电子有限公司 Method for monitoring high-availability cluster resource
CN106888254A (en) * 2017-01-20 2017-06-23 华南理工大学 A kind of exchange method between container cloud framework based on Kubernetes and its each module
CN109271233A (en) * 2018-07-25 2019-01-25 上海数耕智能科技有限公司 The implementation method of Hadoop cluster is set up based on Kubernetes

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105141456A (en) * 2015-08-25 2015-12-09 山东超越数控电子有限公司 Method for monitoring high-availability cluster resource
CN106888254A (en) * 2017-01-20 2017-06-23 华南理工大学 A kind of exchange method between container cloud framework based on Kubernetes and its each module
CN109271233A (en) * 2018-07-25 2019-01-25 上海数耕智能科技有限公司 The implementation method of Hadoop cluster is set up based on Kubernetes

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795404A (en) * 2019-10-31 2020-02-14 京东方科技集团股份有限公司 Hadoop distributed file system and operation method and repair method thereof
CN110795404B (en) * 2019-10-31 2023-04-07 京东方科技集团股份有限公司 Hadoop distributed file system and operation method and repair method thereof
CN110806880A (en) * 2019-11-04 2020-02-18 紫光云技术有限公司 High-reliability, high-performance and high-efficiency container cluster deployment method
CN110806881A (en) * 2019-11-05 2020-02-18 浪潮云信息技术有限公司 Method for deploying different CPU architectures by kubernets
CN110806881B (en) * 2019-11-05 2023-07-04 浪潮云信息技术股份公司 Method for deploying different CPU architectures by kubernetes
CN110837394B (en) * 2019-11-07 2023-10-27 浪潮云信息技术股份公司 High-availability configuration version warehouse configuration method, terminal and readable medium
CN110837394A (en) * 2019-11-07 2020-02-25 浪潮云信息技术有限公司 High-availability configuration version warehouse configuration method, terminal and readable medium
CN111158851B (en) * 2019-12-10 2022-04-29 航天物联网技术有限公司 Rapid deployment method of virtual machine
CN111158851A (en) * 2019-12-10 2020-05-15 航天物联网技术有限公司 Rapid deployment method of virtual machine
CN111026414A (en) * 2019-12-12 2020-04-17 杭州安恒信息技术股份有限公司 HDP platform deployment method based on kubernets
CN111026414B (en) * 2019-12-12 2023-09-08 杭州安恒信息技术股份有限公司 HDP platform deployment method based on kubernetes
CN111131449B (en) * 2019-12-23 2021-03-26 华中科技大学 Method for constructing service clustering framework of water resource management system
CN111131449A (en) * 2019-12-23 2020-05-08 华中科技大学 Method for constructing service clustering framework of water resource management system
CN113886058A (en) * 2020-07-01 2022-01-04 中国联合网络通信集团有限公司 Cross-cluster resource scheduling method and device
CN111880934A (en) * 2020-07-29 2020-11-03 北京浪潮数据技术有限公司 Resource management method, device, equipment and readable storage medium
CN111897541A (en) * 2020-08-03 2020-11-06 上海嗨酷强供应链信息技术有限公司 Software interaction platform and method for automatically deploying resources in cloud environment
CN111897541B (en) * 2020-08-03 2021-08-17 汇链通供应链科技(上海)有限公司 Software interaction platform and method for automatically deploying resources in cloud environment
CN112311886B (en) * 2020-10-30 2022-03-01 新华三大数据技术有限公司 Multi-cluster deployment method, device and management node
CN112311886A (en) * 2020-10-30 2021-02-02 新华三大数据技术有限公司 Multi-cluster deployment method, device and management node
CN112667564A (en) * 2020-12-30 2021-04-16 湖南博匠信息科技有限公司 Zynq platform record management method and system
CN112769964A (en) * 2021-04-12 2021-05-07 江苏红网技术股份有限公司 Method for yann support hybrid operation
CN113645071B (en) * 2021-08-10 2022-12-09 广域铭岛数字科技有限公司 Cluster deployment method, system, medium and electronic terminal
CN113645071A (en) * 2021-08-10 2021-11-12 广域铭岛数字科技有限公司 Cluster deployment method, system, medium and electronic terminal
CN113656181A (en) * 2021-08-23 2021-11-16 中国工商银行股份有限公司 Method and device for issuing real-time application cluster instance resources
CN113934564A (en) * 2021-09-26 2022-01-14 聚好看科技股份有限公司 Cluster log storage method and device
CN114691357A (en) * 2022-03-16 2022-07-01 东云睿连(武汉)计算技术有限公司 HDFS containerization service system, method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110362381A (en) HDFS cluster High Availabitity dispositions method, system, equipment and storage medium
US9460185B2 (en) Storage device selection for database partition replicas
US9489443B1 (en) Scheduling of splits and moves of database partitions
KR102013004B1 (en) Dynamic load balancing in a scalable environment
US10394611B2 (en) Scaling computing clusters in a distributed computing system
CN104615606B (en) A kind of Hadoop distributed file systems and its management method
KR102013005B1 (en) Managing partitions in a scalable environment
CN113504954B (en) Method, system and medium for calling CSI LVM plug in and dynamic persistent volume supply
CN109218100A (en) Distributed objects storage cluster and its request responding method, system and storage medium
CN110377395A (en) A kind of Pod moving method in Kubernetes cluster
CN111935238A (en) Cloud platform load balancing management system, method, equipment and medium
CN108319618B (en) Data distribution control method, system and device of distributed storage system
CN110825704B (en) Data reading method, data writing method and server
CN104468150A (en) Method for realizing fault migration through virtual host and virtual host service device
CN105635311A (en) Method for synchronizing resource pool information in cloud management platform
CN105095103A (en) Storage device management method and device used for cloud environment
CN106354548A (en) Virtual cluster creating and management method and device in distributed database system
CN105468296A (en) No-sharing storage management method based on virtualization platform
WO2017097006A1 (en) Real-time data fault-tolerance processing method and system
CN107682411A (en) A kind of extensive SDN controllers cluster and network system
US10705732B1 (en) Multiple-apartment aware offlining of devices for disruptive and destructive operations
CN114879907A (en) Data distribution determination method, device, equipment and storage medium
CN112231399A (en) Method and device applied to graph database
CN104468674B (en) Data migration method and device
CN108121585A (en) Based on the resource allocation device and method under cloud mode

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191022

RJ01 Rejection of invention patent application after publication