CN102946323A - Realizing method for location awareness of compute node cabinet in HDFS (Hadoop Distributed File System) and realizing system thereof - Google Patents
Realizing method for location awareness of compute node cabinet in HDFS (Hadoop Distributed File System) and realizing system thereof Download PDFInfo
- Publication number
- CN102946323A CN102946323A CN2012104110497A CN201210411049A CN102946323A CN 102946323 A CN102946323 A CN 102946323A CN 2012104110497 A CN2012104110497 A CN 2012104110497A CN 201210411049 A CN201210411049 A CN 201210411049A CN 102946323 A CN102946323 A CN 102946323A
- Authority
- CN
- China
- Prior art keywords
- hadoop
- computing node
- address
- hdfs
- rack
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a realizing method for location awareness of a compute node cabinet in an HDFS (Hadoop Distributed File System) and a realizing system thereof; the method comprises the following steps of: A. starting a Hadoop distributed file system; B. checking the configuration options in standard files of the Hadoop distributed file system; C. linking the configuration options to checked script files; D. obtaining IP addresses of compute nodes in a Hadoop compute cluster; E. judging if the compute nodes belong to the Hadoop compute cluster; F. judging if the IP addresses contain corresponding cabinet messages; G. returning the corresponding cabinet messages of the Hadoop distributed file system; H. returning the defaulted cabinet messages of the Hadoop distributed file system; and I. presenting the Hadoop distributed file system in an abnormal state. The method overcomes the problems that the interconnection of exchangers among the cabinets becomes the bottleneck of the data search and operation among the nodes, and all copies of the same data block are possibly stored in one cabinet, so that the data safety of the system is difficultly ensured when one cabinet loses power.
Description
Technical field
The present invention relates to high-performance calculation and cluster field, be specifically related to the implementation method of computing node rack location aware among a kind of HDFS.
Background technology
Hdfs(Hadoop Distributed File System, Hadoop distributed file system) scale of cluster is generally larger, usually can be deployed in several in addition tens racks in.A general rack uses two layers of convergence switch, therefore exchanges data between the switch is generally little than switch internal exchange of data bandwidth, and the network traffics between the node are usually more efficient than striding the internodal network traffics of rack in the same rack in cluster.Simultaneously, a management node is placed in the different racks copy of a piece to improve system survivability as far as possible.The prerequisite that realizes above-mentioned two kinds of technology is to allow the Hdfs system know that a node belongs to which rack or its rack ID, that is to say, it should have rack consciousness.
At present for how allowing the Hdfs file system have the clearer and more definite solution of rack perceptional function neither one.
If Hdfs does not have the rack perceptional function, will run into following two problems:
1. the interconnection of rack room switch becomes the bottleneck when data search is with operation between node.
2. all copies of same data block may be in same rack, and the fail safe of the data of system is difficult to guarantee when a rack power down.
Summary of the invention
For the deficiencies in the prior art, the invention provides the implementation method of computing node rack location aware among a kind of HDFS, bottleneck when the interconnection that the method has overcome the rack room switch becomes data search and operation between node and all copies of same data block may be in same racks, the problem that the fail safe of the data of system is difficult to guarantee when a rack power down.
The objective of the invention is to adopt following technical proposals to realize:
The implementation method of computing node rack location aware among a kind of HDFS, its improvements be, described method comprises the steps:
A, startup Hadoop distributed file system;
Config option in B, the described Hadoop distributed file system normative document of inspection;
C, described config option is linked to the script file of detection;
D, the IP address that obtains computing node in the Hadoop calculating cluster;
E, judge whether described computing node belongs to this Hadoop and calculate cluster;
F, the described IP of judgement address have or not corresponding rack information;
G, return the corresponding rack information of Hadoop group system;
H, return Hadoop group system acquiescence rack information;
I, Hadoop group system are unusual.
Wherein, among the described step B, described normative document represents with hadoop-default.xml.
Wherein, among the described step D, described Hadoop calculates in the cluster and comprises at least one computing node and a management node, and computing node of every detection obtains the IP address of this computing node when chaining management node, and the IP address is sent to the script file of detection.
Wherein, in the described step e, verify the legitimacy of described IP address after, the information of IP address and config option is compared, judge whether described computing node belongs to this Hadoop and calculate cluster.
Wherein, judge that described computing node belongs to this Hadoop and calculates cluster, then carry out step F; Otherwise carry out step I.
Wherein, described step F has corresponding rack information if judge described IP address, then carries out step G; Otherwise carry out step H.
Wherein, there are mapping relations between described computing node and the rack.
The present invention is based on the realization system of computing node rack location aware among a kind of HDFS that another purpose provides, its improvements are, described system comprises following module:
Start module: be used for starting described Hadoop distributed file system;
Checking module: the config option that is used for checking described Hadoop distributed file system normative document;
Obtain the IP address module: be used for obtaining the IP address that Hadoop calculates the cluster computing node;
Judge the computing node module: be used for judging whether described computing node belongs to this Hadoop and calculate cluster;
Judge the IP address module: be used for judging that described IP address has or not corresponding rack information.
Compared with the prior art, the beneficial effect that reaches of the present invention is:
Implementation method and its implementation of computing node rack location aware among the HDFS provided by the invention, make the Hdfs file system that the rack perceptional function arranged after, have following two benefits:
1. allow being distributed in the same rack that data trnascription tries one's best, thus when guaranteeing data search and operation rapidly, the optimization system performance.
2. allow the copy of same data block can not be distributed in fully in the same rack, thereby guarantee that the data of system can be used when a rack power down, improve Security of the system.
Description of drawings
Fig. 1 is the flow chart of the implementation method of computing node rack location aware among the HDFS provided by the invention.
Embodiment
Below in conjunction with accompanying drawing the specific embodiment of the present invention is described in further detail.
The implementation method flow process of computing node rack location aware comprises the steps: as shown in Figure 1 among the HDFS provided by the invention
A, startup Hadoop distributed file system;
Topology.script.file.name config option among B, the inspection Hadoop distributed file system normative document hadoop-default.xml;
C, config option is linked to the script file of detection;
D, the IP address that obtains computing node in the Hadoop calculating cluster: Hadoop calculates in the cluster and comprises a plurality of computing nodes and a management node, computing node of every detection obtains the IP address of this computing node when chaining management node, and the IP address is sent to the script file of detection.
After the legitimacy of E, checking IP address, the information of IP address and config option is compared, judge whether described computing node belongs to this Hadoop and calculate cluster: judge that computing node belongs to this Hadoop calculating cluster, then carries out step F; Otherwise carry out step I.
F, judgement IP address have or not corresponding rack information: if judge described IP address corresponding rack information is arranged, then carry out step G; Otherwise carry out step H.
G, return the corresponding rack information of Hadoop group system;
H, return Hadoop group system acquiescence rack information;
I, Hadoop group system are unusual.
Embodiment
The below is the content example of the topological arrangement file of computing node and rack information corresponding relation:
Datanode1 /dc1/rack1
Datanode2 /dc1/rack1
Datanode3 /dc1/rack2
Wherein, Datanode represents the computing node in the Hadoop system; Dc(datacenter) be the abbreviation of data center; Rack represents rack information.
This file delegation represents an information: illustrate datanode belongs to which rack of which data center.
Method provided by the invention is called the Hadoop distributed system by script file java interface passes to the Hadoop group system with the rack information of Datanode computing node, thereby realize cluster to the perception of node location, thereby optimization system system and raising security of system.Usually, in order to guarantee the fail safe of data, we can back up data.When certain machine breaks down, can avoid losing of data.In the Hadoop group system, modal is to deposit two backups to data, and best situation is that a backup is placed in the same rack of initial data, and another part is placed on another rack.If a machine is out of joint, yes goes for its backup in rack for our first-selection, because such transfer of data is rapid, also need not to transmit data (this has just been avoided switch " bottleneck " problem) by switch.Certainly, also have the impaired situation of whole rack, at this moment, we just can remove to seek Backup Data to other rack.At same rack, this just relates to the location recognition problem of computing node, and method of the present invention just can well be tackled this situation.
Should be noted that at last: above embodiment is only in order to illustrate that technical scheme of the present invention is not intended to limit, although with reference to above-described embodiment the present invention is had been described in detail, those of ordinary skill in the field are to be understood that: still can make amendment or be equal to replacement the specific embodiment of the present invention, and do not break away from any modification of spirit and scope of the invention or be equal to replacement, it all should be encompassed in the middle of the claim scope of the present invention.
Claims (8)
1. the implementation method of computing node rack location aware among the HDFS is characterized in that, described method comprises the steps:
A, startup Hadoop distributed file system;
Config option in B, the described Hadoop distributed file system normative document of inspection;
C, described config option is linked to the script file of detection;
D, the IP address that obtains computing node in the Hadoop calculating cluster;
E, judge whether described computing node belongs to this Hadoop and calculate cluster;
F, the described IP of judgement address have or not corresponding rack information;
G, return the corresponding rack information of Hadoop group system;
H, return Hadoop group system acquiescence rack information;
I, Hadoop group system are unusual.
2. the implementation method of computing node rack location aware among the HDFS as claimed in claim 1 is characterized in that, among the described step B, described normative document represents with hadoop-default.xml.
3. the implementation method of computing node rack location aware among the HDFS as claimed in claim 1, it is characterized in that, among the described step D, described Hadoop calculates in the cluster and comprises at least one computing node and a management node, computing node of every detection obtains the IP address of this computing node when chaining management node, and the IP address is sent to the script file of detection.
4. the implementation method of computing node rack location aware among the HDFS as claimed in claim 1, it is characterized in that, in the described step e, verify the legitimacy of described IP address after, the information of IP address and config option is compared, judge whether described computing node belongs to this Hadoop and calculate cluster.
5. the implementation method of computing node rack location aware among the HDFS as claimed in claim 4 is characterized in that, judges that described computing node belongs to this Hadoop and calculates cluster, then carries out step F; Otherwise carry out step I.
6. the implementation method of computing node rack location aware among the HDFS as claimed in claim 1 is characterized in that, described step F has corresponding rack information if judge described IP address, then carries out step G; Otherwise carry out step H.
7. such as the implementation method of computing node rack location aware among each described HDFS among the claim 1-6, it is characterized in that, have mapping relations between described computing node and the rack.
8. the realization system of computing node rack location aware among the HDFS is characterized in that, described system comprises following module:
Start module: be used for starting described Hadoop distributed file system;
Checking module: the config option that is used for checking described Hadoop distributed file system normative document;
Obtain the IP address module: be used for obtaining the IP address that Hadoop calculates the cluster computing node;
Judge the computing node module: be used for judging whether described computing node belongs to this Hadoop and calculate cluster;
Judge the IP address module: be used for judging that described IP address has or not corresponding rack information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012104110497A CN102946323A (en) | 2012-10-24 | 2012-10-24 | Realizing method for location awareness of compute node cabinet in HDFS (Hadoop Distributed File System) and realizing system thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012104110497A CN102946323A (en) | 2012-10-24 | 2012-10-24 | Realizing method for location awareness of compute node cabinet in HDFS (Hadoop Distributed File System) and realizing system thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN102946323A true CN102946323A (en) | 2013-02-27 |
Family
ID=47729232
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2012104110497A Pending CN102946323A (en) | 2012-10-24 | 2012-10-24 | Realizing method for location awareness of compute node cabinet in HDFS (Hadoop Distributed File System) and realizing system thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102946323A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103561033A (en) * | 2013-11-08 | 2014-02-05 | 西安电子科技大学宁波信息技术研究院 | Device and method for user to have remote access to HDFS cluster |
CN104615606A (en) * | 2013-11-05 | 2015-05-13 | 阿里巴巴集团控股有限公司 | Hadoop distributed file system and management method thereof |
CN105592178A (en) * | 2015-09-17 | 2016-05-18 | 杭州华三通信技术有限公司 | Method and device for determining position of data node |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101901275A (en) * | 2010-08-23 | 2010-12-01 | 华中科技大学 | Distributed storage system and method thereof |
CN102196049A (en) * | 2011-05-31 | 2011-09-21 | 北京大学 | Method suitable for secure migration of data in storage cloud |
US20120236761A1 (en) * | 2011-03-15 | 2012-09-20 | Futurewei Technologies, Inc. | Systems and Methods for Automatic Rack Detection |
-
2012
- 2012-10-24 CN CN2012104110497A patent/CN102946323A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101901275A (en) * | 2010-08-23 | 2010-12-01 | 华中科技大学 | Distributed storage system and method thereof |
US20120236761A1 (en) * | 2011-03-15 | 2012-09-20 | Futurewei Technologies, Inc. | Systems and Methods for Automatic Rack Detection |
CN102196049A (en) * | 2011-05-31 | 2011-09-21 | 北京大学 | Method suitable for secure migration of data in storage cloud |
Non-Patent Citations (1)
Title |
---|
TOM WHITE: "《Hadoop: The Definitive Guide, Third Edition》", 7 May 2012, O’REILLY MEDIA * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104615606A (en) * | 2013-11-05 | 2015-05-13 | 阿里巴巴集团控股有限公司 | Hadoop distributed file system and management method thereof |
CN104615606B (en) * | 2013-11-05 | 2018-04-06 | 阿里巴巴集团控股有限公司 | A kind of Hadoop distributed file systems and its management method |
CN103561033A (en) * | 2013-11-08 | 2014-02-05 | 西安电子科技大学宁波信息技术研究院 | Device and method for user to have remote access to HDFS cluster |
CN103561033B (en) * | 2013-11-08 | 2016-11-02 | 西安电子科技大学宁波信息技术研究院 | User remotely accesses the device and method of HDFS cluster |
CN105592178A (en) * | 2015-09-17 | 2016-05-18 | 杭州华三通信技术有限公司 | Method and device for determining position of data node |
CN105592178B (en) * | 2015-09-17 | 2018-12-25 | 新华三技术有限公司 | A kind of back end method for determining position and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109308223B (en) | Service request response method and equipment | |
CN106534328B (en) | Node connection method and distributed computing system | |
CN109344014B (en) | Main/standby switching method and device and communication equipment | |
CN103581276A (en) | Cluster management device and system, service client side and corresponding method | |
CN102025630A (en) | Load balancing method and load balancing system | |
CN109787827B (en) | CDN network monitoring method and device | |
CN101753597B (en) | Keeping alive method between peer node and client under peer node-client architecture | |
CN108737574A (en) | A kind of node off-line judgment method, device, equipment and readable storage medium storing program for executing | |
CN103458013A (en) | Streaming media server cluster load balancing system and balancing method | |
CN112118130B (en) | Self-adaptive distributed cache active-standby state information switching method and device | |
CN102006189A (en) | Primary access server determination method and device for dual-machine redundancy backup | |
CN107729205B (en) | Fault processing method and device for business system | |
CN102694689A (en) | Method and device for discovering network topology | |
CN112217847A (en) | Micro service platform, implementation method thereof, electronic device and storage medium | |
CN112291116A (en) | Link fault detection method and device and network equipment | |
WO2016062166A1 (en) | Method, apparatus and system for network operations, administration and maintenance | |
CN104796283B (en) | A kind of method of monitoring alarm | |
CN102946323A (en) | Realizing method for location awareness of compute node cabinet in HDFS (Hadoop Distributed File System) and realizing system thereof | |
CN107592199B (en) | Data synchronization method and system | |
CN105306566A (en) | Method and system for electing master control node in cloud storage system | |
CN104935614B (en) | Data transmission method and device | |
EP3171565B1 (en) | Methods, devices and system for netconf hello packets interaction | |
CN109992531A (en) | Date storage method and device | |
CN114090342A (en) | Storage disaster tolerance link management method, message execution node and storage control cluster | |
CN107451254B (en) | Method for generating unique identifier of database table data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20130227 |
|
RJ01 | Rejection of invention patent application after publication |