[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN106293537B - A lightweight approach to autonomous block management for data-intensive file systems - Google Patents

A lightweight approach to autonomous block management for data-intensive file systems Download PDF

Info

Publication number
CN106293537B
CN106293537B CN201610665489.3A CN201610665489A CN106293537B CN 106293537 B CN106293537 B CN 106293537B CN 201610665489 A CN201610665489 A CN 201610665489A CN 106293537 B CN106293537 B CN 106293537B
Authority
CN
China
Prior art keywords
data
data storage
block
storage node
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610665489.3A
Other languages
Chinese (zh)
Other versions
CN106293537A (en
Inventor
陈付梅
韩德志
毕坤
王军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Maritime University
Original Assignee
Shanghai Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Maritime University filed Critical Shanghai Maritime University
Priority to CN201610665489.3A priority Critical patent/CN106293537B/en
Publication of CN106293537A publication Critical patent/CN106293537A/en
Application granted granted Critical
Publication of CN106293537B publication Critical patent/CN106293537B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种轻量级的数据密集型文件系统的自治块管理方法,通过交叉迁移划分(ISD,Intersected Shifted Declustering)实现数据块到数据存储节点的映射、数据存储节点中数据块的快速查找、数据存储节点失效时数据块的快速恢复和新添加数据存储节点时数据块的快速重新分布等,使主节点只负责数据密集型文件系统中文件命名空间的存储和维护,而数据块到数据存储节点的映射关系信息存储和维护,以及数据存储节点失效时数据块的替换和新数据存储节点添加时数据块的重新分布等都由数据存储节点自治完成。该发明节省了数据密集型文件系统中主节点的内存空间,提高了主节点的处理能力,能大幅度提高大数据环境下数据密集型文件系统的数据块管理效率。

The invention discloses a light-weight autonomous block management method of a data-intensive file system, which realizes the mapping of data blocks to data storage nodes and the fast data block storage in data storage nodes through Intersected Shifted Declustering (ISD). Search, fast recovery of data blocks when a data storage node fails, and fast redistribution of data blocks when a new data storage node is added, etc., make the master node only responsible for the storage and maintenance of the file namespace in the data-intensive file system, and the data blocks to The storage and maintenance of mapping relationship information of data storage nodes, the replacement of data blocks when data storage nodes fail, and the redistribution of data blocks when new data storage nodes are added are all completed autonomously by data storage nodes. The invention saves the memory space of the master node in the data-intensive file system, improves the processing capability of the master node, and can greatly improve the data block management efficiency of the data-intensive file system in the big data environment.

Description

一种轻量级的数据密集型文件系统的自治块管理方法A lightweight approach to autonomous block management for data-intensive file systems

技术领域technical field

本发明涉及计算机安全技术,尤其涉及一种轻量级的数据密集型文件系统的自治块管理方法。The invention relates to computer security technology, in particular to a light-weight autonomous block management method of a data-intensive file system.

背景技术Background technique

数据密集型文件系统DiFS,例如谷歌文件系统GFS、Hadoop分布式文件系统HDFS等,已经成为大数据存储管理的主要文件系统。当前的数据密集型文件系统DiFS采用主从式架构,主节点(元数据服务器)管理所有的元数据,从节点(数据存储节点)只负责数据存储。为了维持高可用性,这些存储系统通常将数据文件分为固定大小的块,每个数据块通常有3个副本,并将它们都分配到不同的集群的数据存储节点中。主节点必须记录成百上千个数据存储节点的地址,以及记录所有数据文件的数据块到这些存储节点的映射信息。并且,主节点必须定期地检查所有数据块的地址映射信息的变化。随着数据量的不断增大,这些元数据信息不仅占据了主节点的内存空间,影响主节点的处理能力,而且严重地限制了主节点的可扩展性。Data-intensive file system DiFS, such as Google File System GFS, Hadoop Distributed File System HDFS, etc., has become the main file system for big data storage management. The current data-intensive file system DiFS adopts a master-slave architecture. The master node (metadata server) manages all metadata, and the slave node (data storage node) is only responsible for data storage. In order to maintain high availability, these storage systems usually divide data files into blocks of fixed size, each data block usually has 3 copies, and distribute them to different data storage nodes of the cluster. The master node must record the addresses of hundreds or even thousands of data storage nodes, and record the mapping information from the data blocks of all data files to these storage nodes. Moreover, the master node must periodically check the changes of the address mapping information of all data blocks. As the amount of data continues to increase, these metadata information not only occupy the memory space of the master node, affect the processing capability of the master node, but also seriously limit the scalability of the master node.

为了解决数据密集型文件系统存在的问题,将数据文件物理块的分配和维护从元数据管理中分离出来,由每个数据存储节点执行数据块到存储节点映射信息的维护方法应运而生。应用此方法,主节点不需要再保存大量的数据块元数据信息以及数据块到数据存储节点的映射表信息,而是需要用一组数据块到数据存储节点、数据存储节点到数据块之间的可逆映射函数完成。In order to solve the problems existing in data-intensive file systems, the allocation and maintenance of data file physical blocks are separated from metadata management, and the method of maintaining data block-to-storage node mapping information by each data storage node came into being. Applying this method, the master node does not need to save a large amount of data block metadata information and the mapping table information from data blocks to data storage nodes, but needs to use a set of data blocks to data storage nodes, data storage nodes to data blocks The reversible mapping function of is completed.

数据密集型文件系统管理海量的数据,这些数据具有以下特点:1)数据量大,数据总量增长快;2)数据存储性能需求高;3)要求高可靠性和高可恢复性:当数据发生丢失或数据存储节点失效时,在不影响正常工作的前提下,能够快速的恢复原数据;4)要求能够快速的查找数据块的存储位置;5)要求尽量少的占用主节点的内存空间和尽量少的影响主节点的处理能力;Data-intensive file systems manage massive amounts of data, which have the following characteristics: 1) large data volume and rapid growth in total data volume; 2) high data storage performance requirements; 3) high reliability and high recoverability requirements: when data In the event of loss or data storage node failure, the original data can be quickly restored without affecting normal work; 4) It is required to quickly find the storage location of the data block; 5) It is required to occupy as little memory space as possible on the master node and affect the processing capability of the master node as little as possible;

从以上分析可以看出,传统文件系统的管理方法不适应数据密集型文件系统的管理,主要原因:1)随着数据量的不断增大,文件数据块地址表的存储将占用大量的存储空间;2)主节点负责文件数据块地址表的维护,随着文件数据块地址表的不断增加,大大降低了主节点的处理能力;3)数据量的不断增加不仅占用了主节点大量的存储空间,增大了地址等元数据维护成本,同时还降低了主节点的可扩展性;4)每个数据存储节点在进行存储和查询时都要先咨询主节点,这样增加寻址的时间。It can be seen from the above analysis that the traditional file system management methods are not suitable for the management of data-intensive file systems. The main reasons are: 1) With the continuous increase of data volume, the storage of file data block address tables will occupy a large amount of storage space ; 2) The master node is responsible for the maintenance of the file data block address table. With the continuous increase of the file data block address table, the processing capacity of the master node is greatly reduced; 3) The continuous increase of the data volume not only takes up a large amount of storage space of the master node , which increases the maintenance cost of metadata such as addresses, and also reduces the scalability of the master node; 4) Each data storage node must first consult the master node when storing and querying, which increases the addressing time.

发明内容Contents of the invention

针对数据密集型文件系统的数据块存储和查询管理需求,本发明提供了一种轻量级的数据密集型文件系统的自治块管理方法,通过将物理数据块的分配、查询和相关元数据维护从传统的元数据管理中分离出来,由每个数据存储节点完成,减少主节点存储空间的开销和负担。本发明可提升大数据环境下的数据密集型文件系统的可扩展性、减少数据块寻址时间,并可大副度提高整个系统的性能。Aiming at the data block storage and query management requirements of the data-intensive file system, the present invention provides a light-weight autonomous block management method of the data-intensive file system, through the distribution, query and related metadata maintenance of physical data blocks Separated from traditional metadata management, it is completed by each data storage node, reducing the overhead and burden of the master node's storage space. The invention can improve the expansibility of the data-intensive file system in the big data environment, reduce the data block addressing time, and greatly improve the performance of the whole system.

本发明的技术原理在于,本发明是通过交叉迁移划分方法(ISD,IntersectedShifted Declustering)实现数据块的自治管理,即通过用一组可逆数学函数实现数据块到数据存储节点,以及数据存储节点到数据块的映射,完成数据块的分布式存储和快速查询等。The technical principle of the present invention is that the present invention realizes the autonomous management of data blocks through the intersected shifted division method (ISD, IntersectedShifted Declustering), that is, by using a set of reversible mathematical functions to realize data blocks to data storage nodes, and data storage nodes to data Block mapping, complete distributed storage and fast query of data blocks, etc.

本发明具体包含以下几种操作:The present invention specifically comprises following several operations:

操作1、数据块存储操作;Operation 1. Data block storage operation;

操作2、数据块查找操作;Operation 2. Data block search operation;

操作3、失效数据存储节点失效处理操作;Operation 3. Invalidation data storage node failure processing operation;

操作4、添加新数据存储节点操作。Operation 4. Add a new data storage node operation.

(1)数据块存储操作包括以下步骤:(1) The data block storage operation includes the following steps:

步骤1.1、主节点通过可逆的线性哈希函数选择数据块所在逻辑组(LG);Step 1.1. The master node selects the logical group (LG) where the data block is located through a reversible linear hash function;

步骤1.2、主节点通过可逆的位移分割函数选择逻辑组中数据存储节点存储数据块数据;Step 1.2, the master node selects the data storage node in the logic group through the reversible displacement partition function to store the data block data;

步骤1.3、数据存储节点存储数据块数据和数据块地址映射信息。Step 1.3, the data storage node stores data block data and data block address mapping information.

(2)数据块查找操作包括以下步骤:(2) The data block search operation includes the following steps:

步骤2.1、数据块b所在数据存储节点根据其索引号用反向可逆函数计算数据块b所在逻辑组的新ID;Step 2.1, the data storage node where the data block b is located uses a reverse reversible function to calculate the new ID of the logical group where the data block b is located according to its index number;

步骤2.2、数据块b所在数据存储节点根据数据块b所在逻辑组ID,用反向可逆函数计算数据块b的物理ID,为文件系统恢复完整的数据文件提供条件;Step 2.2, the data storage node where the data block b is located uses the reverse reversible function to calculate the physical ID of the data block b according to the logical group ID where the data block b is located, and provides conditions for the file system to recover the complete data file;

步骤2.3、数据存储节点根据数据块的物理ID,获取数据块在存储节点的映射信息;Step 2.3, the data storage node obtains the mapping information of the data block on the storage node according to the physical ID of the data block;

步骤2.4、数据存储节点根据数据块b的映射信息取数据块b的数据送文件系统。Step 2.4, the data storage node fetches the data of data block b according to the mapping information of data block b and sends it to the file system.

(3)失效数据存储节点失效处理操作包括以下步骤:(3) The failure processing operation of the failure data storage node includes the following steps:

步骤3.1、确定失效数据存储节点所在逻辑分组;Step 3.1, determine the logical grouping where the failed data storage node is located;

步骤3.2、选择数据存储失效节点以外的逻辑分组中负载最小的数据存储节点作为后备节点;Step 3.2, select the data storage node with the smallest load in the logical grouping other than the data storage failure node as the backup node;

步骤3.3、多个后备数据存储节点采用智能重组映射方法并行复制各个逻辑组中对应的该失效数据存储节点中包含的数据。In step 3.3, multiple backup data storage nodes use the intelligent reorganization mapping method to parallelly copy the data contained in the corresponding failure data storage nodes in each logical group.

(4)添加新数据存储节点操作包括以下步骤:(4) The operation of adding a new data storage node includes the following steps:

步骤4.1、计算整个系统中所有逻辑组中数据存储节点的平均负载COVaveStep 4.1, calculating the average load COV ave of data storage nodes in all logical groups in the entire system;

步骤4.2、选择一个逻辑组,计算该组中所有数据存储节点中最大的负载COVmaxStep 4.2, select a logical group, and calculate the maximum load COV max among all data storage nodes in the group;

步骤4.3、比较COVmax和COVave的大小,如果COVmax≥COVave,用新加入数据存储节点替换逻辑组该数据存储节点。否则,选取下一个逻辑组,重复步骤4.1、步骤4.2和步骤4.3,直到新加入的数据存储节点的负载达到或接近系统中数据存储节点的平均负载为止。Step 4.3, compare the size of COV max and COV ave , if COV max ≥ COV ave , replace the data storage node in the logical group with the newly added data storage node. Otherwise, select the next logical group and repeat steps 4.1, 4.2 and 4.3 until the load of the newly added data storage node reaches or approaches the average load of the data storage nodes in the system.

这种数据密集型文件系统自治块管理方法的优势在于:The advantages of this data-intensive approach to file system autonomous block management are:

(1)大大减少了主节点存储空间开销。将数据块到数据存储节点映射信息从传统的元数据中分离出来,由每个数据存储节点自主的进行存储和管理,主节点不需要保存和维护大量的数据块地址信息,使主节点保存的元数据信息比传统文件系统减少90%以上。(1) The storage space overhead of the master node is greatly reduced. The data block to data storage node mapping information is separated from the traditional metadata, and each data storage node independently stores and manages it. The master node does not need to save and maintain a large amount of data block address information, so that the master node saves Metadata information is reduced by more than 90% compared to traditional file systems.

(2)大大的提高主节点的处理能力。数据块和数据存储节点之间的映射信息由每个数据存储节点自主的存储和维护,消除了主节点的负担。此种方法与分布式文件系统HDFS相比,可使主节点的处理性能提高了30%以上。(2) Greatly improve the processing capacity of the master node. The mapping information between data blocks and data storage nodes is independently stored and maintained by each data storage node, eliminating the burden on the master node. Compared with the distributed file system HDFS, this method can improve the processing performance of the master node by more than 30%.

(3)提高了系统的可恢复性和可扩展性。当数据存储节点故障时通过采用智能重组映射方法,当添加新数据存储节点时通过采用解耦地址映射方法,这样只迁移少数数据块就能完成失效数据节点数据的恢复和新添加数据节点数据的复制,大大提高了系统的可恢复性和可扩展性。(3) Improve the recoverability and scalability of the system. When the data storage node fails, the intelligent reorganization mapping method is adopted, and the decoupling address mapping method is adopted when adding a new data storage node, so that only a few data blocks can be migrated to complete the recovery of the data of the failed data node and the data of the newly added data node. Replication greatly improves the recoverability and scalability of the system.

附图说明Description of drawings

图1为本发明具体操作的流程图;Fig. 1 is the flow chart of concrete operation of the present invention;

图2为本发明中主节点和数据存储节点管理功能划分的示意图;Fig. 2 is a schematic diagram of the division of management functions between master nodes and data storage nodes in the present invention;

图3为连续块到数据节点的映射和数据节点到块的查找的示例;Figure 3 is an example of the mapping of continuous blocks to data nodes and the lookup of data nodes to blocks;

图4为数据节点失效恢复过程的示例;Figure 4 is an example of a data node failure recovery process;

图5为新数据节点添加过程的示例。Figure 5 is an example of the process of adding a new data node.

具体实施方式Detailed ways

为了使本发明实现的技术手段、创作特征、达成目的与功效易于明白了解,下面结合图示与具体实施例,进一步阐述本发明提出的一种轻量级的数据密集型文件系统的自治块管理方法。In order to make the technical means, creative features, goals and effects achieved by the present invention easy to understand, the following will further explain the autonomous block management of a light-weight data-intensive file system proposed by the present invention in combination with diagrams and specific embodiments method.

一种轻量级的数据密集型文件系统的自治块管理方法,通过一组可逆数学函数实现数据块到数据节点及数据节点到数据块的映射。如图2所示,本发明中各节点具体功能的划分:主节点只负责系统命名空间维护、数据块到数据存储节点的分布、各个数据存储节点的管理;各个数据存储节点负责数据块的一致性检查、数据块恢复和数据存储节点的映射信息存储和维护。A lightweight autonomous block management method for data-intensive file systems, which implements the mapping from data blocks to data nodes and from data nodes to data blocks through a set of reversible mathematical functions. As shown in Figure 2, the specific functions of each node in the present invention are divided: the master node is only responsible for the maintenance of the system namespace, the distribution of data blocks to data storage nodes, and the management of each data storage node; each data storage node is responsible for the consistency of data blocks Consistency check, data block recovery, and mapping information storage and maintenance of data storage nodes.

如图1所示,本发明所述的自治块管理方法,具体包括以下几种操作:As shown in Figure 1, the autonomous block management method described in the present invention specifically includes the following operations:

操作1、数据块存储操作;Operation 1. Data block storage operation;

操作2、数据块查找操作;Operation 2. Data block search operation;

操作3、失效数据存储节点失效处理操作;Operation 3. Invalidation data storage node failure processing operation;

操作4、添加新数据存储节点操作。Operation 4. Add a new data storage node operation.

(1)数据块存储操作,包括以下步骤:(1) Data block storage operation, including the following steps:

步骤1.1、主节点通过可逆的线性哈希函数选择块所在逻辑组(LG);Step 1.1. The master node selects the logical group (LG) where the block is located through a reversible linear hash function;

步骤1.2、主节点通过可逆的位移分割函数选择逻辑组中数据存储节点存储数据块数据;Step 1.2, the master node selects the data storage node in the logic group through the reversible displacement partition function to store the data block data;

步骤1.3、数据存储节点存储数据块数据和数据块地址映射信息。Step 1.3, the data storage node stores data block data and data block address mapping information.

数据块存储操作的步骤1.1中,通过可逆的线性哈希函数选择数据块所在的逻辑组(LG)公式:In step 1.1 of the data block storage operation, the logic group (LG) formula where the data block is selected is selected through a reversible linear hash function:

其中,g是要映射的逻辑组ID,x是系统中当前的组数,X是开始时系统中的逻辑组数,b是要存储数据块在其文件中的块ID,s是新增的逻辑组数, Among them, g is the logical group ID to be mapped, x is the current number of groups in the system, X is the number of logical groups in the system at the beginning, b is the block ID of the file in which the data block is to be stored, and s is the newly added number of logical groups,

数据块存储操作的步骤1.2中,通过可逆的位移分割函数选择逻辑组中数据存储节点的过程包括:In step 1.2 of the data block storage operation, the process of selecting data storage nodes in the logic group through a reversible displacement partition function includes:

A)计算数据块b映射到逻辑组g后的新块标识,其公式:A) Calculate the new block identifier after data block b is mapped to logical group g, its formula:

其中,a是数据块在逻辑组g中的新标识,x是当前逻辑组数,X是初始的逻辑组数,b给定数据块ID,s是新增的逻辑组数, Among them, a is the new identifier of the data block in the logical group g, x is the current logical group number, X is the initial logical group number, b specifies the data block ID, s is the newly added logical group number,

B)计算数据块b映射到逻辑组g中的数据存储节点的索引ID,其公式:B) Calculate the index ID of data block b mapped to the data storage node in logical group g, its formula:

d=node(a,i)=(a+i)%4 (3)d=node(a,i)=(a+i)%4 (3)

其中,a是数据块b在逻辑组g中的新数据块标识,i是数据块b的副本号(取值0、1、2),d为数据块b在逻辑组选择的数据存储节点的索引(取值0、1、2、3)。Among them, a is the new data block identifier of data block b in logical group g, i is the copy number of data block b (value 0, 1, 2), and d is the data storage node selected by data block b in logical group Index (values 0, 1, 2, 3).

所述的副本号,是指密集型文件系统为每个数据块提供三个副本,充分保证其可用性,其编号为0、1、2;所述数据存储节点的索引,是指一个逻辑组中的所有数据存储节点的编号,本发明中每个逻辑组包括4个数据存储节点,其索引号分别为0、1、2、3。The said copy number means that the intensive file system provides three copies for each data block to fully guarantee its availability, and its number is 0, 1, 2; the index of the data storage node refers to the The numbers of all data storage nodes, each logic group in the present invention includes 4 data storage nodes, and their index numbers are 0, 1, 2, and 3 respectively.

(2)数据块查找操作包括以下步骤:(2) The data block search operation includes the following steps:

步骤2.1、数据块b所在数据存储节点根据其索引号用反向可逆函数计算数据块b所在逻辑组的新ID;Step 2.1, the data storage node where the data block b is located uses a reverse reversible function to calculate the new ID of the logical group where the data block b is located according to its index number;

步骤2.2、数据块b所在数据存储节点根据数据块b所在逻辑组ID,用反向可逆函数计算数据块b的物理ID,为文件系统恢复完整的数据文件提供条件;Step 2.2, the data storage node where the data block b is located uses the reverse reversible function to calculate the physical ID of the data block b according to the logical group ID where the data block b is located, and provides conditions for the file system to recover the complete data file;

步骤2.3、数据存储节点根据数据块的物理ID,获取数据块在存储节点的映射信息;Step 2.3, the data storage node obtains the mapping information of the data block on the storage node according to the physical ID of the data block;

步骤2.4、数据存储节点根据数据块b的映射信息获取数据块b的数据送至文件系统。Step 2.4, the data storage node obtains the data of the data block b according to the mapping information of the data block b and sends it to the file system.

数据块查找操作中的步骤2.1根据数据存储节点索引号d用反向可逆函数计算数据块b所在逻辑组的新ID,其公式:Step 2.1 in the data block search operation calculates the new ID of the logical group where the data block b is located according to the index number d of the data storage node using a reverse reversible function, the formula:

d=(a+i)%4→可逆运算→a=4·j+(d-i)%4 (4)d=(a+i)%4→reversible operation→a=4·j+(d-i)%4 (4)

其中i表示数据块的副本号,可以迭代取0、1、2,j可取0、1、2、…、n等;Where i represents the copy number of the data block, which can be 0, 1, 2 iteratively, and j can be 0, 1, 2, ..., n, etc.;

数据块查找操作中的步骤2.2反向可逆函数计算数据块b的物理ID,其公式:The step 2.2 in the data block lookup operation calculates the physical ID of the data block b with a reverse reversible function, its formula:

其中,g是包含给定数据存储节点逻辑组的索引, where g is the index containing the logical group of given datastore nodes,

图3(a)是连续的数据块通过线性哈希映射到各个逻辑组,并通过迁移划分实现数据块在逻辑组中各个数据存储节点的分布式存储;图3(b)是以数据节点2为例,演示通过可逆函数实现逆向查找数据块的过程。Figure 3(a) shows that continuous data blocks are mapped to each logical group through linear hash, and the distributed storage of data blocks in each data storage node in the logical group is realized through migration and division; Figure 3(b) is based on data node 2 As an example, demonstrate the process of reversely finding data blocks through reversible functions.

(3)失效数据存储节点失效处理操作包括以下步骤:(3) The failure processing operation of the failure data storage node includes the following steps:

步骤3.1、确定失效数据存储节点所在逻辑分组;Step 3.1, determine the logical grouping where the failed data storage node is located;

步骤3.2、选择数据存储失效节点以外的逻辑分组中负载最小的数据存储节点作为后备节点;Step 3.2, select the data storage node with the smallest load in the logical grouping other than the data storage failure node as the backup node;

步骤3.3、多个后备数据存储节点采用智能重组映射方法并行复制各个逻辑组中对应的该失效数据存储节点中包含的数据。In step 3.3, multiple backup data storage nodes use the intelligent reorganization mapping method to parallelly copy the data contained in the corresponding failure data storage nodes in each logical group.

所述的智能重组映射方法,是选取后备数据存储节点数与包含失效数据存储节点的逻辑组数相等,一个失效数据存储节点可能被包含在多个逻辑组中,每一个后备数据存储节点只负责复制一个对应的逻辑组中该失效数据存储节点中的部分数据。The intelligent reorganization mapping method is to select the number of backup data storage nodes to be equal to the number of logical groups containing failure data storage nodes. A failure data storage node may be included in multiple logic groups, and each backup data storage node is only responsible for Part of the data in the failed data storage node in a corresponding logical group is copied.

图4以数据节点2失效为例,演示了各个逻辑组对数据节点2替换恢复过程。Fig. 4 takes the failure of data node 2 as an example to demonstrate the replacement and recovery process of data node 2 by each logical group.

(4)添加新数据存储节点操作,主要采用解耦地址映射方法,包括以下步骤:(4) The operation of adding a new data storage node mainly adopts the method of decoupling address mapping, including the following steps:

步骤4.1、计算整个系统中所有逻辑组中数据存储节点的平均负载COVaveStep 4.1, calculating the average load COV ave of data storage nodes in all logical groups in the entire system;

步骤4.2、选择一个逻辑组,计算该组中所有数据存储节点中最大的负载COVmaxStep 4.2, select a logical group, and calculate the maximum load COV max among all data storage nodes in the group;

步骤4.3、比较COVmax和COVave的大小,如果COVmax≥COVave,用新加入数据存储节点替换逻辑组中负载最大的数据存储节点。否则,选取下一个逻辑组,重复步骤4.1、步骤4.2和步骤4.3,直到新加入的数据存储节点的负载达到或接近系统中数据存储节点的平均负载为止。Step 4.3, compare the size of COV max and COV ave , if COV max ≥ COV ave , replace the data storage node with the largest load in the logic group with the newly added data storage node. Otherwise, select the next logical group and repeat steps 4.1, 4.2 and 4.3 until the load of the newly added data storage node reaches or approaches the average load of the data storage nodes in the system.

图5演示了系统添加新数据节点node128构成新的新逻辑组LG1000时,整个系统的数据块迁移过程。FIG. 5 demonstrates the data block migration process of the entire system when a new data node node 128 is added to the system to form a new logical group LG 1000 .

通过图4和图5可以看出,系统通过可逆函数并采用智能重组映射方法和采用解耦地址映射方法,使数据节点失效和新数据节点添加时,只有很少的数据块迁移,充分保证了系统的稳定性和对用户的可用性。It can be seen from Figure 4 and Figure 5 that the system adopts the reversible function and adopts the intelligent reorganization mapping method and the decoupling address mapping method, so that when data nodes fail and new data nodes are added, only a few data blocks are migrated, which fully guarantees System stability and availability to users.

下面用一个实例来阐述本方法。The method is illustrated below with an example.

选择HDFS作为数据密集型文件系统,通过仿真10000个数据节点,1000000个数据块的大数据环境下,在采用轻量级的数据密集型文件系统的自治块管理方法和不采用该方法时主节点内存占用情况如表1所示,主节点CUP占用情况如表2所示。其中1000000数据块是均匀分布在10000数据节点中,每个数据块大小为64MB。Choose HDFS as the data-intensive file system. By simulating 10,000 data nodes and a large data environment with 1,000,000 data blocks, when using the lightweight data-intensive file system’s autonomous block management method and not using this method, the master node The memory usage is shown in Table 1, and the CPU usage of the master node is shown in Table 2. Among them, 1,000,000 data blocks are evenly distributed among 10,000 data nodes, and the size of each data block is 64MB.

表1主节点管理数据块内存占用情况Table 1 Memory usage of master node management data blocks

数据节点数Number of data nodes 10001000 20002000 50005000 70007000 90009000 1000010000 优化后占用内存(MB)Optimized memory usage (MB) 1515 2020 2727 3636 4242 5050 未优化占用内存(MB)Unoptimized memory usage (MB) 180180 186186 189189 192192 194194 196196

表2主节点管理数据块CPU占用情况Table 2 CPU usage of the master node management data block

数据节点数Number of data nodes 500500 20002000 30003000 40004000 50005000 优化后CPU占用率(%)Optimized CPU usage (%) 1.41.4 2.32.3 2.52.5 3.13.1 4.24.2 未优化后CPU占用率(%)Unoptimized CPU usage (%) 6.36.3 12.112.1 16.616.6 19.819.8 23.223.2

从表1和表2可知,采用轻量级的数据密集型文件系统的自治块管理方法后,主节点的内存占用情况和CPU的占用情况明显优于未采用轻量级的数据密集型文件系统的自治块管理方法的情况。It can be seen from Table 1 and Table 2 that after adopting the autonomous block management method of the lightweight data-intensive file system, the memory usage and CPU usage of the master node are significantly better than those without the lightweight data-intensive file system The case of the autonomous block management method.

尽管本发明的内容已经通过上述优选实施例作了详细介绍,但应当认识到上述的描述不应被认为是对本发明的限制。在本领域技术人员阅读了上述内容后,对于本发明的多种修改和替代都将是显而易见的。因此,本发明的保护范围应由所附的权利要求来限定。Although the content of the present invention has been described in detail through the above preferred embodiments, it should be understood that the above description should not be considered as limiting the present invention. Various modifications and alterations to the present invention will become apparent to those skilled in the art upon reading the above disclosure. Therefore, the protection scope of the present invention should be defined by the appended claims.

Claims (6)

1.一种轻量级的数据密集型文件系统的自治块管理方法,其特征在于,数据密集型文件系统通过交叉迁移划分方法,实现数据块的自治管理,即通过使用一组可逆数学函数,实现数据块到数据存储节点,及数据存储节点到数据块的映射,完成数据块的分布式存储和查找;1. A light-weight autonomous block management method for data-intensive file systems, characterized in that the data-intensive file system realizes autonomous management of data blocks through a cross-migration division method, that is, by using a set of reversible mathematical functions, Realize the mapping from data blocks to data storage nodes, and from data storage nodes to data blocks, and complete the distributed storage and search of data blocks; 通过所述自治块管理方法实现数据块存储操作,包括以下步骤:Realizing data block storage operations through the autonomous block management method includes the following steps: 步骤1.1、主节点通过可逆的线性哈希函数,选择数据块所在逻辑组;Step 1.1. The master node selects the logical group where the data block is located through a reversible linear hash function; 步骤1.2、主节点通过可逆的位移分割函数,在所述逻辑组中选择数据存储节点;Step 1.2, the master node selects a data storage node in the logical group through a reversible displacement partition function; 步骤1.3、在选中的数据存储节点,存储数据块的数据和数据块地址映射信息;Step 1.3, storing the data of the data block and the address mapping information of the data block at the selected data storage node; 通过所述自治块管理方法实现失效数据存储节点的失效处理操作,包括以下步骤:The failure processing operation of the failure data storage node is realized through the autonomous block management method, including the following steps: 步骤3.1、确定失效数据存储节点所在逻辑分组;Step 3.1, determine the logical grouping where the failed data storage node is located; 步骤3.2、选择数据存储失效节点以外的逻辑分组中负载最小的数据存储节点作为后备节点;Step 3.2, select the data storage node with the smallest load in the logical grouping other than the data storage failure node as the backup node; 步骤3.3、多个后备数据存储节点采用智能重组映射方法,并行复制各个逻辑组中对应的该失效数据存储节点中包含的数据;Step 3.3, multiple backup data storage nodes adopt the intelligent reorganization mapping method, and copy the data contained in the corresponding failure data storage nodes in each logical group in parallel; 所述智能重组映射方法,是使选取的后备数据存储节点数与包含失效数据存储节点的逻辑组数相等,一个失效数据存储节点被包含在多个逻辑组中,并且每一个后备数据存储节点只复制一个对应的逻辑组中该失效数据存储节点的部分数据。The intelligent reorganization mapping method is to make the number of selected backup data storage nodes equal to the number of logic groups containing failure data storage nodes, one failure data storage node is included in multiple logic groups, and each backup data storage node only Partial data of the failed data storage node in a corresponding logical group is copied. 2.如权利要求1所述轻量级的数据密集型文件系统的自治块管理方法,其特征在于,2. the autonomous block management method of lightweight data-intensive file system as claimed in claim 1, is characterized in that, 通过所述自治块管理方法实现数据块查找操作,包括以下步骤:Realizing the data block search operation through the autonomous block management method includes the following steps: 步骤2.1、数据块所在数据存储节点根据其索引号,用反向可逆函数计算数据块所在逻辑组的新ID;Step 2.1, the data storage node where the data block is located uses the reverse reversible function to calculate the new ID of the logical group where the data block is located according to its index number; 步骤2.2、数据块所在数据存储节点根据数据块所在逻辑组的新ID,用反向可逆函数计算数据块的物理ID;Step 2.2, the data storage node where the data block is located calculates the physical ID of the data block with a reverse reversible function according to the new ID of the logical group where the data block is located; 步骤2.3、数据存储节点根据数据块的物理ID,获取数据块在存储节点的映射信息;Step 2.3, the data storage node obtains the mapping information of the data block on the storage node according to the physical ID of the data block; 步骤2.4、数据存储节点根据数据块的映射信息,获取数据块的数据送至数据密集型文件系统。Step 2.4, the data storage node obtains the data of the data block and sends it to the data-intensive file system according to the mapping information of the data block. 3.如权利要求1或2所述轻量级的数据密集型文件系统的自治块管理方法,其特征在于,3. The autonomous block management method of the lightweight data-intensive file system as claimed in claim 1 or 2, characterized in that, 通过所述自治块管理方法实现添加新数据存储节点操作,其中采用解耦地址映射方法,包括以下步骤:The operation of adding a new data storage node is realized through the autonomous block management method, wherein a decoupling address mapping method is adopted, including the following steps: 步骤4.1、计算整个系统中所有逻辑组中数据存储节点的平均负载COVaveStep 4.1, calculating the average load COV ave of data storage nodes in all logical groups in the entire system; 步骤4.2、选择任意一个逻辑组,计算该逻辑组中所有数据存储节点中最大的负载COVmaxStep 4.2, select any logical group, and calculate the maximum load COV max among all data storage nodes in the logical group; 步骤4.3、比较COVmax和COVave的大小,如果COVmax≥COVave,用新加入的数据存储节点替换逻辑组中负载最大的数据存储节点;否则,选取下一个逻辑组,重复执行步骤4.1、步骤4.2和步骤4.3,直到新加入的数据存储节点的负载达到或接近系统中数据存储节点的平均负载为止。Step 4.3, compare the size of COV max and COV ave , if COV max ≥ COV ave , replace the data storage node with the largest load in the logical group with the newly added data storage node; otherwise, select the next logical group, and repeat steps 4.1, Step 4.2 and Step 4.3, until the load of the newly added data storage node reaches or approaches the average load of the data storage nodes in the system. 4.如权利要求1所述轻量级的数据密集型文件系统的自治块管理方法,其特征在于,4. the autonomous block management method of lightweight data-intensive file system as claimed in claim 1, is characterized in that, 步骤1.1中,通过可逆的线性哈希函数选择数据块所在逻辑组的公式:In step 1.1, the formula for selecting the logical group where the data block is located is selected through a reversible linear hash function: 其中,g是逻辑组ID,x是系统中当前逻辑组数,X是初始时系统中的逻辑组数,b是要存储的数据块在其文件中的数据块ID,s是新增的逻辑组数, Among them, g is the logical group ID, x is the current logical group number in the system, X is the initial logical group number in the system, b is the data block ID of the data block to be stored in its file, and s is the newly added logic Number of groups, 步骤1.2中,通过可逆的位移分割函数选择逻辑组中数据存储节点的过程包括:In step 1.2, the process of selecting data storage nodes in a logical group through a reversible displacement partition function includes: A)计算数据块b映射到逻辑组g后的新块标识,其公式:A) Calculate the new block identifier after data block b is mapped to logical group g, its formula: 其中,a是数据块b在逻辑组g中的新标识;Among them, a is the new identifier of data block b in logical group g; B)计算数据块b映射到逻辑组g中的数据存储节点的索引ID,其公式:B) Calculate the index ID of data block b mapped to the data storage node in logical group g, its formula: d=node(a,i)=(a+i)%4 (3)d=node(a,i)=(a+i)%4 (3) 其中,i是数据块b的副本号,d为数据块b在逻辑组选择的数据存储节点的索引号。Wherein, i is the copy number of the data block b, and d is the index number of the data storage node selected by the data block b in the logic group. 5.如权利要求4所述轻量级的数据密集型文件系统的自治块管理方法,其特征在于,5. the autonomous block management method of lightweight data-intensive file system as claimed in claim 4, is characterized in that, 步骤2.1中,用反向可逆函数计算数据块b所在逻辑组的新ID;In step 2.1, the new ID of the logical group where the data block b is located is calculated with a reverse reversible function; a=4·j+(d-i)%4 (4)a=4·j+(d-i)%4 (4) 该公式通过对d=(a+i)%4进行可逆运算得到;This formula is obtained by carrying out reversible operation to d=(a+i)%4; 其中,数据块的副本号i迭代取0、1、2,j取零或正整数;Wherein, the iteration number i of the data block is 0, 1, 2, and j is zero or a positive integer; 步骤2.2中,用反向可逆函数计算数据块b的物理ID:In step 2.2, use the reverse reversible function to calculate the physical ID of data block b: 6.如权利要求1所述轻量级的数据密集型文件系统的自治块管理方法,其特征在于,6. the autonomous block management method of lightweight data-intensive file system as claimed in claim 1, is characterized in that, 所述数据密集型文件系统中,通过主节点进行系统命名空间维护、数据块到数据存储节点的分布、各个数据存储节点的管理;In the data-intensive file system, system namespace maintenance, distribution of data blocks to data storage nodes, and management of each data storage node are performed through the master node; 以及,通过各个数据存储节点负责数据块的一致性检查、数据块恢复和数据存储节点的映射信息存储和维护。And, each data storage node is responsible for the consistency check of the data block, the recovery of the data block, and the storage and maintenance of the mapping information of the data storage node.
CN201610665489.3A 2016-08-12 2016-08-12 A lightweight approach to autonomous block management for data-intensive file systems Active CN106293537B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610665489.3A CN106293537B (en) 2016-08-12 2016-08-12 A lightweight approach to autonomous block management for data-intensive file systems

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610665489.3A CN106293537B (en) 2016-08-12 2016-08-12 A lightweight approach to autonomous block management for data-intensive file systems

Publications (2)

Publication Number Publication Date
CN106293537A CN106293537A (en) 2017-01-04
CN106293537B true CN106293537B (en) 2019-11-12

Family

ID=57670722

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610665489.3A Active CN106293537B (en) 2016-08-12 2016-08-12 A lightweight approach to autonomous block management for data-intensive file systems

Country Status (1)

Country Link
CN (1) CN106293537B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783398B (en) * 2019-01-18 2020-09-15 上海海事大学 Performance optimization method for FTL (fiber to the Home) solid state disk based on relevant perception page level
CN114844911B (en) * 2022-04-20 2024-07-09 网易(杭州)网络有限公司 Data storage method, device, electronic equipment and computer readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101488104A (en) * 2009-02-26 2009-07-22 北京世纪互联宽带数据中心有限公司 System and method for implementing high-efficiency security memory
CN104052576A (en) * 2014-06-07 2014-09-17 华中科技大学 A data recovery method based on error-correcting codes in cloud storage
CN104077423A (en) * 2014-07-23 2014-10-01 山东大学(威海) Consistent hash based structural data storage, inquiry and migration method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012147087A1 (en) * 2011-04-29 2012-11-01 Tata Consultancy Services Limited Archival storage and retrieval system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101488104A (en) * 2009-02-26 2009-07-22 北京世纪互联宽带数据中心有限公司 System and method for implementing high-efficiency security memory
CN104052576A (en) * 2014-06-07 2014-09-17 华中科技大学 A data recovery method based on error-correcting codes in cloud storage
CN104077423A (en) * 2014-07-23 2014-10-01 山东大学(威海) Consistent hash based structural data storage, inquiry and migration method

Also Published As

Publication number Publication date
CN106293537A (en) 2017-01-04

Similar Documents

Publication Publication Date Title
US11775392B2 (en) Indirect replication of a dataset
US10817212B2 (en) Method and apparatus for creating a short hash handle highly correlated with a globally-unique hash signature
US10761758B2 (en) Data aware deduplication object storage (DADOS)
US10678751B2 (en) Data protection and long term retention
CN105487818B (en) For the efficient De-weight method of repeated and redundant data in cloud storage system
US9418131B1 (en) Synchronization of volumes
US9378106B1 (en) Hash-based replication
CN105069111B (en) Block level data duplicate removal method based on similitude in cloud storage
CN110262922B (en) Erasure code updating method and system based on duplicate data log
CN105320773B (en) A kind of distributed data deduplication system and method based on Hadoop platform
CN106066896B (en) An application-aware big data deduplication storage system and method
CN102609446B (en) Distributed Bloom filter system and application method thereof
US20130054894A1 (en) Increase in deduplication efficiency for hierarchical storage system
CN103139300A (en) Virtual machine image management optimization method based on data de-duplication
CN104077423A (en) Consistent hash based structural data storage, inquiry and migration method
CN113535670B (en) Virtual resource mirror image storage system and implementation method thereof
US11755557B2 (en) Flat object storage namespace in an object storage system
US11334523B2 (en) Finding storage objects of a snapshot group pointing to a logical page in a logical address space of a storage system
US20150058554A1 (en) Systems, Methods, and Computer Program Products Implementing Hybrid File Structures for Data Storage
CN108073472B (en) Memory erasure code distribution method based on heat perception
CN109933564A (en) File system management method, device, terminal and medium for fast rollback based on linked list and N-ary tree structure
Varade et al. Distributed metadata management scheme in hdfs
US11381400B2 (en) Using double hashing schema to reduce short hash handle collisions and improve memory allocation in content-addressable storage systems
CN106293537B (en) A lightweight approach to autonomous block management for data-intensive file systems
Klein et al. Dxram: A persistent in-memory storage for billions of small objects

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant