Distributed memory system and its data read-write method
Technical field
The application is related to computer memory technical field, more particularly to distributed memory system and its data read-write method.
Background technology
With flourishing for a variety of applications such as mobile device, social networks, Internet of Things, data are in caused by human society
Explosive growth.Traditional disk array is increasingly difficult to meet based on mass data in terms of capacity, performance and bandwidth
Data-intensive applications memory requirement.Therefore, scale-out frameworks, capacity and performance is taken linearly to increase with nodes
The distributed cluster storage system added arises at the historic moment, and can provide number (IOPS, the Input/ of higher read-write per second (IO) operation
Output Operations Per Second) performance solid state hard disc also gradually substitution traditional magnetic disk turn into I/O intensive type application
First choice.In this context, the expensive price of solid state hard disc, first wipe the application characteristic write afterwards, require in storage system
In as far as possible the physical memory space of compressed data, reduce write-in number, further to improve using distributed flash memory system
Cost performance.
Data de-duplication is one kind in data reducti techniques, is generally used for the standby system based on disk, it is intended to subtract
The memory capacity actually used in few storage system.The working method of data de-duplication is typically in some time cycle at present
It is interior, in running background data de-duplication program, the duplicate data block of diverse location in different files is searched, the data repeated
Block is substituted with designator, to reduce the occupancy to memory capacity.The data set (such as Backup Data) of high redundancy is from repeat number
It is very big according to the benefit of deleting technique;Other data de-duplication technology can allow to carry out between the different websites of user efficiently,
Economic Backup Data replicates.But for distributed cluster storage system, the duplicate data in existing single device is deleted
Except technology is unable to reach the target of global data de-duplication, data reduction ratio does not reach optimum efficiency;In addition, background process
Data de-duplication mode can not reduce the operation of data write-in, for the storage system using solid state hard disc, just up to not
To the target for reducing erasable number, extending solid-state disk service life.
The content of the invention
This application provides one kind to be applied to distributed memory system and its data read-write method, without carrying out actual weight
Complex data deletion action, you can reach the target of global data de-duplication, and the operation of data write-in can be reduced.
The embodiment of the present application provides a kind of distributed memory system, including:The distributed memory system includes one
Proxy module, a metadata service module and multiple storage services modules, each storage services module management are at least one
Memory node;
The proxy module is used to receive the write request from application system, and the data to be write are calculated by piecemeal parameter
Cryptographic Hash, obtain block identification, the write request for carrying the block identification sent to metadata service module;Receive Metadata Service
Write request, according to the nodal information, is routed to corresponding memory node by the nodal information that module returns;And it will come from
The write-in success message of metadata service module or storage services module returns to application system;
The metadata service module is used for the one-level mapping table for safeguarding the overall situation, and one-level mapping table is with containing global logic
Location and memory node, the mapping relations of block identification;The write request of the carrying block identification from proxy module is received, one-level is searched and reflects
Firing table, if the map record of existing corresponding block identification, refreshes one-level mapping table, increase a write-in initial address and corresponding
The map record of block identification, write-in success message is returned to application system by proxy module;If being not present, a storage is selected
Node, the nodal information of the memory node is returned to proxy module;Receive the write-in information from storage services module, brush
New one-level mapping table, increase write-in initial address, corresponding block identification, the map record of respective stored node newly, taken to the storage
Module of being engaged in returns to Flushing success message;
Storage services module is used to safeguard two level mapping table, and two level mapping table contains block identification and actual storage physically
The mapping relations of location;The write request for being routed to the memory node that this storage services module is managed is received, is write the data to described
The disk of memory node, refresh two level mapping table, increase corresponding block identification newly and be actually written into the record of physical address, and to first number
Write-in information is sent according to service module.
Alternatively, the proxy module is additionally operable to receive the read request from application system, and the read request is passed through into member
Data service module, and the reading data from storage services module are returned into application system;
The metadata service module is additionally operable to receive the read request that initial address is read in the carrying from proxy module, looks into
Map record corresponding to reading initial address described in one-level mapping table is looked for, obtains corresponding memory node and block identification,
Read request is routed to corresponding memory node;
The storage services module be additionally operable to receive be routed to the memory node that this storage services module is managed reading please
Ask, two level mapping table is searched according to block identification, obtains actual physical address, obtained from actual physical address and read data, will read
Data return to proxy module.
Alternatively, the metadata service module further comprises:
Load Sharing Algorithm unit, for selecting the memory node of a light load according to Load Sharing Algorithm.
Alternatively, metadata service module is individually deployed in isolated node, or distributed deployment is in all nodes
On cluster.
The embodiment of the present application additionally provides a kind of method for writing data of distributed memory system, the distributed memory system
As it was previously stated, the method for writing data includes:
Proxy module receives the write request from application system, and the Hash for the data to be write is calculated by piecemeal parameter
Value, obtains block identification, and the write request for carrying the block identification is sent to metadata service module;
Metadata service module receives the write request of the carrying block identification from proxy module, searches one-level mapping table, such as
The map record of the existing corresponding block identification of fruit, refreshes one-level mapping table, increases a write-in initial address and corresponding block identification
Map record, by proxy module to application system return write-in success message, terminate write-in flow;If being not present, selection
One memory node, the nodal information of the memory node is returned to proxy module;
Proxy module receives the nodal information that metadata service module returns, according to the nodal information, write request
It is routed to corresponding memory node;
Storage services module receives the write request for being routed to the memory node that this storage services module is managed, and data are write
Enter the disk of the memory node, refresh two level mapping table, increase corresponding block identification newly and be actually written into the record of physical address, and
Write-in information is sent to metadata service module;
Metadata service module receives the write-in information from storage services module, refreshes one-level mapping table, increases write-in newly
Initial address, corresponding block identification, the map record of respective stored node, return to Flushing success to the storage services module and disappear
Breath;
Write-in success message from metadata service module or storage services module is returned to application by storage services module
System.
The embodiment of the present application additionally provides a kind of method for reading data of distributed memory system, the distributed memory system
As it was previously stated, the method for reading data includes:
Proxy module receives the read request from application system, and the read request is passed through into metadata service module;
Metadata service module receives the read request that initial address is read in the carrying from proxy module, searches one-level mapping
Map record corresponding to reading initial address described in table, obtains corresponding memory node and block identification, read request is route
To corresponding memory node;
Storage services module receives the read request for being routed to the memory node that this storage services module is managed, according to block mark
Know and search two level mapping table, obtain actual physical address, obtained from actual physical address and read data, data will be read and return to generation
Manage module;
Reading data from storage services module are returned to application system by proxy module.
As can be seen from the above technical solutions, due to using two-stage metadata organization, when writing data, if existing
Identical data so as to reduce the operation of data write-in, and reaches the mesh of global data de-duplication then without being actually written into
Mark.Application scheme can realize following technique effect:
Online global data de-duplication is realized in distributed memory system, the physical memory space is reduced and takes;
Because the data repeated do not need actual write operation when writing data, the number of application system and storage system is saved
According to interaction and bandwidth, storage efficiency is improved;
Metadata Service and storage service can be flexibly disposed on memory node, is easy to structure distribution, big rule
The storage system of mould.
Brief description of the drawings
Fig. 1 is that the framework for the distributed memory system using two level metadata organization method that the embodiment of the present application provides shows
It is intended to;
Fig. 2 is that the data for the distributed memory system that the embodiment of the present application provides write schematic flow sheet;
Fig. 3 is the time data stream journey schematic diagram for the distributed memory system that the embodiment of the present application provides.
Embodiment
To make the technical principle of technical scheme, feature and technique effect clearer, below in conjunction with specific reality
Example is applied technical scheme is described in detail.
In application scheme, using two level metadata organization method, the service module of distributed memory system is divided into member
Data service module and storage services module, wherein metadata service module are responsible for safeguarding global one-level mapping table, and one-level is reflected
Firing table contains global logic address and memory node, the mapping relations of block identification (i.e. data block Hash (Hash) value);Storage
Service module is responsible for safeguarding two level mapping table, and two level mapping table contains block identification and the mapping of actual storage physical address is closed
System.
The framework such as Fig. 1 institutes for the distributed memory system using two level metadata organization method that the embodiment of the present application provides
Show, distributed memory system 100 includes 101, metadata service modules 102 of a proxy module and multiple storage clothes
Business module 103.
The proxy module 101 is used to receive the write request from application system, calculates what is write by piecemeal parameter
The cryptographic Hash of data, obtains block identification, and the write request for carrying the block identification is sent to metadata service module 102;Receive member
The nodal information that data service module 102 returns, according to the nodal information, write request is routed to corresponding storage section
Point;And the write-in success message from metadata service module 102 or storage services module 103 is returned into application system;
The metadata service module 102 is used for the one-level mapping table for safeguarding the overall situation, and one-level mapping table contains the overall situation and patrolled
Collect address and memory node, the mapping relations of block identification;The write request of the carrying block identification from proxy module is received, searches one
Level mapping table, if the map record of existing corresponding block identification, refreshes one-level mapping table, increase a write-in initial address and
The map record of corresponding block identification, write-in success message is returned to application system by proxy module 101;If being not present, selection
One memory node, the nodal information of the memory node is returned to proxy module 101;Reception comes from storage services module
103 write-in information, refresh one-level mapping table, increase write-in initial address, corresponding block identification, the mapping note of respective stored node newly
Record, Flushing success message is returned to the storage services module 103;
Storage services module 103 is used to safeguard two level mapping table, and two level mapping table contains block identification and actual storage thing
Manage the mapping relations of address;The write request for being routed to the memory node that this storage services module 103 is managed is received, data are write
Enter the disk of the memory node, refresh two level mapping table, increase corresponding block identification newly and be actually written into the record of physical address, and
Write-in information is sent to metadata service module 102.
According to another embodiment of the application, the proxy module 101 is additionally operable to receive the read request from application system,
The read request is passed through into metadata service module 102, and the reading data from storage services module 103 are returned into application system
System;
The metadata service module 102 is additionally operable to receive the reading that initial address is read in the carrying from proxy module 101
Request, map record corresponding to reading initial address described in one-level mapping table is searched, obtains corresponding memory node and block
Read request, is routed to corresponding memory node by mark;
The storage services module 103, which is additionally operable to receive, is routed to the memory node that this storage services module 103 is managed
Read request, according to block identification search two level mapping table, obtain actual physical address, reading obtained from actual physical address
According to, by read data return to proxy module 101.
Alternatively, the metadata service module 103 further comprises:
Load Sharing Algorithm unit, for selecting the memory node of a light load according to Load Sharing Algorithm.
Alternatively, metadata service module is individually deployed in isolated node, or distributed deployment is in all nodes
On cluster.
The data for the distributed memory system that the embodiment of the present application provides write flow as shown in figure 1, comprising the following steps:
Step 201:The data that application system asks to write initial address LBA1, length is L to distributed memory system.
Step 202:The proxy module of distributed memory system receives request, and the cryptographic Hash of data is calculated by piecemeal parameter,
Block identification is obtained, the write request for carrying the block identification is sent to metadata service module.
Step 203:Metadata service module searches one-level mapping table, if the map record of existing corresponding block identification,
Step 204 is performed, otherwise, performs step 205.
Step 204:Metadata service module refreshes one-level mapping table, increases the mapping note of a LBA1 and corresponding block identification
Record, return and write successfully to application system, flow terminates.
Step 205:Metadata service module selects the memory node of a light load according to Load Sharing Algorithm, section
Point information returns to proxy module.
Step 206:The nodal information that proxy module returns according to metadata service module, write request is routed to accordingly
Memory node.
Step 207:Storage services module corresponding to the memory node receives write request, writes the data to memory node
Disk, refresh two level mapping table, increase corresponding block identification newly and be actually written into the record of physical address.And to metadata service module
Send write-in information.
Step 208:Metadata service module receives the write-in information that storage services module is sent, and refreshes one-level mapping table,
Newly-increased LBA1, corresponding block identification, the map record of respective stored node;Flushing success is returned to the storage services module to disappear
Breath.
Step 209:Storage services module receives the message of the Flushing success of metadata service module, is returned to application system
Write and successfully (pass through proxy module), flow terminates.
The data to be write existing identical data block in distributed memory system is can be seen that from write-in flow
When, corresponding data need not be write again in ablation process, and only need to refresh one-level mapping table, increase logical address and block mark
The map record of knowledge, equivalent to data de-duplication is realized automatically, the occupancy of the physical memory space is greatlyd save, and
Reduce data write-in flow and data transfer, improve storage efficiency.
The time data stream journey for the distributed memory system that the embodiment of the present application provides is as shown in figure 3, comprise the following steps:
Step 301:Initial address LBA1 is read in proxy module request of the application system to distributed memory system, and length is
L data.
Step 302:The read request is passed through metadata service module by proxy module.
Step 303:Metadata service module receives request, searches map record corresponding to LBA1 in one-level mapping table, obtains
To corresponding memory node and block identification.
Step 304:Read request is routed to corresponding memory node by metadata service module.
Step 305:Storage services module corresponding to the memory node receives read request, and searching two level according to block identification reflects
Firing table, actual physical address is obtained, data are read from actual physical address, return to application system (passing through proxy module).
From above-mentioned read-write flow, because technical scheme employs two-stage mapping table, except in a distributed system
It is natural to support outside online global data de-duplication, also have the advantage that, the coupling between node is more open, can be with
Flexibly use different deployment way.Such as asymmetric distribution formula framework dispositions method, metadata service module portion can be used
Administration is responsible for the management of one-level mapping table and the management of each memory node specially in independent metadata node.Storage service mould
Block is then deployed in memory node and forms cluster, is responsible for the management of two level mapping table and data actual storage;Or it can also adopt
With symmetric distributed framework dispositions method, metadata service module and storage services module are deployed in each memory node jointly,
Metadata service module is responsible for the management of global one-level mapping table, and passes through back-end network real-time synchronization to all nodes, storage
The management of the node two level mapping table and the storage of real data are then responsible in service;Or both the above mode can also be used
Mixed deployment method.
The foregoing is only the preferred embodiment of the application, not to limit the protection domain of the application, it is all
Within the spirit and principle of technical scheme, any modification, equivalent substitution and improvements done etc., this Shen should be included in
Within the scope of please protecting.