CN104331478B - Data consistency management method for self-compaction storage system - Google Patents
Data consistency management method for self-compaction storage system Download PDFInfo
- Publication number
- CN104331478B CN104331478B CN201410614846.4A CN201410614846A CN104331478B CN 104331478 B CN104331478 B CN 104331478B CN 201410614846 A CN201410614846 A CN 201410614846A CN 104331478 B CN104331478 B CN 104331478B
- Authority
- CN
- China
- Prior art keywords
- metadata
- node
- data
- tree
- space
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000007726 management method Methods 0.000 title claims abstract description 18
- 238000005056 compaction Methods 0.000 title abstract 3
- 238000000034 method Methods 0.000 claims abstract description 22
- 238000013461 design Methods 0.000 claims abstract 2
- 238000013507 mapping Methods 0.000 claims description 36
- 238000012986 modification Methods 0.000 claims description 29
- 230000004048 modification Effects 0.000 claims description 29
- 230000000694 effects Effects 0.000 claims description 27
- 230000008520 organization Effects 0.000 claims description 9
- 230000001360 synchronised effect Effects 0.000 claims description 5
- 238000012217 deletion Methods 0.000 claims description 4
- 230000037430 deletion Effects 0.000 claims description 4
- 230000008569 process Effects 0.000 abstract description 7
- 238000012423 maintenance Methods 0.000 abstract description 3
- 238000003780 insertion Methods 0.000 description 6
- 230000037431 insertion Effects 0.000 description 6
- 230000009467 reduction Effects 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 239000000976 ink Substances 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
- G06F16/24554—Unary operations; Data partitioning operations
- G06F16/24557—Efficient disk access during query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2237—Vectors, bitmaps or matrices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2246—Trees, e.g. B+trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2365—Ensuring data consistency and integrity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
- G06F16/24573—Query processing with adaptation to user needs using data annotations, e.g. user-defined metadata
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Library & Information Science (AREA)
- Computer Security & Cryptography (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a data consistency management method for a self-compaction storage system, belongs to the technical field of automatic compaction configuration, and designs a metadata structure for data block management and an implementation scheme of the metadata storage structure. For a metadata structure for managing data blocks, an improved B + Tree structure is designed, and the self-simplification management of the data blocks is realized by matching metadata such as super blocks, metadata bitmaps and data bitmaps. On the basis of the original B + Tree data structure, the space of each non-leaf node is expanded by one time and is divided into an active domain and an inactive domain, so that the space of an external magnetic disk is basically not divided in the B + Tree modifying process, the metadata modifying operation complexity is reduced, meanwhile, the allocated extra inactive domain space can also be used as a metadata copy or used for historical operation records, and the storage system copy maintenance or log maintenance cost is reduced.
Description
Technical field
It is specifically that one kind simplifies memory system data coherency management certainly the present invention relates to thin provisioning field
Method.
Background technology
The data volume that current internet is produced is in explosive growth, and higher want is proposed to the capacity and performance of storage system
Ask.Be present the problem of disk storage utilization rate is relatively low, storage resource is wasted in existing storage system, therefore occurred in that in recent years automatic
Simplify configuration technology.
Thin provisioning utilizes " being distributed when writing " strategy, is distributed according to need by the resource for changing storage system, energy
Enough improve disk storage space utilization rate, improve performance of storage system while reach reduction storage system lower deployment cost and
The purpose economized on resources." being distributed when writing " is exactly, to when simplifying logical volume certainly and writing data, just to be distributed from storage pool is simplified certainly
Memory space.Storage pool memory space will be simplified certainly now in fact and be divided into equal-sized data block, and pass through B+Tree etc.
Form is organized:Distribution, recovery, lookup including data block etc. are operated.It is divided into data field and member from storage pool is simplified
Data field, data field is used for data storage, and meta-data region includes storage pool superblock, metadata bitmap, data bitmap, logic
Volume information etc. is organizer and governor to simplifying storage pool certainly, extremely important, once metadata goes out active, the mistake such as inconsistent
By mistake, user data will be lost even more so that whole storage system collapse.Meanwhile, when storage system is normally run, metadata is
Be stored in internal memory and timing write with a brush dipped in Chinese ink on disk, and when system occurs abnormal, such as controller failure, controller power down,
The hardware error such as RAID failures and RAID power down, is likely to result in metadata and writes with a brush dipped in Chinese ink failure, cause metadata error.Therefore how to protect
The uniformity of card metadata is the emphasis of storage system and automatic reduction techniques.
In the realization of automatic reduction techniques, B+Tree data structures multi-purpose greatly carry out management data block.In order to ensure first number
According to B+Tree uniformity, a kind of management method that can be taken is:When carrying out B+Tree modification operation, one is additionally created
Individual and another B+Tree of B+Tree identicals, and operated on the B+Tree additionally created, when whole operation is completed
Afterwards, the pointer for pointing to original B+Tree root nodes points to the B+Tree newly created root node, and by original B+Tree sky
Between discharge, reach modification metadata purpose.The advantage of this implementation method is the uniformity that ensure that metadata B+Tree,
Either phase during whole modification, is stored in metadata all being consistent property on disk, can preferably prevent because of control
The metadata caused by hardware error such as device failure processed are inconsistent.It is exactly each modification member and the shortcoming of the method is also more obvious
Data B+Tree needs to rebuild the B+Tree of an equal size, is needed in process of reconstruction for each node distribution in B+Tree
Space;Simultaneously in order to ensure metadata availability, in other places with the same RAID of metadata B+Tree, portion is also stored
Copy.Therefore expense is all larger over time and space for this method.
The management method of another metadata is the special storage member as metadata space using single RAID in storage system
Data, this causes the RAID to be easily referred to as " focus " and single failure point of data access in system.A kind of solution be by
Metadata is scattered to be stored in several RAID, if but in system controller failure metadata access can still be affected greatly.
The content of the invention
The present invention provides one kind and simplifies memory system data consistency management method certainly, you can solve in above two method
Shortcoming, it is ensured that metadata consistency, the Time & Space Complexity of metadata operation can be reduced again.
The present invention devises the metadata structure of block management data and the implementation of metadata storage organization.For data
The metadata structure of block management, devises a kind of B+Tree improved structure, while coordinating superblock, metadata bitmap and data
What the metadata such as bitmap realized data block simplifies management certainly.On the basis of original B+Tree data structures, by each nonleaf node
One times of space enlargement, and be divided into active scope and inactive domain two parts so that during modification B+Tree substantially not
Distribute extra disk space, reduction metadata modification operation complexity, while the extra inactive domain space of distribution is alternatively arranged as
Metadata copy is recorded for historical operation, is reduced storage system copy and is safeguarded or daily record maintenance costs.
It is a kind of to simplify memory system data consistency management method certainly, it is characterized in that:
S1:In the B+Tree of metadata organization, increase the space of each non-leaf nodes of B+Tree.In original B+
On the basis of Tree data structures, by one times of the space enlargement of each nonleaf node, and active scope and two, inactive domain are divided into
Point, the data of storage mapping B+Tree nodes wherein in active scope, i.e.,(Key, value)Key-value pair;Rather than basis in active scope
Different Strategies can store the copy of activity numeric field data, can also store the data before the last node modification.Modification to node
Carried out in the inactive domain of node, after the completion of the modification of node, active scope and inactive domain are swapped.Each nonleaf node exists
Initial address is alignd with node size during distribution, if such as node size is 8KB, and wherein active scope and inactive domain respectively accounts for 4KB,
Then node initial address is alignd with 8KB.So allow for during repairing metadata not allocation external memory storage space, reduction member
Data modification operation complexity.
Modification to metadata is related to three kinds of operations:Increase data block mapping, delete data block mapping and modification data block
Mapping.Each operating process is as follows:
A, increase data block mapping
1st, father node N of the newly-increased data block in Mapping B+Tree is searched;
2nd, N activity numeric field data is replicated to inactive domain;
3rd, N inactive domain, increase key and index point are changed, pointer is pointed into newly-increased node, node will be increased newly and inserted
Enter to N;
4th, judge whether N is needed into line splitting.If need not if turn to step 7;If N needs division, step 5 is turned to;
5th, divide N, obtain node N ' and node N ' ', now origin node N father node subsequently points to N ' in N divisions;
5.1. metadata bitmap B+Tree is searched, the meta data block of free time is found;
5.2. distribute new node N ' ' and initialize, more new metadata bitmap B+Tree;
5.3. division each self-contained metadata information of posterior nodal point N ' and N ' ' is calculated, i.e.,(Key, value)The model of key-value pair
Enclose;
5.4. the data part for treating split vertexes N active scopes is copied into inactive domain, another portion according to result of calculation
Divide the active scope for copying to new distribution node N ' ', now node N is called N ';
6th, step 2 is gone to, node N ' ' is inserted to the father node M for being split off node;
The 7th, the pointer of the father node of each node changed is pointed to the inactive domain for the node changed;
8th, update the data and increase the corresponding position of data block in bitmap B+Tree newly, be set to and used;
9th, the other metadata such as superblock of more new metadata, changes the big of the logical equipment objects such as storage pool, logical volume
It is small;
10th, operation is completed.
B, deletion data block mapping
1st, father node N of the node to be deleted in Mapping B+Tree is searched;
2nd, N activity numeric field data is replicated to inactive domain;
3rd, N inactive domain is changed, key and index point is deleted;
4th, judge whether N needs to merge with other nodes.If need not if turn to step 7;If desired merge, then turn to step
Rapid 5;
5th, the node N ' with merging is found, and carries out node union operation.Now N father node subsequently points to close in its merging
And after new node M;
5.1. node N and N ' to be combined is calculated, it is determined that merging the metadata information that posterior nodal point is included;
5.2. according to result of calculation by the inactive domain of node N and N ' data duplication to node N ', now node N is called
Node M;
6th, step 2 is gone to, the node N being merged is deleted;
The 7th, the pointer of the father node of each node changed is pointed to the inactive region for the node changed;
8th, the space of deleted node is discharged;
9th, update the data and the corresponding position of data block is deleted in bitmap B+Tree, be set to unused;
10th, the other metadata such as superblock of more new metadata, changes the big of the logical equipment objects such as storage pool, logical volume
It is small;
11st, operation is completed.
C, modification data block mapping
1st, Mapping B+Tree is searched, father's section after the father node N and modification mapping relations belonging to data block to be modified is determined
Point N ';
2nd, data block mapping is deleted under node N;
3rd, data block is inserted under node N ';
4th, operation is completed.
S2:What metadata was hashed is stored in each bottom memory cell of storage pool.
Metadata is distributed on the RAID of each in storage pool, organized by modes such as B+Tree;Preferably lifting
Metadata access performance, reduces the risk that hardware anomalies bring metadata to lose again.
Because metadata is stored in storage pool each RAID, dilatation and capacity reducing of the storage pool in units of RAID need by
One times of metadata space enlargement in each RAID, is equally divided into activity space and inactive space.So dilatation is carried out in system
During with capacity reducing, the inactive space of metadata is only changed, the normal access without influenceing activity space.When metadata is inactive
After space allocation is finished, the metadata activity space in each RAID and inactive space are exchanged, new metadata is enabled, completed
The dilatation of storage system and capacity reducing, the data in last synchronous movement space and inactive space, and set up the metadata across RAID
Copy.
The beneficial effects of the invention are as follows:1)Be conducive to the disk of metadata to write with a brush dipped in Chinese ink, data are managed compared to existing B+Tree
The mode of block, reduces the hash degree of metadata spatial distribution in disk;2)Reduce metadata modification time space distribution
Expense, in addition to the distribution that progress new node is needed when node split, remaining operation does not need additional allocation space;3)Using
Flexibly, the inactive domain of Mapping B+Tree nonleaf nodes can be used as the copy of Mapping B+Tree metadata to mode, also can conduct
The record of historical operation, to support to operate rollback.When the inactive domain of node is used as copy, tied to mapping B+Tree operations
Each Activity On the Node numeric field data, is synchronized in inactive domain by Shu Hou first.Inactive domain node pointer is rebuild afterwards, by each node
Pointer point to child nodes inactive domain.Now the inactive domain of each node forms an independent Mapping B+Tree pair
This, if in addition to root node any node active scope corrupted data, only modification point to root node pointer can be switched fast
To inactive domain copy, normal access map B+Tree metadata.When the inactive domain of node is used as copy, due to copy being divided
Dissipate and be saved together with script, reduce the time for safeguarding the extra disk access that copy consistency is brought and space expense;Section
When the inactive domain of point is used as operation historical record, its last time is saved as data during active scope, system journal is reduced
The data volume of record is needed, the time of log recording and space expense is being reduced simultaneously, is decreasing the data weight of rolling back action
Build complexity.4)Protect metadata consistency.Due to before to mapping B+Tree operations, the activity numeric field data of each node can be answered
Make inactive domain, and operated in inactive domain, so even occur in operation controller failure or
Situations such as RAID power down, exist the data in Activity On the Node domain also being consistent property, it is not complete in the simply inactive domain influenceed
Into the operation of modification.Meanwhile, even if single RAID data is lost, also data can be carried out by the intersection copy stored in other RAID
Reconstruct.5)Metadata access performance is lifted, due to metadata is distributed in all RAID of system, many RAID are taken full advantage of simultaneously
The performance of hair, improves the IOPS of metadata access, solves the problems, such as metadata single-point performance bottleneck, while supporting storage system to exist
Line dilatation.Realize the seamless switching of new and old metadata after System Expansion capacity reducing.
This method compensate for it is existing from simplifying in storage system to ensure the metadata complex operations that use safely, reduction
Metadata increase and application and release disk space are caused repeatedly during deleting overhead and memory space fragmentation.
The access performance of metadata is reduced on the premise of it ensure that metadata consistency.Deposited simultaneously using the metadata of super distributed
Method for storing it also avoid the Single Point of Faliure problem of metadata access.
Brief description of the drawings
Fig. 1 is metadata structure schematic diagram.
Fig. 2 is Mapping B+Tree structural representations.
Fig. 3 is insertion node step 1.
Fig. 4 is insertion node step 2.
Fig. 5 is insertion node step 3.
Fig. 6 is split vertexes step 1.
Fig. 7 is split vertexes step 2.
Fig. 8 is split vertexes step 3.
Fig. 9 is split vertexes step 4.
Figure 10 is split vertexes step 5.
Figure 11 is metadata storage organization schematic diagram.
Embodiment
With reference to the accompanying drawings, by taking the insertion node of Mapping B+Tree in the present invention and split vertexes operation as an example, emphasis is said
Mapping B+Tree operations in the bright present invention in increase, deletion and modification data block mapping.Deletion of node and merge node operation
It is the inverse process for inserting node and split vertexes operation respectively, will not be repeated here.Illustrate metadata in each RAID simultaneously
Operation during storage organization and storage system dilatation, capacity reducing to metadata.
Accompanying drawing 1 is Mapping B+Tree data structure schematic diagram, wherein each nonleaf node includes active scope and non-live
Dynamic domain, both are equal in magnitude, and address space is adjacent, and nonleaf node initial address is alignd with node size.Leaf node is directed to
The pointer of data block.Active scope is that the pointer for pointing to present node by father node is determined with inactive domain in nonleaf node.
The adjacent two spaces in address are called A domains and B domains in node, and the initial address in wherein A domains is the initial address of node.Due to non-
The address of leaf node is alignd with node size, if the address that the pointer that present node is then pointed in father node is stored is risen for A domains
Beginning address, while also be present node initial address, then A domains be active scope, B domains be inactive domain;If otherwise father node middle finger
The address stored to the pointer of present node is B domains initial address, and now the address can not be alignd with node size, then A domains
For inactive domain, B domains are active scope.
The operation of accompanying drawing 3 to 5 pairs of mapping B+Tree insertion nodes of accompanying drawing has carried out process description;
1)As shown in Figure 3, Mapping B+Tree is first looked for, it is determined that the father node of newly-increased leaf node, by its Activity On the Node
Data duplication is to inactive domain in domain;
2)As shown in Figure 4, the inactive domain of modification node, adds new leaf node index, changes key assignments;
3)As shown in Figure 5, the pointer of present node is pointed in the father node of modification present node, its sensing is worked as prosthomere
The inactive domain of point, completes the conversion in active scope and inactive domain, enables new node metadata, and insertion nodal operation is completed.
Some nonleaf node is after new node is inserted in Mapping B+Tree, and the index value in node may be beyond mapping
Node limitation in B+Tree data structures is, it is necessary to carry out node split two new nodes of formation, and each new node storage is former to be saved
The data of a point part.The operation of accompanying drawing 6 to 10 pairs of mapping B+Tree split vertexes of accompanying drawing has carried out process description;
1)As shown in Figure 6, a certain nonleaf node index value after node is inserted reaches maximum in Mapping B+Tree, needs
Enter line splitting;
2)As shown in Figure 7, the activity numeric field data of present node is replicated to inactive domain;
3)As shown in Figure 8, a new nonleaf node and initialization are distributed, by one in the inactive domain of present node
Divided data is moved in the active scope of newly assigned node;
4)As shown in Figure 9, it is inserted into newly assigned node as a new node in Mapping B+Tree;
5)As shown in Figure 10, modification relates to the index point of node, and sensing has the node of new metadata
Inactive domain, completes the conversion in each Activity On the Node domain and inactive domain, enables new node metadata, split vertexes have been operated
Into.
Accompanying drawing 11 is metadata actual storage structural representation.The superblock of metadata is deposited in each RAID in systems
A copy, deposits data block Mapping B+Tree root nodes, metadata bitmap B+Tree root nodes, data bitmap in superblock
B+Tree root nodes, and other metadata such as equipment UUID, device name, device object index, device attribute letter in system
Breath etc..Data block Mapping B+Tree, metadata bitmap B+Tree and data bitmap B+Tree are not to be stored in each RAID
A identical data, but B+Tree data block will be constituted according to the scattered storage of certain load balancing to all RAID
In.The inactive space of metadata in each RAID stores the copy of metadata in current RAID metadata activity space, meanwhile,
According to diversification strategies by data block Mapping B+Tree, metadata bitmap B+Tree in RAID and data bitmap B+Tree at other
Two parts of copies are stored in RAID, two parts of copies are not in same RAID, it is ensured that the metadata two in systems of scattered storage
RAID can also keep integrality when failing.
Storage system carries out as follows to metadata processing procedure during dilatation:
1st, newly-increased RAID is initialized;
2nd, meta-data distribution after dilatation is calculated according to load balancing;
3rd, metadata activity space and inactive space in synchronous each RAID so that the metadata stored in both is consistent;
4th, according to step 2 result of calculation, metadata is replicated to newly-increased RAID metadata activity space;
5th, according to step 2 result of calculation, the metadata for changing the inactive space of metadata in each RAID is state after dilatation;
6th, superblock, metadata bitmap B+Tree and data bitmap B+Tree in each RAID are updated;
7th, enabling the original each RAID inactive space of metadata turns into activity space;
8th, synchronous each RAID metadata activity space and inactive space, and re-establish the metadata pair across RAID
This;
9th, operation is completed.
The capacity reducing operation of storage system is similar with dilatation operating process, will not be repeated here.
It is described above, it is only the preferable embodiment of the present invention, is not intended to limit the scope of the present invention.
Claims (5)
1. a kind of simplify memory system data consistency management method certainly, it is characterised in that devises the metadata of block management data
The implementation of structure and metadata storage organization, wherein
Metadata structure includes:
(1)The improved structure of B+Tree data structures;
(2)The mapping of increase data block, the mapping of deletion data block and the modification data realized using B+Tree improvement data structure
Block map operation;
(3)The active scope of nonleaf node and the decision procedure in inactive domain in B+Tree improvement data structure;
Metadata is distributed in storage system in all RAID according to different allocation strategies, passes through the mode tissue such as B+Tree
Management, while metadata does intersection backup in different RAID;
Metadata storage organization includes:(1)Storage and backup of the metadata across all RAID;(2)Using metadata storage organization
Storage system dilatation capacity reducing operation;
In the B+Tree of metadata organization, increase the space of each non-leaf nodes of B+Tree in original B+Tree data knot
On the basis of structure, by one times of the space enlargement of each nonleaf node, and active scope and inactive domain two parts are divided into, wherein activity
The data of storage mapping B+Tree nodes in domain, i.e. (key, value) key-value pair;Rather than can according to Different Strategies in active scope
The copy of storage activity numeric field data, can also store the data before the modification of the last node to the modification of node node non-live
Dynamic domain is carried out, after the completion of the modification of node, and active scope and inactive domain swap each nonleaf node starting point in distribution
Alignd with node size location;
Each nonleaf node includes active scope and inactive domain, and both are equal in magnitude, and address space is adjacent, nonleaf node starting
Alignd with node size address;Leaf node is directed to the pointer of data block;Active scope and inactive domain are logical in nonleaf node
Cross the pointer decision of father node sensing present node;The adjacent two spaces in address are called A domains and B domains, wherein A domains in node
Initial address be node initial address;Because the address of nonleaf node is alignd with node size, if then being pointed in father node
The address that the pointer of present node is stored is A domains initial address, while being also present node initial address, then A domains are activity
Domain, B domains are inactive domain;Otherwise if the address that the pointer that present node is pointed in father node is stored is B domains initial address, this
When the address can not be alignd with node size, then A domains be inactive domain, B domains be active scope.
2. according to the method described in claim 1, it is characterised in that the improved structure design of the B+Tree data structures, is former
There is nonleaf node in B+Tree data structures to distribute exceptional space so that the nonleaf node space size after improvement is original two
Times, and node initial address alignd with node size;Node space is divided into adjacent active scope and inactive domain, active scope
For the operation of normal metadata query;Inactive domain is used to store active scope copy or a preceding operation historical record.
3. according to the method described in claim 1, it is characterised in that the improved structure of the application B+Tree data structures is realized
Block management data operation, will be movable in the node of B+Tree improved structures when carrying out from the map operation for simplifying data block
Data duplication is to inactive domain in domain, and the operation of data is carried out in the inactive domain of node in all modifications node, when in node
After data modification is finished, the pointer for each node that there is data modification is pointed in modification, it is pointed to original inactive of each node
Domain, active scope is changed into by the inactive domain of each node for having amended data, and each Activity On the Node domain originally is changed into inactive
Domain.
4. according to the method described in claim 1, it is characterised in that the metadata, will be each across all RAID storage and backup
The metadata space that metadata is stored in RAID is divided into that two sizes are identical, the adjacent metadata activity space in address and first number
According to inactive space;The less data structure of data volume, i.e. superblock each RAID metadata within the storage system in metadata
A identical copies are deposited in activity space;And for the larger data structure of data volume, i.e. metadata Mapping B+Tree, first number
According to bitmap B+Tree and data bitmap B+Tree, it is distributed to according to load balancing in each RAID metadata activity space,
Each RAID deposits a part for metadata, and the copy of metadata activity space is deposited in the inactive space of the metadata in RAID;
Meanwhile, the copy of metadata in other two RAID is deposited according to certain strategy in each RAID.
5. according to the method described in claim 1, it is characterised in that the storage system dilatation of the application metadata storage organization
With capacity reducing operation, when dilatation and capacity reducing are operated, metadata activity space synchronous first and the inactive space of metadata, Zhi Hougen
The inactive space of metadata in each RAID is changed according to metadata diversification strategies, each RAID inactive space of metadata is finally changed
For activity space, enable new metadata and complete the operation of dilatation capacity reducing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410614846.4A CN104331478B (en) | 2014-11-05 | 2014-11-05 | Data consistency management method for self-compaction storage system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410614846.4A CN104331478B (en) | 2014-11-05 | 2014-11-05 | Data consistency management method for self-compaction storage system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104331478A CN104331478A (en) | 2015-02-04 |
CN104331478B true CN104331478B (en) | 2017-09-22 |
Family
ID=52406205
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410614846.4A Active CN104331478B (en) | 2014-11-05 | 2014-11-05 | Data consistency management method for self-compaction storage system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104331478B (en) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105989140B (en) * | 2015-02-27 | 2019-09-03 | 阿里巴巴集团控股有限公司 | A kind of data block processing method and equipment |
CN104731905A (en) * | 2015-03-24 | 2015-06-24 | 浪潮集团有限公司 | Volume reducing method for simplified storage pool |
CN104820575B (en) * | 2015-04-27 | 2017-08-15 | 西北工业大学 | Realize the method that storage system is simplified automatically |
CN105354315B (en) * | 2015-11-11 | 2018-10-30 | 华为技术有限公司 | Method, sublist node and the system of distributed data base neutron table splitting |
CN105630417B (en) * | 2015-12-24 | 2018-07-20 | 创新科软件技术(深圳)有限公司 | A kind of RAID5 systems and in the subsequent method for continuing data of RAID5 thrashings |
CN105718217B (en) * | 2016-01-18 | 2018-10-30 | 浪潮(北京)电子信息产业有限公司 | A kind of method and device of simplify configuration storage pool data sign processing |
CN107301183B (en) * | 2016-04-14 | 2020-02-18 | 杭州海康威视数字技术股份有限公司 | File storage method and device |
CN107729142B (en) * | 2017-09-29 | 2021-06-29 | 郑州云海信息技术有限公司 | Thread calling method for self-compaction metadata |
CN110825552B (en) * | 2018-08-14 | 2021-04-09 | 华为技术有限公司 | Data storage method, data recovery method, node and storage medium |
CN110232057B (en) * | 2019-05-29 | 2021-03-12 | 掌阅科技股份有限公司 | Data rollback method, electronic device and storage medium |
US11295031B2 (en) | 2019-10-08 | 2022-04-05 | International Business Machines Corporation | Event log tamper resistance |
US11392348B2 (en) | 2020-02-13 | 2022-07-19 | International Business Machines Corporation | Ordering records for timed meta-data generation in a blocked record environment |
CN111338568B (en) * | 2020-02-16 | 2020-11-06 | 西安奥卡云数据科技有限公司 | Data logic position mapping method |
CN112306971B (en) * | 2020-10-27 | 2023-01-24 | 苏州浪潮智能科技有限公司 | File storage method, device, equipment and readable storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7802063B1 (en) * | 2005-06-10 | 2010-09-21 | American Megatrends, Inc. | Method, system, apparatus, and computer-readable medium for improving disk array performance |
CN101997918A (en) * | 2010-11-11 | 2011-03-30 | 清华大学 | Method for allocating mass storage resources according to needs in heterogeneous SAN (Storage Area Network) environment |
CN103020201A (en) * | 2012-12-06 | 2013-04-03 | 浪潮电子信息产业股份有限公司 | Storage pool capable of automatically simplifying configuration for storage system and organization and management method |
-
2014
- 2014-11-05 CN CN201410614846.4A patent/CN104331478B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7802063B1 (en) * | 2005-06-10 | 2010-09-21 | American Megatrends, Inc. | Method, system, apparatus, and computer-readable medium for improving disk array performance |
CN101997918A (en) * | 2010-11-11 | 2011-03-30 | 清华大学 | Method for allocating mass storage resources according to needs in heterogeneous SAN (Storage Area Network) environment |
CN103020201A (en) * | 2012-12-06 | 2013-04-03 | 浪潮电子信息产业股份有限公司 | Storage pool capable of automatically simplifying configuration for storage system and organization and management method |
Also Published As
Publication number | Publication date |
---|---|
CN104331478A (en) | 2015-02-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104331478B (en) | Data consistency management method for self-compaction storage system | |
CN107798130B (en) | Method for storing snapshot in distributed mode | |
CN107463447B (en) | B + tree management method based on remote direct nonvolatile memory access | |
CN110447021A (en) | Method, apparatus and system for maintaining metadata and data consistency between data centers | |
CN101777016B (en) | Snapshot storage and data recovery method of continuous data protection system | |
CN105718217B (en) | A kind of method and device of simplify configuration storage pool data sign processing | |
US20060047926A1 (en) | Managing multiple snapshot copies of data | |
CN103106286B (en) | Method and device for managing metadata | |
CN105868396A (en) | Multi-version control method of memory file system | |
CN103354923A (en) | Method, device and system for data reconstruction | |
CN110058822A (en) | A kind of disk array transverse direction expanding method | |
CN104281538B (en) | It is a kind of store equipment dilatation and Snapshot Method and storage equipment | |
CN106250320A (en) | A kind of memory file system management method of data consistency and abrasion equilibrium | |
US11073986B2 (en) | Memory data versioning | |
CN102073739A (en) | Method for reading and writing data in distributed file system with snapshot function | |
CN107784121A (en) | Lowercase optimization method of log file system based on nonvolatile memory | |
CN105045850B (en) | Junk data recovery method in cloud storage log file system | |
CN106354890B (en) | A kind of implementation method of the file system of the random access based on N-ary tree construction | |
CN110196818A (en) | Data cached method, buffer memory device and storage system | |
CN109144416A (en) | The method and apparatus for inquiring data | |
CN109933564A (en) | File system management method, device, terminal, the medium of quick rollback are realized based on chained list and N-ary tree construction | |
CN102541691A (en) | Log check point recovery method applied to memory data base OLTP (online transaction processing) | |
CN107436738A (en) | A kind of date storage method and system | |
CN103473258A (en) | Cloud storage file system | |
CN113704217A (en) | Metadata and data organization architecture method in distributed persistent memory file system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |