[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN109684338A - A kind of data-updating method of storage system - Google Patents

A kind of data-updating method of storage system Download PDF

Info

Publication number
CN109684338A
CN109684338A CN201811386532.8A CN201811386532A CN109684338A CN 109684338 A CN109684338 A CN 109684338A CN 201811386532 A CN201811386532 A CN 201811386532A CN 109684338 A CN109684338 A CN 109684338A
Authority
CN
China
Prior art keywords
data
block
updated
backup
updating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811386532.8A
Other languages
Chinese (zh)
Inventor
张婧垚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Huaer Data Technology Co Ltd
Original Assignee
Shenzhen Huaer Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Huaer Data Technology Co Ltd filed Critical Shenzhen Huaer Data Technology Co Ltd
Priority to CN201811386532.8A priority Critical patent/CN109684338A/en
Publication of CN109684338A publication Critical patent/CN109684338A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

A kind of data-updating method of storage system, storage object in storage system is stored at least one node in the form of block, each node stores a logic data block, logic data block on each node collectively constitutes a complete storage object, which includes: the update block and backup block for obtaining data to be updated;Then by the log area of the content copy of backup block to logic data block;Then the corresponding position of the content copy of the block data to be updated into data field will be updated.Historical data can be accurately acquired and restored using the data-updating method, and can restore or read the data of indicated release according to version number.It is abnormal when if furthermore updating, returns to the state before updating with can be convenient, to ensure that the consistency of data and the correctness of update.

Description

A kind of data-updating method of storage system
Technical field
The present invention relates to field of data storage, and in particular to a kind of data-updating method of storage system, and it is related to one Abnormality eliminating method kind based on the data-updating method and the reading data and rollback process of more new data.
Background technique
Due to being constantly progressive for science and technology, the especially rapid development of the information technologies such as internet, so that modern society The data generated daily are in explosive growth, therefore the demand to mass data storage technology is outstanding day by day.In this science and technology back Under scape, large capacity is expansible, High Availabitity and highly reliable distributed memory system, and traditional single node is being replaced to store system System becomes the mainstream of industry.Compared with single node storage system, the distributed memory system of multinode is in face of the new of many complexity Problem will not only have higher data access efficiency, with meet a large number of users and meanwhile access the needs of, also to guarantee the peace of data Complete and correctness.It is higher to the reliability requirement of storage system especially for fields such as telecommunications, finance, need special skill Art means are ensured.
For a storage system, data update is greatly present in a possibility that risk of error occurring In the process.If failure, such as network interruption, machine power-off etc. has occurred in system when modifying data, it is possible to Wu Fazheng Operation is often completed, not only new data failed is correctly written at this time, and old data may also be lost or damage, and even result in Monolith data all fail.In distributed memory system, the different piece of a block number evidence may deposit in different nodes.And And distributed memory system generallys use the redundancies such as more copies or correcting and eleting codes to ensure the safety of data, accordingly even when monolith Data are all placed in same node point, original data block in Backup Data or correcting and eleting codes and check block in more copies (or Encoding block), it can also be scattered on multiple nodes.This, which makes the data of distributed memory system update operation, becomes more multiple It is miscellaneous.Assuming that a block number has N number of according to relevant memory node, need the content to all N number of nodes all correct when modifying data It updates, if part of node completes update, and another part be since the factors such as failure could not correctly update, this N number of section Point just will appear the inconsistent situation of versions of data, thus may cause monolith data and is destroyed.In this case, it needs pair The node of failure executes update operation again, until all interdependent nodes are all correctly updated.But failure cannot be soon sometimes It excludes, the state that the node being updated can only be restored to before not updating.That is, at any time, will guarantee " consistency " of internodal data, this is the distributed memory system basic problem to be solved.
Correctness in addition to guaranteeing data, the trackability of historical data are also that the indispensability of a high quality storage system is special Sign.Such as the modification record of inquiry data, data backup is carried out using snapping technique, the cancellation etc. of disaster recovery, maloperation is answered With scene, the variation for accurately recording data is required.Modern memory systems generally use the mode of log to realize this function Energy.Depending on the application and performance, the level and granularity of log can there are many selections.There is the log of whole system grade, record is entire The state change of system.Also there is the log of file or object level, record creation, attribute modification, the deletion of file or storage object Deng operation, but not the specific variation of data content is recorded, such as Ceph just uses this log mode.There are also data content grades Log, will record the specific variation of data content, including location revision, modify content, modification time etc..First two belong to compared with The log of coarseness, realization is relatively simple, but can not recall to data content.Last kind belongs to the day of fine granulation Will can accurately trace historical data, but also face implementation complexity, the problems such as space efficiency and access efficiency.
At present data content grade more new strategy mainly include the following types:
1) all standing (Full Overwrite, FO) directly covers legacy data with new data in position to be updated.This Kind mode is equivalently employed without log, realizes simply, but risk is larger, is unable to satisfy data consistency requirement, and after update It can not restore old data.
2) full-time will (Full Logging, FL) is stored in special log space in system using new data as log In, while the position (offset of relative data block initial position) of record modification and length.When needing to read data, read first Original data block is taken, then replaces the part being updated with new data.1) this mode plants the plan of full coverage type with respect to the It is slightly much safe and reliable.If updating failure, legacy data is still retained.If restoring historical data, as long as by new data Deletion.But it reads every time require to replace legacy data with new data in this way, influence access speed.
3) log (Parity Logging, PL) is verified, in the distributed memory system for using correcting and eleting codes, to data 1) block is planted (in place) on the spot in full coverage type strategy using the and is updated, then use the 2) to plant complete Journaled plan to encoding block Log in slightly updates.What is read when reading in this way is directly exactly modified new data, without being replaced operation, access effect Rate is higher.When generation mistake needs to restore some old version, encoding block is reverted into legacy data first, then utilizes coding Block and unmodified data block are calculated according to used correcting and eleting codes formula by the legacy version of modification data block.It is such to ask Topic is that rolling back action is more complicated, and expense is larger.And it is only applicable to the situation of the negligible amounts of each more new data block.If The quantity for being updated data block has been more than the theoretical upper limit that it uses correcting and eleting codes, can not just recover legacy data.
4) the verification log with retaining space (Parity Logging with Reserved space, PLR), it is basic 3) to verify Journaled strategy identical for kind with the, but need to reserve one section of space in encoding block adjacent position for log.When having When updating arrival, the variable quantity (Parity Delta) of update is stored in reserved space.Due to reserved space and data block Position is adjacent, is not only convenient for searching and locating, access speed is also than very fast.But still that there are rolling back actions is more complicated, opens Big problem is sold, and application is only applicable to the situation of the negligible amounts of each more new data block there are limitation.If by more The quantity of new data block has been more than the theoretical upper limit that it uses correcting and eleting codes, can not just recover legacy data.
In conclusion a reliable storage system, especially distributed memory system, need by the means such as log come Guarantee the reversibility for the correctness and historical data that data update.The journaling techniques of data content grade can be recorded accurately The detailed variation of data content has great application value.But it needs to control complexity in an implementation, can accurately restore While historical data, guarantee higher access efficiency, not premised on reducing system performance.Existing method can not expire simultaneously Sufficient above-mentioned requirements.
Summary of the invention
The application provides a kind of data-updating method of storage system, and this method can reduce the complexity of storage system, Simultaneously in more kainogenesis mistake, the state before updating is restored to using this method with can be convenient, to guarantee the one of data Cause property and correctness.
According to a first aspect of the present application, this application provides a kind of data-updating method of storage system, storage systems In storage object be stored in the form of block at least one node, each node stores a logic data block, each to save Logic data block on point collectively constitutes a complete storage object;It includes the following steps:
Obtain the update block and backup block of data to be updated;By the log of the content copy of backup block to logic data block Area;The corresponding position of the content copy of the block data to be updated into data field will be updated.
According to a second aspect of the present application, this application provides a kind of abnormality eliminating method of data updating process, data Renewal process uses above-mentioned data-updating method, which is used to work as detects exception in data updating process When, updated data are restored to the state before updating comprising following steps:
Obtain the information of data update abnormal;It is abnormal during replacing legacy data with new data when reading When, then the new data after being updated with legacy data replacement in data backup block to be updated deletes data to be updated from log area Backup block;Do not start also when reading replacement process, when the backup block of data to be updated has created, deletes from log area to more The backup block of new data;After knowing that the backup block of data to be updated does not also have creation or deleted information, returns and update failure Information.
According to the third aspect of the application, the application provides a kind of carry out data read process to more new data, number Above-mentioned data-updating method, the corresponding data of version before capable of being read with the process comprising such as are used according to renewal process Lower step:
Obtain the version number of data to be read;Audit log area reads version number and is greater than the standby of versions of data number to be read Part block;Logic data block is read to buffer area;With in the legacy data replacement current logical data block in the backup block read before Corresponding part.
According to the fourth aspect of the application, the application provides a kind of data rewind process of more new data, and data update Process use above-mentioned data-updating method, with the process by data rewind to before the corresponding data of version comprising it is as follows Step:
Read data will rollback version number;Audit log area reads version number and is greater than the version that data will roll back to This number backup block;With the corresponding part in the legacy data replacement current logical data block in the backup block read before;It deletes The backup block and Free up Memory read before.
According to the 5th of the application the aspect, the application provides a kind of computer readable storage medium, including program, can It is executed by processor to realize the above method.
According to the data-updating method in the storage system of above-described embodiment, the requirement to operation atomicity is updated is reduced The shape before updating is restored to using this method with can be convenient while in more kainogenesis mistake with the complexity of storage system State to guarantee the consistency of data, and also brings convenience to read historical data and data rolling back action.
Detailed description of the invention
Fig. 1 is the schematic diagram of the data updating process of storage system;
Fig. 2 is a kind of flow chart of the data updating process of embodiment;
Fig. 3 is a kind of flow chart of the update abnormal treatment process of embodiment;
Fig. 4 is a kind of flow chart of the reading historical data process of embodiment;
Fig. 5 is a kind of schematic diagram that new data process is replaced with legacy data of embodiment;
Fig. 6 is a kind of schematic diagram of the backup block union operation process of embodiment;
Fig. 7 is a kind of flow chart of the data rewind operating process of embodiment;
Fig. 8 a is the timing diagram that a kind of data rewind under normal circumstances of embodiment operates;
Fig. 8 b is the timing diagram that a kind of data rewind when breaking down of embodiment operates.
Specific embodiment
Below by specific embodiment combination attached drawing, invention is further described in detail.Wherein different embodiments Middle similar component uses associated similar element numbers.In the following embodiments, many datail descriptions be in order to The application is better understood.However, those skilled in the art can recognize without lifting an eyebrow, part of feature It is dispensed, or can be substituted by other elements, material, method in varied situations.In some cases, this Shen Please it is relevant it is some operation there is no in the description show or describe, this is the core in order to avoid the application by mistake More descriptions are flooded, and to those skilled in the art, these relevant operations, which are described in detail, not to be necessary, they Relevant operation can be completely understood according to the general technology knowledge of description and this field in specification.
It is formed respectively in addition, feature described in this description, operation or feature can combine in any suitable way Kind embodiment.Meanwhile each step in method description or movement can also can be aobvious and easy according to those skilled in the art institute The mode carry out sequence exchange or adjustment seen.Therefore, the various sequences in the description and the appended drawings are intended merely to clearly describe a certain A embodiment is not meant to be necessary sequence, and wherein some sequentially must comply with unless otherwise indicated.
It is herein component institute serialization number itself, such as " first ", " second " etc., is only used for distinguishing described object, Without any sequence or art-recognized meanings.And " connection ", " connection " described in the application, unless otherwise instructed, include directly and It is indirectly connected with (connection).
In embodiments of the present invention, a kind of data-updating method of storage system is provided, this method uses data content The log of grade can accurately acquire and restore historical data, and can restore or read indicated release according to version number Data.It is abnormal when if being furthermore updated using the update method to data, it may be convenient to return to the shape before updating State, to ensure that the consistency of data and the correctness of update.
Data update the common operation as storage system, if be abnormal in data updating process, it is most likely that Cause to update and fail, so that the loss of data can not even repair.Therefore the storage system of high quality must assure that data more New correctness, especially distributed memory system also need to guarantee the consistency of internodal data.In addition, updated in data Cheng Zhong, it is sometimes desirable to which the version before reading historical data or arriving data rewind is also that high quality storage system is indispensable Function, the present embodiment are intended to according to providing a kind of data-updating method for making storage system have above-mentioned function.
The all standing (Full Overwrite with Logging, FOL) with log that the embodiment of the invention provides a kind of Data-updating method, and use the log of data content grade.The data-updating method is applied to storage system, is in storage In system, usually there are multiple storage objects, each storage object is usually stored at least one node in the form of block, note Number of nodes is N, then N >=1, puts a logic data block on each node, all logic data blocks collectively constitute one it is complete Whole storage object.Modification to storage object content is actually updated operation to logic data block, and note is every time to depositing Storage contents of object is n to the logical data number of blocks in requisition for update when modifying, then n≤N, for the convenience of narration, The logic data block indicated on each node with " data block " is described below.
Referring to FIG. 1, in the present embodiment, the node of storing data block includes data field and log area in storage system, Wherein the current data content of block, i.e. data content after latest update, log area are mainly used for for storing data for data field Store the more new information of data to be updated and the content for the legacy data being replaced, in a particular embodiment, will more new information together with The legacy data content being replaced is collectively referred to as " backup block ".Wherein, log area can be the specific assigned from node storage space Space out is also possible to be placed on log area and number using the method such as " the verification log with retaining space " (PLR) According in the adjacent reserved space of block to improve access speed.In the embodiment having, the node of storing data block further includes buffering Area, the buffer area mainly use when needing to read historical data are conveniently being looked into for temporarily storing the data of some old version While seeing, the data of latest edition are not interfered with.
In the present embodiment, carried out more using content of the data-updating method shown in FIG. 1 to storage object in storage system Newly.It is denoted as V0 in the version number of original state, the data block of data field storage, the version number Vi by updated data block (i >=1) indicates, and when updating every time, all can be with new data content to adjacent data to be updated a part of in data block or extremely Few non-conterminous data to be updated of two parts are replaced.Wherein, it is known as " updating block " for the new data content of replacement, but It is, in specific embodiment, in addition to it is to be understood that updated new data content, it is also necessary to know the position of new data content replacement data It sets, and in order to facilitate the data of version before subsequent reading, it is also necessary to the version number that record updates every time.For this purpose, each more Further include the more new information of data to be updated in new block other than the updated new data content of data to be updated, updates letter Breath includes at least following two parts information:
1) position is updated, for recording the location information of data to be updated, i.e., in the data that the needs in data block update The position of appearance.In a particular embodiment, updating position with initial position and can update length and indicate, such as with data to be updated Offset relative to data block initial position indicates initial position, is denoted as Pi;It is indicated with the length for the byte being continuously updated Length is updated, Li is denoted as;In a particular embodiment, updating position can be denoted as (Pi, Li), indicate within the data block, to open from Pi Beginning has continuous Li successive byte to be updated.In other examples, initial position and final position can be used by updating position It indicates, such as the offset with data to be updated relative to data block initial position indicates initial position, is denoted as Pi;With to be updated Data indicate final position relative to the offset of data block final position, are denoted as Qi;In a particular embodiment, updating position can To be denoted as (Pi, Qi), indicate that the successive byte since Pi until Qi is updated within the data block.In other examples, Some need data to be updated non-conterminous at least two parts are updated operation, need to wait for more for each section respectively at this time New data provide it is corresponding update position, such as when there is two parts data to be updated, using the above method, update position can be divided It is not denoted as (Pi, Li), (Pj, Lj), or (Pi, Qi), (Pj, Qj).Preferably, it is updated in the present embodiment data to be described In method, the update position initial position of data to be updated and update length are indicated, that is, are expressed as the form of (Pi, Li).
2) version number, for recording the start context of data block after the completion of data to be updated update, in a particular embodiment It is indicated with Vi.It should be noted that one storage object may be divided into multiple data blocks and be stored in distributed memory system On different nodes, and updating every time may be just for part of node, in this way on each node, the same data block Reproducting periods there may be the update of data block on other nodes twice, at this time in order to guarantee a storage object version number Continuity, start context is on the basis of storage object level, and even more new content is on a node to storage object for the first time Data block, when more new content is the data block on another node for the second time, although second of storage object updates to another It is to update for data block on an outer node for the first time, the version number of corresponding data block is also denoted as V2, rather than V1.Such as figure 1 show the schematic diagram updated to data block on a node, and wherein the first time of the data block updates, as corresponding storage pair As the first update, version number is denoted as V1;Second of update of the data block, for the 4th update of corresponding storage object, version Number it is denoted as V4.
In conclusion the more new information of data to be updated can be expressed as ([(Pi, Li)], Vk) or ([(Pi, Li), (Pj, )], Lj Vk) form.
In the embodiment having, the more new information of data to be updated further includes timestamp, uses Ti table in a particular embodiment Show, that is to say, that timestamp is option.In a particular embodiment, timestamp can be the time that current data updates beginning, Such as 20181010150123370, in the embodiment having, it is also possible to this data and updates the time completed.If there is when Between stab, can according to modification time read historical data and search modification record, otherwise can only just be gone through according to version number to read History data and lookup modification record.But if a certain version can not be just read or restored according to user demand without version number Data, but do not influence to read and restore original version, i.e. data of V0 version nor affect on the data of latest edition.When When the more new packets of data to be updated include timestamp, more new information can be expressed as ([(Pi, Li)], Vk, Tk) or ([(Pi, Li), (Pj, Lj)], Vk, Tk).
Referring to FIG. 2, the data-updating method based on above-mentioned storage system includes the following steps:
Step 101, the update block and backup block of data to be updated in data block are obtained.According to data to be updated in data block Location information determine the update position of current data block, it is in the present embodiment, preferred with initial position and to update length It indicates to update position, as shown in fig. 1, the update position that data block updates for the first time is expressed as (P1, L1).Then according to this The version number of data block determines the version number of data block after this time update, as shown in figure 1, data block before updating for the first time before update Version number is V0, and data block version number is V1 after updating for the first time, and version number is denoted as in the more new information updated for the first time at this time V1.In the embodiment having, also according to the current time logging timestamp information that data update, timestamp is indicated with Ti, wherein i Number corresponding with version number is consistent.In conclusion can determine the more new information of data block to be updated, be denoted as ([(P1, L1)],V1,T1).After information to be updated determines, more new information forms more together with the new data for replacing data to be updated New block, forms backup block together with the legacy data being replaced, and obtains the update block of data to be updated and backup in data block with this Block.
Step 102, by the log area of the content copy of backup block to data block.As shown in Figure 1, in being updated at the 1st time, number According in block in from P1 to P1+L1, the data content of part will be updated, at this time for logarithm while retaining initial data It is updated according to block, version number's (and timestamp) that the legacy data updated at position (P1, L1) is updated together with this is copied Shellfish saves to log area.Similarly, when data block exist at least the non-conterminous data to be updated of two parts need to update when, need by The corresponding legacy data of two parts copies log area to, when such as the 2nd update of Fig. 1 data block, update position be (P2, L2) and (P3, L3) two-part data need to update, and need together with version number's (and timestamp) to copy the legacy data at two regions at this time Log area.
Step 103, the corresponding position of the content copy of the block data to be updated into data field will be updated.In a step 102 The legacy data of data area to be updated is stored to log area, in order to accelerate the reading speed to new data, this step will Replace legacy data with new data, by the update block copy obtained in step 101 into data field the corresponding position of data to be updated. As shown in Figure 1, being in the legacy data at (P1, L1) with new data replacement position in updating at the 1st time;In being updated at the 2nd time, Replace the legacy data of corresponding position respectively with the new data in two regions.
In the present embodiment, by the content copy for updating block, into data field, the corresponding position of data to be updated be can be completed Data updating process.In the embodiment having, it is also necessary to according to all data blocks for the content update storage object for updating block Version number.Since a storage object is usually stored on multiple nodes in the form of block, therefore when on wherein some node After data are updated, need to carry out the data block on the corresponding each node of storage object the update of version number, to protect The consistency for demonstrate,proving storage object version number needs the version number of the current more new data block of basis to all data of storage object The version number of block is updated.As shown in Figure 1, the 1st update operation for storage object is related to the data block, produce The backup block of one version number V1, what the 2nd time, the 3rd time when modified is the data block on other nodes, what the 4th updated When again related to the data block, so the version number of data block be V4, corresponding update block and backup block version number It is also V4.In another embodiment, it can also be recorded by version number of the server centered to storage object, without depositing It stores up and version number is updated in each data block of object.
Data are updated using data-updating method described in Fig. 2, can accurately record the detailed of data content Variation, and when being abnormal at no point in the update process, it can continue to be updated behaviour according to above-mentioned steps after excluding exception Make, while updated data block can also be restored to the state before updating, to guarantee the consistency of data and update just True property.As shown in figure 3, more new data block is restored to and updates the different of preceding state when being abnormal in data updating process Normal processing method the following steps are included:
Step 111, data update abnormal information is obtained.When storage system breaks down, such as network interruption, machine are disconnected Electricity etc., it is possible to update operation can not be normally completed, i.e., there is a situation where data update abnormals.It, will after system restores normal According to data updating process, the exception information that data update is read, obtains which step primary data update has progressed to.
Step 112, judge whether to be abnormal during replacing legacy data with new data, i.e., execution step 103 will It is abnormal during updating the corresponding position of data to be updated of the content copy of block into data field.If so then execute step Rapid 113, if not thening follow the steps 114.
Step 113, the new data after being updated with the legacy data replacement of data to be updated in log area.Execute step 103 inverse process.It, at this time should be according to the number being abnormal because there may be the legacy datas that the data block repeatedly updates in log area Initial data is selected according to the version number of renewal process, avoids occurring again because of selection fault abnormal.
Step 114, judge whether backup block has created, because being stored directly in log area after backup block creation, so When judge whether backup block has created, actually judge data updating process it is abnormal whether be to occur to copy to log area During backup block.If data update abnormal occurs to then follow the steps 115 during copying backup block to log area, If not thening follow the steps 116.
Step 115, the backup block of data to be updated is deleted from log area.The reason identical as step 113, because of log area In there may be the backup blocks that the data block repeatedly updates, the backup block version being abnormal should be selected at this time, to avoid because choosing It selects fault and occurs exception again.
Step 116, it returns and updates failure.After step 111-115 operation, data are already restored to the shape before updating State, and backup block is deleted in log area or does not start also to create returns update failure at this time, further according to needing to select Again it updates or stops updating.
Because in distributed memory system, a storage object is often divided into multiple data blocks.Updating operation every time can It can relate to multiple data blocks, then need each data block return to be updated successfully, could announce that this update operates successfully.If In the presence of unsuccessful data block is updated, all data blocks being related to require to return to more by above-mentioned abnormality eliminating method State before new, to guarantee the consistency of data.
Using data-updating method shown in the present embodiment, the data of data field storage are newest obtained data block, are needed When reading current data, directly reading be can be completed, reading efficiency with higher.But in practical applications, it is sometimes desirable to The historical data of version before reading, then when data updating process is using data-updating method described in the present embodiment, such as Fig. 4 It is shown, the reading process of more new data is included the following steps:
Step 121, the version number of data to be read is obtained.According to user's input or system command, obtain currently reading The version number of the historical data taken.When version number be current data locating for version when, can be read directly, below no longer to this Situation is illustrated.
Step 122, the backup block that version number is greater than versions of data number to be read is read.It is Vi's that note, which needs to obtain version number, Data first have to audit log area, and the backup block by wherein version number greater than Vi is read out, it is assumed that a shared k such Backup block, version number are respectively Vj1, Vj2..., Vjk, wherein j1<j2<......<jk, then in this step need to be from log This k backup block is read out in area.As shown in figure 5, if user is specified to read the data that version number is V2, notebook data block Backup block of the middle version number greater than V2 only has 1, i.e., need to only read the backup block that version is V4, includes two parts in this backup block Non-conterminous data.
Step 123, the data block after recent renewal is read into buffer area.Data block after recent renewal is current shape Data block under state, that is, the data block of all updates operation before current time has been completed.
Step 124, it is read in buffer area in the legacy data replacement step 123 in the backup block read in step 122 Data block corresponding data, in replacement process, it is ensured that the location information of legacy data and the data to be replaced in backup block Location information in block is consistent.The case where there may be position overlapping because of the legacy data in the backup block that reads in step 122, Therefore in order to guarantee the correctness of replacement, it is replaced by the sequence of backup block version number from big to small, with k in step 122 It is first Vj with version number for backup blockkBackup block replacement, followed by Vjk-1..., until Vj1Until, it can obtain The data block for being Vi to version number.Again as shown in figure 5, being that two parts legacy data in the backup block of V4 is replaced respectively with version number Version number is the data of the corresponding position in the data block of V4, and the data block that version number is V2 can be obtained.To storage object All data blocks execute above step, and the data of version to be read can be obtained.
In the embodiment having, historical data can also be read according to timestamp information, need to only be believed at this time according to timestamp Breath finds out corresponding version number, and then executing the above process can be obtained the data of version to be read.In the embodiment having, Historical data is only read according to timestamp information, needs to search data in log area according to the timestamp information that user specifies at this time Renewal time is located at the backup block after timestamp information, and being then successively replaced according to sequence from back to front can be obtained The data of version to be read.
According to step 124 describe process with legacy data is replaced in backup block when, need in multiple backup blocks Legacy data is successively replaced operation, and at this time when the historical data version for needing to read is more early, replacement efficiency is lower, especially When each backup blocks positions information overlap is more, need to carry out multiple replacement operation for overlapping region, to a certain extent shadow The reading efficiency of data is rung.For this deficiency, the invention proposes a kind of method for improving historical data reading efficiency, tools Body process is as follows:
The backup number of blocks that reads is k in note step 122, first carries out union operation to k backup block, obtain one (P, L then) list is replaced again with the legacy data in (P, L) list, is adopted this method, need to only carry out a replacement operation The data block that version to be read can be obtained carries out identical operation to each data block, version to be read can be obtained Data.It, can be according to the sequence from highest version to lowest version in order to easy to operate during carrying out union operation to backup block To calculate union, that is to say, that be first Vj to version numberkAnd Vjk-1Backup block seek union, obtain version number be Vjk-1Conjunction And backup block, it is then Vj with the merging backup block and version numberk-2Backup block seek union, and so on, until whole k Backup block all completes union operation.When seeking union to two backup blocks, if there is overlapping in the position of two backup blocks, overlapping The data content of older version is subject in part, and respective content is subject in underlapped part.Referring to FIG. 6, note two The backup block of union to be asked is Vm and Vn, wherein m < n, and the initial position and update length that position is updated in two backup blocks are respectively (100,300) and (200,500), i.e. the update position of backup block Vm are 100 to 400, and the update position of backup block Vn is arrived for 200 700, then overlapping in two backup block of the band of position from 200 to 400, and because of m < n, then Overlapping Location Areas needs at this time Version number be Vm backup block in data, i.e., will generate initial position and update length be respectively (100,300) and (400, 300) two merging backup blocks, the former version is Vm, and data content uses original version for the backup block of Vm, and the latter's version is Vn, content use original version for the data in the backup block of Vn from 400 to 700.One may finally be obtained in this way Include several merging backup blocks for not overlapping region and (P, a L) list.
The reading data by current data block all areas in step 123 into buffer area, but from replacement process it is found that In concrete operations, the data that step 122 reads backup block corresponding position in data block will be replaced, therefore in order to improve efficiency, Unnecessary read work can be reduced, i.e., in step 123, can only read backup block or merge backup block do not include that The data in a little regions.
Step 123 and 124 are optimized using above two method, it is ensured that the data on each position are only read Once, to reduce computing overhead, reading efficiency is improved.In some embodiments, history is read using timestamp information When data, it is also possible to which above two method improves the efficiency read, and which is not described herein again.
Using data-updating method shown in the present embodiment, it is also convenient for carrying out rolling back action to data, to each data The process that block carries out rolling back action is similar with the reading process of historical data, as shown in fig. 7, to the data rewind of more new data Process includes the following steps:
Step 131, the version number that data will roll back to is obtained.This step is similar with step 121, to be inputted by user Or system command obtains the version number that system will roll back to.
Step 132, the backup block that version number is greater than the version number that data will roll back to is read.This step uses and step 122 identical methods, details are not described herein again.
Step 133, the number of corresponding region in current version data is replaced with legacy data in the backup block read in step 132 According to.This step is replaced using method identical with step 124, can also be using above-mentioned standby in order to improve rollback efficiency The mode of part block union operation.
Step 134, the backup block read in step 132 and Free up Memory are deleted.Because of data rolling back action and read history The purpose of data is different, and system reads the data of version before historical data is only intended to check, the data of current latest edition Still to retain in case he uses;And by data rewind to indicated release when data rewind operation, the update of data is operated before In be located at indicated release after versions of data will no longer retain, therefore be located at indicated release after backup block also just lose valence Value, needs to be deleted with Free up Memory.
Indicated release or time point are all rolled back to for all data blocks of distributed memory system, only storage object, It just calculates and completes rolling back action, can just delete backup block at this time.If wherein there is certain data block failure rollbacks to fail, It needs to repeat to update operation, all relevant data blocks is all restored to the state before rollback, to guarantee the consistency of data.Specifically When program process as figures 8 a and 8 b show:
Node where administrator to all data blocks of storage object issues rollback request, it is assumed that storage object is stored in N In a node, then administrator issues rollback request to N number of node, and each node receives after request according to above-mentioned steps to being stored in Data block therein carries out rolling back action, is sent completely confirmation message to administrator after completion, and administrator receives all N number of nodes The completion that end rollback is sent after the completion confirmation message of return is indicated to each node, and each node can after receiving completion instruction Delete useless backup block;If what administrator received is the failed message of rollback mistake, cancel rollback for sending to N number of node Instruction after each node receives instruction, is done reversed update to the data block region of rollback and is operated.In addition, if time-out is not received To the instruction for terminating rollback, also starts reversed update and operate to cancel rollback, guarantee the consistency of data.
The present invention is not only suitable for the storage system of single node, is also applied for the storage system of multinode.Especially it is being distributed In formula storage system, data-updating method used in the present invention reduces the requirement and storage system to operation atomicity is updated Complexity the state before updating is restored to using this method with can be convenient, to guarantee while in more kainogenesis mistake The consistency of data, and also convenience is brought to read historical data and data rolling back action.
It will be understood by those skilled in the art that all or part of function of various methods can pass through in above embodiment The mode of hardware is realized, can also be realized by way of computer program.When function all or part of in above embodiment When being realized by way of computer program, which be can be stored in a computer readable storage medium, and storage medium can To include: read-only memory, random access memory, disk, CD, hard disk etc., it is above-mentioned to realize which is executed by computer Function.For example, program is stored in the memory of equipment, when executing program in memory by processor, can be realized State all or part of function.In addition, when function all or part of in above embodiment is realized by way of computer program When, which also can store in storage mediums such as server, another computer, disk, CD, flash disk or mobile hard disks In, through downloading or copying and saving into the memory of local device, or version updating is carried out to the system of local device, when logical When crossing the program in processor execution memory, all or part of function in above embodiment can be realized.
Use above specific case is illustrated the present invention, is merely used to help understand the present invention, not to limit The system present invention.For those skilled in the art, according to the thought of the present invention, can also make several simple It deduces, deform or replaces.

Claims (13)

1. a kind of data-updating method of storage system, the storage object in the storage system be stored in the form of block to On a few node, each node stores a logic data block, and the logic data block on each node collectively constitutes one completely Storage object;
It is characterized in that, the data-updating method includes the following steps:
Obtain the update block and backup block of data to be updated;
By the content copy of the backup block to the log area of the logic data block;
By the corresponding position of the content copy for updating block data to be updated into data field.
2. the method according to claim 1, wherein the method also includes copying by the content for updating block Behind the corresponding position of shellfish data to be updated into data field, according to the version for the content update respective logic data block for updating block Number.
3. the method according to claim 1, wherein the content for updating block includes: the new of data to be updated The more new information of data and data to be updated;
The more new packets of data to be updated include: for recording the update position of the location information of data to be updated and being used for Record the version number of data block version after the completion of data to be updated update.
4. according to the method described in claim 2, it is characterized in that, the more new information of data to be updated further includes the time Stamp, the timestamp are used to record the time that each secondary data of the logic data block update.
5. same to patrol according to the method described in claim 2, it is characterized in that, when once being updated to logic data block In volume data block some adjacent data to be updated be updated or the same logic data block on have at least that two parts are not Adjacent data to be updated are updated;When having at least non-conterminous data to be updated of two parts on the same logic data block by more It is described to update the location information that position includes at least two parts data to be updated when new, i.e., it will be to be updated comprising at least two parts The backup block of the legacy data of data copies log area to, then arrives the update block copy comprising at least two parts data to be updated The corresponding position of at least two parts data to be updated of the logic data block.
6. a kind of abnormality eliminating method of data updating process, the data updating process is used as any in claim 1-5 Data-updating method described in, which is characterized in that the method is used for when detecting abnormal in data updating process, will Updated data are restored to the state before updating, and described method includes following steps:
Obtain the information of data update abnormal;
When read with new data replace legacy data during be abnormal when, with legacy data in data backup block to be updated The new data after being updated is replaced, the backup block of data to be updated is then deleted from log area;
Do not start also when reading replacement process, when the backup block of data to be updated has created, deletes number to be updated from log area According to backup block;
After knowing that the backup block of data to be updated does not also have creation or deleted information, the information for updating failure is returned.
7. a kind of data read process of more new data, the data updating process of the more new data use such as claim Data-updating method described in any one of 1-5, which is characterized in that the corresponding data of version before being read with the process, institute The process of stating includes the following steps:
Obtain the version number of data to be read;
Audit log area reads the backup block that version number is greater than versions of data number to be read;
Logic data block is read to buffer area;
With corresponding part in the legacy data replacement current logical data block in the backup block read before.
8. a kind of data rewind process of more new data, the data updating process of the more new data use such as claim Data-updating method described in any one of 1-5, which is characterized in that the version before is corresponding by data rewind with the process Data, the process include the following steps:
Read data will rollback version number;
Audit log area reads the backup block that version number is greater than the version number that data will roll back to;
With the corresponding part in the legacy data replacement current logical data block in the backup block read before;
The backup block and Free up Memory read before deleting.
9. method according to claim 7 or 8, which is characterized in that the legacy data replacement in the backup block read before using Corresponding part in current logical data block include: to the backup block read before, according to the sequence of version number from big to small according to It is secondary to be replaced, until content completes replacement in all backup blocks read before.
10. method according to claim 7 or 8, which is characterized in that the legacy data replacement in the backup block read before using Corresponding part in current logical data block includes: to do union operation to each backup block read before, is obtained comprising each backup The merging backup block of block legacy data information, then with the correspondence number merged in backup block in content replacement current logical data block According to.
11. the method according to the description of claim 7 is characterized in that reading logic data block to buffer area includes: only to read it The data on position not included in the backup block of preceding reading.
12. according to the method described in claim 8, it is characterized in that, all logic data blocks all rollbacks of only storage object When to specified version, corresponding backup block could be deleted.
13. a kind of computer readable storage medium, which is characterized in that including program, described program can be executed by processor with Realize such as method of any of claims 1-12.
CN201811386532.8A 2018-11-20 2018-11-20 A kind of data-updating method of storage system Pending CN109684338A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811386532.8A CN109684338A (en) 2018-11-20 2018-11-20 A kind of data-updating method of storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811386532.8A CN109684338A (en) 2018-11-20 2018-11-20 A kind of data-updating method of storage system

Publications (1)

Publication Number Publication Date
CN109684338A true CN109684338A (en) 2019-04-26

Family

ID=66185423

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811386532.8A Pending CN109684338A (en) 2018-11-20 2018-11-20 A kind of data-updating method of storage system

Country Status (1)

Country Link
CN (1) CN109684338A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111090663A (en) * 2019-12-25 2020-05-01 上海金仕达软件科技有限公司 Transaction concurrency control method, device, terminal equipment and medium
CN111476670A (en) * 2020-04-08 2020-07-31 杭州复杂美科技有限公司 Block chain rollback insurance method, equipment and storage medium
CN111522825A (en) * 2020-04-09 2020-08-11 陈尚汉 Efficient information updating method and system based on check information block shared cache mechanism
CN114237495A (en) * 2021-11-29 2022-03-25 歌尔科技有限公司 Data updating method, electronic device and computer readable storage medium
CN117170696A (en) * 2023-11-02 2023-12-05 佛山市钒音科技有限公司 OTA upgrading method and system for air conditioner
WO2023241350A1 (en) * 2022-06-17 2023-12-21 重庆紫光华山智安科技有限公司 Data processing method and device, data access end, and storage medium
CN118170783A (en) * 2024-05-11 2024-06-11 北京爱宾果科技有限公司 Data processing method, equipment and storage medium of distributed storage system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103403679A (en) * 2011-12-31 2013-11-20 华为技术有限公司 Data updating method and device during stabilizing of volatile storage device
CN105872040A (en) * 2016-03-30 2016-08-17 华中科技大学 Method for optimizing write performance of distributed block storage system by utilizing cache of gateway nodes
CN107544873A (en) * 2017-08-28 2018-01-05 郑州云海信息技术有限公司 A kind of standby system and method for depositing Backup Data
CN108701005A (en) * 2016-02-18 2018-10-23 华为技术有限公司 Data update technology

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103403679A (en) * 2011-12-31 2013-11-20 华为技术有限公司 Data updating method and device during stabilizing of volatile storage device
CN108701005A (en) * 2016-02-18 2018-10-23 华为技术有限公司 Data update technology
CN105872040A (en) * 2016-03-30 2016-08-17 华中科技大学 Method for optimizing write performance of distributed block storage system by utilizing cache of gateway nodes
CN107544873A (en) * 2017-08-28 2018-01-05 郑州云海信息技术有限公司 A kind of standby system and method for depositing Backup Data

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111090663A (en) * 2019-12-25 2020-05-01 上海金仕达软件科技有限公司 Transaction concurrency control method, device, terminal equipment and medium
CN111090663B (en) * 2019-12-25 2023-07-07 上海金仕达软件科技股份有限公司 Transaction concurrency control method, device, terminal equipment and medium
CN111476670A (en) * 2020-04-08 2020-07-31 杭州复杂美科技有限公司 Block chain rollback insurance method, equipment and storage medium
CN111476670B (en) * 2020-04-08 2023-05-30 杭州复杂美科技有限公司 Block chain rollback insurance method, apparatus and storage medium
CN111522825A (en) * 2020-04-09 2020-08-11 陈尚汉 Efficient information updating method and system based on check information block shared cache mechanism
CN114237495A (en) * 2021-11-29 2022-03-25 歌尔科技有限公司 Data updating method, electronic device and computer readable storage medium
WO2023241350A1 (en) * 2022-06-17 2023-12-21 重庆紫光华山智安科技有限公司 Data processing method and device, data access end, and storage medium
CN117170696A (en) * 2023-11-02 2023-12-05 佛山市钒音科技有限公司 OTA upgrading method and system for air conditioner
CN117170696B (en) * 2023-11-02 2024-03-12 佛山市钒音科技有限公司 OTA upgrading method and system for air conditioner
CN118170783A (en) * 2024-05-11 2024-06-11 北京爱宾果科技有限公司 Data processing method, equipment and storage medium of distributed storage system

Similar Documents

Publication Publication Date Title
CN109684338A (en) A kind of data-updating method of storage system
US7191299B1 (en) Method and system of providing periodic replication
US6934877B2 (en) Data backup/recovery system
US7181476B2 (en) Flashback database
US7814057B2 (en) Page recovery using volume snapshots and logs
US7225371B2 (en) Method and apparatus for storing and retrieving multiple point-in-time consistent data sets
US7546485B2 (en) Method and system for efficient journal-based resynchronization
US7987158B2 (en) Method, system and article of manufacture for metadata replication and restoration
US20150213100A1 (en) Data synchronization method and system
US20090006500A1 (en) Namespace replication program, namespace replication device, and namespace replication method
CN109739935A (en) Method for reading data, device, electronic equipment and storage medium
US20210165575A1 (en) Copy-on-write systems and methods
US11093387B1 (en) Garbage collection based on transmission object models
CN111078667B (en) Data migration method and related device
US9792941B2 (en) Method and system for data replication
CN102955720A (en) Method for improving stability of EXT (extended) file system
CN101809558A (en) Remote asynchronous data replication system and method
US10628298B1 (en) Resumable garbage collection
US20180225052A1 (en) Modifying membership of replication groups via journal operations
US10884871B2 (en) Systems and methods for copying an operating source volume
US20110208694A1 (en) &#39;Efficient Data Synchronization in a Distributed Data Recovery System&#39;
CN105302667A (en) Cluster architecture based high-reliability data backup and recovery method
CN105302488B (en) The method for writing data and system of a kind of storage system
US7856419B2 (en) Method and system for storage replication
CN107111534A (en) A kind of method and apparatus of data processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190426

RJ01 Rejection of invention patent application after publication