Summary of the invention
For solving the problems of the technologies described above, the embodiment of the invention provides a kind of cloud storage system data back up method and device, and technical scheme is as follows:
The embodiment of the invention provides a kind of cloud storage system data back up method, comprising:
Modification situation to the target data set of client is carried out record;
According to the modification situation that records, determine the current content of described data set and the difference part that has backed up before content;
After receiving the backup operation triggering command, determined difference is partly carried out backup operation.
According to a kind of embodiment of the present invention, described backup operation triggering command is specially:
Arrange according to the user, in the automatically backup operation instruction of triggering of default time point.
According to a kind of embodiment of the present invention, described backup operation triggering command is specially:
Described difference is partly monitored, when described difference part during greater than default threshold value, the backup operation instruction that automatically triggers.
According to a kind of embodiment of the present invention, described determined difference is partly carried out backup operation, be specially:
To determined difference part, utilize the breakpoint transmission mode to carry out backup operation.
According to a kind of embodiment of the present invention, described modification situation to target data set is carried out record, comprising:
In real time target data set is monitored the modification situation of record object data set.
According to a kind of embodiment of the present invention, described modification situation to target data set is carried out record, comprising:
Target data set is scanned, record the difference part of this scanning result and lastlook.According to a kind of embodiment of the present invention, described target data set is scanned, be specially:
When client is idle, target data set is scanned.
The embodiment of the invention also provides a kind of cloud storage system data backup device, comprising:
The amendment record unit is used for the modification situation of client objectives data set is carried out record;
The difference determining unit is used for according to the modification situation that records, and determines the current content of described data set and the difference part that has backed up before content;
Backup units after being used for receiving the backup operation triggering command, is partly carried out backup operation to determined difference.
According to a kind of embodiment of the present invention, described backup operation triggering command is specially:
Arrange according to the user, in the automatically backup operation instruction of triggering of default time point.
According to a kind of embodiment of the present invention, described backup operation triggering command is specially:
Described difference is partly monitored, when described difference part during greater than default threshold value, the backup operation instruction that automatically triggers.
According to a kind of embodiment of the present invention, described backup units specifically is used for:
To determined difference part, utilize the breakpoint transmission mode to carry out backup operation.
According to a kind of embodiment of the present invention, described amendment record unit specifically is used for:
In real time target data set is monitored the modification situation of record object data set.
According to a kind of embodiment of the present invention, described amendment record unit specifically is used for:
Target data set is scanned, record the difference part of this scanning result and lastlook.
According to a kind of embodiment of the present invention, described amendment record unit when client is idle, scans target data set.
Use the technical scheme that the embodiment of the invention provides, the effectively backup of administrative client user file system, the backup mode of robotization has made things convenient for user's operation effectively, reduces the cost of labor of user's side.By the mode of incremental backup, avoid all backing up a large amount of duplicate contents at every turn, thereby reduce taking of BACKUP TIME and backup space.Use the scheme that the embodiment of the invention provides, especially obvious for the user's Use Limitation fruit that has mass data, not only improve efficient and the accuracy of data backup, and reduced the operation cost of user aspect storage.
Embodiment
The core of cloud storage is that application software combines with memory device, realizes that by application software memory device is to the transformation of stores service.Compare with traditional memory device, cloud storage system not only relates to hardware, but the complication system that a plurality of parts such as the network equipment, memory device, server, application software, public access interface, Access Network and a client-side program form.Each several part externally provides data storage and Operational Visit service take memory device as core by application software.
Large data (Big data) are commonly used to describe a large amount of destructurings and the semi-structured data of certain enterprise's creation, and these data can the overspending time and money when downloading to relevant database for analysis.Normal and the cloud computing of large data analysis is linked together, because real-time large data set analysis need to be as MapReduce framework come to tens of, hundreds of or even thousands of computers share out the work, therefore large data storage just becomes the basis of large data management.
Cloud service provider can specially for single corporate client provides customized cloud stores service scheme, perhaps also can be disposed by the IT mechanism of enterprise oneself the privately owned cloud service framework of a cover.Privately owned cloud not only can be for the enterprise customer provides top quality service next to the skin, and can also reduce to a certain extent security risk.But for the user who has a large amount of real-time update data, it obviously is unpractical allowing the user back up a large amount of data by manual mode, therefore for cloud storage system, how user-friendly, raising backup efficient becomes the problem that needs solve.
For the problems referred to above, the embodiment of the invention provides a kind of cloud storage system data back up method to describe, and it comprises following basic step:
Modification situation to the target data set of client is carried out record;
According to the modification situation that records, determine the current content of described data set and the difference part that has backed up before content;
After receiving the backup operation triggering command, determined difference is partly carried out backup operation.
The executive agent of said method, it can be cloud storage system itself, it also can be the module that is arranged in a function opposite independent of cloud storage system, technique scheme, the effectively backup of administrative client user file system, the backup mode of robotization has made things convenient for user's operation effectively, reduces the cost of labor of user's side.By the mode of incremental backup, avoid all backing up a large amount of duplicate contents at every turn, thereby reduce taking of BACKUP TIME and backup space.Use the scheme that the embodiment of the invention provides, especially obvious for the user's Use Limitation fruit that has mass data, not only improve efficient and the accuracy of data backup, and reduced the operation cost of user aspect storage.
In order to make those skilled in the art understand better technical scheme among the present invention, below in conjunction with the accompanying drawing in the embodiment of the invention, technical scheme in the embodiment of the invention is described in detail, obviously, described embodiment only is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment among the present invention, the every other embodiment that those of ordinary skills obtain should belong to the scope of protection of the invention.
Shown in Figure 1, the schematic flow sheet of the cloud storage system data back up method that provides for the embodiment of the invention, the method can may further comprise the steps:
S101 carries out record to the modification situation of the target data set of client;
According to embodiment of the present invention, cloud storage system can be monitored the file system of client, and the data situation in this document system is carried out record, operations such as file modification, file interpolation, file deletion, in actual applications, the user can specify the concrete data object that needs backup, for example certain disk, certain sub-directory etc., in embodiments of the present invention, these data objects that need to back up are called target data set.
In one embodiment of the invention, can monitor in real time the modification situation of record object data set to target data set.Generally for file system, the operation that relates to comprises read operation and write operation two classes, real needs according to the present invention program, can only carry out Real-Time Monitoring to the write operation of file system, in case find that file system has data to write, then the concrete condition that writes is carried out record, file that for example increases, deletes, revises etc.
In another embodiment of the invention, can be by the mode of periodicity or plan target, triggering is to the scan operation of target data set, behind each end of scan, the difference of this scanning result and lastlook relatively just can be found to have increased those files, deleted which file, which file once was modified etc.Wherein, the mode of trigger sweep can be periodic triggers, or triggers according to the scan plan conditionality that the user arranges.For example, can be when client to be idle, i.e. client load triggers target data set is scanned during less than predetermined threshold value.
S102 according to the modification situation that records, determines the current content of described data set and the difference part that has backed up before content;
The backup scenario that the present invention adopts is: target data set is repeatedly backed up, need the whole set of data of backup target data set except backing up for the first time, follow-uply only need to back up and the difference part that has backed up before content at every turn.Suppose that Last Backup is t0 constantly, current time is t1, so, by the modification situation that statistics t0 records in the time period to t1, just can determine the current content of target data set and the difference part that has backed up before content.Determined difference part will need as next time backup the content of actual treatment.
S103, receive the backup operation triggering command after, determined difference is partly carried out backup operation.
According to the determined difference part of S102, the data content of correspondence from the client upload to the cloud storage system, is finished backup operation.Wherein backup operation can trigger with automated manner, for example, arranges according to the user, automatically triggers backup operation at default time point, and generally speaking the user can be arranged on network idle period (such as morning) movable contact and send out backup operation.Perhaps, the determined difference of S102 is partly monitored, when finding differences partly greater than certain default threshold value, modification has by a relatively large margin been arranged, the backup operation instruction that can automatically trigger this moment when target data set is described with respect to Last Backup.
Because backup operation need to be crossed network connection with the data communication device of client and be uploaded to cloud storage system, therefore in the process of uploading data, can utilize the mode of breakpoint transmission that data are transmitted, so that when the reasons such as network failure cause uploading unsuccessfully, can effectively recover.In addition, if current network is busy, also backup operation can be delayed to the network idle period and carries out.
Example below in conjunction with concrete is described further the present invention program:
According to the embodiment of the present invention, can corresponding backup services be installed in client, this service can enter into different operation interfaces according to the needs of operating system of user after client is successfully installed and started, and the guiding user carries out corresponding operating.For example, the windows platform user can be by the supervision of UI executing data automated back-up, and linux user can be by the supervision of order line executing data automated back-up.
After the user successfully logins, check at first whether this user exists back-up plan, for the user who logins first, back-up plan is empty, system will guide the back-up plan of login user configuration-system, system can provide some back-up plans for user selection, perhaps the personalized backup scenario of user oneself configuration oneself.For the non-first user of login, after successfully logining, can see the last time of carrying out backup, take the information such as storage size.The user can enter the back-up plan administration module at operation interface, and plan arranges or revises to automated back-up, and amended back-up plan will be carried out on the system backstage, waits for the generation of trigger event.
According to user configured backup object, start the monitoring function to target data set, on the backstage file system of needs backup is monitored, record is the fileinfo of backup not yet, when the user makes amendment to file, deletion or when adding operation, log file is revised information and file attribute.When starting backup file system monitoring function first, monitoring modular can be carried out the intelligent scanning operation to the file system of needs backup, if current system is busy, scan when then waiting system is idle, non-first startup file supervisory system, file watching system need to scan whole file system, the fileinfo that record has been made an amendment in the daily record of file monitoring, wait for and carry out upload operation when starting backup tasks.
When satisfying the condition of automated back-up, will according to monitoring result, carry out the batch upload operation to the file of once changing.Wherein, the trigger condition that the user disposes the automated back-up plan may be certain time point, perhaps the system file increment is greater than certain threshold value, when back-up plan is triggered, to the upload operation of file system or all be to carry out according to the incremental backup mode of file, only the file that change is arranged being carried out upload operation, is to execute full backup when carrying out upload operation first.
In upload procedure, can record the successful information of upload file, when system's generation power down or the disabled mistake of stores service, but in startup next time or service time spent, do not continue to carry out upload operation from current finishing the work, thereby improved the fault-tolerance of system.
In the present existing cloud stores service, Amazon S3 is the most extensive as the stores service usable range, so the S3 of Amazon service has almost become the standard of cloud storage industry.The compatibility of therefore AmazonS3 being served also is the approval to this service standard, use the present invention program, it is to realize by the sdk interface that calls Amazon S3 that the batch of file system is uploaded task, the interface of compatible Amazon can make things convenient for user's free migration, to the user of original Amazon S3, when using service provided by the present invention, do not need to revise again former code, modification endpoint gets final product, thereby the facility of compatible type is provided for the manufacturer of Application standard cloud storage.
Corresponding to top embodiment of the method, the present invention also provides a kind of cloud storage system data backup device, can comprise referring to this device shown in Figure 2:
Amendment record unit 201 is used for the modification situation of client objectives data set is carried out record;
According to embodiment of the present invention, cloud storage system can be monitored the file system of client, and the data situation in this document system is carried out record, operations such as file modification, file interpolation, file deletion, in actual applications, the user can specify the concrete data object that needs backup, for example certain disk, certain sub-directory etc., in embodiments of the present invention, these data objects that need to back up are called target data set.
In one embodiment of the invention, can monitor in real time the modification situation of record object data set to target data set.Generally for file system, the operation that relates to comprises read operation and write operation two classes, real needs according to the present invention program, can only carry out Real-Time Monitoring to the write operation of file system, in case find that file system has data to write, then the concrete condition that writes is carried out record, file that for example increases, deletes, revises etc.
In another embodiment of the invention, can be by the mode of periodicity or plan target, triggering is to the scan operation of target data set, behind each end of scan, the difference of this scanning result and lastlook relatively just can be found to have increased those files, deleted which file, which file once was modified etc.Wherein, the mode of trigger sweep can be periodic triggers, or triggers according to the scan plan conditionality that the user arranges.For example, can be when client to be idle, i.e. client load triggers target data set is scanned during less than predetermined threshold value.
Difference determining unit 202 is used for according to the modification situation that records, and determines the current content of described data set and the difference part that has backed up before content;
The backup scenario that the present invention adopts is: target data set is repeatedly backed up, need the whole set of data of backup target data set except backing up for the first time, follow-uply only need to back up and the difference part that has backed up before content at every turn.Suppose that Last Backup is t0 constantly, current time is t1, so, by the modification situation that statistics t0 records in the time period to t1, just can determine the current content of target data set and the difference part that has backed up before content.Determined difference part will need as next time backup the content of actual treatment.
Backup units 203 after being used for receiving the backup operation triggering command, is partly carried out backup operation to determined difference.
According to difference determining unit 202 determined difference parts, the data content of correspondence from the client upload to the cloud storage system, is finished backup operation.Wherein backup operation can trigger with automated manner, for example, arranges according to the user, automatically triggers backup operation at default time point, and generally speaking the user can be arranged on network idle period (for example morning) movable contact and send out backup operation.Perhaps, difference determining unit 202 determined differences are partly monitored, when finding differences partly greater than certain default threshold value, modification by a relatively large margin when being described with respect to Last Backup, target data set has been arranged, the backup operation instruction that can automatically trigger this moment.
Because backup operation need to be crossed network connection with the data communication device of client and be uploaded to cloud storage system, therefore in the process of uploading data, can utilize the mode of breakpoint transmission that data are transmitted, so that when the reasons such as network failure cause uploading unsuccessfully, can effectively recover.In addition, if current network is busy, also backup operation can be delayed to the network idle period and carries out.
For the convenience of describing, be divided into various unit with function when describing above device and describe respectively.Certainly, when enforcement is of the present invention, can in same or a plurality of softwares and/or hardware, realize the function of each unit.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, and identical similar part is mutually referring to getting final product between each embodiment, and each embodiment stresses is difference with other embodiment.Especially, for device embodiment, because its basic simlarity is in embodiment of the method, so describe fairly simplely, relevant part gets final product referring to the part explanation of embodiment of the method.Device embodiment described above only is schematic, wherein said unit as the separating component explanation can or can not be physically to separate also, the parts that show as the unit can be or can not be physical locations also, namely can be positioned at a place, perhaps also can be distributed on a plurality of network element.Can select according to the actual needs wherein some or all of module to realize the purpose of present embodiment scheme.Those of ordinary skills namely can understand and implement in the situation of not paying creative work.
The above only is the specific embodiment of the present invention; should be pointed out that for those skilled in the art, under the prerequisite that does not break away from the principle of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.