WO2024078677A1 - Mapping identifiers to maintain name and location coherency in file system objects - Google Patents
Mapping identifiers to maintain name and location coherency in file system objects Download PDFInfo
- Publication number
- WO2024078677A1 WO2024078677A1 PCT/EP2022/078006 EP2022078006W WO2024078677A1 WO 2024078677 A1 WO2024078677 A1 WO 2024078677A1 EP 2022078006 W EP2022078006 W EP 2022078006W WO 2024078677 A1 WO2024078677 A1 WO 2024078677A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- location
- incoherency
- file system
- file
- mapping
- Prior art date
Links
- 238000013507 mapping Methods 0.000 title claims abstract description 66
- 238000000034 method Methods 0.000 claims abstract description 46
- 230000015654 memory Effects 0.000 claims abstract description 34
- 230000004044 response Effects 0.000 claims description 8
- 230000001360 synchronised effect Effects 0.000 claims description 4
- 230000010076 replication Effects 0.000 description 18
- 238000010586 diagram Methods 0.000 description 17
- 238000013500 data storage Methods 0.000 description 13
- 238000012545 processing Methods 0.000 description 12
- 238000012986 modification Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 8
- 230000004048 modification Effects 0.000 description 6
- 230000008859 change Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 150000001875 compounds Chemical class 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 239000000249 polyoxyethylene sorbitan monopalmitate Substances 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 230000000712 assembly Effects 0.000 description 2
- 238000000429 assembly Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000004615 ingredient Substances 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/065—Replication mechanisms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/178—Techniques for file synchronisation in file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
- G06F3/0619—Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0643—Management of files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/0647—Migration mechanisms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/0652—Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
Definitions
- the present disclosure in some embodiments thereof, relates to a computing system, and more particularly, but not exclusively, to file system replication and synchronization processes.
- Data storage systems are arrangements of hardware and software that typically include multiple storage processors coupled to arrays of non-volatile storage devices, such as magnetic disk drives, electronic flash drives, and optical drives.
- Storage processors service requests from host machines and specify files elements to be written, read, created, or deleted.
- the processors perform various data processing tasks to organize and secure data stored on the non-volatile storage devices.
- Data storage systems commonly employ replication and synchronization technologies to backup or proliferate the data they store. Such systems may create a copy of the production site data of an organization on a secondary storage system.
- the secondary storage system may be situated in the same physical location as the original production data or can be in a physically remote location.
- the storage systems may provide data updates to enable data synchronization and recovery at specified points in time. Synchronization seeks to minimize the down time and data incoherency by periodically holding changes that were done in the source data since a latest copy, e.g., a resynchronization operation. During synchronization, the storage system can access a file on a source system and copy object data and file-level metadata to another, target file system.
- the data on the target file system must be deleted during resynchronization to make room for the new copy from the source file system. That is, the updated copy is saved in the place of the deleted copy on the target file system.
- many of the objects and associated data in the target file system are deleted and then re-copied, again, from the source file system.
- certain objects on the target file system must be reconstituted immediately after being deleted because those objects remain unchanged from the source file system copy. For instance, much of the data on a target file system may be the same as that on a source file system where only a name or location of an object is changed on the source side.
- replacing data on the target file system that is not changed or deleted on the source side can be viewed as expending processing and memory resources only to re-accomplish existing work.
- a system to maintain coherence in file system (FS) objects by including a memory having a plurality of mapping records each having first identifier (ID) and a second ID for associating a first FS object stored in a first storage device.
- a second FS object may be stored in a second storage device.
- the first FS object and the second FS object may be controlled by different instances of an FS kernel.
- the system may further include a processor configured to access the memory to identify an incoherency between the first FS object and the second FS object and to update a respective mapping record from the plurality of mapping records to associate the second ID with a modified name or a modified location corresponding to the identified incoherence.
- the processor identifies the incoherency by identifying a difference between respective locations of the first and second FS objects. According to some embodiments of the invention, the processor identifies the incoherency by identifying a difference between respective names of the first and second FS objects. According to some embodiments of the invention, the processor is further configured to synchronize the second FS object to a third FS object of a third storage device by associating a third ID with the modified name or the modified location associated with the second ID.
- the processor is further configured to identify a second incoherency between modified name or location and a corresponding location or name of the third FS object. According to some embodiments of the invention, the processor is further configured to synchronize the first FS object to the third FS object by associating the first ID with a second modified name or second modified location associated with the third ID. According to some embodiments of the invention, the processor is further configured to synchronize the plurality of mapping records by performing the updating in response to a synchronization operation. According to some embodiments of the invention, identifying the incoherency includes comparing a first location of the first FS object within a first path to a second location of the second FS object within a second path.
- identifying the incoherency includes comparing one or more paths associated with the first ID to a plurality of corresponding paths associated with the second ID, wherein the plurality of corresponding paths is associated with multiple hard links of the modified location of the second FS object. According to some embodiments of the invention, identifying the incoherency includes comparing an attribute associated with the first ID to a corresponding attribute associated with the second ID.
- identifying the incoherency includes comparing a first checksum value associated with a first path of the first ID to a second checksum value associated with second path of the second ID. According to some embodiments of the invention, identifying the incoherency includes comparing the plurality of mapping records to a subsequently generated plurality of mapping records. According to some embodiments of the invention, the processor is further configured to reassign the first ID to a third FS object, wherein the first FS object has been deleted.
- changes are made to both the first and second memories prior to a resynchronizing operation, and the processor is further configured to prompt a user to select the first FS object over the second FS object to be copied.
- changes are made to both the first and second memories prior to a resynchronizing operation, and the processor is further configured to prompt a user to select a most recently updated FS object to be copied as between the first FS object and the second FS.
- the processor is further configured to prompt a user to select a correct version to be copied as between the first FS object and the second FS object.
- the processor is further configured to prompt a user to designate the first FS object as a master copy to be copied to multiple FS objects.
- the processor is further configured to store the plurality of mapping records in at least one of: an ordered log file, a key-value store, a database, the first FS object, the second FS object, or as a special extended attribute on the second identifier.
- the processor is further configured to store the plurality of mapping records in a mapping table.
- a method of modifying a name or location of an FS object to be synchronized with that of another FS object comprising mapping a ID associated with a first FS object of a first storage device to a second ID associated with a second FS object of a second storage device.
- the method further includes controlling the first FS object and the second FS object using different instances of an FS kernel, identifying an incoherency between the first FS object and the second FS object, and updating a respective mapping record of a plurality of mapping records to associate the second ID with a modified name or a modified location corresponding to the identified incoherency.
- identifying the incoherency further comprises scanning the first and second memories prior to a synchronizing operation.
- the method further comprises updating a respective name or a respective location of each of a plurality of FS objects comprising copies of the first object.
- the first and second FS objects are included within a plurality of N FS objects, the method further comprising associating the modified name or the modified location with an N-l name or an N-l location associated with an N-l ID of the plurality of N FS objects, and associating the modified name or the modified location of the N- 1 ID with an Nth name or an Nth location associated with an Nth ID of the plurality of N FS objects.
- mapping the first ID further comprises mapping at least one of an inode or a file ID.
- At least one non-transitory computer-readable medium comprising instructions that, in response to execution of the instructions by one or more processors, cause the one or more processors to perform the following operations: mapping a first ID associated with a first FS object of a first storage device to a second ID associated with a second FS object of a second storage device, controlling the first FS object and the second FS object using different instances of an FS kernel; identifying an incoherency between the first FS object and the second FS object, and updating a respective mapping record of a plurality of mapping records to associate the second ID with a modified name or a modified location corresponding to the identified incoherency.
- Implementation of the method and/or system of embodiments of the disclosure can involve performing or completing selected tasks manually, automatically, or a combination thereof. Moreover, according to actual instrumentation and equipment of embodiments of the method and/or system of the invention, several selected tasks could be implemented by hardware, by software or by firmware or by a combination thereof using an operating system.
- a data processor such as a computing platform for executing a plurality of instructions.
- the data processor includes a volatile memory for storing instructions and/or data and/or a non-volatile storage, for example, a magnetic harddisk and/or removable media, for storing instructions and/or data.
- a network connection is provided as well.
- a display and/or a user input device such as a keyboard or mouse are optionally provided as well.
- Fig. 1 is a simplified block diagram of a computing system
- Fig. 2 is a block diagram of an implementation of a system comprising a software application and associated hardware that allows a user to manage maintain coherency between names and locations of objects;
- Fig. 3 is a block diagram illustrating a logical relationship between a file path, a file inode, and file data
- Fig. 4 is a block diagram illustrating an example of a file system comprising a tree that includes multiple directories and files, in addition to associated inodes;
- Fig. 5 is a diagram showing illustrative directory data, and more particularly, the included within a directory of the file system tree of Fig. 4;
- Fig. 6 shows an FS object 600 that is similar to the FS object of Fig. 3, symbolically undergoing an operation to change a location;
- Fig. 7 shows a file system 700 comprising the file system of Fig. 4, but after an object is renamed and assigned a new location/path;
- Fig. 8 is a block diagram illustrating an example of a file system comprising a tree that includes multiple directories and files, in addition to associated inodes;
- Fig. 9 is a table comprising an ID map to facilitate the replication of the file system of Fig. 8;
- Fig. 10 is a block diagram of a system that performs multiway, or n-way, synchronization for a plurality of file system trees, including with arrows illustrating an order in which the changes in the name or location of an object in one file system tree are updated accordingly in another during a replication operation;
- Fig. 11 is an embodiment of a method of maintaining coherence in file system objects.
- Fig. 12 illustrates an example architecture of a computing device, and more particularly, a block diagram of a system that may optionally be utilized to perform one or more aspects of techniques described herein.
- the present invention in some embodiments thereof, relates to a computing system and, more particularly, but not exclusively, to file system replication and synchronization processes.
- An embodiment of a system maintains coherence in file system objects using mapping records that associate objects in corresponding file system trees using unique object identifiers (IDs) to update a modified name or location. For instance, the mapping of object IDs may be used to copy or update a name or hard-link path of a first object in a first file system tree according to a corresponding object in a second file system.
- IDs unique object identifiers
- an embodiment of the system scans a source file system tree.
- the system may then copy each object (e.g., a file or a folder) to a corresponding target location within a target file system tree.
- An embodiment of the system saves mapping between the source object unique ID and each corresponding target object ID. Examples of object IDs include inode numbers and file IDs.
- object IDs include inode numbers and file IDs.
- the system may copy attributes, as well.
- An example of an attribute includes a last-modification date. In this manner, the attributes are identical in both the source file system to the target file system that they are identical in source and target file systems.
- an embodiment of the system may scan both current source and target file system trees. For each file system tree, the system may save mapping records between each object ID and the corresponding object ID, such as in an ID-map.
- the mapping records may additionally include paths associated with objects in each tree, such as in path-maps. In an example where there are multiple hard links associated with an object, the mapping of the object ID will include multiple, corresponding paths.
- the system may identify any changes done on either file system tree since the latest synchronization. The system may then determine how to merge or otherwise synchronize the trees. According to a particular embodiment, the system may choose or prompt a user to choose the synchronization direction (e.g., source to target or vice versa).
- the system may check to determine if the source object ID and the corresponding target object ID exist in both file system trees. If so, the system may determine if the path and attributes of the target and source objects are equal. Where they are determined to be equal, the system may take no action towards synchronizing the objects. Where the path or attributes of the target and source objects are alternatively different, then the system may initiate a synchronization operation.
- Examples of synchronization operations may include copying the source version, a latest version, or some combination that includes merging the two versions.
- Other settings may include removing files that have been deleted since a last synchronization operation, or always synchronizing to the source location.
- the system may detect that an object was moved on either one or both locations from determining that the associated paths are different. Where an object includes multiple paths, those paths that differ as between file system trees may be deleted and hard links may be created for new paths.
- the system may determine that an object was created since a last synchronization operation by detecting that an object exists only in one location. Alternatively, an object may have been deleted since the last synchronization operation.
- the system may identify objects that were changed (in addition or alternative to the comparing object attributes) by calculating and including in the path-maps a checksum value for each object.
- the checksum value may be included by the system as a parameter for object comparison.
- An embodiment of the system may allow both the target file system and the source file system to be changed before a resynchronization.
- the system may prompt the user to choose a resynchronization result. For example, the user may be prompted to choose the file system on the source side to be copied, thus making the target file system identical to the source file system.
- the user may be prompted to have the system choose a latest (e.g., most recently modified) copy as between corresponding objects of the target and source file systems. That is, if the latest update to one of the corresponding objects was on the target file system, then the copy of the object on the target file system is used to update the corresponding object on the source file system. Conversely, if the latest modification was on the object on the source file system, then the copy of the object on the source file system is used to update the corresponding object on the target file system.
- a latest (e.g., most recently modified) copy as between corresponding objects of the target and source file systems. That is, if the latest update to one of the corresponding objects was on the target file system, then the copy of the object on the target file system is used to update the corresponding object on the source file system.
- the system may prompt the user to select a correct version of the object. For instance, where modifications were made to multiple corresponding objects before a resynchronization, a user may select the changes made to one of the objects as the version to be copied to the other object(s).
- a particular embodiment of the system may create multiple copies of a copy (e.g., a master copy) of a designated source file system. For example, multiple copies of the master copy may be desired at multiple different remote locations.
- the system may maintain mapping records (e.g., an ID-map and/or path-map) for each copy from the time the master copy was copied to each target.
- mapping records e.g., an ID-map and/or path-map
- the system may perform the single copy algorithm described herein for each.
- the synchronization of different copies may be done at different times. Each synchronization may be accomplished according to the target specific objects and a current state of the source object (e.g., master copy).
- an object ID may be reused. For instance, after synchronizing source and target file systems, an object (i.e., object A) on the source file system may be deleted. This deletion may functionally free up the object ID formerly associated with object A. When a new object is created (i.e., object B), the system may reassign the object ID of object A to object B.
- the system may encounter object B in its scan.
- the system may check in the ID-map and see that its object ID exists and may locate its corresponding target object (e.g., object A on the target file system).
- the system may move object A on the target file system to the path of object B.
- the system may also recognize that this object was changed, since its attributes differ. For example, a creation and a modification time would be later than those of deleted object A. Since the attributes differ, the system may synchronize the file data by copying the object B file content to the corresponding object B on the target file system.
- the target file system includes object A with an updated path(s) and content.
- Certain embodiments of the system enable multiway, or n-way, synchronization. For instance, the system may synchronize multiple file system copies that have been independently changed, with all changes being merged to all copies.
- synchronization may begin with object copy number 1 and synchronize it with object copy number 2 as described herein using mapping records. After the synchronization, object copies 1 and 2 are identical and both contain changes that were made on both objects. The system may then synchronize object copy 2 with object copy 3, continuing in the same manner until synching object copy N-l with object copy N. The resulting object copy N may thus include combined changes from all the other object copies. The system may then synchronize object copy N with object copy 1.
- the system may loop synchronization processes by synchronizing object copy 1 with 2 and so on, including synchronizing object copy N-2 with object copy N-l. In this manner, all the object copies may be identical and include changes that were made on all other object copies.
- the system may synchronize the 1st (e.g., master) copy with a 2nd copy based a defined policy.
- the system may additionally synchronize the 1st copy with the 3rd copy according to the defined policy. If a file is copied from the 3rd copy to the 1st copy, then the process may repeat by synchronizing the 1st with the 2nd copy according to the defined policy. Where desired, synchronization may occur only for the files that were copied from the 3rd copy to the 1st (e.g., master) copy.
- the 2nd copy may hold, access, or otherwise include object IDs corresponding to the 1st object IDs.
- the 3rd copy may include object IDs corresponding to the 2nd object IDs.
- a policy may be defined by the user. For example, system may prompt the user to define a policy that selects one copy as the master and that make all other copies identical to the master copy. Another policy may cause the system to select from each copy the latest version. The system may use a copy of the latest version as a master copy with which to update all others. In response to a conflict, the system of another embodiment may initiate prompting the user to select between copies which copy will function as the master copy with which to update all others.
- the methods and apparatus of exemplary embodiments may take the form, at least partially, of program code (i.e., instructions) embodied in tangible media, such as disks, CD-ROMs, hard drives, random access or read only-memory, or any other machine-readable storage medium, including transmission medium.
- program code i.e., instructions
- tangible media such as disks, CD-ROMs, hard drives, random access or read only-memory, or any other machine-readable storage medium, including transmission medium.
- the media can include portions in different system components, such as memory in a host, an application instance, and or, a management station.
- the methods and apparatus may be embodied in the form of program code that may be implemented such that when the program code is received and loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the embodiments described herein.
- program code When implemented on processor, the program code combines with the processor to provide a unique apparatus that operates analogously to specific logic circuits.
- the program code (software-based logic) for carrying out the method is embodied as part of the system described below.
- FIG. 1 illustrates a simplified illustration of a system 100 configured to maintaining coherence in file system objects. More particularly, FIG. 1 shows an example environment 100 that includes host computing devices (“hosts”), shown as devices 104, 106, 108, and 110, access a data storage system 116 over a network 114.
- the data storage system 116 include a storage processor (SP) 118 and memory 120.
- SP storage processor
- An embodiment of the data storage system 116 may include multiple SPs.
- multiple SPs may be provided as circuit board assemblies, or blades, which plug into a chassis, which encloses and cools the SPs.
- the chassis has a backplane for interconnecting the SPs, and additional connections may be made among SPs using cables.
- FIG. 1 shows only a single data storage system 116, it is understood that many operations described herein involve activities take place between multiple data storage systems, i.e., between a source data storage system and a target data storage system. As such, an additional storage system 122 is shown in dashed line.
- the source and target systems may be connected via the network 114 or via any suitable means.
- the particular construction shown for the data storage system 116 is intended to be representative of both the source and the destination, although it should be understood that the source and the target systems may vary in their details.
- the network 114 can be any type of network or combination of networks, such as a storage area network (SAN), a local area network (LAN), a wide area network (WAN), the Internet, and/or some other type of network or combination of networks, for example.
- the hosts 104, 106, 108, and 110 may connect to the SP 118 using various technologies, such as Fibre Channel, iSCSI, NFS, SMB 3.0, and CIFS, for example. Any number of hosts 104, 106, 108, and 110 may be provided, using any of the above protocols, some sub set thereof, or other protocols besides those shown.
- the SP 118 is configured to receive IO requests according to both block-based and filebased protocols and to respond to such IO requests by reading or writing the storage 120 or memory 130.
- the SP 118 is seen to include one or more communication interfaces 124 and a set of processing units 124.
- the communication interfaces 124 include, for example, SCSI target adapters and network interface adapters for converting electronic and/or optical signals received over the network 114 to electronic form for use by the SP 118.
- the set of processing units 126 includes one or more processing chips and/or assemblies. In a particular example, the set of processing units 126 includes numerous multi-core CPUs.
- the memory 130 includes both volatile memory (e.g., RAM), and non-volatile memory, such as one or more ROMs, disk drives, solid state drives, and the like.
- the set of processing units 126 and the memory 130 together form control circuitry, which is constructed and arranged to carry out various methods and functions as described herein, e.g., alone or in coordination with similar control circuitry on another data storage system.
- the memory 130 includes a variety of software constructs realized in the form of executable instructions. When the executable instructions are run by the set of processing units 126, the set of processing units 126 are caused to carry out the operations of the software constructs.
- the memory 130 typically includes many other software constructs, such as an operating system 128, as well as various applications, processes, and daemons.
- the memory 128 includes a replication manager 132 that controls the establishment of replication settings on representative data objects 134, 140, and 142.
- the replication manager 132 establishes replication settings on a per-data-object basis, conducts replication sessions, and orchestrates replication activities, including recovery and failover.
- the replication manager 132 works in coordination with a replication appliance 138.
- the replication appliance 138 assists in performing continuous replication with another data storage system (e.g., with a destination data storage system), which may be located remotely.
- the replication appliance 138 takes the form of a separate hardware unit. Any number of such hardware units may be provided, and they may work together, e.g., in a cluster.
- the operating system 128 additionally includes objects 134, 140, and 142.
- object may include an inode or a file ID.
- the objects 134, 140, and 142 may be controlled by separate instances of a kernel 150 of the operating system 128.
- Illustrative object 134 includes mapping records 144 that are used to identify and define a location of the object 134 to maintain coherency between file systems, as explained herein. To this end, the mapping records may include object IDs 146 and paths 148.
- Fig. 2 is a block diagram of an implementation of a system 200 comprising a software application and associated hardware that maintain coherence in FS objects.
- the illustrative system 200 of Fig. 2 may have functional components and software in common with the storage processor 118 of Fig. 1.
- the system 200 may be configured to modify a name or a location of an FS object to be synchronized with that of another FS object.
- the implementation of the system 200 includes multiple modules 204 and 206-224 executed by a processor 202 to manage memory processes prior and during a synchronization operation 214.
- the system 200 includes a memory 202 having a plurality of mapping records 207 associating a plurality of IDs 206 to a plurality of FS objects 209.
- the processor 202 may be configured to access the memory 204 to identify an incoherency between a first FS object 211 and a second FS object 213 of the plurality of FS objects 209.
- the processor may further be configured to update a respective mapping record from the plurality of mapping records 207 to associate the second ID 219 with a modified name of a plurality of names 212 or a modified location of a plurality of locations 210 corresponding to the identified incoherence 223.
- identifying the incoherency 223 includes comparing one or more paths 208 associated with the first ID 217, where the plurality of corresponding paths 208 is associated with multiple hard links of the modified location of the second FS object 219. According to some embodiments of the system 200, identifying the incoherency 223 includes comparing an attribute 220 associated with the first ID 217 to a corresponding attribute associated with the second ID 219.
- identifying the incoherency may include comparing a first checksum value of a plurality of checksum values 218 associated with a first path 208 of the first ID 217 to a second checksum value associated with second path of the second ID 219.
- the processor 202 of an embodiment of the system 200 may be further configured to prompt a user for user input 224 to select a correct version to be copied as between the first FS object and the second FS object. For example, a user may designate the first FS object 211 as a master copy to be copied to multiple FS objects.
- the processor may be further configured to store the plurality of mapping records in a mapping table 216.
- the mapping table 216 and mapping records 207 are depicted separately, but other embodiments may combine functionalities.
- the processor 202 may be configured to store the plurality of mapping records 207 of another embodiment in at least one of an ordered log file (not shown), a key-value store (not shown), a database, the first FS object 211, the second FS object 219, or as a special extended of a plurality of attributes 220 on the second identifier 219.
- the coherency determination algorithm 222 may scan memories prior to a synchronizing operation to determine the incoherencies 223. FIG.
- FIG. 3 is a block diagram illustrating logical relationships between aspects of an illustrative FS object 300.
- the figure shows logical connections between a file path 302, a file inode 304, and file data 306 of the FS object 300.
- the metadata is held in a structure that is called an inode.
- the inode 304 includes a unique ID and attributes.
- the illustrative FS object 300 of Fig. 1 includes three parts. Namely, the FS object 300 includes: the file data 306, the metadata (e.g., attributes) of the inode 304 that holds the files/directories details (e.g., an identifier, a size, a timestamp, a name, a type, a pointer to the data, etc.), and a file path 302 (i.e., location).
- the metadata e.g., attributes
- the files/directories details e.g., an identifier, a size, a timestamp, a name, a type, a pointer to the data, etc.
- a file path 302 i.e., location
- each FS object (e.g., a file or a directory) on a file system has a unique identifier. This identifier remains with the object throughout its existence even if the object is renamed, moved, has its attributes or data changed, or is associated with additional single or multiple hard links.
- Object storage include one or more paths to each FS object and a unique identifier for each FS object.
- File systems such as the file system of the FS object 300, may include files and directories (e.g., folders).
- a directory may include files or other directories. In such an instance, these directories, themselves, may include files and directories.
- a location of each file system may begin with its root directory (i.e., root node).
- Fig. 4 is a block diagram illustrating an example of a file system 400 comprising a tree that includes multiple directories and files, in addition to associated inodes. More particularly, the file system tree 400 includes a root directory 402 and an associated 1 st inode 404. Under the root directory 402, the file system tree 400 includes directory A 406, directory B 408, and directory C 410. The directories 406, 408, and 410 are associated respectively with inodes 412, 414, and 416.
- directory A 406 includes a 1 st file 418 associated with a 6 th inode 420, and a 2 nd file 422 associated with a 7 th inode 424.
- Directory C 410 includes directory D 426 and associated inode 428, as well as a 5 th file 430 associated with inode 432.
- Directory C 410 also includes directory E 434, which is associated with inode 436.
- the directory D 426 may include a 3 rd file 438 that is associated with a 10 th inode 440.
- a 4 th file 442, associated with an 11 th inode 444, may additionally be included under the D directory 426.
- a program may specify a path through the file system tree, beginning from the root directory 402 and proceeding throughout the directory tree until reaching the file itself.
- the path of the 3 rd file 438 in Fig. 4 may be: "/C/D/File 3".
- Each directory 402, 406, 408, 410, 426, and 434 holds a list of its directory entries (e.g., the files and directories it includes). Each entry may include a name and an identifier (e.g., an inode in the file system of Fig. 4). Each directory 402, 406, 408, 410, 426, and 434 may additionally include an entry pointing to its own inode as well as to its parent directory inode
- Fig. 5 is a diagram 500 showing illustrative directory data, and more particularly, the data included within the directory C 410 of the file system tree 400 of Fig. 4.
- directory C 502 includes a list of its own directories D and E, as well as its own file (i.e., the file 5).
- the inodes 5, 8, and 9 of each directory and file are additionally included on the directory C 502 in an association with their respective file and directories.
- the directory C 502 additionally stores its own inode 4, as well as the inode 1 of its corresponding root directory.
- the directory structure 502 points to a structure 504 that includes inode 5 and that corresponds to directory D.
- the structure 504 includes information pertaining to permissions, an owner/group ID, a directory, and data blocks.
- the structure 504 of directory D additionally points to data blocks 506, 508, and 510.
- the structure 512 includes inode 9 and may correspond to directory E 434 of Fig. 4, and another pointer may link to information 430 (not shown) pertaining to the 5 th file 430 of Fig. 4.
- File systems may have operations that may be done on files and directories, such as create, open, delete, write, and append operations.
- One of these operations is the move operation, which allows the user to change the location of the file/directory.
- the move operation changes only the file/directory path (e.g., not the file data). Put another way, the path of the file changes after the move operation, but the data, itself, and its inode and other attributes are not relocated or changed. By enabling the file data and inode to remain, processing cycles are saved over conventional systems that would delete and replace the data. After the move operation, the same file attributes and data may be accessed using the new path.
- FIG. 6 shows an FS object 600 that is similar to the FS object 300 of Fig. 3, symbolically undergoing an operation to change a location. That is, an old path 602 is deleted (as designed by the “X”), and a new path 608 is assigned to the file anode 604 and file data 606.
- a file data may include at least one path. However, the file data 606 may be accessed using several paths. Each of these paths is called a hard link. After creating the initial file and its data, it is possible to create more hard links to that FS object using different paths.
- Fig. 7 shows a file system 700 comprising the file system 400 of Fig. 4, but after an object is renamed and assigned a new location/path.
- the directory C 410 has been renamed directory F and has been moved under the directory B 408.
- the numbering in Fig. 7 is carried over from Fig. 4 to illustrate the renaming/restructuring operations, as well as to show that associations between inodes and files/directories remain unchanged during the operations.
- the block diagram illustrates an example of a file system 700 comprising a tree that includes multiple directories and files, in addition to associated inodes. More particularly, the file system tree includes a root directory 402 and an associated 1 st inode 404. Under the root directory 402, the file system tree includes directory A 406 and directory B 408. Directory F 410 has been moved under directory B 408. As shown in Fig. 7, the A 406 directory continues to be associated with the 2 nd inode 412, just as the directories 408 and 410 continue to be associated respectively with the 3 rd inode 414 and the 4 th inode 416.
- directory A 406 includes a 1st file 418 associated with a 6 th inode 420, and a 2 nd file 422 associated with a 7 th inode 424.
- Renamed directory F 410 includes directory D 426 and associated inode 428, as well as a 5 th file 430 associated with inode 432.
- Directory C 410 also includes directory E 434, which is associated with inode 436.
- the directory D 426 may include a 3 rd file 438 that is associated with a 10 th inode 440.
- a 4 th file 442, associated with an 11th inode 444, may additionally be included under the D directory 426.
- Fig. 8 is a block diagram illustrating an example of a file system 800 comprising a tree that includes multiple directories and files, in addition to associated inodes.
- the file system 800 may be a target or other replication desired to be in synchronization with the file system 400. While other nomenclature of the files and directories remain the same as shown in the source file system 400 of Fig. 4, the inode numbering of the source file system 800 in Fig. 8 is different to reflect a unique ID number is associated with each directory/file.
- the file system tree includes a root directory 802 and an associated 11 th inode 804. Under the root directory 802, the file system tree includes directory A 806, directory B 808, and directory C 810. The directories 806, 808, and 810 are associated respectively with inodes 812, 814, and 816.
- directory A 806 includes a 1 st file 818 associated with a 16 th inode 820, and a 2 nd file 822 associated with a 17 th inode 824.
- Directory C 810 includes directory D 826 and associated inode 828, as well as a 5 th file 830 associated with and 18 th inode 832.
- Directory C 810 also includes directory E 834, which is associated with a 20 th inode 836.
- the directory D 826 may include a 3 rd file 838 that is associated with a 20 th inode 840.
- a 4 th file 842, associated with a 21 st inode 844, may additionally be included under the D directory 426.
- Fig. 9 is a table comprising an ID map 900 for the replication of the file system 800 of Fig. 8.
- the ID map 900 may comprise a portion of the mapping records 144 of Fig. 1.
- a first column 902 displays identifiers (e.g., inode numbers) associated with objects of the source file system 400 of Fig. 4.
- a second column 904 displays identifiers (i.e., inode numbers) associated with objects of the target file system 800 of Fig. 8.
- the first entry 906 in the first row corresponds to the 1 st inode 404 of the root directory 402 of the file system 400 of Fig. 4.
- the second entry 908 in the first row corresponds to the 11 th inode 804 of the root directory 802 of the file system 800 of Fig. 8.
- the ID map 900 logically associates the 1 st inode of the source file system 400 with the 11 th inode of the target file system 800.
- the first entry 910 in the second row corresponds to the 2 nd inode 412 of directory A 406 of the file system 400 of Fig. 4.
- the second entry 912 in the second row corresponds to the 12 th inode 812 of the directory A 806 of the file system 800 of Fig. 8.
- the ID map 900 thus associates the 2 nd inode of the source file system 400 with the 12th inode of the target file system 800.
- the ID map 900 likewise associates inode 3 with inodel3; inode 4 with inode 14, and so on.
- the ID map 900 may be used when a resynchronizing operation begins to apply the changes to the name and locations on the target file system 800 to match the source file system 400. More particularly, inode 4 of the source file system will appear at /B/F path, while its corresponding target inode 14 will appear at /B path. Since the synchronization direction may be from source file system 400 to the target file system 800, the synchronization operation may include accessing the ID map 900 to move the /C directory (inodel4) to its corresponding inode location /B/F on the source file system 800.
- Fig. 10 is a block diagram of a system 1000 that performs multiway, or n-way, synchronization for a plurality of file system trees 1002, 1004, 1006, and 1008.
- Arrows 1010, 1012, 1014, and 1016 illustrate an order in which the changes in the name or location of an object in one file system tree may be updated according to one another during an embodiment of a resynchronization operation.
- the system 1000 may synchronize the multiple file system copies that have been independently changed, with all changes being merged to all copies. For instance, synchronization may begin with synchronizing an object in copy file system tree 1002 with file system tree 1004 as described herein using mapping records. After the synchronization, file system trees 1002 and 1004 are identical and both contain changes that were made on both objects.
- the system 1000 may then synchronize file system tree 1004 with file system tree 1006, continuing in the same manner by synching file system tree 1006 with file system tree 1008.
- the resulting file system tree 1008 may thus include combined changes from all the other file system tree 1002, 1004, and 1006.
- the system 1000 may then synchronize file system tree 1008 with file system tree 1002. In this manner, all the object copies may be identical and include changes that were made on all other object copies.
- the system 1000 may synchronize the file system tree 1002 (e.g., a master copy) with the file system tree 1004 based a defined policy.
- the system may additionally synchronize the file system tree 1002 with the file system tree 1006 according to the defined policy. If a file is copied from the file system tree 1006 to the file system tree 1002, then the process may repeat by synchronizing the file system tree 1002 with the file system tree 1004 according to the defined policy. Where desired, synchronization may occur only for the files that were copied from the file system tree 1008 to the file system tree 1002 (e.g., master copy).
- the file system tree 1004 may hold, access, or otherwise include object IDs corresponding to the file system tree 1002 object IDs.
- the file system tree 1006 may include object IDs corresponding to the file system tree 1004 object IDs.
- a policy may be defined by the user. For example, system may prompt the user to define a policy that selects one copy as the master and that make all other copies identical to the master copy. Another policy may cause the system to select from each copy the latest version. The system may use a copy of the latest version as a master copy with which to update all others. In response to a conflict, the system of another embodiment may initiate prompting the user to select between copies which copy will function as the master copy with which to update all others.
- Fig. 11 is an embodiment of a method 1100 of maintaining coherence in file system objects.
- the method 1100 may be performed, for instance, by the systems 100 and 200 of Figs. 1 and 2.
- the system at 1102 may generate one or more copies of (and synchronized to) a single file tree.
- the system may copy each object (e.g., a file or folder) to a corresponding target location within a target file system tree.
- the method 1100 may including mapping a 1 st ID of an object in the source file tree to a corresponding 2 nd object in the target file tree.
- An embodiment of the system uses records that maps between the source object IDs and each corresponding target object ID.
- the system may also copy attributes, such as a last-modification date. In this manner, the attributes are identical in both the source file system to the target file system that they are identical in source and target file systems.
- the ID map or other mapping records may be stored in: an ordered log file, a key-value store, a database, an object on the source and/or target side, or as a special extended attribute on the second ID.
- An embodiment of the system may at 1106 control the 1 st and 2 nd objects using different instances of a kernel.
- the kernel 150 of Fig. 1 may be used to control the objects of both the target and source file system trees.
- an embodiment of the method 1100 at 1108 may scan the 1 st and 2 nd file systems. The scan may be ultimately used to identify changes to the target and/or source file systems since the last synchronization.
- the comparison to determine any such changes is represented at 1110 of the flowchart.
- the system may save mapping records for each file system tree.
- the mapping records may include an ID-map, as well as a path-map that includes paths associated with objects in each tree.
- the system may identify any changes done on either file system tree since the latest synchronization.
- the system may determine at 1112 if the source object ID and a corresponding target object ID exist in both file system trees. If so, the system may determine if the path and attributes of the target and source objects are equal. Where they are determined to be equal, the system may take no action towards synchronizing the objects and loop back to 1108.
- the system may determine that the path or attributes of the target and source objects are alternatively different at 1112. For example, the system may detect that an object was moved on either one or both locations from determining that the associated paths are different. Where an object includes multiple paths, those paths that differ as between file system trees may be deleted and hard links may be created for new paths. In another example, the system may determine that an object was created since a last synchronization operation by detecting that an object exists only in one location. Alternatively, an object may have been deleted since the last synchronization operation.
- the method 1100 may identify at 1112 objects that were changed, in addition or alternative to the comparing object attributes. For instance, the system may calculate a checksum value for each object. The checksum value may be included by the system as a parameter for object comparison.
- the system may detect an incoherency by comparing a first location of a first FS object within a first path to a second location of a second FS object within a second path.
- the system may compare one or more paths associated with the first ID multiple corresponding paths associated with the second ID.
- Another manner of determining a coherency may include comparing an attribute associated with the first ID to a corresponding attribute associated with the second ID, by or comparing the plurality of mapping records to a subsequently generated plurality of mapping records.
- the method 1100 may include applying a defined policy to determine how to merge or otherwise synchronize the file system trees.
- the system may choose or prompt a user to choose the synchronization direction (e.g., source to target or vice versa).
- the system may prompt the user to choose a resynchronization result.
- the user may be prompted to choose the file system on the source side to be copied, thus making the target file system identical to the source file system.
- the user may be prompted to have the system choose a latest (e.g., most recently modified) copy as between corresponding objects of the target and source file systems.
- the system may prompt the user to select a correct version of the object. For instance, where modifications were made to multiple corresponding objects before a resynchronization, a user may select the changes made to one of the objects as the version to be copied to the other objects.
- the system may associate the 2 nd ID with a new name and/or location (e.g., path).
- the association may be used to update the mapping records at 1118 to synchronize the 1 st and 2 nd file systems. That is, the system may initiate a synchronization operation at 1118.
- Examples of synchronization operations may include copying the source version, a latest version, or some combination that includes merging the two versions.
- Other settings may include removing files that have been deleted since a last synchronization operation, or always synchronizing to the source location.
- Fig. 12 is a block diagram of an example computing device 1200 that may optionally be utilized to perform one or more aspects of techniques described herein.
- one or more of a client computing device, user-controlled resources engine, and/or other component s) may comprise one or more components of the example computing device 1200.
- Computing device 1200 typically includes at least one processor 1214 that communicates with several peripheral devices via bus subsystem 1212. These peripheral devices may include a storage subsystem 1224 that includes, for example, a memory subsystem 1225 and a file storage subsystem 1226, as well as a user interface output devices 1220, user interface input devices 1222, and a network interface subsystem 1216.
- the user interface input devices 1222 of an implementation may include a response volume setting, among other features.
- the input and output devices allow user interaction with computing device 1200.
- the network interface subsystem 1216 provides an interface to outside networks and is coupled to corresponding interface devices in other computing devices.
- the user interface input devices 1222 may include a keyboard, pointing devices such as a mouse, trackball, touchpad, or graphics tablet, a scanner, a touchscreen incorporated into the display, audio input devices such as voice recognition systems, microphones, and/or other types of input devices.
- pointing devices such as a mouse, trackball, touchpad, or graphics tablet
- audio input devices such as voice recognition systems, microphones, and/or other types of input devices.
- use of the term "input device” is intended to include all possible types of devices and ways to input information into computing device 1200 or onto a communication network.
- User interface output devices 1220 may include a display subsystem, a printer, a fax machine, or non-visual displays such as audio output devices.
- the display subsystem may include a cathode ray tube (CRT), a flat-panel device such as a liquid crystal display (LCD), a projection device, or some other mechanism for creating a visible image.
- the display subsystem may also provide non-visual display such as via audio output devices.
- output device is intended to include all possible types of devices and ways to output information from computing device 1200 to the user or to another machine or computing device.
- the storage subsystem 1224 stores programming and data constructs that provide the functionality of some or all the modules described herein. For example, the storage subsystem 1224 may include the logic to perform selected aspects of the method and to implement various components depicted in the preceding figures.
- the memory subsystem 1225 used in the storage subsystem 1224 may include a number of memories including a main random access memory (RAM) 1230 for storage of instructions and data during program execution and a read only memory (ROM) 1232 in which fixed instructions are stored.
- a file storage subsystem 1226 can provide persistent storage for program and data files, and may include a hard disk drive, a floppy disk drive along with associated removable media, a CD-ROM drive, an optical drive, or removable media cartridges.
- the modules implementing the functionality of certain implementations may be stored by file storage subsystem 1226 in the storage subsystem 1224, or in other machines accessible by the processor(s) 1214.
- the bus subsystem 1212 provides a mechanism for letting the various components and subsystems of computing device 1200 communicate with each other as intended. Although the bus subsystem 1212 is shown schematically as a single bus, alternative implementations of the bus subsystem may use multiple busses.
- the computing device 1200 can be of varying types including a workstation, server, computing cluster, blade server, server farm, or any other data processing system or computing device. Due to the ever-changing nature of computers and networks, the description of computing device 1200 depicted in Fig. 12 is intended only as a specific example for purposes of illustrating some implementations. Many other configurations of computing device 1200 are possible having more or fewer components than the computing device depicted in Fig. 12.
- compositions comprising, “comprising”, “includes”, “including”, “having” and their conjugates mean “including but not limited to”.
- consisting of means “including and limited to”.
- consisting essentially of' means that the composition, method, or structure may include additional ingredients, steps and/or parts, but only if the additional ingredients, steps and/or parts do not materially alter the basic and novel characteristics of the claimed composition, method, or structure.
- the singular form “a”, “an” and “the” include plural references unless the context clearly dictates otherwise.
- the term “a compound” or “at least one compound” may include a plurality of compounds, including mixtures thereof.
- range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the invention. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 3, 4, 5, and 6. This applies regardless of the breadth of the range.
- a numerical range is indicated herein, it is meant to include any cited numeral (fractional or integral) within the indicated range.
- the phrases “ranging/ranges between” a first indicate number and a second indicate number and “ranging/ranges from” a first indicate number “to” a second indicate number are used herein interchangeably and are meant to include the first and second indicated numbers and all the fractional and integral numerals there between.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Computer Security & Cryptography (AREA)
- Quality & Reliability (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A system and associated method maintain coherence in file system (FS) objects by including a memory having a plurality of mapping records each having first identifier (ID) and a second ID for associating a first FS object stored in a first storage device. A second FS object may be stored in a second storage device. The first FS object and the second FS object may be controlled by different instances of an FS kernel. The system may further include a processor configured to access the memory to identify an incoherency between the first FS object and the second FS object and to update a respective mapping record from the plurality of mapping records to associate the second ID with a modified name or a modified location corresponding to the identified incoherence.
Description
MAPPING IDENTIFIERS TO MAINTAIN NAME AND LOCATION COHERENCY IN FILE SYSTEM OBJECTS
BACKGROUND
The present disclosure, in some embodiments thereof, relates to a computing system, and more particularly, but not exclusively, to file system replication and synchronization processes.
Data storage systems are arrangements of hardware and software that typically include multiple storage processors coupled to arrays of non-volatile storage devices, such as magnetic disk drives, electronic flash drives, and optical drives. Storage processors service requests from host machines and specify files elements to be written, read, created, or deleted. In addition to managing incoming storage requests, the processors perform various data processing tasks to organize and secure data stored on the non-volatile storage devices.
Data storage systems commonly employ replication and synchronization technologies to backup or proliferate the data they store. Such systems may create a copy of the production site data of an organization on a secondary storage system. The secondary storage system may be situated in the same physical location as the original production data or can be in a physically remote location.
The storage systems may provide data updates to enable data synchronization and recovery at specified points in time. Synchronization seeks to minimize the down time and data incoherency by periodically holding changes that were done in the source data since a latest copy, e.g., a resynchronization operation. During synchronization, the storage system can access a file on a source system and copy object data and file-level metadata to another, target file system.
The data on the target file system must be deleted during resynchronization to make room for the new copy from the source file system. That is, the updated copy is saved in the place of the deleted copy on the target file system. As a result, many of the objects and associated data in the target file system are deleted and then re-copied, again, from the source file system. Put another way, certain objects on the target file system must be reconstituted immediately after being deleted because those objects remain unchanged from the source file system copy. For
instance, much of the data on a target file system may be the same as that on a source file system where only a name or location of an object is changed on the source side. Thus, replacing data on the target file system that is not changed or deleted on the source side can be viewed as expending processing and memory resources only to re-accomplish existing work.
SUMMARY
According to an aspect of some embodiments of the present disclosure there is provided a system to maintain coherence in file system (FS) objects by including a memory having a plurality of mapping records each having first identifier (ID) and a second ID for associating a first FS object stored in a first storage device. A second FS object may be stored in a second storage device. The first FS object and the second FS object may be controlled by different instances of an FS kernel. The system may further include a processor configured to access the memory to identify an incoherency between the first FS object and the second FS object and to update a respective mapping record from the plurality of mapping records to associate the second ID with a modified name or a modified location corresponding to the identified incoherence.
According to some embodiments of the invention, the processor identifies the incoherency by identifying a difference between respective locations of the first and second FS objects. According to some embodiments of the invention, the processor identifies the incoherency by identifying a difference between respective names of the first and second FS objects. According to some embodiments of the invention, the processor is further configured to synchronize the second FS object to a third FS object of a third storage device by associating a third ID with the modified name or the modified location associated with the second ID.
According to some embodiments of the invention, the processor is further configured to identify a second incoherency between modified name or location and a corresponding location or name of the third FS object. According to some embodiments of the invention, the processor is further configured to synchronize the first FS object to the third FS object by associating the first ID with a second modified name or second modified location associated with the third ID. According to some embodiments of the invention, the processor is further configured to synchronize the plurality of mapping records by performing the updating in response to a synchronization operation.
According to some embodiments of the invention, identifying the incoherency includes comparing a first location of the first FS object within a first path to a second location of the second FS object within a second path. According to some embodiments of the invention, identifying the incoherency includes comparing one or more paths associated with the first ID to a plurality of corresponding paths associated with the second ID, wherein the plurality of corresponding paths is associated with multiple hard links of the modified location of the second FS object. According to some embodiments of the invention, identifying the incoherency includes comparing an attribute associated with the first ID to a corresponding attribute associated with the second ID.
According to some embodiments of the invention, identifying the incoherency includes comparing a first checksum value associated with a first path of the first ID to a second checksum value associated with second path of the second ID. According to some embodiments of the invention, identifying the incoherency includes comparing the plurality of mapping records to a subsequently generated plurality of mapping records. According to some embodiments of the invention, the processor is further configured to reassign the first ID to a third FS object, wherein the first FS object has been deleted.
According to some embodiments of the invention, changes are made to both the first and second memories prior to a resynchronizing operation, and the processor is further configured to prompt a user to select the first FS object over the second FS object to be copied. According to some embodiments of the invention, changes are made to both the first and second memories prior to a resynchronizing operation, and the processor is further configured to prompt a user to select a most recently updated FS object to be copied as between the first FS object and the second FS.
According to some embodiments of the invention, changes are made to both the first and second memories prior to a resynchronizing operation, and the processor is further configured to prompt a user to select a correct version to be copied as between the first FS object and the second FS object. According to some embodiments of the invention, the processor is further configured to prompt a user to designate the first FS object as a master copy to be copied to multiple FS objects. According to some embodiments of the invention, the processor is further configured to store the plurality of mapping records in at least one of: an ordered log file, a key-value store, a database, the first FS object, the second FS object, or as a special extended attribute on the second identifier.
According to some embodiments of the invention, the processor is further configured to store the plurality of mapping records in a mapping table. According to an aspect of some embodiments of the present disclosure there is provided a method of modifying a name or location of an FS object to be synchronized with that of another FS object, the method comprising mapping a ID associated with a first FS object of a first storage device to a second ID associated with a second FS object of a second storage device. The method further includes controlling the first FS object and the second FS object using different instances of an FS kernel, identifying an incoherency between the first FS object and the second FS object, and updating a respective mapping record of a plurality of mapping records to associate the second ID with a modified name or a modified location corresponding to the identified incoherency.
According to some embodiments of the invention, identifying the incoherency further comprises scanning the first and second memories prior to a synchronizing operation. According to some embodiments of the invention, the method further comprises updating a respective name or a respective location of each of a plurality of FS objects comprising copies of the first object. According to some embodiments of the invention, the first and second FS objects are included within a plurality of N FS objects, the method further comprising associating the modified name or the modified location with an N-l name or an N-l location associated with an N-l ID of the plurality of N FS objects, and associating the modified name or the modified location of the N- 1 ID with an Nth name or an Nth location associated with an Nth ID of the plurality of N FS objects.
According to some embodiments of the invention, comprising associating the modified name or the modified location of the Nth ID with a first name or a first location associated with the first ID. According to some embodiments of the invention, mapping the first ID further comprises mapping at least one of an inode or a file ID.
According to an aspect of some embodiments of the present disclosure there is provided at least one non-transitory computer-readable medium comprising instructions that, in response to execution of the instructions by one or more processors, cause the one or more processors to perform the following operations: mapping a first ID associated with a first FS object of a first storage device to a second ID associated with a second FS object of a second storage device, controlling the first FS object and the second FS object using different instances of an FS kernel;
identifying an incoherency between the first FS object and the second FS object, and updating a respective mapping record of a plurality of mapping records to associate the second ID with a modified name or a modified location corresponding to the identified incoherency.
Unless otherwise defined, all technical and/or scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the disclosure pertains. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of embodiments of the invention, exemplary methods and/or materials are described below. In case of conflict, the patent specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and are not intended to be necessarily limiting.
Implementation of the method and/or system of embodiments of the disclosure can involve performing or completing selected tasks manually, automatically, or a combination thereof. Moreover, according to actual instrumentation and equipment of embodiments of the method and/or system of the invention, several selected tasks could be implemented by hardware, by software or by firmware or by a combination thereof using an operating system.
For example, hardware for performing selected tasks according to embodiments of the disclosure could be implemented as a chip or a circuit. As software, selected tasks according to embodiments of the disclosure could be implemented as a plurality of software instructions being executed by a computer using any suitable operating system. In an exemplary embodiment of the invention, one or more tasks according to exemplary embodiments of method and/or system as described herein are performed by a data processor, such as a computing platform for executing a plurality of instructions. Optionally, the data processor includes a volatile memory for storing instructions and/or data and/or a non-volatile storage, for example, a magnetic harddisk and/or removable media, for storing instructions and/or data. Optionally, a network connection is provided as well. A display and/or a user input device such as a keyboard or mouse are optionally provided as well.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
Some embodiments of the disclosure are herein described, by way of example only, with reference to the accompanying drawings. With specific reference now to the drawings in detail, it is stressed that the particulars shown are by way of example and for purposes of illustrative
discussion of embodiments of the invention. In this regard, the description taken with the drawings makes apparent to those skilled in the art how embodiments of the disclosure may be practiced.
In the drawings:
Fig. 1 is a simplified block diagram of a computing system;
Fig. 2 is a block diagram of an implementation of a system comprising a software application and associated hardware that allows a user to manage maintain coherency between names and locations of objects;
Fig. 3 is a block diagram illustrating a logical relationship between a file path, a file inode, and file data;
Fig. 4 is a block diagram illustrating an example of a file system comprising a tree that includes multiple directories and files, in addition to associated inodes;
Fig. 5 is a diagram showing illustrative directory data, and more particularly, the included within a directory of the file system tree of Fig. 4;
Fig. 6 shows an FS object 600 that is similar to the FS object of Fig. 3, symbolically undergoing an operation to change a location;
Fig. 7 shows a file system 700 comprising the file system of Fig. 4, but after an object is renamed and assigned a new location/path;
Fig. 8 is a block diagram illustrating an example of a file system comprising a tree that includes multiple directories and files, in addition to associated inodes;
Fig. 9 is a table comprising an ID map to facilitate the replication of the file system of Fig. 8;
Fig. 10 is a block diagram of a system that performs multiway, or n-way, synchronization for a plurality of file system trees, including with arrows illustrating an order in which the changes in the name or location of an object in one file system tree are updated accordingly in another during a replication operation;
Fig. 11 is an embodiment of a method of maintaining coherence in file system objects; and
Fig. 12 illustrates an example architecture of a computing device, and more particularly, a block diagram of a system that may optionally be utilized to perform one or more aspects of techniques described herein.
DESCRIPTION OF SPECIFIC EMBODIMENTS
The present invention, in some embodiments thereof, relates to a computing system and, more particularly, but not exclusively, to file system replication and synchronization processes.
An embodiment of a system maintains coherence in file system objects using mapping records that associate objects in corresponding file system trees using unique object identifiers (IDs) to update a modified name or location. For instance, the mapping of object IDs may be used to copy or update a name or hard-link path of a first object in a first file system tree according to a corresponding object in a second file system.
During a synchronizing operation of a single file system copy, an embodiment of the system scans a source file system tree. The system may then copy each object (e.g., a file or a folder) to a corresponding target location within a target file system tree. An embodiment of the system saves mapping between the source object unique ID and each corresponding target object ID. Examples of object IDs include inode numbers and file IDs. When copying objects from the source file system to the target file system, the system may copy attributes, as well. An example of an attribute includes a last-modification date. In this manner, the attributes are identical in both the source file system to the target file system that they are identical in source and target file systems.
When a need arises to resynchronize the source and target file systems, an embodiment of the system may scan both current source and target file system trees. For each file system tree, the system may save mapping records between each object ID and the corresponding object ID, such as in an ID-map. The mapping records may additionally include paths associated with objects in each tree, such as in path-maps. In an example where there are multiple hard links associated with an object, the mapping of the object ID will include multiple, corresponding paths. By comparing the path-map of the source file system to the path-map of the target file system, and by correlating between the source and target objects using the ID-map, the system may identify any changes done on either file system tree since the latest synchronization. The system may then determine how to merge or otherwise synchronize the trees. According to a particular embodiment, the system may choose or prompt a user to choose the synchronization direction (e.g., source to target or vice versa).
In one implementation, the system may check to determine if the source object ID and the corresponding target object ID exist in both file system trees. If so, the system may determine if the path and attributes of the target and source objects are equal. Where they are determined to be equal, the system may take no action towards synchronizing the objects. Where the path
or attributes of the target and source objects are alternatively different, then the system may initiate a synchronization operation.
Examples of synchronization operations may include copying the source version, a latest version, or some combination that includes merging the two versions. Other settings may include removing files that have been deleted since a last synchronization operation, or always synchronizing to the source location.
The system may detect that an object was moved on either one or both locations from determining that the associated paths are different. Where an object includes multiple paths, those paths that differ as between file system trees may be deleted and hard links may be created for new paths.
The system may determine that an object was created since a last synchronization operation by detecting that an object exists only in one location. Alternatively, an object may have been deleted since the last synchronization operation.
According to a particular embodiment, the system may identify objects that were changed (in addition or alternative to the comparing object attributes) by calculating and including in the path-maps a checksum value for each object. The checksum value may be included by the system as a parameter for object comparison.
An embodiment of the system may allow both the target file system and the source file system to be changed before a resynchronization.
When both the target file system and the source file system are changed, a resynchronization is useful, the system may prompt the user to choose a resynchronization result. For example, the user may be prompted to choose the file system on the source side to be copied, thus making the target file system identical to the source file system.
In another example, the user may be prompted to have the system choose a latest (e.g., most recently modified) copy as between corresponding objects of the target and source file systems. That is, if the latest update to one of the corresponding objects was on the target file system, then the copy of the object on the target file system is used to update the corresponding object
on the source file system. Conversely, if the latest modification was on the object on the source file system, then the copy of the object on the source file system is used to update the corresponding object on the target file system.
In still another example, the system may prompt the user to select a correct version of the object. For instance, where modifications were made to multiple corresponding objects before a resynchronization, a user may select the changes made to one of the objects as the version to be copied to the other object(s).
A particular embodiment of the system may create multiple copies of a copy (e.g., a master copy) of a designated source file system. For example, multiple copies of the master copy may be desired at multiple different remote locations. The system may maintain mapping records (e.g., an ID-map and/or path-map) for each copy from the time the master copy was copied to each target. When a conditions indicate it is time to resynchronize a copy, the system may perform the single copy algorithm described herein for each. The synchronization of different copies may be done at different times. Each synchronization may be accomplished according to the target specific objects and a current state of the source object (e.g., master copy).
According to one or more scenarios, an object ID may be reused. For instance, after synchronizing source and target file systems, an object (i.e., object A) on the source file system may be deleted. This deletion may functionally free up the object ID formerly associated with object A. When a new object is created (i.e., object B), the system may reassign the object ID of object A to object B.
When it is desirable to resynchronize both file systems, the system may encounter object B in its scan. The system may check in the ID-map and see that its object ID exists and may locate its corresponding target object (e.g., object A on the target file system). The system may move object A on the target file system to the path of object B. The system may also recognize that this object was changed, since its attributes differ. For example, a creation and a modification time would be later than those of deleted object A. Since the attributes differ, the system may synchronize the file data by copying the object B file content to the corresponding object B on the target file system. By the end of the synchronization procedure, the target file system includes object A with an updated path(s) and content.
Certain embodiments of the system enable multiway, or n-way, synchronization. For instance, the system may synchronize multiple file system copies that have been independently changed, with all changes being merged to all copies.
For example, with object copies 1-N, synchronization may begin with object copy number 1 and synchronize it with object copy number 2 as described herein using mapping records. After the synchronization, object copies 1 and 2 are identical and both contain changes that were made on both objects. The system may then synchronize object copy 2 with object copy 3, continuing in the same manner until synching object copy N-l with object copy N. The resulting object copy N may thus include combined changes from all the other object copies. The system may then synchronize object copy N with object copy 1.
Where so configured, the system may loop synchronization processes by synchronizing object copy 1 with 2 and so on, including synchronizing object copy N-2 with object copy N-l. In this manner, all the object copies may be identical and include changes that were made on all other object copies.
In another example of synchronization using a master copy approach, the system may synchronize the 1st (e.g., master) copy with a 2nd copy based a defined policy. The system may additionally synchronize the 1st copy with the 3rd copy according to the defined policy. If a file is copied from the 3rd copy to the 1st copy, then the process may repeat by synchronizing the 1st with the 2nd copy according to the defined policy. Where desired, synchronization may occur only for the files that were copied from the 3rd copy to the 1st (e.g., master) copy.
Where chained copies are implemented, the 2nd copy may hold, access, or otherwise include object IDs corresponding to the 1st object IDs. The 3rd copy may include object IDs corresponding to the 2nd object IDs.
A policy may be defined by the user. For example, system may prompt the user to define a policy that selects one copy as the master and that make all other copies identical to the master copy. Another policy may cause the system to select from each copy the latest version. The system may use a copy of the latest version as a master copy with which to update all others. In response to a conflict, the system of another embodiment may initiate prompting the user to
select between copies which copy will function as the master copy with which to update all others.
The methods and apparatus of exemplary embodiments may take the form, at least partially, of program code (i.e., instructions) embodied in tangible media, such as disks, CD-ROMs, hard drives, random access or read only-memory, or any other machine-readable storage medium, including transmission medium. When the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the embodiments described herein. The media can include portions in different system components, such as memory in a host, an application instance, and or, a management station. The methods and apparatus may be embodied in the form of program code that may be implemented such that when the program code is received and loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the embodiments described herein. When implemented on processor, the program code combines with the processor to provide a unique apparatus that operates analogously to specific logic circuits. The program code (software-based logic) for carrying out the method is embodied as part of the system described below.
It is to be understood that the disclosure is not necessarily limited in its application to the details of construction and the arrangement of the components and/or methods set forth in the following description and/or illustrated in the drawings and/or the Examples. The disclosure is capable of other embodiments or of being practiced or carried out in various ways.
Referring now to the drawings, FIG. 1 illustrates a simplified illustration of a system 100 configured to maintaining coherence in file system objects. More particularly, FIG. 1 shows an example environment 100 that includes host computing devices (“hosts”), shown as devices 104, 106, 108, and 110, access a data storage system 116 over a network 114. The data storage system 116 include a storage processor (SP) 118 and memory 120. An embodiment of the data storage system 116 may include multiple SPs. For example, multiple SPs may be provided as circuit board assemblies, or blades, which plug into a chassis, which encloses and cools the SPs. The chassis has a backplane for interconnecting the SPs, and additional connections may be made among SPs using cables. It is understood, however, that no particular hardware configuration is required, as any number of SPs, including a single SP, may be provided and the SP 118 can be any type of computing device capable of processing host IOs.
Likewise, although FIG. 1 shows only a single data storage system 116, it is understood that many operations described herein involve activities take place between multiple data storage systems, i.e., between a source data storage system and a target data storage system. As such, an additional storage system 122 is shown in dashed line. The source and target systems may be connected via the network 114 or via any suitable means. The particular construction shown for the data storage system 116 is intended to be representative of both the source and the destination, although it should be understood that the source and the target systems may vary in their details.
The network 114 can be any type of network or combination of networks, such as a storage area network (SAN), a local area network (LAN), a wide area network (WAN), the Internet, and/or some other type of network or combination of networks, for example. The hosts 104, 106, 108, and 110 may connect to the SP 118 using various technologies, such as Fibre Channel, iSCSI, NFS, SMB 3.0, and CIFS, for example. Any number of hosts 104, 106, 108, and 110 may be provided, using any of the above protocols, some sub set thereof, or other protocols besides those shown. The SP 118 is configured to receive IO requests according to both block-based and filebased protocols and to respond to such IO requests by reading or writing the storage 120 or memory 130.
The SP 118 is seen to include one or more communication interfaces 124 and a set of processing units 124. The communication interfaces 124 include, for example, SCSI target adapters and network interface adapters for converting electronic and/or optical signals received over the network 114 to electronic form for use by the SP 118. The set of processing units 126 includes one or more processing chips and/or assemblies. In a particular example, the set of processing units 126 includes numerous multi-core CPUs.
The memory 130 includes both volatile memory (e.g., RAM), and non-volatile memory, such as one or more ROMs, disk drives, solid state drives, and the like. The set of processing units 126 and the memory 130 together form control circuitry, which is constructed and arranged to carry out various methods and functions as described herein, e.g., alone or in coordination with similar control circuitry on another data storage system. Also, the memory 130 includes a variety of software constructs realized in the form of executable instructions. When the executable instructions are run by the set of processing units 126, the set of processing units 126 are caused
to carry out the operations of the software constructs. Although certain software constructs are specifically shown and described, it is understood that the memory 130 typically includes many other software constructs, such as an operating system 128, as well as various applications, processes, and daemons.
As shown in FIG. 1, the memory 128 includes a replication manager 132 that controls the establishment of replication settings on representative data objects 134, 140, and 142. The replication manager 132 establishes replication settings on a per-data-object basis, conducts replication sessions, and orchestrates replication activities, including recovery and failover. In some examples, the replication manager 132 works in coordination with a replication appliance 138. The replication appliance 138 assists in performing continuous replication with another data storage system (e.g., with a destination data storage system), which may be located remotely. In some examples, the replication appliance 138 takes the form of a separate hardware unit. Any number of such hardware units may be provided, and they may work together, e.g., in a cluster.
The operating system 128 additionally includes objects 134, 140, and 142. Examples of object may include an inode or a file ID. The objects 134, 140, and 142 may be controlled by separate instances of a kernel 150 of the operating system 128. Illustrative object 134 includes mapping records 144 that are used to identify and define a location of the object 134 to maintain coherency between file systems, as explained herein. To this end, the mapping records may include object IDs 146 and paths 148.
Fig. 2 is a block diagram of an implementation of a system 200 comprising a software application and associated hardware that maintain coherence in FS objects. The illustrative system 200 of Fig. 2 may have functional components and software in common with the storage processor 118 of Fig. 1. In one implementation, the system 200 may be configured to modify a name or a location of an FS object to be synchronized with that of another FS object.
The implementation of the system 200 includes multiple modules 204 and 206-224 executed by a processor 202 to manage memory processes prior and during a synchronization operation 214. According to the embodiment shown in Fig. 2, the system 200 includes a memory 202 having a plurality of mapping records 207 associating a plurality of IDs 206 to a plurality of FS objects 209.
The processor 202 may be configured to access the memory 204 to identify an incoherency between a first FS object 211 and a second FS object 213 of the plurality of FS objects 209. The processor may further be configured to update a respective mapping record from the plurality of mapping records 207 to associate the second ID 219 with a modified name of a plurality of names 212 or a modified location of a plurality of locations 210 corresponding to the identified incoherence 223.
According to some embodiments of the system 200, identifying the incoherency 223 includes comparing one or more paths 208 associated with the first ID 217, where the plurality of corresponding paths 208 is associated with multiple hard links of the modified location of the second FS object 219. According to some embodiments of the system 200, identifying the incoherency 223 includes comparing an attribute 220 associated with the first ID 217 to a corresponding attribute associated with the second ID 219. These and other processes explained herein may be performed in response to the processor 202 executing a coherency determination algorithm 222.
As a further example, identifying the incoherency may include comparing a first checksum value of a plurality of checksum values 218 associated with a first path 208 of the first ID 217 to a second checksum value associated with second path of the second ID 219. The processor 202 of an embodiment of the system 200 may be further configured to prompt a user for user input 224 to select a correct version to be copied as between the first FS object and the second FS object. For example, a user may designate the first FS object 211 as a master copy to be copied to multiple FS objects.
As illustrated in Fig. 2, the processor may be further configured to store the plurality of mapping records in a mapping table 216. As with the other modules shown in Fig. 1, the mapping table 216 and mapping records 207 are depicted separately, but other embodiments may combine functionalities. For instance, the processor 202 may be configured to store the plurality of mapping records 207 of another embodiment in at least one of an ordered log file (not shown), a key-value store (not shown), a database, the first FS object 211, the second FS object 219, or as a special extended of a plurality of attributes 220 on the second identifier 219. The coherency determination algorithm 222 may scan memories prior to a synchronizing operation to determine the incoherencies 223.
FIG. 3 is a block diagram illustrating logical relationships between aspects of an illustrative FS object 300. The figure shows logical connections between a file path 302, a file inode 304, and file data 306 of the FS object 300. On Linux/Unix file systems, the metadata is held in a structure that is called an inode. As shown in Fig. 3, the inode 304 includes a unique ID and attributes.
The illustrative FS object 300 of Fig. 1 includes three parts. Namely, the FS object 300 includes: the file data 306, the metadata (e.g., attributes) of the inode 304 that holds the files/directories details (e.g., an identifier, a size, a timestamp, a name, a type, a pointer to the data, etc.), and a file path 302 (i.e., location).
As discussed herein, each FS object (e.g., a file or a directory) on a file system has a unique identifier. This identifier remains with the object throughout its existence even if the object is renamed, moved, has its attributes or data changed, or is associated with additional single or multiple hard links.
On Linux/Unix file systems the identifier is called the inode number, and on Windows file systems that identifier it is referred as the file-ID. Object storage include one or more paths to each FS object and a unique identifier for each FS object.
File systems, such as the file system of the FS object 300, may include files and directories (e.g., folders). A directory may include files or other directories. In such an instance, these directories, themselves, may include files and directories. A location of each file system may begin with its root directory (i.e., root node).
Fig. 4 is a block diagram illustrating an example of a file system 400 comprising a tree that includes multiple directories and files, in addition to associated inodes. More particularly, the file system tree 400 includes a root directory 402 and an associated 1st inode 404. Under the root directory 402, the file system tree 400 includes directory A 406, directory B 408, and directory C 410. The directories 406, 408, and 410 are associated respectively with inodes 412, 414, and 416.
As shown in Fig. 4, directory A 406 includes a 1st file 418 associated with a 6th inode 420, and a 2nd file 422 associated with a 7th inode 424. Directory C 410 includes directory D 426 and
associated inode 428, as well as a 5th file 430 associated with inode 432. Directory C 410 also includes directory E 434, which is associated with inode 436.
The directory D 426 may include a 3rd file 438 that is associated with a 10th inode 440. A 4th file 442, associated with an 11th inode 444, may additionally be included under the D directory 426.
To access a file, a program may specify a path through the file system tree, beginning from the root directory 402 and proceeding throughout the directory tree until reaching the file itself. For example, the path of the 3rd file 438 in Fig. 4 may be: "/C/D/File 3".
Each directory 402, 406, 408, 410, 426, and 434 holds a list of its directory entries (e.g., the files and directories it includes). Each entry may include a name and an identifier (e.g., an inode in the file system of Fig. 4). Each directory 402, 406, 408, 410, 426, and 434 may additionally include an entry pointing to its own inode
as well as to its parent directory inode
Fig. 5 is a diagram 500 showing illustrative directory data, and more particularly, the data included within the directory C 410 of the file system tree 400 of Fig. 4. As shown in the diagram 500, directory C 502 includes a list of its own directories D and E, as well as its own file (i.e., the file 5). The inodes 5, 8, and 9 of each directory and file are additionally included on the directory C 502 in an association with their respective file and directories. The directory C 502 additionally stores its own inode 4, as well as the inode 1 of its corresponding root directory.
The directory structure 502 points to a structure 504 that includes inode 5 and that corresponds to directory D. The structure 504 includes information pertaining to permissions, an owner/group ID, a directory, and data blocks. The structure 504 of directory D additionally points to data blocks 506, 508, and 510. The structure 512 includes inode 9 and may correspond to directory E 434 of Fig. 4, and another pointer may link to information 430 (not shown) pertaining to the 5th file 430 of Fig. 4.
File systems may have operations that may be done on files and directories, such as create, open, delete, write, and append operations. One of these operations is the move operation, which allows the user to change the location of the file/directory. According to an embodiment of the system, the move operation changes only the file/directory path (e.g., not the file data). Put another way, the path of the file changes after the move operation, but the data, itself, and its
inode and other attributes are not relocated or changed. By enabling the file data and inode to remain, processing cycles are saved over conventional systems that would delete and replace the data. After the move operation, the same file attributes and data may be accessed using the new path.
This feature is illustrated in Fig. 6, which shows an FS object 600 that is similar to the FS object 300 of Fig. 3, symbolically undergoing an operation to change a location. That is, an old path 602 is deleted (as designed by the “X”), and a new path 608 is assigned to the file anode 604 and file data 606.
Regardless of the move operation, a file data may include at least one path. However, the file data 606 may be accessed using several paths. Each of these paths is called a hard link. After creating the initial file and its data, it is possible to create more hard links to that FS object using different paths.
Fig. 7 shows a file system 700 comprising the file system 400 of Fig. 4, but after an object is renamed and assigned a new location/path. In terms of Fig.4, the directory C 410 has been renamed directory F and has been moved under the directory B 408. The numbering in Fig. 7 is carried over from Fig. 4 to illustrate the renaming/restructuring operations, as well as to show that associations between inodes and files/directories remain unchanged during the operations.
Turning more particularly to Fig. 7, the block diagram illustrates an example of a file system 700 comprising a tree that includes multiple directories and files, in addition to associated inodes. More particularly, the file system tree includes a root directory 402 and an associated 1st inode 404. Under the root directory 402, the file system tree includes directory A 406 and directory B 408. Directory F 410 has been moved under directory B 408. As shown in Fig. 7, the A 406 directory continues to be associated with the 2nd inode 412, just as the directories 408 and 410 continue to be associated respectively with the 3rd inode 414 and the 4th inode 416.
As further illustrated in Fig. 7, directory A 406 includes a 1st file 418 associated with a 6th inode 420, and a 2nd file 422 associated with a 7th inode 424. Renamed directory F 410 includes directory D 426 and associated inode 428, as well as a 5th file 430 associated with inode 432. Directory C 410 also includes directory E 434, which is associated with inode 436.
The directory D 426 may include a 3rd file 438 that is associated with a 10th inode 440. A 4th file 442, associated with an 11th inode 444, may additionally be included under the D directory 426.
Fig. 8 is a block diagram illustrating an example of a file system 800 comprising a tree that includes multiple directories and files, in addition to associated inodes. In terms of the file system 400 of Fig. 4, the file system 800 may be a target or other replication desired to be in synchronization with the file system 400. While other nomenclature of the files and directories remain the same as shown in the source file system 400 of Fig. 4, the inode numbering of the source file system 800 in Fig. 8 is different to reflect a unique ID number is associated with each directory/file.
More particularly, the file system tree includes a root directory 802 and an associated 11th inode 804. Under the root directory 802, the file system tree includes directory A 806, directory B 808, and directory C 810. The directories 806, 808, and 810 are associated respectively with inodes 812, 814, and 816.
As shown in Fig. 8, directory A 806 includes a 1st file 818 associated with a 16th inode 820, and a 2nd file 822 associated with a 17th inode 824. Directory C 810 includes directory D 826 and associated inode 828, as well as a 5th file 830 associated with and 18th inode 832. Directory C 810 also includes directory E 834, which is associated with a 20th inode 836.
The directory D 826 may include a 3rd file 838 that is associated with a 20th inode 840. A 4th file 842, associated with a 21st inode 844, may additionally be included under the D directory 426.
Fig. 9 is a table comprising an ID map 900 for the replication of the file system 800 of Fig. 8. The ID map 900 may comprise a portion of the mapping records 144 of Fig. 1. A first column 902 displays identifiers (e.g., inode numbers) associated with objects of the source file system 400 of Fig. 4. A second column 904 displays identifiers (i.e., inode numbers) associated with objects of the target file system 800 of Fig. 8.
In one example, the first entry 906 in the first row corresponds to the 1st inode 404 of the root directory 402 of the file system 400 of Fig. 4. The second entry 908 in the first row corresponds to the 11th inode 804 of the root directory 802 of the file system 800 of Fig. 8. In this manner,
the ID map 900 logically associates the 1st inode of the source file system 400 with the 11th inode of the target file system 800.
The first entry 910 in the second row corresponds to the 2nd inode 412 of directory A 406 of the file system 400 of Fig. 4. The second entry 912 in the second row corresponds to the 12th inode 812 of the directory A 806 of the file system 800 of Fig. 8. The ID map 900 thus associates the 2nd inode of the source file system 400 with the 12th inode of the target file system 800. As shown in Fig. 9, the ID map 900 likewise associates inode 3 with inodel3; inode 4 with inode 14, and so on.
The ID map 900 may be used when a resynchronizing operation begins to apply the changes to the name and locations on the target file system 800 to match the source file system 400. More particularly, inode 4 of the source file system will appear at /B/F path, while its corresponding target inode 14 will appear at /B path. Since the synchronization direction may be from source file system 400 to the target file system 800, the synchronization operation may include accessing the ID map 900 to move the /C directory (inodel4) to its corresponding inode location /B/F on the source file system 800.
Fig. 10 is a block diagram of a system 1000 that performs multiway, or n-way, synchronization for a plurality of file system trees 1002, 1004, 1006, and 1008. Arrows 1010, 1012, 1014, and 1016 illustrate an order in which the changes in the name or location of an object in one file system tree may be updated according to one another during an embodiment of a resynchronization operation.
In one example, the system 1000 may synchronize the multiple file system copies that have been independently changed, with all changes being merged to all copies. For instance, synchronization may begin with synchronizing an object in copy file system tree 1002 with file system tree 1004 as described herein using mapping records. After the synchronization, file system trees 1002 and 1004 are identical and both contain changes that were made on both objects.
The system 1000 may then synchronize file system tree 1004 with file system tree 1006, continuing in the same manner by synching file system tree 1006 with file system tree 1008. The resulting file system tree 1008 may thus include combined changes from all the other file
system tree 1002, 1004, and 1006. The system 1000 may then synchronize file system tree 1008 with file system tree 1002. In this manner, all the object copies may be identical and include changes that were made on all other object copies.
In another example of synchronization using a master copy approach, the system 1000 may synchronize the file system tree 1002 (e.g., a master copy) with the file system tree 1004 based a defined policy. The system may additionally synchronize the file system tree 1002 with the file system tree 1006 according to the defined policy. If a file is copied from the file system tree 1006 to the file system tree 1002, then the process may repeat by synchronizing the file system tree 1002 with the file system tree 1004 according to the defined policy. Where desired, synchronization may occur only for the files that were copied from the file system tree 1008 to the file system tree 1002 (e.g., master copy).
Where chained copies are implemented, the file system tree 1004 may hold, access, or otherwise include object IDs corresponding to the file system tree 1002 object IDs. The file system tree 1006 may include object IDs corresponding to the file system tree 1004 object IDs.
A policy may be defined by the user. For example, system may prompt the user to define a policy that selects one copy as the master and that make all other copies identical to the master copy. Another policy may cause the system to select from each copy the latest version. The system may use a copy of the latest version as a master copy with which to update all others. In response to a conflict, the system of another embodiment may initiate prompting the user to select between copies which copy will function as the master copy with which to update all others.
Fig. 11 is an embodiment of a method 1100 of maintaining coherence in file system objects. The method 1100 may be performed, for instance, by the systems 100 and 200 of Figs. 1 and 2. Turning more particularly to the flowchart, the system at 1102 may generate one or more copies of (and synchronized to) a single file tree. In so doing, the system may copy each object (e.g., a file or folder) to a corresponding target location within a target file system tree.
At 1104, the method 1100 may including mapping a 1st ID of an object in the source file tree to a corresponding 2nd object in the target file tree. An embodiment of the system uses records that maps between the source object IDs and each corresponding target object ID. When copying
objects from the source file system to the target file system, the system may also copy attributes, such as a last-modification date. In this manner, the attributes are identical in both the source file system to the target file system that they are identical in source and target file systems. The ID map or other mapping records may be stored in: an ordered log file, a key-value store, a database, an object on the source and/or target side, or as a special extended attribute on the second ID.
An embodiment of the system may at 1106 control the 1st and 2nd objects using different instances of a kernel. For instance, the kernel 150 of Fig. 1 may be used to control the objects of both the target and source file system trees.
When a need arises to resynchronize the source and target file systems, an embodiment of the method 1100 at 1108 may scan the 1st and 2nd file systems. The scan may be ultimately used to identify changes to the target and/or source file systems since the last synchronization.
The comparison to determine any such changes is represented at 1110 of the flowchart. As explained herein, the system may save mapping records for each file system tree. The mapping records may include an ID-map, as well as a path-map that includes paths associated with objects in each tree. By comparing the path-map of the source file system to the path-map of the target file system, and by correlating between the source and target objects using the ID-map, the system may identify any changes done on either file system tree since the latest synchronization.
In one implementation, the system may determine at 1112 if the source object ID and a corresponding target object ID exist in both file system trees. If so, the system may determine if the path and attributes of the target and source objects are equal. Where they are determined to be equal, the system may take no action towards synchronizing the objects and loop back to 1108.
Alternatively, the system may determine that the path or attributes of the target and source objects are alternatively different at 1112. For example, the system may detect that an object was moved on either one or both locations from determining that the associated paths are different. Where an object includes multiple paths, those paths that differ as between file system trees may be deleted and hard links may be created for new paths. In another example, the system may determine that an object was created since a last synchronization operation by
detecting that an object exists only in one location. Alternatively, an object may have been deleted since the last synchronization operation.
According to a particular embodiment, the method 1100 may identify at 1112 objects that were changed, in addition or alternative to the comparing object attributes. For instance, the system may calculate a checksum value for each object. The checksum value may be included by the system as a parameter for object comparison.
In another example, the system may detect an incoherency by comparing a first location of a first FS object within a first path to a second location of a second FS object within a second path. In another or the same embodiment, the system may compare one or more paths associated with the first ID multiple corresponding paths associated with the second ID. Another manner of determining a coherency may include comparing an attribute associated with the first ID to a corresponding attribute associated with the second ID, by or comparing the plurality of mapping records to a subsequently generated plurality of mapping records.
At 1114, the method 1100 may include applying a defined policy to determine how to merge or otherwise synchronize the file system trees. According to a particular embodiment, the system may choose or prompt a user to choose the synchronization direction (e.g., source to target or vice versa). The system may prompt the user to choose a resynchronization result. For example, the user may be prompted to choose the file system on the source side to be copied, thus making the target file system identical to the source file system. In another example, the user may be prompted to have the system choose a latest (e.g., most recently modified) copy as between corresponding objects of the target and source file systems. In another example, the system may prompt the user to select a correct version of the object. For instance, where modifications were made to multiple corresponding objects before a resynchronization, a user may select the changes made to one of the objects as the version to be copied to the other objects.
At 1116, the system may associate the 2nd ID with a new name and/or location (e.g., path). The association may be used to update the mapping records at 1118 to synchronize the 1st and 2nd file systems. That is, the system may initiate a synchronization operation at 1118. Examples of synchronization operations may include copying the source version, a latest version, or some combination that includes merging the two versions. Other settings may include removing files
that have been deleted since a last synchronization operation, or always synchronizing to the source location.
Fig. 12 is a block diagram of an example computing device 1200 that may optionally be utilized to perform one or more aspects of techniques described herein. In some implementations, one or more of a client computing device, user-controlled resources engine, and/or other component s) may comprise one or more components of the example computing device 1200.
Computing device 1200 typically includes at least one processor 1214 that communicates with several peripheral devices via bus subsystem 1212. These peripheral devices may include a storage subsystem 1224 that includes, for example, a memory subsystem 1225 and a file storage subsystem 1226, as well as a user interface output devices 1220, user interface input devices 1222, and a network interface subsystem 1216. The user interface input devices 1222 of an implementation may include a response volume setting, among other features. The input and output devices allow user interaction with computing device 1200. The network interface subsystem 1216 provides an interface to outside networks and is coupled to corresponding interface devices in other computing devices.
The user interface input devices 1222 may include a keyboard, pointing devices such as a mouse, trackball, touchpad, or graphics tablet, a scanner, a touchscreen incorporated into the display, audio input devices such as voice recognition systems, microphones, and/or other types of input devices. In general, use of the term "input device" is intended to include all possible types of devices and ways to input information into computing device 1200 or onto a communication network.
User interface output devices 1220 may include a display subsystem, a printer, a fax machine, or non-visual displays such as audio output devices. The display subsystem may include a cathode ray tube (CRT), a flat-panel device such as a liquid crystal display (LCD), a projection device, or some other mechanism for creating a visible image. The display subsystem may also provide non-visual display such as via audio output devices. In general, use of the term output device is intended to include all possible types of devices and ways to output information from computing device 1200 to the user or to another machine or computing device.
The storage subsystem 1224 stores programming and data constructs that provide the functionality of some or all the modules described herein. For example, the storage subsystem 1224 may include the logic to perform selected aspects of the method and to implement various components depicted in the preceding figures.
These software modules are generally executed by processor 1214 alone or in combination with other processors. The memory subsystem 1225 used in the storage subsystem 1224 may include a number of memories including a main random access memory (RAM) 1230 for storage of instructions and data during program execution and a read only memory (ROM) 1232 in which fixed instructions are stored. A file storage subsystem 1226 can provide persistent storage for program and data files, and may include a hard disk drive, a floppy disk drive along with associated removable media, a CD-ROM drive, an optical drive, or removable media cartridges. The modules implementing the functionality of certain implementations may be stored by file storage subsystem 1226 in the storage subsystem 1224, or in other machines accessible by the processor(s) 1214.
The bus subsystem 1212 provides a mechanism for letting the various components and subsystems of computing device 1200 communicate with each other as intended. Although the bus subsystem 1212 is shown schematically as a single bus, alternative implementations of the bus subsystem may use multiple busses.
The computing device 1200 can be of varying types including a workstation, server, computing cluster, blade server, server farm, or any other data processing system or computing device. Due to the ever-changing nature of computers and networks, the description of computing device 1200 depicted in Fig. 12 is intended only as a specific example for purposes of illustrating some implementations. Many other configurations of computing device 1200 are possible having more or fewer components than the computing device depicted in Fig. 12.
The terms "comprises", "comprising", "includes", "including", “having” and their conjugates mean "including but not limited to". The term “consisting of’ means “including and limited to”. The term "consisting essentially of' means that the composition, method, or structure may include additional ingredients, steps and/or parts, but only if the additional ingredients, steps and/or parts do not materially alter the basic and novel characteristics of the claimed composition, method, or structure. As used herein, the singular form "a", "an" and "the" include
plural references unless the context clearly dictates otherwise. For example, the term "a compound" or "at least one compound" may include a plurality of compounds, including mixtures thereof.
Throughout this application, various embodiments of this disclosure may be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the invention. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 3, 4, 5, and 6. This applies regardless of the breadth of the range.
Whenever a numerical range is indicated herein, it is meant to include any cited numeral (fractional or integral) within the indicated range. The phrases “ranging/ranges between” a first indicate number and a second indicate number and “ranging/ranges from” a first indicate number “to” a second indicate number are used herein interchangeably and are meant to include the first and second indicated numbers and all the fractional and integral numerals there between.
It is appreciated that certain features of the invention, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable sub-combination or as suitable in any other described embodiment of the invention. Certain features described in the context of various embodiments are not to be considered essential features of those embodiments, unless the embodiment is inoperative without those elements.
Although the disclosure has been described in conjunction with specific embodiments thereof, it is evident that many alternatives, modifications, and variations will be apparent to those skilled in the art. Accordingly, it is intended to embrace all such alternatives, modifications and variations that fall within the spirit and broad scope of the appended claims.
It is the intent of the Applicant(s) that all publications, patents, and patent applications referred to in this specification are to be incorporated in their entirety by reference into the specification, as if each individual publication, patent, or patent application was specifically and individually noted when referenced that it is to be incorporated herein by reference. In addition, citation or identification of any reference in this application shall not be construed as an admission that such reference is available as prior art to the present invention. To the extent that section headings are used, they should not be construed as necessarily limiting. In addition, any priority document(s) of this application is/are hereby incorporated herein by reference in its/their entirety.
Claims
1. A system to maintain coherence in file system (FS) objects, comprising: a memory comprising: a plurality of mapping records each having first identifier (ID) and a second ID for associating a first FS object stored in a first storage device with a second FS object stored in a second storage device, wherein the first FS object and the second FS object are controlled by different instances of an FS kernel; and a processor configured to access the memory to: identify an incoherency between the first FS object and the second FS object; and update a respective mapping record from the plurality of mapping records to associate the second ID with a modified name or a modified location corresponding to the identified incoherency.
2. The system of claim 1, wherein the processor identifies the incoherency by identifying a difference between respective locations of the first and second FS objects.
3. The system of claim 1, wherein the processor identifies the incoherency by identifying a difference between respective names of the first and second FS objects.
4. The system of claim 1, wherein the processor is further configured to synchronize the second FS object to a third FS object of a third storage device by associating a third ID with the modified name or the modified location associated with the second ID.
5. The system of claim 4, wherein the processor is further configured to identify a second incoherency between modified name or location and a corresponding location or name of the third FS object.
6. The system of claim 4, wherein the processor is further configured to synchronize the first FS object to the third FS object by associating the first ID with a second modified name or second modified location associated with the third ID.
7. The system of claim 1, wherein the processor is further configured to synchronize the plurality of mapping records by performing the updating in response to a synchronization operation.
8. The system of claim 1, wherein identifying the incoherency includes comparing a first location of the first FS object within a first path to a second location of the second FS object within a second path.
9. The system of claim 1, wherein identifying the incoherency includes comparing one or more paths associated with the first ID to a plurality of corresponding paths associated with the second ID, wherein the plurality of corresponding paths is associated with multiple hard links of the modified location of the second FS object.
10. The system of claim 1, wherein identifying the incoherency includes comparing an attribute associated with the first ID to a corresponding attribute associated with the second ID.
11. The system of claim 1, wherein identifying the incoherency includes comparing a first checksum value associated with a first path of the first ID to a second checksum value associated with second path of the second ID.
12. The system of claim 1, wherein identifying the incoherency includes comparing the plurality of mapping records to a subsequently generated plurality of mapping records.
13. The system of claim 1, wherein the processor is further configured to reassign the first ID to a third FS object, wherein the first FS object has been deleted.
14. The system of claim 1, wherein changes are made to both the first and second memories prior to a resynchronizing operation, and wherein the processor is further configured to prompt a user to select the first FS object over the second FS object to be copied.
15. The system of claim 1, wherein changes are made to both the first and second memories prior to a resynchronizing operation, and wherein the processor is further configured to prompt a user to select a most recently updated FS object to be copied as between the first FS object and the second FS.
16. The system of claim 1, wherein changes are made to both the first and second memories prior to a resynchronizing operation, and wherein the processor is further configured to prompt a user to select a correct version to be copied as between the first FS object and the second FS object.
17. The system of claim 1, wherein the processor is further configured to prompt a user to designate the first FS object as a master copy to be copied to multiple FS objects.
18. The system of claim 1, wherein the processor is further configured to store the plurality of mapping records in at least one of: an ordered log file, a key-value store, a database, the first FS object, the second FS object, or as a special extended attribute on the second identifier.
19. The system of claim 1, wherein the processor is further configured to store the plurality of mapping records in a mapping table.
20. A method of modifying a name or location of a file system (FS) object to be synchronized with that of another FS object, the method comprising: mapping a first identifier (ID) associated with a first FS object of a first storage device to a second ID associated with a second FS object of a second storage device; controlling the first FS object and the second FS object using different instances of an FS kernel; identifying an incoherency between the first FS object and the second FS object; and updating a respective mapping record of a plurality of mapping records to associate the second ID with a modified name or a modified location corresponding to the identified incoherency.
21. The method of claim 20, wherein identifying the incoherency further comprises scanning the first and second memories prior to a synchronizing operation.
22. The method of claim 20, further comprising updating a respective name or a respective location of each of a plurality of FS objects comprising copies of the first object.
23. The method of claim 20, wherein the first and second FS objects are included within a plurality of n FS objects, the method further comprising associating the modified name or the
modified location with an n-1 name or an n-1 location associated with an n-1 ID of the plurality of n FS objects, and associating the modified name or the modified location of the n-1 ID with an nth name or an nth location associated with an nth ID of the plurality of n FS objects.
24. The method of claim 23, further comprising associating the modified name or the modified location of the nth ID with a first name or a first location associated with the first ID.
25. The method of claim 20, wherein mapping the first ID further comprises mapping at least one of an inode or a file ID.
26. At least one non-transitory computer-readable medium comprising instructions that, in response to execution of the instructions by one or more processors, cause the one or more processors to perform the following operations: mapping a first identifier (ID) associated with a first file system (FS) object of a first storage device to a second ID associated with a second FS object of a second storage device; controlling the first FS object and the second FS object using different instances of an FS kernel; identifying an incoherency between the first FS object and the second FS object; and updating a respective mapping record of a plurality of mapping records to associate the second ID with a modified name or a modified location corresponding to the identified incoherency.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/EP2022/078006 WO2024078677A1 (en) | 2022-10-09 | 2022-10-09 | Mapping identifiers to maintain name and location coherency in file system objects |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/EP2022/078006 WO2024078677A1 (en) | 2022-10-09 | 2022-10-09 | Mapping identifiers to maintain name and location coherency in file system objects |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024078677A1 true WO2024078677A1 (en) | 2024-04-18 |
Family
ID=84329871
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2022/078006 WO2024078677A1 (en) | 2022-10-09 | 2022-10-09 | Mapping identifiers to maintain name and location coherency in file system objects |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024078677A1 (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110295805A1 (en) * | 2010-05-28 | 2011-12-01 | Commvault Systems, Inc. | Systems and methods for performing data replication |
US20130282658A1 (en) * | 2012-04-23 | 2013-10-24 | Google, Inc. | Sharing and synchronizing electronically stored files |
-
2022
- 2022-10-09 WO PCT/EP2022/078006 patent/WO2024078677A1/en unknown
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110295805A1 (en) * | 2010-05-28 | 2011-12-01 | Commvault Systems, Inc. | Systems and methods for performing data replication |
US20130282658A1 (en) * | 2012-04-23 | 2013-10-24 | Google, Inc. | Sharing and synchronizing electronically stored files |
Non-Patent Citations (1)
Title |
---|
TAO VINH VINH TAO@LIP6 FR ET AL: "Merging semantics for conflict updates in geo-distributed file systems", PROCEEDINGS OF THE 38TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR '15, ACM PRESS, NEW YORK, NEW YORK, USA, 26 May 2015 (2015-05-26), pages 1 - 12, XP058512370, ISBN: 978-1-4503-3621-5, DOI: 10.1145/2757667.2757683 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10254996B1 (en) | Fast migration of metadata | |
EP2619695B1 (en) | System and method for managing integrity in a distributed database | |
US9639429B2 (en) | Creating validated database snapshots for provisioning virtual databases | |
US8229893B2 (en) | Metadata management for fixed content distributed data storage | |
EP1782289B1 (en) | Metadata management for fixed content distributed data storage | |
US11449391B2 (en) | Network folder resynchronization | |
US11176102B2 (en) | Incremental virtual machine metadata extraction | |
US7606842B2 (en) | Method of merging a clone file system with an original file system | |
US12072853B2 (en) | Database schema branching workflow, with support for data, keyspaces and VSchemas | |
US11704335B2 (en) | Data synchronization in a data analysis system | |
US8812445B2 (en) | System and method for managing scalability in a distributed database | |
EP4172777A1 (en) | Updating a virtual machine backup | |
WO2024078677A1 (en) | Mapping identifiers to maintain name and location coherency in file system objects | |
AU2011265370B2 (en) | Metadata management for fixed content distributed data storage | |
CN115481198A (en) | Data table synchronization method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22801077 Country of ref document: EP Kind code of ref document: A1 |