[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

GB2196764A - Hierarchical file system - Google Patents

Hierarchical file system Download PDF

Info

Publication number
GB2196764A
GB2196764A GB08715199A GB8715199A GB2196764A GB 2196764 A GB2196764 A GB 2196764A GB 08715199 A GB08715199 A GB 08715199A GB 8715199 A GB8715199 A GB 8715199A GB 2196764 A GB2196764 A GB 2196764A
Authority
GB
United Kingdom
Prior art keywords
file
directory
tree
data
files
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
GB08715199A
Other versions
GB8715199D0 (en
Inventor
Bill M Bruffey
Gursharan Singh Sidhu
Patrick William Dirks
Christopher R Mcfall
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Apple Inc
Original Assignee
Apple Computer Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Apple Computer Inc filed Critical Apple Computer Inc
Publication of GB8715199D0 publication Critical patent/GB8715199D0/en
Publication of GB2196764A publication Critical patent/GB2196764A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9017Indexing; Data structures therefor; Storage structures using directory or table look-up
    • G06F16/902Indexing; Data structures therefor; Storage structures using directory or table look-up using more than one table in sequence, i.e. systems with three or more layers

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A hierarchical filing system provides a cataloging of data stored in various locations within a memory device. An upside-down tree type structure provides a downwardly expanding cataloging structure wherein directories provide for further branchings. A branching from a directory is terminated when a file is reached. Each directory is assigned a unique directory identifier value. Then, each file or directory is coupled with the directory identifier value of its parent to provide the interconnection necessary to form the cataloging structure. The complete cataloging structure is organized in the leaf nodes of a B-Tree structure and distributed in an ascending order of the key values to provide a systematic search for a given key. Each file is capable of storing a predetermined number of location description information when data is segmented into non-contiguous segments in memory. A file extents record is used to maintain record of the further segmentation. File location information is kept in the form of file extents descriptors in the leaf nodes of the separate File Extents B-Tree. This extents information is sorted in an ascending order based on a key comprised of a unique file number of a file relative starting block location of the file extent.

Description

1 GB2196764A 1 SPECIFICATION Other directories and files emanate as off
spring. A plurality of descendant levels branch Hierarchical file system downward to provide the hierarchical structure of the catalog. The cataloging structure con
BACKGROUND OF THE INVENTION. 70 tains the location information of where the ac
Field of the Invention tual data is stored.
The present invention relates to the method of The file cataloging system is implemented storing and retrieving data using a computer, using a B-Tree. The cataloging information is and more specifically to a hierarchical filing kept in the leaf nodes of the B-Tree. The non system. 75 leaf nodes (index nodes) of the B-Tree contain information that allows searching for particular Prior Art catalog information by using the code name or
In a computer system, information is typi- key of the corresponding file. Key values, cally stored as signals on various storage me- which are used to identify and catalog various diums, such as magnetic tapes, disks, semi- 80 files in the cataloging system, are also used conductor devices, etc. As storage densities to organize the catalog in the leaf nodes of increased with advances in storage device the B-Tree. The keys are placed in an ascend technology, it became possible for a device to ing order for systematic access. Further, the store much more information than previously. B-Tree grows by using left rotates and left When information is stored on a device, it 8.5 splits with insertion, of catalog information is cataloged so that the same information is about new files from the right to maintain a later retrieved when desired. Normally, a uni- balanced tree.
que code name is attributed to a particular When a file's data is stored, additions, dele- body of data to differentiate it from others. tions and modifications will typically result in To retrieve a desired body of data, an appTo- 90 non-contiguous physical storage of the data in priate code name associated with that data is the memory device. Each of the contiguous used, wherein the device searches for that. segments of the file is known as a file extent.
code name and retrieves the desired data A record of the physical location of the ex when that code name is found. tents for a particular file is maintained in one It is well-known in the prior art that each 95 or more extents records. The hierarchical filing separate body of data is termed a file and the system uses a file extents list to maintain the cataloging of these files on a device is termed extents records of the various files on the filing. Typically, code names associated with memory device.
particular data contain pointers which point to The present invention maintains the first ex- areas in memory reserved for mass storage. 100 tents record of a file in the cataloging struc The various code names and their pointers ture, but any further extents records are main comprise the cataloging system. When high- tained in a separate file extents list. This file density storage devices are used, millions of extents list is also implemented in a second bits of information are capable of being stored B-Tree structure.
on such a device, which permits hundreds, 105 thousands, and even millions of files to be BRIEF DESCRIPTION OF THE DRAWINGS created. To search through these files in a Figure 1 is a representation of a prior art serial fashion to look for a specific file is time- flat filing system.
consuming. Figure 2 is a representation of a hierarchical It is appreciated that what is needed is a 110 filing system of the present invention.
filing system for a high-density storage me- Figure 3 is a representation of a B-Tree dium which rapidly searches and retrieves the structure of the present invention.
desired file stored. Further, with the advent of Figure 4 is a representation of contents of a the personal computer (PC) and the small busi- node for the B-Tree structure of Figure 3.
ness computer, where physical size is a con- 115 Figure 5 is a representation of a left-split cern, it is desirable to have a filing system and a left-rotate operation of a B-Tree struc which may be implemented in a lesser line of ture of the preferred embodiment.
program, yet be effectual. Figure 6 is a representation of a cataloging structure of the preferred embodiment and an SUMMARY 120 organization of the cataloging structure in vari-
A method for providing a hierarchical filing ous nodes of a B-Tree.
system is described. The hierarchical filing Figure 7 is a representation of a volume system provides a catalog of the data stored allocation mapping in a filing system of the in various locations within a memory device. preferred embodiment.
Typically, one cataloging structure is used to 125 Figure 8 is a representation of a file extents organize a volume of memory. list of the preferred -embodiment and showing The cataloging structure of the hiearchical various file extents in memory.
filing system is provided by an upside-down Figure 9 is a representation showing the file tree type structure wherein there is a starting extents organization in the Catalog and Ex directory which operates as a root directory. 1,30 tents B-Trees of the preferred embodiment.
2 GB2196764A 2 to directory 19. The desirable path is to direc DETAILED DESCRIPTION OF THE PREFERRED tory 18, at which point there are again two
EMBODIMENTS paths. The desirable path from directory 18 The present invention describes a method leads directory to file 23. Although this of storing and retrieving information using a 70 example is simplistic because of the miniscule hierarchical filing system. In the following denumber of files shown, one can appreciate the scription, numerous specific details are set file search time saved when a substantially forth in order to provide a thorough under- large number of files are present.
standing of the present invention. It will be Further, as an example, if file 22 had been obvious, however, to one skilled in the art 75 chosen, the path from directory 18 would that the present invention may be practiced have led to directory 20, at which point two without these specific details. In other in- paths exist from directory 20. The desirable stances, well-known methods have not been path to file 22 from directory 20 then would described in detail in order not to unnecessahave been chosen. HFS 16, although shown in rily obscure the present invention. 80 a particular form in Figure 2, may have any Referring to Figure 1, a prior art flat filing number of levels (branchings) down from the system 10 is shown having a directory 11 root directory 17 as well as any number of and files 12-15. For ease of understanding, a branches from a particular directory. However, directory is shown pictorially as a folder and a it is to be noted that all data is stored in the file is shown as a sheet of paper with a 85 represented files 21-24 which are all located folded corner. The pictorial representation ap- at the termination nodes of HFS 16.
plies well to an analogy of placing papers into In actuality, the cataloging architecture of folders (files into directories). In the prior art the preferred embodiment contains cataloging system 10, there is present a single directory locator description information in the HFS 16
11, which contains locator information for files 90 structure. The catalog entries for files 21-24 12-15. Each of the files 12-15 contain data contain pointers which provide locator descrip which would be associated with a specific tions to locate places in storage area where body of stored information. In this particular actual stored data is kept.
example of a prior art system 10, to access file 15, a serial search is made through direc- 95 B-TREE tory 11, until the file address of file 15 is The HFS of the present invention is imple- located, such sequential search resulting in mented using two B-Tree structures in the considerable lapse of time when substantial preferred embodiment, the Catalog B-Tree and numbers of files exist in the directory 11. Al- the File Extents B-Tree. A B-Tree structure is though in this hypothetical example, directory 100 well-known in the prior art and is described in
11 maintains pointer addresses to four files The Art of Computer Programming Volume 3 12-15, directory 11 will continue to store ad(Sorting and Searching); by Donald E. Knuth; dresses of subsequent files in a sequential at Section 6.4; titled "Multiway Trees"; pp fashion. 471-479 (1973). The nodes of a B-Tree con- Figure 2 illustrates the architecture of the 105 tain records, wherein each record is comHierarchical Filing System (HFS) of the present prised of certain information, either pointers or invention. This particular HFS 16 includes a data, and a key associated with that record.
root directory 17 and files 21-24. The HFS 16 Referring to Figure 3, a hypothetical B-Tree also includes directories 18-20. Each directory is illustrated. A basic feature of the B-Tree 31 is capable of containing files, as well as other 110 is that data is stored only in leaf nodes 35 directories such as directory 18 containing di- 38. The internal nodes 3234, also known as rectory 20. Each directory is a branching index nodes, contain pointers to other nodes node, allowing for none or a plurality of sub- such that these index nodes 32-34 provide an branching nodes. Each directory contains infor- index for accessing the data records stored in mation which permits the branching to occur. 115 the leaf nodes 35-38. Each record 39 includes The actual data is stored in the files 21-24. a key 40 and an information segment 41.
Because each file is a termination node, it Within each node, the records are maintained does not need to maintain further branching so that their keys are in ascending order. The information. Instead, each file stores the actual example B-Tree 31 of Figure 3 contains hypo data. Therefore, the directories 17-20 maintain 120 thetical keys which have been inserted to branching information, while files 21-24 con- show the structure of the tree, and the rela tain the stored data. tionship between index nodes 32-34 and leaf HFS 16 accesses files 21-24 in a hierarchi- nodes 35-38. Leaf node 35 contains key cal fashion so that serial search for the files is values 48 and 50. The first key of a node is not necessary. Assume in the example of Fig- 125 also represented as a key in its ascending ure 2 that access to data stored in file 23 is node. Therefore key 48, which is the first key desired. A search of directory 17 reveals that of leaf node 35, is also represented as a key two possible paths exist in seeking the ad- within index node 33. Key 53, which is the dress of file 23. One path from directory 17 first key of leaf node 36, is represented as leads to directory 18 and the other path leads 130 the second key of index node 33. Also, be- 3 GB2196764A 3 cause key 48 is the first key within index and NDFLINK for node 36 would point to node 33, it is again represented as a key node 37. Therefore, NDBLINK 52 and within index node 32. This pattern is repeated NDFLINK 51 are means of locating adjacent for each leaf node 35-38 and each ascending nodes without first reversing back up the 13 index node 32-34 for a B-Tree structure. Al- 70 Tree.
though Figure 3 shows only three levels and The records segment 44 contains the B- two keys per node, any number of keys per Tree's records, each with its key and pointer node, as well as any number of levels, may or data information. In this particular example, be chosen for a particular B-Tree structure. 13- there are two records 60 and 61. The records Tree 31 of Figure 3 as drawn is a hypotheti- 75 in a node can be of variable length. For this cal example for illustration purpose only. reason, offsets to the beginning of each re- When a data record is needed, the key of cord are needed. The records segment begins the desired record is provided. The search be- immediately following the node descriptor seg gins at the root node, which is also an index ment 43. The records are followed by a free node. A search is performed within the node 80 space segment 45, which is basically the un until the record with the highest key that is used space of the node. Therefore, free space not higher than the search key is reached. segment may not exist in some instances.
Assume in the hypothetical example of Figure The record offset segment 46 at the end of 3, that data with key 59 is to be selected. the node contains the offset information for The search commences at the root node 32, 8 records 60 and 61. Offset 68 contains offset wherein key 56 is selected because the value information for record 60 and offset 67 con 56 is the highest key that is not greater than tains offset information for record 61. Offset the search key itself. The pointer of key 56 66 contains the offset necessary to determine selects index node 34, wherein the search free space 62. Thus the record segment 44 continues within index node 34. Again, key 90 builds downward into the free space segment 56 is chosen because it is the highest key 45, while the record offset segment 46 builds that is not greater than the search key itself upward into the free space segment 45 from (the next key 63 is greater than the search the opposite end.
key). The pointer of key 56 in index node 34 If node 42 is an index node, then each re- selects leaf node 37. Within leaf node 37, 95 cord 60 and 61 is comprised of a key and another search is made to identify search key pointer information. Further, NDFLINK 51 and 59. When search key 59 is found, its associ- NDBLINK 52 would contain adjacent index ated information (data) is used. node linking pointers. If node 42 is a leaf A particular pointer in an index record leads node, then each record 60 and 61 is com- to another node one level down in the B-Tree 100 prised of a key and data information. NDFLINK 31. For example, node 32 to node 34. The 51 and NDBLINK 52 would also contain leaf process continues until a leaf node is reached node linking pointers. It is also appreciated whereupon its records are examined until the that although a particular format is illustrated desired key is found. If the desired key is not for node 42, the format may be modified present, then the search stops when a key 105 readily to include other types of information.
larger than the search key is reached or when Also, in the preferred embodiment data infor all the records in the leaf node have been mation in the leaf nodes of the HFS catalog 13 examined. The key values may be numeric, Tree is used to address locations in memory alphabetical or alphanumeric. where the actual data is stored.
Referring to Figure 4,.it shows the structure 110 Referring to Figure 5, a specialized B-Tree of any of the nodes of a B-Tree of the pre- expansion architecture as implemented in the sent invention. Each node 42 includes a node preferred embodiment is shown. A node 70, descriptor segment 43, records segment 44, which is equivalent to node 42 of Figure 4, is record. offset segment 46, and can have a shown having pointers to two lower-level free space segment 45. Each node 42 begins 115 nodes 71 and 73, which may be index or leaf with a node descriptor segment 43. NDNRECS nodes. Although only two nodes 71 and 73 58 contains the number of records currently in are shown at the lower level, any number of the node. NDTYPE 54 indicates the type of nodes may reside at this lower level. Also in node, either leaf or index node. NDHEIGHT 57 this particular hypothetical example, nodes 71 indicates the height of the node in the tree, 120 and 73 are only partially filled.
wherein leaf nodes are chosen as level 1, and For a B-tree to maintain its balance, records the index nodes just above them are at level must be kept uniformly spaced within the hier 2, etc. NDBLINK 52 and NDFLINK 51 are used archical structure. An unbalanced tree will re with B-Tree nodes as a way of quickly moving sult when records are not maintained uni through the records of the various nodes at a 125 formly in each node or nodes are heavily given level. For each node, NDBLINK 52 con- stacked toward one branch of the B-Tree. The tains a pointer to the previous node, and preferred embodiment uses a technique of left NDFLINK 51 contains a pointer to the subse- rotate and left splits to provide movement of quent node at the same level. In Figure 3, records from one node to another to maintain NDBLINK for node 36 would point to node 35 130 a balanced Tree. When records are to be 4 GB2196764A 4 transferred to another node, the left rotate op- The hierarchical catalog structure 90 is eration is used. In this instance, records in stored in a storage device as shown by a node 73 are left rotated to its left adjacent memory map 97 of Figure 6. Cataloging map node 71, as shown by arrow 77. 97 is comprised of three possible types of If another node is needed, such as when 70 records: directory records 100, file records records in node 73 must be rotated and node 101, and thread records 102. Each record 71 cannot accommodate records from node 100-102 is comprised of a key 103 and infor 73, a left split operation is used to insert mation segment 104, as earlier described in node 72 to the left of node 73, between the description of a leaf node of a B-Tree.
nodes 71 and 73. In this instance, node 72 is 75 The key 103 of each record is comprised of a inserted to link node 71 and node 73, as value 105 and a name 106. The key 103 of a shown by arrows 78. When node 72 is in- directory record, such as that of 91 and 92, serted, appropriate pointer links will be estab- is comprised of its directory name 106 and its lished with its index node 70 as well as adja- parent directory's DirlD value 105. A informa cent link pointers for nodes 71 and 73. Contion segment 104 of each directory record, tinually moving data leftward and inserting such as that of directories 91 and 92 is com new data at the right extremities helps keep prised of the directory's DirlD value 107. For the B-tree balanced. Because the HFS of the directory 92, the directory's DirlD has been present invention is structured to have the asgiven the value of 29, and has a name of cending nodes organized in a rightward direc- 85 "Folder". The parent DirlD of record 92 has tion, the balancing is maintained even though been given the value 2 because directory 92 the rotates and splits are made toward the is an offspring of directory 91 in the structure left direction. It is appreciated that right splits 90. Directory record 91 has a directory DirlD and rotate operations, or balanced insertions value of 2, with a corresponding name of using both right and left operations can be 90 "Volume". Because directory 91 is a root di used as well. Although the preferred embodirectory, the parent DirlD value has been given ment uses and attempts to maintain a bal- the value of 1, wherein the value 1 refers to anced B-Tree for search efficiency, most any the foundation of the filing system itself.
B-Tree structure can be used, including unbal- A file record, such as file records 93-96, is anced B-Tree. 95 also comprised of a key 113 and an informa tion segment 114, wherein key 113 is also CATALOG TREE comprised of a parent DirlD value and a name.
Referring to Figure 6, a hypothetical catalog However, in the information segment 114, the is used to illustrate the implementation of descriptive location information for the actual cataloging of the preferred embodiment. The 100 stored file data is maintained as well as a structure 90 has a root directory 91 named unique file number. The information segments "Volume". Each directory of the preferred em- 114 of file records 93-96 contain the descrip- bodiment is assigned a unique numerical identive location of the actual stored data informa tifier known as the directory identifier (DirID). tion.
The root directory of catalog 90 has DirlD 105 File record 94, having a file name of B, and value of 2. Root directory 91 has three bran- file record 93, having a file name A, both ches comprised of directory 92 and files 93 have a parent DirlD value of 2. The parent and 94. Directory 92 has a name of "Folder" DirlD value of 2 signifies that files A and B and a DirlD value of 29. In turn, directory 92 are direct offsprings of directory "Volume" has two branches comprised of files 95 and 110 having a DirlD value of 2. File 95, having a 96. Files 93-96 are named "A", "B", "C" name C, and file 96, having a name D, have and "D", respectively in this example. The parent DirlD values of 29, which reflect the architecture of the directories and files follows origination of files C and D as offsprings of the HFS structure as previously explained in directory 29 labeled "Folder", having a DirlD Figure 2. The complete cataloging structure 115 value of 29. Therefore, by looking at any file is stored as data records in various leaf or a directory record's key 103, the stored nodes of the B-Tree of Figures 3 and 4 information provides the identification of the known as the catalog B-Tree. It is appreciated name of that particular record as well as the that the cataloging structure 90, although a DirlD value of the parent node.
tree, is in itself not a B-Tree. The form of 120 To provide the interconnection of the differ- structure 90 is actually stored in the various ent branches, a thread record 102 is provided leaf nodes of a B-Tree. It is to be appreciated for each directory. The key of a thread record that the cataloging structure 90 not be con- contains a DirlD value and a null-name, which fused with the previous description of the B- is equivalent to having no name at all. In the
Tree. Catalog 90 and the B-Tree structure are 125 example of Figure 6, thread record 108 pro two separate and distinct structures. The hiervides the connection between the directory archical structure of the catalog 90 is imple- "Folder" and files C and D. In the key 111 of mented as a B-Tree structure and stored as thread record 108, only the directory DirlD data records in leaf nodes of a B-Tree similar value of "Folder" is given. In the information to that of Figures 3 and 4. 130 segment 112 of thread record 108, the DirlD GB2196764A 5 of "Folder-'s parent and the directory's name ment uses one HFS cataloging structure per "Folder" are given. Therefore, when file C, memory device, such as a disk. However, having a parent DirlD 29 attempts to link to such a disk can be partitioned and an HFS its immediate parent directory 92, which has a catalog assigned to each such partition.
DirlD of 29, the thread record 108 provides 70 The catalog records of structure 97 of Fig- the name (Folder) of the parent directory 92, ure 6 are stored as the data records in the as well as the parent DirlD value of directory leaf nodes 42 of Figure 4 of a catalog B-Tree.
92, which is equal to 2. These records are inserted and maintained in Equivalently thread record 109 provides the the catalog B-Tree in ascending alphanumeric name (Volume) of directory 91 as well as its 75 order. Thus, if the leaf nodes of the B-Tree parent directory DirlD value for the three off- are traversed from left to right, the data re springs 92-94 of directory 91. By having di- cords will be encountered in the order shown rectory records 91-92, file records 93-96, in structure 97 of Figure 6. This order main along with thread records 108-109 for each tains the records in ascending order first by directory, the cataloging structure 90 is inter- 80 the DirlD value part of the key. Then, among connected into a HFS, wherein the descriptive records with the same DirlD value in their location information for the actual stored data keys, the order is alphabetical on the name is stored in file records 91-92 as shown in part of the key.
the structure 97 of Figure 6. It is also appreciated that other pertinent By implementing the cataloging structure 90 85 information may be stored in the various re- using a B-Tree structure, the hierarchical confi- cords besides what has been disclosed in Fig guration of structure 90 is easily stored in the ure 6. For example, directory and file records leaf nodes of a B-Tree of the earlier descrip- of the present invention maintain flags, date tion. For example, when file C is to be ac- and time of creation of the directory or the cessed by a computer, the system will imple- 90 file, as well as the date and time of last modi ment a B-Tree search. Referring to the catalog fication. Also, file records include such items example 90 of Figure 6, when file with name as flags for locking the file, values to set logi C is to be found, the search path must be cal and physical end of files, and size of the specified for this search. This can be given in file.
terms of a sequence of the names of all direc- 95 tories on the path from the root to the said FILE EXTENTS TREE file, thus "Volume", followed by "Folder", As already noted, the catalog B-Tree's file and finally "C". The search begins by finding record of a particular file contains information the directory record in the Catalog B-Tree that about the locations in the memory device corresponds to "Volume". Its name is "Vol- 100 where the file's data is stored. The memory ume" and since it is the root, its parent DirlD device is considered to be a sequentially num value is 1. The catalog B-Tree is searched for bered collection of blocks. A series of contig a directory record with key <1> Volume; uous memory blocks is called an extent. Ide thus, directory record 91 is found. Its informaally, a file would be stored in a single extent tion segment then provides the DirlD value 2 105 having a contiguous memory allocation space.
of this directory. Now a search is made However, due to the size of certain files, as through the B-Tree for the record with key well as subsequent additions, deletions and <2> Folder which leads to the directory re- modifications to existing files, files are usually cord 92, whose information segment provides stored in more than one allocated area of the this directory's DirlD value of 29. Thus now a 110 memory. Except in the case of preallocated or search of the B-Tree is made to find the data small files, the contents of a particular file are record with key <29>C. This immediately usually stored in more than one extent, sepa leads the search to the file record 95, whose rated into non-contiguous sections on a vol information segment contains the information ume. Each file extent can be identified by an about the physical location of the data con- 115 extent descriptor. Thus, the complete location tained in the desired file. information of a particular file is a sequential It will be appreciated that the specification extents list consisting of the extent descrip- of the file of the above example could start tors of the various extents containing the file's with the DirlD value of any directory on the data.
path from the root to the desired file, and 120 The file extents list of the present invention would then consist of this DirlD value and the is organized also as a B- Tree, known as the sequence of names of the directories on the File Extents B-Tree, and records the volume balance of the path from that directory to the location and size of the various extents that desired file. The search mechanism followed is comprise the files. Although most any mem an obvious variant of the one indicated above. 125 ory allocation system can employ the file ex- Although cataloging structure 90 is a simpli- tents record of the present invention, a spe- fied structure and Figure 6 only shows the cific memory allocation system is described to presence of a single structure having a single illustrate the file extents record of the pre root directory 91, a cataloging structure may ferred embodiment.
be enlarged manifold. The preferred embodi- 130 Referring to Figure 7, a memory volume 120 6 GB2196764A which is a portion of a memory device, such separate B-Tree from the earlier described ca as a hard disk, is shown. Volume 120 is seg- talog B-Tree. Each data record of this extents mented into a number of logical blocks 126. B-Tree consists of a key and an information Typically, each logical block 126 is comprised segment asbefore in the discussion of Figures of a predetermined fixed number of bytes, 70 3-5. The information segment of a File Extents such as 512 bytes for the preferred embodi- B-Tree data record is comprised of a se ment. A fixed number of logical blocks start- quence of extents descriptors of a particular ing at block 0 and ending at block n is res- file. The maximum number of extents descrip erved for volume information. The balance of tors in such a record can vary from implemen- the memory device starting at block n+l is 75 tation to implementation, but in the preferred available for data storage and this storage embodiment is set to three. The key of the area is separated into allocation units, wherein File Extents B-Tree record consists of two each allocation unit is comprised of one or fields: the file number of the particular file and more contiguous logical blocks. the file relative posistion of the starting block Volume 120 includes four areas 121-124. 80 of the first extent descriptor in that record.
System start-up area 121 contains certain These extents records are kept in the leaf configurable system parameters which are nodes of the Extents B-Tree sorted in ascend well-known in operating a disk or other mem- ing order first on the file number field and ory devices. Volume information area 122 then on the file relative position of the starting contains information regarding the housekeep- 85 block. This allows efficient search through the ing parameters of the volume, such as number B-Tree for the location information of data at and size of each allocation unit. Volume bit a particular file relative position.
map 123 maintains record of each allocation In actuality, the preferred embodiment unit on the volume 120 and uses a bit map to stores three extents descriptors, base plus designate use or non-use of each allocation 90 two subsequent extents descriptors, the infor unit. mation data segment 114 of the file's catalog Commencing at block n + 1, a file content B-Tree record such as 94 of Figure 6. There- area 124 extends to the end of the Volume fore, in the example of Figure 8, extent des 120. File content area 124 is separated into a criptors 125a, 126a and 127a are kept in the number of allocation units, wherein each allo- 95 information segment of the cataloging struc cation unit is comprised of a fixed number of ture and extents 128a-131a are kept in the logical blocks. While the bit map 123 main- File Extents B-Tree as shown in Figure 9. Per tains volume space management, it does not mitting limited extent information to be kept in provide file mapping. The file mapping func- the data segments of a cataloging structure tion is provided by the file extents lists. 100 permits faster access to data. Only when a Referring also to Figure 8, a portion of file file contains four extents or more, will it need contents area 124 is shown containing infor- to consult the File Extents B-Tree. It should be mation attributed to a file labeled file E. In this appreciated that the number of extents which hypothetical example the entire contents of file are kept in the file's Catalog B-Tree record E are separated into seven extents 125-131. 105 without using a File Extents B-Tree is arbitrary The first portion of the file is stored in base and can be changed without departing from extent 125, the subsequent portions of the the spirit and scope of the invention.
file are distributed accordingly in extents 2-7 Also referring to Figure 9, it shows a ca- which are labelled 126-131. File E has seven talog file record 145 and File Extents B-Tree extents 125-131 which are not physically con- 110 records 143 and 144. As explained in the tiguous. To maintain file extents information structure of B-Trees of the present invention, an extent descriptor 140 is used for the base each record 143 and 144 is comprised of a extent 125 and each of the subsequent ex- key 148 and 149 and extents list 146 and tents 126-131 of file E. 147, respectively. To locate a certain portion Extent descriptor 140 is comprised of a 115 of the data of a particular file, first the Catalog starting allocation unit number 141 and num- B-Tree is searched for the corresponding file ber of allocation units 142. File E extents list record. From this file record's information seg 135, which is comprised of seven extent des- ment, the file number is extracted. Also, the criptors 125a-131a, provides information as first three extent descriptors in the information to the address and length of each extent 125segment of the catalog B-Tree file record are 131 of file E. For example, the fourth extent examined. If the required file data is contained 128, which has a starting allocation address within the corresponding extents, then the lo of 189 and is only two allocation blocks long, cation information is now readily available. If has a value of 189 in field 141 and a value of however, the desired file data is located in
2 in field 142 of descriptor 128a. 125 extents beyond the three in the catalog's file
Extents descriptors of all files in a volume record, then a search is made of the File Ex- are maintained in the present invention in the tents B-Tree using as a search key the file data records contained in the leaf nodes of 13- number and the computed file relative block Tree such as of Figures 3-5. This tree is position of the desired data. This search will known as the File Extents B-Tree and is a 130 lead to the file extent's BTree record contain- 7 GB2196764A 7 ing the desired location information. medium may be used.
The example using file E is comprised of 22 Thus, a hierarchical filing system for use blocks and having an arbitrary file number with a large capacity memory device in de equal to 20. The extent descriptors contained scribed.
in the catalog file record 145 for file E provide 70

Claims (1)

  1. the location information for the first 3-extents CLAIMS which in turn
    comprises the first 9 blocks 1. In a process where information is stored (3+5+1) of the file. The location information on a memory device, a method for preparing for the remaining 13 blocks (2+3+1+7) of a computer program for cataloging said infor the file is contained in two data records 143 75 mation, comprising the steps of:
    and 144 within the File Extents B-Tree. As- grouping said information into a plurality of sume that the desired data is at file relative files; block position 13 within file E. The extent implementing a hierarchical structure which descriptors contained in the file's catalog re- has a beginning node, a plurality of termina- cord are examined first. Since relative block 80 tion nodes, and a plurality of intermediate 13 is greater than the number of blocks lo- nodes arranged at various subsequent levels cated by the extent descriptors in the file's from said beginning node and interconnecting catalog record, the File Extent B-Tree is some of said termination nodes to said begin searched. The key used for the B-Tree search ning node, such that there is only one inter for relative block position 13 is <20,13> 85 connecting path from said beginning node to Since the key value of "13" is greater than each of said termination nodes; the value -9- of key 148 for the first Files placing location description information for Extents B-Tree record 143 for file E and is each of said files in a predetermined termina less than the value " 15" of key 149 for the tion node, such that each of said termination second record 144, the search results with a 90 nodes includes its associated file location de 11 not found" result but positions to the sec- scription and provides said location description ond B-Tree record 144. By retrieving the pre- for retrieving its associated file; vious record 143 of key 148, the extent des- assigning a unique value to each of said criptor for relative block 13 is obtained. The files; value of -9- for key 148 is derived because 95 whereby said information for a particular file extents list 146 starts at the tenth relative is retrieved by searching for its associated block (allocation unit number 9). The value of value in said hierarchical structure.
    "15" for key 149 is derived because extents 2. The method defined by Claim 1 further list 147 starts at the sixteenth relative block comprising the steps of:
    (allocation unit number 15). 100 implementing a B-Tree structure; and placing each of said unique values and its Implementation associated location description information in a The HFS of the present invention is imple- predetermined leaf node-of said B-Tree.
    mented in a computer which is coupled to a 3. The method defined by Claim 2 wherein memory device, such. as a disk, having an 105 said placing of said unique values in said leaf ability of storing millions of bits of informa- nodes comprises the step of:
    tion, although any storage medium can use arranging said unique values in an ascending the HFS. Typically, the HFS of the present order in said leaf nodes.
    invention provides the cataloging of various 4. The method defined by Claim 3 wherein groupings of data, such as files, which are 110 said placing of said location description infor stored on the disk. mation in said predetermined termination The preferred embodiment implements data nodes further comprises the step of:
    storage by the use of a cataloging structure providing for several location description in- previously described to catalog data stored on formation for each file when said information a large capacity memory device. It also main- 115 for a respective file is segmented, into a plural tains a file extents record of up to three ex- ity of physically non- contiguous segments on tents per file in the catalog. Subsequent exsaid memory device.
    tent information is stored in a separate file 5. In a process where information is catal- extents record. Both the catalog record 'and oged in a filing system, a method for prepar the extents record are maintained using two 120 ing a computer program for providing said fil B-Trees of the earlier described B-Tree struc- ing system, comprising the steps of:
    ture. ordering a hierarchical nodal structure which The HFS as described in the preferred em- has a root directory, a plurality of branching bodiment is controlled by a combination of directories and a plurality of files, wherein hardware and software in a computer system. 125 each said file traces a singular path from itself The HFS controlling routines are stored Jn a to said-root directory such that said singular separate storage device than the device used path can transition through said branching di for storing the actual data. The preferred em- rectories; bodiment stores the routines in a read only assigning a unique identification, value to memory (ROM), although most any storage 130 each of said directories; 8 GB2196764A 8 assigning a unique identification name to itself to said root directory; each of said files; assigning a unique key value to each of said placing location description information of directories to distinguish said directories; stored data in its corresponding file wherein structuring a plurality of files within said hi- each of said files references a particular 70 erarchical structure wherein each of said files grouping of said stored data; branch from its associated directory; each of whereby said particular grouping of said said files having a unique identifying name is stored data is cataloged by its corresponding associated with a particular grouping of data name in said hierarchical structure. stored in said memory device; 6. The method defined by Claim 5 further 75 placing location description information for comprising the steps of: each of said particular grouping of data in its implementing a B-Tree structure having a respective file; root index node, a plurality of branching index placing in each directory and file said key nodes arranged at various subsequent levels value of its parent directory such that said from said root index node and terminating in a 80 singular path is determined by referencing said plurality of leaf nodes; and key value of said parent directory; ordering said hierarchical nodal structure in retrieving said particular grouping of data by said leaf nodes by: traversing downward through said hierarchical associating each of said names for each of structure to said respective file by starting at said files with one of said values of a corre- 85 any directory along said respective path, sponding directory which is immediately above wherein said file provides said location de in said singular path; scription information; associating each of said value for each of whereby search and retrieval of stored data said directory with a value of a corresponding is conducted by a systematic and hierarchical directory which is immediately above in said 90 technique.
    singular path; 10. The method defined by Claim 9 further provides linking of said files and directories,- comprising the steps of:
    such that each of said files can be accessed implementing a B-Tree structure having a by accessing any directory along said singular root index node, a plurality of branching index path; 95 nodes arranged at various subsequent levels 7. The method defined by Claim 6 wherein from said root index node and terminating in a said ordering of said hierchical structure in plurality of leaf nodes; said leaf nodes further comprises the step of: organizing said hierarchical cataloging struc- arranging said values in said leaf nodes of a ture in said leaf nodes such that said directo- B-Tree in an ascending order such that each 100 ries and files are distributed according to their said unique value is associated with its re- parent directory value in an ascending order; spective data record comprising of singular placing a first value of each node of said 13- path linking information, wherein a first value Tree in a connected index node of a previous in each of said node is also listed in a con- level to form an interconnecting sequence nected index node of a previous level to form 105 from said root index node to each of said leaf an interconnecting sequence from said root in- nodes; dex node to each of said leaf nodes. searching for a predetermined key value 8. The method defined by Claim 7 wherein from said hierarchical structure by traversing said placing of said location description infor- across a level of said B- Tree until a higher mation for each particular grouping of stored 110 value than said predetermined key value is data further comprises the step of: found, then traversing down to a next lower providing for several location descriptions level by taking a path provided by a next when said grouping is segmented into a plu- lower key value from said higher value, and rality of physically noncontiguous segments on repeating said traversals until one of said leaf said memory device. 115 nodes is reached; 9. In a process where information is catal- 11. The method defined by Claim 10 oged in a filing system and retrieved from a wherein placing of said location description in memory device by using said filing system, a formation for each of said particular grouping method for preparing a computer program for of stored data further comprises the step of:
    providing said filing system, comprising the 120 providing for several location descriptions steps of: when said grouping is segmented into a plu- ordering a hierarchical cataloging structure rality of physically noncontiguous segments on which has a root directory, a plurality of said memory device.
    branching directories arranged at various sub- 12. The method defined by Claim 10 further sequent levels from said root directory, 125 comprising the step of:
    wherein some of said branching directories providing a second B-Tree to maintain loca- branch from other of said branching directo- tion description information when said group ries; said branching directories being intercon- ing is segmented into a plurality of physically nected such that for each of said branching non-contiguous segments on said memory de directories there is only a singular path from 130 vice.
    9 GB2196764A 9 13. The method defined by Claim 10 further comprising the step of: Published 1988 at The Patent Office, State House, 66/71 High Holborn, London WC 1 R 4TP. Further copies may be obtained from providing a second B-Tree to maintain loca- The Patent Office, Sales Branch, St Mary Cray, Orpington, Kent BR5 3RD.
    tion description information of excess seg- Printed by Burgess & Son (Abingdon) Ltd. Con. 1/87.
    ments when said non-contiguous segments exceeds a predetermined number.
    14. In a computer a hierarchical filing sys- tem to provide cataloging and retrieval of data stored on a storage device, said hierarchical filing system comprising:
    a memory for storing a program for said cataloging and retrieval; a processor coupled to said memory and said storage device for manipulating said pro gram, catalog and retrieve said data; said program for ordering a hierarchical ca- taloging structure which has a root directory, a plurality of branching directories arranged at various subsequent levels from said root direc tory, wherein some of said branching directo ries branch from other of said branching direc tories; said branching directories being inter connected such that for each of said branch ing directories there is only a singular path from itself to said root directory; said program for assigning a unique key value to each of said directories to distinguish said directories; said program for structuring a plurality of files within said hierarchical structure wherein each of said files branch from its associated directory; each of said files associated with a particular grouping of data stored in said memory device; said program for placing location description information for each of said particular grouping of data in its respective file; said program for placing in each directory and file said key value of its parent directory such that said singular path is determined by referencing said key value of said parent direc tory; said program for retrieving said particular grouping of data by traversing downward through said hierarchical structure to said re spective file, wherein said file provides said location description information; whereby search and retrieval of stored data is conducted by a systematic and hierarchical technique.
    15. The hierarchical filing system defined in Claim 14, wherein said program is stored in a read only memory.
    16. In a process where information is stored on a memory device, a method for preparing a computer program for cataloging said information subatantially as hereinbefore described with reference to the accompanying drawings.
    17. In a computer a hierarchical filing sys- tem to provide cataloging and retrieval of data stored on a storage device, said hierarchical filing system being substantially as hereinbe fore described with reference to the accom panying drawings.
GB08715199A 1986-10-30 1987-06-29 Hierarchical file system Pending GB2196764A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US92480286A 1986-10-30 1986-10-30

Publications (2)

Publication Number Publication Date
GB8715199D0 GB8715199D0 (en) 1987-08-05
GB2196764A true GB2196764A (en) 1988-05-05

Family

ID=25450754

Family Applications (1)

Application Number Title Priority Date Filing Date
GB08715199A Pending GB2196764A (en) 1986-10-30 1987-06-29 Hierarchical file system

Country Status (6)

Country Link
JP (1) JPS63116232A (en)
AU (1) AU610092B2 (en)
CA (1) CA1285656C (en)
DE (1) DE3736455A1 (en)
FR (1) FR2606182B1 (en)
GB (1) GB2196764A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0410210A2 (en) * 1989-07-24 1991-01-30 International Business Machines Corporation Method for dynamically expanding and rapidly accessing file directories
EP0453707A2 (en) * 1990-04-26 1991-10-30 International Business Machines Corporation Method and means for managing space re-use in a shadow written B-tree via free space lists
EP0558505A1 (en) * 1990-10-05 1993-09-08 Microsoft Corporation System and method for information retrieval
EP0650131A1 (en) * 1993-10-20 1995-04-26 Microsoft Corporation Computer method and storage structure for storing and accessing multidimensional data
GB2283591A (en) * 1993-11-04 1995-05-10 Northern Telecom Ltd Database management.
GB2336008A (en) * 1998-04-03 1999-10-06 Schlumberger Holdings Simulation system includes simulator and case manager adapted for organizing data files
WO2002029624A1 (en) * 2000-10-04 2002-04-11 Bullant Technology Pty Ltd Data processing structure
GB2369465A (en) * 2000-11-28 2002-05-29 3Com Corp Memory system
US6813611B1 (en) * 1999-06-08 2004-11-02 International Business Machines Corporation Controlling, configuring, storing, monitoring and maintaining accounting of bookkeeping information employing trees with nodes having embedded information

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5371885A (en) * 1989-08-29 1994-12-06 Microsoft Corporation High performance file system
JPH0786880B2 (en) * 1991-04-26 1995-09-20 株式会社椿本チエイン Data storage method
DE4331949A1 (en) * 1993-09-21 1995-03-30 Frank Dipl Ing Mueller Data processing system and method of organising data in data processing systems
US6119151A (en) * 1994-03-07 2000-09-12 International Business Machines Corp. System and method for efficient cache management in a distributed file system
KR100834760B1 (en) 2006-11-23 2008-06-05 삼성전자주식회사 Structure of index, apparatus and method for optimized index searching
CN112579079A (en) * 2019-09-29 2021-03-30 北京向上一心科技有限公司 File processing method and device, computer equipment and storage medium
CN111054082B (en) * 2019-11-29 2023-10-13 珠海金山数字网络科技有限公司 Method for coding Unity resource data set

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2029990A (en) * 1978-08-31 1980-03-26 Fujitsu Ltd Data processing
US4318184A (en) * 1978-09-05 1982-03-02 Millett Ronald P Information storage and retrieval system and method
US4611298A (en) * 1983-06-03 1986-09-09 Harding And Harris Behavioral Research, Inc. Information storage and retrieval system and method
EP0196064A2 (en) * 1985-03-27 1986-10-01 Hitachi, Ltd. System for information storage and retrieval

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE2217500A1 (en) * 1971-04-23 1972-10-26 International Business Machines Corp., Armonk, N.Y. (V.StA.) Method and device for searching for key words in an electronic data processing system
JPS60129873A (en) * 1983-12-19 1985-07-11 Nippon Telegr & Teleph Corp <Ntt> Document storage and retrieval system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2029990A (en) * 1978-08-31 1980-03-26 Fujitsu Ltd Data processing
US4318184A (en) * 1978-09-05 1982-03-02 Millett Ronald P Information storage and retrieval system and method
US4611298A (en) * 1983-06-03 1986-09-09 Harding And Harris Behavioral Research, Inc. Information storage and retrieval system and method
EP0196064A2 (en) * 1985-03-27 1986-10-01 Hitachi, Ltd. System for information storage and retrieval

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0410210A3 (en) * 1989-07-24 1993-03-17 International Business Machines Corporation Method for dynamically expanding and rapidly accessing file directories
EP0410210A2 (en) * 1989-07-24 1991-01-30 International Business Machines Corporation Method for dynamically expanding and rapidly accessing file directories
EP0453707A2 (en) * 1990-04-26 1991-10-30 International Business Machines Corporation Method and means for managing space re-use in a shadow written B-tree via free space lists
EP0453707A3 (en) * 1990-04-26 1992-12-02 International Business Machines Corporation Method and means for managing space re-use in a shadow written b-tree via free space lists
EP0558505A1 (en) * 1990-10-05 1993-09-08 Microsoft Corporation System and method for information retrieval
EP0558505A4 (en) * 1990-10-05 1993-11-24 Microsoft Corporation System and method for information retrieval
US5799184A (en) * 1990-10-05 1998-08-25 Microsoft Corporation System and method for identifying data records using solution bitmasks
US5752243A (en) * 1993-10-20 1998-05-12 Microsoft Corporation Computer method and storage structure for storing and accessing multidimensional data
EP0650131A1 (en) * 1993-10-20 1995-04-26 Microsoft Corporation Computer method and storage structure for storing and accessing multidimensional data
GB2283591B (en) * 1993-11-04 1998-04-15 Northern Telecom Ltd Database management
GB2283591A (en) * 1993-11-04 1995-05-10 Northern Telecom Ltd Database management.
GB2336008A (en) * 1998-04-03 1999-10-06 Schlumberger Holdings Simulation system includes simulator and case manager adapted for organizing data files
GB2336008B (en) * 1998-04-03 2000-11-08 Schlumberger Holdings Simulation system including a simulator and a case manager adapted for organizing data files
US7561997B1 (en) 1998-04-03 2009-07-14 Schlumberger Technology Corporation Simulation system including a simulator and a case manager adapted for organizing data files for the simulator in a non-conventional tree like structure
US6813611B1 (en) * 1999-06-08 2004-11-02 International Business Machines Corporation Controlling, configuring, storing, monitoring and maintaining accounting of bookkeeping information employing trees with nodes having embedded information
WO2002029624A1 (en) * 2000-10-04 2002-04-11 Bullant Technology Pty Ltd Data processing structure
GB2369465A (en) * 2000-11-28 2002-05-29 3Com Corp Memory system
GB2369465B (en) * 2000-11-28 2003-04-02 3Com Corp A method of sorting and retrieving data files

Also Published As

Publication number Publication date
DE3736455A1 (en) 1988-05-05
FR2606182A1 (en) 1988-05-06
GB8715199D0 (en) 1987-08-05
AU610092B2 (en) 1991-05-16
FR2606182B1 (en) 1993-12-17
CA1285656C (en) 1991-07-02
JPS63116232A (en) 1988-05-20
AU8048587A (en) 1988-05-05

Similar Documents

Publication Publication Date Title
US4945475A (en) Hierarchical file system to provide cataloging and retrieval of data
US5727197A (en) Method and apparatus for segmenting a database
EP0632364B1 (en) Efficient storage of object in a file system
US5497485A (en) Method and apparatus for implementing Q-trees
US6029170A (en) Hybrid tree array data structure and method
EP0124097B1 (en) Method for storing and retrieving data in a data base
US6411957B1 (en) System and method of organizing nodes within a tree structure
US4677550A (en) Method of compacting and searching a data index
EP2069979B1 (en) Dynamic fragment mapping
US5276874A (en) Method for creating a directory tree in main memory using an index file in secondary memory
GB2196764A (en) Hierarchical file system
US5121493A (en) Data sorting method
US5813000A (en) B tree structure and method
US5991862A (en) Modified indirect addressing for file system
US4611272A (en) Key-accessed file organization
EP0650131A1 (en) Computer method and storage structure for storing and accessing multidimensional data
EP0397404A2 (en) A system and method for reading and writing disks formatted for an operating system foreign to the host computer
Wagner Indexing design considerations
AU2004225060B2 (en) A computer implemented compact 0-complete tree dynamic storage structure and method of processing stored data
CN1235313A (en) Computer file title searching system
US20110231404A1 (en) File storage and retrieval method
JP2675958B2 (en) Information retrieval computer system and method of operating storage device thereof
EP0117906B1 (en) Key-accessed file organization
EP0111689A2 (en) Method of storing a B-tree type index file on rotating media devices
Gardarin et al. Predicate trees: An approach to optimize relational query operations