CN117453153B - File storage method, device, terminal and medium based on flush rule - Google Patents
File storage method, device, terminal and medium based on flush rule Download PDFInfo
- Publication number
- CN117453153B CN117453153B CN202311800629.XA CN202311800629A CN117453153B CN 117453153 B CN117453153 B CN 117453153B CN 202311800629 A CN202311800629 A CN 202311800629A CN 117453153 B CN117453153 B CN 117453153B
- Authority
- CN
- China
- Prior art keywords
- rule
- file
- information
- determining
- attribute
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 32
- 238000004458 analytical method Methods 0.000 claims description 19
- 239000012634 fragment Substances 0.000 claims description 18
- 238000013507 mapping Methods 0.000 claims description 8
- 238000012216 screening Methods 0.000 claims description 5
- 238000003062 neural network model Methods 0.000 claims description 3
- 238000012549 training Methods 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims 2
- 238000004590 computer program Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 238000012545 processing Methods 0.000 description 4
- 238000007726 management method Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0643—Management of files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a file storage method, a device, a terminal and a medium based on a flush rule, wherein the method comprises the following steps: receiving a file to be stored, determining identification information carried by the file to be stored, and determining directory information corresponding to the identification information; determining a flush rule corresponding to the directory information based on the directory information; and determining rule attribute information of the rule, and storing the file to be stored based on the rule attribute information, wherein the rule attribute information is used for reflecting whether the rule is an original distribution rule or a newly-built distribution rule. The invention can determine the corresponding rule based on the file to be stored so as to use the proper rule to perform distributed storage on the file to be stored, thereby realizing flexible management and efficient storage of data.
Description
Technical Field
The present invention relates to the field of data storage technologies, and in particular, to a method, an apparatus, a terminal, and a medium for storing a file based on a flush rule.
Background
As a distributed storage system (Ceph), how to distribute data reasonably is important. With the advent of large-scale distributed storage systems, the system must be able to evenly distribute data and loads, maximize system utilization, and handle system expansion and system failures. The rules of the flush can well meet the requirements of a distributed storage system. Ceph describes all hardware resources of the system as a tree structure, and then generates a logical tree structure based on the tree structure according to a certain fault tolerance rule, namely the logical tree structure can be used as a credit map. The rule is to obtain a list of disk logical units OSD (Object Storage Daemon) according to a rule (rule) based on a map describing the current cluster resource status (i.e., the rule map) to determine which OSD to distribute data to based on the rule. However, the current rule is basically preset and bound with the storage pool, and when a new rule exists, the corresponding rule cannot be flexibly selected when the data distribution is performed, so that the efficiency of data distribution storage is affected.
Accordingly, there is a need for improvement and advancement in the art.
Disclosure of Invention
The invention aims to solve the technical problems that the data distribution is not capable of flexibly selecting the corresponding rule when a new rule exists in the prior art, and the efficiency of data distribution storage is affected.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
in a first aspect, the present invention provides a file storage method based on a rule, where the file storage method is applied to a distributed file storage system, the method includes:
receiving a file to be stored, determining identification information carried by the file to be stored, and determining directory information corresponding to the identification information;
determining a flush rule corresponding to the directory information based on the directory information;
and determining rule attribute information of the rule, and storing the file to be stored based on the rule attribute information, wherein the rule attribute information is used for reflecting whether the rule is an original distribution rule or a newly-built distribution rule.
In one implementation manner, the determining the identification information carried by the file to be stored and determining the directory information corresponding to the identification information includes:
analyzing the file to be stored to obtain analysis content, and performing character recognition on the analysis content to obtain the identification information;
determining an identification attribute corresponding to the identification information based on the identification information, wherein the identification attribute is used for reflecting the file type of the file to be stored;
and determining directory information corresponding to the identification attribute based on the identification attribute.
In one implementation manner, the determining, based on the identification information, the identification attribute corresponding to the identification information includes:
acquiring a preset identifier definition comparison table, wherein the identifier definition comparison table reflects attributes corresponding to a plurality of identifiers;
and matching the identification information with the identification definition comparison table to obtain the identification attribute corresponding to the identification information.
In one implementation manner, the determining, based on the identification attribute, directory information corresponding to the identification attribute includes:
determining type information corresponding to the identification attribute based on the identification attribute, wherein the type information is used for reflecting the file type of the file to be stored;
and determining directory information corresponding to the type information based on the type information.
In one implementation manner, the determining, based on the directory information, a rule corresponding to the directory information includes:
matching the catalog information with the newly-built catalog to obtain a matching result;
if the matching result is that the matching fails, determining the directory information as an original directory, and acquiring an original credit rule corresponding to the original directory;
and if the matching result is that the matching is successful, determining the catalog information as a new catalog, and acquiring a new credit rule corresponding to the new catalog.
In one implementation, the determining rule attribute information of the rule, and storing the file to be stored based on the rule attribute information includes:
if the rule is an original rule, determining that the rule attribute information is an original distribution rule, and performing distributed storage on the file to be stored based on the original distribution rule;
and if the rule is a new rule, determining that the rule attribute information is a new distribution rule, and performing distributed storage on the file to be stored based on the new distribution rule.
In one implementation manner, the performing distributed storage on the file to be stored based on the newly created distribution rule includes:
slicing the file to be stored to obtain a plurality of file fragments;
and determining a target disk corresponding to the new distribution rule based on the new distribution rule, and storing a plurality of file fragments into the target disk.
In a second aspect, an embodiment of the present invention further provides a file storage device based on a rule, where the file storage device is applied to a distributed file storage system, and the device includes:
the identification analysis module is used for receiving a file to be stored, determining identification information carried by the file to be stored, and determining directory information corresponding to the identification information;
the rule determining module is used for determining a rule corresponding to the directory information based on the directory information;
and the file storage module is used for determining rule attribute information of the rule and storing the file to be stored based on the rule attribute information, wherein the rule attribute information is used for reflecting whether the rule is an original distribution rule or a newly-built distribution rule.
In a third aspect, an embodiment of the present invention further provides a terminal, where the terminal includes a memory, a processor, and a file storage program based on a rule stored in the memory and capable of running on the processor, and when the processor executes the file storage program based on the rule, the processor implements the steps of the file storage method based on the rule in any one of the above schemes.
In a fourth aspect, an embodiment of the present invention further provides a computer readable storage medium, where a file storage program based on a rule is stored on the computer readable storage medium, where when the file storage program based on the rule is executed by a processor, the steps of the file storage method based on the rule according to any one of the above schemes are implemented.
The beneficial effects are that: compared with the prior art, the invention provides a file storage method based on a flush rule. Then, based on the catalog information, a credit rule corresponding to the catalog information is determined. And finally, determining rule attribute information of the rule, and storing the file to be stored based on the rule attribute information, wherein the rule attribute information is used for reflecting whether the rule is an original distribution rule or a newly-built distribution rule. The invention can determine the corresponding rule based on the file to be stored so as to use the proper rule to store the file to be stored in a distributed mode, and even if the newly-built rule exists, the newly-built rule can be used for storing the file, thereby realizing flexible management and efficient storage of data.
Drawings
FIG. 1 is a flowchart of a specific implementation of a method for storing a file based on a rule according to an embodiment of the present invention.
FIG. 2 is a functional schematic diagram of a file storage system based on a flush rule according to an embodiment of the present invention.
Fig. 3 is a schematic block diagram of a terminal according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and effects of the present invention clearer and more specific, the present invention will be described in further detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The embodiment provides a file storage method based on a flush rule, and the method based on the embodiment can realize flexible management and efficient storage of data. Specifically, the embodiment may first receive a file to be stored, determine identification information carried by the file to be stored, and determine directory information corresponding to the identification information. Then, based on the catalog information, a credit rule corresponding to the catalog information is determined. And finally, determining rule attribute information of the rule, and storing the file to be stored based on the rule attribute information, wherein the rule attribute information is used for reflecting whether the rule is an original distribution rule or a newly-built distribution rule. It can be seen that, in this embodiment, the corresponding rule may be determined based on the file to be stored, so that the file to be stored may be stored in a distributed manner by using an appropriate rule, and even if a new rule exists, the file may be stored by using the new rule.
The file storage method based on the flush rule can be applied to a terminal, and the terminal can be an intelligent product terminal such as a computer and a mobile phone. Specifically, as shown in fig. 1, the file storage method based on the rule of the present embodiment includes the following steps:
step S100, receiving a file to be stored, determining identification information carried by the file to be stored, and determining directory information corresponding to the identification information.
The file storage method of the present embodiment may be applied to a distributed file storage system, in this embodiment, a rule is preset in the distributed file storage system, that is, an original rule is set in advance, and it has also been preset to which OSD the file is stored in the original rule. In this embodiment, a suitable rule needs to be flexibly selected according to the file to be stored, so when the terminal in this embodiment receives the file to be stored, the file to be stored needs to be analyzed, and identification information carried by the file to be stored is determined. The identification information of the present embodiment is used to identify the file type of the file to be stored, so as to determine whether the newly created rule needs to be used according to the file type of the file to be stored. In this embodiment, the files of different file types are recorded under different directory information, so each file type corresponds to one directory information, and thus when the identification information of the file to be stored is obtained, the embodiment can determine the corresponding directory information according to the identification information.
In one implementation, the present embodiment, when determining directory information based on identification information, includes the steps of:
s101, analyzing the file to be stored to obtain analysis content, and performing character recognition on the analysis content to obtain the identification information;
step S102, determining an identification attribute corresponding to the identification information based on the identification information, wherein the identification attribute is used for reflecting the file type of the file to be stored;
step S103, determining directory information corresponding to the identification attribute based on the identification attribute.
Specifically, after receiving a file to be stored, the terminal in this embodiment parses the file to be stored to obtain parsed content, where the parsed content may include file content carried in the file to be stored and identification information for identifying the file content. In order to identify the identification information, the embodiment can identify certain specific characters in the file to be stored by means of character identification, so that the identification information is obtained, and therefore, the identification information is certain characters. After the identification information is obtained, the embodiment can further determine an identification attribute corresponding to the identification information, where the identification attribute is used for reflecting the file type of the file to be stored. In one implementation manner, an identifier definition comparison table may be preset in this embodiment, where attributes corresponding to a plurality of identifiers are reflected in the identifier definition comparison table. For example, when the identification information is a character "T", the corresponding identification attribute is a time identification, and when the identification information is a character "E", the corresponding identification attribute is an error reporting identification. Therefore, the embodiment can determine the identification attribute corresponding to the identification information based on the identification definition comparison table. When the identification attribute is obtained, the embodiment can determine the corresponding type information based on the identification attribute, and the file type of the file to be stored is obtained. The embodiment can preset the mapping relation between the identification attribute and the type information so as to match the type information corresponding to the current identification attribute based on the mapping relation. For example, if the identifier attribute is a time identifier, it may be determined that the type information is log information related to time, and at this time, the file type of the file to be stored is a log file. If the identification attribute is the error identification, the type information can be determined to be the fault information related to the fault, and at the moment, the file type of the file to be stored is the fault file. Therefore, the file type of the file to be stored can be determined through analysis of the identification information.
In other implementations, the process of determining the identification attribute based on the identification information in this embodiment may also be implemented based on the technology of the neural network. Specifically, the sample identifier and the sample attribute corresponding to the sample identifier may be acquired in advance, at which time the sample identifier and the sample attribute may be defined and a mapping relationship between the sample identifier and the sample attribute is established. And inputting the mapping relation into a preset neural network model for training to obtain an attribute identification model, wherein the attribute identification model can automatically output corresponding sample attributes based on sample identifiers. Therefore, after determining the identification information, the embodiment can input the identification information into the attribute identification model, and the attribute identification model can automatically output the corresponding identification attribute, so that the purpose of efficiently and accurately analyzing the identification attribute corresponding to the identification information is achieved.
In this embodiment, since different types of files are stored in different directories, after determining the type information corresponding to the identification attribute, the embodiment can determine the corresponding directory information based on the type information. The directory information in this embodiment is directory description information, and is used to describe type information corresponding to the identification attribute at this time. Each directory information is provided with a corresponding file interface for calling the file to be stored and calling a corresponding rule.
Step 200, determining a flush rule corresponding to the directory information based on the directory information.
After the catalog information is obtained, the embodiment can determine the corresponding rule through analyzing the catalog information. The terminal obtains the catalog information, so that the terminal can determine whether the catalog information at the moment is a newly built catalog, and the determined rule can also determine whether the newly built rule is the newly built rule.
In one implementation, when using the rule, the embodiment includes the following steps:
step S201, matching the catalog information with a newly-built catalog to obtain a matching result;
step S202, if the matching result is that the matching fails, determining the catalog information as an original catalog, and acquiring an original credit rule corresponding to the original catalog;
and step 203, if the matching result is that the matching is successful, determining that the catalog information is a new catalog, and acquiring a new credit rule corresponding to the new catalog.
Specifically, in this embodiment, a directory is pre-established, that is, a new directory, and a file interface and a new rule corresponding to the file interface are set for the new directory. The new rule is used for carrying out distributed storage on the files to be stored by using the new rule when the directory information of the files to be stored is determined to be the new directory. Therefore, in this embodiment, after obtaining the directory information of the file to be stored, the directory information is matched with the new directory, so as to obtain a matching result. The matching result may reflect whether the directory information at this time is the same as the newly created directory. If the matching result is that the matching is failed, the directory information at the moment is different from the newly-built directory, so that the directory information at the moment is the original directory, and the obtained rule is the original rule corresponding to the original directory. If the matching result is successful, the directory information at the moment is the same as the new directory, so that the directory information at the moment is the new directory, and the new Crush rule corresponding to the new directory can be acquired based on the file interface of the new directory at the moment. Therefore, according to the embodiment, whether the directory information corresponding to the file to be stored is the new directory or not can be determined, the new directory is used for storing the file by using the new cache rule, and the file is flexibly managed by using either the new directory or the original cache rule.
And step S300, determining rule attribute information of the rule, and storing the file to be stored based on the rule attribute information, wherein the rule attribute information is used for reflecting whether the rule is an original distribution rule or a newly-built distribution rule.
After determining the rule of the present application, the present embodiment may obtain corresponding rule attribute information according to whether the rule is a new rule, where the rule attribute information is used to reflect whether the rule at this time is an original distribution rule (i.e., an original rule) or a new distribution rule (i.e., a new rule), so as to execute a corresponding file storage operation based on the rule attribute information.
In one implementation manner, the method in this embodiment includes the following steps when storing a file:
step S301, if the rule is an original rule, determining that the rule attribute information is an original distribution rule, and performing distributed storage on the file to be stored based on the original distribution rule;
and step S302, if the rule is a newly built rule, determining that the rule attribute information is a newly built distribution rule, and performing distributed storage on the file to be stored based on the newly built distribution rule.
Specifically, if the rule is an original rule, it is determined that the rule attribute information is an original distribution rule, and at this time, the file to be stored may be stored in a distributed manner based on the original distribution rule. And if the rule is a new rule, determining that the rule attribute information is a new distribution rule, and performing distributed storage on the file to be stored based on the new distribution rule. When storing files, the embodiment can carry out slicing processing on the files to be stored to obtain a plurality of file slices, and the files to be stored are divided into a plurality of files for the purpose of slicing, so that a plurality of files can be processed simultaneously when being stored, and the storage efficiency is improved. In addition, the file to be stored is sliced, so that the distributed processing is realized, and the storage requirement of a distributed storage system is met. Because the file to be stored in the embodiment needs to be stored by using the new distribution rule (i.e., the new hash rule), the embodiment can acquire the target disk corresponding to the new distribution rule, that is, determine which OSD the file is stored in, and determine the target disk, so that a plurality of files corresponding to the file to be stored can be stored in the target disk in fragments.
In another implementation manner, after determining the target disk, the embodiment may further obtain the remaining storage space of each storage area in the target disk. And then determining the space utilization rate of each storage area based on the residual storage space, sorting all the storage areas based on the space utilization rate, screening the storage areas with the ranked space utilization rates according to the number of the file fragments, and respectively storing the file fragments into the storage areas, so that all the storage areas in the target disk can be used uniformly, and the waste of the storage space is avoided. In addition, the embodiment can analyze the history use record of each storage area, then determine the use frequency of each storage area, namely determine the number of times each storage area is stored in a file in the same time, sort the storage areas based on the use frequency, and then screen the storage areas with the use frequency ranked according to the number of the file fragments, so that all the storage areas in the target disk can be uniformly used, and the waste of storage space is avoided.
In summary, the embodiment can determine the corresponding rule based on the file to be stored, so that the file to be stored can be distributed and stored by using the appropriate rule, and even if a new rule exists, the file can be stored by using the new rule, so that flexible management of the file is realized.
Based on the above embodiment, the present invention further provides a file storage device based on a rule, where the file storage device is applied to a distributed file storage system, as shown in fig. 2, and the device includes: an identity analysis module 10, a rule determination module 20 and a file storage module 30. Specifically, the identifier analysis module 10 is configured to receive a file to be stored, determine identifier information carried by the file to be stored, and determine directory information corresponding to the identifier information. The rule determining module 20 is configured to determine a rule corresponding to the directory information based on the directory information. The file storage module 30 is configured to determine rule attribute information of the rule, and store the file to be stored based on the rule attribute information, where the rule attribute information is used to reflect that the rule is an original distribution rule or a newly created distribution rule.
In one implementation, the identification analysis module 10 includes:
the file analysis unit is used for analyzing the file to be stored to obtain analysis content, and carrying out character recognition on the analysis content to obtain the identification information;
the attribute determining unit is used for determining an identification attribute corresponding to the identification information based on the identification information, wherein the identification attribute is used for reflecting the file type of the file to be stored;
and the catalog determining unit is used for determining catalog information corresponding to the identification attribute based on the identification attribute.
In one implementation, the attribute determining unit includes:
a comparison table obtaining subunit, configured to obtain a preset identifier definition comparison table, where the identifier definition comparison table reflects attributes corresponding to a plurality of identifiers;
and the comparison table matching subunit is used for matching the identification information with the identification definition comparison table to obtain the identification attribute corresponding to the identification information.
In one implementation, the catalog determining unit includes:
a type determining subunit, configured to determine type information corresponding to the identifier attribute, where the type information is used to reflect a file type of the file to be stored;
and the type matching subunit is used for determining directory information corresponding to the type information based on the type information.
In one implementation, the rule determination module 20 includes:
the catalog matching unit is used for matching the catalog information with the newly-built catalog to obtain a matching result;
the original rule determining unit is used for determining the directory information as an original directory if the matching result is that the matching fails, and acquiring an original rule corresponding to the original directory;
and the new-built rule determining unit is used for determining the catalog information as a new catalog if the matching result is that the matching is successful, and acquiring a new-built rule corresponding to the new catalog.
In one implementation, the file storage module 30 includes:
the original mode storage unit is used for determining that the rule attribute information is an original distribution rule if the rule is the original rule, and performing distributed storage on the file to be stored based on the original distribution rule;
and the new mode storage unit is used for determining that the rule attribute information is a new distribution rule if the rule is a new rule, and performing distributed storage on the file to be stored based on the new distribution rule.
In one implementation, the newly created manner storage unit includes:
the file slicing subunit is used for slicing the file to be stored to obtain a plurality of file slices;
and the distributed storage subunit is used for determining a target disk corresponding to the new distribution rule based on the new distribution rule and storing a plurality of file fragments into the target disk.
The working principle of each module in the file storage system based on the flush rule in this embodiment is the same as the principle of each step in the above method embodiment, and will not be described here again.
Based on the above embodiment, the present invention also provides a terminal, and a schematic block diagram of the terminal may be shown in fig. 3. The terminal may include one or more processors 100 (only one shown in fig. 3), a memory 101, and a computer program 102 stored in the memory 101 and executable on the one or more processors 100, e.g., a file storage program based on a rules. The one or more processors 100, when executing the computer program 102, may implement the steps of an embodiment of a method for storing files based on a rules of a credit. Alternatively, the functions of the modules/units in the embodiment of the file storage system based on the rules of the flush may be implemented by one or more processors 100 when executing computer program 102, which is not limited herein.
In one embodiment, the processor 100 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
In one embodiment, the memory 101 may be an internal storage unit of the electronic device, such as a hard disk or a memory of the electronic device. The memory 101 may also be an external storage device of the electronic device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) card, a flash card (flash card) or the like, which are provided on the electronic device. Further, the memory 101 may also include both an internal storage unit and an external storage device of the electronic device. The memory 101 is used to store computer programs and other programs and data required by the terminal. The memory 101 may also be used to temporarily store data that has been output or is to be output.
It will be appreciated by those skilled in the art that the functional block diagram shown in fig. 3 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the terminal to which the present inventive arrangements may be applied, as a specific terminal may include more or less components than those shown, or may be combined with some components, or may have a different arrangement of components.
Those skilled in the art will appreciate that implementing all or part of the above-described methods may be accomplished by way of a computer program, which may be stored on a non-transitory computer readable storage medium, that when executed may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, operational database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual operation data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (4)
1. A file storage method based on a rule of credit, wherein the file storage method is applied to a distributed file storage system, the method comprising:
receiving a file to be stored, determining identification information carried by the file to be stored, and determining directory information corresponding to the identification information;
determining a flush rule corresponding to the directory information based on the directory information;
determining rule attribute information of the rule, and storing the file to be stored based on the rule attribute information, wherein the rule attribute information is used for reflecting whether the rule is an original distribution rule or a newly-built distribution rule;
the determining the identification information carried by the file to be stored and the directory information corresponding to the identification information includes:
analyzing the file to be stored to obtain analysis content, and performing character recognition on the analysis content to obtain the identification information;
determining an identification attribute corresponding to the identification information based on the identification information, wherein the identification attribute is used for reflecting the file type of the file to be stored;
determining directory information corresponding to the identification attribute based on the identification attribute;
the determining, based on the identification information, the identification attribute corresponding to the identification information includes:
acquiring a preset identifier definition comparison table, wherein the identifier definition comparison table reflects attributes corresponding to a plurality of identifiers;
matching the identification information with the identification definition comparison table to obtain an identification attribute corresponding to the identification information;
or, determining the identification attribute corresponding to the identification information based on the identification information, and further includes:
the method comprises the steps of collecting a sample identifier and a sample attribute corresponding to the sample identifier in advance to obtain a mapping relation between the sample identifier and the sample attribute;
inputting the mapping relation into a preset neural network model for training to obtain an attribute identification model, wherein the attribute identification model is used for outputting corresponding sample attributes based on sample identifiers;
inputting the identification information into the attribute identification model, and outputting an identification attribute;
the determining, based on the identification attribute, directory information corresponding to the identification attribute includes:
determining type information corresponding to the identification attribute based on the identification attribute, wherein the type information is used for reflecting the file type of the file to be stored;
determining directory information corresponding to the type information based on the type information;
the determining, based on the catalog information, a flush rule corresponding to the catalog information includes:
matching the catalog information with the newly-built catalog to obtain a matching result;
if the matching result is that the matching fails, determining the directory information as an original directory, and acquiring an original credit rule corresponding to the original directory;
if the matching result is that the matching is successful, determining the catalog information as a new catalog, and acquiring a new credit rule corresponding to the new catalog;
the determining rule attribute information of the rule and storing the file to be stored based on the rule attribute information comprises the following steps:
if the rule is an original rule, determining that the rule attribute information is an original distribution rule, and performing distributed storage on the file to be stored based on the original distribution rule;
if the rule is a new rule, determining that the rule attribute information is a new distribution rule, and performing distributed storage on the file to be stored based on the new distribution rule;
the step of performing distributed storage on the file to be stored based on the newly created distribution rule includes:
slicing the file to be stored to obtain a plurality of file fragments;
determining a target disk corresponding to the new distribution rule based on the new distribution rule, and storing a plurality of file fragments into the target disk;
the method further comprises the steps of:
after a target disk is determined, respectively acquiring the residual storage space of each storage area in the target disk;
determining the space utilization rate of each storage area based on the residual storage space, and sequencing all the storage areas based on the space utilization rate;
screening storage areas with space utilization rate ranking according to the number of the file fragments, and respectively storing the file fragments into the storage areas with space utilization rate ranking;
alternatively, the method further comprises:
after the target disk is determined, analyzing the history use record of each storage area, and determining the use frequency of each storage area;
sorting the storage areas based on the frequency of use;
and screening the storage areas with the frequency of use ranking according to the number of the file fragments, and respectively storing the file fragments into the storage areas with the frequency of use ranking.
2. A file storage device based on a rules rule, the file storage device being applied to a distributed file storage system, the device comprising:
the identification analysis module is used for receiving a file to be stored, determining identification information carried by the file to be stored, and determining directory information corresponding to the identification information;
the rule determining module is used for determining a rule corresponding to the directory information based on the directory information;
the file storage module is used for determining rule attribute information of the rule and storing the file to be stored based on the rule attribute information, wherein the rule attribute information is used for reflecting whether the rule is an original distribution rule or a newly-built distribution rule;
the identification analysis module comprises:
the file analysis unit is used for analyzing the file to be stored to obtain analysis content, and carrying out character recognition on the analysis content to obtain the identification information;
the attribute determining unit is used for determining an identification attribute corresponding to the identification information based on the identification information, wherein the identification attribute is used for reflecting the file type of the file to be stored;
a catalog determining unit, configured to determine catalog information corresponding to the identification attribute based on the identification attribute;
the attribute determination unit includes:
a comparison table obtaining subunit, configured to obtain a preset identifier definition comparison table, where the identifier definition comparison table reflects attributes corresponding to a plurality of identifiers;
a comparison table matching subunit, configured to match the identification information with the identification definition comparison table to obtain an identification attribute corresponding to the identification information;
or, the attribute determining unit is also used for
The method comprises the steps of collecting a sample identifier and a sample attribute corresponding to the sample identifier in advance to obtain a mapping relation between the sample identifier and the sample attribute;
inputting the mapping relation into a preset neural network model for training to obtain an attribute identification model, wherein the attribute identification model is used for outputting corresponding sample attributes based on sample identifiers;
inputting the identification information into the attribute identification model, and outputting an identification attribute;
the catalog determining unit includes:
a type determining subunit, configured to determine type information corresponding to the identifier attribute, where the type information is used to reflect a file type of the file to be stored;
a type matching subunit, configured to determine directory information corresponding to the type information based on the type information;
the rule determining module includes:
the catalog matching unit is used for matching the catalog information with the newly-built catalog to obtain a matching result;
the original rule determining unit is used for determining the directory information as an original directory if the matching result is that the matching fails, and acquiring an original rule corresponding to the original directory;
a new-built rule determining unit, configured to determine that the directory information is a new directory if the matching result is that the matching is successful, and obtain a new-built rule corresponding to the new directory;
the file storage module comprises:
the original mode storage unit is used for determining that the rule attribute information is an original distribution rule if the rule is the original rule, and performing distributed storage on the file to be stored based on the original distribution rule;
a new mode storage unit, configured to determine that the rule attribute information is a new distribution rule if the rule is a new rule, and perform distributed storage on the file to be stored based on the new distribution rule;
the newly-built mode storage unit comprises:
the file slicing subunit is used for slicing the file to be stored to obtain a plurality of file slices;
the distributed storage subunit is used for determining a target disk corresponding to the new distribution rule based on the new distribution rule and storing a plurality of file fragments into the target disk;
the device is also for:
after a target disk is determined, respectively acquiring the residual storage space of each storage area in the target disk;
determining the space utilization rate of each storage area based on the residual storage space, and sequencing all the storage areas based on the space utilization rate;
screening storage areas with space utilization rate ranking according to the number of the file fragments, and respectively storing the file fragments into the storage areas with space utilization rate ranking;
alternatively, the device is further configured to:
after the target disk is determined, analyzing the history use record of each storage area, and determining the use frequency of each storage area;
sorting the storage areas based on the frequency of use;
and screening the storage areas with the frequency of use ranking according to the number of the file fragments, and respectively storing the file fragments into the storage areas with the frequency of use ranking.
3. A terminal comprising a memory, a processor and a file storage program based on a rule stored in the memory and executable on the processor, wherein the processor implements the steps of the file storage method based on a rule as claimed in claim 1 when executing the file storage program based on a rule.
4. A computer-readable storage medium, wherein a file storage program based on a rule is stored on the computer-readable storage medium, and the file storage program based on the rule realizes the steps of the file storage method based on the rule according to claim 1 when the file storage program based on the rule is executed by a processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311800629.XA CN117453153B (en) | 2023-12-26 | 2023-12-26 | File storage method, device, terminal and medium based on flush rule |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311800629.XA CN117453153B (en) | 2023-12-26 | 2023-12-26 | File storage method, device, terminal and medium based on flush rule |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117453153A CN117453153A (en) | 2024-01-26 |
CN117453153B true CN117453153B (en) | 2024-04-09 |
Family
ID=89591366
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311800629.XA Active CN117453153B (en) | 2023-12-26 | 2023-12-26 | File storage method, device, terminal and medium based on flush rule |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117453153B (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012164104A (en) * | 2011-02-07 | 2012-08-30 | Nec Corp | File classification device |
CN102938784A (en) * | 2012-11-06 | 2013-02-20 | 无锡江南计算技术研究所 | Method and system used for data storage and used in distributed storage system |
CN104239376A (en) * | 2013-11-07 | 2014-12-24 | 新华瑞德(北京)网络科技有限公司 | Method and device for storing data |
WO2015057240A1 (en) * | 2013-10-18 | 2015-04-23 | Hitachi Data Systems Engineering UK Limited | Target-driven independent data integrity and redundancy recovery in a shared-nothing distributed storage system |
CN107704201A (en) * | 2017-09-11 | 2018-02-16 | 厦门集微科技有限公司 | Data storage handling method and device |
CN110825704A (en) * | 2019-09-27 | 2020-02-21 | 华为技术有限公司 | Data reading method, data writing method and server |
CN112817535A (en) * | 2021-02-03 | 2021-05-18 | 柏科数据技术(深圳)股份有限公司 | Method and device for distributing homing groups and distributed storage system |
CN114153806A (en) * | 2021-12-03 | 2022-03-08 | 杭州安恒信息技术股份有限公司 | File storage method, device, equipment and storage medium |
CN116743780A (en) * | 2022-03-01 | 2023-09-12 | 网联清算有限公司 | Distributed storage system and method |
CN117130983A (en) * | 2023-08-29 | 2023-11-28 | 北京五八信息技术有限公司 | File storage method and device, electronic equipment and storage medium |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10222986B2 (en) * | 2015-05-15 | 2019-03-05 | Cisco Technology, Inc. | Tenant-level sharding of disks with tenant-specific storage modules to enable policies per tenant in a distributed storage system |
US11645266B2 (en) * | 2020-08-13 | 2023-05-09 | Red Hat, Inc. | Automated pinning of file system subtrees |
-
2023
- 2023-12-26 CN CN202311800629.XA patent/CN117453153B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012164104A (en) * | 2011-02-07 | 2012-08-30 | Nec Corp | File classification device |
CN102938784A (en) * | 2012-11-06 | 2013-02-20 | 无锡江南计算技术研究所 | Method and system used for data storage and used in distributed storage system |
WO2015057240A1 (en) * | 2013-10-18 | 2015-04-23 | Hitachi Data Systems Engineering UK Limited | Target-driven independent data integrity and redundancy recovery in a shared-nothing distributed storage system |
CN104239376A (en) * | 2013-11-07 | 2014-12-24 | 新华瑞德(北京)网络科技有限公司 | Method and device for storing data |
CN107704201A (en) * | 2017-09-11 | 2018-02-16 | 厦门集微科技有限公司 | Data storage handling method and device |
CN110825704A (en) * | 2019-09-27 | 2020-02-21 | 华为技术有限公司 | Data reading method, data writing method and server |
CN112817535A (en) * | 2021-02-03 | 2021-05-18 | 柏科数据技术(深圳)股份有限公司 | Method and device for distributing homing groups and distributed storage system |
CN114153806A (en) * | 2021-12-03 | 2022-03-08 | 杭州安恒信息技术股份有限公司 | File storage method, device, equipment and storage medium |
CN116743780A (en) * | 2022-03-01 | 2023-09-12 | 网联清算有限公司 | Distributed storage system and method |
CN117130983A (en) * | 2023-08-29 | 2023-11-28 | 北京五八信息技术有限公司 | File storage method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN117453153A (en) | 2024-01-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110427368B (en) | Data processing method and device, electronic equipment and storage medium | |
CN110209652B (en) | Data table migration method, device, computer equipment and storage medium | |
CN110309125B (en) | Data verification method, electronic device and storage medium | |
US10114682B2 (en) | Method and system for operating a data center by reducing an amount of data to be processed | |
CN108197306B (en) | SQL statement processing method and device, computer equipment and storage medium | |
CN111400361B (en) | Data real-time storage method, device, computer equipment and storage medium | |
CN111324606B (en) | Data slicing method and device | |
CN109460252B (en) | Configuration file processing method and device based on git and computer equipment | |
CN110245125B (en) | Data migration method, device, computer equipment and storage medium | |
CN110956269A (en) | Data model generation method, device, equipment and computer storage medium | |
CN111680108A (en) | Data storage method and device and data acquisition method and device | |
CN111651595A (en) | Abnormal log processing method and device | |
CN114091409A (en) | Method, system, equipment and storage medium for distributed asynchronous Excel analysis | |
CN115544050A (en) | Operation log recording method, device, equipment and storage medium | |
CN113051102A (en) | File backup method, device, system, storage medium and computer equipment | |
CN117453153B (en) | File storage method, device, terminal and medium based on flush rule | |
CN110928941B (en) | Data fragment extraction method and device | |
CN111767126A (en) | System and method for distributed batch processing | |
CN117076480A (en) | Method, device, equipment and storage medium for determining data source | |
CN111274209B (en) | Method and device for processing ticket file | |
CN114896347A (en) | Data processing method and device, electronic equipment and storage medium | |
CN114817152A (en) | Method and system for querying slice file | |
CN114443595A (en) | Method and device for processing file | |
CN113704203A (en) | Log file processing method and device | |
CN117453148B (en) | Data balancing method, device, terminal and storage medium based on neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |