CN110134738A - Distributed memory system resource predictor method, device - Google Patents
Distributed memory system resource predictor method, device Download PDFInfo
- Publication number
- CN110134738A CN110134738A CN201910425874.4A CN201910425874A CN110134738A CN 110134738 A CN110134738 A CN 110134738A CN 201910425874 A CN201910425874 A CN 201910425874A CN 110134738 A CN110134738 A CN 110134738A
- Authority
- CN
- China
- Prior art keywords
- cluster
- resource
- data
- memory system
- distributed memory
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application discloses a kind of distributed memory system resource predictor method and devices, this method comprises: receiving the resource occupation inquiry request for each cluster in distributed memory system;According to resource occupation inquiry request, the metadata of each cluster in distributed memory system is obtained;The resource parameters that each cluster has currently occupied are obtained according to the metadata of each cluster, the resource parameters currently occupied include the memory that data file quantity, data volume, data block size, data number of blocks and each task of processing need;Parameter is occupied according to the storage resource that the resource parameters that each cluster has currently occupied calculate distributed memory system.By the metadata for extracting each cluster in distributed memory system, and according to the metadata of each cluster, the resource parameters that each cluster of processing has currently occupied are found out, the resource parameters currently occupied according to each cluster are to complete to estimate the resource of distributed memory system.
Description
Technical field
The application belongs to data processing field, and in particular to distributed memory system resource predictor method, device.
Background technique
The data generated in big data era, enterprise are more and more, and big data cluster scale is increasing, entreprise cost
It steeply rises, if occupancy resource can be estimated in advance in the case where extensive operation is submitted, optimization operation can be provided very
It is big to help, to reduce cluster resource consumption, guarantee cluster stability, reduces entreprise cost.
Occupancy resource, this method are estimated currently with building test environment and carrying out trial operation on test or direct-on-line
Entreprise cost is increased, adverse effect is generated to cluster on line.
Summary of the invention
The application for currently with build test environment test or direct-on-line on trial operation come estimate occupy provide
Source, such a process increases entreprise costs, lead to the problem of adverse effect to cluster on line, provide a kind of distributed memory system
Resource predictor method, device.
The application provides a kind of distributed memory system resource predictor method, comprising:
Receive the resource occupation inquiry request for each cluster in distributed memory system;
According to the resource occupation inquiry request, the metadata of each cluster in distributed memory system is obtained;
The resource parameters that each cluster has currently occupied are obtained according to the metadata of each cluster, it is described current
The resource parameters occupied include data file quantity, data volume, data block size, data number of blocks and each task needs of processing
Memory;
The storage resource of the distributed memory system is calculated according to the resource parameters that each cluster has currently occupied
Occupy parameter.
Optionally, described according to the resource occupation inquiry request, obtain the member of each cluster in distributed memory system
Data step, comprising:
Acquire the meta data file of the binary format saved in the distributed memory system with predetermined period, and by institute
It states meta data file and is converted to text formatting;
Extract metadata from the meta data file of the text formatting, the metadata include it is at least one of following or
Any combination: when file system directories name, file system directories, access user, user group, permission, file path, file modification
Between, the file access time.
Optionally, the metadata according to each cluster obtains the resource ginseng that each cluster has currently occupied
Number step, comprising:
The mark of data to be checked is parsed from the resource occupation inquiry request, and according to the data to be checked
Mark obtains the metadata of the data to be checked;
According to the metadata of the data to be checked, the target cluster that the data to be checked are belonged to is determined;
The resource occupation inquiry request is sent to the target cluster;
Receive the resource parameters of the object set pocket transmission currently occupied.
Optionally, the metadata according to the data to be checked determines the target that the data to be checked are belonged to
It is described that the resource occupation inquiry request is sent to before the target cluster step after cluster step, further includes:
Judge whether the target cluster is isomeric group, if so, rewriting to the resource occupation inquiry request;
It is described that the resource occupation inquiry request is sent to the target cluster, comprising: by revised resource occupation
Inquiry request is sent to the target cluster.
Optionally, it includes task quantity and EMS memory occupation amount that the storage resource, which occupies parameter, described according to described each
The step of storage resource that the resource parameters that cluster has currently occupied calculate the distributed memory system occupies parameter, comprising:
For each cluster, determine the maximum value in data file quantity and data number of blocks, calculate the maximum value and
The sum of data volume, and the sum and the ratio of data block size of the maximum value and data volume are calculated, obtain the number of tasks of the cluster
Amount;
The task quantity of the distributed memory system is determined according to the task quantity of each cluster;
For each cluster, calculating task quantity and the product for handling the memory that each task needs obtain the interior of the cluster
Deposit occupancy;
According to the EMS memory occupation amount of each cluster, the EMS memory occupation amount of the distributed memory system is determined.
The application also provides a kind of distributed memory system resource estimating device, comprising:
Receiving module, for receiving the resource occupation inquiry request for being directed to each cluster in distributed memory system;
First obtains module, for obtaining each collection in distributed memory system according to the resource occupation inquiry request
The metadata of group;
Second obtains module, for obtaining what each cluster had currently occupied according to the metadata of each cluster
Resource parameters, the resource parameters currently occupied include data file quantity, data volume, data block size, data block number
Measure and handle the memory that each task needs;
Computing module, the resource parameters for currently having been occupied according to each cluster calculate the distributed storage system
The storage resource of system occupies parameter.
Optionally, described first module is obtained, comprising:
Submodule is acquired, for acquiring the member of the binary format saved in the distributed memory system with predetermined period
Data file, and the meta data file is converted into text formatting;
Extracting sub-module, for extracting metadata, the metadata packet from the meta data file of the text formatting
Include at least one of following or any combination: file system directories name, file system directories, access user, user group, permission, text
Part path, filemodetime, file access time.
Optionally, described second module is obtained, comprising:
Acquisition submodule, for parsing the mark of data to be checked from the resource occupation inquiry request, and according to
The mark of the data to be checked obtains the metadata of the data to be checked;
It determines submodule, for the metadata according to the data to be checked, determines what the data to be checked were belonged to
Target cluster;
Sending submodule, for the resource occupation inquiry request to be sent to the target cluster;
Receiving submodule, for receiving the resource parameters of the object set pocket transmission currently occupied.
Optionally, described second module is obtained, further includes:
Judgment module, for judging whether the target cluster is isomeric group, if so, inquiring the resource occupation
Request is rewritten;
The sending submodule, is specifically used for: revised resource occupation inquiry request is sent to the target cluster.
Optionally, the computing module, comprising:
First computational submodule determines the maximum in data file quantity and data number of blocks for being directed to each cluster
Value, calculates the sum of the maximum value and data volume, and calculate the sum and the ratio of data block size of the maximum value and data volume,
Obtain the task quantity of the cluster;
Second computational submodule determines the task of the distributed memory system for the task quantity according to each cluster
Quantity;
Third computational submodule, for being directed to each cluster, the memory that calculating task quantity and each task of processing need
Product obtains the EMS memory occupation amount of the cluster;
4th computational submodule determines the distributed memory system for the EMS memory occupation amount according to each cluster
EMS memory occupation amount.
Distributed memory system resource predictor method provided by the present application, by making full dimension picture to distributed memory system
Picture, extracts the metadata of each cluster in distributed memory system, and parses to the resource occupation inquiry request of submission, according to
Metadata finds out the resource parameters that each cluster has currently occupied, and combines meter according to the resource parameters that each cluster has currently occupied
Process is calculated to estimate total task number of operation, total EMS memory occupation amount, to complete to estimate the resource of distributed memory system.
Detailed description of the invention
Fig. 1 is a kind of flow chart for distributed memory system resource predictor method that the application first embodiment provides;
A kind of optional embodiment of step S2 in Fig. 1 that Fig. 2 provides for the application first embodiment;
A kind of optional embodiment of step S3 in Fig. 1 that Fig. 3 provides for the application first embodiment;
Another optional embodiment of step S4 in Fig. 1 that Fig. 4 provides for the application first embodiment;
Fig. 5 is a kind of structural representation for distributed memory system resource predictor method that the application second embodiment provides
Figure;
Fig. 6 is that a kind of another structure for distributed memory system resource predictor method that the application second embodiment provides is shown
It is intended to;
Fig. 7 is that a kind of another structure for distributed memory system resource predictor method that the application second embodiment provides is shown
It is intended to;
Fig. 8 is that a kind of another structure for distributed memory system resource predictor method that the application second embodiment provides is shown
It is intended to
Specific embodiment
Technical solution in order to enable those skilled in the art to better understand the present invention, with reference to the accompanying drawing and specific embodiment party
Present invention is further described in detail for formula.
The application provides a kind of distributed memory system resource predictor method, device.It is provided below in conjunction with the application
The attached drawing of embodiment be described in detail one by one.
A kind of distributed memory system resource predictor method that the application first embodiment provides is as follows:
As shown in Figure 1, it illustrates a kind of distributed memory system resource predictor method provided by the embodiments of the present application, packet
Include following steps.
Step S1 receives the resource occupation inquiry request for each cluster in distributed memory system.
In this step, in this step, the resource occupation inquiry for each cluster in distributed memory system is received
Request, i.e. SQL query are requested.Sql like language is the abbreviation of structured query language (Structured Query Language).
Sql like language is a kind of data base querying and programming language, for accessing data and querying, updating, and managing relation data
Library system;It is simultaneously also the extension name of database script file.
Step S2 obtains the metadata of each cluster in distributed memory system according to the resource occupation inquiry request.
Metadata (Metadata), also known as broker data, relaying data, for data (the data about for describing data
Data), the information of data attribute (property) is mainly described, for supporting as indicated storage location, historical data, resource
The functions such as lookup, file record.Metadata a kind of electronic type catalogue at last, in order to achieve the purpose that scheduling, it is necessary to retouch
The interior perhaps characteristic of data is stated and collected, and then reaches the purpose for assisting data retrieval.Metadata is tissue, the number about data
According to domain and its information of relationship, in short, metadata is exactly the data about data.
Preferably, as shown in Fig. 2, the step S2 obtains distributed storage system according to the resource occupation inquiry request
The metadata of each cluster in system, comprising:
Step S201 acquires the metadata text of the binary format saved in the distributed memory system with predetermined period
Part, and the meta data file is converted into text formatting.
Distributed memory system can save the catalogue of whole system, the details of file in memory, in order to prevent
Internal storage data is lost after delay machine, and data in EMS memory can be arrived magnetic at regular intervals with binary form sequenceization by storage system
Disk.
In this step, the binary file in taken at regular intervals distributed memory system, and by binary file antitone sequence
Text formatting is turned to, for extracting metadata.Predetermined period is preset value, can specifically be set as required
It is fixed, it is not construed as limiting herein.
Step S202 extracts metadata from the meta data file of the text formatting.
In this step, the metadata of each cluster is extracted by distributed computing.Extraction step includes customized KV, two
Minor sort, customized subregion, it is customized merge, customized grouping, the metadata of extraction include it is at least one of following or
Any combination: file system directories name, file system directories, access user (file owning user), user group, permission, text
Part path, filemodetime, file access time.It is also possible to comprising other data, such as the directory capacity (appearance of file
Amount), catalogue file number (number of files under file), minimax mean file size, file format etc..
Specifically, the often more than one big data cluster in enterprise, it may multiple clusters even different types of isomery
Cluster, may need to combine multiple clusters when we submit a SQL query and calculate together, either still to single cluster
Multiple isomeric groups require consumption resource, require to do resource occupation and estimate, so system needs first number to each cluster
According to being collected.Each isomeric group provides the metadata information that a http interface sticks one's chin out, and metadata obtains Fang Cheng
This http interface is called to obtain the metadata of each isomeric group in sequence.That is, will be executed for each cluster above-mentioned
Step.
Step S3 obtains the resource parameters that each cluster has currently occupied according to the metadata of each cluster.
In this step, the data that each cluster carries out stock assessment needs are obtained according to the metadata of each cluster, i.e.,
The resource parameters that each cluster has currently occupied, including data file quantity, data volume, data block size, data number of blocks and
Handle the memory that each task needs.Herein it should be noted that the resource parameters currently occupied are to generate when data processing
Dynamic data.
Preferably, as shown in figure 3, the step S3, obtains each cluster according to the metadata of each cluster
The resource parameters currently occupied, comprising:
Step S301, parses the mark of data to be checked from the resource occupation inquiry request, and according to it is described to
The mark for inquiring data obtains the metadata of the data to be checked.
In this step, when submitting SQL operation, the logic executive plan and physics executive plan of SQL are parsed.According to
The logic executive plan and physics executive plan that parse obtain the mark of data to be checked, and according to data to be checked
Mark obtains metadata corresponding with the data to be checked.Data to be checked parse not yet herein, only parse and want
Inquire the mark of what data.
Step S302 determines the object set that the data to be checked are belonged to according to the metadata of the data to be checked
Group.
In this step, the metadata obtained according to previous step, according to system directory name, the file system in metadata
Catalogue and file path, determine this document path it is corresponding be which cluster.
The resource occupation inquiry request is sent to the target cluster by step S303.
In this step, which cluster belonged to according to the data to be checked that previous step metadata is judged, it then will money
Source occupies inquiry request and is routed to this target cluster.
Preferably, after the step S302, before the step S303, further includes: whether judge the target cluster
For isomeric group, if so, being rewritten to the resource occupation inquiry request.The step S303, by the resource occupation
Inquiry request is sent to the target cluster, comprising: revised resource occupation inquiry request is sent to the target cluster.
In this step, it has been possible to isomeric group in distributed memory system, so needing in routing procedure to SQL
A degree of rewriting is carried out, rewriting herein will be rewritten as adapting to corresponding isomeric group for each isomeric group
Sentence, it is specific to rewrite sentence sets itself as required, it is not construed as limiting herein.It is different to get the corresponding target of data to be checked
After structure cluster, it will be rewritten for the SQL statement of isomeric group, and the revised SQL statement is routed to corresponding mesh
Mark isomeric group.
By taking practical big data system as an example: the SQL syntax that hive, ES, HBase cluster are supported has difference to a certain degree,
Group type is arrived according to metadata is available, execution grammer, such as HBase are adapted to according to the distinctive difference of this group type
Itself do not support SQL, the built-in API for needing for SQL statement to be changed to HBase is calculated.API is that operating system is left for using journey
One calling interface of sequence, application program make operating system go the life of executing application by the API of call operation system
It enables.
Step S304 receives the resource parameters of the object set pocket transmission currently occupied.
In this step, the data file number of each target cluster is inquired from each target cluster according to metadata
The memory that amount, data volume, data block size, data number of blocks and each task of processing need.The metadata of extraction mainly include with
At least one of lower or any combination: file system directories name, file system directories, access user, user group, permission, file road
Diameter, filemodetime, file access time.Member can be obtained according to file system directories name, file system directories, file path
Which cluster is corresponding data be, can be obtained according to access user, user group, permission, filemodetime, file access time
Data file quantity, data volume, data block size, data number of blocks and each task of processing of each target cluster need interior
It deposits.
It should be noted that need to parse the data to be checked of scanning again according to revised SQL after SQL rewrites,
According to the corresponding metadata of data to be checked, the data text of each target heterogeneous cluster is inquired from each target heterogeneous cluster
The memory that number of packages amount, data volume, data block size, data number of blocks and each task of processing need.
Step S4 calculates depositing for the distributed memory system according to the resource parameters that each cluster has currently occupied
Store up resource occupation parameter.
In this step, the data that stock assessment needs are carried out to distributed memory system obtained according to step S3, i.e.,
The resource parameters currently occupied, are calculated, and the storage resource for obtaining final distributed memory system occupies parameter, including total
Number of tasks and total EMS memory occupation amount, to complete stock assessment.
Preferably, as shown in figure 4, it includes task quantity and EMS memory occupation amount, the step that the storage resource, which occupies parameter,
Rapid S4, the resource parameters currently occupied according to each cluster calculate the storage resource of the distributed memory system
Occupy parameter, comprising:
Step S401 determines the maximum value in data file quantity and data number of blocks for each cluster, described in calculating
The sum of maximum value and data volume, and the sum and the ratio of data block size of the maximum value and data volume are calculated, obtain the cluster
Task quantity.Meanwhile the corresponding CPU core number of each task, CPU core number=task number.
In this step, for a cluster, the number of tasks of a cluster is calculated, task number=max (data file number,
Data block number)+data volume/data block size.It can be seen that how much major embodiments of task number are the number and data of number of files
The size of amount.
In a preferred embodiment, recommend if mean file size is less than the data block size of system setting
Merged when processing, main adjustment direction is that minimum fragment number is greater than or equal to data block number, with specific reference to stock assessment into
Row tuning.For example, mean file size is 1M, a data block size is 10M, and a total of 100 files need to handle, average
File size is less than data block size, then minimum fragment number is more than or equal to 100/10=10 when handling.
Step S402 determines the task quantity of the distributed memory system according to the task quantity of each cluster.
Merger is carried out to the stock assessment result of cluster each in distributed memory system, calculates general assignment number and total
CPU core number, calculation formula are as follows:
General assignment number (total_task)=cluster task+ cluster task+ cluster task...;
Total CPU core number=total task number.
Step S403, for each cluster, calculating task quantity and the product for handling the memory that each task needs are somebody's turn to do
The EMS memory occupation amount of cluster.
In this step, for a cluster, the EMS memory occupation amount an of cluster, memory=task number * processing are calculated
The memory that each task needs.
Step S404 determines the EMS memory occupation amount of the distributed memory system according to the EMS memory occupation amount of each cluster.
In this step, merger is carried out to the stock assessment result of cluster each in distributed memory system, then calculated
Total EMS memory occupation amount, calculation formula are as follows out:
Total EMS memory occupation amount (total_memory)=cluster memory+ cluster memory+ cluster memory...
In a preferred embodiment, the embodiment of the present application also calculates the bottle of each cluster in distributed memory system
Neck, bottleneck=required resource/total resources.The EMS memory occupation amount of a required resource i.e. cluster, total resources are the total interior of the cluster
It deposits.
Distributed memory system resource predictor method provided by the present application, by making full dimension picture to distributed memory system
Picture, extracts the metadata of each cluster in distributed memory system, and parses to the SQL of submission, according to the logic meter of generation
Draw, physics plan and metadata find out the resource parameters that each cluster has currently occupied, according to the resource parameters currently occupied
In conjunction with calculation process to estimate total task number of operation, total EMS memory occupation amount, to complete the resource to distributed memory system
It estimates.
A kind of distributed memory system resource estimating device that the application second embodiment provides is as follows:
As shown in figure 5, it illustrates a kind of distributed memory system resource predictor methods provided by the embodiments of the present application
Structural schematic diagram comprises the following modules.
Receiving module 11, for receiving the resource occupation inquiry request for being directed to each cluster in distributed memory system;
First obtains module 12, for obtaining each in distributed memory system according to the resource occupation inquiry request
The metadata of cluster;
Second obtains module 13, has currently occupied for obtaining each cluster according to the metadata of each cluster
Resource parameters, the currently resource parameters that have occupied include data file quantity, data volume, data block size, data block
Quantity and the memory for handling each task needs;
Computing module 14, the resource parameters for currently having been occupied according to each cluster calculate the distributed storage
The storage resource of system occupies parameter.
Optionally, as shown in fig. 6, described first obtains module 12, comprising:
Submodule 121 is acquired, for acquiring the binary format saved in the distributed memory system with predetermined period
Meta data file, and the meta data file is converted into text formatting;
Extracting sub-module 122, for extracting metadata, the metadata from the meta data file of the text formatting
Including at least one of following or any combination: file system directories name, file system directories, access user, user group, permission,
File path, filemodetime, file access time.
Optionally, as shown in fig. 7, described second obtains module 13, comprising:
Acquisition submodule 131, for parsing the mark of data to be checked, and root from the resource occupation inquiry request
The metadata of the data to be checked is obtained according to the mark of the data to be checked;
It determines submodule 132, for the metadata according to the data to be checked, determines that the data to be checked are belonged to
Target cluster;
Sending submodule 133, for the resource occupation inquiry request to be sent to the target cluster;
Receiving submodule 134, for receiving the resource parameters of the object set pocket transmission currently occupied.
Optionally, described second module 13 (being not drawn into figure) is obtained, further includes:
Judgment module, for judging whether the target cluster is isomeric group, if so, inquiring the resource occupation
Request is rewritten;
The sending submodule, is specifically used for: revised resource occupation inquiry request is sent to the target cluster.
Optionally, as shown in figure 8, the computing module 14, comprising:
First computational submodule 141 determines in data file quantity and data number of blocks most for being directed to each cluster
Big value, calculates the sum of the maximum value and data volume, and calculate the sum of the maximum value and data volume and the ratio of data block size
Value, obtains the task quantity of the cluster;
Second computational submodule 142, for determining the distributed memory system according to the task quantity of each cluster
Task quantity;
Third computational submodule 143, for being directed to each cluster, calculating task quantity and the memory for handling each task needs
Product, obtain the EMS memory occupation amount of the cluster;
4th computational submodule 144 determines the distributed memory system for the EMS memory occupation amount according to each cluster
EMS memory occupation amount.
It is understood that the principle that embodiment of above is intended to be merely illustrative of the present and the exemplary implementation that uses
Mode, however the present invention is not limited thereto.For those skilled in the art, essence of the invention is not being departed from
In the case where mind and essence, various changes and modifications can be made therein, these variations and modifications are also considered as protection scope of the present invention.
Claims (10)
1. a kind of distributed memory system resource predictor method characterized by comprising
Receive the resource occupation inquiry request for each cluster in distributed memory system;
According to the resource occupation inquiry request, the metadata of each cluster in distributed memory system is obtained;
The resource parameters that each cluster has currently occupied are obtained according to the metadata of each cluster, it is described currently to have accounted for
Resource parameters include in data file quantity, data volume, data block size, data number of blocks and each task of processing need
It deposits;
It is occupied according to the storage resource that the resource parameters that each cluster has currently occupied calculate the distributed memory system
Parameter.
2. distributed memory system resource predictor method according to claim 1, which is characterized in that described according to the money
Source occupies inquiry request, obtains the metadata step of each cluster in distributed memory system, comprising:
Acquire the meta data file of the binary format saved in the distributed memory system with predetermined period, and by the member
Data file transition is text formatting;
Metadata is extracted from the meta data file of the text formatting, the metadata includes at least one of following or any
Combination: file system directories name, file system directories, access user, user group, permission, file path, filemodetime,
The file access time.
3. distributed memory system resource predictor method according to claim 1, which is characterized in that described according to described each
The metadata of a cluster obtains the resource parameters step that each cluster has currently occupied, comprising:
The mark of data to be checked is parsed from the resource occupation inquiry request, and according to the mark of the data to be checked
Obtain the metadata of the data to be checked;
According to the metadata of the data to be checked, the target cluster that the data to be checked are belonged to is determined;
The resource occupation inquiry request is sent to the target cluster;
Receive the resource parameters of the object set pocket transmission currently occupied.
4. distributed memory system resource predictor method according to claim 3, which is characterized in that it is described according to
The metadata of data is inquired, it is described to account for the resource after determining the target cluster step that the data to be checked are belonged to
Before being sent to the target cluster step with inquiry request, further includes:
Judge whether the target cluster is isomeric group, if so, rewriting to the resource occupation inquiry request;
It is described that the resource occupation inquiry request is sent to the target cluster, comprising: to inquire revised resource occupation
Request is sent to the target cluster.
5. distributed memory system resource predictor method according to claim 1, which is characterized in that the storage resource accounts for
It include task quantity and EMS memory occupation amount with parameter, the resource parameters currently occupied according to each cluster calculate institute
The step of stating the storage resource occupancy parameter of distributed memory system, comprising:
For each cluster, the maximum value in data file quantity and data number of blocks is determined, calculate the maximum value and data
The sum of amount, and the sum and the ratio of data block size of the maximum value and data volume are calculated, obtain the task quantity of the cluster;
The task quantity of the distributed memory system is determined according to the task quantity of each cluster;
For each cluster, calculating task quantity and the product for handling the memory that each task needs, the memory for obtaining the cluster are accounted for
Dosage;
According to the EMS memory occupation amount of each cluster, the EMS memory occupation amount of the distributed memory system is determined.
6. a kind of distributed memory system resource estimating device characterized by comprising
Receiving module, for receiving the resource occupation inquiry request for being directed to each cluster in distributed memory system;
First obtains module, for obtaining each cluster in distributed memory system according to the resource occupation inquiry request
Metadata;
Second obtains module, for obtaining the resource that each cluster has currently occupied according to the metadata of each cluster
Parameter, the currently resource parameters that have occupied include data file quantity, data volume, data block size, data number of blocks and
Handle the memory that each task needs;
Computing module, the resource parameters for currently having been occupied according to each cluster calculate the distributed memory system
Storage resource occupies parameter.
7. distributed memory system resource estimating device according to claim 6, which is characterized in that described first obtains mould
Block, comprising:
Submodule is acquired, for acquiring the metadata of the binary format saved in the distributed memory system with predetermined period
File, and the meta data file is converted into text formatting;
Extracting sub-module, for extracting metadata from the meta data file of the text formatting, the metadata include with
At least one of lower or any combination: file system directories name, file system directories, access user, user group, permission, file road
Diameter, filemodetime, file access time.
8. distributed memory system resource estimating device according to claim 6, which is characterized in that described second obtains mould
Block, comprising:
Acquisition submodule, for parsing the mark of data to be checked from the resource occupation inquiry request, and according to described
The mark of data to be checked obtains the metadata of the data to be checked;
It determines submodule, for the metadata according to the data to be checked, determines the target that the data to be checked are belonged to
Cluster;
Sending submodule, for the resource occupation inquiry request to be sent to the target cluster;
Receiving submodule, for receiving the resource parameters of the object set pocket transmission currently occupied.
9. distributed memory system resource estimating device according to claim 8, which is characterized in that described second obtains mould
Block, further includes:
Judgment module, for judging whether the target cluster is isomeric group, if so, to the resource occupation inquiry request
It is rewritten;
The sending submodule, is specifically used for: revised resource occupation inquiry request is sent to the target cluster.
10. distributed memory system resource estimating device according to claim 6, which is characterized in that the computing module,
Include:
First computational submodule determines the maximum value in data file quantity and data number of blocks, counts for being directed to each cluster
The sum of the maximum value and data volume is calculated, and calculates the sum and the ratio of data block size of the maximum value and data volume, is obtained
The task quantity of the cluster;
Second computational submodule determines the number of tasks of the distributed memory system for the task quantity according to each cluster
Amount;
Third computational submodule, for being directed to each cluster, calculating task quantity and the product for handling the memory that each task needs,
Obtain the EMS memory occupation amount of the cluster;
4th computational submodule determines the memory of the distributed memory system for the EMS memory occupation amount according to each cluster
Occupancy.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910425874.4A CN110134738B (en) | 2019-05-21 | 2019-05-21 | Distributed storage system resource estimation method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910425874.4A CN110134738B (en) | 2019-05-21 | 2019-05-21 | Distributed storage system resource estimation method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110134738A true CN110134738A (en) | 2019-08-16 |
CN110134738B CN110134738B (en) | 2021-09-10 |
Family
ID=67572348
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910425874.4A Active CN110134738B (en) | 2019-05-21 | 2019-05-21 | Distributed storage system resource estimation method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110134738B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110569317A (en) * | 2019-09-12 | 2019-12-13 | 北京明略软件系统有限公司 | metadata collection method and device for data source |
CN111680799A (en) * | 2020-04-08 | 2020-09-18 | 北京字节跳动网络技术有限公司 | Method and apparatus for processing model parameters |
CN113111038A (en) * | 2021-03-31 | 2021-07-13 | 北京达佳互联信息技术有限公司 | File storage method, device, server and storage medium |
CN113553166A (en) * | 2020-04-26 | 2021-10-26 | 广州汽车集团股份有限公司 | Cross-platform high-performance computing integration method and system |
WO2023051270A1 (en) * | 2021-09-30 | 2023-04-06 | 中兴通讯股份有限公司 | Memory occupation amount pre-estimation method and apparatus, and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102542040A (en) * | 2011-12-27 | 2012-07-04 | 北京奇虎科技有限公司 | Capacity acquiring method and system |
CN103678563A (en) * | 2011-12-27 | 2014-03-26 | 北京奇虎科技有限公司 | Capacity obtaining method and system |
US20140372250A1 (en) * | 2011-03-04 | 2014-12-18 | Forbes Media Llc | System and method for providing recommended content |
CN104657260A (en) * | 2013-11-25 | 2015-05-27 | 航天信息股份有限公司 | Achievement method for distributed locks controlling distributed inter-node accessed shared resources |
CN108694071A (en) * | 2017-03-29 | 2018-10-23 | 瞻博网络公司 | Multi-cluster panel for distributed virtualized infrastructure element monitoring and policy control |
US20190051210A1 (en) * | 2017-08-09 | 2019-02-14 | Inchstones, LLC | Distributed architecture for data synchronization |
-
2019
- 2019-05-21 CN CN201910425874.4A patent/CN110134738B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140372250A1 (en) * | 2011-03-04 | 2014-12-18 | Forbes Media Llc | System and method for providing recommended content |
CN102542040A (en) * | 2011-12-27 | 2012-07-04 | 北京奇虎科技有限公司 | Capacity acquiring method and system |
CN103678563A (en) * | 2011-12-27 | 2014-03-26 | 北京奇虎科技有限公司 | Capacity obtaining method and system |
CN104657260A (en) * | 2013-11-25 | 2015-05-27 | 航天信息股份有限公司 | Achievement method for distributed locks controlling distributed inter-node accessed shared resources |
CN108694071A (en) * | 2017-03-29 | 2018-10-23 | 瞻博网络公司 | Multi-cluster panel for distributed virtualized infrastructure element monitoring and policy control |
US20190051210A1 (en) * | 2017-08-09 | 2019-02-14 | Inchstones, LLC | Distributed architecture for data synchronization |
Non-Patent Citations (1)
Title |
---|
蔡涛等: ""NVMMDS-一种面向非易失存储器的元数据管理方法"", 《计算机研究与发展》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110569317A (en) * | 2019-09-12 | 2019-12-13 | 北京明略软件系统有限公司 | metadata collection method and device for data source |
CN111680799A (en) * | 2020-04-08 | 2020-09-18 | 北京字节跳动网络技术有限公司 | Method and apparatus for processing model parameters |
CN111680799B (en) * | 2020-04-08 | 2024-02-20 | 北京字节跳动网络技术有限公司 | Method and device for processing model parameters |
CN113553166A (en) * | 2020-04-26 | 2021-10-26 | 广州汽车集团股份有限公司 | Cross-platform high-performance computing integration method and system |
CN113111038A (en) * | 2021-03-31 | 2021-07-13 | 北京达佳互联信息技术有限公司 | File storage method, device, server and storage medium |
CN113111038B (en) * | 2021-03-31 | 2024-01-19 | 北京达佳互联信息技术有限公司 | File storage method, device, server and storage medium |
WO2023051270A1 (en) * | 2021-09-30 | 2023-04-06 | 中兴通讯股份有限公司 | Memory occupation amount pre-estimation method and apparatus, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110134738B (en) | 2021-09-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110134738A (en) | Distributed memory system resource predictor method, device | |
US10831562B2 (en) | Method and system for operating a data center by reducing an amount of data to be processed | |
CN108009236B (en) | Big data query method, system, computer and storage medium | |
CN105138592B (en) | A kind of daily record data storage and search method based on distributed structure/architecture | |
CN102164186B (en) | Method and system for realizing cloud search service | |
CN111797091A (en) | Method and device for querying data in database, electronic equipment and storage medium | |
CN108241539B (en) | Interactive big data query method and device based on distributed system, storage medium and terminal equipment | |
CN103064960A (en) | Method and equipment for database query | |
CN106503008B (en) | File storage method and device and file query method and device | |
CN110147470B (en) | Cross-machine-room data comparison system and method | |
CN104423982A (en) | Request processing method and device | |
CN111752945A (en) | Time sequence database data interaction method and system based on container and hierarchical model | |
CN111224831B (en) | Method and system for generating call ticket | |
CN108154024B (en) | Data retrieval method and device and electronic equipment | |
CN110222046B (en) | List data processing method, device, server and storage medium | |
CN109586970B (en) | Resource allocation method, device and system | |
CN109213950B (en) | Data processing method and device for browser application of IPTV (Internet protocol television) intelligent set top box | |
CN106934066B (en) | Metadata processing method and device and storage equipment | |
CN110781430B (en) | Novel virtual data center system of internet and construction method thereof | |
CN112887113A (en) | Method, device and system for processing data | |
CN105490956A (en) | Network request processing method and network request processing device | |
CN116185578A (en) | Scheduling method of computing task and executing method of computing task | |
CN115455042A (en) | Data processing method, apparatus and computer readable storage medium | |
CN105765569B (en) | A kind of data distributing method, loading machine and storage system | |
CN112040283B (en) | Method and system for splitting video mass selection list |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |