[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2017045472A1 - 资源预测方法、系统和容量管理装 - Google Patents

资源预测方法、系统和容量管理装 Download PDF

Info

Publication number
WO2017045472A1
WO2017045472A1 PCT/CN2016/089927 CN2016089927W WO2017045472A1 WO 2017045472 A1 WO2017045472 A1 WO 2017045472A1 CN 2016089927 W CN2016089927 W CN 2016089927W WO 2017045472 A1 WO2017045472 A1 WO 2017045472A1
Authority
WO
WIPO (PCT)
Prior art keywords
service
prediction
service indicator
prediction result
correlation factor
Prior art date
Application number
PCT/CN2016/089927
Other languages
English (en)
French (fr)
Inventor
刘成华
张园园
刘征
赵禄强
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2017045472A1 publication Critical patent/WO2017045472A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0896Bandwidth or capacity management, i.e. automatically increasing or decreasing capacities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/70Admission control; Resource allocation
    • H04L47/82Miscellaneous aspects
    • H04L47/822Collecting or measuring resource availability data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/70Admission control; Resource allocation
    • H04L47/83Admission control; Resource allocation based on usage prediction

Definitions

  • the present invention relates to the field of cloud computing, and in particular, to a resource prediction method, system, and capacity management apparatus.
  • Capacity management is used to assess the capacity of the existing network sites, that is, to collect and analyze the capacity of CPU, memory, storage, network and other resources under the current traffic volume, and to estimate the system based on the current business scale and the trend of traffic growth.
  • the future supportability Thereby supporting capacity scheduling or capacity expansion in advance to ensure smooth operation of the system. Therefore, a medium- and long-term resource prediction and estimation method is needed.
  • resource prediction and estimation for a server or a service node are predicted by predicting the resources consumed by the service itself invoked by the service on the server.
  • the resource consumption affecting the service is complicated and may be invoked by other services.
  • the impact of the current server predictions is usually larger and less accurate.
  • an object of the present invention is to provide a resource prediction method, system, and capacity management apparatus that take into consideration the relationship between services.
  • the present invention provides a method for resource prediction, including:
  • the acquiring the correlation factor of the initial service indicator includes: acquiring an association factor of the initial service indicator from a database, the association factor Determined by the cloud controller when deploying the service.
  • the acquiring the correlation factor of the initial service indicator includes: acquiring an association factor of the initial service indicator from a database, where the association factor Obtained by the calling chain device by analyzing the calling path between services, the calling path is recorded in the log.
  • the final service is obtained according to the prediction result of the initial service indicator and the prediction result of the correlation factor
  • the predicted result of the indicator, and predicting the resource consumption of the server according to the predicted result of the final business indicator specifically including:
  • the resource consumption of the server is predicted according to the predicted result of the final service indicator of each service and the unit resource consumption value corresponding to each service.
  • the present invention also provides a capacity management apparatus, including:
  • a first prediction module configured to predict the initial service indicator according to the historical data of the service prediction algorithm and the initial service indicator, and obtain a prediction result of the initial service indicator
  • An obtaining module configured to acquire an association factor of the initial service indicator, and obtain historical data of the association factor
  • a second prediction module configured to predict the correlation factor according to the historical data of the association factor and a service prediction algorithm, and obtain a prediction result of the correlation factor
  • a third prediction module configured to obtain a prediction result of the final service indicator according to the prediction result of the initial service indicator and the prediction result of the correlation factor, and predict a resource consumption of the server according to the prediction result of the final service indicator.
  • the acquiring module is configured to obtain, from the database, an association factor of the initial service indicator, where the association factor is determined by the cloud controller when the service is deployed. of.
  • the acquiring module is specifically configured to obtain, from the database, an association factor of the initial service indicator, where the association factor is used by the calling chain device to analyze the service. Obtained between the call path, the call path is recorded in the record Recorded in the log.
  • the third prediction module is specifically configured to: when the multiple services are running on the server, the prediction result according to the initial service indicator and the association The prediction result of the factor respectively obtains the prediction result of the final service indicator of each service; and predicts the resource consumption of the server according to the prediction result of the final service indicator of each service and the unit resource consumption value corresponding to each service.
  • the present invention also provides a resource prediction system, including: a capacity management device and a database, where
  • the capacity management device is configured to: predict, according to the service prediction algorithm and the historical data of the initial service indicator, the initial service indicator, obtain a prediction result of the initial service indicator; acquire a correlation factor of the initial service indicator, and Obtaining historical data of the correlation factor; predicting the correlation factor according to the historical data of the correlation factor and a service prediction algorithm, and obtaining a prediction result of the correlation factor; and predicting a result according to the initial service indicator Deriving a prediction result of the correlation factor, obtaining a prediction result of the final service indicator, and predicting a resource consumption of the server according to the prediction result of the final service indicator; the database, configured to store historical data of the initial service indicator, the The correlation factor corresponding to the initial business indicator and the historical data of the associated factor.
  • the method further includes: a call chain device, configured to obtain a log from the database, analyze a call path in the log, and obtain an initial service The correlation factor of the indicator.
  • the resource prediction of the server not only considers the initial service indicator, but also needs to consider the correlation factor that affects the initial service indicator, so that the resource prediction for the server is more accurate.
  • FIG. 1 is a schematic structural diagram of a system for resource prediction according to an embodiment of the present invention.
  • FIG. 2 is a flowchart of a method for resource prediction according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a relationship between calls between services according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a capacity management apparatus according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a capacity management apparatus according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a resource prediction system according to an embodiment of the present invention.
  • the embodiment of the invention provides a resource prediction method, a system and a capacity management device, which are used to improve the accuracy of resource prediction of a server in a cloud computing, which are respectively described in detail below.
  • FIG. 1 is a schematic diagram of a network environment according to an embodiment of the present invention.
  • the system architecture includes the following devices, a cloud controller 102, a service resource pool 104, a service bus 106, a capacity management device 108, a call chain device 110, and an external system 118.
  • each database includes a cloud controller database 112, a capacity management database 114, and a call chain database 116.
  • the application is deployed by the cloud controller 102.
  • An application is composed of a group of services. When deploying, the user can select which services are associated with the services. For example, the subscription service needs to be associated with the database service.
  • the cloud controller After the application is deployed, the cloud controller will The association relationship of the services is recorded in the own cloud controller database 112, and the cloud controller server 112 synchronizes the changed associations between the services to the capacity management database 114.
  • the cloud controller 102 deploys an application
  • the idle service is searched for from the service resource pool 104 to deploy the application.
  • the service resource pool 104 has the resource A, the resource B, and the resource C.
  • the resources may be more than these.
  • a service will need to call between several services. When called between services, it will find on the service bus 106 where the target service is currently deployed. For example, the subscription service needs to find the billing service, and the service bus returns available.
  • the billing service node the subscription service can communicate with the billing service.
  • the call chain device 110 and the capacity management device 108 belong to the operation and maintenance system. Each service records the call log when it is called.
  • the call chain device 108 collects the call logs of each service node, and then analyzes the call relationship of each service, and analyzes Each service call relationship is recorded to the call chain database 116. Communication between the various service nodes is via service bus 106.
  • Calling relationships between services can be determined by deploying in the cloud controller. Because a new service is deployed in a cloud environment, the administrator specifies the association between the service and other services when the new service is deployed.
  • the cloud controller records the nodes of the service and the association between the service and other services to the cloud server database.
  • the capacity management device can acquire the association relationship between the various services in the cloud controller database 112, so that the cloud system resources can be predicted and estimated.
  • the capacity management device 108 can also obtain a service association relationship from the capacity management database 114 or the call chain database 116.
  • the capacity management device 108 and the call chain device 110 can be integrated together as a whole hardware.
  • the cloud server database 112 stores information about which nodes the cloud controller 102 deploys a certain service, and the relationship between the services, and then the cloud server database 112 can information about which nodes the certain service is deployed on, and The association between the services is synchronized to the capacity management server 114.
  • the call chain database 116 can synchronize the association relationship between the services analyzed by the call chain device 110 to the capacity management database 114. That is, the latest association relationship between services will be stored in the capacity management database 114. Under normal circumstances, when it is necessary to acquire a call relationship between certain services, the capacity management device 108 first goes to the capacity management database 114 to look up, and if not found in the capacity management database 114, the capacity management device 108 will go to the cloud control.
  • the lookup is performed in the database database 112 or in the call chain database 116.
  • the cloud server database 112, the capacity management database 114, and the call chain database 116 can be combined into one database, that is, the cloud controller 102, the capacity management device 108, and the call chain device 110 share a common database. At this time, all the data are It is stored in the capacity management database 114.
  • the external system 118 can directly access the services published in the system.
  • the system issues a WEB service, and the external system can access the service through the access URL.
  • FIG. 2 it is a flowchart of a method for resource prediction according to an embodiment of the present invention, which specifically includes:
  • Step 201 The capacity management device determines a service indicator that affects the service usage on the server.
  • the capacity management device determines the respective service metrics of each service that affect the service usage.
  • Steps 202-204 illustrate a specific process for predicting a specific service indicator of a service.
  • the prediction process of other service indicators of a service is similar to 202-204.
  • Step 202 According to the historical data of the service prediction algorithm and the initial service indicator, the initial service refers to The forecast is made and the forecast results of the initial business indicators are obtained.
  • the capacity management device automatically detects the period of the service indicator and obtains the period of the service indicator.
  • the modulus value of each frequency point is the amplitude characteristic under the frequency value.
  • the period of the service indicator is not obvious or the period changes due to the change of the service over time.
  • the manual designation will result in inaccurate prediction.
  • the service indicators are reported to the database every day and saved in the database. According to the latest service indicators.
  • the automatic detection cycle not only reduces the manual intervention but also adapts to the cyclical changes, and increases the prediction accuracy.
  • the capacity management device performs the m-phase moving average of the historical data M t of the service indicator (m is an odd number) or twice (m is an even number) to obtain a trend component T t .
  • the capacity management device provides a variety of fitting models; and based on the curve characteristics and fitting error of the input data, automatically select the best match from a plurality of (here, 7 examples) models.
  • Trend prediction model The trend prediction model.
  • the capacity management device selects a trend prediction model that best matches the service indicator from the plurality of fitted models.
  • the cloud controller database 112 periodically sends the service indicator data to the capacity management database 114.
  • the capacity management database 114 stores the data of each service indicator, that is, the related data of the service indicator is always generating new data, and needs to be updated according to the time period.
  • the data is modified to fit the model, and the specific time can be set by itself, such as: one week, one month or one quarter.
  • the capacity management device 108 acquires the data of the latest business indicator for a period of time from the capacity management database 114; extracts the model judgment feature, determines whether the feature is a constant, and if it is a constant, considers that it matches the corresponding model; the judgment characteristics of the different models are:
  • Feature1 (Feature1-min(Feature1))/mean(Feature1-min(Feature1));
  • Feature2 (Feature2-min(Feature2))/mean(Feature2-min(Feature2));
  • Feature5(i) ((1/L(i+2))-(1/L(i+1)))/((1/L(i+1))-(1/L(i))) ;
  • Feature6(i) Feature1(i+1)-Feature1(i);
  • Feature6 (Feature6-min(Feature6))/mean(Feature6-min(Feature6));
  • the trend forecasting model of the fit may change.
  • the exponential growth is presented, and the index prediction model is fitted.
  • the development period enters the stationary period, and the linear prediction model may be fitted, so each The secondary predictions need to correct the fitted trend prediction model based on the latest business data.
  • Step 203 Acquire an association factor of the initial service indicator from a database, and obtain historical data of the association factor.
  • the relationship between the service indicators can be reflected by the association factor.
  • the service indicator 2 is the association factor of the service indicator 1. Therefore, if the second service indicator affects the first service indicator, the second service indicator is the first service.
  • the service indicator may be specifically a service involved in a certain service. For example, in the following example, service A is the first service indicator, service B is the association factor of the first service indicator, and service C, Service D is the association factor of Service B.
  • the key point is the acquisition of the correlation factor of the service indicator, that is, the relationship between the services is obtained.
  • the embodiment of the present invention provides two solutions, which can automatically analyze and record the relationship between the services, and the other is through the cloud controller. Obtained when the service is deployed, one is obtained by calling the chain device. The following two ways to obtain are introduced separately:
  • the first way when the service is deployed through the cloud controller. Because a new service is deployed in a cloud environment, the user specifies the association between the service and other services when the new service is deployed.
  • the cloud controller stores the relationship between the service and other services in the cloud controller database.
  • the controller database can store the associations between services into the capacity management database.
  • the administrator deploys service A through the cloud controller to configure the association relationship between service A and other services (such as service B and service C); the cloud controller searches for available resources in the service resource pool, and deploys service A in the above-mentioned Available resources.
  • the cloud controller deploys the new service, the service node is deployed on which service nodes, and the relationship between the service A and other services is saved in the cloud controller database.
  • the cloud controller database synchronizes the associations between these services into the capacity management database.
  • the capacity management device can then obtain the association relationship between the service A and other services from the capacity management database.
  • the second way get by calling the chain device.
  • a request triggers calls between hundreds of front-end services. Some of these calls affect the request. Some steps slow down the entire process, during large holidays. At the peak of the business, how many machines need to be allocated to the application cluster. These are the operations and maintenance needs to be considered, but the complexity of calling the environment has made it difficult to do accurate analysis and evaluation with human resources. In the face of massive logs, it needs to be automated. String together "isolated" logs of different components involved in a request to restore more valuable information, assist problem location, capacity planning, and stability analysis. That is, the relationship between services is obtained by calling the chain.
  • the specific process is as follows:
  • each service logs the call process to the log, then collects the logs and summarizes the collected logs into the call chain system.
  • the call chain device analyzes the log, analyzes the call path processing, obtains the call source between the services, and the association relationship between the services, and records them in the call chain database.
  • the call log is recorded during the invocation of each service, including trace identifier (TraceID), serial number (SeqNo), call service and called service.
  • TraceID is a unique identifier ID generated when an external request accesses the front-end service. The TraceID is unchanged in the entire call chain.
  • SeqNo identifies the order of the buried points in the call chain. SeqNo is a string with an unfixed length. The format is: XX...XX, where: each X corresponds to a level 1 call, and the number of X is the call depth; The increment is from 1 in the order of the call; the SeqNo of the call root node (the node that generates the TraceID) is 1.
  • the service call response SeqNo is the same as the SeqNo of the corresponding service call request.
  • the format of each service record call log is: TraceID
  • service A is a front-end service
  • service B, service C, service D, and service E are back-end services.
  • service A receives a request from an external application, generates a TraceID of 10001, SeqNo is 1, and the recorded log is: 10001
  • Each of the above log records is stored in the call chain database.
  • the calling chain device When it is necessary to analyze the calling relationship between the services of a certain service, the calling chain device obtains the log between the services of the above specific service from the calling chain database, and analyzes the above-mentioned day, wherein the same TraceID is The same business call, and then according to SeqNo know the order of the call, for example: through the analysis of the log, you can know that the TraceID is 10001 is the same business call, and then you can know the SeqNo relationship: 1, 1.1, 1.1.2, so I know that service A calls service B, and service B calls service E. In this way, the calling relationship between the various services of the service can be obtained.
  • the association relationship between the services is synchronized to the capacity management database, so that the capacity management device can acquire the calling relationship between the services from the capacity management database.
  • the capacity management device acquires the correlation factor of the service indicator
  • the historical data of the correlation factor is acquired from the capacity management database.
  • Step 204 Perform prediction on the correlation factor according to the historical data of the correlation factor and the service prediction algorithm, obtain a prediction result of the correlation factor, calculate the obtained service according to the prediction result of the initial service indicator predicted by step 202 and the prediction result of the predicted correlation factor.
  • the predicted value of the indicator and the predicted value of the predicted business indicator is stored in the capacity management database.
  • Step 205 Obtain a prediction result of the service indicator of each service running on the server from the capacity management database, and predict resource consumption of the server.
  • Server resources are often affected by multiple services. Each service cycle, trend, and busy time are different. After the impact of each service on server resources is superimposed, the cycle and trend of server resources are not obvious, and the system resources are directly predicted. Lead to inaccurate predictions.
  • the resource prediction of the server not only considers the initial service indicator, but also needs to consider the correlation factor that affects the initial service indicator, so that the resource prediction for the server is more accurate.
  • the prediction algorithm of the present invention optimizes the existing scheme, increases the automatic cycle detection, and the trend fitting model selection function, saves labor cost, reduces human factor interference, and increases the prediction in the case of periodic changes and trend changes. accuracy.
  • the forecasting index is caused by multiple factors, the period and trend are not obvious, and the direct forecasting error is large.
  • the business model is designed according to the characteristics of the telecommunication service, and the related index forecasting and resource estimating functions are added, and the function is increased. Forecast accuracy.
  • a capacity management apparatus provided by the present invention includes:
  • the first prediction module 41 is configured to predict the initial service indicator according to the historical data of the service prediction algorithm and the initial service indicator, and obtain a prediction result of the initial service indicator;
  • the obtaining module 42 is configured to acquire an association factor of the initial service indicator, and obtain historical data of the association factor;
  • the second prediction module 43 is configured to predict the correlation factor according to the historical data of the association factor and the service prediction algorithm, and obtain a prediction result of the correlation factor;
  • the third prediction module 44 is configured to obtain a prediction result of the final service indicator according to the prediction result of the initial service indicator and the prediction result of the correlation factor, and predict a resource consumption of the server according to the prediction result of the final service indicator .
  • the obtaining module 42 is specifically configured to obtain, from the database, an association factor of the initial service indicator, where the association factor is determined by the cloud controller when the service is deployed.
  • the obtaining module 42 is specifically configured to obtain, from the database, an association factor of the initial service indicator, where the association factor is obtained by the calling chain device by analyzing a calling path between the services, where the calling path is Recorded in the log.
  • the third prediction module 44 is configured to obtain, according to the prediction result of the initial service indicator and the prediction result of the correlation factor, each service when the multiple services are running on the server.
  • the prediction result of the final service indicator ; predicting the resource consumption of the server according to the prediction result of the final service indicator of each service and the unit resource consumption value corresponding to each service.
  • the resource prediction of the server not only considers the initial service indicator, but also needs to consider the correlation factor that affects the initial service indicator, so that the resource prediction for the server is more accurate.
  • each module in the embodiment of the present invention is schematic, and is only a division of a logical function. In actual implementation, there may be another division manner, and in addition, in each embodiment of the present application,
  • the function modules can be integrated in one processing module, or each module can exist physically separately, or two or more modules can be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or in the form of software functional modules.
  • the integrated modules if implemented in the form of software functional modules and sold or used as separate products, may be stored in a computer readable storage medium.
  • a computer readable storage medium A number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) or a processor to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .
  • FIG. 5 is a schematic structural diagram of a capacity management device according to an embodiment of the present invention.
  • the capacity management device includes a processor 51 and a memory 52.
  • the processor 51 and the memory 52 are connected.
  • the specific connecting medium between the above components is not limited in the embodiment of the present invention.
  • the embodiment of the present invention selects the connection between the processor 51 and the memory 52 in FIG. 5 through the bus 53.
  • the bus is indicated by a thick line in FIG. 5, and the connection manner between other components is only for illustrative description, and is not cited. To be limited.
  • the bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in Figure 5, but it does not mean that there is only one bus or one type of bus.
  • the memory 52 is used to store the program code executed by the processor 51, and the memory 52 may be a volatile memory (English: volatile memory), such as a random access memory (English: random-access memory, abbreviation: RAM)
  • the memory 52 can also be a non-volatile memory (English: non-volatile memory), such as read-only memory (English: read-only memory, abbreviation: ROM), flash memory (English: flash memory), hard disk (English) Hard disk drive (HDD) or solid state drive (SSD), or memory 52 can be used to carry or store desired program code in the form of instructions or data structures and can be stored by the computer. Any other memory taken, but is not limited to this. Further, the memory 52 may also be a combination of any of the above memories.
  • the processor 51 is configured to call the program code stored in the memory 52 through the bus, and execute by executing the called program code:
  • the processor 51 in the embodiment of the present invention may be a central processing unit (English: central processing unit, CPU for short).
  • the embodiment of the present invention further provides a resource prediction system, as shown in FIG. 6, including a capacity management device 61 and a database 62, wherein
  • the capacity management device 61 is configured to: predict, according to the service prediction algorithm and historical data of the initial service indicator, the initial service indicator, obtain a prediction result of the initial service indicator; and acquire an association factor of the initial service indicator, And obtaining historical data of the correlation factor; predicting the correlation factor according to historical data of the association factor and a service prediction algorithm, to obtain a prediction result of the correlation factor; and predicting results according to the initial service indicator
  • the prediction result of the correlation factor obtains a prediction result of the final service indicator, and predicts resource consumption of the server according to the prediction result of the final service indicator;
  • the database 62 is configured to store historical data of the initial service indicator, an association factor corresponding to the initial service indicator, and historical data of the association factor.
  • the system further includes: a call chain device 63, configured to obtain a record log from the database, analyze a call path in the record log, and obtain an association factor of an initial service indicator.
  • a call chain device 63 configured to obtain a record log from the database, analyze a call path in the record log, and obtain an association factor of an initial service indicator.
  • embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明提供一种资源预测方法、系统和容量管理装置,该方法包括:根据业务预测算法和初始业务指标的历史数据,对所述初始业务指标进行预测,获得所述初始业务指标的预测结果;获取所述初始业务指标的关联因子,并获得所述关联因子的历史数据;根据所述关联因子的历史数据和业务预测算法,对所述关联因子进行预测,得到所述关联因子的预测结果;根据所述初始业务指标的预测结果和所述关联因子的预测结果,获得最终业务指标的预测结果,并根据所述最终业务指标的预测结果,预测服务器的资源消耗。本发明提供的资源预测方法、系统和容量管理装置,能够提高资源预测的准确性。

Description

资源预测方法、系统和容量管理装
本申请要求于2015年09月16日提交中国专利局、申请号为201510590666.1,发明名称为“资源预测方法、系统和容量管理装”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及云计算领域,尤其涉及一种资源预测方法、系统和容量管理装置。
背景技术
容量管理用于评估现网局点的容量,即采集并分析在当前的业务量规模下CPU、内存、存储、网络以及其它资源的容量,并基于当前业务规模以及业务量增长的趋势预估系统未来的可支撑度。从而支撑提前进行容量调度或扩容,保障系统平稳运行。于是需要一种中长期资源预测和估算方法。
目前,对服务器或服务节点的资源预测和估算,都是通过预测服务器上业务所调用的服务自身所消耗的资源来进行预测的,然而影响服务的资源消耗是复杂的,可能会受到其它服务调用的影响,因此目前的服务器预测的结果通常误差会比较大,不够准确。
发明内容
针对上述问题,本发明的目的在于提供考虑了服务之间的关联关系的一种资源预测方法、系统和容量管理装置。
第一方面,本发明提供了一种资源预测的方法,包括:
根据业务预测算法和初始业务指标的历史数据,对所述初始业务指标进行预测,获得所述初始业务指标的预测结果;
获取所述初始业务指标的关联因子,并获得所述关联因子的历史数据;
根据所述关联因子的历史数据和业务预测算法,对所述关联因子进行预测,得到所述关联因子的预测结果;
根据所述初始业务指标的预测结果和所述关联因子的预测结果,获得最终业务指标的预测结果,并根据所述最终业务指标的预测结果,预测服务器的资源消耗。
结合第一方面,在第一方面第一种可能的实现方式中,所述获取所述初始业务指标的关联因子,具体包括:从数据库中获取所述初始业务指标的关联因子,所述关联因子由云控制器在部署服务时确定的。
结合第一方面,在第一方面第二种可能的实现方式中,获取所述初始业务指标的关联因子,具体包括:从数据库中获取所述初始业务指标的关联因子,其中,所述关联因子由调用链装置通过分析服务之间的调用路径获得的,所述调用路径被记录在记录日志中。
结合第一方面,在第一方面第三种可能的实现方式中,当服务器上运行多个业务时,所述根据所述初始业务指标的预测结果和所述关联因子的预测结果,获得最终业务指标的预测结果,并根据所述最终业务指标的预测结果,预测服务器的资源消耗,具体包括:
根据所述初始业务指标的预测结果和所述关联因子的预测结果,分别获得每个业务的最终业务指标的预测结果;
根据所述每个业务的最终业务指标的预测结果和每个业务对应的单位资源消耗值,预测服务器的资源消耗。
第二方面,本发明还提供一种容量管理装置,包括:
第一预测模块,用于根据业务预测算法和初始业务指标的历史数据,对所述初始业务指标进行预测,获得所述初始业务指标的预测结果;
获取模块,用于获取所述初始业务指标的关联因子,并获得所述关联因子的历史数据;
第二预测模块,用于根据所述关联因子的历史数据和业务预测算法,对所述关联因子进行预测,并得到所述关联因子的预测结果;
第三预测模块,用于根据所述初始业务指标的预测结果和所述关联因子的预测结果,获得最终业务指标的预测结果,并根据所述最终业务指标的预测结果,预测服务器的资源消耗。
结合第二方面,在第二方面第一种可能的实现方式中,获取模块,具体用于从数据库中获取所述初始业务指标的关联因子,所述关联因子由云控制器在部署服务时确定的。
结合第二方面,在第二方面第二种可能的实现方式中,获取模块,具体用于从数据库中获取所述初始业务指标的关联因子,其中,所述关联因子由调用链装置通过分析服务之间的调用路径获得的,所述调用路径被记录在记 录日志中。
结合第二方面,在第二方面第三种可能的实现方式中,所述第三预测模块,具体用于当服务器上运行多个业务时,根据所述初始业务指标的预测结果和所述关联因子的预测结果,分别获得每个业务的最终业务指标的预测结果;根据所述每个业务的最终业务指标的预测结果和每个业务对应的单位资源消耗值,预测服务器的资源消耗。
第三方面,本发明还提供了一种资源预测系统,包括:容量管理装置和数据库,其中,
所述容量管理装置,用于根据业务预测算法和初始业务指标的历史数据,对所述初始业务指标进行预测,获得所述初始业务指标的预测结果;获取所述初始业务指标的关联因子,并获得所述关联因子的历史数据;根据所述关联因子的历史数据和业务预测算法,对所述关联因子进行预测,得到所述关联因子的预测结果;根据所述初始业务指标的预测结果和所述关联因子的预测结果,获得最终业务指标的预测结果,并根据所述最终业务指标的预测结果,预测服务器的资源消耗;所述数据库,用于存储所述初始业务指标的历史数据、所述初始业务指标所对应的关联因子以及所述关联因子的历史数据。
结合第三方面,在第三方面第一种可能的实现方式中,还包括:调用链装置,用于从所述数据库中获取到记录日志,分析所述记录日志中的调用路径,获得初始业务指标的关联因子。
本发明实施例在进行服务器的资源预测的不仅仅考虑了初始业务指标,还需要考虑影响初始业务指标的关联因子,这样对服务器是资源预测将更加准确一些。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的资源预测的系统框架示意图;
图2为本发明实施例提供的资源预测的方法流程图;
图3为本发明实施例提供的服务之间调用的关系示意图;
图4为本发明实施例提供的容量管理装置的结构示意图;
图5为本发明实施例提供的容量管理装置的结构示意图;
图6为本发明实施例提供的资源预测系统的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本发明实施例提供了一种资源预测方法,系统和容量管理装置,用于提高云计算中的服务器的资源预测的准确性,以下分别进行详细说明。
为了便于理解本发明实施例,下面先对本发明实施例的资源预测的系统架构进行描述。
图1为本发明实施例的一个网络环境示意图,在这个系统架构中,包括如下装置,云控制器102,服务资源池104,服务总线106,容量管理装置108,调用链装置110,外部系统118和各个数据库,各个数据库包括云控制器数据库112,容量管理数据库114和调用链数据库116。其中,通过云控制器102部署应用,一个应用由一组服务组成,部署时用户可选择这些服务和哪些服务有关联关系,如订购服务需要和数据库服务关联,部署完应用后,云控制器会把各服务的关联关系记录到自己的云控制器数据库112中,云控制器服务器112将变化的服务之间关联关系同步到容量管理数据库114。云控制器102部署应用时会从服务资源池104中查找空闲服务来部署应用,例如:服务资源池104中有了资源A,资源B和资源C,当然可以不止这些资源。有时一个业务会需要几个服务之间的调用,在各服务之间调用时,会到服务总线106上查找目标服务当前的被部署到哪里,如订购服务要查找计费服务,服务总线返回可用的计费服务节点,订购服务就可以和计费服务通讯了。
调用链装置110和容量管理装置108都属于运维系统的,各个服务在调用时会记录调用日志,调用链装置108收集各个服务节点的调用日志,然后分析出各服务的调用关系,并将分析出的各个服务调用关系记录到调用链数据库 116中。各个服务节点之间通过服务总线106进行通信。
服务之间调用关系,可以在云控制器进行部署进确定下来。因为云化场景下经常会部署新的服务,部署新服务时管理员指定此服务和其它服务的关联关系,云控制器记录此服务在哪些节点和此服务与其它服务的关联关系到云服务器数据库112中,容量管理装置可在云控制器数据库112获取到各个服务间的关联关系,从而可以对云系统资源进行预测和估算。当然,容量管理装置108还可以从容量管理数据库114或调用链数据库116中获取服务关联关系。容量管理装置108和调用链装置110可以集成在一起,作为一个整体的硬件。
通常情况下,云服务器数据库112存储了云控制器102部署某种服务在哪些节点的信息,以及服务之间的关联关系,然后云服务器数据库112可以把部署某种服务在哪些节点的信息,以及服务之间的关联关系同步到容量管理服务器114中。调用链数据库116可以将调用链装置110分析出的服务之间的关联关系同步到容量管理数据库114。也就是说,容量管理数据库114中将存储了服务之间的最新的关联关系。在通常情况下,当需要获取某些服务之间的调用关系时,容量管理装置108首先到容量管理数据库114去查找,如果在容量管理数据库114查找不到,则容量管理装置108将到云控制器数据库112或调用链数据库116中进行查找。当然上述云服务器数据库112,容量管理数据库114,调用链数据库116可以合并成一个数据库,也就是云控制器102,容量管理装置108,调用链装置110共用一个公共的数据库,此时,所有数据都存储在容量管理数据库114中。
外部系统118,外部系统118可直接访问系统中发布的服务,如本系统发布了一个WEB服务,外部系统可通过访问网址访问此服务。
请参考图2,为本发明实施例一种资源预测的方法实施例的流程图,具体包括:
步骤201:容量管理装置确定影响服务器上的业务使用量的业务指标。
如果服务器上运行了多个业务,则容量管理装置分别确定每个业务的影响业务使用量的各自业务指标。
步骤202-204,说明了一个业务的一个具体业务指标预测的具体过程,一个业务的其他的业务指标的预测过程和202-204是类似的。
步骤202:根据业务预测算法和初始业务指标的历史数据,对初始业务指 标进行预测,并得到初始业务指标的预测结果。
业务资源预测算法有很多,本发明实施例以下面一个具体算法为例来进行说明,利用其它的业务资源预测算法对业务指标进行预测的过程是类似的。其过程包括:
1,容量管理装置对业务指标的周期进行自动检测,获得业务指标的周期。
从数据库中获得一定范围的最新的业务指标,进行快速傅里叶变换,将业务指标变换为频率,在快速傅里叶变换后,各个频率点的模值,就是该频率值下的幅度特性,其中模值最大的点即为主频freq,则主频倒数即为周期,则周期m=1/freq。
在实际应用场景中,业务指标的周期不明显或随着时间推移业务变化而引起周期变化了,由人工指定会导致预测不准确;业务指标每天上报统计并保存到数据库中,根据最新的业务指标自动检测周期,既减少了人工干预又可适应周期变化的情况,增加了预测准确性。
2,容量管理装置对业务指标的历史数据Mt做一次(m为奇数)或两次(m为偶数)的m期移动平均,得到趋势分量Tt
3,获得已剔除趋势分量数据Mt-Tt,并在已剔除趋势分量数据Mt-Tt的基础上,对全部周期的第i个时间点取平均作为第i个时间点的估计,得到季节分量St
4,对趋势分量Tt进行拟合,容量管理装置提供多种拟合模型;并基于输入数据的曲线特征和拟合误差,从多种(这里以7种为例)模型中自动选择最匹配的趋势预测模型。
具体的:容量管理装置从多个拟合模型中选择一个最匹配业务指标的趋势预测模型。
云控制器数据库112定时将各个业务指标数据发送到容量管理数据库114中,容量管理数据库114保存了各个业务指标数据,也就是说业务指标的相关数据一直在产生新的数据,需要根据一段时间最新的数据修正拟合模型,具体时间可以自己设定,比如:一周,一个月或一个季度等。
容量管理装置108从容量管理数据库114中获取一段时间的最新的业务指标的数据;提取模型判断特征,判断特征是否为常数,如果是常数就认为与对应模型匹配;不同模型的判断特征分别是:
线形特征
Feature1(i)=L(i+1)-L(i)
Feature1=(Feature1-min(Feature1))/mean(Feature1-min(Feature1));
指数特征
Feature2(i)=log(L(i+1))-log(L(i));
Feature2=(Feature2-min(Feature2))/mean(Feature2-min(Feature2));
修正指数特征:
Feature3(i)=(L(i+2)-L(i+1))/(L(i+1)-L(i));
Compertz曲线特征:
Feature4(i)=(log(L(i+2))-log(L(i+1)))/(log(L(i+1))-log(L(i)));
S曲线特征:
Feature5(i)=((1/L(i+2))-(1/L(i+1)))/((1/L(i+1))-(1/L(i)));
二次曲线特征:
Feature6(i)=Feature1(i+1)-Feature1(i);
Feature6=(Feature6-min(Feature6))/mean(Feature6-min(Feature6));
对数曲线特征:
Feature7(i)=(L(i+1)-L(i))/(log(i+1)-log(i))
如果所有模型特征都不符合,则分别按不同模型进行曲线拟合,逐个检验各模型的r方值,r方越接近1说明模型拟合越好,选择r方最接近于1的模型作为趋势预测模型。
这样,利用上述7种模型分别进行曲线拟合,从7种模型中选择最佳的模型作为趋势预测模型,这样的预测的准确度会更高。
5,根据选择的趋势预测模型,拟合业务分解指标的趋势分量
Figure PCTCN2016089927-appb-000001
并将预测的业务分解指标的趋势分量
Figure PCTCN2016089927-appb-000002
记录到数据库中。
随着业务的发展,拟合的趋势预测模型可能变化,比如在业务的发展阶段,呈现指数增长,拟合指数预测模型,但是多了发展期进入平稳期,可能拟合线形预测模型,所以每次预测都需要根据业务最新数据去修正拟合趋势预测模型。
6,从原始数据中剔除趋势分量和季节分量,得到随机分量,估计随机分量的均值μ和标准差σ,具体为:
Figure PCTCN2016089927-appb-000003
Figure PCTCN2016089927-appb-000004
7,将拟合的趋势分量,估计的季节分量和随机分量进行叠加,得到预测模型:
Figure PCTCN2016089927-appb-000005
8,然后根据预测模型和历史的业务指标数据,获得业务指标的预测结果
步骤203:从数据库中获取所述初始业务指标的关联因子,并获得所述关联因子的历史数据。
在复杂的应用场景下,比如:电信业务场景下,有很多业务指标之间存在关联关系,一个业务指标依赖于另外一个业务指标或多个业务指标,即依赖于多个因变量,因此业务指标的周期和趋势是多个因变量的叠加,没有明显特征,直接对这个业务指标进行预测,会导致预测不准确。为了解决这种情况,需要先分离出因变量,对因变量进行预测,再将各因变量的预测值进行叠加,计算预测指标的预测值,这样可增加预测的准确度。
业务指标之间的关联关系可以通过关联因子来体现出来,比如:业务指标2是业务指标1的关联因子,因此,如果第二业务指标影响第一业务指标,则第二业务指标为第一业务指标的关联因子。在本发明实施例中,业务指标可以具体为某个业务所涉及的服务,比如,在下面的例子中,服务A是第一业务指标,服务B是第一业务指标的关联因子,服务C,服务D是服务B的关联因子。
在分布式、大规模集群系统场景下,经常部署新的服务,服务之间的关联关系需要经常更新,很难手工去维护。且分布式、大规模集群系统场景下,一次请求会触发数百次前后端的调用,多层次的服务调用,很难用人力去找出服务之间的关联关系。
其中关键点是业务指标的关联因子的获取,即获取服务之间的关联关系,本发明实施例提出两种解决方法,可自动分析并记录服务之间的关联关系,一种是通过云控制器部署服务时获取,一种是通过调用链装置获取。下面对这两种获取方式分别介绍:
第一种方式:通过云控制器部署服务时获取。因为云化场景下经常会部署新的服务,部署新服务时用户指定该服务和其它服务的关联关系,云控制器将该服务和其它服务之间的关联关系存储到云控制器数据库中,云控制器数据库可以将服务之间的关联关系存储到容量管理数据库中。具体的过程如下:
管理员通过云控制器部署服务A,配置服务A和其它服务(比如:服务B,服务C)的关联关系;云控制器会在服务资源池中查找可用资源,将服务A部署在上述找到的可用资源上去。云控制器部署完新服务后,将服务A部署在哪些服务节点上,此服务A和其它服务的关联关系等信息保存到云控制器数据库中。云控制器数据库将这些服务之间的关联关系同步到容量管理数据库中。然后容量管理装置就可以从容量管理数据库中获取到服务A和其它服务之间的关联关系。
第二种方式:通过调用链装置获取。
分布式、大规模集群系统场景下,一次请求会触发数百次前后端的服务之间的调用,这些调用的一些问题会影响这次请求,某些步骤会拖慢整个处理流程,在大型节假日的业务高峰期需要给应用集群分配多少机器,这些都是运维需要考虑的,但是调用环境的复杂度,已经很难用人力去做准确的分析和评估了;面对海量的日志,需要能自动把一次请求中涉及到的不同组件的“孤立的”日志串在一起,还原出更多有价值的信息,辅助问题定界定位、容量规划及稳定性分析。也就是说,通过调用链来获取服务之间的关系。具体过程如下:
假设存在三种服务,前端服务,第一后端服务,第二后端服务,这三类服务都是在服务资源池中发布的服务。此时,当某个外部请求访问前端服务时,前端服务还需要调用后端服务1和后端服务2,才能完成外部请求要求。随着各个服务之间的调用,每个服务都把调用过程记录到日志中,然后将日志收集,并把收集的日志汇总到调用链系统中。
然后调用链装置对日志进行分析,分析调用路径处理,获得服务之间的调用来源,和各个服务之间的关联关系,并记录到调用链数据库中。
在各服务的调用过程中会记录调用日志,包括轨迹标识(TraceID),序列号(SeqNo),调用服务和被调用服务等信息。其中,TraceID为外部请求访问前端服务时生成的一个唯一的标识ID,在整个调用链中TraceID不变。SeqNo标识调用链中埋点先后顺序,SeqNo为长度不固定的字符串,格式为:X.X.…X.X,其中:每个X对应一级调用,X的个数即为调用深度;同级别调用中X按调用先后顺序从1开始递增;调用链根节点(产生TraceID的节点)的SeqNo为1。服务调用响应SeqNo和对应服务调用请求的SeqNo相同。各服务记录调用日志的格式为:TraceID|SeqNo|业务流程|调用方|被调用方。
例如,如图3所示,为服务之间的调用关系的示意图,服务A为前端服务,服务B,服务C,服务D和服务E是后端服务。在一个业务流程中,服务A接收了外部应用的一个请求,生成TraceID为100001,SeqNo为1,记录的日志为:100001|1|业务流程A|外部应用|服务A;该服务A需要调用服务B,则将SeqNo修改为1.1,记录的日志为:100001|1.1|业务流程A|服务A|服务B,进一步,服务B还需要调用服务E,则将SeqNo修改为1.1.2,记录的日志:100001|1.1.2|业务流程A|服务B|服务E。上述每一个记录日志都存储调用链数据库中。
在需要分析某个业务的各个服务之间的调用关系时,调用链装置从调用链数据库中获取到上述一个具体业务的各个服务之间的记录日志,并分析上述日,其中,TraceID相同的是同一次业务调用,然后根据SeqNo就知道先后调用顺序,例如:通过分析日志,可以获知TraceID为10001是同一个业务调用的,然后可以获知SeqNo的先后关系:1,1.1,1.1.2,这样就知道了服务A调用服务B,服务B调用服务E。这样就可以获取到该业务的各个服务之间的调用关系。
调用链装置获取到服务之间的调用关系后,将上述服务之间的关联关系同步到容量管理数据库中,这样容量管理装置就可以从容量管理数据库中获取到服务之间的调用关系。
这样,当容量管理装置获取到业务指标的关联因子后,从容量管理数据库中获取到关联因子的历史数据。
步骤204:根据关联因子的历史数据和业务预测算法,对关联因子进行预测,得到关联因子的预测结果,根据步骤202预测的初始业务指标的预测结果和预测的关联因子的预测结果,计算获得业务指标的预测值,并将预测的业务指标的预测值保存在容量管理数据库中。
这样按照步骤202-204的过程,预测到各个业务的业务指标的预测结果
步骤205:从容量管理数据库中,获取服务器上运行的各个业务的业务指标的预测结果,预测服务器的资源消耗。
服务器资源往往由多种业务共同影响,各业务周期、趋势、忙闲时各不相同,各业务对服务器资源的影响叠加后,服务器资源的周期和趋势不明显,直接对系统资源进行预测,会导致预测不准确。
根据电信业务特点预先建立资源评估模型:例如 St=a×M1t+b×M2t+c×M3t+d,S为系统资源指标,M1、M2、M3表示各业务指标,a、b、c、d表示各单位业务的资源消耗值,每次资源评估前需计算并修正各业务单位资源消耗值;也就是说,根据业务分解指标和多项式模型计算多项式因子,如St=a×M1t+b×M2t+c×M3t+d,其中多项式因子有4个,故只需获取4组最新业务分解指标,即可计算出各单位业务的资源消耗值。
本发明实施例在进行服务器的资源预测的不仅仅考虑了初始业务指标,还需要考虑影响初始业务指标的关联因子,这样对服务器是资源预测将更加准确一些。进一步地,本发明的预测算法对现有方案进行了优化,增加周期自动检测、趋势拟合模型选择功能,节省人力成本,减少人为因素干扰,在周期变化、趋势变化的情况下,增加预测的准确性。在实际情况下预测指标由多个因素共同作用,周期和趋势不明显,直接预测误差较大,为了解决这种情况,根据电信业务特点设计业务模型,增加关联指标预测和资源估算功能,增加了预测准确性。
如图4所示,本发明提供的一种容量管理装置,包括:
第一预测模块41,用于根据业务预测算法和初始业务指标的历史数据,对所述初始业务指标进行预测,获得所述初始业务指标的预测结果;
获取模块42,用于获取所述初始业务指标的关联因子,并获得所述关联因子的历史数据;
第二预测模块43,用于根据所述关联因子的历史数据和业务预测算法,对所述关联因子进行预测,并得到所述关联因子的预测结果;
第三预测模块44,用于根据所述初始业务指标的预测结果和所述关联因子的预测结果,获得最终业务指标的预测结果,并根据所述最终业务指标的预测结果,预测服务器的资源消耗。
可选的,获取模块42,具体用于从数据库中获取所述初始业务指标的关联因子,所述关联因子由云控制器在部署服务时确定的。
可选的,获取模块42,具体用于从数据库中获取所述初始业务指标的关联因子,其中,所述关联因子由调用链装置通过分析服务之间的调用路径获得的,所述调用路径被记录在记录日志中。
可选的,第三预测模块44,具体用于当服务器上运行多个业务时,根据所述初始业务指标的预测结果和所述关联因子的预测结果,分别获得每个业务的 最终业务指标的预测结果;根据所述每个业务的最终业务指标的预测结果和每个业务对应的单位资源消耗值,预测服务器的资源消耗。
本发明实施例在进行服务器的资源预测的不仅仅考虑了初始业务指标,还需要考虑影响初始业务指标的关联因子,这样对服务器是资源预测将更加准确一些。
需要说明的是,本发明实施例中对各个模块的划分是示意性的,仅仅为一种逻辑功能的划分,实际实现时可以有另外的划分方式,另外,在本申请各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
本发明实施例还提供了一种容量管理装置,如图5所示,图5为本发明实施例中容量管理装置的结构示意图,该容量管理装置包括处理器51和存储器52。其中,处理器51和存储器52相连接。本发明实施例中不限定上述部件之间的具体连接介质。本发明实施例选择在图5中处理器51和存储器52之间通过总线53连接,总线在图5中以粗线表示,其它部件之间的连接方式,仅是进行示意性说明,并不引以为限。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图5中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
本发明实施例中存储器52,用于存储处理器51执行的程序代码,存储器52可以是易失性存储器(英文:volatile memory),例如随机存取存储器(英文:random-access memory,缩写:RAM);存储器52也可以是非易失性存储器(英文:non-volatile memory),例如只读存储器(英文:read-only memory,缩写:ROM),快闪存储器(英文:flash memory),硬盘(英文:hard disk drive,缩写:HDD)或固态硬盘(英文:solid-state drive,缩写:SSD)、或者存储器52是能够用于携带或存储具有指令或数据结构形式的期望程序代码并能够由计算机存取的任何其他存储器,但不限于此。此外,存储器52还可以是上述任意存储器的组合。
本发明实施例中,处理器51,用于通过总线调用存储器52中存储的程序代码,并通过执行调用的程序代码执行:
根据业务预测算法和初始业务指标的历史数据,对所述初始业务指标进行预测,获得所述初始业务指标的预测结果;获取所述初始业务指标的关联因子,并获得所述关联因子的历史数据;根据所述关联因子的历史数据和业务预测算法,对所述关联因子进行预测,得到所述关联因子的预测结果;根据所述初始业务指标的预测结果和所述关联因子的预测结果,获得最终业务指标的预测结果,并根据所述最终业务指标的预测结果,预测服务器的资源消耗。
本发明实施例中的处理器51,可以是一个中央处理单元(英文:central processing unit,简称CPU)。
本发明实施例还提供了一种资源预测系统,如图6所示,包括容量管理装置61和数据库62,其中,
所述容量管理装置61,用于根据业务预测算法和初始业务指标的历史数据,对所述初始业务指标进行预测,获得所述初始业务指标的预测结果;获取所述初始业务指标的关联因子,并获得所述关联因子的历史数据;根据所述关联因子的历史数据和业务预测算法,对所述关联因子进行预测,得到所述关联因子的预测结果;根据所述初始业务指标的预测结果和所述关联因子的预测结果,获得最终业务指标的预测结果,并根据所述最终业务指标的预测结果,预测服务器的资源消耗;
所述数据库62,用于存储所述初始业务指标的历史数据、所述初始业务指标所对应的关联因子以及所述关联因子的历史数据。
可选的,该系统,还包括:调用链装置63,用于从所述数据库中获取到记录日志,分析所述记录日志中的调用路径,获得初始业务指标的关联因子。
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本发明的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本发明范围的所有变更和修改。
显然,本领域的技术人员可以对本发明实施例进行各种改动和变型而不脱离本发明实施例的精神和范围。这样,倘若本发明实施例的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。

Claims (10)

  1. 一种资源预测的方法,其特征在于,包括:
    根据业务预测算法和初始业务指标的历史数据,对所述初始业务指标进行预测,获得所述初始业务指标的预测结果;
    获取所述初始业务指标的关联因子,并获得所述关联因子的历史数据;
    根据所述关联因子的历史数据和业务预测算法,对所述关联因子进行预测,得到所述关联因子的预测结果;
    根据所述初始业务指标的预测结果和所述关联因子的预测结果,获得最终业务指标的预测结果,并根据所述最终业务指标的预测结果,预测服务器的资源消耗。
  2. 根据权利要求1所述的资源预测方法,其特征在于,所述获取所述初始业务指标的关联因子,具体包括:
    从数据库中获取所述初始业务指标的关联因子,所述关联因子由云控制器在部署服务时确定的。
  3. 根据权利要求1所述的资源预测方法,其特征在于,所述获取所述初始业务指标的关联因子,具体包括:
    从数据库中获取所述初始业务指标的关联因子,其中,所述关联因子由调用链装置通过分析服务之间的调用路径获得的,所述调用路径被记录在记录日志中。
  4. 根据权利要求1所述的资源预测方法,其特征在于,当服务器上运行多个业务时,所述根据所述初始业务指标的预测结果和所述关联因子的预测结果,获得最终业务指标的预测结果,并根据所述最终业务指标的预测结果,预测服务器的资源消耗,具体包括:
    根据所述初始业务指标的预测结果和所述关联因子的预测结果,分别获得每个业务的最终业务指标的预测结果;
    根据所述每个业务的最终业务指标的预测结果和每个业务对应的单位资源消耗值,预测服务器的资源消耗。
  5. 一种容量管理装置,其特征在于,包括:
    第一预测模块,用于根据业务预测算法和初始业务指标的历史数据,对所述初始业务指标进行预测,获得所述初始业务指标的预测结果;
    获取模块,用于获取所述初始业务指标的关联因子,并获得所述关联因 子的历史数据;
    第二预测模块,用于根据所述关联因子的历史数据和业务预测算法,对所述关联因子进行预测,并得到所述关联因子的预测结果;
    第三预测模块,用于根据所述初始业务指标的预测结果和所述关联因子的预测结果,获得最终业务指标的预测结果,并根据所述最终业务指标的预测结果,预测服务器的资源消耗。
  6. 根据权利要求5所述的容量管理装置,其特征在于,所述获取模块,具体用于从数据库中获取所述初始业务指标的关联因子,所述关联因子由云控制器在部署服务时确定的。
  7. 根据权利要求5所述的容量管理装置,其特征在于,所述获取模块,具体用于从数据库中获取所述初始业务指标的关联因子,其中,所述关联因子由调用链装置通过分析服务之间的调用路径获得的,所述调用路径被记录在记录日志中。
  8. 根据权利要求5所述的容量管理装置,其特征在于,所述第三预测模块,具体用于当服务器上运行多个业务时,根据所述初始业务指标的预测结果和所述关联因子的预测结果,分别获得每个业务的最终业务指标的预测结果;根据所述每个业务的最终业务指标的预测结果和每个业务对应的单位资源消耗值,预测服务器的资源消耗。
  9. 一种资源预测系统,其特征在于,包括:容量管理装置和数据库,其中,
    所述容量管理装置,用于根据业务预测算法和初始业务指标的历史数据,对所述初始业务指标进行预测,获得所述初始业务指标的预测结果;获取所述初始业务指标的关联因子,并获得所述关联因子的历史数据;根据所述关联因子的历史数据和业务预测算法,对所述关联因子进行预测,得到所述关联因子的预测结果;根据所述初始业务指标的预测结果和所述关联因子的预测结果,获得最终业务指标的预测结果,并根据所述最终业务指标的预测结果,预测服务器的资源消耗;
    所述数据库,用于存储所述初始业务指标的历史数据、所述初始业务指标所对应的关联因子以及所述关联因子的历史数据。
  10. 如权利要求9所述的资源预测系统,其特征在于,还包括:调用链装置,用于从所述数据库中获取到记录日志,分析所述记录日志中的调用路径,获得初始业务指标的关联因子。
PCT/CN2016/089927 2015-09-16 2016-07-13 资源预测方法、系统和容量管理装 WO2017045472A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510590666.1 2015-09-16
CN201510590666.1A CN106549772B (zh) 2015-09-16 2015-09-16 资源预测方法、系统和容量管理装置

Publications (1)

Publication Number Publication Date
WO2017045472A1 true WO2017045472A1 (zh) 2017-03-23

Family

ID=58288140

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/089927 WO2017045472A1 (zh) 2015-09-16 2016-07-13 资源预测方法、系统和容量管理装

Country Status (2)

Country Link
CN (1) CN106549772B (zh)
WO (1) WO2017045472A1 (zh)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107943579A (zh) * 2017-11-08 2018-04-20 深圳前海微众银行股份有限公司 资源瓶颈预测方法、设备、系统及可读存储介质
CN109446041A (zh) * 2018-09-25 2019-03-08 平安普惠企业管理有限公司 一种服务器压力预警方法、系统及终端设备
CN110109750A (zh) * 2019-04-03 2019-08-09 平安科技(深圳)有限公司 虚拟资源获取方法、装置、计算机设备和存储介质
CN110348684A (zh) * 2019-06-06 2019-10-18 阿里巴巴集团控股有限公司 服务调用风险模型生成方法、预测方法及各自装置
CN111181875A (zh) * 2018-11-12 2020-05-19 中移(杭州)信息技术有限公司 带宽调节方法及装置
CN111352733A (zh) * 2020-02-26 2020-06-30 北京奇艺世纪科技有限公司 一种扩缩容状态的预测方法和装置
CN111400147A (zh) * 2019-01-02 2020-07-10 中国移动通信有限公司研究院 一种业务质量测试方法、装置和系统
CN112685173A (zh) * 2020-12-22 2021-04-20 中通天鸿(北京)通信科技股份有限公司 一种基于富媒体的智能路由分配系统
CN112860523A (zh) * 2021-03-16 2021-05-28 中国工商银行股份有限公司 批量作业处理的故障预测方法、装置和服务器
CN112965956A (zh) * 2021-03-18 2021-06-15 上海东普信息科技有限公司 数据库水平扩容方法、装置、设备和存储介质
CN113220299A (zh) * 2021-05-28 2021-08-06 北京达佳互联信息技术有限公司 一种图形化展示的方法及装置
CN113703974A (zh) * 2021-08-27 2021-11-26 深圳前海微众银行股份有限公司 一种预测服务器容量的方法及装置
CN113762688A (zh) * 2021-01-06 2021-12-07 北京沃东天骏信息技术有限公司 业务分析系统、方法以及存储介质
CN113923099A (zh) * 2021-09-03 2022-01-11 华为技术有限公司 一种通信网络故障的根因定位方法及相关设备
CN115080363A (zh) * 2022-08-23 2022-09-20 中国中金财富证券有限公司 一种基于业务日志的系统容量评估方法及装置
CN117407158A (zh) * 2023-10-08 2024-01-16 天翼数字生活科技有限公司 一种集群扩容方法、装置、电子设备及存储介质

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107346472A (zh) * 2017-06-29 2017-11-14 人民法院信息技术服务中心 一种通过在线建模处理运维数据的方法及装置
CN110019110B (zh) * 2017-07-28 2022-11-18 腾讯科技(深圳)有限公司 一种业务系统的容量管理方法、装置、设备及业务系统
CN109510715B (zh) * 2017-09-14 2022-02-08 中国电信股份有限公司 带宽分配方法、装置、数据中心以及存储介质
CN108984304A (zh) * 2018-07-11 2018-12-11 广东亿迅科技有限公司 基于回归方程的服务器扩容计算方法及装置
CN109271453B (zh) * 2018-10-22 2021-08-27 创新先进技术有限公司 一种数据库容量的确定方法和装置
CN109543891B (zh) * 2018-11-09 2022-02-01 深圳前海微众银行股份有限公司 容量预测模型的建立方法、设备及计算机可读存储介质
CN110058939A (zh) * 2018-12-27 2019-07-26 阿里巴巴集团控股有限公司 系统扩容方法、装置及设备
CN110109800A (zh) * 2019-04-10 2019-08-09 网宿科技股份有限公司 一种服务器集群系统的管理方法及装置
CN110096423A (zh) * 2019-05-14 2019-08-06 深圳供电局有限公司 一种基于大数据分析的服务器存储容量分析预测方法
CN110321240B (zh) * 2019-06-28 2023-06-09 创新先进技术有限公司 一种基于时序预测的业务影响评估方法和装置
CN111611517B (zh) * 2020-05-13 2023-07-21 咪咕文化科技有限公司 指标监控方法、装置、电子设备及存储介质
CN111985713B (zh) * 2020-08-19 2023-08-18 中国银行股份有限公司 数据指标波形预测方法及装置
CN112559191B (zh) * 2020-12-23 2023-04-25 平安银行股份有限公司 动态部署gpu资源的方法、装置和计算机设备
CN112702345A (zh) * 2020-12-24 2021-04-23 福建技术师范学院 基于数据元特征的信息漏洞风险评估方法及装置
CN112734195B (zh) * 2020-12-31 2023-07-07 平安科技(深圳)有限公司 数据处理方法、装置、电子设备及存储介质
CN112766698B (zh) * 2021-01-13 2024-02-09 中国工商银行股份有限公司 应用业务压力确定方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102103714A (zh) * 2009-12-22 2011-06-22 阿里巴巴集团控股有限公司 实现业务数据预测的实时处理平台及预测方法
CN103685347A (zh) * 2012-09-03 2014-03-26 阿里巴巴集团控股有限公司 一种网络资源的配置方法和装置
CN103903069A (zh) * 2014-04-15 2014-07-02 广东电网公司信息中心 存储容量预测方法及存储容量预测系统
CN104125584A (zh) * 2013-04-27 2014-10-29 中国移动通信集团福建有限公司 一种针对网络业务的业务指标实现预测的方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102082703A (zh) * 2009-11-26 2011-06-01 中国移动通信集团贵州有限公司 业务支撑系统设备性能监控的方法及装置
US9563663B2 (en) * 2012-09-28 2017-02-07 Oracle International Corporation Fast path evaluation of Boolean predicates
CN103490956A (zh) * 2013-09-22 2014-01-01 杭州华为数字技术有限公司 基于业务量预测的自适应节能控制方法及设备、系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102103714A (zh) * 2009-12-22 2011-06-22 阿里巴巴集团控股有限公司 实现业务数据预测的实时处理平台及预测方法
CN103685347A (zh) * 2012-09-03 2014-03-26 阿里巴巴集团控股有限公司 一种网络资源的配置方法和装置
CN104125584A (zh) * 2013-04-27 2014-10-29 中国移动通信集团福建有限公司 一种针对网络业务的业务指标实现预测的方法及装置
CN103903069A (zh) * 2014-04-15 2014-07-02 广东电网公司信息中心 存储容量预测方法及存储容量预测系统

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107943579A (zh) * 2017-11-08 2018-04-20 深圳前海微众银行股份有限公司 资源瓶颈预测方法、设备、系统及可读存储介质
CN109446041B (zh) * 2018-09-25 2022-10-28 平安普惠企业管理有限公司 一种服务器压力预警方法、系统及终端设备
CN109446041A (zh) * 2018-09-25 2019-03-08 平安普惠企业管理有限公司 一种服务器压力预警方法、系统及终端设备
CN111181875A (zh) * 2018-11-12 2020-05-19 中移(杭州)信息技术有限公司 带宽调节方法及装置
CN111400147A (zh) * 2019-01-02 2020-07-10 中国移动通信有限公司研究院 一种业务质量测试方法、装置和系统
CN111400147B (zh) * 2019-01-02 2023-05-05 中国移动通信有限公司研究院 一种业务质量测试方法、装置和系统
CN110109750A (zh) * 2019-04-03 2019-08-09 平安科技(深圳)有限公司 虚拟资源获取方法、装置、计算机设备和存储介质
CN110109750B (zh) * 2019-04-03 2023-08-18 平安科技(深圳)有限公司 虚拟资源获取方法、装置、计算机设备和存储介质
CN110348684A (zh) * 2019-06-06 2019-10-18 阿里巴巴集团控股有限公司 服务调用风险模型生成方法、预测方法及各自装置
CN110348684B (zh) * 2019-06-06 2023-07-18 创新先进技术有限公司 服务调用风险模型生成方法、预测方法及各自装置
CN111352733A (zh) * 2020-02-26 2020-06-30 北京奇艺世纪科技有限公司 一种扩缩容状态的预测方法和装置
CN112685173A (zh) * 2020-12-22 2021-04-20 中通天鸿(北京)通信科技股份有限公司 一种基于富媒体的智能路由分配系统
CN113762688A (zh) * 2021-01-06 2021-12-07 北京沃东天骏信息技术有限公司 业务分析系统、方法以及存储介质
CN112860523A (zh) * 2021-03-16 2021-05-28 中国工商银行股份有限公司 批量作业处理的故障预测方法、装置和服务器
CN112965956A (zh) * 2021-03-18 2021-06-15 上海东普信息科技有限公司 数据库水平扩容方法、装置、设备和存储介质
CN113220299A (zh) * 2021-05-28 2021-08-06 北京达佳互联信息技术有限公司 一种图形化展示的方法及装置
CN113703974A (zh) * 2021-08-27 2021-11-26 深圳前海微众银行股份有限公司 一种预测服务器容量的方法及装置
CN113923099A (zh) * 2021-09-03 2022-01-11 华为技术有限公司 一种通信网络故障的根因定位方法及相关设备
CN115080363A (zh) * 2022-08-23 2022-09-20 中国中金财富证券有限公司 一种基于业务日志的系统容量评估方法及装置
CN117407158A (zh) * 2023-10-08 2024-01-16 天翼数字生活科技有限公司 一种集群扩容方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN106549772B (zh) 2019-11-19
CN106549772A (zh) 2017-03-29

Similar Documents

Publication Publication Date Title
WO2017045472A1 (zh) 资源预测方法、系统和容量管理装
EP3840295B1 (en) Resource configuration prediction method and device
US11836578B2 (en) Utilizing machine learning models to process resource usage data and to determine anomalous usage of resources
AU2019202091B2 (en) Comparative multi-forecasting analytics service stack for cloud computing resource allocation
US11184241B2 (en) Topology-aware continuous evaluation of microservice-based applications
JP6952058B2 (ja) メモリ使用量判断技術
JP5857049B2 (ja) 単語のユーザー挙動数の予測
US20160321331A1 (en) Device and method
US10909503B1 (en) Snapshots to train prediction models and improve workflow execution
US20210255899A1 (en) Method for Establishing System Resource Prediction and Resource Management Model Through Multi-layer Correlations
US10789146B2 (en) Forecasting resource utilization
US10713578B2 (en) Estimating utilization of network resources using time series data
CN114356577A (zh) 一种系统容量预估方法以及装置
CN110928636A (zh) 虚拟机热迁移方法、装置和设备
WO2020206699A1 (en) Predicting virtual machine allocation failures on server node clusters
CN110196751B (zh) 互扰服务的隔离方法及装置、电子设备、存储介质
US20210263718A1 (en) Generating predictive metrics for virtualized deployments
CN111800807A (zh) 一种基站用户数量告警的方法及装置
CN110443451B (zh) 事件定级方法、装置、计算机设备和存储介质
CN117331989A (zh) 一种基于云计算的大数据挖掘方法及挖掘系统
CN113886036B (zh) 用于优化分布式系统集群配置的方法和系统
WO2023129221A1 (en) Resource capacity management in computing systems
WO2015154641A1 (zh) 一种业务并发性预测方法与预测系统
US20240303134A1 (en) Systems and methods for edge resource demand load estimation
US20240303130A1 (en) Systems and methods for edge resource demand load scheduling

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16845586

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16845586

Country of ref document: EP

Kind code of ref document: A1