[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN117527611A - Gaussian distribution-based fault dynamic prediction method and related device - Google Patents

Gaussian distribution-based fault dynamic prediction method and related device Download PDF

Info

Publication number
CN117527611A
CN117527611A CN202311475288.3A CN202311475288A CN117527611A CN 117527611 A CN117527611 A CN 117527611A CN 202311475288 A CN202311475288 A CN 202311475288A CN 117527611 A CN117527611 A CN 117527611A
Authority
CN
China
Prior art keywords
data
gaussian distribution
distribution
fault
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311475288.3A
Other languages
Chinese (zh)
Other versions
CN117527611B (en
Inventor
王玉臣
安吉元
王志刚
修兴强
胡永鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Taiji Information System Technology Co ltd
Original Assignee
Beijing Taiji Information System Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Taiji Information System Technology Co ltd filed Critical Beijing Taiji Information System Technology Co ltd
Priority to CN202311475288.3A priority Critical patent/CN117527611B/en
Publication of CN117527611A publication Critical patent/CN117527611A/en
Application granted granted Critical
Publication of CN117527611B publication Critical patent/CN117527611B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a fault dynamic prediction method and a related device based on Gaussian distribution, wherein the method comprises the following steps: collecting parameter index historical data related to faults of equipment; constructing time windows of equipment fault abnormal index data, and constructing relatively accurate Gaussian distribution of each time window according to historical data; parameter indexes of equipment are collected and monitored regularly, and warning is given out to possible equipment faults according to abnormal data in time. Compared with the traditional static method, the dynamic fault threshold setting method is more adaptive, the threshold can be automatically adjusted according to the resource type and the scene, and the false alarm rate is reduced. In addition, the technology is applicable to various resource types and has wide applicability. The method also adds automatic evaluation and adjustment to the model, improves the accuracy and efficiency of fault prediction, is beneficial to taking measures in advance, reduces the influence of equipment faults on the service, and improves the reliability and performance of the resource management system.

Description

Gaussian distribution-based fault dynamic prediction method and related device
Technical Field
The application belongs to the technical field of equipment fault prediction, and relates to a fault dynamic prediction method based on Gaussian distribution and a related device.
Background
The intelligent resource management system manages a wide variety of resources. The set fault thresholds are different for different kinds of resources, and the fault thresholds are also different for the same kind of resources under different task scenes. The index monitoring and early warning system in the prior art generally applies a static fault threshold method to assist in analyzing abnormal conditions of resources. The static fault threshold method is to set a fixed fault threshold for the monitoring index of the resource, compare the current monitoring index data with the set fault threshold, generate an alarm if the fault threshold is exceeded, and treat the fault threshold as normal if the fault threshold is not exceeded. However, existing fault prediction techniques are typically implemented by monitoring a single resource of the device, and their thresholds are typically set empirically by the practitioner. However, when resources are single, the prediction error may be large, and when the kinds of resources are too many, how the threshold of some resources should be set may be unfamiliar to the professional. In addition, due to the characteristics of different resources, performance indexes of the resources may be in a fluctuation state, and at the moment, the abnormal state of the resources cannot be accurately reflected by adopting a static threshold method.
Disclosure of Invention
The invention aims to solve the problems in the prior art and provides a fault dynamic prediction method and a related device based on Gaussian distribution.
In order to achieve the above purpose, the present application is implemented by adopting the following technical scheme:
in a first aspect, the present application provides a method for dynamic fault prediction based on gaussian distribution, including the steps of:
continuously monitoring and collecting the characteristics related to the equipment state and the corresponding characteristic data thereof;
dividing the characteristic data to obtain a plurality of data time windows;
constructing a mapping relation model of the equipment state and the characteristic data by using a Gaussian distribution method;
and calculating probability distribution and a threshold value of each data time window according to the mapping relation model, and comparing the probability distribution with the default threshold value to obtain a fault prediction result.
In a second aspect, the present application provides a gaussian distribution-based fault dynamic prediction system, comprising:
the data acquisition module is used for continuously monitoring and acquiring the characteristics related to the equipment state and the corresponding characteristic data thereof;
the data dividing module is used for dividing the characteristic data to obtain a plurality of data time windows;
the model construction module is used for constructing a mapping relation model of the equipment state and the characteristic data by using a Gaussian distribution method;
and the fault prediction module is used for calculating probability distribution and a threshold value of each data time window according to the mapping relation model, and comparing the probability distribution with the default threshold value to obtain a fault prediction result.
In a third aspect, the present application provides an electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method as described above when executing the computer program.
In a fourth aspect, the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of a method as described above.
Compared with the prior art, the application has the following beneficial effects:
the method for setting the dynamic fault threshold value is used for searching the characteristics of service mutation so as to break through the monitoring blind area of the traditional absolute index before ensuring personnel perceive and discover the resource abnormality. The hidden danger of the resource index is captured early through the idea of active monitoring, the method is realized through Gaussian distribution according to the normal distribution characteristic of the resource index, the threshold value is obtained more reasonably and accurately, the early warning level is automatically generated, the accurate monitoring of the resource is achieved, the invisible faults in the resource are mined, and the active monitoring of the index before the problem analysis and the steering problem are generated in the time of the resource faults in the prior art is finally realized.
The method and the device divide the characteristic data into different data time windows based on periodicity and distribution characteristics of the characteristic data and service understanding of the data, wherein the characteristic data in the same data time window have similar numerical distribution characteristics.
Based on Gaussian distribution, the threshold is automatically adjusted according to historical data of a time window, equipment characteristic parameter indexes with fluctuation are monitored through dynamic thresholds, abnormal performance of the indexes is timely and accurately achieved, and the possibility of fault occurrence is predicted.
Drawings
For a clearer description of the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and should therefore not be considered limiting in scope, and that other related drawings can be obtained according to these drawings without the inventive effort of a person skilled in the art.
FIG. 1 is a flow chart of the method of the present application.
Fig. 2 is a schematic diagram of the system of the present application.
FIG. 3 is a flow chart of a method and system for dynamic prediction of a fault based on Gaussian distribution.
FIG. 4 is a plot of 2000 historical data points received during a 1:00-2:00 historical time window of example 2 of the present application.
Fig. 5 is a diagram showing the data amount transmitted by the device in the "normal" state according to embodiment 2 of the present application.
Fig. 6 is a diagram showing the data amount transmitted by the device in the "failure one" state in embodiment 2 of the present application.
Fig. 7 is a diagram showing the data amount transmitted by the device in the "fail-two" state in embodiment 2 of the present application.
Fig. 8 is data obtained by classifying the data of comparative test set of example 2 of the present application into "abnormal state" and "normal state".
Fig. 9 is a result of calculating epsilon when P (β) =0.997 is set in example 2 of the present application.
Fig. 10 is a result of calculating epsilon when P (β) =0.96 is set in example 2 of the present application.
Fig. 11 is a result of calculating epsilon when P (β) =0.97 is set in example 2 of the present application.
Fig. 12 is a result of calculating epsilon when P (β) =0.98 is set in example 2 of the present application.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, as provided in the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
In the description of the embodiments of the present application, it should be noted that, if the terms "upper," "lower," "horizontal," "inner," and the like indicate an azimuth or a positional relationship based on the azimuth or the positional relationship shown in the drawings, or the azimuth or the positional relationship in which the inventive product is conventionally put in use, it is merely for convenience of describing the present application and simplifying the description, and does not indicate or imply that the apparatus or element to be referred to must have a specific azimuth, be configured and operated in a specific azimuth, and thus should not be construed as limiting the present application. Furthermore, the terms "first," "second," and the like, are used merely to distinguish between descriptions and should not be construed as indicating or implying relative importance.
Furthermore, the term "horizontal" if present does not mean that the component is required to be absolutely horizontal, but may be slightly inclined. As "horizontal" merely means that its direction is more horizontal than "vertical", and does not mean that the structure must be perfectly horizontal, but may be slightly inclined.
In the description of the embodiments of the present application, it should also be noted that, unless explicitly specified and limited otherwise, the terms "disposed," "mounted," "connected," and "connected" should be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the terms in this application will be understood by those of ordinary skill in the art as the case may be.
The present application is described in further detail below with reference to the accompanying drawings:
referring to fig. 1, an embodiment of the application discloses a fault dynamic prediction method based on gaussian distribution, which includes the following steps:
s1, continuously monitoring and collecting the characteristics related to the equipment state and the corresponding characteristic data thereof.
In practical applications, the feature related to the device state may be one or a combination of multiple features capable of reflecting the device state, such as a CPU temperature, a packet loss rate, or a response delay of the device.
S2, dividing the characteristic data to obtain a plurality of data time windows.
In some embodiments, the feature data is partitioned into a plurality of different data time windows according to the periodicity of the feature data, the distribution characteristics, and the business understanding of the feature data.
In practical applications, the feature data in the same data time window have similar numerical distribution features.
S3, constructing a mapping relation model of the equipment state and the characteristic data by using a Gaussian distribution method.
In some embodiments, the mean and variance of the features are calculated to obtain a gaussian distribution of the features, and the probability distribution of the gaussian distribution and a default threshold are determined.
In some embodiments, the two-dimensional data samples D of the characteristic parameters of the data time window and its device state are as follows:
D={(x i ,y i )|x i ∈R,y i e [ normal, abnormal ]],i∈I)
Wherein x is i For the device characteristic value, y i I is a data index for the device state;
dividing the two-dimensional data sample D into a training set D train And test set D test Then, get training set D train Corresponding to y in i Normal characteristic parameter X (n):
X(n)=x 1 ,x 2 ,...,x n
calculating the mean mu and variance sigma of the characteristic parameter X (n) 2
Representing the deviation degree of a certain characteristic value X from the mean value mu by delta, calculating X in the characteristic parameter X (n) i Randomly distributed in the range [ mu-delta, mu+delta ]]The probability P (δ) of (a) is as follows:
where ρ (X) is a probability density function of the eigenvalue X in the gaussian distribution of X (n):
substitution of (4) into (3) gives:
considering a value β such that P (β) =0.95, i.e., the probability that the eigenvalues in X (n) are randomly distributed over the range [ μ - β, μ+β ] is 95%; inverse solving the integral equation to calculate a value beta;
then, a default threshold ε is calculated using β:
ε=ρ(μ-β;μ,σ 2 ) (6)。
and S4, calculating probability distribution and a threshold value of each data time window according to the mapping relation model, and comparing the probability distribution with the default threshold value to obtain a fault prediction result.
In practical application, in the data time window, if the gaussian distribution density ρ of the characteristic value is smaller than the threshold epsilon, the equipment state is judged to be abnormal.
As shown in fig. 2, an embodiment of the present application discloses a fault dynamic prediction system based on gaussian distribution, including:
the data acquisition module is used for continuously monitoring and acquiring the characteristics related to the equipment state and the corresponding characteristic data thereof;
the data dividing module is used for dividing the characteristic data to obtain a plurality of data time windows;
the model construction module is used for constructing a mapping relation model of the equipment state and the characteristic data by using a Gaussian distribution method;
and the fault prediction module is used for calculating probability distribution and a threshold value of each data time window according to the mapping relation model, and comparing the probability distribution with the default threshold value to obtain a fault prediction result.
The principle of the present application:
the gaussian distribution is used for determining a reference fault threshold of the index and provides support for final threshold prediction of various resources in various scenes. The gaussian distribution is used to describe the probability distribution of the occurrence of the data samples, and the probability distribution of the population is estimated by calculating the mean and variance of the sample values. The basic idea of the anomaly detection algorithm based on Gaussian distribution is to calculate the Gaussian distribution probability of a certain resource characteristic value sample, wherein the probability is larger than or equal to a threshold value and is normal, and the probability is smaller than the threshold value and is abnormal. The mathematical connotation of anomaly detection is derived from a rule of mathematical statistics that small probability events will not generally occur and are considered to be anomalous once they occur.
Constructing a resource abnormality detection algorithm, and firstly selecting a characteristic value related to a resource state, such as the CPU utilization rate of a server. And then calculating the mean value and the variance of each feature, further obtaining the Gaussian distribution of each feature, and determining the abnormal probability threshold value of the Gaussian distribution. Based on the basic theory of the Gaussian distribution detection algorithm, the probability distribution and the threshold value of the resource characteristic value data sample are calculated, and the abnormal state of the resource is rapidly mined and detected by comparing the Gaussian distribution value of the detection sample with the threshold value. The Gaussian normal distribution model constructed by the Gaussian distribution algorithm can rapidly and accurately analyze the state of the resource index and effectively identify the reference threshold value, so that support is provided for dynamic correction of the threshold value. The gaussian distribution can form corresponding distribution for different types of indexes and indexes under different scenes, so that corresponding reference thresholds are obtained.
Example 1:
the embodiment discloses a fault dynamic prediction method based on Gaussian distribution, which comprises the following steps:
step 1: data preparation
And selecting the relevant characteristics of the equipment state and corresponding characteristic values, such as CPU temperature, packet loss rate, response delay and the like of the server. The feature data comes from the continuous monitoring and collection of features.
Step 2: data periodic decomposition
The data is divided into different time windows based on periodicity of the feature data, distribution of the features, and business understanding of the data. The feature data has similar numerical distribution features in the same time window. Taking the CPU temperature of a certain server as an example, the temperature of the server is lowest between 2 and 5 am, which corresponds to the lowest access amount in the time period, so that between 2 and 5 am can be used as a time window of the CPU temperature data. In this application, it is assumed that the eigenvalues conform to a gaussian distribution and that there is a significant difference in the mean and variance of the eigenvalues across different data time windows. In addition, since updating of feature data may cause the periodicity and distribution characteristics of the data to change, it is necessary to dynamically adjust the data time window and perform a gaussian distribution-based failure prediction for each window.
Step 3: modeling method selection
And constructing a mapping relation between the equipment state and the characteristic value by using a Gaussian distribution method. And calculating the mean and variance of the features, and further obtaining the Gaussian distribution of the features. An anomaly probability threshold for the gaussian distribution is determined. Based on the basic theory of the Gaussian distribution detection algorithm, the probability distribution and the threshold value of the equipment characteristic value sample are calculated, and the Gaussian distribution value of the detection sample is compared with the threshold value to quickly mine and detect the abnormal state of the equipment.
Step 4: threshold calculation
The two-dimensional data samples of the characteristic parameters in a certain data time window of the device and the state of the device are expressed as:
D={(x i ,y i )|x i ∈R,y i e [ normal, abnormal ]],i∈I}
Wherein x is i For the device characteristic value, y i For device status, I is a data index.
Dividing the sample D into training sets D train And test set D test Then, get training set D train Corresponding to y in i The normal characteristic parameters are expressed as:
X(n)=x 1 ,x 2 ,...,x n
calculating the mean μ and variance σ of X (n) 2 The following are provided:
the deviation degree of a certain characteristic value X from the mean value mu is represented by delta, and X in X (n) can be calculated i Randomly distributed in the range [ mu-delta, mu+delta ]]The probability P (δ) of (a) is as follows:
where ρ (X) is a probability density function of the eigenvalue X in the gaussian distribution of X (n):
substitution of (4) into (3) gives:
the value β is considered such that P (β) =0.95, i.e. the probability that the eigenvalues are randomly distributed in the range [ μ - β, μ+β ] in X (n) is 95%. The integral equation is solved inversely to calculate the value β.
Then, a default threshold ε is calculated using β:
ε=ρ(μ-β;μ,σ 2 ) (6)
that is, in this time window, if the gaussian distribution density ρ of the feature value is smaller than the threshold value ε, the device state is determined to be abnormal.
Step 5: classification and assessment
Classifying the test data set of the state parameter of the equipment, calculating the precision p, recall r and F1 score, and further checking the classification accuracy. The calculation method comprises the following steps:
wherein: TP is the number of correctly predicted normal samples (both detected as abnormal and actually abnormal); FP is the number of abnormal samples that are mispredicted as normal (detected as abnormal actually normal); FN is the number of normal samples that are mispredicted as abnormal (detected as normal and actually abnormal).
When the F1 score is not ideal, different values of P (δ) are attempted to be used to calculate different thresholds epsilon. After attempting to use different thresholds epsilon, the epsilon value with the highest F1 score is used to improve the classification accuracy.
Step 6: time advance
When the time advances to the next time window, the steps are repeated using the data in the time window to continue monitoring the device status and detecting anomalies.
Example 2:
there is some equipment, and when the equipment is normally operated, the data volume that it sent has normal distribution characteristics according to the time of use. The faults of the equipment mainly comprise:
(a) The data amount transmitted increases due to abnormal operation of the device;
(b) The CPU cooling system does not normally operate, so that the working progress of the equipment is slow.
The main working time of the equipment is 1:00-6:00 pm. The size of the data volume sent out in a 30 second time interval is recorded every 30 seconds, with one time window per hour, according to the usage characteristics.
As shown in fig. 4, square dots represent history data with a label of "failure one", x "dots represent history data with a label of" normal ", and dots represent history data with a label of" failure two ". The dashed lines represent the gaussian distribution of the corresponding data.
In terms of splitting, in the historical data, the data volume sent by the device in the "normal" state is as shown in fig. 5, and accords with the gaussian distribution characteristics:
in the history data, the data amount transmitted by the device in the "failure one" state is as shown in fig. 6. When the fault occurs, the data volume sent by the equipment has the characteristics of normal distribution and lower temperature than the normal operation.
In the history data, the data amount transmitted by the device in the "failure two" state is as shown in fig. 7. When the fault occurs, the data volume sent by the equipment has the characteristics of normal distribution and higher temperature than that in normal operation.
After the historical data are divided into a training data set and a test data set according to the proportion of 7:3 (namely 1400 training data points and 600 test data points), the CPU temperature mean value rho and standard deviation sigma in the normal state of the equipment in the training data set are calculated by using formulas (1) - (2), and the result is obtained
μ≈2.0035,σ≈0.25002
When P (β) =0.95 is set, ε is calculated using equations (5) - (6), namely:
ε=ρ(μ-β;μ,,σ 2 )
obtaining
ε≈0.233762
By comparing the Gaussian distribution ρ of the data of the test set, the data of the test set is classified into an "abnormal state" and a "normal state", and then the labels thereof are compared, so that the data shown in fig. 8 is obtained.
Among them, 248 data points of TP (predicted as abnormal and actually abnormal); there are 335 data points of TN (predicted as normal, as well as actual as normal); 13 data points of FP (predicted to be abnormal and actually normal); FN (predicted normal, actually abnormal) has 4 data points.
The F1 score is calculated as follows:
try other P (β):
as shown in fig. 9, when P (β) =0.997, f1≡96.08%;
as shown in fig. 10, when P (β) =0.96, f1≡97.25%;
as shown in fig. 11, when P (β) =0.97, f1≡ 97.43%;
as shown in fig. 12, when P (β) =0.98, f1≡97.80%.
Based on current data, the threshold calculated using P (β) =0.98 can provide a more accurate prediction. At this time, the threshold epsilon=0.106. That is, if the device is operating properly, the size of the data to be transmitted should be in the interval 1.42,2.59.
Upon entering a new time window, the system will repeat the above steps to calculate a new threshold epsilon.
The embodiment of the application provides computer equipment. The computer device of this embodiment includes: a processor, a memory, and a computer program stored in the memory and executable on the processor. The steps of the various method embodiments described above are implemented when the processor executes the computer program. Alternatively, the processor may implement the functions of the modules/units in the above-described device embodiments when executing the computer program.
The computer program may be divided into one or more modules/units, which are stored in the memory and executed by the processor to complete the present application.
The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer device may include, but is not limited to, a processor, a memory.
The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like.
The memory may be used to store the computer program and/or modules, and the processor may implement various functions of the computer device by running or executing the computer program and/or modules stored in the memory, and invoking data stored in the memory.
The modules/units integrated with the computer device may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as stand alone products. Based on such understanding, the present application may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each method embodiment described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
The foregoing is merely a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and variations may be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present application should be included in the protection scope of the present application.

Claims (10)

1. The dynamic fault prediction method based on Gaussian distribution is characterized by comprising the following steps of:
continuously monitoring and collecting the characteristics related to the equipment state and the corresponding characteristic data thereof;
dividing the characteristic data to obtain a plurality of data time windows;
constructing a mapping relation model of the equipment state and the characteristic data by using a Gaussian distribution method;
and calculating probability distribution and a threshold value of each data time window according to the mapping relation model, and comparing the probability distribution with the default threshold value to obtain a fault prediction result.
2. The gaussian distribution based fault dynamic prediction method according to claim 1, wherein said characteristics related to the state of the device include CPU temperature, packet loss rate and response delay of the device.
3. The method for dynamically predicting failure based on gaussian distribution according to claim 1, wherein said dividing the feature data into a plurality of data time windows comprises:
and dividing the characteristic data according to the periodicity, the distribution characteristics and the service understanding of the characteristic data to obtain a plurality of different data time windows.
4. A gaussian distribution based fault dynamic prediction method according to claim 1 or 3, characterized in that the feature data in the same data time window have similar numerical distribution features.
5. The gaussian distribution based fault dynamic prediction method according to claim 1, wherein said calculating probability distribution and default threshold values for each data time window according to a mapping relation model comprises:
and calculating the mean value and variance of the features, further obtaining the Gaussian distribution of the features, and determining the probability distribution and the default threshold value of the Gaussian distribution.
6. The gaussian distribution based fault dynamic prediction method according to claim 5, wherein said determining a probability distribution of a gaussian distribution and a default threshold comprises:
the characteristic parameters of the data time window and the two-dimensional data samples D of its device state are as follows:
D={(x i ,y i )|x i ∈R,y i e [ normal, abnormal ]],i∈I}
Wherein x is i For the device characteristic value, y i I is a data index for the device state;
dividing the two-dimensional data sample D into a training set D train And test set D test Then, get training set D train Corresponding to y in i Normal characteristic parameter X (n):
X(n)=x 1 ,x 2 ,…,x n
calculating the mean mu and variance sigma of the characteristic parameter X (n) 2
Representing the deviation degree of a certain characteristic value X from the mean value mu by delta, calculating X in the characteristic parameter X (n) i Random distributionIn the range [ mu-delta, mu+delta ]]The probability P (δ) of (a) is as follows:
where ρ (X) is a probability density function of the eigenvalue X in the gaussian distribution of X (n):
substitution of (4) into (3) gives:
considering a value β such that P (β) =0.95, i.e., the probability that the eigenvalues in X (n) are randomly distributed over the range [ μ - β, μ+β ] is 95%; inverse solving the integral equation to calculate a value beta;
then, a default threshold ε is calculated using β:
ε=ρ(μ-β;μ,σ 2 ) (6)。
7. the gaussian distribution-based fault dynamic prediction method according to claim 5 or 6, wherein comparing the probability distribution with a default threshold value yields a fault prediction result, comprising:
and in the data time window, if the Gaussian distribution density rho of the characteristic value is smaller than the threshold epsilon, judging that the equipment state is abnormal.
8. A gaussian distribution-based fault dynamic prediction system, comprising:
the data acquisition module is used for continuously monitoring and acquiring the characteristics related to the equipment state and the corresponding characteristic data thereof;
the data dividing module is used for dividing the characteristic data to obtain a plurality of data time windows;
the model construction module is used for constructing a mapping relation model of the equipment state and the characteristic data by using a Gaussian distribution method;
and the fault prediction module is used for calculating probability distribution and a threshold value of each data time window according to the mapping relation model, and comparing the probability distribution with the default threshold value to obtain a fault prediction result.
9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1-7 when the computer program is executed.
10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1-7.
CN202311475288.3A 2023-11-07 2023-11-07 Gaussian distribution-based fault dynamic prediction method, system, electronic equipment and storage medium Active CN117527611B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311475288.3A CN117527611B (en) 2023-11-07 2023-11-07 Gaussian distribution-based fault dynamic prediction method, system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311475288.3A CN117527611B (en) 2023-11-07 2023-11-07 Gaussian distribution-based fault dynamic prediction method, system, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN117527611A true CN117527611A (en) 2024-02-06
CN117527611B CN117527611B (en) 2024-06-07

Family

ID=89759993

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311475288.3A Active CN117527611B (en) 2023-11-07 2023-11-07 Gaussian distribution-based fault dynamic prediction method, system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117527611B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110830450A (en) * 2019-10-18 2020-02-21 平安科技(深圳)有限公司 Abnormal flow monitoring method, device and equipment based on statistics and storage medium
US20200210537A1 (en) * 2018-12-27 2020-07-02 Utopus Insights, Inc. System and method for evaluating models for predictive failure of renewable energy assets
CN111913850A (en) * 2020-07-31 2020-11-10 北京嘀嘀无限科技发展有限公司 Data anomaly detection method, device, equipment and storage medium
CN113297790A (en) * 2021-05-19 2021-08-24 哈尔滨工业大学 High-speed rail response prediction method based on sparse Bayesian width learning
CN114816901A (en) * 2022-02-28 2022-07-29 南开大学 Method for evaluating health state of software after change in AI (artificial intelligence) mode
CN116304909A (en) * 2023-03-13 2023-06-23 天翼云科技有限公司 Abnormality detection model training method, fault scene positioning method and device
CN116821141A (en) * 2022-03-21 2023-09-29 中兴通讯股份有限公司 Data updating method, fault diagnosis method, electronic device, and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200210537A1 (en) * 2018-12-27 2020-07-02 Utopus Insights, Inc. System and method for evaluating models for predictive failure of renewable energy assets
CN110830450A (en) * 2019-10-18 2020-02-21 平安科技(深圳)有限公司 Abnormal flow monitoring method, device and equipment based on statistics and storage medium
CN111913850A (en) * 2020-07-31 2020-11-10 北京嘀嘀无限科技发展有限公司 Data anomaly detection method, device, equipment and storage medium
CN113297790A (en) * 2021-05-19 2021-08-24 哈尔滨工业大学 High-speed rail response prediction method based on sparse Bayesian width learning
CN114816901A (en) * 2022-02-28 2022-07-29 南开大学 Method for evaluating health state of software after change in AI (artificial intelligence) mode
CN116821141A (en) * 2022-03-21 2023-09-29 中兴通讯股份有限公司 Data updating method, fault diagnosis method, electronic device, and storage medium
CN116304909A (en) * 2023-03-13 2023-06-23 天翼云科技有限公司 Abnormality detection model training method, fault scene positioning method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王捷;马红艳;: "基于多元高斯分布的数据质量波动监测模型研究", 电信工程技术与标准化, no. 05, 15 May 2019 (2019-05-15) *

Also Published As

Publication number Publication date
CN117527611B (en) 2024-06-07

Similar Documents

Publication Publication Date Title
CN111045894B (en) Database abnormality detection method, database abnormality detection device, computer device and storage medium
EP1480126B1 (en) Self-learning method and system for detecting abnormalities
US8065568B2 (en) Communication network failure detection system, and communication network failure detection method and failure detection program
CN114978568B (en) Data center management using machine learning
US20110161048A1 (en) Method to Optimize Prediction of Threshold Violations Using Baselines
EP3927000B1 (en) Network element health status detection method and device
CN112188531A (en) Abnormality detection method, abnormality detection device, electronic apparatus, and computer storage medium
EP2918976A1 (en) Smart meter Privacy Analyzer
CN115454778A (en) Intelligent monitoring system for abnormal time sequence indexes in large-scale cloud network environment
CN112671767A (en) Security event early warning method and device based on alarm data analysis
KR101281460B1 (en) Method for anomaly detection using statistical process control
CN115081969B (en) Abnormal data determination method and related device
CN115311829A (en) Accurate alarm method and system based on mass data
CN112380073B (en) Fault position detection method and device and readable storage medium
CN117527611B (en) Gaussian distribution-based fault dynamic prediction method, system, electronic equipment and storage medium
EP2882139A1 (en) System and method for IT servers anomaly detection using incident consolidation
JP6832890B2 (en) Monitoring equipment, monitoring methods, and computer programs
CN113992496B (en) Abnormal alarm method and device based on quartile algorithm and computing equipment
CN114297034B (en) Cloud platform monitoring method and cloud platform
CN117149486A (en) Alarm and root cause positioning method, model training method, device, equipment and medium
CN112398706B (en) Data evaluation standard determining method and device, storage medium and electronic equipment
US20200112577A1 (en) Graph-based sensor ranking
JP7095443B2 (en) Anomaly detection program, anomaly detection method and anomaly detection device
JP2018191217A (en) Data monitoring apparatus, data monitoring method, and data monitoring program
EP3457609B1 (en) System and method for computing of anomalies based on frequency driven transformation and computing of new features based on point anomaly density

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant