CN110909306A - Service abnormity detection method and device, electronic equipment and storage equipment - Google Patents
Service abnormity detection method and device, electronic equipment and storage equipment Download PDFInfo
- Publication number
- CN110909306A CN110909306A CN201811081466.3A CN201811081466A CN110909306A CN 110909306 A CN110909306 A CN 110909306A CN 201811081466 A CN201811081466 A CN 201811081466A CN 110909306 A CN110909306 A CN 110909306A
- Authority
- CN
- China
- Prior art keywords
- data
- point
- residual
- baseline
- service point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Operations Research (AREA)
- Economics (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Physics (AREA)
- Game Theory and Decision Science (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Development Economics (AREA)
- Probability & Statistics with Applications (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Debugging And Monitoring (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The application discloses a service abnormity detection method, which comprises the following steps: obtaining first data; the first data comprises historical traffic data; obtaining a baseline containing the future trend of the business data according to the first data; and predicting whether the current service point is an abnormal point or not at least according to the baseline. By adopting the method, the problem of false alarm of service detection is solved.
Description
Technical Field
The application relates to the technical field of service monitoring, in particular to a service abnormity detection method and device, electronic equipment and storage equipment.
Background
In the field of anomaly detection, the conventional method is to set a static threshold value for alarming by manually combing expert knowledge. Firstly, a service scene describing the occurrence of the abnormity is established, and the time of the abnormity and the service abnormity index value are qualitatively described in the scene. When a new service data appears, the current service data is compared with a preset service abnormity index value, and then abnormity detection is carried out.
In the anomaly detection method provided by the prior art, when the threshold is manually set, only rising and falling trend curves of the current time interval (for example, 5 minutes) can be seen, and the periodicity and periodic rising and falling trends of the historical data curve cannot be seen, so that the manually set static threshold is not accurate enough, and false alarm often occurs.
Disclosure of Invention
The application provides a service abnormity detection method, a device, electronic equipment and storage equipment, which are used for solving the problem of false alarm in the conventional service abnormity detection.
The application provides a service anomaly detection method, which comprises the following steps:
obtaining first data; the first data comprises historical traffic data;
obtaining a baseline containing the future trend of the business data according to the first data;
and predicting whether the current service point is an abnormal point or not at least according to the baseline.
Optionally, the first data includes: weekday data and holiday data.
Optionally, the method further includes:
denoising the first data to obtain second data;
and obtaining a baseline containing the future trend of the business data according to the second data.
Alternatively to this, the first and second parts may,
mapping the average difference value of the data of the historical service points of the preset number aiming at the current service point in the second data to the future service points of the preset number;
merging the data of the future service points with the preset number with the second data to obtain merged data;
obtaining a working day baseline and a holiday baseline according to the merged data;
and combining the working day baseline and the holiday baseline to obtain the baseline.
Alternatively to this, the first and second parts may,
calculating an average value of data of a preset number of historical service points aiming at the current service point in the second data;
compressing the data of the historical service point in a first preset time interval before the current service point in the second data to obtain data of a second preset time interval;
taking out the data of the preset number of service points behind the current service point from the data of the second preset time interval;
the difference value is made between the data of the historical service points with the preset number of the current service points in the second data curve and the average value;
and taking the sum of the data of the preset number of service points behind the current service point and the corresponding difference value as the data of the preset number of future service points of the current service point.
Optionally, obtaining a working day baseline and a holiday baseline according to the merged data includes:
extracting working day data from the merged data, and generating a working day base line by adopting an STL algorithm;
and extracting holiday data from the combined data, and generating holiday baselines by adopting a moving smoothing algorithm.
Optionally, predicting whether the current service point is an abnormal point according to the baseline includes:
obtaining a current service point residual error, an abnormal detection parameter and a feedback adjustment parameter according to the base line;
and judging whether the current service point is an abnormal point or not at least according to the obtained residual error of the service point and one of the variance of the historical residual error, the abnormal detection parameter and the feedback regulation parameter.
Optionally, the determining whether the current service point is an abnormal point according to at least one of the obtained residual error of the service point and the variance of the historical residual error, the abnormal detection parameter, and the feedback adjustment parameter includes:
if the residual error of the current service point meets the formula: (x-mean) > std (n + delta), and judging that the current service point is an abnormal point;
wherein, x is the residual error of the current service point; mean is the historical residual mean; std is the variance; n is an abnormality detection parameter; delta is the feedback adjustment parameter.
Optionally, the obtaining an abnormality detection parameter according to the baseline includes:
obtaining the variation of the mean value and the variance of the residual errors according to the base line and the first data;
acquiring a time interval according to the variation of the mean value and the variance of the residual errors;
obtaining a residual sequence in the time interval according to the base line and the first data;
determining a confidence level of the time interval;
and according to the confidence coefficient, carrying out Gaussian processing on the residual sequence to obtain the abnormal detection parameters of the time interval.
Optionally, the obtaining a feedback adjustment parameter according to the baseline includes:
and determining feedback adjusting parameters according to the percentile of effective alarm and the percentile of ineffective alarm in the time interval.
Optionally, the feedback adjustment parameter is determined according to the percentile of the effective alarm and the percentile of the ineffective alarm in the time interval. The method comprises the following steps:
determining percentile x of effective alarm and percentile y of ineffective alarm in the time interval;
when x > -y, delta ═ y × exp (y/x-1);
when x > y, delta ═ x/(1+ exp (-y/x));
wherein delta is a feedback regulation parameter.
Optionally, predicting whether the current service point is an abnormal point according to the baseline includes:
obtaining a current service point residual error according to the base line;
judging whether the residual error of the current service point is an extreme residual error;
and if so, determining the service point as an abnormal point.
Optionally, the determining whether the current service point residual is an extreme residual includes:
selecting a window comprising a current time point and a third preset time interval residual error before the current time point from the base line as a first sliding window;
selecting a window comprising the current time point and a fourth preset time interval residual error before the current time point from the base line as a second sliding window;
and judging whether the residual error of the current service point is an extreme residual error or not according to the residual sequence mean value and the variance of the first sliding window and the residual sequence mean value and the variance of the second sliding window.
Optionally, the determining whether the current service point residual is an extreme residual according to the residual sequence mean and the variance of the first sliding window and the residual sequence mean and the variance of the second sliding window includes:
if the following formula is satisfied: 0.499 ═ 0.5 ═ erf ((W _1_ mean; W _2_ mean)/(W _1_ std ═ W _2_ std)), then the current traffic point residual is judged to be an extreme residual;
wherein, W _1_ mean is a residual sequence mean value of the first sliding window;
w _2_ mean is the mean value of the residual sequence of the second sliding window;
w _1_ std is the residual sequence variance of the first sliding window;
w _2_ std is the residual sequence variance of the second sliding window.
Optionally, predicting whether the current service point is an abnormal point according to the baseline includes:
obtaining a sliding window with the length of the current time point data being a fifth preset time interval according to the base line;
calculating a characteristic value of the sliding window;
performing logistic regression detection according to the characteristic value;
and determining whether the service point is an abnormal point according to the result of the logistic regression detection.
The present application further provides a device for detecting service anomaly, including:
a first data obtaining unit for obtaining first data; the first data comprises historical traffic data;
the base line obtaining unit is used for obtaining a base line containing the future trend of the business data according to the first data;
and the abnormal point judging unit is used for predicting whether the current service point is an abnormal point or not at least according to the baseline.
The present application further provides an electronic device, comprising:
a processor; and
a memory for storing a program of the service anomaly detection method, wherein after the device is powered on and the program of the service anomaly detection method is run by the processor, the following steps are executed:
obtaining first data; the first data comprises historical traffic data;
obtaining a baseline containing the future trend of the business data according to the first data;
and predicting whether the current service point is an abnormal point or not at least according to the baseline.
The application also provides a storage device, which stores a program of the service anomaly detection method, wherein the program is run by a processor and executes the following steps:
obtaining first data; the first data comprises historical traffic data;
obtaining a baseline containing the future trend of the business data according to the first data;
and predicting whether the current service point is an abnormal point or not at least according to the baseline.
Compared with the prior art, the method has the following advantages:
the application provides a device, an electronic device and a storage device, a base line containing the future trend of business data is obtained according to historical business data, and whether the current business point is an abnormal point or not is predicted according to the base line, so that the prediction accuracy is improved, and the problem of false alarm is effectively solved.
Drawings
Fig. 1 is a flowchart of a service anomaly detection method according to a first embodiment of the present application.
Fig. 2 is a schematic diagram of a baseline generation provided in the first embodiment of the present application.
Fig. 3 is a schematic diagram of a service anomaly detection apparatus according to a second embodiment of the present application.
Fig. 4 is a schematic diagram of an electronic device according to a third embodiment of the present application.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, but rather construed as limited to the embodiments set forth herein.
A first embodiment of the present application provides a method for detecting a service anomaly, which is described below with reference to fig. 1.
As shown in fig. 1, in step S101, first data is obtained; the first data includes historical traffic data.
The service data refers to data monitored by the monitoring system, when the monitored data is abnormal, an alarm is given, and the service data comprises the transaction amount of the e-commerce platform and the like. The first data also comprises time information corresponding to the service data.
The first data includes: weekday data and holiday data. As shown in fig. 1, in step S102, a baseline containing future trends of the business data is obtained according to the first data.
Since the first data may have some noise, the first data may be denoised before the baseline is generated using the first data, so as to obtain the second data.
When the data of the first data is filtered for abnormal data such as noise, pressure measurement and the like, a median filtering method, a box diagram and other denoising methods in a time sequence can be adopted.
The obtaining a baseline containing a future trend of the business data according to the first data comprises: and obtaining a baseline containing the future trend of the business data according to the second data.
The obtaining of the baseline including the future trend of the business data according to the second data includes:
mapping the average difference value of the data of the historical service points of the preset number aiming at the current service point in the second data to the future service points of the preset number;
merging the data of the future service points with the preset number with the second data to obtain merged data;
obtaining a working day baseline and a holiday baseline according to the merged data;
and combining the working day baseline and the holiday baseline to obtain the baseline.
Fig. 2 is a schematic diagram of obtaining a baseline containing future trends of the business data according to the first data. Firstly, obtaining the value of the future 100 service points according to the average difference value of the current point (current service point) and the historical 100 service points; then merging the data of the curve of the working day/holiday obtained in the preprocessing with the values of 100 service points in the future, extracting holiday data from the merged data, moving and smoothing the holiday data to generate a base line, and generating a season and trend curve through STL; meanwhile, extracting working day data from the combined data, generating a base line through an STL algorithm, and storing a season and a trend curve; the above baselines are merged to obtain a baseline containing future trends.
Mapping the average difference of the data of the preset number of historical service points for the current service point in the second data to a preset number of future service points, may include the following steps:
calculating an average value of data of a preset number of historical service points aiming at the current service point in the second data;
compressing the data of the historical service point in a first preset time interval before the current service point in the second data to obtain data of a second preset time interval;
taking out the data of the preset number of service points behind the current service point from the data of the second preset time interval;
the data of the historical service points with the preset number of the current service points in the second data is compared with the average value to obtain a difference value;
and taking the sum of the data of the preset number of service points behind the current service point and the corresponding difference value as the data of the preset number of future service points of the current service point.
Obtaining a working day baseline and a holiday baseline according to the merged data, wherein the working day baseline and the holiday baseline comprise:
extracting working day data from the merged data, and generating a working day base line by adopting an STL algorithm;
and extracting holiday data from the combined data, and generating holiday baselines by adopting a moving smoothing algorithm.
STL is the English abbreviation of standard Template Library of Template Library, it contains basic data structure and basic algorithm commonly used in the field of computer science.
The process of obtaining a baseline containing future trends in the business data from the second data is described below with reference to specific examples.
1. Calculating an average value of data of a preset number of historical service points aiming at the current service point in the second data: if the preset number is 100, the time point corresponding to the current service point is 10 points, and if 1 service point is arranged every 1 minute, the data of 100 historical service points before 10 points refers to the data of 100 corresponding historical service points such as 9:59, 9:58, 9:57 …. The average value is the average value of the data of the 100 historical service points.
2. Compressing the data of the historical service point in the first preset time interval before the current service point in the second data to obtain the data of the second preset time interval: and if the first preset time interval is 2 months and the second preset time interval is 1 day, compressing the data 2 months before the current service point in the second data into the data of 1 day.
3. Taking out the data of a preset number of second preset sub-time intervals for the current service point from the data of the second preset time intervals: the data of 100 service points after 10 points 01 are extracted from the data of 1 day.
4. And making a difference value between the data of the preset number of service points behind the current service point and the average value: and (3) subtracting the average values from the data of 100 service points after 10 points 01, which are taken out in the step 3, to obtain corresponding 100 difference values.
5. And taking the sum of the historical service point data of the preset number of the current service points and the corresponding difference value as the future service point data of the preset number of the current service points.
6. And merging the data of the future service point data and the second data in preset quantity to obtain merged data.
7. And extracting working day data from the combined data, and generating a working day base line by adopting an STL algorithm.
8. And extracting holiday data from the combined data, and generating holiday baselines by adopting a moving smoothing algorithm.
9. And merging the working day baseline and the holiday baseline to obtain the baseline.
As shown in fig. 1, in step S103, it is predicted whether the current service point is an outlier according to the baseline.
Predicting whether the current service point is an abnormal point according to the baseline, comprising the following steps:
obtaining a current service point residual error, an abnormal detection parameter and a feedback adjustment parameter according to the base line;
and judging whether the current service point is an abnormal point or not at least according to the obtained residual error of the service point and one of the variance of the historical residual error, the abnormal detection parameter and the feedback regulation parameter.
In specific implementation, whether the current service point is an abnormal point or not can be judged according to the current service point residual error, the historical residual error mean value, the variance of the historical residual error, the abnormal detection parameter and the feedback adjustment parameter:
if the residual error of the current service point meets the formula: (x-mean) > std (n + delta), and judging that the current service point is an abnormal point;
wherein, x is the residual error of the current service point; mean is the historical residual mean; std is the variance of the historical residuals; n is an abnormality detection parameter; delta is the feedback adjustment parameter.
The abnormality detection parameter may be obtained by:
obtaining the variation of the mean value and the variance of the residual errors according to the base line and the first data;
acquiring a time interval according to the variation of the mean value and the variance of the residual errors;
obtaining a residual sequence of the time interval according to the base line and the first data;
determining a confidence level of the time interval;
and according to the confidence coefficient, carrying out Gaussian processing on the residual sequence to obtain the abnormal detection parameters of the time interval.
And the residual error refers to the difference between the service point data of the base line and the service point data of the first data curve corresponding to the base line.
The residual sequence in the time interval refers to a sequence formed by all residual errors in the time interval.
The variance refers to an average value of a sum of squares of differences between the data of each service point in the time interval and an average value of the data of all the service points in the time interval.
Since the tolerance of the service to the anomaly is different in different time intervals, different time intervals need to be set in one day to set different anomaly detection parameters. Obtaining the variation of the mean and variance of the residual errors according to the base line and the first data; the time interval is obtained from the amount of change in the mean and the variance of the residual, and for example, the abnormality detection parameters are different for each time interval with (7 am, 10 am) as one time interval and (10 am, 12 am) as one time interval from the amount of change in the mean and the variance of the residual. The initial value setting is a given confidence interval alpha, namely an initialized abnormal detection interval value obtained according to historical event data, and the residual sequence is gaussed and sequenced to obtain 5% of quantiles as a parameter n of abnormal detection.
The feedback regulation parameter can be obtained by the following method:
and determining feedback adjusting parameters according to the percentile of effective alarm and the percentile of ineffective alarm in the time interval.
And determining feedback adjusting parameters according to the percentile of effective alarm and the percentile of ineffective alarm in the time interval. The method comprises the following steps:
determining percentile x of effective alarm and percentile y of ineffective alarm in the time interval;
when x > -y, delta ═ y × exp (y/x-1);
when x > y, delta ═ x/(1+ exp (-y/x));
wherein delta is a feedback regulation parameter.
The alarm records can be marked manually, whether the alarm is accurate or not is fed back, and the alarm result is corrected to obtain the percentile of effective alarm and the percentile of ineffective alarm.
Predicting whether the current service point is an abnormal point according to the baseline, and further adopting the following steps:
obtaining a current service point residual error according to the base line;
judging whether the residual error of the current service point is an extreme residual error;
and if so, determining the service point as an abnormal point.
The determining whether the current service point residual is an extreme residual includes:
selecting a window comprising a current time point and a third preset time interval residual error before the current time point from the base line as a first sliding window;
selecting a window comprising a current time point and a fourth preset time interval residual error before the current time point from the base line as a first sliding window;
and judging whether the residual error of the current service point is an extreme residual error or not according to the residual sequence mean value and the variance of the first sliding window and the residual sequence mean value and the variance of the second sliding window.
Judging whether the current service point residual is an extreme residual according to the residual sequence mean and the variance of the first sliding window and the residual sequence mean and the variance of the second sliding window, including:
if the following formula is satisfied: 0.499 ═ 0.5 ═ erf ((W _1_ mean; W _2_ mean)/(W _1_ std ═ W _2_ std)), then the current traffic point residual is judged to be an extreme residual;
wherein, W _1_ mean is a residual sequence mean value of the first sliding window;
w _2_ mean is the mean value of the residual sequence of the second sliding window;
w _1_ std is the residual sequence variance of the first sliding window;
w _2_ std is the residual sequence variance of the second sliding window.
The first sliding window may be a window of a shorter time interval, and the first sliding window may be a window of a longer time interval, for example, the first sliding window W _1 is a window of the last 5 minutes residual including the current time point, if the current time point is 10 points, the first sliding window is a window of the 5 minutes residual (9:55 points, 10 points), and the second sliding window W _2 is a window of the historical 20 days residual of the current time point.
Predicting whether the current service point is an abnormal point according to the baseline, comprising the following steps:
obtaining a sliding window with the length of the current time point data being a fifth preset time interval according to the base line;
calculating a characteristic value of the sliding window;
performing logistic regression detection according to the characteristic value;
and determining whether the service point is an abnormal point according to the result of the logistic regression detection.
The characteristic value refers to a value for logistic regression detection, and comprises the following steps: the mean and variance of the data of the sliding window service point in the fifth preset time interval, the data of the last service point of the sliding window, and the difference between the data of the last service point of the sliding window and the data of the last service point, wherein the last time point of the sliding window is the first minute of the day, the last time point of the sliding window is the second day of the month, the last time point of the window is the second day of the week, the last time point of the sliding window is the second hour of the day, and whether the last time point of the sliding window is abnormal or not.
Parameters required for logistic regression testing are trained by: taking the value of the first data through a sliding window with the length of t, calculating the data mean value and the variance of the service point in the window interval, calculating the data difference value of the last service point of the sliding window and the last service point of the sliding window, wherein the last time point of the sliding window is the first minute of the day, the last time point of the sliding window is the second day of the month, the last time point of the sliding window is the second day of the week, the last time point of the sliding window is the second hour of the day, and whether the last time point of the sliding window is abnormal or not is calculated. And performing logistic regression parameter training through the generated characteristics and the labels of the data, and storing the trained parameters.
It should be noted that the above three methods for predicting whether the current service point is an abnormal point according to the baseline may be adopted alone, or may be used in combination, and the effect is the best when the three methods are used in combination. If the three methods are used simultaneously, when the two methods judge that the current service point is the abnormal point, the current service point is determined to be the abnormal point, and the judged abnormal point is more accurate.
Corresponding to the above-mentioned service anomaly detection method, a second embodiment of the present application further provides a service anomaly detection apparatus.
As shown in fig. 3, the apparatus includes:
a first data obtaining unit 301 for obtaining first data; the first data comprises historical traffic data;
a baseline obtaining unit 302, configured to obtain a baseline including a future trend of the service data according to the first data;
an abnormal point judgment unit 303, configured to predict whether the current service point is an abnormal point at least according to the baseline.
Optionally, the first data includes: weekday data and holiday data.
Optionally, the apparatus further comprises:
a second data obtaining unit, configured to perform denoising processing on the first data to obtain second data;
the baseline obtaining unit comprises: and the base line obtaining subunit is used for obtaining a base line containing the future trend of the business data according to the second data.
Optionally, the obtaining a baseline subunit according to the second data includes:
a future service point mapping subunit, configured to map an average difference value of data of a preset number of historical service points for the current service point in the second data to a preset number of future service points;
a data merging subunit, configured to merge the data of the preset number of future service points with the data of the second data curve to obtain merged data;
a workday and holiday baseline obtaining subunit, configured to obtain a workday baseline and a holiday baseline according to the combined data;
and the base line obtaining subunit is used for combining the working day base line and the holiday base line to obtain the base line.
Optionally, the future service point mapping subunit is specifically configured to:
calculating an average value of data of a preset number of historical service points aiming at the current service point in the second data;
compressing the data of the historical service point in a first preset time interval before the current service point in the second data to obtain data of a second preset time interval;
taking out the data of the preset number of service points behind the current service point from the data of the second preset time interval;
the data of the historical service points with the preset number of the current service points in the second data is compared with the average value to obtain a difference value;
and taking the sum of the data of the preset number of service points behind the current service point and the corresponding difference value as the data of the preset number of future service points of the current service point.
Optionally, the workday and holiday baseline obtaining subunit is specifically configured to:
extracting working day data from the merged data, and generating a working day base line by adopting an STL algorithm;
and extracting holiday data from the combined data, and generating holiday baselines by adopting a moving smoothing algorithm.
Optionally, the abnormal point determining unit includes:
a parameter obtaining subunit, configured to obtain, according to the baseline, a current service point residual error, an anomaly detection parameter, and a feedback adjustment parameter;
and the abnormal point judging subunit is used for judging whether the current service point is an abnormal point or not at least according to one of the obtained residual error of the service point and the variance of the historical residual error, the abnormal detection parameter and the feedback regulation parameter.
Optionally, the abnormal point determining subunit is specifically configured to:
if the residual error of the current service point meets the formula: (x-mean) > std (n + delta), and judging that the current service point is an abnormal point;
wherein, x is the residual error of the current service point; mean is the historical residual mean; std is the variance of the historical residuals; n is an abnormality detection parameter; delta is the feedback adjustment parameter.
Optionally, the obtaining an abnormality detection parameter according to the baseline includes:
obtaining the variation of the mean value and the variance of the residual errors according to the base line and the first data;
acquiring a time interval according to the variation of the mean value and the variance of the residual errors;
obtaining a residual sequence in the time interval according to the base line and the first data;
determining a confidence level of the time interval;
and according to the confidence coefficient, carrying out Gaussian processing on the residual sequence to obtain the abnormal detection parameters of the time interval.
Optionally, the obtaining a feedback adjustment parameter according to the baseline includes:
and determining feedback adjusting parameters according to the percentile of effective alarm and the percentile of ineffective alarm in the time interval.
Optionally, the feedback adjustment parameter is determined according to the percentile of the effective alarm and the percentile of the ineffective alarm in the time interval. The method comprises the following steps:
determining percentile x of effective alarm and percentile y of ineffective alarm in the time interval;
when x > -y, delta ═ y × exp (y/x-1);
when x > y, delta ═ x/(1+ exp (-y/x));
wherein delta is a feedback regulation parameter.
Optionally, the abnormal point determining unit includes:
a current service point residual error subunit, configured to obtain a current service point residual error according to the baseline;
an extreme residual error judging subunit, configured to judge whether the current service point residual error is an extreme residual error;
and the abnormal point determining subunit is used for determining the service point as the abnormal point when the output of the extreme residual error judging subunit is yes.
Optionally, the extreme residual error determining subunit is specifically configured to:
selecting a window comprising a current time point and a third preset time interval residual error before the current time point from the base line as a first sliding window;
selecting a window comprising the current time point and a fourth preset time interval residual error before the current time point from the base line as a second sliding window;
and judging whether the residual error of the current service point is an extreme residual error or not according to the residual sequence mean value and the variance of the first sliding window and the residual sequence mean value and the variance of the second sliding window.
Optionally, the determining whether the current service point residual is an extreme residual according to the residual sequence mean and the variance of the first sliding window and the residual sequence mean and the variance of the second sliding window includes:
if the following formula is satisfied: 0.499 ═ 0.5 ═ erf ((W _1_ mean; W _2_ mean)/(W _1_ std ═ W _2_ std)), then the current traffic point residual is judged to be an extreme residual;
wherein, W _1_ mean is a residual sequence mean value of the first sliding window;
w _2_ mean is the mean value of the residual sequence of the second sliding window;
w _1_ std is the residual sequence variance of the first sliding window;
w _2_ std is the residual sequence variance of the second sliding window.
Optionally, the abnormal point determining unit includes:
obtaining a sliding window with the length of the current time point data being a fifth preset time interval according to the base line;
calculating a characteristic value of the sliding window;
performing logistic regression detection according to the characteristic value;
and determining whether the service point is an abnormal point according to the result of the logistic regression detection.
It should be noted that, for the detailed description of the service anomaly detection apparatus provided in the second embodiment of the present application, reference may be made to the related description of the first embodiment of the present application, and details are not described here again.
Corresponding to the above-mentioned service anomaly detection method, a third embodiment of the present application further provides an electronic device.
As shown in fig. 4, the electronic apparatus includes:
a processor 401; and
a memory 402 for storing a program of the service anomaly detection method, wherein after the device is powered on and the program of the service anomaly detection method is run by the processor, the following steps are executed:
obtaining first data; the first data comprises historical traffic data;
obtaining a baseline containing the future trend of the business data according to the first data;
and predicting whether the current service point is an abnormal point or not at least according to the baseline.
Optionally, the first data includes: weekday data and holiday data.
Optionally, the electronic device further performs the following steps:
denoising the first data to obtain second data;
the obtaining a baseline containing a future trend of the business data according to the first data comprises: and obtaining a baseline containing the future trend of the business data according to the second data.
Optionally, the obtaining a baseline including a future trend of the business data according to the second data includes:
mapping the average difference value of the data of the historical service points of the preset number aiming at the current service point in the second data to the future service points of the preset number;
merging the data of the future service points with the preset number with the second data to obtain merged data;
obtaining a working day baseline and a holiday baseline according to the merged data;
and combining the working day baseline and the holiday baseline to obtain the baseline.
Optionally, the mapping, to a preset number of future service points, an average difference of data of a preset number of historical service points for the current service point in the second data, includes:
calculating an average value of data of a preset number of historical service points aiming at the current service point in the second data;
compressing the data of the historical service point in a first preset time interval before the current service point in the second data to obtain data of a second preset time interval;
taking out the data of the preset number of service points behind the current service point from the data of the second preset time interval;
the data of the historical service points with the preset number of the current service points in the second data is compared with the average value to obtain a difference value;
and taking the sum of the data of the preset number of service points behind the current service point and the corresponding difference value as the data of the preset number of future service points of the current service point.
Optionally, obtaining a working day baseline and a holiday baseline according to the merged data includes:
extracting working day data from the merged data, and generating a working day base line by adopting an STL algorithm;
and extracting holiday data from the combined data, and generating holiday baselines by adopting a moving smoothing algorithm.
Optionally, predicting whether the current service point is an abnormal point according to the baseline includes:
obtaining a current service point residual error, an abnormal detection parameter and a feedback adjustment parameter according to the base line;
and judging whether the current service point is an abnormal point or not according to the current service point residual error, the historical residual error mean value, the variance, the abnormal detection parameter and the feedback regulation parameter.
Optionally, the determining whether the current service point is an abnormal point according to the current service point residual error, the historical residual error mean, the variance, the abnormal detection parameter, and the feedback adjustment parameter includes:
if the residual error of the current service point meets the formula: (x-mean) > std (n + delta), and judging that the current service point is an abnormal point;
wherein, x is the residual error of the current service point; mean is the historical residual mean; std is the variance; n is an abnormality detection parameter; delta is the feedback adjustment parameter.
Optionally, the obtaining an abnormality detection parameter according to the baseline includes:
obtaining the variation of the mean value and the variance of the residual errors according to the base line and the first data;
acquiring a time interval according to the variation of the mean value and the variance of the residual errors;
obtaining a residual sequence in the time interval according to the base line and the first data curve;
determining a confidence level of the time interval;
and according to the confidence coefficient, carrying out Gaussian processing on the residual sequence to obtain the abnormal detection parameters of the time interval.
Optionally, the obtaining a feedback adjustment parameter according to the baseline includes:
and determining feedback adjusting parameters according to the percentile of effective alarm and the percentile of ineffective alarm in the time interval.
Optionally, the feedback adjustment parameter is determined according to the percentile of the effective alarm and the percentile of the ineffective alarm in the time interval. The method comprises the following steps:
determining percentile x of effective alarm and percentile y of ineffective alarm in the time interval;
when x > -y, delta ═ y × exp (y/x-1);
when x > y, delta ═ x/(1+ exp (-y/x));
wherein delta is a feedback regulation parameter.
Optionally, predicting whether the current service point is an abnormal point according to the baseline includes:
obtaining a current service point residual error according to the base line;
judging whether the residual error of the current service point is an extreme residual error;
and if so, determining the service point as an abnormal point.
Optionally, the determining whether the current service point residual is an extreme residual includes:
selecting a window comprising a current time point and a third preset time interval residual error before the current time point from the base line as a first sliding window;
selecting a window comprising the current time point and a fourth preset time interval residual error before the current time point from the base line as a second sliding window;
and judging whether the residual error of the current service point is an extreme residual error or not according to the residual sequence mean value and the variance of the first sliding window and the residual sequence mean value and the variance of the second sliding window.
Optionally, the determining whether the current service point residual is an extreme residual according to the residual sequence mean and the variance of the first sliding window and the residual sequence mean and the variance of the second sliding window includes:
if the following formula is satisfied: 0.499 ═ 0.5 ═ erf ((W _1_ mean; W _2_ mean)/(W _1_ std ═ W _2_ std)), then the current traffic point residual is judged to be an extreme residual;
wherein, W _1_ mean is a residual sequence mean value of the first sliding window;
w _2_ mean is the mean value of the residual sequence of the second sliding window;
w _1_ std is the residual sequence variance of the first sliding window;
w _2_ std is the residual sequence variance of the second sliding window.
Optionally, predicting whether the current service point is an abnormal point according to the baseline includes:
obtaining a sliding window with the length of the current time point data being a fifth preset time interval according to the base line;
calculating a characteristic value of the sliding window;
performing logistic regression detection according to the characteristic value;
and determining whether the service point is an abnormal point according to the result of the logistic regression detection.
It should be noted that, for the detailed description of the electronic device provided in the third embodiment of the present application, reference may be made to the related description of the first embodiment of the present application, and details are not repeated here.
Corresponding to the above-mentioned method for detecting service anomaly, a fourth embodiment of the present application further provides a storage device,
a program storing a service anomaly detection method, the program being run by a processor to perform the steps of:
obtaining first data; the first data comprises historical traffic data;
obtaining a baseline containing the future trend of the business data according to the first data;
and predicting whether the current service point is an abnormal point or not at least according to the baseline.
It should be noted that, for the detailed description of the storage device provided in the fourth embodiment of the present application, reference may be made to the related description of the first embodiment of the present application, and details are not described here again.
Although the present invention has been described with reference to the preferred embodiments, it should be understood that the scope of the present invention is not limited to the embodiments described above, and that various changes and modifications may be made by one skilled in the art without departing from the spirit and scope of the present invention.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Claims (18)
1. A method for detecting service abnormality is characterized by comprising the following steps:
obtaining first data; the first data comprises historical traffic data;
obtaining a baseline containing the future trend of the business data according to the first data;
and predicting whether the current service point is an abnormal point or not at least according to the baseline.
2. The method of claim 1, wherein the first data comprises: weekday data and holiday data.
3. The method of claim 2, further comprising:
denoising the first data to obtain second data;
and obtaining a baseline containing the future trend of the business data according to the second data.
4. The method of claim 3,
mapping the average difference value of the data of the historical service points of the preset number aiming at the current service point in the second data to the future service points of the preset number;
merging the data of the future service points with the preset number with the second data to obtain merged data;
obtaining a working day baseline and a holiday baseline according to the merged data;
and combining the working day baseline and the holiday baseline to obtain the baseline.
5. The method of claim 4,
calculating an average value of data of a preset number of historical service points aiming at the current service point in the second data;
compressing the data of the historical service point in a first preset time interval before the current service point in the second data to obtain data of a second preset time interval;
taking out the data of the preset number of service points behind the current service point from the data of the second preset time interval;
the data of the historical service points with the preset number of the current service points in the second data is compared with the average value to obtain a difference value;
and taking the sum of the data of the preset number of service points behind the current service point and the corresponding difference value as the data of the preset number of future service points of the current service point.
6. The method of claim 4, wherein deriving a weekday baseline and a holiday baseline from the merged data comprises:
extracting working day data from the merged data, and generating a working day base line by adopting an STL algorithm;
and extracting holiday data from the combined data, and generating holiday baselines by adopting a moving smoothing algorithm.
7. The method of claim 1, wherein predicting whether a current service point is an outlier based on the baseline comprises:
obtaining a current service point residual error, an abnormal detection parameter and a feedback adjustment parameter according to the base line;
and judging whether the current service point is an abnormal point or not at least according to the obtained residual error of the service point and one of the variance of the historical residual error, the abnormal detection parameter and the feedback regulation parameter.
8. The method of claim 7, wherein the determining whether the current service point is an abnormal point according to at least one of a variance between a residual of the obtained service point and a historical residual, an abnormal detection parameter, and a feedback adjustment parameter comprises:
if the residual error of the current service point meets the formula: (x-mean) > std (n + delta), and judging that the current service point is an abnormal point;
wherein, x is the residual error of the current service point; mean is the historical residual mean; std is the variance of the historical residuals; n is an abnormality detection parameter; delta is the feedback adjustment parameter.
9. The method of claim 7, wherein said deriving anomaly detection parameters from said baseline comprises:
obtaining the variation of the mean value and the variance of the residual errors according to the base line and the first data;
acquiring a time interval according to the variation of the mean value and the variance of the residual errors;
obtaining a residual sequence in the time interval according to the base line and the first data;
determining a confidence level of the time interval;
and according to the confidence coefficient, carrying out Gaussian processing on the residual sequence to obtain the abnormal detection parameters of the time interval.
10. The method of claim 9, wherein said deriving feedback adjustment parameters from said baseline comprises:
and determining feedback adjusting parameters according to the percentile of effective alarm and the percentile of ineffective alarm in the time interval.
11. The method of claim 10, wherein the feedback adjustment parameter is determined based on a percentile of valid alarms and a percentile of invalid alarms over the time interval. The method comprises the following steps:
determining percentile x of effective alarm and percentile y of ineffective alarm in the time interval;
when x > -y, delta ═ y × exp (y/x-1);
when x > y, delta ═ x/(1+ exp (-y/x));
wherein delta is a feedback regulation parameter.
12. The method of claim 1, wherein predicting whether a current service point is an outlier based on the baseline comprises:
obtaining a current service point residual error according to the base line;
judging whether the residual error of the current service point is an extreme residual error;
and if so, determining the service point as an abnormal point.
13. The method of claim 12, wherein the determining whether the current service point residual is an extreme residual comprises:
selecting a window comprising a current time point and a third preset time interval residual error before the current time point from the base line as a first sliding window;
selecting a window comprising the current time point and a fourth preset time interval residual error before the current time point from the base line as a second sliding window;
and judging whether the residual error of the current service point is an extreme residual error or not according to the residual sequence mean value and the variance of the first sliding window and the residual sequence mean value and the variance of the second sliding window.
14. The method of claim 13, wherein determining whether the current service point residual is an extreme residual according to the mean and the variance of the residual sequence of the first sliding window and the mean and the variance of the residual sequence of the second sliding window comprises:
if the following formula is satisfied: 0.499 ═ 0.5 ═ erf ((W _1_ mean; W _2_ mean)/(W _1_ std ═ W _2_ std)), then the current traffic point residual is judged to be an extreme residual;
wherein, W _1_ mean is a residual sequence mean value of the first sliding window;
w _2_ mean is the mean value of the residual sequence of the second sliding window;
w _1_ std is the residual sequence variance of the first sliding window;
w _2_ std is the residual sequence variance of the second sliding window.
15. The method of claim 1, wherein predicting whether a current service point is an outlier based on the baseline comprises:
obtaining a sliding window with the length of the current time point data being a fifth preset time interval according to the base line;
calculating a characteristic value of the sliding window;
performing logistic regression detection according to the characteristic value;
and determining whether the service point is an abnormal point according to the result of the logistic regression detection.
16. A traffic anomaly detection apparatus, comprising:
a first data obtaining unit for obtaining first data; the first data comprises historical traffic data;
the base line obtaining unit is used for obtaining a base line containing the future trend of the business data according to the first data;
and the abnormal point judging unit is used for predicting whether the current service point is an abnormal point or not at least according to the baseline.
17. An electronic device, comprising:
a processor; and
a memory for storing a program of the service anomaly detection method, wherein after the device is powered on and the program of the service anomaly detection method is run by the processor, the following steps are executed:
obtaining first data; the first data comprises historical traffic data;
obtaining a baseline containing the future trend of the business data according to the first data;
and predicting whether the current service point is an abnormal point or not at least according to the baseline.
18. A storage device, characterized in that,
a program storing a service anomaly detection method, the program being run by a processor to perform the steps of:
obtaining first data; the first data comprises historical traffic data;
obtaining a baseline containing the future trend of the business data according to the first data;
and predicting whether the current service point is an abnormal point or not at least according to the baseline.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811081466.3A CN110909306B (en) | 2018-09-17 | 2018-09-17 | Business abnormality detection method and device, electronic equipment and storage equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811081466.3A CN110909306B (en) | 2018-09-17 | 2018-09-17 | Business abnormality detection method and device, electronic equipment and storage equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110909306A true CN110909306A (en) | 2020-03-24 |
CN110909306B CN110909306B (en) | 2023-06-16 |
Family
ID=69813481
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811081466.3A Active CN110909306B (en) | 2018-09-17 | 2018-09-17 | Business abnormality detection method and device, electronic equipment and storage equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110909306B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111523084A (en) * | 2020-04-09 | 2020-08-11 | 京东方科技集团股份有限公司 | Service data prediction method and device, electronic equipment and computer readable storage medium |
CN111898094A (en) * | 2020-07-06 | 2020-11-06 | 广州市吉华勘测股份有限公司 | Method and device for processing foundation pit monitoring data, electronic equipment and storage medium |
CN113391982A (en) * | 2021-08-17 | 2021-09-14 | 云智慧(北京)科技有限公司 | Monitoring data anomaly detection method, device and equipment |
CN115017211A (en) * | 2022-06-15 | 2022-09-06 | 平安国际融资租赁有限公司 | Method and device for determining abnormality detection object, storage medium and computer equipment |
CN116702081A (en) * | 2023-08-07 | 2023-09-05 | 西安格蒂电力有限公司 | Intelligent inspection method for power distribution equipment based on artificial intelligence |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104731816A (en) * | 2013-12-23 | 2015-06-24 | 阿里巴巴集团控股有限公司 | Method and device for processing abnormal business data |
CN105406991A (en) * | 2015-10-26 | 2016-03-16 | 上海华讯网络系统有限公司 | Method and system for generating service threshold by historical data based on network monitoring indexes |
CN107528722A (en) * | 2017-07-06 | 2017-12-29 | 阿里巴巴集团控股有限公司 | Abnormal point detecting method and device in a kind of time series |
US20180176241A1 (en) * | 2016-12-21 | 2018-06-21 | Hewlett Packard Enterprise Development Lp | Abnormal behavior detection of enterprise entities using time-series data |
CN108269189A (en) * | 2017-07-05 | 2018-07-10 | 中国中投证券有限责任公司 | Achievement data monitoring method, device, storage medium and computer equipment |
-
2018
- 2018-09-17 CN CN201811081466.3A patent/CN110909306B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104731816A (en) * | 2013-12-23 | 2015-06-24 | 阿里巴巴集团控股有限公司 | Method and device for processing abnormal business data |
CN105406991A (en) * | 2015-10-26 | 2016-03-16 | 上海华讯网络系统有限公司 | Method and system for generating service threshold by historical data based on network monitoring indexes |
US20180176241A1 (en) * | 2016-12-21 | 2018-06-21 | Hewlett Packard Enterprise Development Lp | Abnormal behavior detection of enterprise entities using time-series data |
CN108269189A (en) * | 2017-07-05 | 2018-07-10 | 中国中投证券有限责任公司 | Achievement data monitoring method, device, storage medium and computer equipment |
CN107528722A (en) * | 2017-07-06 | 2017-12-29 | 阿里巴巴集团控股有限公司 | Abnormal point detecting method and device in a kind of time series |
Non-Patent Citations (2)
Title |
---|
CLAUDIO MARTELLA等: "Visualizing, clustering, and predicting the behavior of museum visitors" * |
刘金钊;周悦芝;张尧学;: "基于小波分析的云计算在线业务异常负载检测方法" * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111523084A (en) * | 2020-04-09 | 2020-08-11 | 京东方科技集团股份有限公司 | Service data prediction method and device, electronic equipment and computer readable storage medium |
CN111898094A (en) * | 2020-07-06 | 2020-11-06 | 广州市吉华勘测股份有限公司 | Method and device for processing foundation pit monitoring data, electronic equipment and storage medium |
CN113391982A (en) * | 2021-08-17 | 2021-09-14 | 云智慧(北京)科技有限公司 | Monitoring data anomaly detection method, device and equipment |
CN115017211A (en) * | 2022-06-15 | 2022-09-06 | 平安国际融资租赁有限公司 | Method and device for determining abnormality detection object, storage medium and computer equipment |
CN116702081A (en) * | 2023-08-07 | 2023-09-05 | 西安格蒂电力有限公司 | Intelligent inspection method for power distribution equipment based on artificial intelligence |
CN116702081B (en) * | 2023-08-07 | 2023-10-24 | 西安格蒂电力有限公司 | Intelligent inspection method for power distribution equipment based on artificial intelligence |
Also Published As
Publication number | Publication date |
---|---|
CN110909306B (en) | 2023-06-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110909306B (en) | Business abnormality detection method and device, electronic equipment and storage equipment | |
US7836111B1 (en) | Detecting change in data | |
CN110874674B (en) | Abnormality detection method, device and equipment | |
JP6501982B2 (en) | Failure risk index estimation device and failure risk index estimation method | |
CN113570396A (en) | Time series data abnormity detection method, device, equipment and storage medium | |
CN110008080A (en) | Operational indicator method for detecting abnormality, device and electronic equipment based on time series | |
CN105593864B (en) | Analytical device degradation for maintenance device | |
US9600391B2 (en) | Operations management apparatus, operations management method and program | |
US7505868B1 (en) | Performing quality determination of data | |
JP2021081975A (en) | Accounting information processor, accounting information processing method and accounting information processing program | |
JP2019105927A (en) | Failure probability calculation device, failure probability calculation method and program | |
CN109684320B (en) | Method and equipment for online cleaning of monitoring data | |
CN111897851A (en) | Abnormal data determination method and device, electronic equipment and readable storage medium | |
CN109213651A (en) | A kind of object monitor method and device, electronic equipment | |
CN108959415B (en) | Abnormal dimension positioning method and device and electronic equipment | |
CN112380073A (en) | Fault position detection method and device and readable storage medium | |
CN117494030A (en) | Abnormal event identification method and related device based on distributed optical fiber acoustic wave sensing | |
US20240020436A1 (en) | Automated data quality monitoring and data governance using statistical models | |
JP2020181443A (en) | Abnormality detection apparatus, abnormality detection method, and computer program | |
CN110020744A (en) | Dynamic prediction method and its system | |
CN112988536B (en) | Data anomaly detection method, device, equipment and storage medium | |
CN114328662A (en) | Abnormal data positioning method and device, electronic equipment and storage medium | |
CN112529315A (en) | Landslide prediction method, device, equipment and storage medium | |
Bossons | The effects of parameter misspecification and non-stationarity on the applicability of adaptive forecasts | |
CN112291297B (en) | Information data processing method, device, storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |