[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN112000830A - Time sequence data detection method and device - Google Patents

Time sequence data detection method and device Download PDF

Info

Publication number
CN112000830A
CN112000830A CN202010869594.5A CN202010869594A CN112000830A CN 112000830 A CN112000830 A CN 112000830A CN 202010869594 A CN202010869594 A CN 202010869594A CN 112000830 A CN112000830 A CN 112000830A
Authority
CN
China
Prior art keywords
image
time sequence
detected
data
sequence data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010869594.5A
Other languages
Chinese (zh)
Inventor
陈欢欢
黄威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN202010869594.5A priority Critical patent/CN112000830A/en
Publication of CN112000830A publication Critical patent/CN112000830A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a time sequence data detection method and a device, converting time sequence data to be detected into images to be detected, classifying and distinguishing the images to be detected by using a pre-constructed discriminator in a generation countermeasure network to obtain an output result of the discriminator, determining that the time sequence data to be detected is abnormal under the condition that the output result is greater than a preset threshold, and determining that the time sequence data to be detected is normal under the condition that the output result is not greater than the preset threshold, wherein the generation countermeasure network is an unsupervised model, the image training is used for generating the countermeasure network without simulating an abnormal data label, the training time is effectively reduced, the detection efficiency is accelerated, and meanwhile, the accuracy of the abnormal detection of the time sequence data can be improved under the condition of lacking abnormal data samples. Therefore, compared with the prior art, the scheme provided by the application can effectively capture the time correlation of the time sequence data, and has higher detection efficiency and higher detection result of accuracy.

Description

Time sequence data detection method and device
Technical Field
The present disclosure relates to the field of time series data processing, and in particular, to a method and an apparatus for detecting time series data.
Background
With the development of society and the progress of science and technology, the informatization degree of human society is continuously deepened, and data generated by an information system is increased in a geometric progression. In many companies and factories, a large amount of time series data (referred to as time series data) such as passenger flow data, stock data, sales data, and web logs are generated every day. The time series data detection is used as an auxiliary means for judging the data state, and aims to timely find abnormal data (the abnormal data is observed from multiple dimensions, for example, the value of the time series data exceeds a preset threshold range, or the value change frequency of the time series data exceeds a preset range, the abnormal data can be regarded as the abnormal data), so that a user can timely take measures to eliminate or weaken the abnormality, the loss caused by the abnormality is reduced, and the normal operation of each service is ensured.
Time series data, which is dynamic data, has various states, trends, and periodic variations. Therefore, the time series data has strong time correlation, for example, data such as passenger flow data and sales data have obvious seasonal correlation and show obvious periodicity, however, the existing time series data detection method is difficult to capture the dependency relationship of the time series data on the time dimension. In addition, most of the existing time series data detection methods are based on labels of artificial simulation abnormal data, and classification detection of normal data and abnormal data is realized. However, due to the diversity of the states of the time series data and the fact that various abnormal states cannot be estimated, the manual simulation of the labels of the abnormal data has huge workload and is very difficult to implement, and the detection efficiency of the time series data is greatly reduced. In addition, the existing time series data detection method obviously cannot use enough abnormal data labels as samples to train the classification model (normally, normal data labels and abnormal data labels are used as training samples), so that the detection result accuracy of the classification model is low.
Disclosure of Invention
The application provides a time sequence data detection method and a time sequence data detection device, and aims to provide an effective detection method for time sequence data so as to solve the problems that the time correlation of the time sequence data is difficult to capture, the detection efficiency is low, and the accuracy of a detection result is not high in the conventional time sequence data detection method.
In order to achieve the above object, the present application provides the following technical solutions:
a method for detecting time series data is characterized by comprising the following steps:
converting time sequence data to be detected into an image to be detected, wherein any pixel in the image to be detected is the inner product of two pieces of data in the time sequence data to be detected;
classifying and distinguishing the to-be-detected image by using a pre-constructed discriminator in a generation countermeasure network to obtain an output result of the discriminator; the generation countermeasure network is obtained by training a multi-frame sample image, and the multi-frame sample image is obtained by converting a plurality of groups of pre-collected sample time sequence data;
determining that the time sequence data to be detected is abnormal under the condition that the output result is greater than a preset threshold value;
and determining that the time sequence data to be detected is normal under the condition that the output result is not greater than the preset threshold value.
Optionally, the method further includes:
acquiring a plurality of groups of target time sequence data belonging to the same time sequence with the time sequence data to be detected; the sequence positions of the multiple groups of target time sequence data and the time sequence data to be detected in the time sequence are continuous; the sequence bit of the target time sequence data is arranged in front of the sequence bit of the time sequence data to be detected;
converting a plurality of groups of target time sequence data into a plurality of frames of non-to-be-detected images, wherein any one pixel in the non-to-be-detected images is the inner product of two pieces of data in the target time sequence data;
inputting a plurality of frames of the images to be detected into a pre-constructed prediction model to obtain a target image output by the prediction model, wherein an incidence relation exists between the target image and the images to be detected, the incidence relation is used for indicating that time sequence data corresponding to the target image and the images to be detected respectively belong to the same time sequence;
and comparing the pixels of the image to be detected and the target image, and correcting the output result according to the comparison result.
Optionally, the comparing the pixels of the image to be detected and the target image, and correcting the output result according to the comparison result includes:
calculating the difference value between the first numerical value and the second numerical value to obtain a prediction error; the first numerical value is the total number of pixels of the image to be detected, and the second numerical value is the total number of pixels of the target image;
and calculating a sum of the prediction error and the output result, and using the sum as a final value of the output result.
Optionally, the construction process of the prediction model includes:
obtaining a plurality of time series samples in advance, wherein the time series samples comprise a plurality of groups of sample time series data;
converting a plurality of groups of sample time sequence data into a plurality of frames of sample images, wherein any pixel in the sample images is an inner product of two pieces of data in the sample time sequence data;
taking a first image and a second image in each image sequence as training samples of a preset initial prediction model to be input, training the initial prediction model until the output result of the initial prediction model is a third image, and taking the initial prediction model obtained by training as the prediction model; the image sequence comprises n frames of sample images, wherein n is a positive integer greater than 1, and the n frames of sample images belong to one time sequence sample; the first image is a first n-1 frame sample image in each image sequence, the second image is an nth frame sample image in each image sequence, and the similarity between the third image and the second image meets a preset condition.
A time series data detection apparatus comprising:
the first conversion unit is used for converting the time sequence data to be detected into an image to be detected, and any one pixel in the image to be detected is the inner product of two pieces of data in the time sequence data to be detected;
the classification unit is used for classifying and distinguishing the image to be detected by using a pre-constructed discriminator in a generation countermeasure network to obtain an output result of the discriminator; the generation countermeasure network is obtained by training a multi-frame sample image, and the multi-frame sample image is obtained by converting a plurality of groups of pre-collected sample time sequence data;
the determining unit is used for determining that the time sequence data to be detected is abnormal under the condition that the output result is greater than a preset threshold value; and determining that the time sequence data to be detected is normal under the condition that the output result is not greater than the preset threshold value.
Optionally, the method further includes:
the acquisition unit is used for acquiring a plurality of groups of target time sequence data belonging to the same time sequence with the time sequence data to be detected; the sequence positions of the multiple groups of target time sequence data and the time sequence data to be detected in the time sequence are continuous; the sequence bit of the target time sequence data is arranged in front of the sequence bit of the time sequence data to be detected;
the second conversion unit is used for converting a plurality of groups of target time sequence data into a plurality of frames of non-to-be-detected images, and any one pixel in the non-to-be-detected images is the inner product of two pieces of data in the target time sequence data;
the prediction unit is used for inputting multiple frames of the images to be predicted to a pre-constructed prediction model to obtain a target image output by the prediction model, an incidence relation exists between the target image and the images to be predicted, the incidence relation is used for indicating that time sequence data corresponding to the target image and the images to be predicted respectively belong to the same time sequence;
and the correcting unit is used for comparing the pixels of the image to be detected and the target image and correcting the output result according to the comparison result.
Optionally, the correction unit is specifically configured to:
calculating the difference value between the first numerical value and the second numerical value to obtain a prediction error; the first numerical value is the total number of pixels of the image to be detected, and the second numerical value is the total number of pixels of the target image; and calculating a sum of the prediction error and the output result, and using the sum as a final value of the output result.
Optionally, the process of the prediction unit pre-constructing the prediction model includes:
obtaining a plurality of time series samples in advance, wherein the time series samples comprise a plurality of groups of sample time series data; converting a plurality of groups of sample time sequence data into a plurality of frames of sample images, wherein any pixel in the sample images is an inner product of two pieces of data in the sample time sequence data; taking a first image and a second image in each image sequence as training samples of a preset initial prediction model to be input, training the initial prediction model until the output result of the initial prediction model is a third image, and taking the initial prediction model obtained by training as the prediction model; the image sequence comprises n frames of sample images, wherein n is a positive integer greater than 1, and the n frames of sample images belong to one time sequence sample; the first image is a first n-1 frame sample image in each image sequence, the second image is an nth frame sample image in each image sequence, and the similarity between the third image and the second image meets a preset condition.
A computer-readable storage medium including a stored program, wherein the program executes the time-series data detection method.
A time series data detecting apparatus, characterized by comprising: a processor, a memory, and a bus; the processor and the memory are connected through the bus;
the memory is used for storing programs, and the processor is used for running the programs, wherein the time sequence data detection method is executed when the programs run.
According to the technical scheme, time sequence data to be detected are converted into images to be detected, the images to be detected are classified and distinguished by using a pre-constructed discriminator in a generation countermeasure network, an output result of the discriminator is obtained, the time sequence data to be detected are determined to be abnormal under the condition that the output result is larger than a preset threshold value, the time sequence data to be detected are determined to be normal under the condition that the output result is not larger than the preset threshold value, the generation countermeasure network is obtained by training of multi-frame sample images, and the multi-frame sample images are obtained by converting a plurality of groups of pre-collected sample time sequence data. Therefore, time sequence data are converted into image representation, the generated countermeasure network is used for classifying and judging the images, the generated countermeasure network is an unsupervised model, the image training is used for generating the countermeasure network, time consumption is not needed for simulating an abnormal data label, the training time for generating the countermeasure network is effectively reduced, and the detection efficiency is accelerated. And based on the time correlation of the time series data, the time series data can be regarded as a vector set, the inner product of every two vectors (namely, the data in the time series data) is used for indicating the linear relation between every two vectors, the linear relation between every two vectors actually indicates the time correlation of the time series data, the basic characteristic (the image is a matrix) of the digital image shows that the pixels in the image are the inner product, and therefore the time correlation of the time series data can be reserved by using the image as a detection object for detection. In addition, based on the characteristics of the generation countermeasure network, the accuracy of the time series data anomaly detection can be ensured under the condition of lacking the anomaly data samples.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic diagram of a time series data detection method according to an embodiment of the present disclosure;
fig. 2 is a schematic diagram of another timing data detection method according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a timing data detection apparatus according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
As shown in fig. 1, a schematic diagram of a time series data detection method provided in an embodiment of the present application includes the following steps:
s101: a plurality of time series samples are collected in advance.
Wherein the time series of samples comprises a plurality of sets of sample timing data. Specifically, the sample timing data is X ═ { X ═ X1,…,xNI.e., the sample timing data includes N data, N being a positive integer greater than 1. Accordingly, the time-series samples are Y ═ X1,.., XM }, i.e., time series samples are a sequence of M sets of sample timing data.
It should be noted that, only in a few special cases, the time series data has an exception, for example, a mechanical device fails, and a network is attacked. Although a small number of special cases may cause the time series data to be abnormal, the user cannot manually distinguish the special cases (i.e., cannot distinguish the abnormality in the time series data), and therefore, the pre-collected sample time series data is generally considered as normal data.
S102: the time series data of the multiple groups of samples under each time series sample are converted into a multi-frame sample image by utilizing a Gramian Angularfeld (GAF) algorithm.
Any pixel in the sample image is the inner product of two pieces of data in the sample time sequence data. The specific implementation process of converting the multiple groups of sample time sequence data into the multiple frames of sample images by using the gram angular field algorithm is common knowledge familiar to those skilled in the art, and is not described herein again.
It should be noted that, because there are various encoding methods of the graham angular field algorithm, there are various types of the sample image obtained by conversion, for example, the Graham Angular Sum Field (GASF) type, the Graham Angular Difference Field (GADF) type, and the markov transfer domain (MTF) type.
Specifically, the time sequence data is converted into an image by utilizing a gram angular field algorithm, and the specific conversion process is as follows: the time sequence data can be regarded as a vector set, inner product calculation is carried out on any two vectors in the vector set, a gram matrix is constructed by utilizing the inner products obtained through calculation, elements of the gram matrix are used as pixels of an image, in mathematical application, the inner products indicate linear relation between every two vectors, and time correlation of the time sequence data can be regarded as linear relation.
S103: the method comprises the steps of inputting a multi-frame sample image as a training sample of an initially generated countermeasure network (GAN), training the initially generated countermeasure network until an output result of a discriminator in the initially generated countermeasure network meets a preset requirement, and taking the initially generated countermeasure network obtained through training as the generated countermeasure network.
Wherein, the generation of the countermeasure network comprises an arbiter and a generator, the input of the generator obeys the Gaussian distribution.
It should be noted that the specific training process for generating the countermeasure network is common knowledge familiar to those skilled in the art, and will not be described herein.
S104: collecting multi-frame sample images to construct an image sequence, taking a first image and a second image in each image sequence as training sample inputs of a preset initial prediction model, training the initial prediction model until the output result of the initial prediction model is a third image, and determining the initial prediction model obtained by training as a prediction model.
The image sequence comprises n frames of sample images, wherein n is a positive integer larger than 1, and the n frames of sample images belong to a time sequence sample. The first image is the first n-1 frame sample image in each image sequence, the second image is the nth frame sample image in each image sequence, and the similarity between the third image and the second image meets the preset condition.
It should be noted that the initial prediction model includes, but is not limited to: regression prediction models, trend extrapolation models, BP neural network prediction models, and the like.
S105: and acquiring a plurality of groups of target time sequence data.
The target time sequence data is the time sequence data belonging to the same time sequence with the time sequence data to be detected.
The sequence positions of the multiple groups of target time sequence data and the time sequence data to be detected in the time sequence are continuous. The sequence bit of the target time sequence data is arranged before the sequence bit of the time sequence data to be detected.
It should be noted that the target time series data is time series data occurring before the time series data to be measured, and in practical application, the past occurring time series data is used as a reference, so that the change of the current time series data can be assisted to be known.
S106: converting the time sequence data to be tested into an image to be tested, and converting the multiple groups of target time sequence data into multiple frames of non-images to be tested.
Any pixel in the image to be detected is the inner product of two pieces of data in the time sequence data to be detected. Any pixel in the image not to be measured is the inner product of the two pieces of data in the target time sequence data.
It should be noted that, for a specific implementation process and an implementation principle of converting the time series data into the image, reference may be made to the explanation of S102, which is not described herein again.
S107: and classifying and distinguishing the images to be detected by using a discriminator in the generated countermeasure network to obtain an output result of the discriminator.
The output result comprises an abnormal score, and the abnormal score is used for indicating the probability of abnormality of the image to be detected (namely the time sequence data to be detected).
It should be noted that, as known in the art, a user cannot distinguish whether or not time series data is abnormal in most cases, and therefore, the collected time series data is generally regarded as normal data. However, the generation of the countermeasure network is an unsupervised model, and a discriminator capable of effectively distinguishing normal data from abnormal data can be obtained through training without artificially collecting and labeling abnormal data, in other words, the generation of the countermeasure network is trained without using the abnormal data as a training sample, and the classification discrimination performance of the discriminator can be improved only through training of a large number of normal samples (namely, normal data). Therefore, the generation countermeasure network can be suitable for the abnormal detection of the time sequence data, the labor cost is not needed to be spent for simulating the abnormal data label, the detection efficiency is obviously improved, and the accuracy of the abnormal data detection can be ensured.
S108: and inputting the multiple frames of images not to be detected into the prediction model to obtain a target image output by the prediction model.
The target image and the image to be measured have an incidence relation, and the incidence relation is used for indicating that the time sequence data corresponding to the target image and the image to be measured belong to the same time sequence.
S109: and comparing the pixels of the image to be detected and the target image, and correcting the output result (namely the abnormal score) according to the comparison result.
The method comprises the steps of obtaining the total number of pixels of an image to be detected as a first numerical value, obtaining the total number of pixels of a target image as a second numerical value, calculating the difference value of the first numerical value and the second numerical value to obtain a prediction error, calculating the sum value of the prediction error and an output result, and taking the sum value as the final value of the output result.
It should be noted that, in practical applications, if the time series data to be detected is abnormal, the total number of pixels of the image to be detected also changes correspondingly, so that the output result of the discriminator can be corrected in an auxiliary manner based on the prediction error, and the accuracy and the robustness of the discriminator are improved.
S110: and judging whether the corrected output result is larger than a preset threshold value or not.
If the corrected output result is greater than the preset threshold, executing S111, otherwise executing S112.
S111: and determining that the time sequence data to be detected is abnormal.
S112: and determining that the time sequence data to be detected is normal.
In summary, the time sequence data to be detected is converted into the image to be detected, and the image to be detected is classified and judged by using the pre-constructed discriminator in the generation countermeasure network, so as to obtain the output result of the discriminator. And acquiring a plurality of groups of target time sequence data belonging to the same time sequence with the time sequence data to be detected. And converting the multiple groups of target time sequence data into multiple frames of images which are not to be tested, and inputting the multiple frames of images which are not to be tested into the prediction model to obtain the target image output by the prediction model. And comparing the pixels of the image to be detected and the target image, and correcting the output result according to the comparison result. And determining that the time sequence data to be detected is normal under the condition that the corrected output result is not greater than a preset threshold value. Therefore, time sequence data are converted into image representation, the generated countermeasure network is used for classifying and judging the images, the generated countermeasure network is an unsupervised model, the image training is used for generating the countermeasure network, time consumption is not needed for simulating an abnormal data label, the training time for generating the countermeasure network is effectively reduced, and the detection efficiency is accelerated. And based on the time correlation of the time sequence data, the time sequence data can be regarded as a vector set, the inner product of every two vectors is used for indicating the linear relation between every two vectors, the linear relation between every two vectors actually indicates the time correlation of the time sequence data, the basic characteristic of the digital image shows that the pixels in the image are the inner product, therefore, the image is used as a detection object for detection, and the time correlation of the time sequence data can be reserved. Based on the characteristics of the generation countermeasure network, the accuracy of the time series data abnormity detection can be ensured under the condition of lacking abnormal data samples. In addition, the output result of the discriminator is corrected by using the prediction model, and the accuracy and the robustness of the output result can be further improved.
The method for detecting time series data according to the foregoing embodiment may be specifically applied to a van der pol oscillator test scenario, and is used to detect time series data generated by a van der pol oscillator.
The time series data generated by the van der pol oscillator is discrete, and the generation process is as shown in formula (1).
Figure BDA0002650583730000101
The specific implementation process of the detection of the time series data is as follows:
1. using equation (1), a large amount of sample timing data is generated.
2. A plurality of time series samples are randomly selected from a large amount of time series data with a time span length of w unit time (for example, with a unit time of hour), and each time series sample comprises n groups of sample time series data.
3. All sample time series data are converted into sample images.
4. And training to obtain a generated countermeasure network by using the multi-frame sample image.
5. And taking the multi-frame sample image under each time sequence sample as an image sequence, and training by using the first n-1 frame sample image and the nth frame sample image in each image sequence to obtain a prediction model.
6. And (3) generating a time sequence by using the formula (1), wherein the time sequence comprises n groups of time sequence data, the first n-1 groups of time sequence data are taken as normal evidences, and the nth group of time sequence data are taken as time sequence data to be detected.
7. And converting the first n-1 groups of time sequence data into n-1 frames of images, and converting the nth group of time sequence data into the nth frame of images.
8. And classifying and distinguishing the nth frame of image by using a discriminator in the generated countermeasure network to obtain an output result of the discriminator, wherein the output result comprises an abnormal score.
9. And taking the n-1 frame image as the input of the prediction model to obtain a target image output by the prediction model.
10. And calculating the difference value between the total number of pixels of the nth frame image and the total number of pixels of the target image to obtain a prediction error.
11. And calculating the sum of the prediction error and the abnormal score, and taking the sum as the final value of the output result.
12. And if the output result is greater than 0.5, determining that the time sequence data to be detected is abnormal, and if the output result is not greater than 0.5, determining that the time sequence data to be detected is normal.
In summary, a generative countermeasure network is obtained by training using the multi-frame sample image, and the discrimination of the generative countermeasure network can effectively perform anomaly detection on the time series data generated by the van der pol oscillator. And training by utilizing the first n-1 frame sample image and the nth frame sample image in each image sequence to obtain a prediction model, wherein the prediction model can output a target image, and the output result of the discriminator can be effectively corrected by utilizing the difference value between the total number of pixels of the target image and the total number of pixels of the image to be detected, so that the accuracy of the output result of the discriminator is higher, and the robustness is better.
It should be noted that, in the foregoing embodiment, the process of collecting multiple time-series sequence samples in advance in S101 is an optional implementation manner of the time-series data detection method, and the time-series data detection may also be implemented without collecting time-series sequence samples in advance. In addition, in the above embodiment, the training process of the prediction model in S104 is also an optional implementation manner of the time series data detection method, and the time series data may also be detected without training the prediction model. In addition, in the above embodiment, the process of comparing the pixels of the image to be detected and the target image and correcting the output result according to the comparison result in S109 is an optional implementation manner of the time series data detection method, and the time series data detection can be implemented without correcting the output result of the discriminator. Therefore, the time series data detection method shown in the above embodiment can be summarized as the flow shown in fig. 2.
As shown in fig. 2, a schematic diagram of another time series data detection method provided in the embodiment of the present application includes the following steps:
s201: and converting the time sequence data to be detected into an image to be detected.
Any pixel in the image to be detected is the inner product of two pieces of data in the time sequence data to be detected.
S202: and classifying and distinguishing the images to be detected by using a pre-constructed discriminator in the generation countermeasure network to obtain an output result of the discriminator.
The generation countermeasure network is obtained by training a plurality of frames of sample images, and the plurality of frames of sample images are obtained by converting a plurality of groups of pre-collected sample time sequence data.
S203: and judging whether the output result is larger than a preset threshold value or not.
If the output result is greater than the preset threshold, executing S204, otherwise executing S205.
S204: and determining that the time sequence data to be detected is abnormal.
S205: and determining that the time sequence data to be detected is normal.
In summary, the time sequence data to be detected is converted into an image to be detected, a pre-constructed discriminator in the generated countermeasure network is used for classifying and discriminating the image to be detected to obtain an output result of the discriminator, the time sequence data to be detected is determined to be abnormal under the condition that the output result is greater than a preset threshold, and the time sequence data to be detected is determined to be normal under the condition that the output result is not greater than the preset threshold, wherein the generated countermeasure network is trained by using multi-frame sample images, and the multi-frame sample images are obtained by converting a plurality of groups of pre-collected sample time sequence data. Therefore, time sequence data are converted into image representation, the generated countermeasure network is used for classifying and judging the images, the generated countermeasure network is an unsupervised model, the image training is used for generating the countermeasure network, time consumption is not needed for simulating an abnormal data label, the training time for generating the countermeasure network is effectively reduced, and the detection efficiency is accelerated. And based on the time correlation of the time series data, the time series data can be regarded as a vector set, the inner product of every two vectors is used for indicating the linear relation between every two vectors, the linear relation between every two vectors actually indicates the time correlation of the time series data, the basic characteristic (the image is a matrix) of the digital image is known, the pixel in the image is the inner product, therefore, the image is used as a detection object for detection, and the time correlation of the time series data can be reserved. In addition, based on the characteristics of the generation countermeasure network, the accuracy of the time series data anomaly detection can be ensured under the condition of lacking the anomaly data samples.
Corresponding to the time sequence data detection method provided by the embodiment of the application, the application also provides a time sequence data detection device.
As shown in fig. 3, a schematic structural diagram of a time series data detection apparatus provided in an embodiment of the present application includes:
the first converting unit 100 is configured to convert the time series data to be detected into an image to be detected, where any one pixel in the image to be detected is an inner product of two pieces of data in the time series data to be detected.
And the classification unit 200 is configured to classify and discriminate the image to be detected by using a pre-constructed discriminator in the generation countermeasure network, so as to obtain an output result of the discriminator. The generation countermeasure network is obtained by training a plurality of frames of sample images, and the plurality of frames of sample images are obtained by converting a plurality of groups of pre-collected sample time sequence data.
The determining unit 300 is configured to determine that the time series data to be measured is abnormal when the output result is greater than a preset threshold. And determining that the time sequence data to be detected is normal under the condition that the output result is not greater than the preset threshold value.
An obtaining unit 400 is configured to obtain multiple sets of target time series data belonging to the same time series as the time series data to be detected. The sequence positions of the multiple groups of target time sequence data and the time sequence data to be detected in the time sequence are continuous. The sequence bit of the target time sequence data is arranged before the sequence bit of the time sequence data to be detected.
The second converting unit 500 is configured to convert multiple sets of target time series data into multiple frames of non-to-be-detected images, where any one pixel in the non-to-be-detected images is an inner product of two pieces of data in the target time series data.
The prediction unit 600 is configured to input multiple frames of non-to-be-detected images into a pre-constructed prediction model to obtain a target image output by the prediction model, where an association relationship exists between the target image and the non-to-be-detected images, and the association relationship is used to indicate that time sequence data corresponding to the target image and the non-to-be-detected images belong to the same time sequence.
The process of the prediction unit 600 for constructing the prediction model in advance includes: pre-obtaining a plurality of time series samples, wherein the time series samples comprise a plurality of groups of sample time series data, converting the plurality of groups of sample time series data into a plurality of frames of sample images, any pixel in the sample images is in the sample time series data, the inner product of the two data takes the first image and the second image in each image sequence as the training sample input of the preset initial prediction model, training the initial prediction model until the output result of the initial prediction model is a third image, taking the initial prediction model obtained by training as a prediction model, the image sequence comprises n frames of sample images, n is a positive integer larger than 1, the n frames of sample images belong to a time sequence sample, the first image is the first n-1 frames of sample images in each image sequence, the second image is the nth frame of sample images in each image sequence, and the similarity between the third image and the second image meets the preset condition.
The correcting unit 700 is configured to compare pixels of the image to be detected and the target image, and correct an output result according to a comparison result.
The specific implementation process of the correction unit 700 for comparing the pixels of the image to be detected and the target image and correcting the output result according to the comparison result includes: and calculating a difference value between a first value and a second value to obtain a prediction error, wherein the first value is the total number of pixels of the image to be detected, the second value is the total number of pixels of the target image, calculating a sum value of the prediction error and the output result, and taking the sum value as a final value of the output result.
In summary, the time sequence data to be detected is converted into an image to be detected, a pre-constructed discriminator in the generated countermeasure network is used for classifying and discriminating the image to be detected to obtain an output result of the discriminator, the time sequence data to be detected is determined to be abnormal under the condition that the output result is greater than a preset threshold, and the time sequence data to be detected is determined to be normal under the condition that the output result is not greater than the preset threshold, wherein the generated countermeasure network is trained by using multi-frame sample images, and the multi-frame sample images are obtained by converting a plurality of groups of pre-collected sample time sequence data. Therefore, time sequence data are converted into image representation, the generated countermeasure network is used for classifying and judging the images, the generated countermeasure network is an unsupervised model, the image training is used for generating the countermeasure network, time consumption is not needed for simulating an abnormal data label, the training time for generating the countermeasure network is effectively reduced, and the detection efficiency is accelerated. And based on the time correlation of the time series data, the time series data can be regarded as a vector set, the inner product of every two vectors is used for indicating the linear relation between every two vectors, the linear relation between every two vectors actually indicates the time correlation of the time series data, the basic characteristic (the image is a matrix) of the digital image is known, the pixel in the image is the inner product, therefore, the image is used as a detection object for detection, and the time correlation of the time series data can be reserved. In addition, based on the characteristics of the generation countermeasure network, the accuracy of the time series data anomaly detection can be ensured under the condition of lacking the anomaly data samples.
The application also provides a computer readable storage medium, which includes a stored program, wherein the program executes the time series data detection method provided by the application.
The present application further provides a time series data detection device, including: a processor, a memory, and a bus. The processor is connected with the memory through a bus, the memory is used for storing programs, and the processor is used for running the programs, wherein when the programs run, the time sequence data detection method provided by the application is executed, and the method comprises the following steps:
converting time sequence data to be detected into an image to be detected, wherein any pixel in the image to be detected is the inner product of two pieces of data in the time sequence data to be detected;
classifying and distinguishing the to-be-detected image by using a pre-constructed discriminator in a generation countermeasure network to obtain an output result of the discriminator; the generation countermeasure network is obtained by training a multi-frame sample image, and the multi-frame sample image is obtained by converting a plurality of groups of pre-collected sample time sequence data;
determining that the time sequence data to be detected is abnormal under the condition that the output result is greater than a preset threshold value;
and determining that the time sequence data to be detected is normal under the condition that the output result is not greater than the preset threshold value.
Optionally, the method further includes:
acquiring a plurality of groups of target time sequence data belonging to the same time sequence with the time sequence data to be detected; the sequence positions of the multiple groups of target time sequence data and the time sequence data to be detected in the time sequence are continuous; the sequence bit of the target time sequence data is arranged in front of the sequence bit of the time sequence data to be detected;
converting a plurality of groups of target time sequence data into a plurality of frames of non-to-be-detected images, wherein any one pixel in the non-to-be-detected images is the inner product of two pieces of data in the target time sequence data;
inputting a plurality of frames of the images to be detected into a pre-constructed prediction model to obtain a target image output by the prediction model, wherein an incidence relation exists between the target image and the images to be detected, the incidence relation is used for indicating that time sequence data corresponding to the target image and the images to be detected respectively belong to the same time sequence;
and comparing the pixels of the image to be detected and the target image, and correcting the output result according to the comparison result.
Optionally, the comparing the pixels of the image to be detected and the target image, and correcting the output result according to the comparison result includes:
calculating the difference value between the first numerical value and the second numerical value to obtain a prediction error; the first numerical value is the total number of pixels of the image to be detected, and the second numerical value is the total number of pixels of the target image;
and calculating a sum of the prediction error and the output result, and using the sum as a final value of the output result.
Optionally, the construction process of the prediction model includes:
obtaining a plurality of time series samples in advance, wherein the time series samples comprise a plurality of groups of sample time series data;
converting a plurality of groups of sample time sequence data into a plurality of frames of sample images, wherein any pixel in the sample images is an inner product of two pieces of data in the sample time sequence data;
taking a first image and a second image in each image sequence as training samples of a preset initial prediction model to be input, training the initial prediction model until the output result of the initial prediction model is a third image, and taking the initial prediction model obtained by training as the prediction model; the image sequence comprises n frames of sample images, wherein n is a positive integer greater than 1, and the n frames of sample images belong to one time sequence sample; the first image is a first n-1 frame sample image in each image sequence, the second image is an nth frame sample image in each image sequence, and the similarity between the third image and the second image meets a preset condition.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method for detecting time series data is characterized by comprising the following steps:
converting time sequence data to be detected into an image to be detected, wherein any pixel in the image to be detected is the inner product of two pieces of data in the time sequence data to be detected;
classifying and distinguishing the to-be-detected image by using a pre-constructed discriminator in a generation countermeasure network to obtain an output result of the discriminator; the generation countermeasure network is obtained by training a multi-frame sample image, and the multi-frame sample image is obtained by converting a plurality of groups of pre-collected sample time sequence data;
determining that the time sequence data to be detected is abnormal under the condition that the output result is greater than a preset threshold value;
and determining that the time sequence data to be detected is normal under the condition that the output result is not greater than the preset threshold value.
2. The method of claim 1, further comprising:
acquiring a plurality of groups of target time sequence data belonging to the same time sequence with the time sequence data to be detected; the sequence positions of the multiple groups of target time sequence data and the time sequence data to be detected in the time sequence are continuous; the sequence bit of the target time sequence data is arranged in front of the sequence bit of the time sequence data to be detected;
converting a plurality of groups of target time sequence data into a plurality of frames of non-to-be-detected images, wherein any one pixel in the non-to-be-detected images is the inner product of two pieces of data in the target time sequence data;
inputting a plurality of frames of the images to be detected into a pre-constructed prediction model to obtain a target image output by the prediction model, wherein an incidence relation exists between the target image and the images to be detected, the incidence relation is used for indicating that time sequence data corresponding to the target image and the images to be detected respectively belong to the same time sequence;
and comparing the pixels of the image to be detected and the target image, and correcting the output result according to the comparison result.
3. The method of claim 2, wherein comparing the pixels of the image to be measured and the target image and modifying the output result according to the comparison result comprises:
calculating the difference value between the first numerical value and the second numerical value to obtain a prediction error; the first numerical value is the total number of pixels of the image to be detected, and the second numerical value is the total number of pixels of the target image;
and calculating a sum of the prediction error and the output result, and using the sum as a final value of the output result.
4. The method of claim 2, wherein the construction of the predictive model comprises:
obtaining a plurality of time series samples in advance, wherein the time series samples comprise a plurality of groups of sample time series data;
converting a plurality of groups of sample time sequence data into a plurality of frames of sample images, wherein any pixel in the sample images is an inner product of two pieces of data in the sample time sequence data;
taking a first image and a second image in each image sequence as training samples of a preset initial prediction model to be input, training the initial prediction model until the output result of the initial prediction model is a third image, and taking the initial prediction model obtained by training as the prediction model; the image sequence comprises n frames of sample images, wherein n is a positive integer greater than 1, and the n frames of sample images belong to one time sequence sample; the first image is a first n-1 frame sample image in each image sequence, the second image is an nth frame sample image in each image sequence, and the similarity between the third image and the second image meets a preset condition.
5. A time series data detecting apparatus, comprising:
the first conversion unit is used for converting the time sequence data to be detected into an image to be detected, and any one pixel in the image to be detected is the inner product of two pieces of data in the time sequence data to be detected;
the classification unit is used for classifying and distinguishing the image to be detected by using a pre-constructed discriminator in a generation countermeasure network to obtain an output result of the discriminator; the generation countermeasure network is obtained by training a multi-frame sample image, and the multi-frame sample image is obtained by converting a plurality of groups of pre-collected sample time sequence data;
the determining unit is used for determining that the time sequence data to be detected is abnormal under the condition that the output result is greater than a preset threshold value; and determining that the time sequence data to be detected is normal under the condition that the output result is not greater than the preset threshold value.
6. The apparatus of claim 5, further comprising:
the acquisition unit is used for acquiring a plurality of groups of target time sequence data belonging to the same time sequence with the time sequence data to be detected; the sequence positions of the multiple groups of target time sequence data and the time sequence data to be detected in the time sequence are continuous; the sequence bit of the target time sequence data is arranged in front of the sequence bit of the time sequence data to be detected;
the second conversion unit is used for converting a plurality of groups of target time sequence data into a plurality of frames of non-to-be-detected images, and any one pixel in the non-to-be-detected images is the inner product of two pieces of data in the target time sequence data;
the prediction unit is used for inputting multiple frames of the images to be predicted to a pre-constructed prediction model to obtain a target image output by the prediction model, an incidence relation exists between the target image and the images to be predicted, the incidence relation is used for indicating that time sequence data corresponding to the target image and the images to be predicted respectively belong to the same time sequence;
and the correcting unit is used for comparing the pixels of the image to be detected and the target image and correcting the output result according to the comparison result.
7. The apparatus according to claim 6, wherein the modification unit is specifically configured to:
calculating the difference value between the first numerical value and the second numerical value to obtain a prediction error; the first numerical value is the total number of pixels of the image to be detected, and the second numerical value is the total number of pixels of the target image; and calculating a sum of the prediction error and the output result, and using the sum as a final value of the output result.
8. The apparatus of claim 6, wherein the prediction unit is configured to pre-construct the prediction model by:
obtaining a plurality of time series samples in advance, wherein the time series samples comprise a plurality of groups of sample time series data; converting a plurality of groups of sample time sequence data into a plurality of frames of sample images, wherein any pixel in the sample images is an inner product of two pieces of data in the sample time sequence data; taking a first image and a second image in each image sequence as training samples of a preset initial prediction model to be input, training the initial prediction model until the output result of the initial prediction model is a third image, and taking the initial prediction model obtained by training as the prediction model; the image sequence comprises n frames of sample images, wherein n is a positive integer greater than 1, and the n frames of sample images belong to one time sequence sample; the first image is a first n-1 frame sample image in each image sequence, the second image is an nth frame sample image in each image sequence, and the similarity between the third image and the second image meets a preset condition.
9. A computer-readable storage medium characterized in that the computer-readable storage medium includes a stored program, wherein the program executes the time series data detection method of claims 1 to 4.
10. A time series data detecting apparatus, characterized by comprising: a processor, a memory, and a bus; the processor and the memory are connected through the bus;
the memory is used for storing a program, and the processor is used for running the program, wherein the program is used for executing the time series data detection method of claims 1-4 during running.
CN202010869594.5A 2020-08-26 2020-08-26 Time sequence data detection method and device Pending CN112000830A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010869594.5A CN112000830A (en) 2020-08-26 2020-08-26 Time sequence data detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010869594.5A CN112000830A (en) 2020-08-26 2020-08-26 Time sequence data detection method and device

Publications (1)

Publication Number Publication Date
CN112000830A true CN112000830A (en) 2020-11-27

Family

ID=73471834

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010869594.5A Pending CN112000830A (en) 2020-08-26 2020-08-26 Time sequence data detection method and device

Country Status (1)

Country Link
CN (1) CN112000830A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112819386A (en) * 2021-03-05 2021-05-18 中国人民解放军国防科技大学 Method, system and storage medium for generating time series data with abnormity
CN113537247A (en) * 2021-08-13 2021-10-22 重庆大学 Data enhancement method for converter transformer vibration signal
CN113743607A (en) * 2021-09-15 2021-12-03 京东科技信息技术有限公司 Training method of anomaly detection model, anomaly detection method and device
CN113780412A (en) * 2021-09-10 2021-12-10 齐齐哈尔大学 Fault diagnosis model training method and system and fault diagnosis model training method and system
CN114219961A (en) * 2021-12-16 2022-03-22 博雅创智(天津)科技有限公司 Vggnet algorithm-based time sequence data anomaly detection method
CN114549930A (en) * 2022-02-21 2022-05-27 合肥工业大学 Rapid road short-time vehicle head interval prediction method based on trajectory data
WO2022141871A1 (en) * 2020-12-31 2022-07-07 平安科技(深圳)有限公司 Time sequence data anomaly detection method, apparatus and device, and storage medium
CN114944831A (en) * 2022-05-12 2022-08-26 中国科学技术大学先进技术研究院 Multi-cycle time series data decomposition method, device, equipment and storage medium

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022141871A1 (en) * 2020-12-31 2022-07-07 平安科技(深圳)有限公司 Time sequence data anomaly detection method, apparatus and device, and storage medium
CN112819386A (en) * 2021-03-05 2021-05-18 中国人民解放军国防科技大学 Method, system and storage medium for generating time series data with abnormity
CN113537247A (en) * 2021-08-13 2021-10-22 重庆大学 Data enhancement method for converter transformer vibration signal
CN113780412A (en) * 2021-09-10 2021-12-10 齐齐哈尔大学 Fault diagnosis model training method and system and fault diagnosis model training method and system
CN113780412B (en) * 2021-09-10 2024-01-30 齐齐哈尔大学 Fault diagnosis model training method and system and fault diagnosis method and system
CN113743607A (en) * 2021-09-15 2021-12-03 京东科技信息技术有限公司 Training method of anomaly detection model, anomaly detection method and device
CN113743607B (en) * 2021-09-15 2023-12-05 京东科技信息技术有限公司 Training method of anomaly detection model, anomaly detection method and device
CN114219961A (en) * 2021-12-16 2022-03-22 博雅创智(天津)科技有限公司 Vggnet algorithm-based time sequence data anomaly detection method
CN114549930A (en) * 2022-02-21 2022-05-27 合肥工业大学 Rapid road short-time vehicle head interval prediction method based on trajectory data
CN114549930B (en) * 2022-02-21 2023-01-10 合肥工业大学 Rapid road short-time vehicle head interval prediction method based on trajectory data
CN114944831A (en) * 2022-05-12 2022-08-26 中国科学技术大学先进技术研究院 Multi-cycle time series data decomposition method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN112000830A (en) Time sequence data detection method and device
CN110826648A (en) Method for realizing fault detection by utilizing time sequence clustering algorithm
Pavlovski et al. Hierarchical convolutional neural networks for event classification on PMU measurements
CN110110804B (en) Flight control system residual life prediction method based on CNN and LSTM
CN113128412B (en) Fire trend prediction method based on deep learning and fire monitoring video
CN117041017B (en) Intelligent operation and maintenance management method and system for data center
CN116910752B (en) Malicious code detection method based on big data
CN114239725A (en) Electricity stealing detection method oriented to data virus throwing attack
CN115587335A (en) Training method of abnormal value detection model, abnormal value detection method and system
Du et al. Convolutional neural network-based data anomaly detection considering class imbalance with limited data
CN116451139B (en) Live broadcast data rapid analysis method based on artificial intelligence
CN117192416A (en) Battery monitoring system and method based on BMS system
CN117092581A (en) Segment consistency-based method and device for detecting abnormity of electric energy meter of self-encoder
Stržinar et al. Soft sensor for non-invasive detection of process events based on Eigenresponse Fuzzy Clustering
CN115905959A (en) Method and device for analyzing relevance fault of power circuit breaker based on defect factor
JP5905375B2 (en) Misclassification detection apparatus, method, and program
CN115294397A (en) Classification task post-processing method, device, equipment and storage medium
CN115018012A (en) Internet of things time sequence anomaly detection method and system under high-dimensional characteristic
CN113781469A (en) Method and system for detecting wearing of safety helmet based on YOLO improved model
CN112819041A (en) Data processing method and system based on electric power big data platform
CN116232761B (en) Method and system for detecting abnormal network traffic based on shapelet
CN117951646A (en) Data fusion method and system based on edge cloud
CN111581640A (en) Malicious software detection method, device and equipment and storage medium
Mahmoodpour et al. A learning based contrast specific no reference image quality assessment algorithm
CN115909165A (en) Weak surveillance video anomaly detection method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201127