[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN111738335A - Time series data abnormity detection method based on neural network - Google Patents

Time series data abnormity detection method based on neural network Download PDF

Info

Publication number
CN111738335A
CN111738335A CN202010577504.5A CN202010577504A CN111738335A CN 111738335 A CN111738335 A CN 111738335A CN 202010577504 A CN202010577504 A CN 202010577504A CN 111738335 A CN111738335 A CN 111738335A
Authority
CN
China
Prior art keywords
anomaly
data
time series
neural network
detection method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010577504.5A
Other languages
Chinese (zh)
Inventor
周春姐
李阿丽
张振兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ludong University
Original Assignee
Ludong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ludong University filed Critical Ludong University
Priority to CN202010577504.5A priority Critical patent/CN111738335A/en
Publication of CN111738335A publication Critical patent/CN111738335A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The invention discloses a time series data abnormity detection method based on a neural network, which comprises the following steps: collecting data, and performing an anomaly detection process based on the time series data of the neural network and performance verification of an anomaly detection method. The invention has the advantages that: (1) the anomaly detector does not make any assumption on the potential mechanism of the anomaly type and the anomaly pattern, but can obtain related concepts from the training data set through learning; (2) the anomaly detector does not need to select a threshold value, so that the tedious work of setting the threshold value is avoided, and good anomaly detection performance can be obtained; (3) with the accumulation of the anomaly detection experience, the anomaly detector continuously carries out dynamic improvement to learn new anomalies, so that the knowledge accumulation of the anomaly detection by the anomaly detector is enhanced; (4) the anomaly detection method provided by the invention can be widely applied to the fields of human body disease monitoring, traffic accident discovery, equipment fault diagnosis, network intrusion detection and the like in the industries of health care, intelligent transportation, large-scale production systems and the like.

Description

Time series data abnormity detection method based on neural network
Technical Field
The invention relates to a time series abnormity detection method, in particular to a time series data abnormity detection method based on a neural network, and belongs to the technical field of artificial intelligence and big data application.
Background
Anomaly detection is a hot topic that is widely present in many fields. For example, human body disease monitoring in health care, traffic accident discovery in intelligent transportation, equipment fault diagnosis in large-scale production systems, network intrusion detection and other fields, it is seen that anomaly detection is very important. Ideally, the anomaly detection method should be suitable for various different scenes and can be easily operated.
However, the conventional abnormality detection method cannot meet the demand. The literature (Chandola V, Banerjee a, and kumar v. Anomaly detection: a surface detection, 41(3): 1-58,2009) summarizes in depth the Anomaly detection methods, finding that different Anomaly detection methods make various strong assumptions on the Anomaly patterns (gap-layer L, gap-layer g. a novel Anomaly detection flow-based Anomaly detection and cross-layer defects in the wireless sensor network, Automatic Control and Computer science, 54(1):62-69, as in the case of the distributed non-procedural method, but in the case of the assumed condition standing, the Anomaly detection method may not achieve satisfactory results (spring F, light b. Anomaly detection and real. Ad: local detection and coverage Ad, 84:82-89, 2019). On the other hand, the abnormality detection method is not always easy to operate. In 2015, Yahoo issued their time series anomaly detection system EGADS (Laptev N, Amizadeh S, and Flint I. general and cable frame for automated time-series anomaly detection. ACM SIGKDDInternal Conference on Knowledge Discovery and Data Mining, 1939 and 1947, 2015), and within the system, a set of methods were implemented and integrated to generate anomaly detection results. Such complex systems require engineers to understand not only the components but also the methodology to be able to adjust the parameters for each component. Furthermore, the methods used in the industry rarely take into account the evolution of anomaly patterns (Carreno A, Inza I, and Lozano J. analysis hierarchy, novelty and outler detection units under the collaborative understanding frame, scientific insight Review, 53(5): 3575-.
In summary, the existing methods have the following disadvantages:
(i) the existing anomaly detection method makes various strong assumptions on the anomaly type and the anomaly mode, and when the assumption conditions are not met, a satisfactory result cannot be obtained;
(ii) the existing anomaly detection method is complex in construction, needs frequent parameter adjustment and is not strong in operability;
(iii) the existing method rarely considers the evolution of an abnormal mode and has poor performance in a dynamic scene.
Based on the above, the invention provides an anomaly detection method based on a neural network, which does not make any assumption on the potential mechanism of an abnormal mode; the tedious work of threshold setting is avoided, so that good abnormal detection performance is obtained; the anomaly detection is made on time series data by continuously learning and improving with the increase of the anomaly detection experience.
Disclosure of Invention
In order to eliminate abnormal data caused by emergencies, emergencies and the like, the invention provides a time series data abnormity detection method based on a neural network, which is continuously trained by using an intensified learning framework. The process of the anomaly detection method is depicted in FIG. 2, and includes three main components: anomaly detector, accumulated anomaly values and empirical values, all of which are self-learning.
In order to achieve the above object, the present invention adopts the following technical solutions:
a time series data abnormity detection method based on a neural network is characterized by comprising the following steps:
step 1: collecting data
Collecting certain time series data with certain duration and various events which may cause data mutation, and uploading the collected data to a cloud terminal through a mobile terminal or other equipment, which is the prior art;
step 2: neural network based time series data anomaly detection process
The invention provides a time series data anomaly detection method, wherein an anomaly detector is driven by a recurrent neural network, a self-learning process is realized by adopting a reinforcement learning method, for example, a recurrent neural network for estimating X (z, a) is trained by adopting an X learning method, and the anomaly detector is continuously improved by self-learning from an empirical value, so that good anomaly detection performance is obtained. The proposed anomaly detection method has the following features: the method does not make any assumption on the abnormal type and mode, does not need to select a threshold value, and continuously learns and improves along with the increase of the abnormal detection experience, so that the method can be widely applied to a plurality of fields;
and step 3: performance verification of anomaly detection methods
The method comprises the steps of firstly training a data record set of a certain duration to obtain an abnormal detection experience, and then verifying the performance of the abnormal detection method in a real-time data set, wherein the abnormal detection is carried out on various types of time series data sets (such as sign data of chronic heart failure patients, service data of a public transport network, equipment energy data in a production system and the like) so as to eliminate abnormal data caused by emergencies or emergency situations, and the method can identify the mean value of a target time series, point abnormality and deviation of an abnormal mode, can obtain high-quality results in a test data set, and has the accuracy of about 100%.
The invention has the advantages that:
(1) the anomaly detector does not make any assumption on the potential mechanism of the anomaly type and the anomaly pattern, but can obtain related concepts from the training data set through learning;
(2) the anomaly detector does not need to select a threshold value, so that the tedious work of setting the threshold value is avoided, and good anomaly detection performance can be obtained;
(3) with the accumulation of the anomaly detection experience, the anomaly detector continuously carries out dynamic improvement to learn new anomalies, so that the knowledge accumulation of the anomaly detection by the anomaly detector is enhanced;
(4) the anomaly detection method provided by the invention can be widely applied to the fields of human body disease monitoring, traffic accident discovery, equipment fault diagnosis, network intrusion detection and the like in the industries of health care, intelligent transportation, large-scale production systems and the like.
Drawings
FIG. 1 is a screenshot of our real research project HeartCarer;
FIG. 2 is a neural network based time series data anomaly detection process;
FIG. 3 is a performance verification of the anomaly detection method.
Detailed Description
The invention is described in detail below with reference to the figures and the embodiments.
Taking the physical sign monitoring data of a patient with chronic heart failure as an example.
Collecting data
Various physiological sign data (such as heart rate, blood pressure, blood sugar and the like) and various events (such as mood changes, diet conditions, mental stress, excessive physical exertion, environmental factors and the like) which can cause sudden changes of the sign data of the chronic heart failure patient are collected through a wearable technology, and then the collected data are uploaded to the cloud end through a mobile terminal or a telephone line. This section is our prior art.
In this embodiment, the data is from our real research project, heartcarrer, as shown in fig. 1. The system is a family-oriented remote monitoring system, is based on a cloud platform, and is specially used for monitoring chronic heart failure patients and performing timely intervention. The remote monitoring system monitors various physiological sign data (particularly heart rate, blood pressure, blood sugar and the like) of a chronic heart failure patient and various events (such as mood changes, diet conditions, mental stress, excessive physical consumption, environmental factors and the like) which can cause sign data mutation through a wearable technology, and uploads the monitored data to the cloud through a mobile terminal or a telephone line.
The remote monitoring system has been applied to clinical observation research of 2607 chronic heart failure patients in 6 medical institutions in China. These chronic heart failure patients received care during 2015 to 2019 years, respectively, most of which were over 60 years (63.8 ± 12 years) and most of which were males (70% of the age), and the amount of each type of information data of these persons exceeded 100 GB.
We used OrientDB Cluster to store large-scale matrix maps, HBase as the vertex attribute, and Hadoop MR for data analysis and computation. The cluster comprises 8 servers running a CentOS 7.4 operating system, and is provided with a 12-core (24-thread) Intel Xeon CPU, the running frequency is 2.80 GHz, and the memory is 64 GB.
Time series data abnormity detection process based on neural network
The invention provides a time series data abnormity detection method, wherein an abnormity detector is driven by a recurrent neural network, and a self-learning process is realized by adopting a reinforcement learning method. A specific anomaly detection process is shown in fig. 2, which includes three main components: an anomaly detector, accumulated anomaly values and empirical values, the relationship of which is described as follows:
the empirical value Y is a set of tuples, each tuple being represented as
Figure 575741DEST_PATH_IMAGE001
Wherein
Figure 820778DEST_PATH_IMAGE002
Representing the corresponding data record with and without event a at a given point in time, respectively, r is the transient anomaly caused by event a. These events are found by the anomaly detector in the anomaly detection method, so empirical values record all the behavior of the anomaly detector.
We denote the anomaly detector by a conditional probability distribution pi = p (a | Z), where a and Z are the set of events in the item and the actual data record, respectively. Typically a = {0,1}, where 1 indicates the presence of an anomaly in the current data record and 0 indicates no anomaly. The formula pi (Z, a) = p (a = a | Z = Z) represents the probability that an event anomaly exists for a particular data record Z.
The present invention uses anomaly detection capability to measure the performance of an anomaly detector
Figure 958104DEST_PATH_IMAGE003
It is defined as:
Figure 477947DEST_PATH_IMAGE004
,
wherein
Figure DEST_PATH_IMAGE005
Is the probability of an actual data record z having an anomaly detector pi, and X (z, a) represents the cumulative anomaly from data record z under event a. That is, the performance is the average cumulative anomaly with the anomaly detector π used.
If the detector satisfies the following conditions
Figure 121681DEST_PATH_IMAGE006
This detector is essentially an optimal anomaly detector that maximizes performance. At the same time, for all
Figure 803198DEST_PATH_IMAGE007
Figure 398127DEST_PATH_IMAGE005
Are substantially identical. If it is not
Figure 954617DEST_PATH_IMAGE008
If true, then π (z, a) = 1. That is, the optimum anomaly detector π*Determined entirely by the cumulative anomaly function X (z, a).
According to the above process description, the empirical value can be used to better estimate X (z, a), the anomaly detector can be continuously improved by self-learning from the empirical value, and the recurrent neural network used to estimate X (z, a) can be trained using the X learning method, resulting in good anomaly detection performance. In summary, the proposed anomaly detection method has the following features: (1) no assumptions are made about the type and mode of the anomaly; (2) no threshold needs to be selected; (3) the anomaly detection method can be widely applied to a plurality of fields because the anomaly detection experience is increased and the anomaly detection method is continuously learned and improved.
Third, performance verification of anomaly detection method
As described in step 1, the data set used for training is a heartcare reference data set, which includes various physiological sign data of 2607 patients with chronic heart failure, and the data volume of each type of information exceeds 100 GB.
Each time series is converted into a set of multidimensional data instances using a sliding window approach. Events in X learning are
Figure 889075DEST_PATH_IMAGE009
Where 0 indicates no anomaly and 1 indicates an anomaly. To enhance the process of model training, a binary tree strategy is used, i.e., two data records to be generated by performing different operations 0 and 1 on the previous data record z
Figure 741493DEST_PATH_IMAGE010
And
Figure DEST_PATH_IMAGE011
are added to the experience set for training. That is, two strips are recorded during the training process
Figure 325184DEST_PATH_IMAGE012
And
Figure 921250DEST_PATH_IMAGE013
adding to the experience set. By performing different operations in the data set Z, we can obtain a reward r0And r1
As shown in fig. 3, we perform anomaly detection on various kinds of sign data sets of chronic heart failure patients to eliminate abnormal data caused by emotional changes, environmental factors, or the like. Firstly, the abnormal detection experience is obtained through a sign data record set in the first three years of training, then the performance of the abnormal detection method in the invention is verified in a real-time sign data set, and the abnormal detection performance of the heart rate index and the blood pressure and blood sugar index is respectively shown in fig. 3 (a) and 3 (b).
From fig. 3 we can see that:
(1) the grey line represents the original sign data record, and the black line represents the emergency in the abnormal detection;
(2) the heart rate index is directly related to chronic heart failure, the testing time interval is set to be 120 minutes, the indexes such as blood pressure, blood sugar and the like are indirectly related, and the interval is set to be 240 minutes;
(3) the anomaly detection method can identify the mean value of a target time sequence, point anomaly and offset of an anomaly mode;
(4) the anomaly detection method can obtain high quality results in a test dataset with an accuracy of about 100%.
It should be noted that the above-mentioned embodiments do not limit the present invention in any way, and all technical solutions obtained by using equivalent alternatives or equivalent variations fall within the protection scope of the present invention.

Claims (3)

1. A time series data abnormity detection method based on a neural network is characterized by comprising the following steps:
step 1: collecting data
Collecting certain time series data with certain duration and various events which may cause data mutation, and uploading the collected data to a cloud terminal through a mobile terminal or other equipment, which is the prior art;
step 2: neural network based time series data anomaly detection process
The invention has proposed a time series data anomaly detection method based on neural network, the anomaly detector is driven by recurrent neural network, and adopt the reinforcement learning method to realize the self-learning process, for example train the recurrent neural network used for estimating X (z, a) through adopting the learning method of X, the anomaly detector does not make any assumption to the unusual type and unusual mode, through studying oneself from the empirical value and thus improving the anomaly detector constantly, the anomaly detection method that the invention puts forward can be applied to the human disease monitoring, traffic accident discovery, apparatus failure diagnosis, network intrusion detection etc. in the industries such as health medical treatment, intelligent traffic and large-scale production system, etc. a lot of fields;
and step 3: performance verification of anomaly detection methods
The method comprises the steps of firstly training a data record set of a certain duration to obtain an abnormal detection experience, and then verifying the performance of the abnormal detection method in a real-time data set, wherein the abnormal detection is carried out on various types of time series data sets (such as sign data of chronic heart failure patients, service data of a public transport network, equipment energy data in a production system and the like) so as to eliminate abnormal data caused by emergencies or emergency situations, and the method can identify the mean value of a target time series, point abnormality and deviation of an abnormal mode, can obtain high-quality results in a test data set, and has the accuracy of about 100%.
2. The neural network-based time series data anomaly detection method according to claim 1, wherein in step 2, the anomaly detector does not make any assumption about the type of anomaly and the pattern of anomaly.
3. The neural network-based time series data anomaly detection method according to claim 1, wherein in step 3, the anomaly detector does not need to select a threshold value and does not need to adjust parameters.
CN202010577504.5A 2020-06-23 2020-06-23 Time series data abnormity detection method based on neural network Pending CN111738335A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010577504.5A CN111738335A (en) 2020-06-23 2020-06-23 Time series data abnormity detection method based on neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010577504.5A CN111738335A (en) 2020-06-23 2020-06-23 Time series data abnormity detection method based on neural network

Publications (1)

Publication Number Publication Date
CN111738335A true CN111738335A (en) 2020-10-02

Family

ID=72650511

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010577504.5A Pending CN111738335A (en) 2020-06-23 2020-06-23 Time series data abnormity detection method based on neural network

Country Status (1)

Country Link
CN (1) CN111738335A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115359870A (en) * 2022-10-20 2022-11-18 之江实验室 Disease diagnosis and treatment process abnormity identification system based on hierarchical graph neural network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109819446A (en) * 2019-03-14 2019-05-28 湖南大学 The space access authentication method and software definition edge calculations system of mobile Internet of Things
CN110287439A (en) * 2019-06-27 2019-09-27 电子科技大学 A kind of network behavior method for detecting abnormality based on LSTM
EP3561815A1 (en) * 2018-04-27 2019-10-30 Tata Consultancy Services Limited A unified platform for domain adaptable human behaviour inference
CN110865625A (en) * 2018-08-28 2020-03-06 中国科学院沈阳自动化研究所 Process data anomaly detection method based on time series
CN111191934A (en) * 2019-12-31 2020-05-22 北京理工大学 Multi-target cloud workflow scheduling method based on reinforcement learning strategy
CN111190804A (en) * 2019-12-28 2020-05-22 同济大学 Multi-level deep learning log fault detection method for cloud native system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3561815A1 (en) * 2018-04-27 2019-10-30 Tata Consultancy Services Limited A unified platform for domain adaptable human behaviour inference
CN110865625A (en) * 2018-08-28 2020-03-06 中国科学院沈阳自动化研究所 Process data anomaly detection method based on time series
CN109819446A (en) * 2019-03-14 2019-05-28 湖南大学 The space access authentication method and software definition edge calculations system of mobile Internet of Things
CN110287439A (en) * 2019-06-27 2019-09-27 电子科技大学 A kind of network behavior method for detecting abnormality based on LSTM
CN111190804A (en) * 2019-12-28 2020-05-22 同济大学 Multi-level deep learning log fault detection method for cloud native system
CN111191934A (en) * 2019-12-31 2020-05-22 北京理工大学 Multi-target cloud workflow scheduling method based on reinforcement learning strategy

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CHENGQIANG HUANG ET AL.: ""Towards Experienced Anomaly Detector through Reinforcement Learning"", 《THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115359870A (en) * 2022-10-20 2022-11-18 之江实验室 Disease diagnosis and treatment process abnormity identification system based on hierarchical graph neural network

Similar Documents

Publication Publication Date Title
US20210225510A1 (en) Human body health assessment method and system based on sleep big data
CN112529329A (en) Infectious disease prediction method based on BP algorithm and SEIR model
WO2022233121A1 (en) Unsupervised medical behavior compliance assessment method based on electronic medical record
CN113241196B (en) Remote medical treatment and grading monitoring system based on cloud-terminal cooperation
WO2023098303A1 (en) Real-time epileptic seizure detecting and monitoring system for video electroencephalogram examination of epilepsy
Jain et al. Linguistic summarization of in-home sensor data
CN110739076A (en) medical artificial intelligence public training platform
CN111081379B (en) Disease probability decision method and system thereof
US20130282295A1 (en) System and method for classifying respiratory and overall health status of an animal
CN107016457A (en) One kind realizes community's hazardous act pre-warning system and method
CN118213075A (en) Multi-mode prediction and identification system for postoperative delirium syndrome of senile general anesthesia patient
CN117142009B (en) Scraper conveyor health state assessment method based on graph rolling network
CN111738335A (en) Time series data abnormity detection method based on neural network
CN114491078B (en) Community project personnel foothold and peer personnel analysis method based on knowledge graph
Kumar et al. Disease prediction using machine learning algorithms KNN and CNN
CN117789973A (en) Medical management method and system based on remote diagnosis
CN118038548A (en) Abnormal behavior detection method, device, electronic equipment and storage medium
CN116681281A (en) Sudden public health event acquisition system and method based on context awareness
Islam et al. College life is hard!-shedding light on stress prediction for autistic college students using data-driven analysis
CN114185739A (en) Supervision data processing implementation method based on artificial intelligence deep learning
CN113470808A (en) Method for artificial intelligence to predict delirium
Boloka et al. Anomaly detection monitoring system for healthcare
Liu et al. Research on application of data mining in hospital management
Yan Application of ID3 Algorithm in Mental Health Education of College Students
Zhang et al. A Deep Learning Method with Multi-view Attention and Multi-branch GCN for BECT Diagnosis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20201002

WD01 Invention patent application deemed withdrawn after publication