CN114024835B - Abnormality positioning method and device - Google Patents
Abnormality positioning method and device Download PDFInfo
- Publication number
- CN114024835B CN114024835B CN202111285930.2A CN202111285930A CN114024835B CN 114024835 B CN114024835 B CN 114024835B CN 202111285930 A CN202111285930 A CN 202111285930A CN 114024835 B CN114024835 B CN 114024835B
- Authority
- CN
- China
- Prior art keywords
- abnormal
- determining
- probability
- maintenance
- data set
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 32
- 230000005856 abnormality Effects 0.000 title claims abstract description 16
- 230000002159 abnormal effect Effects 0.000 claims abstract description 69
- 238000012423 maintenance Methods 0.000 claims abstract description 61
- 238000005516 engineering process Methods 0.000 claims abstract description 6
- 238000012544 monitoring process Methods 0.000 claims description 46
- 238000004458 analytical method Methods 0.000 claims description 26
- 230000001364 causal effect Effects 0.000 claims description 10
- 238000007781 pre-processing Methods 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 7
- 238000005315 distribution function Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 4
- 238000010276 construction Methods 0.000 claims description 3
- 238000003745 diagnosis Methods 0.000 claims description 3
- 238000004393 prognosis Methods 0.000 claims 2
- 206010063385 Intellectualisation Diseases 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 4
- 101001072091 Homo sapiens ProSAAS Proteins 0.000 description 3
- 101150096185 PAAS gene Proteins 0.000 description 3
- 102100036366 ProSAAS Human genes 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000013398 bayesian method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 230000035515 penetration Effects 0.000 description 1
- 102000054765 polymorphisms of proteins Human genes 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0677—Localisation of faults
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/142—Network analysis or design using statistical or mathematical methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Algebra (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Pure & Applied Mathematics (AREA)
- Debugging And Monitoring (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention relates to an anomaly locating method and device. The resource object relation and the index influence relation are established through scenery, the abnormal influence range and the propagation chain are defined, the propagation chain is dynamically analyzed based on the Bayesian network technology, the problem that the abnormality is rapidly located and complex and difficult is solved, and the operation and maintenance intellectualization is further realized.
Description
Technical Field
The invention relates to the field of intelligent operation and maintenance abnormality positioning analysis, in particular to an abnormality positioning method and device.
Background
Along with the completion of digitization and networking by enterprises through related technologies such as cloud computing, distributed services and the like, the digitization transformation of the enterprises is promoted. The operation and maintenance team faces the problems of larger cluster protocol, more service components, more complex association relation and the like, the pressure and the challenges are increased day by day, and the unprecedented challenges are brought to IT operation and maintenance work. Any single outage event may have a significant impact on the business of the company. Therefore, how to quickly analyze and locate the abnormal situation, prevent the fault from being further deteriorated to give an early warning, and then ensure the stable and reliable operation of the digital service through quick repair and adjustment.
The most widely used anomaly or fault location technique at present is based on a CMDB (configuration management database) relational model, which is analyzed and located by fault tree analysis. The scheme adopted has the following defects:
The supervision range and the data quality of the CMDB are required to be relied on, and the correctness of the configuration items and the association relation is maintained for a long time; after the cloud digital transformation on the enterprise, the consistency, the integrity and the correctness of the data of the huge cloud resource CMDB are all the great doubts and difficulties; at the same time, the granularity of the CMDB relationships is configured to the object level, which requires great effort, and is difficult to refine or dynamically configure to help locate the root cause of the problem and adapt to the changing application scene requirements. The fault tree analysis method has some defects, and is generally used for a system with definite fault mechanism and clear fault logic relationship, because the fault state of the system is assumed to be mainly in two aspects, namely the two-state property of the event state and the certainty of the logic relationship; events in the fault tree have only two states: faults and normals, but polymorphisms such as anomalies (higher index, reduced performance, etc.) exist in actual operation and maintenance; in many complex systems there is no definite causal relationship between events, which is more suitably described in a probabilistic manner.
Disclosure of Invention
The invention aims to provide an anomaly positioning method and device for positioning technical limitation and unilateral performance of fault tree analysis positioning of CMDB relation model reasoning, which establishes influence association between resource object relation and indexes through scenerification, defines an anomaly influence range and a propagation chain, dynamically analyzes the propagation chain based on Bayesian network technology so as to solve the problem of complexity and difficulty in positioning anomalies, and further realizes the intellectualization of operation and maintenance.
In order to achieve the above purpose, the technical scheme of the invention is as follows: an anomaly locating method comprises the following steps:
step S101, determining an object and a monitoring index related to operation and maintenance;
Step S102, constructing one or more abnormal propagation chain relation models according to operation and maintenance scenerization;
step S103, acquiring multidimensional index data of the operation and maintenance object;
Step S104, preprocessing the acquired multidimensional index data, and dividing the index data into a normal data set and an abnormal data set;
step S105, analyzing by a Bayesian network technology based on one or more abnormal propagation chain relation models and data sets;
And S106, generating a cause result of the abnormal event according to the analysis result.
In an embodiment of the present invention, the step S101 includes: the method comprises the steps of determining an operation and maintenance object, determining an attribute of the operation and maintenance object, determining a monitoring index related to the operation and maintenance object, and determining an index attribute related to the monitoring index.
In an embodiment of the present invention, the step S102 includes:
determining the relation among the operation and maintenance objects and the influence association among the monitoring indexes according to different operation and maintenance scenes so as to determine an abnormal propagation relation;
Determining an abnormal node according to the abnormal propagation relationship;
one or more anomaly propagation chain relationship models between anomaly nodes are determined.
In an embodiment of the present invention, the step S105 includes:
determining an abnormal scene top event according to the abnormal data set;
Constructing a Bayesian network based on the objects, the monitoring indexes and the abnormal propagation chain relations between the objects and the monitoring indexes, wherein the objects and the monitoring indexes are associated with the top events of the abnormal scene;
And analyzing from top to bottom by using a Bayesian network, a normal data set and an abnormal data set, reasoning by using causal relation, and calculating an occurrence probability result.
The invention also provides an abnormality locating device, comprising:
the definition module is used for determining the operation and maintenance object, determining the attribute of the operation and maintenance object, determining the monitoring index related to the operation and maintenance object and determining the index attribute related to the monitoring index;
The relation model module is used for constructing one or more abnormal propagation chain relation models according to the relation among objects and the influence relation among indexes in the operation and maintenance scene;
The acquisition module is used for acquiring the monitoring index data of each dimension of the operation and maintenance object;
the preprocessing module is used for preprocessing the acquired multidimensional monitoring index data and dividing the index data into a normal data set and an abnormal data set;
the construction module is used for constructing a Bayesian network for the objects, the monitoring indexes and the abnormal propagation chain relations between the objects and the monitoring indexes, wherein the objects and the monitoring indexes are related to the top events of the abnormal scene;
The analysis module is used for analyzing from top to bottom by using a Bayesian network, a normal data set and an abnormal data set, reasoning causal relation and calculating an occurrence probability result;
and the positioning module is used for positioning the abnormal event according to the result of the analysis module and providing a reason analysis result.
Compared with the prior art, the invention has the following beneficial effects:
(1) Establishing abnormal propagation relation penetration among operation and maintenance objects of the IAAS layer, the PAAS layer and the SAAS layer, and influencing correlation among indexes; according to the actual operation and maintenance scene, the abnormal propagation chain relation model can be flexibly configured from different angles; the method realizes the knowledge and experience precipitation of experts in design, development, deployment, operation and maintenance and the like.
(2) The problem that the anomaly analysis depends on the monitoring range of the CMDB and the data quality is solved, and the continuous changing application scene requirements are met; the problem of event binaryzation and logic relation certainty of the fault tree analysis method is solved.
(3) The method can quickly and accurately position the root cause of the abnormality, shortens the abnormality analysis and positioning time, saves manpower, improves the operation and maintenance efficiency, and reduces the loss caused by the abnormality to enterprises.
Drawings
FIG. 1 is a main flow chart of a method according to an embodiment of the present invention;
FIG. 2 is a Bayesian network diagram in accordance with embodiments of the present invention;
FIG. 3 is another Bayesian network diagram in accordance with an embodiment of the present invention;
fig. 4 is a schematic diagram of the main modules of the apparatus according to the embodiment of the present invention.
Detailed Description
The technical scheme of the invention is specifically described below with reference to the accompanying drawings.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present application.
First, the terms referred to herein are explained, and IT should be noted that the method of the present invention may be applied to any anomaly analysis of an IT operation and maintenance scene, where objects, indexes, anomalies, relational models, etc. listed herein are only examples, and do not limit the embodiments of the present invention.
The object is: the operation and maintenance resource monitoring target identified based on operation and maintenance practice comprises an infrastructure IAAS layer, a platform PAAS layer, a software service SAAS layer and the like. For example, various IT and peripheral devices, such as a certain server of hardware, storage, a switch, a router, cloud resources, security devices, cameras, ONUs, OLT, etc., an operating system of software, a database, middleware, an application, a distributed component, etc., and objects of a business system, a service, etc. of the system.
Object properties: the object has its own attribute fields such as object name, object code, ip address, port, belonging business system, etc.
The index is as follows: the parameters of the object, namely various states and performance values of the object in running process are measured.
Index attribute: the index is provided with attribute fields such as index name, threshold, duration, alarm level, etc.
Relationship model: the method is characterized in that a plurality of objects are related based on an operation and maintenance scene according to the relationship among the objects and the influence among indexes, and one or more anomaly propagation chain topology models are built.
Abnormality: abnormal conditions of functions, parameters, performance, states and the like of a certain component object of the system in operation and maintenance can cause the deterioration of the functions of the object or the failure of the fault.
Fault tree analysis: the method is a top-down deduction failure analysis method, which utilizes Boolean logic to combine low-order events to analyze the undesired state in the system; the fault tree analysis is mainly used in the fields of security engineering and reliability engineering, and is used for knowing the cause of system failure, finding out the best way to reduce risk, or confirming the occurrence rate of a certain security accident or a specific system failure; fault tree analysis is also used in software engineering, in debugging, and in techniques for eliminating the cause of errors.
Bayesian networks: also known as belief networks or directed acyclic graph models, is a model of probability patterns. Is one of the most effective theoretical models in the field of uncertain knowledge expression and reasoning at present,
Suitable for expressing and analyzing uncertainty and probabilistic events, for applying to decisions that are conditionally dependent upon a variety of control factors, inferences can be made from incomplete, inaccurate, or uncertain knowledge or information.
Fig. 1 is a flow chart of an anomaly locating method according to an embodiment of the present invention, including the following steps:
Step S101, determining the operation and maintenance related object and monitoring index
The method specifically comprises the following substeps: the method comprises the steps of determining an operation and maintenance object, determining an attribute of the operation and maintenance object, determining a monitoring index related to the operation and maintenance object, and determining an index attribute related to the monitoring index.
S1011, determining an operation and maintenance object
The operation and maintenance resource monitoring target identified by operation and maintenance practice can be objects such as an infrastructure IAAS layer, a platform PAAS layer, a software service SAAS layer and the like. For example, various IT and peripheral devices, such as a certain server of hardware, storage, a switch, a router, cloud resources, security devices, cameras, ONUs, OLT, etc., an operating system of software, a database, middleware, an application, a distributed component, etc., and objects of a business system, a service, etc. of the system.
S1012, determining the attribute of the operation and maintenance object
Object attributes are attribute fields that the object is self-contained, e.g., where the object is a database, the object attributes may be object code, object name, resource grouping, ip address, port, software version, the business system to which it belongs, etc.
S1013, determining a monitoring index related to the operation and maintenance object
The monitoring index is various monitored states and performance values during operation of the operation and maintenance object, and reflects the health condition of the object. For example, in the case where the object is a server, the monitor indexes include CPU usage, memory usage, disk read throughput, disk write throughput, disk partition usage, reception network bandwidth, transmission network bandwidth, system load, and the like.
S1014, determining index attribute related to the monitoring index
The index attribute is an attribute field of the index itself, such as index name, threshold, duration, alarm level, silence period, etc.
And step S102, constructing one or more abnormal propagation chain relation models according to the operation and maintenance scene.
According to different operation and maintenance scenes from physical, logical, service call chains and other structural topologies, configuring and defining objects, object attributes, relationships among the objects and influence association among monitoring indexes, completing influence relationships among causal nodes, and describing an exception propagation chain; and carrying out subsequent anomaly analysis and positioning on aspects through the anomaly propagation chain relation.
In the related art, anomalies are often analyzed through association relationships among configuration attributes of objects. According to the invention, an abnormal propagation chain relation model is constructed through the association relation between objects and the influence association between object indexes according to the operation and maintenance scene and expert knowledge experience, and in one implementation mode, the occurrence probability and the weight are determined by combining the actual index data so as to identify, analyze and locate the abnormality.
Step S103, acquiring multidimensional index data of the operation and maintenance object
The system periodically monitors and collects the running state, health performance index, characteristic variable and the like of the operation and maintenance object in an omnibearing way.
Step S104, preprocessing the acquired multidimensional index data, and dividing the index data into a normal data set and an abnormal data set.
Preprocessing the acquired multidimensional index data, integrating the preprocessed data at the same moment, and dividing the data into a normal data set and an abnormal data set according to the moment of occurrence of the abnormality and the index threshold and a built-in algorithm.
Step S105, analyzing by bayesian network technology based on the one or more abnormal propagation chain relation models and the data set.
The method specifically comprises the following substeps:
s1051, determining abnormal scene top event according to abnormal data set
An anomaly scene top event is an initial point of locating an anomaly, and may be a user-perceived anomaly event such as slow system access, or an anomaly event that has not reached the user-perceived level such as low available capacity of a disk partition.
S1052, constructing a Bayesian network based on the objects and monitoring indexes associated with the top events of the abnormal scene, the association relation among the objects and the abnormal propagation chains affecting the association among the object indexes; based on the monitoring index and data related to the top event of the abnormal scene, a directed acyclic graph model is constructed according to the Bayesian network principle according to one or more abnormal propagation chain relation models constructed in the step S102 and the node objects and index data on the abnormal propagation chain.
S1053, analyzing from top to bottom by using Bayesian network, normal data set and abnormal data set, reasoning by causal relationship, and calculating occurrence probability result.
And (3) constructing a directed acyclic graph model and an actual data set, analyzing from top to bottom according to a Bayesian method and reasoning the causal relationship, and calculating the probability result of each node occurrence in the directed acyclic graph.
And S106, generating a cause result of the abnormal event according to the analysis result.
In addition, the bayesian network technique analysis method used in step S105 is as follows:
Given a joint probability distribution P (X1, X2, xn) and one ordering d of variables, starting with X1 as the root node, and giving X1 a priori probability distribution P (X1).
X2 is then represented by a node, and if X2 is related to X1, a bond is established from X1 to X2, and the bond strength is represented by P (X2|X1). If X2 is independent of X1, X2 is given an a priori probability distribution P (X2). From the parent node set of Xi at level i
Drawing a set of direction lines connected to Xi and usingThe conditional probability quantification shows that the result can yield a directed acyclic graph that can be used to represent many of the independent relationships embodied in P (X1, X2, & Xn), referred to as a Bayesian network.
In turn, the process may be performed,All the information necessary for reconstructing the original distribution function is contained, with the following relationship under the order d:
FIG. 2 is an example of a typical Bayesian network with joint distribution functions of:
P(X1,X2,X3,X4,X5,X6)=P(X6|X5)·P(X5|X3,X2)·P(X4|X2,X1)·P(X3|X1)·P(X2|X1)·P(X1)
Once the network is established, it can be inferred as a computational strategy. The belief is defined as Bel (x) =p (x|e), i.e. the conditional probability that an event x occurs in case of known evidence e, reflecting the probability that a certain event occurs in a certain environment.
E may be expressed as e=e -x∪e+ X, where e - X reflects a subtree with X as the root node and e + X reflects the rest of the tree, then the belief may be expressed as
Bel(e)=P(x|e-x,e+x)=αP(e-x|e+x,x)·P(x|e+x)=αP(e-x|x)·P(x|e+x)
Where α= [ P (e -x|e+x)]-1 is a normalization factor. Let λ (x) =p (e - x|x) denote support for diagnosis, pi (x) =p (x|e + x) denote support for forecasting, and then be (x) =αλ (x) pi (x) is updated when the observation evidence e comes in the actual reasoning process.
Fig. 3 is another example bayesian network:
in a bayesian network, the calculation of the occurrence probability of the top event can be equivalently the update of beliefs; the probability distribution of each underlying event is assumed to be:
p (X1) =0.5, P (X2) =0.6, P (X3) =0.7, P (X4) =0.8, P (X5) =0.9, which correspond to the a priori probability distribution of root nodes in the network, respectively. Under initial conditions, where nothing is known, the belief distribution in the network is simply a distribution of prior probabilities, each node has λ=1, and pi propagates from bottom to top, and the support value of each node for the forecast is equal to its prior probability distribution.
Based on the principle of network belief propagation update Bel (x) =αλ (x) pi (x), the belief value of the top event T, i.e., the probability of occurrence of the top event, can be calculated.
The calculation method of the belief value of each node in the network will be described below by taking the node S f as an example. Where pi (X2) =p (X2) =0.6, pi (X5) =p (X5) =0.9, and pi s f (X2) =pi (X2) =0.6,
Pi S f (X5) =pi (X5) =0.9, and since X2 and X5 are related to S f by or by the relationship, the coupling strength P (S f |x2, X5) can be obtained
Then Bel (S f)=αλ(Sf)π(Sf) =α (0.54), α is eliminated by normalization processing, and Bel (S f) =0.54 is obtained. According to the transfer relation among nodes in the network, the belief value of each node, including the belief value of the top event T,
I.e. the probability of occurrence of the top event T, i.e. P (T) =bel (T) =0.83.
Fig. 4 is a diagram of an abnormality locating device according to an embodiment of the present invention, including:
Definition module for executing: the method comprises the steps of determining an operation and maintenance object, determining an attribute of the operation and maintenance object, determining a monitoring index related to the operation and maintenance object, and determining an index attribute related to the monitoring index.
A relational model module for performing: and constructing one or more abnormal propagation chain relation models according to the relation among the objects and the influence relation among the indexes in the operation and maintenance scene.
An acquisition module for performing: and acquiring monitoring index data of each dimension of the operation and maintenance object.
A preprocessing module, configured to perform: preprocessing the acquired multidimensional monitoring index data, and dividing the index data into a normal data set and an abnormal data set.
A construction module for executing: and constructing a Bayesian network for the objects, the monitoring indexes and the abnormal propagation chain relations between the objects and the monitoring indexes, wherein the objects and the monitoring indexes are related to the top events of the abnormal scene.
An analysis module for performing: and analyzing from top to bottom by using a Bayesian network, a normal data set and an abnormal data set, reasoning by using causal relation, and calculating an occurrence probability result.
A positioning module for performing: and positioning the abnormal event according to the analysis module result, and providing an analysis result of the abnormal cause.
The above examples of implementations of the invention have been presented for the purpose of making the spirit of the invention more clear and easy to understand and are not intended to limit the invention, but all modifications, substitutions, combinations, improvements made within the spirit and principle of the invention are intended to be included within the protection scope of the invention as outlined in the claims appended hereto.
The above is a preferred embodiment of the present invention, and all changes made according to the technical solution of the present invention belong to the protection scope of the present invention when the generated functional effects do not exceed the scope of the technical solution of the present invention.
Claims (4)
1. An anomaly locating method is characterized by comprising the following steps:
step S101, determining an object and a monitoring index related to operation and maintenance;
Step S102, constructing one or more abnormal propagation chain relation models according to operation and maintenance scenerization;
step S103, acquiring multidimensional index data of the operation and maintenance object;
Step S104, preprocessing the acquired multidimensional index data, and dividing the index data into a normal data set and an abnormal data set;
step S105, analyzing by a Bayesian network technology based on one or more abnormal propagation chain relation models and data sets;
Step S106, generating a reason result generated by the abnormal event according to the analysis result;
The step S105 includes:
determining an abnormal scene top event according to the abnormal data set;
Constructing a Bayesian network based on the objects, the monitoring indexes and the abnormal propagation chain relations between the objects and the monitoring indexes, wherein the objects and the monitoring indexes are associated with the top events of the abnormal scene;
the Bayesian network, the normal data set and the abnormal data set are utilized, the top is used for analyzing downwards, and causal relationship reasoning and calculating the occurrence probability result; the specific implementation is as follows:
given a joint probability distribution P (X1, X2, xn) and one ordering d of variables, starting X1 as a root node, and giving X1 a priori probability distribution P (X1);
Then, X2 is represented by a node, if X2 is related to X1, a bond is established from X1 to X2, and the bond strength is represented by P (X2|X1); if X2 is independent of X1, giving X2 a priori probability distribution P (X2); from the parent node set of Xi at level i Drawing a set of direction lines connected to Xi and usingThe conditional probability quantitative representation, the result is a directed acyclic graph representing a number of independent relationships embodied in P (X1, X2, & Xn), referred to as a Bayesian network;
in turn, the process may be performed, All the information necessary for reconstructing the original distribution function is contained, and under the order d, there is a joint distribution function of the following bayesian network:
;
Defining beliefs as Bel (x) =p (x|e), i.e. the conditional probability of occurrence of an event x in case of known evidence e, reflecting the probability of occurrence of a certain event under a predetermined environment;
e is denoted as e=e -x∪e+ X, where e - X reflects the subtree with X as the root node and e + X reflects the rest of the tree, then the belief is expressed as:
Bel(e)=P(x|e-x,e+x)=αP(e-x|e+x,x)·P(x|e+x)=αP(e-x|x)·P(x|e+x)
Where α= [ P (e -x|e+x)]-1 is a normalization factor; let λ (x) =p (e - x|x) denote support for diagnosis, pi (x) =p (x|e + x) denote support for prognosis, then be (x) =αλ (x) pi (x);
In the actual reasoning process, when the observation evidence e comes, the network belief is updated; in the Bayesian network, the calculation of the occurrence probability of the abnormal scene top event can be equivalently updated as beliefs;
Under the initial condition that the condition is unknown, the belief distribution in the network is only the distribution of the prior probability, each node has lambda=1, pi propagates from bottom to top, and the supporting value of each node for forecasting is equal to the prior probability distribution;
According to the principle Bel (x) =alpha lambda (x) pi (x) of network belief propagation update, calculating the belief value of the abnormal scene top event T, namely the occurrence probability of the abnormal scene top event.
2. The abnormality locating method according to claim 1, characterized in that said step S101 includes: the method comprises the steps of determining an operation and maintenance object, determining an attribute of the operation and maintenance object, determining a monitoring index related to the operation and maintenance object, and determining an index attribute related to the monitoring index.
3. The abnormality locating method according to claim 1, characterized in that said step S102 includes:
determining the relation among the operation and maintenance objects and the influence association among the monitoring indexes according to different operation and maintenance scenes so as to determine an abnormal propagation relation;
Determining an abnormal node according to the abnormal propagation relationship;
one or more anomaly propagation chain relationship models between anomaly nodes are determined.
4. An abnormality locating device, characterized by comprising:
the definition module is used for determining the operation and maintenance object, determining the attribute of the operation and maintenance object, determining the monitoring index related to the operation and maintenance object and determining the index attribute related to the monitoring index;
The relation model module is used for constructing one or more abnormal propagation chain relation models according to the relation among objects and the influence relation among indexes in the operation and maintenance scene;
The acquisition module is used for acquiring the monitoring index data of each dimension of the operation and maintenance object;
the preprocessing module is used for preprocessing the acquired multidimensional monitoring index data and dividing the index data into a normal data set and an abnormal data set;
the construction module is used for constructing a Bayesian network for the objects and the monitoring indexes associated with the abnormal scene top events and the abnormal propagation chain relation between the objects and the monitoring indexes;
The analysis module is used for analyzing from top to bottom by using a Bayesian network, a normal data set and an abnormal data set, reasoning causal relation and calculating an occurrence probability result;
the positioning module is used for positioning the abnormal event according to the result of the analysis module and providing a reason analysis result;
The Bayesian network, the normal data set and the abnormal data set are utilized, the results of occurrence probability are analyzed from top to bottom, and the causal relationship reasoning and the calculation are realized as follows:
given a joint probability distribution P (X1, X2, xn) and one ordering d of variables, starting X1 as a root node, and giving X1 a priori probability distribution P (X1);
Then, X2 is represented by a node, if X2 is related to X1, a bond is established from X1 to X2, and the bond strength is represented by P (X2|X1); if X2 is independent of X1, giving X2 a priori probability distribution P (X2); from the parent node set of Xi at level i Drawing a set of direction lines connected to Xi and usingThe conditional probability quantitative representation, the result is a directed acyclic graph representing a number of independent relationships embodied in P (X1, X2, & Xn), referred to as a Bayesian network;
in turn, the process may be performed, All the information necessary for reconstructing the original distribution function is contained, and under the order d, there is a joint distribution function of the following bayesian network:
;
Defining beliefs as Bel (x) =p (x|e), i.e. the conditional probability of occurrence of an event x in case of known evidence e, reflecting the probability of occurrence of a certain event under a predetermined environment;
e is denoted as e=e -x∪e+ X, where e - X reflects the subtree with X as the root node and e + X reflects the rest of the tree, then the belief is expressed as:
Bel(e)=P(x|e-x,e+x)=αP(e-x|e+x,x)·P(x|e+x)=αP(e-x|x)·P(x|e+x)
Where α= [ P (e -x|e+x)]-1 is a normalization factor; let λ (x) =p (e - x|x) denote support for diagnosis, pi (x) =p (x|e + x) denote support for prognosis, then be (x) =αλ (x) pi (x);
In the actual reasoning process, when the observation evidence e comes, the network belief is updated; in the Bayesian network, the calculation of the occurrence probability of the abnormal scene top event can be equivalently updated as beliefs;
Under the initial condition that the condition is unknown, the belief distribution in the network is only the distribution of the prior probability, each node has lambda=1, pi propagates from bottom to top, and the supporting value of each node for forecasting is equal to the prior probability distribution;
According to the principle Bel (x) =alpha lambda (x) pi (x) of network belief propagation update, calculating the belief value of the abnormal scene top event T, namely the occurrence probability of the abnormal scene top event.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111285930.2A CN114024835B (en) | 2021-11-02 | 2021-11-02 | Abnormality positioning method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111285930.2A CN114024835B (en) | 2021-11-02 | 2021-11-02 | Abnormality positioning method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114024835A CN114024835A (en) | 2022-02-08 |
CN114024835B true CN114024835B (en) | 2024-09-20 |
Family
ID=80059613
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111285930.2A Active CN114024835B (en) | 2021-11-02 | 2021-11-02 | Abnormality positioning method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114024835B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102255764A (en) * | 2011-09-02 | 2011-11-23 | 广东省电力调度中心 | Method and device for diagnosing transmission network failure |
CN112579402A (en) * | 2020-12-14 | 2021-03-30 | 中国建设银行股份有限公司 | Method and device for positioning faults of application system |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5960029B2 (en) * | 2012-10-31 | 2016-08-02 | 住友重機械工業株式会社 | Abnormal cause identification system |
CN109409411B (en) * | 2018-09-28 | 2020-11-03 | 东软集团股份有限公司 | Problem positioning method and device based on operation and maintenance management and storage medium |
CN110490433A (en) * | 2019-07-30 | 2019-11-22 | 同济大学 | A kind of train control system methods of risk assessment |
CN111368888B (en) * | 2020-02-25 | 2022-07-01 | 重庆邮电大学 | Service function chain fault diagnosis method based on deep dynamic Bayesian network |
CN112039695A (en) * | 2020-08-19 | 2020-12-04 | 朔黄铁路发展有限责任公司肃宁分公司 | Transmission network fault positioning method and device based on Bayesian inference |
-
2021
- 2021-11-02 CN CN202111285930.2A patent/CN114024835B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102255764A (en) * | 2011-09-02 | 2011-11-23 | 广东省电力调度中心 | Method and device for diagnosing transmission network failure |
CN112579402A (en) * | 2020-12-14 | 2021-03-30 | 中国建设银行股份有限公司 | Method and device for positioning faults of application system |
Also Published As
Publication number | Publication date |
---|---|
CN114024835A (en) | 2022-02-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112769605B (en) | Heterogeneous multi-cloud operation and maintenance management method and hybrid cloud platform | |
CN114465874B (en) | Fault prediction method, device, electronic equipment and storage medium | |
CN113516174B (en) | Call chain abnormality detection method, computer device, and readable storage medium | |
US10616040B2 (en) | Managing network alarms | |
CN112199252B (en) | Abnormality monitoring method and device and electronic equipment | |
CN116166505B (en) | Monitoring platform, method, storage medium and equipment for dual-state IT architecture in financial industry | |
CN114064196A (en) | System and method for predictive assurance | |
CN112612671B (en) | System monitoring method, device, equipment and storage medium | |
CN115514619B (en) | Alarm convergence method and system | |
CN112559237A (en) | Operation and maintenance system troubleshooting method and device, server and storage medium | |
CN112379325A (en) | Fault diagnosis method and system for intelligent electric meter | |
CN111913824B (en) | Method for determining data link fault cause and related equipment | |
CN115470025A (en) | Intelligent root cause analysis method, device, medium and equipment in distributed cloud scene | |
CN114024835B (en) | Abnormality positioning method and device | |
CN112579402B (en) | Method and device for positioning faults of application system | |
KR20080087571A (en) | Context prediction system and method thereof | |
CN115150289A (en) | Exception handling method and system based on composite monitoring | |
CN114819367A (en) | Public service platform based on industrial internet | |
CN116522213A (en) | Service state level classification and classification model training method and electronic equipment | |
CN118520405B (en) | Cloud data platform comprehensive service management system and method based on artificial intelligence | |
CN117439899B (en) | Communication machine room inspection method and system based on big data | |
Liu | Enhanced Optimization of Computer Network Connection Based on Neural Network Algorithm | |
CN118590396A (en) | Multi-controller communication method and system based on hybrid tree topology | |
CN117544475A (en) | Alarm data processing method and device of wavelength division system, medium and electronic equipment | |
CN118101422A (en) | Business abnormality management method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |