[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN114024835B - Abnormality positioning method and device - Google Patents

Abnormality positioning method and device Download PDF

Info

Publication number
CN114024835B
CN114024835B CN202111285930.2A CN202111285930A CN114024835B CN 114024835 B CN114024835 B CN 114024835B CN 202111285930 A CN202111285930 A CN 202111285930A CN 114024835 B CN114024835 B CN 114024835B
Authority
CN
China
Prior art keywords
abnormal
determining
probability
maintenance
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111285930.2A
Other languages
Chinese (zh)
Other versions
CN114024835A (en
Inventor
陈贵
邹岚
李潇儒
林双
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Youke Communication Technology Co ltd
Original Assignee
China Youke Communication Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Youke Communication Technology Co ltd filed Critical China Youke Communication Technology Co ltd
Priority to CN202111285930.2A priority Critical patent/CN114024835B/en
Publication of CN114024835A publication Critical patent/CN114024835A/en
Application granted granted Critical
Publication of CN114024835B publication Critical patent/CN114024835B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Pure & Applied Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to an anomaly locating method and device. The resource object relation and the index influence relation are established through scenery, the abnormal influence range and the propagation chain are defined, the propagation chain is dynamically analyzed based on the Bayesian network technology, the problem that the abnormality is rapidly located and complex and difficult is solved, and the operation and maintenance intellectualization is further realized.

Description

Abnormality positioning method and device
Technical Field
The invention relates to the field of intelligent operation and maintenance abnormality positioning analysis, in particular to an abnormality positioning method and device.
Background
Along with the completion of digitization and networking by enterprises through related technologies such as cloud computing, distributed services and the like, the digitization transformation of the enterprises is promoted. The operation and maintenance team faces the problems of larger cluster protocol, more service components, more complex association relation and the like, the pressure and the challenges are increased day by day, and the unprecedented challenges are brought to IT operation and maintenance work. Any single outage event may have a significant impact on the business of the company. Therefore, how to quickly analyze and locate the abnormal situation, prevent the fault from being further deteriorated to give an early warning, and then ensure the stable and reliable operation of the digital service through quick repair and adjustment.
The most widely used anomaly or fault location technique at present is based on a CMDB (configuration management database) relational model, which is analyzed and located by fault tree analysis. The scheme adopted has the following defects:
The supervision range and the data quality of the CMDB are required to be relied on, and the correctness of the configuration items and the association relation is maintained for a long time; after the cloud digital transformation on the enterprise, the consistency, the integrity and the correctness of the data of the huge cloud resource CMDB are all the great doubts and difficulties; at the same time, the granularity of the CMDB relationships is configured to the object level, which requires great effort, and is difficult to refine or dynamically configure to help locate the root cause of the problem and adapt to the changing application scene requirements. The fault tree analysis method has some defects, and is generally used for a system with definite fault mechanism and clear fault logic relationship, because the fault state of the system is assumed to be mainly in two aspects, namely the two-state property of the event state and the certainty of the logic relationship; events in the fault tree have only two states: faults and normals, but polymorphisms such as anomalies (higher index, reduced performance, etc.) exist in actual operation and maintenance; in many complex systems there is no definite causal relationship between events, which is more suitably described in a probabilistic manner.
Disclosure of Invention
The invention aims to provide an anomaly positioning method and device for positioning technical limitation and unilateral performance of fault tree analysis positioning of CMDB relation model reasoning, which establishes influence association between resource object relation and indexes through scenerification, defines an anomaly influence range and a propagation chain, dynamically analyzes the propagation chain based on Bayesian network technology so as to solve the problem of complexity and difficulty in positioning anomalies, and further realizes the intellectualization of operation and maintenance.
In order to achieve the above purpose, the technical scheme of the invention is as follows: an anomaly locating method comprises the following steps:
step S101, determining an object and a monitoring index related to operation and maintenance;
Step S102, constructing one or more abnormal propagation chain relation models according to operation and maintenance scenerization;
step S103, acquiring multidimensional index data of the operation and maintenance object;
Step S104, preprocessing the acquired multidimensional index data, and dividing the index data into a normal data set and an abnormal data set;
step S105, analyzing by a Bayesian network technology based on one or more abnormal propagation chain relation models and data sets;
And S106, generating a cause result of the abnormal event according to the analysis result.
In an embodiment of the present invention, the step S101 includes: the method comprises the steps of determining an operation and maintenance object, determining an attribute of the operation and maintenance object, determining a monitoring index related to the operation and maintenance object, and determining an index attribute related to the monitoring index.
In an embodiment of the present invention, the step S102 includes:
determining the relation among the operation and maintenance objects and the influence association among the monitoring indexes according to different operation and maintenance scenes so as to determine an abnormal propagation relation;
Determining an abnormal node according to the abnormal propagation relationship;
one or more anomaly propagation chain relationship models between anomaly nodes are determined.
In an embodiment of the present invention, the step S105 includes:
determining an abnormal scene top event according to the abnormal data set;
Constructing a Bayesian network based on the objects, the monitoring indexes and the abnormal propagation chain relations between the objects and the monitoring indexes, wherein the objects and the monitoring indexes are associated with the top events of the abnormal scene;
And analyzing from top to bottom by using a Bayesian network, a normal data set and an abnormal data set, reasoning by using causal relation, and calculating an occurrence probability result.
The invention also provides an abnormality locating device, comprising:
the definition module is used for determining the operation and maintenance object, determining the attribute of the operation and maintenance object, determining the monitoring index related to the operation and maintenance object and determining the index attribute related to the monitoring index;
The relation model module is used for constructing one or more abnormal propagation chain relation models according to the relation among objects and the influence relation among indexes in the operation and maintenance scene;
The acquisition module is used for acquiring the monitoring index data of each dimension of the operation and maintenance object;
the preprocessing module is used for preprocessing the acquired multidimensional monitoring index data and dividing the index data into a normal data set and an abnormal data set;
the construction module is used for constructing a Bayesian network for the objects, the monitoring indexes and the abnormal propagation chain relations between the objects and the monitoring indexes, wherein the objects and the monitoring indexes are related to the top events of the abnormal scene;
The analysis module is used for analyzing from top to bottom by using a Bayesian network, a normal data set and an abnormal data set, reasoning causal relation and calculating an occurrence probability result;
and the positioning module is used for positioning the abnormal event according to the result of the analysis module and providing a reason analysis result.
Compared with the prior art, the invention has the following beneficial effects:
(1) Establishing abnormal propagation relation penetration among operation and maintenance objects of the IAAS layer, the PAAS layer and the SAAS layer, and influencing correlation among indexes; according to the actual operation and maintenance scene, the abnormal propagation chain relation model can be flexibly configured from different angles; the method realizes the knowledge and experience precipitation of experts in design, development, deployment, operation and maintenance and the like.
(2) The problem that the anomaly analysis depends on the monitoring range of the CMDB and the data quality is solved, and the continuous changing application scene requirements are met; the problem of event binaryzation and logic relation certainty of the fault tree analysis method is solved.
(3) The method can quickly and accurately position the root cause of the abnormality, shortens the abnormality analysis and positioning time, saves manpower, improves the operation and maintenance efficiency, and reduces the loss caused by the abnormality to enterprises.
Drawings
FIG. 1 is a main flow chart of a method according to an embodiment of the present invention;
FIG. 2 is a Bayesian network diagram in accordance with embodiments of the present invention;
FIG. 3 is another Bayesian network diagram in accordance with an embodiment of the present invention;
fig. 4 is a schematic diagram of the main modules of the apparatus according to the embodiment of the present invention.
Detailed Description
The technical scheme of the invention is specifically described below with reference to the accompanying drawings.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present application.
First, the terms referred to herein are explained, and IT should be noted that the method of the present invention may be applied to any anomaly analysis of an IT operation and maintenance scene, where objects, indexes, anomalies, relational models, etc. listed herein are only examples, and do not limit the embodiments of the present invention.
The object is: the operation and maintenance resource monitoring target identified based on operation and maintenance practice comprises an infrastructure IAAS layer, a platform PAAS layer, a software service SAAS layer and the like. For example, various IT and peripheral devices, such as a certain server of hardware, storage, a switch, a router, cloud resources, security devices, cameras, ONUs, OLT, etc., an operating system of software, a database, middleware, an application, a distributed component, etc., and objects of a business system, a service, etc. of the system.
Object properties: the object has its own attribute fields such as object name, object code, ip address, port, belonging business system, etc.
The index is as follows: the parameters of the object, namely various states and performance values of the object in running process are measured.
Index attribute: the index is provided with attribute fields such as index name, threshold, duration, alarm level, etc.
Relationship model: the method is characterized in that a plurality of objects are related based on an operation and maintenance scene according to the relationship among the objects and the influence among indexes, and one or more anomaly propagation chain topology models are built.
Abnormality: abnormal conditions of functions, parameters, performance, states and the like of a certain component object of the system in operation and maintenance can cause the deterioration of the functions of the object or the failure of the fault.
Fault tree analysis: the method is a top-down deduction failure analysis method, which utilizes Boolean logic to combine low-order events to analyze the undesired state in the system; the fault tree analysis is mainly used in the fields of security engineering and reliability engineering, and is used for knowing the cause of system failure, finding out the best way to reduce risk, or confirming the occurrence rate of a certain security accident or a specific system failure; fault tree analysis is also used in software engineering, in debugging, and in techniques for eliminating the cause of errors.
Bayesian networks: also known as belief networks or directed acyclic graph models, is a model of probability patterns. Is one of the most effective theoretical models in the field of uncertain knowledge expression and reasoning at present,
Suitable for expressing and analyzing uncertainty and probabilistic events, for applying to decisions that are conditionally dependent upon a variety of control factors, inferences can be made from incomplete, inaccurate, or uncertain knowledge or information.
Fig. 1 is a flow chart of an anomaly locating method according to an embodiment of the present invention, including the following steps:
Step S101, determining the operation and maintenance related object and monitoring index
The method specifically comprises the following substeps: the method comprises the steps of determining an operation and maintenance object, determining an attribute of the operation and maintenance object, determining a monitoring index related to the operation and maintenance object, and determining an index attribute related to the monitoring index.
S1011, determining an operation and maintenance object
The operation and maintenance resource monitoring target identified by operation and maintenance practice can be objects such as an infrastructure IAAS layer, a platform PAAS layer, a software service SAAS layer and the like. For example, various IT and peripheral devices, such as a certain server of hardware, storage, a switch, a router, cloud resources, security devices, cameras, ONUs, OLT, etc., an operating system of software, a database, middleware, an application, a distributed component, etc., and objects of a business system, a service, etc. of the system.
S1012, determining the attribute of the operation and maintenance object
Object attributes are attribute fields that the object is self-contained, e.g., where the object is a database, the object attributes may be object code, object name, resource grouping, ip address, port, software version, the business system to which it belongs, etc.
S1013, determining a monitoring index related to the operation and maintenance object
The monitoring index is various monitored states and performance values during operation of the operation and maintenance object, and reflects the health condition of the object. For example, in the case where the object is a server, the monitor indexes include CPU usage, memory usage, disk read throughput, disk write throughput, disk partition usage, reception network bandwidth, transmission network bandwidth, system load, and the like.
S1014, determining index attribute related to the monitoring index
The index attribute is an attribute field of the index itself, such as index name, threshold, duration, alarm level, silence period, etc.
And step S102, constructing one or more abnormal propagation chain relation models according to the operation and maintenance scene.
According to different operation and maintenance scenes from physical, logical, service call chains and other structural topologies, configuring and defining objects, object attributes, relationships among the objects and influence association among monitoring indexes, completing influence relationships among causal nodes, and describing an exception propagation chain; and carrying out subsequent anomaly analysis and positioning on aspects through the anomaly propagation chain relation.
In the related art, anomalies are often analyzed through association relationships among configuration attributes of objects. According to the invention, an abnormal propagation chain relation model is constructed through the association relation between objects and the influence association between object indexes according to the operation and maintenance scene and expert knowledge experience, and in one implementation mode, the occurrence probability and the weight are determined by combining the actual index data so as to identify, analyze and locate the abnormality.
Step S103, acquiring multidimensional index data of the operation and maintenance object
The system periodically monitors and collects the running state, health performance index, characteristic variable and the like of the operation and maintenance object in an omnibearing way.
Step S104, preprocessing the acquired multidimensional index data, and dividing the index data into a normal data set and an abnormal data set.
Preprocessing the acquired multidimensional index data, integrating the preprocessed data at the same moment, and dividing the data into a normal data set and an abnormal data set according to the moment of occurrence of the abnormality and the index threshold and a built-in algorithm.
Step S105, analyzing by bayesian network technology based on the one or more abnormal propagation chain relation models and the data set.
The method specifically comprises the following substeps:
s1051, determining abnormal scene top event according to abnormal data set
An anomaly scene top event is an initial point of locating an anomaly, and may be a user-perceived anomaly event such as slow system access, or an anomaly event that has not reached the user-perceived level such as low available capacity of a disk partition.
S1052, constructing a Bayesian network based on the objects and monitoring indexes associated with the top events of the abnormal scene, the association relation among the objects and the abnormal propagation chains affecting the association among the object indexes; based on the monitoring index and data related to the top event of the abnormal scene, a directed acyclic graph model is constructed according to the Bayesian network principle according to one or more abnormal propagation chain relation models constructed in the step S102 and the node objects and index data on the abnormal propagation chain.
S1053, analyzing from top to bottom by using Bayesian network, normal data set and abnormal data set, reasoning by causal relationship, and calculating occurrence probability result.
And (3) constructing a directed acyclic graph model and an actual data set, analyzing from top to bottom according to a Bayesian method and reasoning the causal relationship, and calculating the probability result of each node occurrence in the directed acyclic graph.
And S106, generating a cause result of the abnormal event according to the analysis result.
In addition, the bayesian network technique analysis method used in step S105 is as follows:
Given a joint probability distribution P (X1, X2, xn) and one ordering d of variables, starting with X1 as the root node, and giving X1 a priori probability distribution P (X1).
X2 is then represented by a node, and if X2 is related to X1, a bond is established from X1 to X2, and the bond strength is represented by P (X2|X1). If X2 is independent of X1, X2 is given an a priori probability distribution P (X2). From the parent node set of Xi at level i
Drawing a set of direction lines connected to Xi and usingThe conditional probability quantification shows that the result can yield a directed acyclic graph that can be used to represent many of the independent relationships embodied in P (X1, X2, & Xn), referred to as a Bayesian network.
In turn, the process may be performed,All the information necessary for reconstructing the original distribution function is contained, with the following relationship under the order d:
FIG. 2 is an example of a typical Bayesian network with joint distribution functions of:
P(X1,X2,X3,X4,X5,X6)=P(X6|X5)·P(X5|X3,X2)·P(X4|X2,X1)·P(X3|X1)·P(X2|X1)·P(X1)
Once the network is established, it can be inferred as a computational strategy. The belief is defined as Bel (x) =p (x|e), i.e. the conditional probability that an event x occurs in case of known evidence e, reflecting the probability that a certain event occurs in a certain environment.
E may be expressed as e=e -x∪e+ X, where e - X reflects a subtree with X as the root node and e + X reflects the rest of the tree, then the belief may be expressed as
Bel(e)=P(x|e-x,e+x)=αP(e-x|e+x,x)·P(x|e+x)=αP(e-x|x)·P(x|e+x)
Where α= [ P (e -x|e+x)]-1 is a normalization factor. Let λ (x) =p (e - x|x) denote support for diagnosis, pi (x) =p (x|e + x) denote support for forecasting, and then be (x) =αλ (x) pi (x) is updated when the observation evidence e comes in the actual reasoning process.
Fig. 3 is another example bayesian network:
in a bayesian network, the calculation of the occurrence probability of the top event can be equivalently the update of beliefs; the probability distribution of each underlying event is assumed to be:
p (X1) =0.5, P (X2) =0.6, P (X3) =0.7, P (X4) =0.8, P (X5) =0.9, which correspond to the a priori probability distribution of root nodes in the network, respectively. Under initial conditions, where nothing is known, the belief distribution in the network is simply a distribution of prior probabilities, each node has λ=1, and pi propagates from bottom to top, and the support value of each node for the forecast is equal to its prior probability distribution.
Based on the principle of network belief propagation update Bel (x) =αλ (x) pi (x), the belief value of the top event T, i.e., the probability of occurrence of the top event, can be calculated.
The calculation method of the belief value of each node in the network will be described below by taking the node S f as an example. Where pi (X2) =p (X2) =0.6, pi (X5) =p (X5) =0.9, and pi s f (X2) =pi (X2) =0.6,
Pi S f (X5) =pi (X5) =0.9, and since X2 and X5 are related to S f by or by the relationship, the coupling strength P (S f |x2, X5) can be obtained
Then Bel (S f)=αλ(Sf)π(Sf) =α (0.54), α is eliminated by normalization processing, and Bel (S f) =0.54 is obtained. According to the transfer relation among nodes in the network, the belief value of each node, including the belief value of the top event T,
I.e. the probability of occurrence of the top event T, i.e. P (T) =bel (T) =0.83.
Fig. 4 is a diagram of an abnormality locating device according to an embodiment of the present invention, including:
Definition module for executing: the method comprises the steps of determining an operation and maintenance object, determining an attribute of the operation and maintenance object, determining a monitoring index related to the operation and maintenance object, and determining an index attribute related to the monitoring index.
A relational model module for performing: and constructing one or more abnormal propagation chain relation models according to the relation among the objects and the influence relation among the indexes in the operation and maintenance scene.
An acquisition module for performing: and acquiring monitoring index data of each dimension of the operation and maintenance object.
A preprocessing module, configured to perform: preprocessing the acquired multidimensional monitoring index data, and dividing the index data into a normal data set and an abnormal data set.
A construction module for executing: and constructing a Bayesian network for the objects, the monitoring indexes and the abnormal propagation chain relations between the objects and the monitoring indexes, wherein the objects and the monitoring indexes are related to the top events of the abnormal scene.
An analysis module for performing: and analyzing from top to bottom by using a Bayesian network, a normal data set and an abnormal data set, reasoning by using causal relation, and calculating an occurrence probability result.
A positioning module for performing: and positioning the abnormal event according to the analysis module result, and providing an analysis result of the abnormal cause.
The above examples of implementations of the invention have been presented for the purpose of making the spirit of the invention more clear and easy to understand and are not intended to limit the invention, but all modifications, substitutions, combinations, improvements made within the spirit and principle of the invention are intended to be included within the protection scope of the invention as outlined in the claims appended hereto.
The above is a preferred embodiment of the present invention, and all changes made according to the technical solution of the present invention belong to the protection scope of the present invention when the generated functional effects do not exceed the scope of the technical solution of the present invention.

Claims (4)

1. An anomaly locating method is characterized by comprising the following steps:
step S101, determining an object and a monitoring index related to operation and maintenance;
Step S102, constructing one or more abnormal propagation chain relation models according to operation and maintenance scenerization;
step S103, acquiring multidimensional index data of the operation and maintenance object;
Step S104, preprocessing the acquired multidimensional index data, and dividing the index data into a normal data set and an abnormal data set;
step S105, analyzing by a Bayesian network technology based on one or more abnormal propagation chain relation models and data sets;
Step S106, generating a reason result generated by the abnormal event according to the analysis result;
The step S105 includes:
determining an abnormal scene top event according to the abnormal data set;
Constructing a Bayesian network based on the objects, the monitoring indexes and the abnormal propagation chain relations between the objects and the monitoring indexes, wherein the objects and the monitoring indexes are associated with the top events of the abnormal scene;
the Bayesian network, the normal data set and the abnormal data set are utilized, the top is used for analyzing downwards, and causal relationship reasoning and calculating the occurrence probability result; the specific implementation is as follows:
given a joint probability distribution P (X1, X2, xn) and one ordering d of variables, starting X1 as a root node, and giving X1 a priori probability distribution P (X1);
Then, X2 is represented by a node, if X2 is related to X1, a bond is established from X1 to X2, and the bond strength is represented by P (X2|X1); if X2 is independent of X1, giving X2 a priori probability distribution P (X2); from the parent node set of Xi at level i Drawing a set of direction lines connected to Xi and usingThe conditional probability quantitative representation, the result is a directed acyclic graph representing a number of independent relationships embodied in P (X1, X2, & Xn), referred to as a Bayesian network;
in turn, the process may be performed, All the information necessary for reconstructing the original distribution function is contained, and under the order d, there is a joint distribution function of the following bayesian network:
Defining beliefs as Bel (x) =p (x|e), i.e. the conditional probability of occurrence of an event x in case of known evidence e, reflecting the probability of occurrence of a certain event under a predetermined environment;
e is denoted as e=e -x∪e+ X, where e - X reflects the subtree with X as the root node and e + X reflects the rest of the tree, then the belief is expressed as:
Bel(e)=P(x|e-x,e+x)=αP(e-x|e+x,x)·P(x|e+x)=αP(e-x|x)·P(x|e+x)
Where α= [ P (e -x|e+x)]-1 is a normalization factor; let λ (x) =p (e - x|x) denote support for diagnosis, pi (x) =p (x|e + x) denote support for prognosis, then be (x) =αλ (x) pi (x);
In the actual reasoning process, when the observation evidence e comes, the network belief is updated; in the Bayesian network, the calculation of the occurrence probability of the abnormal scene top event can be equivalently updated as beliefs;
Under the initial condition that the condition is unknown, the belief distribution in the network is only the distribution of the prior probability, each node has lambda=1, pi propagates from bottom to top, and the supporting value of each node for forecasting is equal to the prior probability distribution;
According to the principle Bel (x) =alpha lambda (x) pi (x) of network belief propagation update, calculating the belief value of the abnormal scene top event T, namely the occurrence probability of the abnormal scene top event.
2. The abnormality locating method according to claim 1, characterized in that said step S101 includes: the method comprises the steps of determining an operation and maintenance object, determining an attribute of the operation and maintenance object, determining a monitoring index related to the operation and maintenance object, and determining an index attribute related to the monitoring index.
3. The abnormality locating method according to claim 1, characterized in that said step S102 includes:
determining the relation among the operation and maintenance objects and the influence association among the monitoring indexes according to different operation and maintenance scenes so as to determine an abnormal propagation relation;
Determining an abnormal node according to the abnormal propagation relationship;
one or more anomaly propagation chain relationship models between anomaly nodes are determined.
4. An abnormality locating device, characterized by comprising:
the definition module is used for determining the operation and maintenance object, determining the attribute of the operation and maintenance object, determining the monitoring index related to the operation and maintenance object and determining the index attribute related to the monitoring index;
The relation model module is used for constructing one or more abnormal propagation chain relation models according to the relation among objects and the influence relation among indexes in the operation and maintenance scene;
The acquisition module is used for acquiring the monitoring index data of each dimension of the operation and maintenance object;
the preprocessing module is used for preprocessing the acquired multidimensional monitoring index data and dividing the index data into a normal data set and an abnormal data set;
the construction module is used for constructing a Bayesian network for the objects and the monitoring indexes associated with the abnormal scene top events and the abnormal propagation chain relation between the objects and the monitoring indexes;
The analysis module is used for analyzing from top to bottom by using a Bayesian network, a normal data set and an abnormal data set, reasoning causal relation and calculating an occurrence probability result;
the positioning module is used for positioning the abnormal event according to the result of the analysis module and providing a reason analysis result;
The Bayesian network, the normal data set and the abnormal data set are utilized, the results of occurrence probability are analyzed from top to bottom, and the causal relationship reasoning and the calculation are realized as follows:
given a joint probability distribution P (X1, X2, xn) and one ordering d of variables, starting X1 as a root node, and giving X1 a priori probability distribution P (X1);
Then, X2 is represented by a node, if X2 is related to X1, a bond is established from X1 to X2, and the bond strength is represented by P (X2|X1); if X2 is independent of X1, giving X2 a priori probability distribution P (X2); from the parent node set of Xi at level i Drawing a set of direction lines connected to Xi and usingThe conditional probability quantitative representation, the result is a directed acyclic graph representing a number of independent relationships embodied in P (X1, X2, & Xn), referred to as a Bayesian network;
in turn, the process may be performed, All the information necessary for reconstructing the original distribution function is contained, and under the order d, there is a joint distribution function of the following bayesian network:
Defining beliefs as Bel (x) =p (x|e), i.e. the conditional probability of occurrence of an event x in case of known evidence e, reflecting the probability of occurrence of a certain event under a predetermined environment;
e is denoted as e=e -x∪e+ X, where e - X reflects the subtree with X as the root node and e + X reflects the rest of the tree, then the belief is expressed as:
Bel(e)=P(x|e-x,e+x)=αP(e-x|e+x,x)·P(x|e+x)=αP(e-x|x)·P(x|e+x)
Where α= [ P (e -x|e+x)]-1 is a normalization factor; let λ (x) =p (e - x|x) denote support for diagnosis, pi (x) =p (x|e + x) denote support for prognosis, then be (x) =αλ (x) pi (x);
In the actual reasoning process, when the observation evidence e comes, the network belief is updated; in the Bayesian network, the calculation of the occurrence probability of the abnormal scene top event can be equivalently updated as beliefs;
Under the initial condition that the condition is unknown, the belief distribution in the network is only the distribution of the prior probability, each node has lambda=1, pi propagates from bottom to top, and the supporting value of each node for forecasting is equal to the prior probability distribution;
According to the principle Bel (x) =alpha lambda (x) pi (x) of network belief propagation update, calculating the belief value of the abnormal scene top event T, namely the occurrence probability of the abnormal scene top event.
CN202111285930.2A 2021-11-02 2021-11-02 Abnormality positioning method and device Active CN114024835B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111285930.2A CN114024835B (en) 2021-11-02 2021-11-02 Abnormality positioning method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111285930.2A CN114024835B (en) 2021-11-02 2021-11-02 Abnormality positioning method and device

Publications (2)

Publication Number Publication Date
CN114024835A CN114024835A (en) 2022-02-08
CN114024835B true CN114024835B (en) 2024-09-20

Family

ID=80059613

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111285930.2A Active CN114024835B (en) 2021-11-02 2021-11-02 Abnormality positioning method and device

Country Status (1)

Country Link
CN (1) CN114024835B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102255764A (en) * 2011-09-02 2011-11-23 广东省电力调度中心 Method and device for diagnosing transmission network failure
CN112579402A (en) * 2020-12-14 2021-03-30 中国建设银行股份有限公司 Method and device for positioning faults of application system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5960029B2 (en) * 2012-10-31 2016-08-02 住友重機械工業株式会社 Abnormal cause identification system
CN109409411B (en) * 2018-09-28 2020-11-03 东软集团股份有限公司 Problem positioning method and device based on operation and maintenance management and storage medium
CN110490433A (en) * 2019-07-30 2019-11-22 同济大学 A kind of train control system methods of risk assessment
CN111368888B (en) * 2020-02-25 2022-07-01 重庆邮电大学 Service function chain fault diagnosis method based on deep dynamic Bayesian network
CN112039695A (en) * 2020-08-19 2020-12-04 朔黄铁路发展有限责任公司肃宁分公司 Transmission network fault positioning method and device based on Bayesian inference

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102255764A (en) * 2011-09-02 2011-11-23 广东省电力调度中心 Method and device for diagnosing transmission network failure
CN112579402A (en) * 2020-12-14 2021-03-30 中国建设银行股份有限公司 Method and device for positioning faults of application system

Also Published As

Publication number Publication date
CN114024835A (en) 2022-02-08

Similar Documents

Publication Publication Date Title
CN112769605B (en) Heterogeneous multi-cloud operation and maintenance management method and hybrid cloud platform
CN114465874B (en) Fault prediction method, device, electronic equipment and storage medium
CN113516174B (en) Call chain abnormality detection method, computer device, and readable storage medium
US10616040B2 (en) Managing network alarms
CN112199252B (en) Abnormality monitoring method and device and electronic equipment
CN116166505B (en) Monitoring platform, method, storage medium and equipment for dual-state IT architecture in financial industry
CN114064196A (en) System and method for predictive assurance
CN112612671B (en) System monitoring method, device, equipment and storage medium
CN115514619B (en) Alarm convergence method and system
CN112559237A (en) Operation and maintenance system troubleshooting method and device, server and storage medium
CN112379325A (en) Fault diagnosis method and system for intelligent electric meter
CN111913824B (en) Method for determining data link fault cause and related equipment
CN115470025A (en) Intelligent root cause analysis method, device, medium and equipment in distributed cloud scene
CN114024835B (en) Abnormality positioning method and device
CN112579402B (en) Method and device for positioning faults of application system
KR20080087571A (en) Context prediction system and method thereof
CN115150289A (en) Exception handling method and system based on composite monitoring
CN114819367A (en) Public service platform based on industrial internet
CN116522213A (en) Service state level classification and classification model training method and electronic equipment
CN118520405B (en) Cloud data platform comprehensive service management system and method based on artificial intelligence
CN117439899B (en) Communication machine room inspection method and system based on big data
Liu Enhanced Optimization of Computer Network Connection Based on Neural Network Algorithm
CN118590396A (en) Multi-controller communication method and system based on hybrid tree topology
CN117544475A (en) Alarm data processing method and device of wavelength division system, medium and electronic equipment
CN118101422A (en) Business abnormality management method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant