CN105634781B

CN105634781B - Multi-fault data decoupling method and device

Info

Publication number: CN105634781B
Application number: CN201410620670.3A
Authority: CN
Inventors: 赵春华
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2014-11-05
Filing date: 2014-11-05
Publication date: 2020-03-13
Anticipated expiration: 2034-11-05
Also published as: WO2016070642A1; CN105634781A

Abstract

The invention discloses a multi-fault data decoupling method and device. The method uses the frequent itemset analysis result using the association analysis method for de-rooting and de-correlation in fault data decoupling. and the de-correlated fault data, use the correlation coefficient matrix in the case of single fault data to select the attribution fault for the alarm data. The decoupling method has the characteristics of the correlation analysis method: high accuracy and robustness, and compared with the manual method in the existing network, the work efficiency is improved, and it provides the possibility for large-scale data mining and analysis of fault alarm data.

Description

A method and device for decoupling multi-fault data

技术领域technical field

本发明涉及通信技术领域，尤其涉及一种多故障数据解耦方法和装置。The present invention relates to the field of communication technologies, and in particular, to a method and device for decoupling multiple fault data.

背景技术Background technique

通信网络中的故障指的是组成被管网络的硬件设备或者软件设备所发生的功能异常。通信网络中的告警指的是特定事件发生时被管对象发出的通报构成的事件报告，用于传递告警信息。通信网络中的故障和故障之间，故障和告警之间的关系较为复杂。A failure in a communication network refers to a malfunction of the hardware or software devices that make up the managed network. An alarm in a communication network refers to an event report composed of a notification sent by a managed object when a specific event occurs, and is used to transmit alarm information. The relationship between faults and faults, and between faults and alarms in a communication network is relatively complex.

故障与故障之间，一个故障可以独立存在，也可能一个故障引发多个故障，例如IUB口的故障引发用户的掉话，电力系统故障引发单板掉电从而小区退服用户掉话等。故障与告警之间，一个故障可能产生了一个告警也可能产生多个告警。一条告警的出现也表明可能有故障发生，而不是一定有故障发生。Between faults, one fault can exist independently, or one fault may cause multiple faults. For example, a fault on the IUB port causes a user to drop calls, and a power system fault causes a single board to lose power, thereby causing a user to drop calls when the cell is withdrawn. Between faults and alarms, one fault may generate one alarm or multiple alarms. The presence of an alarm also indicates that there may be a failure, not necessarily a failure.

根据网络中的告警信息，进行通信网络的故障分析是研究网络的维护管理的重要工作之一。目前基于数据挖掘的方法进行通信网络故障分析的研究中，通过数据挖掘的各种分类算法对单故障数据信息根据告警信息进行故障分析的方法已有不少研究成果。According to the alarm information in the network, the fault analysis of the communication network is one of the important tasks in the research of network maintenance and management. At present, in the research of communication network fault analysis based on data mining method, there have been many research results in the method of fault analysis method for single fault data information according to alarm information through various classification algorithms of data mining.

而实际网络中采集的数据，为在同一区域和时间的多告警信息及对应的多故障数据。既存在相关故障的同时发生，即根因故障和从属故障同时存在，也存在多不相关的故障同时发生情况。The data collected in the actual network is the multiple alarm information and the corresponding multiple fault data in the same area and time. There are both co-occurrence of related faults, that is, the coexistence of root-cause faults and dependent faults, and the co-occurrence of multiple unrelated faults.

因此要根据现有的研究方法根据告警信息进行故障根因分析，需要考虑多故障发生情况下，采集的告警数据与在多故障之间进行数据解耦方法：Therefore, to analyze the root cause of faults according to the alarm information according to the existing research methods, it is necessary to consider the method of decoupling the collected alarm data and the data between multiple faults in the case of multiple faults:

对多故障情况下，进行故障数据之间的相关性分析；In the case of multiple faults, the correlation analysis between fault data is carried out;

对相关的多故障，给出故障的根因；For related multiple faults, give the root cause of the fault;

给出告警数据的归属故障。Gives the attribution fault of the alarm data.

实际网络中目前采取网络维护工程师进行人工数据处理。该方法一方面人工成本高，且准确性受限于工程师的水平，另一方面工作效率无法满足大数据的故障分析的需求。In the actual network, network maintenance engineers are currently used for manual data processing. On the one hand, this method has high labor cost, and the accuracy is limited by the level of engineers. On the other hand, the work efficiency cannot meet the needs of fault analysis of big data.

发明内容SUMMARY OF THE INVENTION

本发明提供一种多故障数据解耦方法和装置，用以解决现有技术采用的数据解耦方法效率低下，无法满足大数据的故障分析需求的问题。The invention provides a multi-fault data decoupling method and device, which are used to solve the problem that the data decoupling method adopted in the prior art is inefficient and cannot meet the fault analysis requirements of big data.

依据本发明的一个方面，提供一种多故障数据解耦方法，包括：According to an aspect of the present invention, a method for decoupling multiple fault data is provided, comprising:

获取同一时间在同一区域采集的K组告警数据和K组故障数据，其中，每组故障数据均按故障优先级排序；Obtain K groups of alarm data and K groups of fault data collected in the same area at the same time, where each group of fault data is sorted by fault priority;

对K组故障数据使用关联分析算法，得到故障频繁项集X，并将所述故障频繁项集X转化为故障两两相关矩阵R；Using an association analysis algorithm for the K groups of fault data, a frequent fault item set X is obtained, and the frequent fault itemset X is converted into a fault pairwise correlation matrix R;

基于所述故障两两相关矩阵R，对K组故障数据中存在多故障的故障数据组进行故障去相关化和根因化；Based on the fault pairwise correlation matrix R, perform fault de-correlation and root cause for the fault data groups with multiple faults in the K groups of fault data;

提取与故障去相关化和根因化后存在多不相关故障的各故障数据组对应的各组告警数据，根据告警与各故障之间的相关性，确定提取的各组告警数据中每个告警所归属的故障。Extract each group of alarm data corresponding to each fault data group with multiple irrelevant faults after fault de-correlation and root cause, and determine each alarm in the extracted groups of alarm data according to the correlation between the alarm and each fault attributable fault.

可选地，本发明所述方法中，所述将故障频繁项集X转化为故障两两相关矩阵R的转化原则为：对于在任一频繁项集中同时存在的两两故障标记为相关，对于所有频繁项集中都没有同时存在的两两故障标记为不相关；所述故障两两相关矩阵R中的元素表示两两故障间是否相关。Optionally, in the method of the present invention, the transformation principle for converting the frequent fault item set X into the fault pairwise correlation matrix R is: mark the pairwise faults that exist simultaneously in any frequent item set as correlation, and for all Pairs of faults that do not exist at the same time in the frequent item set are marked as irrelevant; the elements in the fault pairwise correlation matrix R indicate whether the two faults are correlated.

可选地，本发明所述方法中，所述基于所述故障两两相关矩阵R，对K组故障数据中存在多故障的故障数据组进行故障去相关化和根因化的原则为：Optionally, in the method of the present invention, based on the fault pairwise correlation matrix R, the principle of performing fault de-correlation and root cause for the fault data group with multiple faults in the K groups of fault data is as follows:

若故障两两相关矩阵R中表示相关的两故障同时存在于多故障的故障数据组中，则高优先级故障存在的情况下，保留高优先级故障，删除低优先级故障；If the fault pair correlation matrix R indicates that the two related faults exist in the fault data group with multiple faults at the same time, then if the high priority fault exists, the high priority fault is retained, and the low priority fault is deleted;

若故障两两相关矩阵R中表示不相关的两个故障同时存在于多故障的故障数据组中，则两个故障同时保留。If two faults that are not correlated in the fault pairwise correlation matrix R exist simultaneously in the fault data group of multiple faults, the two faults are retained at the same time.

可选地，本发明所述方法中，所述根据告警与各故障之间的相关性，确定提取的各组告警数据中每个告警所归属的故障，具体包括：Optionally, in the method of the present invention, determining the fault to which each alarm belongs in the extracted groups of alarm data according to the correlation between the alarm and each fault, specifically includes:

对于提取的每组告警数据，获取与其对应的故障去相关化和根因化后的故障数据组中包含的各故障，得到故障集合；For each group of extracted alarm data, obtain each fault contained in the corresponding fault de-correlation and root-caused fault data group, and obtain a fault set;

对于提取的每组告警数据，确定告警数据中每个告警与对应故障集合中各故障相关性最高的故障为对应告警所归属的故障。For each group of extracted alarm data, it is determined that the fault with the highest correlation between each alarm in the alarm data and each fault in the corresponding fault set is the fault to which the corresponding alarm belongs.

可选地，本发明所述方法中，还包括：根据K组故障数据中单故障的故障数据组，计算各告警与各故障之间的皮尔逊相关系数，并通过皮尔逊相关系数来表示告警与各故障之间的相关性。Optionally, in the method of the present invention, the method further includes: calculating a Pearson correlation coefficient between each alarm and each fault according to the fault data group of a single fault in the K groups of fault data, and using the Pearson correlation coefficient to represent the alarm Correlation with each fault.

依据本发明的另一各方面，提供一种多故障数据解耦装置，包括：According to another aspect of the present invention, a multi-fault data decoupling device is provided, comprising:

数据输入单元，用于获取同一时间在同一区域采集的K组告警数据和K组故障数据，其中，每组故障数据均按故障优先级排序；The data input unit is used to obtain K groups of alarm data and K groups of fault data collected in the same area at the same time, wherein each group of fault data is sorted by fault priority;

数据处理单元，用于对K组故障数据使用关联分析算法，得到故障频繁项集X，将所述故障频繁项集X转化为故障两两相关矩阵R，并基于所述故障两两相关矩阵R，对K组故障数据中存在多故障的故障数据组进行故障去相关化和根因化；The data processing unit is used to use an association analysis algorithm on the K groups of fault data to obtain a frequent fault item set X, convert the frequent fault item set X into a fault pairwise correlation matrix R, and based on the fault pairwise correlation matrix R , perform fault de-correlation and root cause for the fault data group with multiple faults in the K group fault data;

解耦单元，用于提取与故障去相关化和根因化后存在多不相关故障的各故障数据组对应的各组告警数据，根据告警与各故障之间的相关性，确定提取的各组告警数据中每个告警所归属的故障。The decoupling unit is used to extract each group of alarm data corresponding to each fault data group with multiple irrelevant faults after fault de-correlation and root cause, and determine the extracted groups according to the correlation between the alarm and each fault The fault to which each alarm in the alarm data belongs.

可选地，本发明所述装置中，所述数据处理单元将故障频繁项集X转化为故障两两相关矩阵R的转化原则为：对于在任一频繁项集中同时存在的两两故障标记为相关，对于所有频繁项集中都没有同时存在的两两故障标记为不相关；所述故障两两相关矩阵R中的元素表示两两故障间是否相关。Optionally, in the device of the present invention, the data processing unit converts the frequent fault itemsets X into the fault pairwise correlation matrix R. The conversion principle is: for the pairwise faults that exist at the same time in any frequent item set, the faults are marked as related. , the pairwise faults that do not exist simultaneously in all frequent item sets are marked as irrelevant; the elements in the fault pairwise correlation matrix R indicate whether the pairwise faults are correlated.

可选地，本发明所述装置中，所述数据处理单元基于所述故障两两相关矩阵R，对K组故障数据中存在多故障的故障数据组进行故障去相关化和根因化的原则为：Optionally, in the device of the present invention, the data processing unit performs the principle of fault de-correlation and root cause for the fault data groups with multiple faults in the K groups of fault data based on the fault pairwise correlation matrix R. for:

若故障两两相关矩阵R中表示相关的两故障同时存在于多故障的故障数据组中，则高优先级故障存在的情况下，保留高优先级故障，删除低优先级故障；若故障两两相关矩阵R中表示不相关的两个故障同时存在于多故障的故障数据组中，则两个故障同时保留。If the fault pair correlation matrix R indicates that the two related faults exist in the fault data group with multiple faults at the same time, then if the high priority fault exists, the high priority fault will be retained and the low priority fault will be deleted; The correlation matrix R indicates that two irrelevant faults exist in the multi-fault fault data group at the same time, then the two faults are retained at the same time.

可选地，本发明所述装置中，所述故障解耦单元，具体用于对于提取的每组告警数据，获取与其对应的故障去相关化和根因化后的故障数据组中包含的各故障，得到故障集合；对于提取的每组告警数据，确定告警数据中每个告警与对应故障集合中各故障相关性最高的故障为对应告警所归属的故障。Optionally, in the device of the present invention, the fault decoupling unit is specifically configured to, for each group of extracted alarm data, obtain the corresponding fault de-correlation and root cause fault data groups included in the group. For each set of extracted alarm data, determine the fault with the highest correlation between each alarm in the alarm data and each fault in the corresponding fault set as the fault to which the corresponding alarm belongs.

可选地，本发明所述装置中，所述数据处理单元，还用于根据K组故障数据中单故障的故障数据组，计算各告警与各故障之间的皮尔逊相关系数，以通过皮尔逊相关系数来表示告警与各故障之间的相关性。Optionally, in the device of the present invention, the data processing unit is further configured to calculate the Pearson correlation coefficient between each alarm and each fault according to the fault data group of a single fault in the K groups of fault data, so as to pass the Pearson correlation coefficient between each alarm and each fault. The correlation coefficient is used to represent the correlation between the alarm and each fault.

本发明有益效果如下：The beneficial effects of the present invention are as follows:

本发明揭示的技术方案，具有关联分析方法的特点：准确率高和鲁棒性强，且相对于现网中的人工方法提高了工作效率，为故障告警数据的大规模数据挖掘分析提供了可能。The technical scheme disclosed by the invention has the characteristics of the correlation analysis method: high accuracy and strong robustness, and compared with the manual method in the existing network, the work efficiency is improved, and the possibility of large-scale data mining and analysis of fault alarm data is provided. .

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description These are just some embodiments of the present invention, and for those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.

图1为本发明实施例一提供的一种多故障数据解耦方法的流程图；1 is a flowchart of a method for decoupling multi-fault data according to Embodiment 1 of the present invention;

图2为本发明实施例二提供的一种多故障数据解耦方法的流程图；2 is a flowchart of a method for decoupling multi-fault data according to Embodiment 2 of the present invention;

图3为本发明提供的一种多故障数据解耦装置的结构框图。FIG. 3 is a structural block diagram of a multi-fault data decoupling device provided by the present invention.

具体实施方式Detailed ways

为了解决现有技术采用的数据解耦方法效率低下，无法满足大数据的故障分析需求的问题，本发明提供一种多故障数据解耦方法和装置。本发明提供的方案创新在于，将使用关联分析方法的频繁项集分析结果用于故障数据解耦中去根因化和去相关化，基于去跟因化和去相关化后的故障数据，使用单故障数据情况下的相关系数矩阵为告警数据选择归属故障。下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to solve the problem that the data decoupling method adopted in the prior art is inefficient and cannot meet the fault analysis requirements of big data, the present invention provides a multi-fault data decoupling method and device. The innovation of the solution provided by the present invention lies in that the frequent itemset analysis results using the association analysis method are used for de-rooting and de-correlation in fault data decoupling, and based on the de-causing and de-correlated fault data, use The correlation coefficient matrix in the case of single fault data selects the attribution fault for the alarm data. The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

实施例一Example 1

本发明实施例提供一种多故障数据解耦方法，如图1所示，包括如下步骤：An embodiment of the present invention provides a multi-fault data decoupling method, as shown in FIG. 1 , including the following steps:

步骤S101，获取同一时间在同一区域采集的K组告警数据和K组故障数据，其中，每组故障数据均按故障优先级排序；Step S101, acquiring K groups of alarm data and K groups of fault data collected in the same area at the same time, wherein each group of fault data is sorted by fault priority;

其中，每组告警数据中都包含M个数据，每个数据对应一个告警，用以表示该告警是否存在；Wherein, each group of alarm data includes M data, and each data corresponds to an alarm to indicate whether the alarm exists;

每组故障数据中都包含N个数据，每个数据对应一个故障，用以表示该故障是否存在。Each set of fault data contains N pieces of data, and each data corresponds to a fault to indicate whether the fault exists.

步骤S102，对K组故障数据使用关联分析算法，得到故障频繁项集X，并将所述故障频繁项集X转化为故障两两相关矩阵R；Step S102, use an association analysis algorithm on the K groups of fault data to obtain a frequent fault item set X, and convert the frequent fault item set X into a fault pairwise correlation matrix R;

其中，将故障频繁项集X转化为故障两两相关矩阵R的转化原则为：对于在任一频繁项集中同时存在的两两故障标记为相关，对于所有频繁项集中都没有同时存在的两两故障标记为不相关；Among them, the transformation principle of transforming the fault frequent itemset X into the fault pairwise correlation matrix R is: for the pairwise faults that exist at the same time in any frequent item set, it is marked as correlation, and for all frequent item sets, there is no pairwise fault that exists at the same time. marked as irrelevant;

所述故障两两相关矩阵R中的元素表示两两故障间是否相关。Elements in the fault pairwise correlation matrix R indicate whether the pairwise faults are correlated.

步骤S103，基于故障两两相关矩阵R，对K组故障数据中存在多故障的故障数据组进行故障去相关化和根因化；Step S103, based on the fault pairwise correlation matrix R, perform fault de-correlation and root cause on the fault data groups with multiple faults in the K groups of fault data;

其中，所述基于所述故障两两相关矩阵R，对K组故障数据中存在多故障的故障数据组进行故障去相关化和根因化的原则为：Wherein, based on the fault pairwise correlation matrix R, the principle of performing fault de-correlation and root cause for the fault data group with multiple faults in the K group fault data is as follows:

步骤S104，提取与故障去相关化和根因化后存在多不相关故障的各故障数据组对应的各组告警数据，根据告警与各故障之间的相关性，确定提取的各组告警数据中每个告警所归属的故障。Step S104, extract each group of alarm data corresponding to each fault data group with multiple irrelevant faults after fault de-correlation and root cause, and determine the alarm data in each group of extracted alarm data according to the correlation between the alarm and each fault. The fault to which each alarm belongs.

其中，根据告警与各故障之间的相关性，确定提取的各组告警数据中每个告警所归属的故障，具体包括：Among them, according to the correlation between the alarm and each fault, the fault to which each alarm belongs in the extracted sets of alarm data is determined, which specifically includes:

其中，告警与故障间的相关性，优选的，通过皮尔逊相关系数来表示。The correlation between the alarm and the fault is preferably represented by a Pearson correlation coefficient.

所述皮尔逊相关系数的计算方式为：根据K组故障数据中单故障的故障数据组，计算各告警与各故障之间的皮尔逊相关系数。其中涉及的具体计算方式属于已知技术，再此不作详述。The calculation method of the Pearson correlation coefficient is as follows: according to the fault data group of a single fault in the K groups of fault data, the Pearson correlation coefficient between each alarm and each fault is calculated. The specific calculation method involved is a known technology, and will not be described in detail here.

综上所述，本实施例所述的多故障数据解耦方案具有准确率高和鲁棒性强的特点，并且相对于现网中的人工方法提高了工作效率，为故障告警数据的大规模数据挖掘分析提供了可能。To sum up, the multi-fault data decoupling scheme described in this embodiment has the characteristics of high accuracy and strong robustness, and compared with the manual method in the existing network, the work efficiency is improved, and the large-scale fault alarm data is improved. Data mining analysis provides the possibility.

实施例二Embodiment 2

本实施例提供了一种多故障数据解耦方法，该方法的实施原理与实施例一相同，其通过公开实现本发明所述方法的更多技术细节，以更清楚的表述本发明的具体实现过程。需要说明的是，本实施例是一种较佳的实施例，其公开的内容并不用于唯一限定本发明的实施过程。This embodiment provides a multi-fault data decoupling method. The implementation principle of the method is the same as that of the first embodiment. By disclosing more technical details for implementing the method of the present invention, the specific implementation of the present invention is more clearly expressed process. It should be noted that this embodiment is a preferred embodiment, and the disclosed content is not used to exclusively limit the implementation process of the present invention.

本实施例提供一种通信网络中多故障情况下的故障数据解耦方法，如图2所示，包括如下步骤：This embodiment provides a method for decoupling fault data in the case of multiple faults in a communication network, as shown in FIG. 2 , including the following steps:

步骤1：数据采集与预处理方法：Step 1: Data collection and preprocessing method:

对于通信网络，定义故障的优先级，并按照优先级进行排序。故障优先级可以根据故障波及的网元数量，硬件数量和受其影响的KPI(Key Performance Indicator，关键性能指标)的关键程度进行评估。For communication networks, define the priority of failures and sort them by priority. The priority of the fault can be evaluated according to the number of network elements affected by the fault, the number of hardware, and the criticality of the KPI (Key Performance Indicator, key performance indicator) affected by the fault.

将按照优先级排序后的故障(为了与后续的故障数据区分，下述通过故障变量表述)记为{G₁,G₂,...,G_N}。例如以网元NODEB为例，故障变量的集合可以为：{NODEB断电..NODEB退服，NODEB控制单板故障..IUB断链..}The faults sorted by priority (in order to be distinguished from subsequent fault data, described below by fault variables) are denoted as {G ₁ , G ₂ , . . . , G _N }. For example, taking the network element NODEB as an example, the set of fault variables can be: {NODEB is powered off..NODEB is out of service, NODEB control board is faulty..IUB is disconnected..}

将系统告警(为了与后续的告警数据区分，下述通过告警变量表述)记为{E₁,E₂,...,E_M}。例如{NODEB断电告警,..RRU退服,板间通信流量超过告警门限，性能门限越界}。The system alarm (in order to be distinguished from the subsequent alarm data, the following is expressed by the alarm variable) is recorded as {E ₁ , E ₂ , . . . , E _M }. For example, {NODEB power failure alarm, ..RRU out of service, inter-board communication traffic exceeds the alarm threshold, performance threshold is out of bounds}.

采集现网中的K组告警数据和优先级排序后的K组故障数据，组成以下矩阵：Collect K groups of alarm data and prioritized K groups of fault data in the live network to form the following matrix:

其中，矩阵中元素e_im(1<＝i<＝K,1<＝m<＝M)，记录第i组采样数据中，告警变量E_m是否存在：如果告警变量E_m存在，则e_im＝1，否则e_im＝0。Among them, the element e _im (1<=i<=K, 1<= _m <=M) in the matrix records whether the alarm variable Em exists in the i- _th group of sampling data: if the alarm variable Em exists, then e _im =1, otherwise e _im =0.

其中，矩阵中元素g_in(1<＝i<＝K,1<＝n<＝N)记录第i组采样数据中，故障变量G_n是否存在：如果告警变量G_n存在，则g_in＝1，否则g_in＝0。Among them, the element g _in (1<=i<=K, 1<=n<=N) in the matrix records whether the fault variable G _n exists in the i-th group of sampling data: if the alarm variable G _n exists, then g _in = 1, otherwise g _in = 0.

假设第i组采样数据中，存在多故障发生，那么g_i1...g_iN中存在多个非零项，如：g_i1...g_iN＝{1,0,…1,0..}Assuming that there are multiple faults in the i-th sampled data, then there are multiple non-zero items in g _i1 ... g _iN , such as: g _i1 ... g _iN = {1,0,...1,0.. }

步骤2：对K组故障信息样本使用Apriori关联分析算法获得频繁项集X。假设获得的频繁项集的数目为J，将故障信息的频繁项集记为{x₁,x₂,...,x_J}，其中x₁～x_J都是故障变量{G₁,G₂,...,G_N}集合的子集。例如x_j＝{NODEB断电，NODEB退服}，其中，j＝1,...,J。Step 2: Use the Apriori association analysis algorithm to obtain frequent itemsets X for the K groups of fault information samples. Assuming that the number of frequent itemsets obtained is J, record the frequent itemsets of fault information as {x ₁ , x ₂ ,...,x _J }, where x ₁ ～x _J are all fault variables {G ₁ ,G ₂ ,...,G _N } is a subset of the set. For example, x _j ={NODEB is powered off, NODEB is taken out of service}, where j=1, . . . , J.

步骤3：将故障频繁项集X转化为故障两两相关矩阵R。Step 3: Transform the fault frequent itemset X into the fault pairwise correlation matrix R.

定义故障两两相关矩阵R中元素r_xy为故障G_x和G_y的两两相关系数。r_xy的计算方法如下：如果所有频繁项集中都没有G_x和G_y同时存在，则r_xy＝0，否则r_xy＝1。其中，x＝1,...,N；Y＝1,...,NThe element r _xy in the fault pairwise correlation matrix R is defined as the pairwise correlation coefficient of faults _Gx and _Gy . The calculation method of r _xy is as follows: if no G _x and G _y exist at the same time in all frequent item sets, then r _xy =0, otherwise r _xy =1. Among them, x=1,...,N; Y=1,...,N

步骤4：根据故障两两相关矩阵R，对样本中多故障的数据组进行故障去相关化和根因化。Step 4: According to the fault pairwise correlation matrix R, perform fault de-correlation and root cause on the multi-fault data set in the sample.

对于第i组故障数据中，如果g_i1...g_iN中存在多个非零项，则认为是多故障数据，那么对故障数据组g_i1...g_iN进行去相关化和根因化操作，转换为去相关和根因后的故障数据组g′_i1...g′_iN。其中，g′_in(n＝1,...,N)的计算方法如下：For the i-th group of fault data, if there are multiple non-zero items in g _i1 ... g _iN , it is considered as multi-fault data, then the fault data group g _i1 ... g _iN is de-correlated and root cause The transformation operation is transformed into the fault data set g′ _i1 ... g′ _iN after decorrelation and root cause. Among them, the calculation method of g′ _in (n=1,...,N) is as follows:

g′_in＝g_in，如果g′_in非零，则：g' _in = g _in , if g' _in is non-zero, then:

在优先级高于当前故障的所有故障g_i1,g_i2,...g_i(n-1)中进行搜索，若存在某故障数据g_in′非零，且该故障与当前故障的故障相关系数r_n′n＝1，则令g′_in＝0。Search in all faults g _i1 , g _i2 ,...g _i(n-1) whose priority is higher than the current fault, if there is a certain fault data g _in' non-zero, and the fault is related to the fault of the current fault Coefficient _rn'n =1, then let _g'in =0.

步骤5：筛选单故障的数据组，并根据单故障的数据组，计算告警变量{E₁,E₂,...,E_M}与故障变量{G₁,G₂,...,G_N}之间的皮尔逊相关系数。定义告警E_m与故障G_n的皮尔逊相关系数为p_mn。Step 5: Filter the single-fault data group, and calculate the alarm variables {E ₁ ,E ₂ ,...,E _M } and the fault variables {G ₁ ,G ₂ ,...,G according to the single-fault data group Pearson's correlation coefficient between _N }. Define the Pearson correlation coefficient between the alarm _Em and the fault Gn as _p _mn .

步骤6：遍历故障去相关化和根因化后的各故障数据，若某故障数据{g′_i1...g′_iN}为多不相关故障数据，则分析与该多不相关故障数据对应的告警数据{e_i1...e_iM}中每个告警归属故障。Step 6: Traverse the fault data after fault de-correlation and root cause. If a certain fault data {g′ _i1 ... g′ _iN } is multi-irrelevant fault data, analyze the corresponding multi-irrelevant fault data Each alarm in the alarm data {e _i1 ... e _iM } belongs to the fault.

对于第i组采样数据中，如果g′_i1...g′_iN中存在多个非零项，则认为是多不相关故障数据。For the i-th group of sampled data, if there are multiple non-zero items in g′ _i1 ... g′ _iN , it is considered as multi-irrelevant fault data.

如果e_im非零(即有告警)，则分析e_im归属故障的方法如下：If e _im is non-zero (that is, there is an alarm), the method for analyzing the fault attributable to e _im is as follows:

将{g′_i1...g′_iN}中非零项对应的故障组成故障集合，寻找故障集合中与告警E_m皮尔逊相关系数最大的故障为e_im的归属故障。The faults corresponding to the non-zero items in {g′ _i1 ... g′ _iN } are formed into a fault set, and the fault with the largest Pearson correlation coefficient with the alarm E _m in the fault set is found as the attributable fault of e _im .

实施例三Embodiment 3

本发明实施例提供一种多故障数据解耦装置，该装置中所涉及的各单元可以通过硬件加软件程序的方式实现，所述软件程序用于实现下述各单元的功能，所述硬件用于为软件程序运行提供支持，从而组成一个实体硬件装置。如图3所示，本实施例所述装置包括：An embodiment of the present invention provides a multi-fault data decoupling device. Each unit involved in the device can be implemented by hardware plus a software program. The software program is used to realize the functions of the following units. The hardware uses It is used to provide support for the operation of software programs, thereby forming a physical hardware device. As shown in FIG. 3 , the device in this embodiment includes:

数据输入单元310，用于获取同一时间在同一区域采集的K组告警数据和K组故障数据，其中，每组故障数据均按故障优先级排序；A data input unit 310, configured to acquire K groups of alarm data and K groups of fault data collected in the same area at the same time, wherein each group of fault data is sorted by fault priority;

数据处理单元320，用于对K组故障数据使用关联分析算法，得到故障频繁项集X，将所述故障频繁项集X转化为故障两两相关矩阵R，并基于所述故障两两相关矩阵R，对K组故障数据中存在多故障的故障数据组进行故障去相关化和根因化；The data processing unit 320 is configured to use an association analysis algorithm on the K groups of fault data to obtain a frequent fault item set X, convert the frequent fault itemset X into a fault pairwise correlation matrix R, and based on the fault pairwise correlation matrix R, perform fault de-correlation and root cause for the fault data group with multiple faults in the K group of fault data;

解耦单元330，用于提取与故障去相关化和根因化后存在多不相关故障的各故障数据组对应的各组告警数据，根据告警与各故障之间的相关性，确定提取的各组告警数据中每个告警所归属的故障。The decoupling unit 330 is configured to extract each group of alarm data corresponding to each fault data group with multiple irrelevant faults after fault de-correlation and root cause, and determine the extracted alarm data according to the correlation between the alarm and each fault. The fault to which each alarm in the group alarm data belongs.

基于上述结构框架及实施原理，下面给出在上述结构下的几个具体及优选实施方式，用以细化和优化本发明所述装置的功能，具体涉及如下内容：Based on the above-mentioned structural framework and implementation principles, several specific and preferred implementations under the above-mentioned structure are given below to refine and optimize the functions of the device of the present invention, specifically involving the following contents:

本实施例中，数据处理单元320将故障频繁项集X转化为故障两两相关矩阵R的转化原则为：对于在任一频繁项集中同时存在的两两故障标记为相关，对于所有频繁项集中都没有同时存在的两两故障标记为不相关；所述故障两两相关矩阵R中的元素表示两两故障间是否相关。In this embodiment, the transformation principle for the data processing unit 320 to convert the fault frequent item set X into the fault pairwise correlation matrix R is: for the pairwise faults that exist simultaneously in any frequent item set to be marked as correlation, for all frequent item sets, Pairwise faults that do not exist at the same time are marked as irrelevant; the elements in the fault pairwise correlation matrix R indicate whether the pairwise faults are correlated.

本实施例中，数据处理单元330基于所述故障两两相关矩阵R，对K组故障数据中存在多故障的故障数据组进行故障去相关化和根因化的原则为：In the present embodiment, the data processing unit 330 performs fault de-correlation and root cause for the fault data groups with multiple faults in the K groups of fault data based on the fault pairwise correlation matrix R as follows:

本实施例中，故障解耦单元330，具体用于对于提取的每组告警数据，获取与其对应的故障去相关化和根因化后的故障数据组中包含的各故障，得到故障集合；对于提取的每组告警数据，确定告警数据中每个告警与对应故障集合中各故障相关性最高的故障为对应告警所归属的故障。In this embodiment, the fault decoupling unit 330 is specifically configured to, for each group of extracted alarm data, obtain each fault contained in the corresponding fault-decorrelated and root-caused fault data group, and obtain a fault set; For each group of extracted alarm data, it is determined that the fault with the highest correlation between each alarm in the alarm data and each fault in the corresponding fault set is the fault to which the corresponding alarm belongs.

优选地，本实施例中，数据处理单元320，还用于根据K组故障数据中单故障的故障数据组，计算各告警与各故障之间的皮尔逊相关系数，以通过皮尔逊相关系数来表示告警与各故障之间的相关性。Preferably, in this embodiment, the data processing unit 320 is further configured to calculate the Pearson correlation coefficient between each alarm and each fault according to the fault data group of a single fault in the K groups of fault data, so as to use the Pearson correlation coefficient to calculate the Pearson correlation coefficient. Indicates the correlation between an alarm and each fault.

本实施例所述的多故障数据解耦方案具有准确率高和鲁棒性强的特点，并且相对于现网中的人工方法提高了工作效率，为故障告警数据的大规模数据挖掘分析提供了可能。The multi-fault data decoupling scheme described in this embodiment has the characteristics of high accuracy and strong robustness, and improves the work efficiency compared with the manual method in the existing network, and provides a large-scale data mining analysis of fault alarm data. possible.

显然，本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样，倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内，则本发明也意图包含这些改动和变型在内。It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit and scope of the invention. Thus, provided that these modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include these modifications and variations.

Claims

1. a multi-fault data decoupling method, is characterized in that, comprises:

Obtain K groups of alarm data and K groups of fault data collected in the same area at the same time, where each group of fault data is sorted by fault priority;

Using an association analysis algorithm for the K groups of fault data, a frequent fault item set X is obtained, and the frequent fault itemset X is converted into a fault pairwise correlation matrix R;

Based on the fault pairwise correlation matrix R, perform fault de-correlation and root cause for the fault data groups with multiple faults in the K groups of fault data;

Extract each group of alarm data corresponding to each fault data group with multiple irrelevant faults after fault de-correlation and root cause, and determine each alarm in the extracted groups of alarm data according to the correlation between the alarm and each fault attributable fault;

Among them, the transformation principle of transforming the fault frequent item set X into the fault pairwise correlation matrix R is: for the pairwise faults that exist simultaneously in any frequent item set, it is marked as correlation, and for all frequent item sets, there is no simultaneous two pairs of faults. The two faults are marked as irrelevant;

Elements in the fault pairwise correlation matrix R indicate whether the pairwise faults are correlated.

2. The method according to claim 1, characterized in that, based on the fault pairwise correlation matrix R, performing fault de-correlation and root cause on the fault data groups with multiple faults in the K groups of fault data. The principle is:

If the fault pair correlation matrix R indicates that the two related faults exist in the fault data group with multiple faults at the same time, then if the high priority fault exists, the high priority fault is retained, and the low priority fault is deleted;

If two faults that are not correlated in the fault pairwise correlation matrix R exist simultaneously in the fault data group of multiple faults, the two faults are retained at the same time.

3. The method according to claim 1, wherein determining the fault to which each alarm belongs in the extracted alarm data of each group according to the correlation between the alarm and each fault, specifically comprising:

For each group of extracted alarm data, obtain each fault contained in the corresponding fault de-correlation and root-caused fault data group, and obtain a fault set;

For each group of extracted alarm data, it is determined that the fault with the highest correlation between each alarm in the alarm data and each fault in the corresponding fault set is the fault to which the corresponding alarm belongs.

4. The method according to claim 1 or 3, wherein in the method, according to the fault data group of a single fault in the K groups of fault data, the Pearson correlation coefficient between each alarm and each fault is calculated, and The correlation between alarms and faults is represented by the Pearson correlation coefficient.

5. A multi-fault data decoupling device, comprising:

The data input unit is used to obtain K groups of alarm data and K groups of fault data collected in the same area at the same time, wherein each group of fault data is sorted by fault priority;

The data processing unit is used to use an association analysis algorithm on the K groups of fault data to obtain a frequent fault item set X, convert the frequent fault item set X into a fault pairwise correlation matrix R, and based on the fault pairwise correlation matrix R , perform fault de-correlation and root cause for the fault data group with multiple faults in the K group fault data;

The decoupling unit is used to extract each group of alarm data corresponding to each fault data group with multiple irrelevant faults after fault de-correlation and root cause, and determine the extracted groups according to the correlation between the alarm and each fault The fault to which each alarm in the alarm data belongs;

Wherein, the transformation principle for the data processing unit to convert the fault frequent item set X into the fault pairwise correlation matrix R is as follows: for the pairwise faults that exist simultaneously in any frequent item set, it is marked as correlation, and for all frequent item sets that do not have simultaneous faults at the same time Existing pairwise faults are marked as irrelevant; the elements in the fault pairwise correlation matrix R indicate whether the pairwise faults are correlated.

6 . The device according to claim 5 , wherein the data processing unit performs fault de-correlation and rooting on the fault data groups with multiple faults in the K groups of fault data based on the fault pairwise correlation matrix R. 7 . The principle of factorization is:

If the fault pair correlation matrix R indicates that the two related faults exist in the fault data group with multiple faults at the same time, then if the high priority fault exists, the high priority fault will be retained and the low priority fault will be deleted; The correlation matrix R indicates that two irrelevant faults exist in the multi-fault fault data group at the same time, then the two faults are retained at the same time.

7 . The device according to claim 5 , wherein the fault decoupling unit is specifically configured to obtain, for each group of extracted alarm data, a corresponding fault data group after de-correlation and root cause. 8 . Each fault contained in the fault set is obtained; for each group of extracted alarm data, the fault with the highest correlation between each alarm in the alarm data and each fault in the corresponding fault set is determined as the fault to which the corresponding alarm belongs.

8. The apparatus according to claim 5 or 7, wherein the data processing unit is further configured to calculate the Pearson difference between each alarm and each fault according to the fault data group of a single fault in the K groups of fault data The correlation coefficient is used to express the correlation between the alarm and each fault through the Pearson correlation coefficient.