CN111026870A

CN111026870A - ICT system fault analysis method integrating text classification and image recognition

Info

Publication number: CN111026870A
Application number: CN201911264526.XA
Authority: CN
Inventors: 俞学豪; 孙瑨一; 郑蓉蓉; 李国栋; 赵子岩; 王晨辉; 韩笑; 冯显时; 李雅西; 袁洲; 高金京; 陈亮; 王玮
Original assignee: State Grid Corp of China SGCC; State Grid Information and Telecommunication Co Ltd; North China Electric Power University; Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; State Grid Information and Telecommunication Co Ltd; North China Electric Power University; Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd
Priority date: 2019-12-11
Filing date: 2019-12-11
Publication date: 2020-04-17

Abstract

The invention discloses an ICT system fault analysis method integrating text classification and image recognition, and belongs to the technical field of fault analysis through neural network learning. The method collects fault data recorded by customer service, and manually preprocesses the data, wherein the data comprises two parallel processes of text classification and image identification, and the two processes are finally classified by a classifier; then establishing a fault analysis model for fault analysis through fault judgment, and returning to the initial data for manual preprocessing after the model is updated; the method and the system realize reasonable allocation of ICT system resources, relieve huge pressure brought to customer service operation and maintenance by increasing number of ICT systems, solve the problems that knowledge cannot be shared, internal resources cannot be efficiently coordinated and orderly operated and the like because ICT customer service of the state network at present only depends on personal knowledge storage and experience, and improve the intelligent level of ICT operation and maintenance.

Description

ICT system fault analysis method integrating text classification and image recognition

Technical Field

The invention belongs to the technical field of fault analysis through neural network learning, and particularly relates to an ICT system fault analysis method integrating text classification and image recognition.

Background

With the increasing complexity of the power grid system, the analysis and judgment of the fault of the power grid system cannot be performed by only depending on the knowledge reserve and personal experience of a single worker. Therefore, the system fault analysis technology is also an important direction for the research of the current computer-aided decision system. The fault analysis and judgment technology is mainly realized by two methods at present: and constructing a fault information database and performing fault analysis by a machine learning-based method.

The method of constructing the fault information database is generally accomplished by constructing a knowledge graph to store the fault information. The concept of a knowledge graph was first proposed by Google in 2012 to address the upgrade of traditional search models. The main objective of the knowledge graph is to describe various entities and concepts existing in the real world and strong relationships between them, and we use relationships to describe the association between two entities. The knowledge graph is used in the fault analysis technology, because of the logical property during construction, fault analysis has strong logical property and is convenient for customers to use, but the effect is poor for certain faults and reasons which have no logical relation on the surface, and if new fault information needs to be accumulated by experience, the method needs to update the knowledge graph, needs professionals to update a fault analysis system, and has large workload.

The method for analyzing the faults based on the machine learning mainly comprises the steps of training a deep learning network model by using historical fault record information as a training set and classifying the faults by using new fault information as a test set through a text classification method. Being a machine learning approach, this approach easily discovers the logical relationship of some potential, artificially difficult-to-discover faults and causes, and the new fault records will "build up experience" for the system as a training set. But again this method suffers from a number of disadvantages. As a specific application of a machine learning algorithm, the method has the main difficulty and disadvantages that the method has high requirements on the quantity of historical data, and a large amount of historical data represents higher training precision; secondly, the quality of the historical data is also closely related to the final classification result, which depends on the awareness level of the data logger to the fault at the time; finally, the method needs to classify the fault in advance, the classification standard is established on the experience of manually handling the fault, and the classification cannot be too much or too little, which is also a big difficulty of the method.

The invention pertains generally to improved optimization of fault analysis based on neural network learning. Mainly aiming at the specific application field, in practice, fault records of the national power grid ICT (information and Communication technology) system not only have text records, but also have a large amount of fault image information, such as system anomaly detection screenshots and the like, and the auxiliary judgment of fault analysis through image identification is added.

Disclosure of Invention

The invention aims to provide an ICT system fault analysis method integrating text classification and image recognition, which is characterized by comprising the following steps of:

the method comprises the following steps: collecting fault data recorded by customer service, and manually preprocessing the data, wherein the data comprises two parallel processes of text classification and image identification, and the two processes are classified by a classifier finally; then establishing a fault analysis model for fault analysis through fault judgment, and returning to the initial data for manual preprocessing after the model is updated;

step two: model training is respectively carried out in the two parallel processes of text classification and image recognition; the text classification comprises the following main steps: text preprocessing, tf-idf calculation, label digitalization, data extraction and classifier classification; the image recognition main steps are as follows: image preprocessing, neural network model construction, parameter optimization and classifier classification;

step three: judging by classification, namely comparing the text classification result and the image classification result obtained in the step two with the actual result, and determining a text classification weight value and an image classification weight value so that the integrated result is consistent with the actual classification;

step four: analyzing faults, inputting a newly generated fault report as a test set, judging fault classification according to the fault report by a training model, and overhauling a system according to the fault classification by operation and maintenance personnel;

step five: and updating the model, namely after the operation and maintenance personnel finish the fault maintenance, putting the fault report into a historical fault report set, updating the data of the training set, and training the model through a new training set at regular intervals, so as to improve the operation and maintenance personnel.

The fault data of the customer service record is collected, fault data reports of the customer service record in the actual ICT system are all unstructured data stored in WORD documents, and as python has richer operation functions on EXCLE, scattered WORD document contents are stored in an EXCLEL table, secondly, report contents need to be manually simplified in preprocessing, and finally, for each data item, classification is carried out by staff with rich experience.

The first step of data preprocessing is to adopt a mode of generating a picture list and a label list, corresponding the picture name and the label, and reading and manufacturing an iterator, wherein in the data preprocessing process, the picture name comprises picture classification information; and is mainly composed of two functions: get _ files and get _ batch.

The text preprocessing in the second step is different from the preprocessing in the first step, and is mainly finished by a computer through a jieba Chinese language processing toolkit, and the main tasks are word segmentation and word deactivation.

The image preprocessing in the second step is the same as the text preprocessing, and is also completed through computer processing. The method mainly cuts the image of the picture in the document and processes the picture into a proper size.

Image preprocessing and neural network model construction in the second step; the convolutional neural network of the system is: convolution + pooling layer x2, full join layer x2, and last softmax layer. The specific implementation of the process is mainly realized by calling a TensorFlow related function.

The classification judgment of the third step is that the probability obtained by text classification and the probability obtained by image recognition are multiplied by the weighted probability respectively and summed to obtain a comprehensive probability value,

P＝P₁w₁+P₂w₂

wherein w₁、w₂The weights of the image and the text are respectively, according to multiple tests, the image weight is 0.19, and the text weight is 0.81; p₁Probability, P, obtained by classification₂The probability obtained by image recognition is P, and the final comprehensive probability is P.

The method has the beneficial effects that the specialization and the assimilation level of the ICT customer service are improved by constructing the knowledge map in the ICT field. In order to further effectively exert the potential value of the ICT operation and maintenance mass data; through knowledge processing and big data analysis, the potential association requirements of ICT system users are met, hot events are tracked, ICT system faults are researched and judged in an auxiliary mode, value-added service capability of ICT services is created, reasonable allocation of ICT system resources is achieved, huge pressure brought to customer service operation and maintenance by increasing number of ICT systems is relieved, the problems that knowledge cannot be shared, internal resources cannot efficiently cooperate with one another and operate orderly due to the fact that existing state network ICT customer service only depends on personal knowledge storage and experience are solved, and the intelligent level of ICT operation and maintenance is improved.

Drawings

Fig. 1 is a flow chart of ICT system fault analysis.

Fig. 2 is a flow chart of text classification.

Fig. 3 is a flowchart of image recognition.

Detailed Description

The invention provides an ICT system fault analysis method integrating text classification and image recognition, which specifically comprises the following steps: collecting fault data recorded by customer service, and manually preprocessing the data, wherein the data comprises two parallel processes of text classification and image identification, and the two processes are classified by a classifier finally; and then establishing a fault analysis model for fault analysis through fault judgment, and returning to the initial data for manual preprocessing after the model is updated. The present invention will be described with reference to the accompanying drawings.

Fig. 1 shows a flow chart of fault analysis of an ICT system. The method comprises the following steps:

the method comprises the following steps: and (5) manually processing the data. Fault data reports recorded by customer service in an actual ICT system are all unstructured data stored in WORD documents, and the steps mainly comprise converting the unstructured data into structured data, simplifying texts and completing data item label labeling.

The system adopts the mode of generating the picture list and the label list, corresponding the picture name and the label, and reading and manufacturing the iterator, wherein the picture name comprises the picture classification information in the data preprocessing process.

The process consists essentially of two functions: get _ files and get _ batch.

The goal of the Get _ files function is to return the determined folder path to the disordered pictures and labels through function operation.

Listdir converts the name into a list expression form, the name array caches the list result, and the name [0] is the classification label of the picture, so that the image can be judged to belong to the classification, and the classification can be added into the array corresponding to the classification. For example, a file named NetworkError0001.jpg, where name [0] ═ NetworkError is a network fault problem, we store the name and path of a NetworkError0001.jpg picture in NetworkError, and add a 1 value in label _ NetworkError as a flag (different classifications have different values as flags).

Then, the classified file data need to be disordered in sequence, in order to eliminate errors caused by the specific sequence of the data set on the training result, the picture and the label are integrated together through an np.hstack () method, the label is also integrated together and is dumped into a two-dimensional array, then np.random.shuffle is used for carrying out random disorder on the integrated array, and finally the first row of the result after random disorder is taken as the picture array, and the second row is taken as the label array.

…	D:/Python/data/NetworkError0001.jpg	…
			…	1	…

TABLE 1 random arrangement of temporary storage arrays

The Get _ batch function aims to generate batches of the same size.

First, the type python.list generated in the previous step needs to be converted into a format that tf can recognize. The method of tf.cast is used for forced type conversion. Next, a queue is generated, we use slice _ input _ producer () to create a queue, putting image and label into a list as parameters to pass to the function. And then needs to be decoded according to the picture format. In this routine the training data is in jpg format, so a decode _ jpeg () decoder is used. Finally, the picture size is unified, the main method is image, resize _ image _ with _ crop _ or _ pad (image, image _ W, image _ H), and the method parameters mainly include the picture to be processed, the target image width and the target image height.

Step two: the method mainly comprises two parallel steps of text classification and image recognition.

The text classification comprises the following main steps: text preprocessing, tf-idf calculation, label digitalization, data extraction and classifier classification. As shown in fig. 2.

The text preprocessing is different from the preprocessing in the step one, and is mainly finished by a computer through a jieba Chinese language processing toolkit, and the main tasks are word segmentation and word stop. The word segmentation is to divide a complete and coherent sentence into words, and the full mode is set to be used for word segmentation through a parameter cut _ all as Ture; the stop word is a word which does not affect the sentence semantics, such as a word of ' in ', ' and the like, a Chinese stop word.

TF-IDF is the word frequency inverse text frequency index. The main idea of TFIDF is: if a word or phrase appears in an article frequently and rarely, TF is high, the word or phrase is considered to represent the text, so that the word or phrase can be judged to have good category distinguishing capability and is suitable for classification. TF-IDF is actually: TF, IDF, TF Term Frequency (Term Frequency), IDF Inverse file Frequency (Inverse document Frequency). TF represents the frequency with which terms appear in document d. The main idea of IDF is: if the documents containing the entry t are fewer, that is, the smaller n is, the larger IDF is, the entry t has good category distinguishing capability. If the number of documents containing the entry t in a certain class of document C is m, and the total number of documents containing the entry t in other classes is k, it is obvious that the number of documents containing t is m + k, when m is large, n is also large, and the value of the IDF obtained according to the IDF formula is small, which means that the category distinguishing capability of the entry t is not strong. In practice, however, if a term frequently appears in a document of a class, it indicates that the term can well represent the characteristics of the text of the class, and such terms should be given higher weight and selected as characteristic words of the text of the class to distinguish the document from other classes. In a given document, the Term Frequency (TF) refers to the frequency with which a given word appears in the document. This number is a normalization of the number of words (termcount) to prevent it from biasing towards long files. (the same word may have a higher number of words in a long document than in a short document, regardless of whether the word is important or not.) for a word in a particular document, its importance may be expressed as:

the numerator in the above equation is the number of occurrences of the word in the document, and the denominator is the sum of the number of occurrences of all words in the document. Inverse Document Frequency (IDF) is a measure of the general importance of a word. The IDF for a particular term may be obtained by dividing the total number of documents by the number of documents that contain that term, and taking the resulting quotient to be a base-10 logarithm:

the numerator is the total number of documents in the corpus, and the denominator is the number of documents containing words. And finally, calculating the product of the TF and the IDF. A high word frequency within a particular document, and a low document frequency for that word across the document collection, may result in a high-weighted TF-IDF. At this point, the keywords can be filtered out.

The tag digitization step is achieved by calling pd. Categorical in the pandas toolkit, which converts text into Categories objects and automatically into digital storage. Wherein the "()" internal information is mainly the input parameters of the function. Sometimes when a function is used, if the parameter is empty, we can delete the parenthesis and only keep the function name. For example, "()" may be deleted here

The extracted data mainly breaks up the training set data sequence, and avoids errors caused by the training set data in multiple training.

Classifier classification is mainly accomplished by logistic regression algorithm derived from skleran. Logistic regression is usually only used as a regression function of a binary problem for prediction, and is applied to multi-classification, one of the classes is marked as 1, the other classes are 0, a group of corresponding parameter theta values are trained by using a logistic regression algorithm, the work is repeated for k times (k is the number of the classes), different classes are set as 1 successively, k groups of theta prediction data x are respectively used, k prediction results can be obtained, and the class corresponding to the group of theta corresponding to the maximum prediction value is selected as the prediction class.

The image recognition main steps are as follows: image preprocessing, neural network model construction, parameter optimization and classifier classification. As shown in fig. 3.

Image preprocessing is also accomplished by computer processing, as is text preprocessing. The method mainly cuts the image of the picture in the document and processes the picture into a proper size. Two important functions are used: queue _ slice _ input _ producer ([ image, label ]) and image _ batch, label _ batch ═ tf _ batch [ image, label ], batch _ size ═ batch _ size, [ num _ threads ═ 64, capacity ═ capacity ], where tf. The processed image length and width are determined by resize _ w and resize _ h.

The convolutional neural network of the system is as follows: convolution + pooling layer x2, full join layer x2, and last softmax layer. The specific implementation of the process is mainly realized by calling a TensorFlow related function.

a. The convolutional layer 1: 16 convolution kernels (3 channels) of 3 × 3, padding ═ SAME', a graph showing the convolution after padding is in agreement with the original size, and a function relu (), is activated; wherein "()" internal information is mainly input parameters of the function; when the function is used, if the input parameter is null, the brackets are deleted, and only the function name is reserved.

b. A pooling layer 1: the 3x3 maximum pooling, the step length strides being 2, after pooling, lrn () operation is performed, and the local response normalization is beneficial to training.

c. And (3) convolutional layer 2: 16 convolution kernels of 3 × 3 (16 channels) ('SAME'), a graph showing convolution after padding is identical to the original size, and a function relu ()

d. And (3) a pooling layer 2: 3x3 max pooling with stride of 2, executing lrn () operation after pooling

e. Full connection layer 3: 128 neurons, aligning the outputs reshape of the previous pool layer, activating the function relu ()

f. Full connection layer 4: 128 neurons, activation function relu ()

Softmax regression layer: the previous FC layer output is subjected to a linear regression to calculate the score for each class, here class 2, so this layer outputs two scores.

Loss of h.loss calculation

And (3) inputting parameters: logits, the network computes the output value. labels, true value, here 0 or 1

Returning parameters: loss, loss value

Loss value optimization

Inputting parameters: loss. learning _ rate, learning rate.

Returning parameters: run _ op, this parameter is input into the sess.

j. Evaluation/accuracy calculation

Inputting parameters: logits, network computed values. labels, the label, i.e. the true value, is here 0 or 1.

Returning parameters: accuracy, the average accuracy of the current step, i.e. how many pictures in these batchs are correctly classified.

In LeNet, the accuracy of training precision is mainly realized by adjusting the following three parameters:

the batch _ size can be understood as a batch parameter, the limit value of which is the total number of samples in the training set, and when the data volume is small, the batch _ size value can be set to be a Full data set (Full batch coring);

the Learning rate (Learning rate) is an important super-parameter in supervised Learning and deep Learning, and determines whether and when the objective function can converge to a local minimum. The proper learning rate can enable the objective function to converge to a local minimum value in a proper time, if lr is too small, the model does not converge or converges too slowly, and if lr is too large, the model can oscillate.

max _ step is the maximum number of training steps, and when the value is small, complete learning cannot be performed on the training set, and when the value is large, the result may be overfit.

The whole algorithm thought is from the first step to the large loop of max _ step, the training loss value is accumulated in each loop, the training is completed according to the sess.run method in the neural network defined in the previous section, and the trained model.

Step three: and (4) judging by classification, namely multiplying the probability obtained by classifying the text and the probability obtained by identifying the image by the weighted probability respectively and summing to obtain a comprehensive probability value.

P＝P₁w₁+P₂w₂

Wherein, P₁Probability, P, obtained by classification₂The probability obtained by image recognition is P, and the final comprehensive probability is P.

w is the weight of the image and the text respectively, and according to multiple experiments, the weight of the image is 0.19, and the weight of the text is 0.81.

Step four: the fault analysis system is constructed, when a new fault is generated, fault data are input as a test set, the system can complete fault judgment according to a trained model, operation and maintenance personnel can overhaul the fault according to the fault judgment, and overhaul conditions are recorded in a fault report.

Step five: and recording a newly generated fault report in the fourth step into a historical fault record, and regularly retraining the model by taking the updated historical fault set as a training set, so that the effect of updating the model is achieved, and the model continuously accumulates experience.

Claims

1. An ICT system fault analysis method integrating text classification and image recognition is characterized by comprising the following steps:

2. The ICT system fault analysis method of integrated text classification and image recognition as claimed in claim 1, wherein the fault data collected from customer service records and fault data reports from customer service records in actual ICT systems are unstructured data stored in WORD documents, which stores scattered WORD document contents into an EXCLE table because python has richer operation functions for the EXCLE, secondly, manually reduces report contents in preprocessing, and finally, for each data item, classification is labeled by an experienced worker.

3. The ICT system fault analysis method according to claim 1, wherein the step-data preprocessing is performed by generating a picture list and a label list, associating the picture name with the label, and reading and making an iterator, because the picture name includes the picture classification information during the data preprocessing; and mainly comprises two functions of get _ files and get _ batch.

4. The ICT system fault analysis method based on comprehensive text classification and image recognition according to claim 1, characterized in that the text preprocessing in the second step is different from the preprocessing in the first step, and is mainly completed by a computer through a jieba Chinese language processing toolkit, and the main tasks are word segmentation and word deactivation.

5. The ICT system fault analysis method according to claim 1, characterized in that the image preprocessing in step two is the same as the text preprocessing, and is also completed by computer processing; the method mainly cuts the image of the picture in the document and processes the picture into a proper size.

6. The ICT system fault analysis method based on comprehensive text classification and image identification according to claim 1, characterized in that in the second step, image preprocessing and neural network model construction are performed; the convolutional neural network of the system is: a convolution + pooling layer x2, a full connection layer x2 and a last softmax layer are classified; the specific implementation of the process is mainly realized by calling a TensorFlow related function.

7. The ICT system fault analysis method for integrated text classification and image recognition according to claim 1, wherein the classification judgment of step three is to multiply the probability obtained by text classification and the probability obtained by image recognition by the weighted probabilities respectively and sum them to obtain an integrated probability value,

P＝P₁w₁+P₂w₂