CN116318985B

CN116318985B - Computer network security early warning system and method based on big data

Info

Publication number: CN116318985B
Application number: CN202310264041.0A
Authority: CN
Inventors: 蒋耀亮
Original assignee: Shandong Runyun Intelligent Technology Co ltd
Current assignee: Shandong Runyun Intelligent Technology Co ltd
Priority date: 2023-03-02
Filing date: 2023-03-02
Publication date: 2024-06-14
Anticipated expiration: 2043-03-02
Also published as: CN116318985A

Abstract

The invention discloses a computer network security early warning system and method based on big data, belonging to the field of computer network security, the network security early warning system comprises: the system comprises an information acquisition module, a database, a file analysis module and a safety early warning module, wherein the information acquisition module is used for acquiring basic data information, monitoring transmitted file data, the database is used for encrypting and storing acquired data and analysis results, the file analysis module is used for analyzing and processing the acquired data, and the safety early warning module is used for carrying out early warning and reminding on related technicians according to the analysis results. According to the invention, basic data information is input, file data information is collected and stored in an encrypted manner, files are classified, the sensitivity degree of the file content is analyzed, the withdrawal time of a user is set according to an analysis result, and safety early warning reminding is carried out on the user through popup window display and voice, so that the leakage of the data information is avoided, and the safety of the data information is ensured.

Description

Computer network security early warning system and method based on big data

Technical Field

The invention relates to the field of computer network security, in particular to a computer network security early warning system and method based on big data.

Background

Along with the continuous progress of technology, the use of a computer network brings great convenience to the daily life of people, and the computer network refers to a computer system which connects a plurality of computers with different geographic positions and external devices thereof and has independent functions through communication lines, and realizes resource sharing and information transmission under the management and coordination of a network operating system, network management software and a network communication protocol. Through the use of a computer network, daily communication, automation office and information integration of people are promoted, and an electronic and digital living environment is created for people.

However, when people send files such as information, documents, pictures and videos through a computer network, resource sharing is performed rapidly, and meanwhile, misdistribution or misdistribution of file resources often occurs, so that the situation that privacy files are revealed exists. In the prior art, by detecting the attribute state of the file, the target file, the screenshot file and the downloaded file are replaced and deleted, but it is still difficult to ensure that other people do not use other equipment to take photos, intercept and other operations, even if measures are taken to replace and delete, the wrongly sent or wrongly sent file may be checked by other people, and information leakage is caused.

Therefore, how to carry out safety early warning reminding on the user according to the sensitivity degree of the file and how to ensure the information safety under the condition that the user missends or missends file resources are necessary. Therefore, a need exists for a computer network security early warning system and method based on big data.

Disclosure of Invention

The invention aims to provide a computer network security early warning system and method based on big data, so as to solve the problems in the background technology.

In order to solve the technical problems, the invention provides the following technical scheme: a computer network security early warning system based on big data, the network security early warning system includes: the system comprises an information acquisition module, a database, a file analysis module and a safety early warning module;

the output end of the information acquisition module is connected with the input end of the database, the output end of the database is connected with the input end of the file analysis module, the output end of the file analysis module is connected with the input end of the safety early warning module, and the output end of the file analysis module is connected with the input end of the database; the information acquisition module is used for acquiring basic data information and monitoring transmitted file data, the database is used for carrying out encryption storage on the acquired data and analysis results, the file analysis module is used for carrying out analysis processing on the acquired data, and the safety early warning module is used for carrying out early warning and reminding on related technicians according to the analysis results.

Further, the information acquisition module comprises a basic information input unit and a file detection unit, wherein the basic information input unit is used for inputting basic data information, the file monitoring unit is used for monitoring the content of a file, classifying the file according to the suffix name of the transmitted file, and acquiring document information and image information.

Further, the database includes a data encryption unit and a data storage unit, the data encryption unit encrypts collected data and analysis results through an SM4 encryption algorithm, thereby ensuring the security of a computer network, avoiding information leakage, ensuring the privacy security and property security of users, the SM4 is a block cipher algorithm, the block length of the algorithm is 128 bits, the key length is 128 bits, the encryption algorithm and the key expansion algorithm both adopt a 32-round nonlinear iterative structure, the decryption algorithm is the same as the encryption algorithm, the use sequence of round keys is reverse, the decryption round keys are the reverse sequence of the encryption round keys, the SM4 has the characteristics of safety and high efficiency, and the method has the following advantages in terms of design and implementation: a. the involution operation, the decryption algorithm is the same as the encryption algorithm, but the use sequence of round keys is reverse, and the decryption round keys are the reverse sequence of the encryption round keys; b. the sub-key generation algorithm is similar to the encryption algorithm in structure, and resource reuse is achieved in design; c. the encryption algorithm and the key expansion algorithm both adopt a 32-round nonlinear iteration structure, and a 128-bit plaintext and a key are subjected to nonlinear iteration operation for 32 times to obtain a final result; the data storage unit comprises a document information subunit and an image information subunit, the document information subunit stores the collected basic document information and the analyzed historical document information through a hash storage method, the image subunit stores the collected basic image information and the historical image information through storage, the hash storage method is also called hash storage, is a search technology for seeking to establish a corresponding relation between a storage position of a data element and a key, the storage address of the node is directly calculated according to the key of the node, the hash is one development of an array storage mode, compared with an array, the data access speed of the hash is higher than that of the array, the hash can be quickly realized, and the method has the following advantages: a. the space is arbitrary; b. explicitly storing the relationship; c. the ability to express relationships is strong; d. elements are inserted and deleted in O (1) (only the pointers of nodes need to be changed), so that the memory can be managed dynamically.

Further, the file analysis module comprises a document analysis unit and an image analysis unit, wherein the document analysis unit is used for analyzing and processing document information sent by a user by combining basic document information and historical document information stored in a database, and the image analysis unit is used for analyzing and processing image information sent by the user by combining the basic image information and the historical image information stored in the database.

Further, the safety early warning module comprises a time control unit and an alarm reminding unit, wherein the time control unit is used for setting user withdrawal time in a personalized way according to analysis results, in the user withdrawal time, a user can withdraw a transmitted file, after the user withdrawal time is finished, the user receives the file to be consulted, the alarm reminding unit is used for carrying out safety early warning reminding on the user according to analysis results when abnormal conditions occur, the user is guaranteed to know information in time, withdrawal processing is carried out on the transmitted file, important file transmission errors are avoided, leakage caused by opening of other people occurs, and privacy safety and information safety of the user are guaranteed.

A computer network security early warning method based on big data comprises the following steps:

S1, inputting basic data information, monitoring file content sent by a user, collecting document information and image information of the file, and carrying out classified encryption storage;

S2, analyzing and processing the document sensitive word degree sent by the user according to the collected document information and the basic data information;

S3, analyzing and processing the image contour sent by the user according to the acquired image information and the basic data information;

and S4, according to the analysis result, when an abnormal situation occurs, the withdrawal time of the user is set in a personalized mode, and safety early warning reminding is carried out on the user.

Further, in step S1, basic data information is entered, the transmitted file is divided into a document class and an image class according to the monitored file suffix name transmitted by the user, text information in the document class file is extracted, the image class file is intercepted, image information is obtained, and meanwhile, the collected data information is encrypted and stored in a database.

Further, in step S2, the sensitivity degree of the document is analyzed according to the collected document information and the basic data information, including the steps of:

S201, identifying collected document information, and carrying out marking segmentation on characters in a document through a connected domain, wherein the connected domain generally refers to an image area formed by foreground pixel points which have the same pixel value and are adjacent in position in an image, and can be used in a connected area analysis method in an application scene which needs to extract a foreground target so as to be processed later, and the object of the connected area analysis processing is a binarized image;

S202, merging the divided connected domains, selecting a connected domain A and a connected domain B, and calculating the horizontal overlapping rate P ₁ and the vertical overlapping rate P ₂ of characters according to the following formula:

Wherein a ₁ is denoted as the leftmost list number of connected domain a, a ₂ is denoted as the rightmost list number of connected domain a, a ₃ is denoted as the uppermost list number of connected domain a, and a ₄ is denoted as the lowermost list number of connected domain a; b ₁ is represented as the leftmost list number of connected domain B, B ₂ is represented as the rightmost list number of connected domain B, B ₃ is represented as the uppermost list number of connected domain B, and B ₄ is represented as the lowermost list number of connected domain B; p ₁ and p ₂ are denoted as constants, which are set by the relevant technician himself;

s203, recognizing a single character through a convolutional neural network, and monitoring meaningful words a _i in the character string through an NLP technology to form a character string set A= { a ₁,a₂,…,a_m }, wherein m is the number of the character strings, the NLP technology refers to natural language processing, is an important direction in the field of computer science and the artificial intelligence, can realize various theories and methods for effectively communicating between people and computers by natural language, and the natural language processing mainly comprises two parts of natural language understanding and natural language generation, and is mainly applied to the aspects of machine translation, public opinion monitoring, automatic abstract, viewpoint extraction, text classification, question answering, text semantic comparison, voice recognition, chinese OCR and the like;

S204, combining basic data information and historical document information recorded in a database to form a document sensitive word set B= { B ₁,b₂,…,b_n }, wherein n is expressed as the number of document sensitive words, and the character string similarity degree F is calculated through the following formula:

Where d (a _i,b_j) represents the edit distance between the character string a _i in the transmitted file and the character string b _j in the database, the edit distance refers to the minimum number of editing operations required to convert from one to the other between the two character strings, L (a _i) represents the length of the character string a _i, L (b _j) represents the length of the character string b _j, and max (L (a _i),L(b_j)) represents the maximum value of the lengths of the character strings a _i and b _j;

Setting a character string similarity degree threshold as F _{Threshold value}, when F < F _{Threshold value}, representing that two character strings are dissimilar, when F is larger than or equal to F _{Threshold value}, representing that the two character strings are similar, counting the number s of similar character strings, and calculating the document sensitivity degree F as follows:

setting a document sensitivity degree threshold as f _{Threshold value}, and when f is less than f _{Threshold value}, indicating that the document sensitivity degree is low, and not giving an alarm at the moment; when f is more than or equal to f _{Threshold value}, the document sensitivity is high, the user is reminded through popup window display and voice, and meanwhile the analysis result is stored in an encrypted mode.

Further, in step S3, the sensitivity of the image is analyzed according to the collected image information and the basic data information, including the following steps:

S301, collecting an image file sent by a user, and performing basic preprocessing;

s302, extracting the acquired image contour, which comprises the following steps:

S302-1, filtering an image through a nonlinear filter to obtain an image with textures and noise filtered and edges and contours reserved;

Placing the image in a coordinate system, wherein the position coordinates of a pixel point I in the image are I (X _i,y_i),x_i and y _i represent pixel values of the image to form a set X= { (X ₁,y₁),(x₂,y₂),…,(x_k,y_k) }, wherein k represents the number of pixel points, and calculating a filtering index Q through the following formula:

Where I' is represented as a filtered image, alpha is represented as a smoothing index, the larger alpha is, the smoother the image, beta ₁ and beta ₂ are represented as smoothing weights, Expressed as partial derivative, T is expressed as logarithmic brightness channel of input image, logarithmic brightness channel is the channel of image under different logarithmic frequency brightness, color digital image is composed of pixels, pixels are composed of a series of primary colors expressed by codes, channel is gray image with same size as color image, and is composed of only one of these primary colors, gamma is expressed as image gradient index, gamma is bigger, reserved image edge is sharper, epsilon is expressed as constant; when the filtering index Q is minimum, the image I' at the moment is a new image after filtering;

S302-2, performing peripheral inhibition on the filtered image;

The suppression amount r (x, y) is calculated by the following formula:

r(x，y)＝{Z*ω}(x，y)；

Wherein Z is represented as a result of filtering the phase consistency quantity, the phase consistency refers to that in the frequency domain of an image, the frequency of the edge similar features appearing in the same stage is higher, the phase consistency quantity is obtained through the calculation of the phase consistency, and the calculation process is the prior art, so that the invention does not describe the phase consistency quantity excessively; ω is denoted as distance weight, the suppressed image u (x, y) is calculated by the following formula:

u(x，y)＝H[{Z-δ·r}(x，y)]；

wherein H is expressed as summation of convolution results of the odd symmetric filter, delta is expressed as a suppression factor, and the smaller the delta is, the weaker the suppression strength is; conversely, the greater the δ, the stronger the inhibition strength;

s302-3, performing binarization processing on the suppressed image to obtain a contour map of the image;

s303, analyzing the similarity degree of the images according to the images after the contours are extracted and combining the basic data information and the historical image information in the database;

Selecting contour pixel points on the image after contour extraction, wherein the position coordinates are X_μ(x_μ,y_μ),X_ρ(x_ρ,y_ρ),X_τ(x_τ,y_τ),X_ξ(x_ξ,y_ξ), to form a vector Sum vector/>The angle θ ₁ between the vectors is calculated by the following formula:

Selecting pixel points with the same outline characteristics on images in a database, and forming a vector by using the position coordinates of Y_μ(x_μ',y_μ'),Y_ρ(x_ρ',y_ρ'),Y_τ(x_τ',y_τ'),Y_ξ(x_ξ',y_ξ'), Sum vectorY _τ'), the angle θ ₂ between the vectors is calculated by the following formula:

S304, when theta ₁≠θ₂ is adopted, the pixel points representing comparison form vector dissimilarity, when theta ₁＝θ₂ is adopted, the pixel points representing comparison form vector similarity, the number of times of statistical comparison similarity is v, the total number of times of comparison is c, and then the image sensitivity degree f' is calculated according to the following formula:

Setting an image sensitivity degree threshold as f _{Threshold value} ', and when f ' < f _{Threshold value} ', indicating that the image sensitivity degree is low, and not giving an alarm at the moment; when f 'is not less than f _{Threshold value}', the image sensitivity is high, the user is reminded through popup window display and voice, and meanwhile, the analyzed result is stored in an encrypted mode.

Further, in step S4, according to the analysis result, after the user sends the file, the user withdraw time is set individually, in the user withdraw time, the user can withdraw the sent file, at this time, the receiving end does not display the file information, and after the user withdraw time is over, the receiving end receives the file to review;

setting the standard user revocation time to t _{Label (C)}, calculating the actual user document revocation time t ₁ and the image revocation time t ₂ by the following formulas:

wherein phi ₁ and phi ₂ are expressed as coefficients;

When the document sensitivity degree or the image sensitivity degree sent by the user is higher than a threshold value, safety early warning reminding is carried out on the user through popup window display and voice. The method and the device have the advantages that the situation that information data are leaked due to the fact that users cannot timely withdraw the files under the condition of missending or missending of the files is avoided, the withdrawal time of the users is set according to the sensitivity degree of the files, the fact that even if the users missend or missend the files, the receiving end cannot immediately open the files is guaranteed, and safety and privacy of data information are guaranteed.

Compared with the prior art, the invention has the following beneficial effects:

According to the invention, the basic information is input, file information generated by a user is acquired in real time, the file is divided into a document class and an image class according to the file suffix name sent by the user, and the content of the file is respectively analyzed by combining the basic data information and the historical data information in the database; comparing the sensitive words in the document with the sensitive word information in the database to obtain the document sensitivity degree, and carrying out safety early warning reminding on the user when the document sensitivity degree is higher than a threshold value; for image information, contour pixel points in an image are extracted to form vectors, the size of an included angle between the vectors is calculated, meanwhile, for the image in a database, the pixel points with the same contour characteristics are selected to form the vectors, the size of the included angle between the vectors is calculated, when the sizes of the included angles are consistent, the pixel points are indicated to form the similar vectors, so that the image sensitivity is analyzed, when the image sensitivity is greater than a threshold value, safety early warning reminding is carried out on a user, the information safety of the user is ensured, the user is reminded to carry out file withdrawal operation, the analysis result is encrypted and stored in the database, the richness of historical data information in the database is improved, and the accuracy and the robustness of the system are promoted.

Meanwhile, according to the document sensitivity and the image sensitivity of the file, the user withdrawal time is set in a personalized mode, in the user withdrawal time, the user can withdraw the transmitted file, at the moment, the receiving end does not display file information, after the user withdrawal time is over, the receiving end receives the file to review, the fact that the receiving end cannot immediately open the file even if the file is wrongly sent or wrongly sent is guaranteed, safety of data information is guaranteed, loss caused by information leakage is avoided, and safe use of a computer network is guaranteed.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:

FIG. 1 is a schematic diagram of the module composition of a computer network security early warning system based on big data according to the present invention;

FIG. 2 is a flow chart of the steps of a computer network security pre-warning method based on big data according to the present invention;

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1-2, the present invention provides the following technical solutions: a computer network security early warning system based on big data, the network security early warning system includes: the system comprises an information acquisition module, a database, a file analysis module and a safety early warning module;

The output end of the information acquisition module is connected with the input end of the database, the output end of the database is connected with the input end of the file analysis module, the output end of the file analysis module is connected with the input end of the safety early warning module, and the output end of the file analysis module is connected with the input end of the database;

The information acquisition module is used for acquiring basic data information and monitoring transmitted file data, the information acquisition module comprises a basic information input unit and a file detection unit, the basic information input unit is used for inputting basic data information, such as file types, basic document sensitive word information, basic image information and the like, the file detection unit is used for monitoring the content of a file, classifying the file according to the suffix name of the transmitted file, such as document types and image types, and acquiring document information and image information, such as the sensitive word and image contour of the document and the like

The data encryption unit encrypts the collected data and the analysis result through an SM4 encryption algorithm, so that the safety of a computer network is guaranteed, information leakage is avoided, the privacy safety and property safety of a user are guaranteed, SM4 is a block cipher algorithm, the block length of the algorithm is 128 bits, the key length is 128 bits, the encryption algorithm and the key expansion algorithm both adopt a 32-round nonlinear iterative structure, the decryption algorithm is identical to the encryption algorithm, the use sequence of round keys is opposite, the decryption round keys are the reverse sequence of the encryption round keys, and the national cipher algorithm SM4 has the characteristics of safety and high efficiency and has the following advantages in terms of design and implementation: a. the involution operation, the decryption algorithm is the same as the encryption algorithm, but the use sequence of round keys is reverse, and the decryption round keys are the reverse sequence of the encryption round keys; b. the sub-key generation algorithm is similar to the encryption algorithm in structure, and resource reuse is achieved in design; c. the encryption algorithm and the key expansion algorithm both adopt a 32-round nonlinear iteration structure, and a 128-bit plaintext and a key are subjected to nonlinear iteration operation for 32 times to obtain a final result; the data storage unit comprises a document information subunit and an image information subunit, the document information subunit stores the collected basic document information and the analyzed historical document information through a hash storage method, the image subunit stores the collected basic image information and the historical image information through storage, the hash storage method is also called hash storage, is a search technology for seeking to establish a corresponding relation between a storage position of a data element and a key, the storage address of the node is directly calculated according to the key of the node, the hash is one development of an array storage mode, compared with an array, the data access speed of the hash is higher than that of the array, the hash can be quickly realized, and the method has the following advantages: a. the space is arbitrary; b. explicitly storing the relationship; c. the ability to express relationships is strong; d. elements are inserted and deleted in O (1) (only the pointers of nodes need to be changed), so that the memory can be managed dynamically.

The file analysis module is used for analyzing and processing collected data, the file analysis module comprises a document analysis unit and an image analysis unit, the document analysis unit is used for analyzing and processing document information sent by a user in combination with basic document information and historical document information stored in a database, such as text information, text documents, link documents and the like, and the image analysis unit is used for analyzing and processing image information sent by the user in combination with basic image information and historical image information stored in the database, such as pictures, videos and the like.

The safety early warning module is used for carrying out early warning and reminding on related technicians according to analysis results. The safety early warning module comprises a time control unit and an alarm reminding unit, wherein the time control unit is used for setting the withdrawal time of a user according to analysis results in a personalized way, the user can withdraw a transmitted file in the withdrawal time of the user, after the withdrawal time of the user is finished, the user receives the file to be consulted, the alarm reminding unit is used for carrying out safety early warning reminding on the user according to analysis results when abnormal conditions occur, for example, the user can know information in time through popup window display, voice alarm and other modes, withdraw the transmitted file, important file transmission errors are avoided, and the situation that other people open the file to cause leakage is guaranteed.

In step S1, basic data information is entered, the transmitted files are divided into document types and image types according to the monitored file suffix names transmitted by users, the document types comprise text information, text documents, linked documents and the like transmitted by the users, text information in the document type files is extracted, the image types comprise pictures, videos, expression packages and the like transmitted by the users, the files of the image types are intercepted, image information is obtained, and meanwhile collected data information is stored in a database in an encrypted mode.

In step S2, the sensitivity degree of the document is analyzed based on the collected document information and the basic data information, including the steps of:

S203, recognizing a single character through a convolutional neural network, and monitoring meaningful words a _i in the character string through an NLP technology to form a character string set A= { a ₁,a₂,...,a_m }, wherein m is the number of the character strings, the NLP technology refers to natural language processing, is an important direction in the field of computer science and the artificial intelligence, can realize various theories and methods for effectively communicating between people and computers by natural language, and the natural language processing mainly comprises two parts of natural language understanding and natural language generation, and is mainly applied to the aspects of machine translation, public opinion monitoring, automatic abstract, viewpoint extraction, text classification, question answering, text semantic comparison, voice recognition, chinese OCR and the like;

s204, combining basic data information and historical document information recorded in a database to form a document sensitive word set B= { B ₁,b₂,...,b_n }, wherein n is expressed as the number of document sensitive words, and the character string similarity degree F is calculated through the following formula:

Where d (a _i,b_j) is represented as an edit distance between the character string a _i in the transmitted file and the character string b _j in the database, the edit distance refers to a minimum number of editing operations required to convert from one to another between the two character strings, the permitted editing operations include replacing one character with another, inserting one character, deleting one character, L (a _i) is represented as a length of the character string a _i, L (b _j) is represented as a length of the character string b _j, and max (L (a _i),L(b_j)) is represented as a maximum value of the lengths of the character strings a _i and b _j;

In step S3, the sensitivity of the image is analyzed according to the collected image information and the basic data information, including the following steps:

S301, collecting an image file sent by a user, and performing basic preprocessing, wherein the preprocessing operation comprises capturing video key frames, rotating calibration and the like;

placing the image in a coordinate system, wherein the position coordinates of a pixel point I in the image are I (X _i,y_i),x_i and y _i represent pixel values of the image to form a set X= { (X ₁,y₁),(x₂,y₂),...,(x_k,y_k) }, wherein k represents the number of pixel points, and calculating a filtering index Q through the following formula:

S302-2, performing peripheral inhibition on the filtered image;

The suppression amount r (x, y) is calculated by the following formula:

r(x，y)＝{Z*ω}(x，y)；

u(x，y)＝H[{Z-δ·r}(x，y)]；

Selecting pixel points with the same outline characteristics on images in a database, and forming a vector by using the position coordinates of Y_μ(x_μ',y_μ'),Y_ρ(x_ρ',y_ρ'),Y_τ(x_τ',y_τ'),Y_ξ(x_ξ',y_ξ'), Sum vector The angle θ ₂ between the vectors is calculated by the following formula:

In step S4, according to the analysis result, after the user sends the file, the user withdraw time is set individually, in the user withdraw time, the user can withdraw the sent file, at this time, the receiving end does not display the file information, and after the user withdraw time is over, the receiving end receives the file to review;

wherein phi ₁ and phi ₂ are expressed as coefficients;

Example 1:

If the document information contains 15 character strings, the edit distance between the character string a _i in the transmitted file and the character string b _j in the database is 2, the length of the character string a _i is 2, the length of the character string b _j is 3, and the character string similarity is If the threshold F _{Threshold value} =0.2, F > F _{Threshold value} represents that the two strings are similar, the number of the similar strings is 10, the threshold F _{Threshold value} =60%, and the sensitivity/> The document sensitivity is high, at the moment, the user is reminded through popup window display and voice, and meanwhile, the analysis result is encrypted and stored; if the coefficient phi ₁ is 1 and the standard user withdrawal time is 60s, the document withdrawal time

If the contour pixel points are selected from the image information after the contour is extracted, and the position coordinates are (1, 1), (2, 3), (2, 1), (4, 1), the included angles among the vectors are as follows:

Selecting pixel points with the same outline characteristics on images in a database, wherein the position coordinates are (2, 2), (4, 7), (3, 1), (6, 1), and the included angles among vectors are as follows:

at this time, θ ₁＝θ₂ represents that the pixel points of contrast form vector similarity, the number of statistical contrast similarity is 20, the total number of contrast is 25, and the image sensitivity threshold is f _{Threshold value}' =75% of image sensitivity The image sensitivity is high, at the moment, the user is reminded through popup window display and voice, and meanwhile, the analysis result is encrypted and stored; if the coefficient phi ₂ is 2, then the image withdrawal time/>

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A computer network safety early warning method based on big data is characterized in that: comprises the following steps:

S4, according to the analysis result, when abnormal conditions occur, the withdrawal time of the user is set in a personalized mode, and safety early warning reminding is carried out on the user;

S201, identifying the collected document information, and carrying out mark segmentation on characters in the document through a connected domain;

Wherein a ₁ is denoted as the leftmost list number of connected domain a, a ₂ is denoted as the rightmost list number of connected domain a, a ₃ is denoted as the uppermost list number of connected domain a, and a ₄ is denoted as the lowermost list number of connected domain a; b ₁ is represented as the leftmost list number of connected domain B, B ₂ is represented as the rightmost list number of connected domain B, B ₃ is represented as the uppermost list number of connected domain B, and B ₄ is represented as the lowermost list number of connected domain B; p ₁ and p ₂ are denoted as constants;

S203, identifying single characters through a convolutional neural network, and monitoring meaningful words a _i in the character strings through an NLP technology to form a character string set A= { a ₁,a₂,…,a_m }, wherein m is expressed as the number of the character strings;

Where d (a _i,b_j) represents the edit distance of the character string a _i in the transmitted file and the character string b _j in the database, L (a _i) represents the length of the character string a _i, L (b _j) represents the length of the character string b _j, and max (L (a _i),L(b_j)) represents the maximum value of the lengths of the character strings a _i and b _j;

2. The method for computer network security pre-warning based on big data according to claim 1, wherein the method comprises the following steps: in step S1, basic data information is input, the transmitted files are divided into document types and image types according to the file suffix names transmitted by the monitored users, text information in the document type files is extracted, the image type files are intercepted, image information is obtained, and meanwhile collected data information is stored in a database in an encrypted mode.

3. The method for pre-warning safety of a computer network based on big data according to claim 2, wherein the method comprises the following steps: in step S3, the image profile sent by the user is analyzed according to the collected image information and the basic data information, and the method includes the following steps:

S302, extracting the acquired image contour; comprises the following steps:

Where I' is denoted as filtered image, alpha is denoted as a smoothing index, beta ₁ and beta ₂ are denoted as smoothing weights, Expressed as partial derivative, T expressed as logarithmic luminance channel of the input image, gamma expressed as image gradient index, epsilon expressed as constant; when the filtering index Q is minimum, the image I' at the moment is a new image after filtering;

S302-2, performing peripheral inhibition on the filtered image;

The suppression amount r (x, y) is calculated by the following formula:

r(x，y)＝{Z*ω}(x，y)；

wherein Z is the filtered result of the phase coincidence quantity, ω is the distance weight, and the suppressed image u (x, y) is calculated by the following formula:

u(x，y)＝H[{Z-δ·r}(x，y)]；

wherein H is expressed as the summation of convolution results of the odd symmetric filter, and delta is expressed as the inhibition factor;

Selecting pixel points with the same outline characteristics on images in a database, and forming a vector by using the position coordinates of Y_μ(x_μ',y_μ'),Y_ρ(x_ρ',y_ρ'),Y_τ(x_τ',y_τ'),Y_ξ(x_ξ',y_ξ'), Sum vector/> The angle θ ₂ between the vectors is calculated by the following formula:

4. A computer network security pre-warning method based on big data according to claim 3, characterized in that: in step S4, according to the analysis result, after the user sends the file, the withdrawal time of the user is set in a personalized way;

wherein phi ₁ and phi ₂ are expressed as coefficients;

When the document sensitivity degree or the image sensitivity degree sent by the user is higher than a threshold value, safety early warning reminding is carried out on the user through popup window display and voice.

5. A big data based computer network security pre-warning system, the system being applied to the implementation of the big data based computer network security pre-warning method as claimed in any one of claims 1 to 4, characterized in that: the network security early warning system comprises: the system comprises an information acquisition module, a database, a file analysis module and a safety early warning module;

6. The big data based computer network security pre-warning system of claim 5, wherein: the information acquisition module comprises a basic information input unit and a file detection unit, wherein the basic information input unit is used for inputting basic data information, the file monitoring unit is used for monitoring the content of a file, classifying the file according to the suffix name of the transmitted file and acquiring document information and image information.

7. The big data based computer network security pre-warning system of claim 6, wherein: the database comprises a data encryption unit and a data storage unit, wherein the data encryption unit encrypts collected data and analysis results through an SM4 encryption algorithm; the data storage unit comprises a document information subunit and an image information subunit, wherein the document information subunit stores collected basic document information and analyzed historical document information through a hash storage method, and the image subunit stores the collected basic image information and the historical image information through storage.

8. The big data based computer network security pre-warning system of claim 7, wherein: the file analysis module comprises a document analysis unit and an image analysis unit, wherein the document analysis unit is used for analyzing and processing document information sent by a user by combining basic document information and historical document information stored in a database, and the image analysis unit is used for analyzing and processing image information sent by the user by combining basic image information and historical image information stored in the database.

9. The big data based computer network security pre-warning system of claim 8, wherein: the safety early warning module comprises a time control unit and an alarm reminding unit, wherein the time control unit is used for setting the withdrawal time of a user in a personalized way according to analysis results, and the alarm reminding unit is used for carrying out safety early warning reminding on the user when abnormal conditions occur according to the analysis results.