[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2021169064A1 - Procédé et appareil de traitement d'anomalie basé sur un réseau périphérique - Google Patents

Procédé et appareil de traitement d'anomalie basé sur un réseau périphérique Download PDF

Info

Publication number
WO2021169064A1
WO2021169064A1 PCT/CN2020/091867 CN2020091867W WO2021169064A1 WO 2021169064 A1 WO2021169064 A1 WO 2021169064A1 CN 2020091867 W CN2020091867 W CN 2020091867W WO 2021169064 A1 WO2021169064 A1 WO 2021169064A1
Authority
WO
WIPO (PCT)
Prior art keywords
service
abnormal
edge node
node
monitoring event
Prior art date
Application number
PCT/CN2020/091867
Other languages
English (en)
Chinese (zh)
Inventor
朱少武
Original Assignee
网宿科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 网宿科技股份有限公司 filed Critical 网宿科技股份有限公司
Publication of WO2021169064A1 publication Critical patent/WO2021169064A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/145Countermeasures against malicious traffic the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers

Definitions

  • the present invention relates to the technical field of network security, in particular to an abnormal processing method and device based on an edge network.
  • each edge node collects its own service data and reports it to the central node, and then the central node analyzes whether each edge node is abnormal based on these service data. If there is an abnormality, the operation and maintenance personnel are notified to go Repair abnormal edge nodes.
  • the problem with this method is that there is a large amount of service data in each edge node, and uploading a large amount of service data to the central node for centralized anomaly analysis usually requires the central node to spend a lot of time and cost, which leads to the central node’s failure. The pressure is great, and it will also reduce the real-time performance of exception handling.
  • the present invention provides an abnormality processing method and device based on an edge network, which is used to solve the technical problems of high pressure on the central node and untimely processing of abnormalities caused by the centralized analysis of the abnormality of each edge node by the central node in the prior art.
  • the present invention provides an abnormality processing method based on an edge network, the edge network including a central node and at least one edge node; the method includes:
  • Any edge node analyzes service data using anomaly analysis rules to determine whether the first service in the edge node is abnormal; the service data includes service data corresponding to the first service; further, the edge node determines the first service After the service is abnormal, if there is an exception handling rule for the first service in the edge node, use the exception handling rule to repair the first service; if the first service does not exist in the edge node The exception handling rules of, are reported to the central node.
  • the central node determines the exception handling rule of the first service and sends it to the edge node; accordingly, the edge node Receiving the exception handling rule of the first service sent by the central node; the edge node uses the exception handling rule of the first service to repair the first service.
  • the edge node when the edge node cannot handle the exception, the exception is reported to the central node, and the central node issues exception handling rules, so that the edge node can handle the exception according to the exception handling rule set by the central node, and improve the exception handling. Accuracy and comprehensiveness.
  • any edge node before any edge node analyzes service data using anomaly analysis rules, it also sends a registration request to the central node; the registration request is used to establish between the central node and the edge node Communication connection; In this way, after the edge node establishes a communication connection with the central node, it obtains the self-closed loop strategy corresponding to various services from the central node; the various services include the first service; the self-closing strategy corresponding to any service
  • the closed-loop strategy includes exception analysis rules for the service, or also includes exception handling rules for the service.
  • the self-closed loop strategy can be configured on the central node side instead of separately in each edge node Separate configuration, thereby improving the flexibility and convenience of self-closed-loop strategy configuration; and, by using the service as a unit to configure self-closed-loop strategy, it can make the abnormal identification process more targeted, better reflect the true service capabilities of the service, and improve Accuracy of anomaly recognition and anomaly handling.
  • the self-closed loop strategy corresponding to the various services is obtained by the following method: the central node obtains and analyzes the abnormal monitoring after detecting that the user enters the abnormal monitoring configuration information in the abnormal monitoring configuration interface The configuration information is obtained, and the self-closed loop strategy corresponding to various services is obtained and stored in the local database of the central node.
  • the self-closed loop strategy corresponding to various services can be set by the user on the abnormal monitoring configuration interface of the central node, and the self-closed loop strategy of the service can be decoupled from the business.
  • the service is configured with different self-closed-loop strategies to improve the flexibility of exception handling; moreover, configuring each self-closed-loop strategy through the configuration interface can also simplify operations, reduce manual operation and maintenance costs and events, and improve the efficiency of exception handling.
  • the anomaly analysis rule of any service includes an anomaly analysis rule corresponding to each monitoring event in the service; the any edge node analyzes the service data using the anomaly analysis rule to determine that the edge node Whether the first service is abnormal includes: for any monitoring event in the first service, the edge node parses out the service data of the monitoring event from the service data of the first service, and calls and The abnormal analysis algorithm that matches the type of the service data of the monitoring event analyzes the service data of the monitoring event, and if the analysis result meets the first abnormal condition corresponding to the monitoring event, it is determined that the monitoring event is abnormal, at least according to the The monitoring event determines whether the first service is abnormal; if the analysis result does not meet the first abnormal condition corresponding to the monitoring event, it is determined that the first service is not abnormal.
  • the edge node determines whether the first service is abnormal at least according to the monitoring event, including: if the edge node determines that the abnormal condition corresponding to the monitoring event only includes the first abnormal condition , It is determined that the first service is abnormal; if it is determined that the abnormal condition corresponding to the monitoring event also includes a second abnormal condition, and the second abnormal condition is the impact time, then when the abnormal duration of the monitoring event is less than the When the impact time, it is determined that the first service is not abnormal, and when the abnormal duration of the monitoring event is greater than or equal to the impact time, it is determined that the first service is abnormal.
  • the method further includes: if the edge node determines that the second abnormal condition is that the associated monitoring event is abnormal at the same time, determining whether other monitoring events associated with the monitoring event are abnormal, when When the other monitoring event is also abnormal, it is determined that the first service is abnormal, and when at least one other monitoring event is normal, it is determined that the first service is not abnormal.
  • the present invention provides an abnormality processing device based on an edge network.
  • the edge network includes a central node and at least one edge node; the device includes:
  • An anomaly analysis module configured to analyze service data using anomaly analysis rules to determine whether the first service in the edge node is abnormal; the service data includes service data corresponding to the first service;
  • the exception processing module is configured to, after determining that the first service is abnormal, if there is an exception handling rule for the first service in the edge node, use the exception handling rule to repair the first service; If there is no exception handling rule for the first service in the edge node, it is reported to the central node.
  • the central node determines the exception handling rule of the first service and sends it to the edge node; the device further It includes a transceiver module, the transceiver module is configured to: receive the exception handling rule of the first service sent by the central node; accordingly, the exception handling module is also configured to: use the exception handling rule of the first service Repair the first service.
  • the device further includes a transceiver module; before the abnormality analysis module analyzes the service data using abnormal analysis rules, the transceiver module is configured to: send a registration request to the central node; the registration The request is used for the central node to establish a communication connection with the edge node; and, after the communication connection is established with the central node, obtain self-closed loop strategies corresponding to various services from the central node; the various services include the first A service; the self-closed loop strategy corresponding to any service includes the exception analysis rules of the service, or also includes the exception handling rules of the service.
  • the self-closed loop strategy corresponding to the various services is obtained by the following method: the central node obtains and analyzes the abnormal monitoring after detecting that the user enters the abnormal monitoring configuration information in the abnormal monitoring configuration interface The configuration information is obtained, and the self-closed loop strategy corresponding to various services is obtained and stored in the local database of the central node.
  • the anomaly analysis rule of any service includes an anomaly analysis rule corresponding to each monitoring event in the service; the anomaly analysis module is specifically configured to: target any one of the first services The monitoring event, analyzing the service data of the monitoring event from the service data of the first service, and invoking an abnormality analysis algorithm that matches the type of the service data of the monitoring event to analyze the service data of the monitoring event, If the analysis result meets the first abnormal condition corresponding to the monitoring event, determine that the monitoring event is abnormal, and determine whether the first service is abnormal at least according to the monitoring event; if the analysis result does not meet the monitoring event corresponding If the first abnormal condition is found, it is determined that the first service is not abnormal.
  • the abnormality analysis module is specifically configured to: if it is determined that the abnormal condition corresponding to the monitoring event only includes the first abnormal condition, determine that the first service is abnormal; if it is determined that the monitoring event The corresponding abnormal condition also includes a second abnormal condition, and the second abnormal condition is the impact time, when the abnormal duration of the monitoring event is less than the impact time, it is determined that the first service is not abnormal, and when the When the abnormal duration of the monitoring event is greater than or equal to the impact time, it is determined that the first service is abnormal.
  • the abnormality analysis module is further configured to: if it is determined that the second abnormal condition is that the associated monitoring event is abnormal at the same time, determine whether other monitoring events associated with the monitoring event are abnormal, and when all When the other monitoring event is also abnormal, it is determined that the first service is abnormal, and when at least one other monitoring event is normal, it is determined that the first service is not abnormal.
  • a computing device includes at least one processor and at least one memory, wherein the memory stores a computer program, and when the program is executed by the processor, the processor Perform any of the methods described in the first aspect above.
  • the present invention provides a computer-readable storage medium that stores a computer program executable by a computing device.
  • the program runs on the computing device, the computing device executes the first aspect described above. Any of the methods described.
  • FIG. 1 is a schematic diagram of a system architecture of an edge network provided by an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a corresponding process flow of an edge network-based exception handling method provided by an embodiment of the present invention
  • FIG. 3 is a schematic diagram of the overall interaction flow corresponding to an exception handling method provided by an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a monitoring device provided by an embodiment of the present invention.
  • Fig. 5 is a schematic structural diagram of a computing device provided by an embodiment of the present invention.
  • FIG. 1 is a schematic diagram of a system architecture of an edge network provided by an embodiment of the present invention.
  • the edge network includes a central node 110 and at least one edge node, such as an edge node 121, an edge node 122, and an edge node 123.
  • the central node 110 may be connected to any edge node, for example, it may be connected in a wired manner, or may be connected in a wireless manner, which is not specifically limited.
  • the central node 110 is a remote device, and each edge node is a near-end device, and any edge node can also be connected to a client (not shown in FIG. 1) to provide a near-end service to the client.
  • the edge node 121 can be connected to the client 131 and the client 132 and provide near-end services to the client 131 and the client 132;
  • the edge node 122 can be connected to the client 133 and provide the client 133 provides near-end services;
  • the edge node 123 can be connected to the client 134 and the client 135, and provide near-end services to the client 134 and the client 135.
  • the client can be any terminal device, such as a notebook computer, an IPad, a mobile phone, a router, and other hardware devices with communication interaction functions, which are not limited.
  • the central node 110 can pre-deliver business data to each edge node.
  • the client when the client has a data access request, the client can send a data access request to the central node 110, and the data access request arrives in advance.
  • the edge node adjacent to the client the edge node detects whether the service data corresponding to the data access request is stored locally according to the data access request. If so, the service data can be directly responded to the client; if not, the data access request can be forwarded to the central node 110.
  • the architecture in Figure 1 is only an exemplary description, and does not constitute a limitation to the solution; in specific implementation, multiple layers (ie, two or more layers) can also be deployed in the edge network
  • the client's data access request first reaches the lowest edge node. If the bottom edge node stores the corresponding business data locally, the bottom edge node responds to the corresponding business data to the client. If the bottom edge node's local If the corresponding business data is not stored, the bottom edge node forwards the data access request to the next level edge node, and the next level edge node performs the data response operation until the corresponding business data is responded to the client.
  • edge node in the embodiment of the present invention may be an edge device, an edge device cluster deployed according to a cluster, or a process in an edge device, which is not limited.
  • Fig. 2 is a schematic diagram of the process corresponding to an edge network-based exception handling method provided by an embodiment of the present invention.
  • the method is applicable to any edge node in the edge network, and the method includes:
  • Step 201 The edge node analyzes the service data using an abnormality analysis rule to determine whether the first service in the edge node is abnormal; the service data includes service data corresponding to the first service.
  • Step 202 After determining that the first service is abnormal, the edge node determines whether there is an exception handling rule for the first service in the edge node, and if so, uses the exception handling rule to perform the first service Repair, if not, report the first service exception to the central node.
  • the edge node by placing service abnormality identification and abnormality repair on the edge node side for execution instead of uniformly reporting to the central node for execution, the work pressure of the central node can be effectively reduced, and network overhead and time cost can be saved;
  • the edge node performs self-closed loop processing of its own abnormalities, and can also discover and handle abnormalities in time, which not only improves the efficiency of abnormal identification and processing, but also restores service availability in time.
  • the anomaly analysis rule can be configured in the edge node based on the anomaly monitoring configuration information.
  • the anomaly monitoring configuration information can be pre-configured on the edge node side by the operation and maintenance personnel, or it can be synchronized to the central node after being configured by the business personnel.
  • the edge node may also be obtained by the edge node from a third-party interface device, and the specifics are not limited.
  • the abnormal monitoring configuration information can be configured in the edge node through the following steps:
  • Step a The central node receives the abnormal monitoring configuration information input by the user.
  • the central node can provide users with an abnormal monitoring configuration interface. After detecting that the user inputs abnormal monitoring configuration information in the abnormal monitoring configuration interface, it can obtain and analyze the abnormal monitoring configuration information, and use the service as a unit to configure the abnormal monitoring The abnormal monitoring configuration information belonging to the same service is extracted from the information, so as to obtain the abnormal monitoring configuration information corresponding to various services. Further, the central node can parse the abnormal monitoring configuration information corresponding to any service, obtain the self-closed loop strategy corresponding to the service, and store it in the local database of the central node. Wherein, the self-closed loop strategy corresponding to any service may include the exception analysis rule of the service, and may also include the exception handling rule of the service and/or the acquisition rule of service data, which is not limited.
  • the self-closed-loop strategy refers to a strategy for self-closed-loop processing of abnormal conditions of the service, including various rules related to self-closed-loop processing, such as exception analysis rules, exception handling rules, data acquisition rules, abnormal conditions, and so on.
  • the self-closed-loop strategy is actually obtained by extracting various rules from the abnormal monitoring configuration information of the service, and belongs to the collective name of the various rules for self-closed-loop processing of the same service, not the processing method.
  • the exception analysis rule corresponding to any service may include the exception analysis rule for each monitoring event in the service, and the exception handling rule corresponding to any service may include the exception processing rule for each monitoring event in the service.
  • Table 1 illustrates a schematic table of a self-closed loop strategy corresponding to each service.
  • any service can correspond to one monitoring event or multiple monitoring events, and each monitoring event can be set with corresponding abnormal conditions and abnormal handling rules.
  • the concurrent service corresponds to two monitoring events, namely the concurrent volume event and the concurrent error rate event.
  • the concurrent volume is greater than or equal to 10,000
  • the concurrent volume event is determined to be abnormal, so the concurrent service process can be added to restore the concurrent service in the edge node
  • the concurrent error rate is greater than 45%
  • it is determined that the concurrent error rate event is abnormal so the concurrent service can be restarted to restore the accuracy of the concurrent service in the edge node.
  • a resource service corresponds to a monitoring event, that is, a resource occupancy event. When the resource occupancy is greater than or equal to 95% for more than 5 minutes, it is determined that the resource service is abnormal, so the cache of the resource service can be cleaned to restore the edge node Availability of resource services.
  • the central node can also support the user to create new self-closed-loop strategies, clear existing self-closed-loop strategies, modify existing self-closed-loop strategies, or query existing self-closed-loop strategies and other update operations, and the self-closed loop is detected After the strategy is updated, the central node can also automatically load the updated abnormal self-closed loop strategy to improve the accuracy of abnormal handling. Take the clearing of the existing self-closed loop strategy as an example.
  • the central node can also display the existing self-closed loop strategy to the user, and the user can Directly select the self-closed-loop strategy to be cleared for deletion, or modify the state of the self-closed-loop strategy to be cleared from the effective state to the invalid state to delete the self-closed-loop strategy to be cleared.
  • the self-closed-loop strategy of the service can be decoupled from the business, and users can support different services according to their respective business needs.
  • Configure different self-closing-loop strategies to improve the flexibility of exception handling; moreover, configuring each self-closing-loop strategy through the configuration interface can also simplify operations, reduce manual operation and maintenance costs and events, and improve the efficiency of exception handling.
  • Step b The edge node sends a registration request to the central node when it is started.
  • Step c The central node verifies the registration request of the edge node. If the verification is successful, it establishes a communication connection with the edge node (used to allow the edge node to obtain a self-closed loop strategy corresponding to various services), and sends a successful registration to the edge node In the response message, if the verification fails, it refuses to establish a communication connection with the edge node, and sends a registration failure response message to the edge node.
  • Step d If the edge node receives the response message of successful registration, it can obtain the self-closed loop strategy corresponding to various services from the central node, and store the self-closed loop strategy corresponding to various services in the local database. Correspondingly, if the edge node does not receive the response message, or receives the response message of the registration failure, it can periodically send the registration request to the central node repeatedly, and if the registration is not successful after the set number of repeated transmissions, it will give up Register and generate warning messages.
  • various services can be services deployed on edge nodes, or all services stored in central nodes, without limitation.
  • the edge node can obtain concurrency from the central node
  • the self-closed loop strategy corresponding to the volume service and the self-closed loop strategy corresponding to the port service are stored in the local database of the edge node.
  • the edge node can send an obtain request to the central node, and the obtain request carries the identifier of the concurrent service and the identifier of the port service, so that the central node corresponds to the concurrent service according to the obtain request.
  • the self-closed loop strategy and the self-closed loop strategy corresponding to the port service are returned to the edge node.
  • the central node can upload the self-closed loop strategy corresponding to all services to the set location, and authorize the access rights of the set location to the edge node, so that the edge node can automatically go to the set location to obtain the self-closed loop corresponding to the concurrent service Strategies and self-closed loop strategies corresponding to port services, etc.
  • the edge node can also periodically obtain the self-closed loop strategy corresponding to various services from the central node to ensure that the self-closed loop strategy corresponding to any service is in the configuration side (that is, the central node).
  • the consistency of the node) and the executor (that is, the edge node) improves the accuracy of exception handling.
  • the central node can also monitor the local database in real time. Once it detects that the user has updated the self-closed loop strategy corresponding to a certain service, it can issue an update instruction to the edge node corresponding to the service, so that the edge node can obtain it in real time.
  • the updated self-closed-loop strategy ensures the consistency of the self-closed-loop strategy corresponding to the service in the configuration side and the execution side, and improves the accuracy of abnormal handling of the service.
  • the self-closed loop strategy can be configured on the central node side instead of separately in each edge node Separate configuration, thereby improving the flexibility and convenience of self-closed-loop strategy configuration; and, by using the service as a unit to configure self-closed-loop strategy, it can make the abnormal identification process more targeted, better reflect the true service capabilities of the service, and improve Accuracy of anomaly recognition and anomaly handling.
  • a service process of any service (such as the first service) is set in the edge node, and the edge node provides the first service to the client or other devices through the service process of the first service.
  • the edge node After the edge node stores the self-closed loop strategy corresponding to the first service in the local database, the edge node can also obtain the service data of the first service by invoking the service process of the first service.
  • an obtaining request can be sent to the service process of the first service, and
  • the acquisition request carries the identifier of the monitoring event, so that the service process of the first service returns the service data corresponding to the monitoring event in real time, or the acquisition request can be sent to the service process of the first service according to the set period, so that the first service
  • the service process returns the service data corresponding to the monitoring event according to the set period, etc., which are not limited.
  • the edge node can obtain the service data of the first service in the following way: the self-closed loop strategy also includes the data source interface corresponding to each monitoring event in the first service, and the data source interface is pre-encapsulated in The internal function function of the edge node, the data source interface can record the service data corresponding to the monitoring event during the process of the service process providing the first service. In this way, for any monitoring event in the first service, the edge node can first determine the data source interface corresponding to the monitoring event from the self-closed loop strategy, and then obtain the corresponding monitoring event by calling the data source interface corresponding to the monitoring event Service data.
  • a first service process is set in the edge node, and the first service process is used to provide port services to the Internet Protocol (IP) address 127.0.0.1.
  • IP Internet Protocol
  • the edge node may request call number data corresponding to the event source interface to the first service providing server process port number acquisition requesting access port IP address 127.0.0.1 is set in the period (i.e., data service).
  • the self-closed loop strategy may also include other configuration information required to call the data source interface, such as environment variables and communication protocol conventions, which are not limited.
  • the acquisition operation may be performed by the monitoring process set in the edge node, and socket communication is adopted between the monitoring process and the service process to improve the efficiency and accuracy of communication.
  • the edge node can directly call the data source interface corresponding to the monitoring event to obtain the corresponding service data without manual configuration.
  • the operation is simple, easy to implement, and can also improve the efficiency of service data acquisition.
  • the abnormality analysis rule corresponding to the monitoring event may include one or more abnormal conditions, and each monitoring event may correspond to its own first abnormal condition, and the first abnormal condition is used to indicate whether the monitoring event is abnormal. If the monitoring event only corresponds to the first abnormal condition, the first abnormal condition can not only indicate the abnormality of the monitoring event, but also the abnormality of the service corresponding to the monitoring event; if the monitoring event corresponds to the first abnormal condition and at least one second abnormal condition at the same time Abnormal conditions, the first abnormal condition is used to indicate the abnormality of the monitoring event, and the first abnormal condition and the at least one second abnormal condition together indicate the abnormality of the service corresponding to the monitoring event.
  • the at least one second abnormal condition can be set by those skilled in the art based on experience, or can also be set according to actual needs, which is not specifically limited.
  • the abnormality analysis rule corresponding to the monitoring event only includes the first abnormal condition
  • the service data corresponding to the monitoring event meets the first abnormal condition
  • the exception handling rules corresponding to the monitoring event can be directly invoked to process the edge node, so as to restore the service corresponding to the monitoring event in the central node. If the service data corresponding to the monitoring event does not meet the first abnormal condition, it can be determined that the monitoring event is in a normal state in the edge node, and therefore, no processing is required.
  • the concurrent volume events and concurrent error rate events in the concurrent service only correspond to the first abnormal condition, and the concurrent volume events and concurrent error rate events correspond to their respective exception handling rules. Therefore, when concurrent In the event of an exception in any of the quantitative event and the concurrent error rate event, the concurrent service exception can be determined, so that the exception handling rule corresponding to the abnormal monitoring event can be used to process the concurrent service in the edge node.
  • the abnormality analysis rule corresponding to the monitoring event also includes at least one second abnormal condition
  • the service corresponding to the monitoring event is explained
  • the edge node is in an abnormal state, so that the abnormal handling rules corresponding to the monitoring event can be called to process the edge node, so as to restore the service corresponding to the monitoring event in the center node.
  • the service data corresponding to the monitoring event only meets the first abnormal condition and does not meet at least one second abnormal condition, it means that the monitoring event is abnormal in the edge node, and the service corresponding to the monitoring event is not abnormal in the edge node, so no processing is required. .
  • the second abnormal condition may include the associated monitoring event and/or impact time, and the second abnormal condition may be determined based on the actual failure scenario of the service. Specifically, for any service, you can first obtain the historical service data corresponding to each monitoring event when the service fails, and then combine the historical service data corresponding to each monitoring event to analyze the characteristic factors that caused the service failure, and set according to the characteristic factors The second abnormal condition. For example, if the characteristic factor is that both a certain monitoring event and other monitoring events are abnormal and the service is truly abnormal, then the second abnormal condition corresponding to the monitoring event can be set to be associated with other monitoring events, and the monitoring event can be associated with other monitoring events. Corresponding to the same exception handling rule, if the characteristic factor is that the duration of a certain monitoring event abnormality is greater than the impact time, the service is truly abnormal, and the second abnormal condition corresponding to the monitoring event can be set as the impact time.
  • the second abnormal condition corresponding to the abnormal status code event in the port service is the associated request count event.
  • the first abnormal condition corresponding to the abnormal status code event is used to determine that the abnormal status code event is abnormal, it can be Determine whether the request count event associated with the abnormal status code event is abnormal. If the request count event is also abnormal, it can be determined that the port service is abnormal, so that the port service can be corrected using the exception handling rules corresponding to the abnormal status code event. If it is not abnormal, it can be determined that the port service is not abnormal, so it is not necessary to deal with it.
  • the second abnormal condition corresponding to the resource occupancy event in the resource service is the impact time ( ⁇ 5 minutes).
  • each monitoring event can also correspond to three or more abnormal conditions, for example, it can also correspond to The third abnormal condition, the third abnormal condition is used to indicate the abnormal level of the service. Only when the abnormal level of the service exceeds the abnormal level indicated by the third abnormal condition, the abnormal handling rule corresponding to the monitoring event is used for repair, or can also be set.
  • the fourth abnormal condition the fourth abnormal condition is used to indicate the combined abnormal situation of the service. Only when the services indicated by the fourth abnormal condition are abnormal, the abnormal handling rules corresponding to the monitoring event are used for repair, etc., and the specific is not limited .
  • setting the second abnormal condition for the monitoring event by combining the real failure scenario can reduce the probability of detecting false abnormal services and improve the accuracy of detection; and, by setting the second abnormal condition to affect time and/or The associated monitoring events are abnormal at the same time, and the abnormality of the service can be comprehensively judged based on the abnormal duration characteristics and/or the abnormal quantity characteristics, and the accuracy of abnormal judgment can be improved.
  • the abnormality analysis rule corresponding to the monitoring event can also include the abnormality analysis algorithm corresponding to the monitoring event.
  • the same type of monitoring event can correspond to the same type of abnormality analysis algorithm, because the abnormality analysis rule corresponding to the monitoring event includes anomaly Analyze algorithms and abnormal conditions, so the abnormal analysis rules corresponding to each monitoring event can be unique.
  • the corresponding abnormality analysis algorithm can be called according to the type of the service data to calculate the service data, so as to filter out the abnormal judgment data in the service data, and then judge whether the abnormal judgment data meets the monitoring requirements. If the abnormal condition corresponding to the event is met, it is determined that the monitoring event is abnormal, and if it is not met, it is determined that the monitoring event is not abnormal.
  • the abnormality analysis algorithm may include any one or more of log keyword analysis method, service health value analysis method, threshold value analysis method, and service self-defined analysis method. The following are respectively analyzed:
  • the log keyword analysis method is used for abnormal analysis of the service data of the log data type.
  • the service data of the log data type includes batch processing time, batch processing success amount, etc.
  • the service data can be segmented based on preset log fields to obtain each monitoring log field, and then multiple pattern matching algorithms (such as Aho-Corasick algorithm, wu-manber algorithm, etc.) can be used to match each monitoring log field.
  • the successfully matched monitoring log field is used as the abnormality judgment data and compared with the preset log field in the abnormal condition to determine whether the monitoring event is abnormal.
  • the service health value analysis method is used to perform abnormal analysis on the service data of the operation data type.
  • the service data of the operation data type includes status code, bandwidth, number of requests, resource occupancy rate, etc.
  • Threshold analysis method is used for abnormal analysis of service data of indicator data type.
  • Service data of indicator data type includes the number of requests, the number of alarms, and so on.
  • the monitoring value of the monitoring event under each specific indicator can be extracted from the service data according to the specific indicator of the service, and the monitoring value under the specific indicator is used as the abnormal judgment data, and the threshold value under the specific indicator in the abnormal condition Make a comparison to determine whether the monitoring event is abnormal.
  • the service custom analysis method is used to perform anomaly analysis on service data of unknown data types or users who require custom anomaly analysis algorithms.
  • the edge node can provide the user with a general interface so that the user can upload the custom anomaly analysis algorithm through the general interface.
  • the edge node can also load the anomaly analysis algorithm, and use the loaded anomaly analysis algorithm to calculate the service data corresponding to the monitoring event to obtain the abnormality judgment data.
  • the user can also customize the abnormal conditions at the same time. After the abnormality judgment data is calculated, the edge node can also compare the abnormality judgment data with the user-defined abnormal conditions to determine whether the monitoring event is abnormal.
  • the log keyword analysis method can be called to analyze the abnormality of the service data. If the service data type is determined to be the operational data type, the service health value analysis method can be used to analyze the service data. If the service data type is determined to be the indicator data type, the threshold value analysis method can be used to analyze the service data abnormality. If the type of service data is determined to be other data types or the user has a need for a custom anomaly analysis algorithm, then the service custom analysis method can be called to perform anomaly analysis on the service data.
  • the abnormality analysis method can be decoupled from the actual business, and the flexibility of abnormality analysis can be improved.
  • the corresponding abnormal analysis algorithm is set for each monitoring event, which reduces the difficulty of development and further improves the flexibility of abnormal analysis.
  • the above method also supports user-defined anomaly analysis algorithms, which can not only continuously supplement new anomaly analysis algorithms according to user settings, improve the applicable scenarios of anomaly analysis, but also meet the needs of different users and improve the versatility of anomaly analysis.
  • the edge node can query the local database to determine whether there is an exception handling rule for the first service. If so, it can directly call the exception handling rule of the first service to perform the first service. Repair, if it does not exist, an exception message can be generated and reported to the central node 110.
  • the abnormal message carries related abnormal data of the first service, such as the identifier of the abnormal monitoring event in the first service, the abnormal field, the abnormal time, and the abnormal level in the service data corresponding to the abnormal monitoring event.
  • the central node 110 can first parse the abnormal message to obtain the abnormal field in the service data corresponding to the abnormal monitoring event, and then calculate the abnormal field and each prediction in the operation and maintenance knowledge base. Set the matching degree of the abnormal event, and use the preset abnormal event with the matching degree greater than the preset matching degree as the preset abnormal event corresponding to the monitoring event. If there is a preset abnormal event with a matching degree greater than the preset matching degree, the central node 110 may analyze the matched preset abnormal event to generate a corresponding abnormal handling rule, and send the abnormal handling rule to the edge node. If there is no preset abnormal event with a matching degree greater than the preset matching degree, the central node 110 may push the exception message to the user, and the user sets the corresponding exception handling rule, and sends the set exception handling rule to the edge node.
  • the edge node can not only use the exception handling rule to repair the first service, but also use the abnormal monitoring event in the first service and the exception handling rule of the first service to update the local database.
  • the central node 110 may also display the service status of each edge node to the user, so that the user can check the abnormal status and distribution status of various services in a timely manner.
  • the displayed information can include the abnormal situation of any service in each edge node, the abnormal situation of each monitoring event in any service, the processing result of the abnormal monitoring event, the distribution of abnormal monitoring events, and the correlation of each monitoring event. Any one or more of.
  • the central node 110 may be displayed to the user in the form of a holographic view, or may be displayed to the user in the form of a table, which is not limited.
  • the edge node by placing service abnormality identification and abnormality repair on the side of the edge node for execution, instead of uniformly reporting to the central node, the work pressure of the central node can be effectively reduced, and network overhead and time cost can be saved; and this solution
  • the edge node performs self-closed loop processing on its own anomalies, and can also discover and handle anomalies in time, which not only improves the efficiency of anomaly identification and processing, but also restores service availability in time.
  • Fig. 3 is a schematic diagram of the overall interaction flow corresponding to an exception handling method provided by an embodiment of the present invention. As shown in Fig. 3, the method includes:
  • Step 301 After detecting that the user inputs abnormal monitoring configuration information in the abnormal monitoring configuration interface, the central node acquires and stores the abnormal monitoring configuration information.
  • the abnormality monitoring configuration information may include the self-closed-loop strategy corresponding to each service.
  • the self-closed-loop strategy corresponding to any service may include the abnormal analysis rules of the service, and may also include the abnormal handling rules of the service and/or the acquisition of service data. rule.
  • Step 302 The edge node sends a registration request to the central node when it is started.
  • step 303 the central node verifies the registration request. If the verification is successful, step 304 is executed, and if the verification fails, step 315 is executed.
  • Step 304 The central node sends a response message of successful registration to the edge node.
  • Step 305 The edge node obtains the self-closed loop strategy corresponding to various services from the central node and stores it in the local database of the edge node; the various services include the first service.
  • Step 306 The edge node invokes the data source interface corresponding to the first service to obtain the service data of the first service from the service process of the first service.
  • the edge node makes the abnormality analysis rule of the first service analyze the service data of the first service to determine whether the first service is abnormal.
  • step 308 the edge node queries the local database to determine whether there is an exception handling rule for the first service, if not, execute step 309, and if yes, execute step 312.
  • Step 309 The edge node sends an abnormal message to the central node, and the abnormal message carries related abnormal data of the first service.
  • Step 310 The central node sets an exception handling rule for the first service based on the parsed related exception data of the first service.
  • Step 311 The central node sends the exception handling rule of the first service to the edge node.
  • Step 312 The edge node uses the exception handling rule of the first service to repair the first service.
  • Step 313 If the central node determines that the exception handling rule of the first service is not stored in the local database, it updates the local database using the exception handling rule of the first service.
  • step 314 the edge node repeatedly sends a registration request to the central node, and after repeatedly sending a set number of times, if the registration is not successful, an alarm message is generated.
  • any edge node analyzes service data using an abnormality analysis rule to determine whether the first service in the edge node is abnormal; the service data includes service data corresponding to the first service; further, After the edge node determines that the first service is abnormal, if an exception handling rule of the first service exists in the edge node, use the exception handling rule to repair the first service; if the edge node If there is no exception handling rule for the first service in, it is reported to the central node.
  • the edge node by placing service abnormality identification and abnormality repair on the edge node side for execution instead of uniformly reporting to the central node for execution, the work pressure of the central node can be effectively reduced, and network overhead and time cost can be saved;
  • the edge node performs self-closed loop processing of its own abnormalities, and can also discover and handle abnormalities in time, which not only improves the efficiency of abnormal identification and processing, but also restores service availability in time.
  • an embodiment of the present invention also provides an edge network-based exception handling device, and the specific content of the device can be implemented with reference to the foregoing method.
  • Fig. 4 is a schematic structural diagram of an abnormality processing device based on an edge network provided by an embodiment of the present invention.
  • the edge network includes a central node and at least one edge node; the device includes:
  • Anomaly analysis 401 configured to analyze service data using anomaly analysis rules to determine whether the first service in the edge node is abnormal; the service data includes service data corresponding to the first service;
  • the exception processing module 402 is configured to, after determining that the first service is abnormal, if an exception handling rule of the first service exists in the edge node, use the exception handling rule to repair the first service; if If there is no exception handling rule for the first service in the edge node, it is reported to the central node.
  • the central node determines the exception handling rule of the first service and issues it to the edge node;
  • the device also includes a transceiver module 403, which is configured to: receive the exception handling rule of the first service sent by the central node;
  • the exception handling module 402 is further configured to: use the exception handling rule of the first service to repair the first service.
  • the device further includes a transceiver module 403; before the abnormality analysis module 401 analyzes the service data using anomaly analysis rules, the transceiver module 403 is configured to:
  • the self-closed-loop strategy of the service includes the first service; the self-closed-loop strategy corresponding to any service includes the exception analysis rules of the service, or also includes the exception handling rules of the service.
  • the self-closed loop strategy corresponding to the various services is obtained in the following manner:
  • the central node When the central node detects that the user enters the abnormal monitoring configuration information in the abnormal monitoring configuration interface, it obtains and analyzes the abnormal monitoring configuration information, obtains the self-closed loop strategy corresponding to various services, and stores it in the local database of the central node .
  • the anomaly analysis rule of any service includes an anomaly analysis rule corresponding to each monitoring event in the service;
  • the abnormality analysis module 401 is specifically used for:
  • the service data of the monitoring event is parsed from the service data of the first service, and an abnormality analysis algorithm that matches the type of the service data of the monitoring event is invoked Analyze the service data of the monitoring event, and if the analysis result meets the first abnormal condition corresponding to the monitoring event, determine that the monitoring event is abnormal, and determine whether the first service is abnormal at least according to the monitoring event; if If the analysis result does not satisfy the first abnormal condition corresponding to the monitoring event, it is determined that the first service is not abnormal.
  • the abnormality analysis module 401 is specifically configured to:
  • the abnormal condition corresponding to the monitoring event only includes the first abnormal condition, it is determined that the first service is abnormal; if it is determined that the abnormal condition corresponding to the monitoring event also includes a second abnormal condition, and the second abnormal condition Is the impact time, when the abnormal duration of the monitoring event is less than the impact time, it is determined that the first service is not abnormal, and when the abnormal duration of the monitoring event is greater than or equal to the impact time, the first service is determined to be A service is abnormal.
  • the abnormality analysis module 401 is further configured to:
  • the second abnormal condition is that the associated monitoring event is abnormal at the same time, it is determined whether the other monitoring event associated with the monitoring event is abnormal. When the other monitoring event is also abnormal, it is determined that the first service is abnormal. When at least one other monitoring event is normal, it is determined that the first service is not abnormal.
  • any edge node analyzes service data using an abnormality analysis rule to determine whether the first service in the edge node is abnormal; the service data includes the corresponding information for the first service Service data; further, after the edge node determines that the first service is abnormal, if there is an exception handling rule for the first service in the edge node, the exception handling rule is used to perform the first service Repair; if there is no exception handling rule for the first service in the edge node, report to the central node.
  • the edge node by placing service abnormality identification and abnormality repair on the edge node side for execution instead of uniformly reporting to the central node for execution, the work pressure of the central node can be effectively reduced, and network overhead and time cost can be saved;
  • the edge node performs self-closed loop processing of its own abnormalities, and can also discover and handle abnormalities in time, which not only improves the efficiency of abnormal identification and processing, but also restores service availability in time.
  • an embodiment of the present invention also provides a computing device, as shown in FIG. 5, including at least one processor 501 and a memory 502 connected to the at least one processor.
  • the embodiment of the present invention does not limit the processor
  • the connection between the processor 501 and the memory 502 in FIG. 5 is taken as an example.
  • the bus can be divided into address bus, data bus, control bus and so on.
  • the memory 502 stores instructions that can be executed by at least one processor 501.
  • the at least one processor 501 can execute the aforementioned edge network-based exception handling method included step.
  • the processor 501 is the control center of the computing device, which can use various interfaces and lines to connect various parts of the computing device, and realize data by running or executing instructions stored in the memory 502 and calling data stored in the memory 502. deal with.
  • the processor 501 may include one or more processing units, and the processor 501 may integrate an application processor and a modem processor.
  • the application processor mainly processes the operating system, user interface, and application programs.
  • the adjustment processor mainly handles issuing instructions. It can be understood that the foregoing modem processor may not be integrated into the processor 501.
  • the processor 501 and the memory 502 may be implemented on the same chip, and in some embodiments, they may also be implemented on separate chips.
  • the processor 501 may be a general-purpose processor, such as a central processing unit (CPU), a digital signal processor, an application specific integrated circuit (ASIC), a field programmable gate array or other programmable logic devices, discrete gates or transistors Logic devices and discrete hardware components can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of the present invention.
  • the general-purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiment of the exception handling based on the edge network can be directly embodied as executed by a hardware processor, or executed by a combination of hardware and software modules in the processor.
  • the memory 502 can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules.
  • the memory 502 may include at least one type of storage medium, such as flash memory, hard disk, multimedia card, card-type memory, random access memory (Random Access Memory, RAM), static random access memory (Static Random Access Memory, SRAM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), magnetic memory, disk , CD, etc.
  • the memory 502 is any other medium that can be used to carry or store desired program codes in the form of instructions or data structures and that can be accessed by a computer, but is not limited thereto.
  • the memory 502 in the embodiment of the present invention may also be a circuit or any other device capable of realizing a storage function for storing program instructions and/or data.
  • embodiments of the present invention also provide a computer-readable storage medium that stores a computer program executable by a computing device, and when the program runs on the computing device, the computing device executes Figure 2 or Figure 3 arbitrarily described an edge network-based exception handling method.
  • the embodiments of the present invention can be provided as a method or a computer program product. Therefore, the present invention may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, the present invention may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • a computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Debugging And Monitoring (AREA)
  • Computer And Data Communications (AREA)

Abstract

L'invention concerne un procédé et un appareil de traitement d'anomalie basé sur un réseau périphérique, permettant de résoudre les problèmes techniques des antériorités liés à la pression élevée sur un noeud central et au traitement inopportun d'une anomalie provoquée par le traitement d'anomalie centralisé du noeud central. Dans le procédé selon l'invention : un noeud périphérique analyse des données de service au moyen d'une règle d'analyse d'anomalie et, après qu'il a été déterminé qu'un premier service dans le noeud périphérique est anormal, si ledit noeud est doté de la règle de traitement d'anomalie du premier service, le premier service est réparé au moyen de la règle de traitement d'anomalie ; si le noeud périphérique n'est pas doté de la règle de traitement d'anomalie du premier service, le procédé consiste à faire rapport au noeud central. La présente invention, par la configuration de l'identification d'anomalie et de la réparation d'anomalie d'un service d'un côté noeud périphérique au lieu de faire rapport au noeud central dans un mode unifié, permet de réduire efficacement la pression de fonctionnement du noeud central et de réduire le surdébit de réseau et le coût temporel ; en outre, le noeud périphérique effectue un traitement en boucle auto-fermée sur l'anomalie du noeud périphérique, l'anomalie pouvant ainsi être détectée et traitée en temps opportun, ce qui améliore l'efficacité de traitement d'anomalie.
PCT/CN2020/091867 2020-02-25 2020-05-22 Procédé et appareil de traitement d'anomalie basé sur un réseau périphérique WO2021169064A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010115008.8 2020-02-25
CN202010115008.8A CN111355610A (zh) 2020-02-25 2020-02-25 一种基于边缘网络的异常处理方法及装置

Publications (1)

Publication Number Publication Date
WO2021169064A1 true WO2021169064A1 (fr) 2021-09-02

Family

ID=71197132

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/091867 WO2021169064A1 (fr) 2020-02-25 2020-05-22 Procédé et appareil de traitement d'anomalie basé sur un réseau périphérique

Country Status (2)

Country Link
CN (1) CN111355610A (fr)
WO (1) WO2021169064A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113806225A (zh) * 2021-09-24 2021-12-17 上海淇玥信息技术有限公司 一种业务异常节点识别方法、装置和电子设备
CN118413562A (zh) * 2024-07-04 2024-07-30 中钢集团武汉安全环保研究院有限公司 一种边缘计算方法、装置和系统

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111800299A (zh) * 2020-07-08 2020-10-20 广州市品高软件股份有限公司 一种边缘云的运营维护系统及其方法
CN112073231B (zh) * 2020-08-31 2023-08-18 深圳市国电科技通信有限公司 局域网联动防护方法、装置、计算机设备及存储介质
CN112492632B (zh) * 2020-11-09 2023-02-17 厦门亿联网络技术股份有限公司 一种基于漫游系统的异常监控方法、系统
CN112583898B (zh) * 2020-11-30 2023-08-15 北京百度网讯科技有限公司 业务流程编排方法、装置、以及可读介质
CN114666075B (zh) * 2020-12-08 2023-04-07 上海交通大学 基于深度特征粗糙编码的分布式网络异常检测方法及系统
CN114765555A (zh) * 2021-01-12 2022-07-19 华为技术有限公司 一种网络威胁的处理方法和通信装置
CN112988327A (zh) * 2021-03-04 2021-06-18 杭州谐云科技有限公司 一种基于云边协同的容器安全管理方法和系统
CN113013990A (zh) * 2021-03-18 2021-06-22 华润电力技术研究院有限公司 一种发电机组故障预警方法、系统及相关设备
CN113887749A (zh) * 2021-08-23 2022-01-04 国网江苏省电力有限公司信息通信分公司 基于云边协同的电力物联网多维度监控处置方法、设备及平台
CN113806092A (zh) * 2021-09-18 2021-12-17 济南浪潮数据技术有限公司 一种存储设备管理方法、系统、设备以及介质
CN114640709B (zh) * 2022-03-31 2023-07-25 苏州浪潮智能科技有限公司 一种边缘节点的处理方法、装置及介质
CN115297124B (zh) * 2022-07-25 2023-08-04 天翼云科技有限公司 一种系统运维管理方法、装置及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015088851A1 (fr) * 2013-12-09 2015-06-18 Cisco Technology, Inc. Réparation d'arcs de routage de réseau défaillants à l'aide d'un protocole de plan de données
CN106375328A (zh) * 2016-09-19 2017-02-01 中国人民解放军国防科学技术大学 一种大规模数据分发系统运行时自适应优化方法
CN109769023A (zh) * 2019-01-16 2019-05-17 网宿科技股份有限公司 一种数据传输方法、相关服务器和存储介质
CN109889569A (zh) * 2019-01-03 2019-06-14 网宿科技股份有限公司 Cdn服务调度方法及系统
CN110430071A (zh) * 2019-07-19 2019-11-08 云南电网有限责任公司信息中心 业务节点故障自愈方法、装置、计算机设备及存储介质

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101883016B (zh) * 2009-05-05 2014-11-05 中兴通讯股份有限公司 一种深度报文检测设备联动策略生成系统及方法
CN101583024B (zh) * 2009-06-04 2011-06-22 中兴通讯股份有限公司 分布式网络视频监控系统及其注册控制方法
CN101790156B (zh) * 2009-11-19 2011-10-26 北京邮电大学 基于策略优化的终端软件故障修复方法及装置
CN103166778A (zh) * 2011-12-13 2013-06-19 成都勤智数码科技有限公司 一种故障自动化智能处理方法及其装置
CN104299659B (zh) * 2013-07-16 2017-08-04 中广核工程有限公司 核电站运行状态监控方法、装置及系统
CN103838637A (zh) * 2014-03-03 2014-06-04 江苏智联天地科技有限公司 基于数据挖掘的终端自主故障诊断与恢复方法
CN107026865A (zh) * 2017-04-14 2017-08-08 北京奇虎科技有限公司 异常事件处理方法及系统、客户端及服务端
CN108595333B (zh) * 2018-04-26 2021-08-03 Oppo广东移动通信有限公司 PaaS平台中应用进程的健康检查方法及装置
CN109639516B (zh) * 2018-10-17 2022-05-17 平安科技(深圳)有限公司 分布式网络系统的监控方法、装置、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015088851A1 (fr) * 2013-12-09 2015-06-18 Cisco Technology, Inc. Réparation d'arcs de routage de réseau défaillants à l'aide d'un protocole de plan de données
CN106375328A (zh) * 2016-09-19 2017-02-01 中国人民解放军国防科学技术大学 一种大规模数据分发系统运行时自适应优化方法
CN109889569A (zh) * 2019-01-03 2019-06-14 网宿科技股份有限公司 Cdn服务调度方法及系统
CN109769023A (zh) * 2019-01-16 2019-05-17 网宿科技股份有限公司 一种数据传输方法、相关服务器和存储介质
CN110430071A (zh) * 2019-07-19 2019-11-08 云南电网有限责任公司信息中心 业务节点故障自愈方法、装置、计算机设备及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YUANJIE LIU, LIU ZHENG: "Research on the Application of New Smart City Based on Edge Intelligence Collaboration", SCIENCE AND TECHNOLOGY INNOVATION HERALD, SHIJIE ZHISHI CHUBANSHE, CN, vol. 17, no. 1, 1 January 2020 (2020-01-01), CN, pages 143 - 146,148, XP055840163, ISSN: 1674-098X, DOI: 10.16660/j.cnki.1674-098X.2020.01.143 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113806225A (zh) * 2021-09-24 2021-12-17 上海淇玥信息技术有限公司 一种业务异常节点识别方法、装置和电子设备
CN113806225B (zh) * 2021-09-24 2024-06-07 上海淇玥信息技术有限公司 一种业务异常节点识别方法、装置和电子设备
CN118413562A (zh) * 2024-07-04 2024-07-30 中钢集团武汉安全环保研究院有限公司 一种边缘计算方法、装置和系统

Also Published As

Publication number Publication date
CN111355610A (zh) 2020-06-30

Similar Documents

Publication Publication Date Title
WO2021169064A1 (fr) Procédé et appareil de traitement d'anomalie basé sur un réseau périphérique
CN111049705B (zh) 一种监控分布式存储系统的方法及装置
WO2019153505A1 (fr) Procédé de publication d'un paquet de données de récupération après défaillance et serveur
US20190018667A1 (en) Systems and Methods of Constructing a Network Topology
WO2023071761A1 (fr) Procédé et dispositif de localisation d'anomalie
CN111641524A (zh) 监控数据处理方法、装置、设备和存储介质
WO2021120975A1 (fr) Procédé et appareil de surveillance
CN108390793A (zh) 一种分析系统稳定性的方法及装置
CN118138471A (zh) 基于知识图谱的网络模型构建方法、设备及存储介质
CN109409948B (zh) 交易异常检测方法、装置、设备及计算机可读存储介质
CN109426597B (zh) 应用性能监控方法、装置、设备、系统及存储介质
CN117151726A (zh) 故障的修复方法、修复装置、电子设备以及存储介质
CN111897643B (zh) 线程池配置系统、方法、装置和存储介质
CN111782456B (zh) 异常检测方法、装置、计算机设备和存储介质
CN107885634B (zh) 监控中异常信息的处理方法和装置
WO2023273461A1 (fr) Système de surveillance d'état de fonctionnement de robot, et procédé
CN116048846A (zh) 数据传输方法、装置、设备和存储介质
CN109714214B (zh) 一种服务器异常的处理方法及管理设备
CN114553682A (zh) 实时告警方法、系统、计算机设备及存储介质
CN115049493A (zh) 一种区块链数据追踪方法、装置及电子设备
WO2012088761A1 (fr) Système et procédé de contrôle d'échange d'informations de sécurité reposant sur une analyse de données
CN112910733A (zh) 一种基于大数据的全链路监控系统及方法
CN107612755A (zh) 一种云资源的管理方法及其装置
CN116089446A (zh) 一种结构化查询语句的优化控制方法及装置
CN115150253A (zh) 一种故障根因确定方法、装置及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20920994

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20920994

Country of ref document: EP

Kind code of ref document: A1