[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN118487994A - Fault recovery method, MLAG slave device, computer readable storage medium and network device - Google Patents

Fault recovery method, MLAG slave device, computer readable storage medium and network device Download PDF

Info

Publication number
CN118487994A
CN118487994A CN202410766682.0A CN202410766682A CN118487994A CN 118487994 A CN118487994 A CN 118487994A CN 202410766682 A CN202410766682 A CN 202410766682A CN 118487994 A CN118487994 A CN 118487994A
Authority
CN
China
Prior art keywords
protocol
mlag
priority
lacp
routing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410766682.0A
Other languages
Chinese (zh)
Inventor
陈龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Maipu Communication Technology Co Ltd
Original Assignee
Maipu Communication Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Maipu Communication Technology Co Ltd filed Critical Maipu Communication Technology Co Ltd
Priority to CN202410766682.0A priority Critical patent/CN118487994A/en
Publication of CN118487994A publication Critical patent/CN118487994A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/28Routing or path finding of packets in data switching networks using route fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/24Multipath
    • H04L45/245Link aggregation, e.g. trunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a fault recovery method, an MLAG slave device, a computer readable storage medium and a network device, which belong to the field of communication, when an MLAG system has double main faults, an MLAG protocol on the slave device informs a dynamic routing protocol to reduce the whole network routing priority of the device and announce outwards, so that the southward traffic can be firstly switched to a main device; after the delay waiting route notification is completed, the MLAG protocol on the slave device notifies the LACP protocol to reduce the system priority of the dynamic link aggregation group port connected with the dual-homing access device, so that the northbound traffic is also switched to the master device, and the bidirectional traffic is switched so far, so that the traffic can not have packet loss in the dual-master fault process.

Description

Fault recovery method, MLAG slave device, computer readable storage medium and network device
Technical Field
The present invention relates to the field of data communications, and in particular, to a failure recovery method, an MLAG slave device, a computer readable storage medium, and a network device.
Background
With the development of computer networks, data traffic in data center networks is increasing, and requirements for data reliability are also increasing. MLAG (Multi-CHASSIS LINK AggregationGroup, cross-device Link aggregation group) is one protocol that combines two devices into a dual active system. As shown in fig. 1, the MLAG is deployed between the Device1 Device and the Device2 Device, so that the service reliability carried by the dual homing access Device (DHD shown in fig. 1) is promoted to the Device level. The Device1 equipment and the Device2 equipment form two pieces of equipment of the MLAG, are interconnected through a Peer-link (Peer-to-Peer link), and perform heartbeat keep-alive by using a keep-alive link; the dual homing access Device serves as two devices of the pure two-layer dual homing access Device, which access the MLAG through link aggregation, and the single homing access Device (SD shown in fig. 1) accesses the slave Device (Device 2 shown in fig. 1) of the MLAG through a physical link, and the northbound Device of the MLAG domain runs a dynamic routing protocol. In this scenario, when the MLAG link fails, the user wants to be able to keep the stability of the system to the maximum, i.e. there is little to no packet loss for the traffic in the north-south direction.
The secondary faults generated by the Peer-link fault and the keep link are an important fault type in the MLAG scene, and directly influence the overall fault rate of the MLAG protocol. As shown in fig. 1, the present invention mainly discusses how to solve the problem of packet loss of uplink and downlink traffic when a keep-alive link fails while a Peer-link between a Device1 Device and a Device2 Device fails. Specifically, in the MLAG scenario, it is assumed that the Device1 Device is a master Device of the MLAG and the Device2 Device is a slave Device of the MLAG. When the Peer-link between the Device1 Device and the Device2 Device is DOWN, the Device1 Device remains in the primary state, and the MLAG protocol on the Device2 Device drops the port connected to the dual-homing-access Device according to the existing mechanism requirements, while entering the suspended state. If the keep-alive link fails, the Device2 Device considers that the connection with the Device1 Device at the opposite end is lost, the Device should take on the task of forwarding the traffic, then upgrade to the primary state, and restore all the ports originally dropped to UP (connection). There are two Master devices in the MLAG domain, namely, a Dual Master (Dual-Master) phenomenon. In the double main scene, the traffic is not abnormal just at the beginning, but after a period of waiting, ARP and/or FDB table entries age, new data streams can lose packets, and the situation of always losing packets can be caused when serious. Therefore, if the existing mechanism is adopted, serious traffic abnormality may be caused when the double main scenes occur, and under the core requirement that the data center is switched without packet loss, it becomes very important to ensure the normal switching of the north-south traffic.
When the dual master scenario occurs, in the prior art, the MLAG Device (Device 2 shown in fig. 1) with the primary role as the slave Device typically reduces the system priority of all member ports of the aggregation group connected with the dual-homing Device, and at this time, the dynamic link aggregation group port on the dual-homing Device can only negotiate with the MLAG Device (Device 1 shown in fig. 1) with the primary role as the master Device to be in an Attached state, and the northbound traffic is naturally switched to one Device for forwarding. And as the state of the convergence group port connected with the dual-homing access Device on the Device2 Device is changed to Detached, the corresponding three-layer VLAN interface is also changed to DOWN, the northbound Device is rerouted, and the southbound traffic is switched to the Device1 Device for forwarding.
However, the problem with the existing solution is that, when the Device2 Device is accessed by the Device with single-homing access and the Device with single-homing access is in the same network segment as the Device with double-homing access as shown in fig. 1, the three-layer VLAN interface where all member ports of the aggregation group connected by the Device2 Device with double-homing access are located will not be changed into the DOWN state, specifically, when the Device with single-homing access and the Device with double-homing access are accessed in the same network segment, it means that the two devices are located in the same VLAN, and for the three-layer VLAN interface, as long as there is one physical port UP in the VLAN, the VLAN interface is UP. At this time, if a secondary failure scenario occurs, for the dual-homing access Device, the Device2 Device announces the low system priority of the aggregation group port to the dual-homing access Device, which causes the aggregation group port state between the dual-homing access Device and the Device2 Device to change to the DOWN state, but the Device2 Device does not announce the low system priority of the aggregation group port to the single-homing access Device (even if the aggregation group port is announced to have a low system priority, since only one Device is connected to the single-homing access Device, the aggregation group port between the Device2 Device and the single-homing access Device does not change to the DOWN state), and therefore, one physical port in the VLAN interface is UP, and the VLAN interface keeps UP. Therefore, the northbound Device will not be rerouted, and the southbound traffic can still go to the Device2 Device, but after the Device2 Device, the traffic is dropped because the egress port is already Detached.
Disclosure of Invention
The invention aims to overcome the problems in the prior art and provides a fault recovery method, MLAG slave equipment, a computer readable storage medium and network equipment, which ensure normal switching of the north-south traffic under double main scenes.
The aim of the invention is realized by the following technical scheme:
In a first aspect, a recovery method under a MLAG dual-master fault scenario is provided, which is applied to a slave device in an MLAG system, and when the MLAG system has a dual-master fault, the method includes the following steps:
S1, when an MLAG system has double main faults, an MLAG protocol on slave equipment informs a dynamic routing protocol of the MLAG system to reduce the whole network routing priority of the slave equipment and announce outwards;
s2, delaying to wait for the completion of the route notification;
S3, the MLAG protocol on the slave device informs the LACP protocol of the slave device to reduce the system priority of the dynamic link aggregation group port connected with the dual-homing access device;
S4, waiting for the completion of double-main fault recovery;
s5, the MLAG protocol on the slave device informs the LACP protocol of the slave device to recover the system priority of the dynamic link aggregation group port connected with the dual-homing access device;
S6, delaying to wait for the recovery of the LACP protocol to be completed;
S7, the MLAG protocol on the slave device informs the slave device of the dynamic routing protocol to restore the whole network routing priority of the slave device and announce the whole network routing priority.
As a preferred option, the step S1 specifically includes:
Issuing an event notification from the MLAG protocol on the device to the dynamic routing protocol of the device;
And the dynamic routing protocol modifies the routing priority in the message which carries out protocol interaction with the network side equipment to be the lowest and continuously announces after receiving the event notification.
As a preferred option, the time for waiting for the route announcement is configured according to the number of routes.
As a preferred option, the step S3 specifically includes:
Issuing an event notification to the LACP protocol of the device from the MLAG protocol on the device;
and after receiving the event notification, the LACP modifies the system priority in the LACP protocol message interacted with the dual-homing access device to be the lowest.
As a preferred option, the step S5 specifically includes:
Issuing an event notification to the LACP protocol of the device from the MLAG protocol on the device;
And after receiving the event notification, the LACP protocol modifies the system priority in the LACP protocol message interacted with the dual-homing access device into the original system priority.
As a preferred option, the step S7 specifically includes:
Issuing an event notification from the MLAG protocol on the device to the dynamic routing protocol of the device;
and after receiving the event notification, the dynamic routing protocol modifies the routing priority in the message which carries out protocol interaction with the network side equipment into the original routing priority and continuously announces.
As a preference, the time delay waiting for the route advertisement is the same as the time delay waiting for the LACP protocol to recover.
In a second aspect, an MLAG slave device is provided, and is applied to an MLAG dual-master failure scenario, and includes an MLAG protocol module, a dynamic routing protocol module, and an LACP protocol module;
the MLAG protocol module is used for notifying the dynamic routing protocol module to reduce the whole network routing priority of the device when the MLAG system has double main faults;
the dynamic routing module is used for reducing the whole network routing priority of the equipment and notifying the outside; the MLAG protocol module is also used for notifying the LACP protocol module to reduce the system priority of the dynamic link aggregation group port connected with the equipment with double return access after the route notification is completed;
the LACP protocol module is used for reducing the system priority of a dynamic link aggregation group port connected with the equipment with double return access;
the MLAG protocol module is also used for notifying the LACP protocol module to recover the system priority of the dynamic link aggregation group port connected with the equipment with double return access after the double main fault recovery is completed;
The LACP protocol module is also used for recovering the system priority of the dynamic link aggregation group port of the slave device connected with the dual-homing access device;
The MLAG protocol module is also used for notifying the dynamic routing protocol module to restore the whole network routing priority of the sub-equipment after the LACP protocol is restored; the dynamic routing protocol module is used for recovering the whole network routing priority of the slave device and notifying the outside.
In a third aspect, a computer readable storage medium is provided, the computer readable storage medium storing a computer program, which when executed by a processor, implements the fault recovery method of any one of the above.
In a fourth aspect, a network device is provided, comprising a memory and a processor, the memory having stored thereon computer instructions executable on the processor, the processor executing the fault recovery method when executing the computer instructions.
It should be further noted that the technical features corresponding to the above options may be combined with each other or replaced to form a new technical scheme without collision.
Compared with the prior art, the invention has the beneficial effects that:
(1) According to the recovery method, the slave device and the medium under the MLAG double-master fault scene, when the MLAG system has double-master faults, the MLAG protocol on the slave device firstly informs the MLAG protocol of the MLAG system to reduce the whole network routing priority of the MLAG system and announce outwards, so that southbound traffic can be firstly switched to the master device, even if single-return access equipment exists and is under the same network segment with the double-return access equipment, southbound traffic cannot go to the slave device, traffic is discarded because the traffic arrives at the slave device and then the traffic is discharged because of the outlet port Detached, and the southbound traffic packet loss problem is solved; after the delay waiting route notification is completed, the MLAG protocol on the slave device notifies the LACP protocol to reduce the system priority of the dynamic link aggregation group port connected with the dual-homing access device, so that the northbound traffic is also switched to the master device, and the bidirectional traffic is switched so far, so that the traffic can not have packet loss in the dual-master fault process. The invention solves the possible flow abnormality under the double main scenes by increasing the processing of the dynamic routing protocol.
(2) The invention can switch zero packet loss theoretically under the double main scenes, and particularly can realize the switching and recovery of zero packet loss when applied to the primary failure scene of the Peer-link failure.
Drawings
FIG. 1 is a schematic diagram of an MLAG system networking according to an embodiment of the present invention;
FIG. 2 is a flow chart of a fault recovery method according to an embodiment of the present invention;
Fig. 3 is a block diagram of an MLAG slave device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present application will be made apparent and fully understood from the accompanying drawings, in which some, but not all embodiments of the application are shown. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be noted that the above solutions in the prior art have all the drawbacks that the inventors have obtained after they have practiced and studied carefully, and therefore, the discovery process of the above problems and the solutions that the embodiments of the present application in the following description address the above problems should be considered as contributions of the inventors to the application in the inventive process, and should not be construed as being what is known to those skilled in the art.
Based on the above related statement, since the present embodiment relates to the MLAG related technology, for the purpose, technical solutions and advantages of the embodiments of the present application to be more clear, the following explains related technical terms that may be related to the present application:
1. MLAG (Multi-CHASSIS LINK AggregationGroup, cross-device Link aggregation group) is a mechanism for implementing cross-device Link aggregation, which aggregates one device with two other paired devices to form a dual-active system.
2. The LACP protocol (Link Aggregation Control Protocol ) bundles multiple physical links between two devices together to form a logical link to extend the link bandwidth. The physical links within the logical link are redundant and dynamically backed up with respect to each other to provide higher reliability of the network connection.
3. The OSPF protocol (Open Shortest path first PATH FIRST) is a dynamic routing protocol based on link state, and uses Dijkstra's Shortest path first algorithm SPF to compute routes within a single autonomous system (Autonomous System, abbreviated AS).
4. IS-IS (INTERMEDIATE SYSTEM to INTERMEDIATE SYSTEM ) IS an interior gateway routing protocol (IGP) based on the SPF algorithm.
5. BGP (Border Gateway Protocol ) is a dynamic routing protocol used to exchange network layer reachability information (Network Layer Reachability Information, NLRI) between autonomous systems (Autonomous System, AS).
6. Peer-link: the direct-connection convergence link between MLAG paired nodes, also called peer-to-peer link, is used for interacting MLAG protocol messages and transmitting partial data traffic.
7. Keepalive: keep-alive detection between MLAG paired nodes is performed over three-layer links. The method is used for judging whether the counterpart still survives or not between the MLAG paired nodes after the failure of the Peer-link.
8. Dual-Master: if both the Peer-link and the keep alive fail, the SLAVE node (SLAVE) judges that the MASTER node (MASTER) is completely disconnected, the SLAVE node is updated to the MASTER node, and the original MASTER node still exists, so that two MASTER nodes exist, namely Dula-Master (double MASTER). Traffic forwarding may be abnormal in the Dual-Master state.
9. The SD (Single-homed Device) is a Device (including a server) for accessing the MLAG system in a Single-homed manner, and the Device accessing the MLAG system generally requires dual-homed access to improve the reliability of a link, but in practical application, there may be a case that the server of an old version type has only one network port, so that the Device can only adopt a Single-homed access manner when accessing the MLAG system, thereby becoming an SD.
10. The DHD (dual-homed Device) is a Device (or server) for dual-homed access to the MLAG system, and may itself be a MLAG system.
11. The ARP table shows the corresponding relation between the IP address and the MAC address and is mainly used for three-layer forwarding.
12. The FDB table shows the corresponding relation of the MAC address + VLAN and the port, and is mainly used for two-layer forwarding.
Aiming at the technical problems pointed out in the background art, the embodiment provided by the invention is as follows:
Example 1
In an exemplary embodiment, referring to fig. 2, there is provided a failure recovery method of a MLAG dual master scenario, applied to a slave device in a MLAG system, when the MLAG system has a dual master failure, the method comprising the steps of:
S1, when an MLAG system has double main faults, an MLAG protocol on slave equipment informs a dynamic routing protocol of the MLAG system to reduce the whole network routing priority of the slave equipment and announce outwards;
s2, delaying to wait for the completion of the route notification;
S3, the MLAG protocol on the slave device informs the LACP protocol of the slave device to reduce the system priority of the dynamic link aggregation group port connected with the dual-homing access device;
S4, waiting for the completion of double-main fault recovery;
s5, the MLAG protocol on the slave device informs the LACP protocol of the slave device to recover the system priority of the dynamic link aggregation group port connected with the dual-homing access device;
S6, delaying to wait for the recovery of the LACP protocol to be completed;
S7, the MLAG protocol on the slave device informs the slave device of the dynamic routing protocol to restore the whole network routing priority of the slave device and announce the whole network routing priority.
Specifically, the technical scheme provided by the invention needs to be matched by the MLAG protocol, the LACP protocol and the dynamic routing protocol, when the MLAG system has double main faults, the MLAG protocol on the slave device firstly informs the dynamic routing protocol to reduce the whole network routing priority of the device and announce outwards, so that the southward traffic can be firstly switched to the master device, even if single-homing access equipment exists and is under the same network segment with the double-homing access equipment, the southward traffic cannot go to the slave device, the traffic is not discarded because the traffic is already in Detached state after the traffic reaches the slave device, and the problem of southward traffic packet loss is solved; after the delay waiting route notification is completed, the MLAG protocol on the slave device notifies the LACP protocol to reduce the system priority of all member ports of the dynamic link aggregation group connected by the slave device and the dual-homing access device, so that the northbound traffic is also switched to the master device, and the bidirectional traffic is switched so far, so that the traffic can not have packet loss in the dual-master fault process.
The dual-master failure recovery method adopted in step S4 is a prior art, and is not described herein, for example, the MLAG dual-master failure recovery method disclosed in CN115250226a may be adopted.
Example 2
Based on the inventive concept of embodiment 1, this embodiment describes in detail a recovery method under the MLAG dual-main fault scenario, and specifically includes the following contents in conjunction with fig. 1:
1) Two node devices of the MLAG record the original roles of their own devices in a stable state, that is, the device roles are in a MASTER device state (MASTER in the figure) or a SLAVE device state (SLAVE in the figure), and are used for deciding which device will enter a low priority state when a dual-MASTER scene occurs.
2) The Peer-link failure then either the Keepalive link failure or the Keepalive link failure then the Peer-link failure.
3) The two MLAG node devices respectively calculate the priority under the secondary fault scene according to the role in the stable state recorded in advance, at the moment, the priority originally calculated for the Device in the slave role IS low priority (Device 2 Device in fig. 1), the MLAG protocol on the low priority Device can issue an event notification to a dynamic routing protocol module (BGP/OSPF/IS-IS) of the Device, and the dynamic routing protocol module modifies the routing priority in the message which carries out protocol interaction with the network side Device (IP-network Device in the figure) to be the lowest and continuously announces after receiving the event. The announced message is a standard protocol, and even other manufacturer equipment can still identify and process the announced message.
4) Delay waits for the completion of route announcement, the delay time can be configured according to the needs of users, and generally, the larger the number of routes of the equipment is, the longer the time is configured, so that the priority of the routes in the delay time can be guaranteed to be announced to be completed as much as possible. When the routing priority of the Device is advertised as low priority, for the network side Device, since the two devices connected to the network side Device have different routing priorities and the Device2 has a lower routing priority, an ECMP (equivalent routing) load cannot be formed, and the southbound traffic is finally switched to the Device1 Device for forwarding.
5) After the time delay time is overtime, the MLAG protocol on the Device2 equipment issues an event notification to the LACP protocol module of the Device, and the LACP protocol module modifies the system priority in the LACP protocol message interacted with the dual-homing access equipment (DHD equipment in the figure) to be the lowest after receiving the event. At this time, two devices Device1 and Device2 connected to the dual-homing access Device have different system priorities and the system priority of Device2 is lower, so that only the convergence group port connected to the dual-homing access Device can be UP, the north traffic is finally switched to the Device1 Device for forwarding, and the bidirectional traffic is switched so far, and the packet loss situation of the traffic can not occur in the process. The LACP protocol message is a standard protocol, and even equipment of other manufacturers can still be identified and processed.
6) Peer-link recovery followed by Keepalive link recovery or Keepalive recovery followed by Peer-link recovery.
7) After two MLAG node devices elect a master Device/slave Device, the MLAG protocol on the Device2 Device issues an event notification to the LACP protocol module of the Device, and the LACP protocol module modifies the system priority in the LACP protocol message interacted with the dual-homing access Device into the original system priority after receiving the event. At this time, the priority of the two devices Device1 and Device2 connected with the dual-homing access Device is the same, so that the ports of the aggregation group of the two devices Device1 and Device2 connected with the dual-homing access Device can be UP, and the northbound traffic can finally form dual-activity for forwarding. The LACP protocol message is a standard protocol, and even equipment of other manufacturers can still be identified and processed.
8) And delaying to wait for the recovery of the LACP protocol to finish. This delay time is identical to the time in step 4).
9) The MLAG protocol on the Device2 Device issues an event notification to the dynamic routing protocol of the Device; after receiving the event notification, the dynamic routing protocol modifies the routing priority in the message which carries out protocol interaction with the network side equipment into the original routing priority and continuously announces, so that the south-oriented traffic can be switched to the Device2 equipment for forwarding, and finally the north-oriented traffic is recovered for forwarding.
Example 3
Based on the same inventive concept as embodiment 1, a slave device is provided, which is applied to an MLAG dual-master fault scenario, as shown in fig. 3, and includes an MLAG protocol module, a dynamic routing protocol module, and an LACP protocol module;
When the MLAG system has double main faults, the MLAG protocol module is used for notifying the dynamic routing protocol module to reduce the whole network routing priority of the equipment; the dynamic routing protocol module is used for reducing the routing priority of the whole network of the equipment and notifying the outside; the MLAG protocol module is also used for notifying the LACP protocol module to reduce the system priority of all member ports of the dynamic link aggregation group connected with the equipment with double return access after the route notification is completed; the LACP protocol module is used for reducing the system priority of all member ports of a dynamic link aggregation group, which are connected with the equipment with double return access;
After the double-main fault recovery is completed, the MLAG protocol module is used for notifying the LACP protocol module to recover the system priority of all member ports of the dynamic link aggregation group connected with the double-return access equipment; the LACP protocol module is used for recovering the system priority of all member ports of the dynamic link aggregation group, which are connected with the dual-homing access equipment by the slave equipment; the MLAG protocol module is also used for notifying the dynamic routing protocol module to restore the whole network routing priority of the sub-equipment after the LACP protocol is restored; the dynamic routing protocol module is used for recovering the whole network routing priority of the slave device and notifying the outside.
Example 4
Based on the same inventive concept as embodiment 1, a computer-readable storage medium storing a computer program which, when executed by a processor, implements the fault recovery method provided by the embodiment of the present invention is provided. Based on such understanding, the technical solution of the present embodiment may be essentially or a part contributing to the prior art or a part of the technical solution may be embodied in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the embodiments of the present invention. The storage medium includes a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, etc., which can store the program code.
Example 5
Based on the same inventive concept as embodiment 1, a network device is provided, which includes a memory and a processor, wherein the memory stores computer instructions that can be executed on the processor, and the processor executes the fault recovery method provided by the embodiment of the present invention when executing the computer instructions.
The processor may be a single or multi-core central processing unit or a specific integrated circuit, or one or more integrated circuits configured to implement the invention.
Embodiments of the subject matter and the functional operations described in this specification can be implemented in: tangibly embodied computer software or firmware, computer hardware including the structures disclosed in this specification and structural equivalents thereof, or a combination of one or more of them. Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions encoded on a tangible, non-transitory program carrier for execution by, or to control the operation of, data processing apparatus. Alternatively or additionally, the program instructions may be encoded on a manually-generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode and transmit information to suitable receiver apparatus for execution by data processing apparatus.
The processes and logic flows described in this specification can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
Processors suitable for the execution of a computer program include, by way of example, general and/or special purpose microprocessors, or any other type of central processing unit. Typically, the central processing unit will receive instructions and data from a read only memory and/or a random access memory. The essential elements of a computer include a central processing unit for carrying out or executing instructions and one or more memory devices for storing instructions and data. Typically, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks, etc. However, a computer does not have to have such a device. Furthermore, the computer may be embedded in another device, such as a mobile phone, a Personal Digital Assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device such as a Universal Serial Bus (USB) flash drive, to name a few.
It should be understood that each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The foregoing detailed description of the invention is provided for illustration, and it is not to be construed that the detailed description of the invention is limited to only those illustration, but that several simple deductions and substitutions can be made by those skilled in the art without departing from the spirit of the invention, and are to be considered as falling within the scope of the invention.

Claims (10)

1. A fault recovery method applied to a slave device in an MLAG system, the method comprising the steps of:
S1, when an MLAG system has double main faults, an MLAG protocol on slave equipment informs a dynamic routing protocol of the MLAG system to reduce the whole network routing priority of the slave equipment and announce outwards;
s2, delaying to wait for the completion of the route notification;
S3, the MLAG protocol on the slave device informs the LACP protocol of the slave device to reduce the system priority of the dynamic link aggregation group port connected with the dual-homing access device;
S4, waiting for the completion of double-main fault recovery;
s5, the MLAG protocol on the slave device informs the LACP protocol of the slave device to recover the system priority of the dynamic link aggregation group port connected with the dual-homing access device;
S6, delaying to wait for the recovery of the LACP protocol to be completed;
S7, the MLAG protocol on the slave device informs the slave device of the dynamic routing protocol to restore the whole network routing priority of the slave device and announce the whole network routing priority.
2. The fault recovery method according to claim 1, wherein the step S1 specifically includes:
Issuing an event notification from the MLAG protocol on the device to the dynamic routing protocol of the device;
And the dynamic routing protocol modifies the routing priority in the message which carries out protocol interaction with the network side equipment to be the lowest and continuously announces after receiving the event notification.
3. The method of claim 1, wherein the time to delay waiting for a route advertisement is configured according to the number of routes.
4. The fault recovery method according to claim 1, wherein the step S3 specifically includes:
Issuing an event notification to the LACP protocol of the device from the MLAG protocol on the device;
and after receiving the event notification, the LACP modifies the system priority in the LACP protocol message interacted with the dual-homing access device to be the lowest.
5. The fault recovery method according to claim 1, wherein the step S5 specifically includes:
Issuing an event notification to the LACP protocol of the device from the MLAG protocol on the device;
And after receiving the event notification, the LACP protocol modifies the system priority in the LACP protocol message interacted with the dual-homing access device into the original system priority.
6. The fault recovery method according to claim 1, wherein the step S7 specifically includes:
Issuing an event notification from the MLAG protocol on the device to the dynamic routing protocol of the device;
and after receiving the event notification, the dynamic routing protocol modifies the routing priority in the message which carries out protocol interaction with the network side equipment into the original routing priority and continuously announces.
7. The method of claim 1, wherein the time delay waiting for the route advertisement is the same as the time delay waiting for the LACP protocol to recover.
8. The MLAG slave device is characterized by comprising an MLAG protocol module, a dynamic routing protocol module and an LACP protocol module;
the MLAG protocol module is used for notifying the dynamic routing protocol module to reduce the whole network routing priority of the device when the MLAG system has double main faults;
the dynamic routing module is used for reducing the whole network routing priority of the equipment and notifying the outside; the MLAG protocol module is also used for notifying the LACP protocol module to reduce the system priority of the dynamic link aggregation group port connected with the equipment with double return access after the route notification is completed;
the LACP protocol module is used for reducing the system priority of a dynamic link aggregation group port connected with the equipment with double return access;
the MLAG protocol module is also used for notifying the LACP protocol module to recover the system priority of the dynamic link aggregation group port connected with the equipment with double return access after the double main fault recovery is completed;
The LACP protocol module is also used for recovering the system priority of the dynamic link aggregation group port of the slave device connected with the dual-homing access device;
The MLAG protocol module is also used for notifying the dynamic routing protocol module to restore the whole network routing priority of the sub-equipment after the LACP protocol is restored; the dynamic routing protocol module is used for recovering the whole network routing priority of the slave device and notifying the outside.
9. A computer readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the fault recovery method of any one of claims 1-7.
10. A network device comprising a memory and a processor, the memory having stored thereon computer instructions executable on the processor, wherein the processor executes the fault recovery method of any one of claims 1-7 when the computer instructions are executed.
CN202410766682.0A 2024-06-14 2024-06-14 Fault recovery method, MLAG slave device, computer readable storage medium and network device Pending CN118487994A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410766682.0A CN118487994A (en) 2024-06-14 2024-06-14 Fault recovery method, MLAG slave device, computer readable storage medium and network device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410766682.0A CN118487994A (en) 2024-06-14 2024-06-14 Fault recovery method, MLAG slave device, computer readable storage medium and network device

Publications (1)

Publication Number Publication Date
CN118487994A true CN118487994A (en) 2024-08-13

Family

ID=92189638

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410766682.0A Pending CN118487994A (en) 2024-06-14 2024-06-14 Fault recovery method, MLAG slave device, computer readable storage medium and network device

Country Status (1)

Country Link
CN (1) CN118487994A (en)

Similar Documents

Publication Publication Date Title
US9264302B2 (en) Methods and systems with enhanced robustness for multi-chassis link aggregation group
US9197583B2 (en) Signaling of attachment circuit status and automatic discovery of inter-chassis communication peers
US8300523B2 (en) Multi-chasis ethernet link aggregation
JP6165850B2 (en) Enhanced protocol independent multicast (PIM) fast rerouting methodology using downstream notification packets
US7558194B2 (en) Virtual private network fault tolerance
US9019814B1 (en) Fast failover in multi-homed ethernet virtual private networks
CN111865779B (en) Route synchronization method and cross-device link aggregation group
JP5666590B2 (en) LDP and IGP synchronization for broadcast networks
CN107547370B (en) Flow forwarding method, device and system
US10237165B2 (en) Data traffic management system and method
JP5168230B2 (en) Communication system, edge router, and signal transfer method
US8750096B2 (en) Method and apparatus for improving data integrity during a router recovery process
US20120110393A1 (en) Method and apparatus providing failover for a point to point tunnel for wireless local area network split-plane environments
JP2015521449A (en) Enhancements to PIM fast rerouting using upstream activation packets
JP2008078893A (en) Redundant method for network and medium switching equipment
WO2011021180A1 (en) Technique for dual homing interconnection between communication networks
CN101999224A (en) Redundant Ethernet automatic protection switching access to virtual private lan services
US9425893B1 (en) Methods and apparatus for implementing optical integrated routing with traffic protection
US9762545B2 (en) Proxy forwarding of local traffic by edge devices in a multi-homed overlay virtual private network
US20130343180A1 (en) Internetworking and failure recovery in unified mpls and ip networks
US8861334B2 (en) Method and apparatus for lossless link recovery between two devices interconnected via multi link trunk/link aggregation group (MLT/LAG)
WO2012062069A1 (en) Method and device for sending bidirectional forwarding detection message
WO2011113395A2 (en) A method and apparatus for load balance
EP2575304B1 (en) OSPF nonstop routing (NSR) synchronization reduction
CN101924684A (en) Routing processing method and system as well as router

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination