WO2015088268A1

WO2015088268A1 - Method for processing failure of network device in software defined networking (sdn) environment

Info

Publication number: WO2015088268A1
Application number: PCT/KR2014/012220
Authority: WO
Inventors: 곽은주; 이광국; 이영욱
Original assignee: 주식회사 케이티
Priority date: 2013-12-11
Filing date: 2014-12-11
Publication date: 2015-06-18

Abstract

Disclosed is a method for processing a failure occurring in a network device. The method for processing the failure, performed in a network device connected to at least one controller, comprises the steps of: predicting the failure of the network device; and when the failure of the network device is predicted, notifying at least one controller that the network device will be down. Accordingly, by defining a processing mechanism for each type of router failure, all controllers concerned can quickly grasp the failure information of the router.

Description

How to handle faults on network devices in a SDN environment

TECHNICAL FIELD The present invention relates to software defined networking techniques, and more particularly, to a method for handling a failure occurring in a network device.

Software that separates the forwarding plane and the control plane of the communication system independently for flexible control and cost reduction of the communication network, so that the software can be centrally defined and controlled in the same way as software programming. Software Defined Networking (SDN) technology has emerged.

According to this flow, the Internet Engineering Task Force (IETF) can centrally collect router information using an external controller or apply a routing system control policy so that the SDN concept can be applied without modifying the functions of the existing router as much as possible. It defines standard interfaces for routers and external controllers.

Specifically, the IETF is an Interface to Routing System (I2RS) that supports centralized control using external controllers for routing systems, including legacy legacy IP routing systems that do not have separate forwarding and control planes. Propose technology.

That is, the IETF currently defines a framework and an interface for communicating between a controller and an existing or new router device by standardizing a routing system interface technology for a routing system.

However, there is insufficient discussion on how to handle a network device such as a router in an SDN environment.

An object of the present invention for solving the above problems is to provide a method of handling when a network device such as a router in the SDN environment has a failure.

According to an aspect of the present invention, there is provided a fault handling method for a network device, the fault handling method performed in a network device connected to at least one controller, the method comprising: predicting a fault on the network device; If a failure to the network device is predicted, informing the at least one controller that the network device will be down.

Here, when the failure of the network device is predicted, the network device may be notified to the at least one controller including the time information when the network device is down.

Here, the time information when the network device is down may use a time stamp generated by the network device.

Here, the notifying that the network device is to be down includes: searching for a controller associated with the network device from a storage storing a list of at least one controller; And sending a message informing the discovered controller that the network device is going down.

Here, a message broker may relay a message exchange between at least one controller and a network device.

According to another aspect of the present invention, there is provided a fault handling method for a network device, the fault handling method performed in a network device connected to at least one controller, the method comprising: restarting the network device after failing over; And transmitting information about the restart to the at least one controller to inform the fact of the failure.

The transmitting of the information about the restart to the at least one controller may inform the at least one controller by using the information about the restart that an unexpected failure has occurred in the network device.

Here, the information on the restart count of the network device may be included in the restart information to inform the at least one controller of the failure of the network device.

The transmitting of the information about the restart to the at least one controller may include: searching for a controller associated with a network device from a storage configured to store a list of at least one controller; And transmitting the information about the restart to the discovered controller.

According to another aspect of the present invention, there is provided a fault handling method for a network device. The fault handling method performed by a controller connected to at least one network device includes: Receiving the distinguished information according to; And treating the failure according to the information distinguished according to the failure type.

Here, the information distinguished according to the failure type includes notification information that the network device will be down if a failure for the network device is predicted, and if the failure for the network device is not predicted, It may include notification information for notifying restart.

Here, in the receiving of the distinguished information according to the type of failure occurring in the network device, when the failure of the network device is predicted, the network device may receive notification information including time information when the network device is to be down.

Here, in the receiving of the information distinguished according to the type of failure occurring in the network device, when the failure of the network device is not predicted, the number of restarts of the network device may be received.

Here, in the processing of the failure of the network device, the message to be sent to the failed network device may be recorded in a log and the transmission may be suspended.

As described above, a method for dealing with a failure of a network device according to the present invention defines a handling mechanism for graceful failure and crash for each failure type of a router, so that all relevant controllers have a failure of the router. You can quickly grasp the information.

In addition, after the failure of the router using information about Graceful Failure or Crash, the controller logs all the messages that the controller wants to send to the router, and then pauses the transmission. The network load can be reduced by reducing the retransmission of messages.

1 is a block diagram illustrating the structure of a routing system according to an embodiment of the present invention.

2 is a flowchart illustrating a failure processing method for a network device according to an embodiment of the present invention.

3 is a conceptual diagram illustrating the publication and subscription of an event using a message broker according to an embodiment of the present invention.

4 is a flowchart illustrating the publication and subscription of an event using a message broker according to an embodiment of the present invention.

5 is a flowchart illustrating a method of handling a predicted failure for a network device using a message broker according to an embodiment of the present invention.

6 is a flowchart illustrating a method for a message broker to handle a predicted failure of a network device according to an embodiment of the present invention.

7 is a flowchart illustrating a method of handling a predicted failure for a network device in the absence of a message broker according to an embodiment of the present invention.

8 is a flowchart illustrating a method of handling an unexpected failure for a network device using a message broker according to an embodiment of the present invention.

9 is a flow chart illustrating a method for a message broker to handle an unexpected failure for a network device according to an embodiment of the present invention.

10 is a flowchart illustrating a method of handling an unexpected failure for a network device in the absence of a message broker according to an embodiment of the present invention.

As the invention allows for various changes and numerous embodiments, particular embodiments will be illustrated in the drawings and described in detail in the written description. However, this is not intended to limit the present invention to specific embodiments, it should be understood to include all modifications, equivalents, and substitutes included in the spirit and scope of the present invention. In describing the drawings, similar reference numerals are used for similar elements.

Terms such as first, second, A, and B may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component. The term and / or includes a combination of a plurality of related items or any item of a plurality of related items.

When a component is referred to as being "connected" or "connected" to another component, it may be directly connected to or connected to that other component, but it may be understood that other components may be present in between. Should be. On the other hand, when a component is said to be "directly connected" or "directly connected" to another component, it should be understood that there is no other component in between.

The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting of the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In this application, the terms "comprise" or "have" are intended to indicate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, and one or more other features. It is to be understood that the present invention does not exclude the possibility of the presence or the addition of numbers, steps, operations, components, components, or a combination thereof.

Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art. Terms such as those defined in the commonly used dictionaries should be construed as having meanings consistent with the meanings in the context of the related art and shall not be construed in ideal or excessively formal meanings unless expressly defined in this application. Do not.

Hereinafter, a 'controller' or a 'client' referred to in the present invention refers to a function element for controlling a related component (for example, a switch or a router) to control the flow of traffic. Meaning, it is not limited to the physical implementation form, implementation location and the like. For example, a controller may mean a controller function entity defined by ONF, IETF, ETSI, and / or ITU-T.

In addition, the term 'network device' or 'agent' referred to in the present invention refers to a functional element that substantially forwards, switches, or routes traffic (or packets), and includes ONF, IETF, ETSI, and / or ITU-. It may mean a switch, a router, a switch element, a router element, a forwarding element, and the like defined in T.

In addition, embodiments of the present invention described below are IEEE, ITU- which performs standardization on standard documents and / or delivery networks written in ONF, IETF, ETSI, ITU-T, which are performing standardization of SDN technology. T, may be supported by standard documents written in IETFs. That is, the contents of the embodiments of the present invention that are not specifically described in order to clearly reveal the technical spirit of the present invention may be supported by the standard documents prepared by the standardization organizations. In addition, all terms used in the present invention can be described by the above standard documents.

Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

Referring to FIG. 1, a plurality of routers 200 to be controlled by the controller 100 may be configured, and a plurality of controllers 100 may also be configured to increase load distribution and stability.

1 is physically separated from the router 200, the M controller 100 represented by the M controller in the first controller located outside to control the N routers 200 represented by the N-th router in the first router The case where it is shown is shown.

Each controller 100 may operate in conjunction with the network application 300. In addition, each controller 100 may operate in conjunction with one or more applications 300. For example, each controller 100 may provide information required for the application 300 or perform a request of the application 300.

In detail, FIG. 1 illustrates normalization between an agent module 211 existing on a control plane in a router 200 and a client module 101 existing on a controller 100. A structure in which communication is performed through an interface to routing system (I2RS) is illustrated.

The client module 101 may receive the routing policy or the control command from the application 300 and convert the received policy or the control command into a message in a form that the agent module 211 can parse. .

The Agent module 211 parses the transmitted policy or control information to connect the topology database 212, the policy database 215, and the Routing Information Base (RIB) module connected to the router 200. 214, a routing / signaling protocol module 213, an OAM event module 216, and the like, may interact with each other.

In addition, the forwarding information base module 217 may exist on a data plane of the router 200. Accordingly, the information from the agent module 211 may be delivered to the forwarding information base module 217 of the data plane via the routing information base module 214.

In addition, a monitoring function for transmitting various event information or statistical information of the routers 200 preset from the operator to the client module 101 through the agent module 211 may be performed.

Agent module 211, which is a module in the router 200 that communicates with the controller 100 controlling the router 200 through a standard interface, is very important in terms of stability and reliability of the routing system.

However, at present, there is no defined structure and mechanism for handling the failure of the Agent module 211. In other words, router failure (or agent failure) is discussed in the Interface To the Routing System (I2RS) standardization group, but no specific mechanism is established. Therefore, it is necessary to define a proper way to deal with router failure (or agent failure).

On the other hand, it is necessary to define requirements for protocol in terms of message transmission in I2RS environment. In an environment in which a plurality of controllers 100 are connected to and operate with a plurality of routers 200 as shown in FIG. 1, a message transmitted through an interface between the controller 100 and the routers 200 may include the number of the controllers 100 and the routers ( As the number of 200 increases, the number of relations to be managed by each controller 100 and the router 200 increases.

For example, when N routers 200 and M controllers 100 form a relationship, the number of relations to be managed directly becomes N × M.

In addition, when a new router 200 or a controller 100 is added, the router 200 newly added to all the controllers 100 and the router 200 affected by the router 200 or the controller 100. There are also scalability issues, such as the need to perform additional tasks.

Accordingly, the present invention proposes a method for handling a router failure (or agent failure), and issues / issues an I2RS interface message such as a router failure (or agent failure). Provides a way to improve the structure of the publish (Subscribe) method.

Referring to FIG. 2, the router 200 may classify a failure according to whether the failure may be predicted (S210). For example, the router 200 may classify a case in which a predictable shutdown or failure occurs as a graceful failure, and classify a case in which the router 200 suddenly fails as a crash. .

When the router 200 detects the occurrence of a graceful failure, the router 200 inquires or searches for information on all the controllers 100 connected thereto (S211), and the controllers 100 ) May be notified that the router will be down (S213). At this time, the controllers 100 may log a message to be sent to the router 200 to be down and log the transmission.

An unexpected crash may occur in the router 200 (S230). In this case, the controllers 100 may not know the failure of the corresponding router 200. Accordingly, the controller 100 may transmit a message for checking the health, such as a heartbeat, to the controller 100 so that the router 200 may quickly recognize the router 200 in which the crash occurred. (S220). However, the transmission of the heartbeat message by the router 200 may be performed optionally.

The controller 100 may not receive a heartbeat from the router 200 or may not detect a crash occurring in the router 200 at a predetermined period (S231). In this case, the controller 100 may request a connection for transmitting a message to the router 200 (S240), and since the router 200 is in a crashed state, the controller 100 may fail to connect. An error response (Reply) such as Fail) can be received (S241).

Therefore, the controller 100 may detect a crash occurring in the router 200 through an error response such as not receiving a heartbeat message or a connection failure (S243).

The controller 100 may record a message to be sent to the router 200 which has become a crash state in a log and suspend transmission (S250). In addition, the controller 100 may inquire of the list of other controllers 100 related to the corresponding router to notify the router failure.

On the other hand, even if the error (Reply), such as not receiving the heartbeat (Heartbeat) message or connection failure (Connection Fail) may not detect the crash occurred in the router (200), the process in this case As follows.

The router 200 may solve the crash and restart (S260). When the router 200 is restarted, the router 200 may notify all related controllers 100 that the reboot is performed (S261). At this time, in order to separate from the previous session, the session ID, a boot count, a boot time, and the like may be included and notified. Here, the boot count may mean the number of times the router 200 has been booted.

Therefore, even if the controller 100 does not receive the notification of the failure from the router 200, the controller 100 may recognize that the router has been rebooted due to the failure of the 200.

The controller may retransmit or delete the unsent message due to the failure of the router 100 according to the policy after the router 200 restarts (Agent Reboot) (S263). For example, depending on the message type, information on QoS, statistics, and events may be retransmitted, and change information on topology and RIB may be deleted. In addition, all messages older than 1 hour may be deleted, and messages not transmitted within 1 hour may be retransmitted to process unsent messages for each policy.

Referring to FIG. 3, in the case where there are many and various types of messages exchanged between the controller 100 and the router 200, in order to reduce the association between the controller 100 and the router 200 and reduce the burden of session management, the publish is performed. And the method of subscription can be used.

In addition, the message broker 400 may be utilized to reduce the mutual dependency between the controller 100 and the router 200 and to reduce the complexity and burden of managing the relationship between the plurality of controllers 100 and the router 200. have.

The message broker 400 may relay message exchanges between the plurality of controllers 100 and the plurality of routers 200. For example, the message broker 400 may relay a message between a plurality of controllers 100 and a plurality of routers 200 with reference to a publish / subscribe relation DB 500, and relay the message. Log information on the message exchange by the message log (Message Log) can be stored in the DB (600).

Referring to FIG. 4, a method for issuing and subscribing to an event using a message broker according to an embodiment of the present invention includes a subscription / publication registration step (S410), authentication / authorization (Authenticate / Authorize). A step S420, an event publication step S430, and an event subscription step S440 may be configured.

The message used in each step will be described with reference to FIG. 4 as follows.

FIG. 4 is an embodiment of each step message and parameter in the publish / subscribe message transmission step in which the message broker (MB) 400 is present.

First, a subscription / publication registration step S410 may be performed using a message for a subscription registration request and a publication registration request.

The controller 100 may transmit a message for subscription registration request to the message broker, and the router 200 may transmit a message for publication registration request to the message broker.

Accordingly, the message broker 400 may receive the message for the subscription registration request and the publication registration request, so that the controller 100 may request the subscription and the router 200 may request the subscription.

In addition, the message used in the subscription / publication registration step S410 may include information included in Table 1 below.

That is, the publisher and subscriber can be distinguished using the information in Table 1. In addition, by using the information on the request status, it is possible to perform registration, pause, suspension cancellation, cancellation of registration, and the like.

Table 1

parameter	Explanation	Remarks
Msg id	Message id
Requester id	Controller id or router id requesting registration	Identification information of the controller or router requesting registration
Order type	Request status	Register, pause, unpause, unregister
Role	Classification of roles to register	Publisher or Subscriber
Event type	Type of event you want to publish / subscribe	Policy, Routing Information, Fault, Statistics, etc.
Time stamp	Request time	Request time of registration request message

In the authentication / authorization step (S420), authentication and authorization may be performed between the message broker 400, the controller 100, and the router 200. That is, the message broker 400, the controller 100, and the router 200 may authenticate each other, and request and grant authority according to each role.

In addition, the message used in the authentication / authorization step (S420) may include information included in Table 2 below.

TABLE 2

parameter	Explanation	Remarks
Msg id	Message id
Requester id	Message broker id, controller id, or router id requesting authentication / permission	Identification of the message broker, controller, or router requesting authentication to authenticate
Order type	Request status	Register, pause, unpause, unregister
Role	Role division	Publisher or Subscriber or Message Broker
Event type	Type of event you want to publish / subscribe	Policy, Routing Information, Fault, Statistics, etc.
Time stamp	Request time	Request time of the request message

In the event publication step S430, the message broker 400 may receive an event issued from the controller 100 and the router 200.

In an event subscription step S440, the message broker 400 may provide an event issued by the controller 100 and the router 200 to the controller 100 and the router 200.

In addition, the message used in the event publication step S430 and the event subscription step S440 may include information included in Table 3 below.

TABLE 3

parameter	Explanation	Remarks
Msg id	Message id	Subscription message id
Publisher id	Issued controller id or router id
Subscriber id	Subscribed controller id or router id
Priority	Message priority	Higher priority should send without delay or loss
Event type	Type of event	Policy, Routing Information, Fault, Statistics, etc.
Event message	Event message	Detailed messages about Router Shutdown, Agent Crash, Agent Reboot, etc.
Event time	The time the event occurred	Router boot time, router shutdown time
Time stamp	Message request time	Request time of subscription message

FIG. 5 is a flowchart illustrating a method of handling a predicted failure of a network device using a message broker according to an embodiment of the present invention, and FIG. 6 is a message broker according to an embodiment of the present invention. A flowchart for explaining a method for dealing with a predicted failure.

5 illustrates a procedure for handling a graceful failure in a structure in which a message broker 400 is present.

Referring to FIG. 5, a method of handling a predicted failure of a network device using a message broker 400 according to an embodiment of the present invention may include a subscription / publication registration step S510 and authentication. Authenticate / Authorize step (S520), router failure publication (Router Failure Publication) step (S430) and router failure subscription (Router Failure Subscription) step (S540). Here, each step according to FIG. 5 may be understood to correspond to each step according to FIG. 4.

In detail, the controller 100 may register a router failure subscription to the message broker 400, and the router 200 may request a router failure issuance registration request to the message broker 400 (S510). ).

The message broker 400 and the controller 100 and the router 200 registered for subscription and publication may authenticate each other, and request and grant authority according to each role (S520).

The router 200 may issue a router failure event to the message broker 400 according to occurrence of a router failure (S530).

Therefore, the message broker 400 may transmit a router failure event to the controller 100 requesting a subscription, and change the state of the router 100 to a failure state (S540).

FIG. 6 describes the steps S530 and S540 of FIG. 5 in more detail.

Referring to FIG. 6, the router 200 may publish a router failure event, and the message broker 400 may notify the controller 100 of the failure of the router. In addition, the message broker 400 may change the state of the corresponding router 200 in which the failure occurs to fail.

The message broker 400 may receive a publication for a router failure and record it in a message log (S610).

The message broker 400 may inquire the publish / subscribe relationship information to inquire the controller 100 which is the subscriber connected to the corresponding router 200 (S620).

In addition, the message broker 400 may be placed in a queue according to a transmission priority for a message to notify the controller 100 of the router failure (S630 and S640). At this time, by processing a queue by priority, it is possible to transmit an urgent and important message among several messages without delay or loss.

Finally, the message broker 400 may change the state of the corresponding router 200 in which the router failure occurs to a failure state (S650).

When the message between the controller 100 and the router 200 is processed using the message broker 400 as illustrated in FIGS. 5 and 6, the following advantages are provided.

The message broker 400 may centrally manage whether the connection relationship between the controller 100 and the router 200 is connected or disconnected (by a router failure).

Since the message broker 400 finally takes the role of subscription and publication, the burden of transmitting a message between the controller 100 and the router 200 may be reduced.

Even in a situation in which the message transmission is impossible due to a failure of the controller 100 or the router 200, the message broker 400 stores the message as a log to enable asynchronous transmission of the message. For example, the message broker 400 may store a message in a log in case of a router failure, and collectively transmit an unsent message after the failure is recovered.

When the message broker 400 manages the priority of the message as a whole, when congestion occurs in the message transmission, the message broker 400 may guarantee the transmission of the message for each priority in the entire network. Therefore, it is possible to improve the stability and reliability of the network by delivering the event (Event) occurred in the network quickly.

Referring to FIG. 7, unlike the embodiment according to FIG. 5, information exchange between the controller 100 and the router 200 may be performed directly without the message broker 400 relaying message transfer between the controller 100 and the router 200. You can deal with failures.

That is, the controller 100 and the router 200 may directly authenticate each other and manage connection information between each other.

Specifically, the method for handling a predicted failure for a network device in the absence of a message broker 400 according to an embodiment of the present invention may include a subscription / publication registration step (S710), authentication / Authorization (Authenticate / Authorize) step (S720), router failure publication (Router Failure Publication) step (S730) and router failure subscription (Router Failure Subscription) step (S740). Here, each step according to FIG. 7 may be understood to correspond to each step according to FIG. 4.

The controller 100 may make a router failure subscription registration request to the router 200 (S710).

The controller 100 and the router 200 may authenticate each other, and may request and grant a right according to each role (S720).

The router 200 may issue a router failure event to the controller 100 according to occurrence of a router failure (S730).

The controller 100 may change the state of the corresponding router 200 into a failure state (S740).

Accordingly, a failure processing method performed by the network device will be described with reference to FIGS. 5 to 7.

The network device may predict a failure of the network device, and when a failure of the network device is predicted, the network device may transmit a message indicating that the network device is to be down.

That is, when a failure of the network device is predicted, the controller 100 may be informed that the network device is to be down, including time information when the network device is to be down. Here, the time information when the network device is down may use a time stamp generated by the network device.

In addition, the network device may search for the controller 100 associated with the network device from a storage that stores a list of the controller 100, and may transmit a message indicating that the network device is to be down to the discovered controller 100. have.

8 is a flowchart illustrating a method of handling an unexpected failure for a network device using a message broker according to an embodiment of the present invention, and FIG. 9 is a message broker according to an embodiment of the present invention. Is a flowchart for explaining a method for dealing with an unforeseen obstacle.

Referring to FIG. 8, a method of handling a predicted failure for a network device using the message broker 400 according to an embodiment of the present invention may include a subscription / publication registration step (S810), authentication. Authenticate / Authorize step (S820), router failure publication step (S830) and router failure subscription step (Router Failure Subscription) step (S840). Here, each step according to FIG. 8 may be understood to correspond to each step according to FIG. 4.

In detail, the controller 100 may request a router restart subscription registration request to the message broker 400, and the router 200 may request a router restart issue registration request to the message broker 400 ( S810).

The message broker 400 and the controller 100 and the router 200 that register the subscription and the publication may authenticate each other and may request and grant authority according to their respective roles (S820).

The router 200 may issue a router reboot event to the message broker 400 according to the router reboot (S830).

Therefore, the message broker 400 may transmit a router restart event to the controller 100 requesting a subscription, and change the state of the corresponding router 200 to a failure state (S840).

FIG. 9 describes steps S830 and S840 of FIG. 8 in more detail.

Referring to FIG. 9, the router 200 may publish a router restart event, and the message broker 400 may notify the control 100 of the router restart. In addition, the message broker 400 may change the state of the corresponding router 200 in which the failure occurs to fail.

The message broker 400 may receive a publication for restarting the router and record it in a message log (S910).

The message broker 400 may query a controller that is a subscriber connected to the corresponding router by inquiring publish / subscribe relationship information (S920).

In addition, the message broker 400 may be placed in a queue according to a transmission priority for a message to notify the controller 100 of the router failure (S930 and S940). At this time, by processing a queue by priority, it is possible to transmit urgent and important messages without delay or loss even among multiple messages.

In addition, the message broker 400 transmits a message including information such as a session ID, a boot count, a boot time, and the like to the controller 100 to provide information about a router failure or restart. Even if the controller 100 does not receive, the controller 100 may inform the controller 100 of the time and the number of times that the controller 100 is restarted due to a router failure.

Finally, the message broker 400 may change the restarted router state into a fail state (S950).

Referring to FIG. 10, unlike the embodiment according to FIG. 8, the information exchange between the controller 100 and the router 200 is directly performed without the message broker 400 relaying message transfer between the controller 100 and the router 200. Through this, it is possible to handle restart due to router failure.

Specifically, the method for handling an unexpected failure for a network device in the absence of a message broker 400 according to an embodiment of the present invention may include a subscription / publication registration step (S1010), authentication. / Authenticate / Authorize step (S1020), router failure publication (Router Failure Publication) step (S1030) and router failure subscription (Router Failure Subscription) step (S1040). Here, each step according to FIG. 10 may be understood to correspond to each step according to FIG. 4.

The controller 100 may request a router restart subscription registration request to the router 200 (S1010).

The controller 100 and the router 200 may authenticate each other, and may request and grant a right according to each role (S1020).

The router 200 may issue a router restart event to the controller 100 according to the router reboot (S1030).

The controller 100 may change the state of the router 200 to a failure state (S1040).

Therefore, a failure processing method performed by the network device will be described with reference to FIGS. 8 to 10.

The network device can be restarted by recovering from a failure. If the restart is based on a failure of the network device, information about the restart of the network device may be transmitted to the controller 100. For example, the network device may inform the controller 100 by using the information about the restart of the network device that the failure of the network device is unexpected. In addition, the network device may notify the controller of the failure of the network device based on the number of restarts of the network device according to the information about the restart of the network device.

In addition, the network device may search for a controller 100 related to the network device from a storage that stores a list of the controller 100, and may transmit information about restarting the network device to the discovered controller 100.

Meanwhile, a failure processing method performed by the controller 100 will be described with reference to FIGS. 5 to 10.

The controller 100 may receive failure information on the network device from the network device, and determine the type of failure for the network device by using the failure information on the network device to handle the failure for the network device.

Here, the failure information for the network device includes notification information that the network device will be down if a failure for the network device is predicted, and restart of the network device if the failure for the network device is not predicted. ) May include notification information.

First, when a failure of the network device is predicted, the controller 100 may identify the failure of the network device by using the notification information that the network device including time information when the network device is down will be down. Here, the time information when the network device is down may use a time stamp generated by the network device.

When the failure of the network device is not predicted, the controller 100 may determine the failure of the network device by calculating the restart count of the network device using the failure information of the network device.

After identifying the failed network device, the controller 100 may record a message to be sent to the failed network device in a log and suspend transmission.

According to the present invention, by defining the processing mechanism for the graceful failure (Graceful Failure) and crash (Crash) for each type of failure of the router, all the associated controllers can quickly determine the failure information of the router.

In addition, it is possible to preferentially transmit an urgent message such as a router failure without delay or loss depending on the message priority applied with the QoS.

In addition, after the router is normally rebooted (rebooted), it is possible to retransmit the pending messages in a batch to synchronize the message transmission between the controller and the router asynchronously, or to cancel the pending messages.

Although described above with reference to a preferred embodiment of the present invention, those skilled in the art will be variously modified and changed within the scope of the invention without departing from the spirit and scope of the invention described in the claims below I can understand that you can.

Claims

In the failure handling method performed in a network device connected to at least one controller,

Predicting a failure for the network device; And

Informing the at least one controller that the network device will be down when a failure to the network device is predicted,

How to handle failures for network devices.
The method according to claim 1,

And when the failure of the network device is predicted, informing the at least one controller that the network device is going to be down, including time information for the network device to be down,

How to handle failures for network devices.
The method according to claim 2,

Time information when the network device is down,

Characterized by using a time stamp generated by the network device,

How to handle failures for network devices.
The method according to claim 1,

Notifying that the network device will be down will include:

Searching for a controller associated with the network device from a storage unit storing a list of the at least one controller; And

And sending a message indicating that the network device is to be down to the discovered controller.

How to handle failures for network devices.
The method according to claim 1,

A message broker relays message exchange between the at least one controller and the network device;

How to handle failures for network devices.
In the failure handling method performed in a network device connected to at least one controller,

Restarting after the network device fails over; And

Transmitting information about the restart to the at least one controller to inform the fact that a failure has occurred;

How to handle failures for network devices.
The method according to claim 6,

The transmitting of the information about the restart to the at least one controller,

Characterized in that the at least one controller is informed using the information on the restart that an unexpected failure has occurred in the network device.

How to handle failures for network devices.
The method according to claim 6,

Including the information on the number of restarts of the network device in the information on the restart characterized in that the at least one controller to inform the failure of the network device,

How to handle failures for network devices.
The method according to claim 6,

The transmitting of the information about the restart to the at least one controller,

Searching for a controller associated with the network device from a storage unit storing a list of the at least one controller; And

And transmitting information on the restart to the discovered controller.

How to handle failures for network devices.
The method according to claim 6,

A message broker relays message exchange between the at least one controller and the network device;

How to handle failures for network devices.
In the failure handling method performed in a controller connected to at least one network device,

Receiving information distinguished from the network device according to a type of failure occurring in the network device; And

Treating the failure according to information distinguished according to the failure type,

How to handle failures for network devices.
The method according to claim 11,

The information distinguished according to the type of disorder is

Notification information that the network device is to be down when a failure to the network device is predicted,

When the failure of the network device is not predicted, characterized in that it comprises notification information for notifying the restart (restart) of the network device,

How to handle failures for network devices.
The method according to claim 11,

Receiving the information distinguished according to the type of failure occurring in the network device,

When the failure of the network device is predicted, the network device receives notification information including time information to be down,

How to handle failures for network devices.
The method according to claim 13,

Time information when the network device is down,

Characterized by using a time stamp generated by the network device,

How to handle failures for network devices.
The method according to claim 11,

Receiving the information distinguished according to the type of failure occurring in the network device,

Receiving a restart count of the network device when the failure of the network device is not predicted,

How to handle failures for network devices.
The method according to claim 11,

Handling the failure for the network device,

Characterized in that to log the message to be sent to the failed network device and to suspend transmission, characterized in that

How to handle failures for network devices.
The method according to claim 11,

A message broker relays message exchange between the at least one controller and the network device;

How to handle failures for network devices.