[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

US20070174695A1 - Log-based rollback-recovery - Google Patents

Log-based rollback-recovery Download PDF

Info

Publication number
US20070174695A1
US20070174695A1 US11/424,350 US42435006A US2007174695A1 US 20070174695 A1 US20070174695 A1 US 20070174695A1 US 42435006 A US42435006 A US 42435006A US 2007174695 A1 US2007174695 A1 US 2007174695A1
Authority
US
United States
Prior art keywords
component
states
series
processes
state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/424,350
Inventor
Srinidhi Varadarajan
Joseph Ruscio
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Librato Inc
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US11/424,350 priority Critical patent/US20070174695A1/en
Application filed by Individual filed Critical Individual
Assigned to EVERGRID, INC. reassignment EVERGRID, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: VARADARAJAN, SRINIDHI, RUSCIO, JOSEPH F.
Publication of US20070174695A1 publication Critical patent/US20070174695A1/en
Assigned to TRIPLEPOINT CAPITAL LLC reassignment TRIPLEPOINT CAPITAL LLC SECURITY AGREEMENT Assignors: EVERGRID, INC.
Assigned to LIBRATO, INC. reassignment LIBRATO, INC. CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: EVERGRID, INC., CALIFORNIA DIGITAL CORPORATION
Assigned to EVERGRID, INC. reassignment EVERGRID, INC. CORRECTIVE ASSIGNMENT TO CORRECT THE RE-RECORDING TO REMOVE INCORRECT APPLICATIONS. PLEASE REMOVE 12/420,015; 7,536,591 AND PCT US04/38853 FROM PROPERTY LIST. PREVIOUSLY RECORDED ON REEL 023538 FRAME 0248. ASSIGNOR(S) HEREBY CONFIRMS THE CHANGE OF NAME SHOULD BE - ASSIGNOR: CALIFORNIA DIGITAL CORPORATION; ASSIGNEE: EVERGRID, INC.. Assignors: CALIFORNIA DIGITAL CORPORATION
Assigned to LIBRATO, INC. reassignment LIBRATO, INC. CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: EVERGRID, INC.
Priority to US12/894,877 priority patent/US8631276B2/en
Priority to US14/152,806 priority patent/US10310947B1/en
Priority to US16/430,934 priority patent/US11093345B1/en
Priority to US17/394,381 priority patent/US11847029B1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1438Restarting or rejuvenating

Definitions

  • the present disclosure relates generally to distributed systems, and more particularly, to systems and techniques for recovering from system failures in distributed systems.
  • a distributed system may be thought of an individual computer capable of supporting two or more simultaneous processes, or a single process with multiple threads.
  • a distributed system may comprise a network with a mainframe that allows hundreds, or even thousands, of computers to share software applications.
  • the Internet is another example of a distributed system with a host of Internet servers providing the World Wide Web.
  • system failures can be at the very least frustrating, but in other circumstances could lead to catastrophic results.
  • a system failure can result in the loss of work product and the inconvenience of having to reboot the computer.
  • system failures can be devastating to the business operations of a company or the personal affairs of a consumer.
  • rollback recovery There are a number of system recovery techniques that are employed today to minimize the impact of system failures.
  • One such technique is known as “rollback recovery.”
  • the basic idea behind rollback recovery is to model the operation of a system as a series of states, and when an error occurs, to roll back the system to a previous error-free state and resume operation.
  • One technique for implementing rollback recovery is commonly referred as Checkpoint-Based Rollback Recovery.
  • Checkpoint-Based Rollback Recovery is commonly referred as Checkpoint-Based Rollback Recovery.
  • the system saves in a stable database some of the states it reaches during operation as “checkpoints,” and when an error occurs, the system is restored to a previous error-free state from the checkpoints.
  • Log-Based Rollback Recovery is another technique that builds on the concept of Checkpoint-Based Rollback Recovery.
  • this technique also uses information about non-deterministic events that occur between successive checkpoints.
  • a non-deterministic event is generally an input to the system whose timing and content are unknown by system prior to receipt.
  • the execution of the system until the reception of the next input is deterministic.
  • the execution of the system can be modeled as a sequence of deterministic state intervals, each initiated by a non-deterministic event.
  • PWD piecewise deterministic
  • a system in one aspect of the present invention, includes a storage medium, a component configured to transition through a series of states, and record in the storage medium the state of the component every time the component communicates with another component in the system, and recovery manager configured to recover the most recent state of the component recorded in the storage medium following a failure of the component.
  • computer-readable media contains a set of program instructions executable by hardware in a component of a system while the component is transitioning through a series of states.
  • the instructions include a routine to record in a storage medium the state of the component every time the component communicates with another component in the system.
  • a method of checkpointing a component in a system while the component is transitioning through a series of states includes recording in a storage medium the state of the component every time the component communicates with another component in the system, and recovering the most recent state recorded in the storage medium following a failure of the component.
  • a component configured to operate in a system includes means for transitioning through a series of states, and means for recording in a storage medium the state of the component every time the component communicates with another component in the system.
  • FIG. 1 is a conceptual block diagram illustrating an example of a distributed system
  • FIG. 2 is a block diagram illustrating an example of a hardware configuration for a processing node in a distributed system
  • FIG. 3 is a conceptual block diagram illustrating an example of the communications layering for a processing node in a distributed system
  • FIG. 4 is a conceptual block diagram illustrating another example of the communications layering for a processing node in a distributed system.
  • FIG. 5 is a conceptual block diagram illustrating yet another example of the communications layering for a processing node in a distributed system.
  • This communication would commonly take the form of a message passed between the one component and another component in the system, but could also be a modification to a shared file or some other Inter-Process Communication (IPC) mechanism.
  • IPC Inter-Process Communication
  • the distributed system of FIG. 1 will be used to illustrate this concept.
  • the distributed system 100 has a group of processing nodes 102 connected through a network 106 .
  • the network 106 may be a packet-base network, such as the Internet or corporate Intranet, or any other type of suitable network.
  • the group of processing nodes 102 may be any combination of desktop computers, laptop computers, client workstations, server-enabled computers, dedicated servers, a mainframes, or other processing nodes.
  • a storage medium 108 is shown connected to the network 106 .
  • the storage medium 108 provides a stable database for each processing node 102 to record its current state every time a checkpoint is taken.
  • a recovery manager 110 may be used to load the state of the failed processing node 102 that existed when the last checkpoint was taken into a spare processing node 102 .
  • the recovery manager 110 may roll back the failed processing node 102 to that last checkpoint state and resume operation.
  • storage medium 108 and the recovery manager 110 are shown as separate entities on the network 106 , those skilled in the art will readily appreciate that the storage medium 108 and recovery manager 110 may be integrated into a processing node 102 or other entity on the network 106 , or distributed across multiple processing nodes 102 and/or other entities.
  • the processing node 102 includes a processor 202 implemented with one or more processing entities.
  • the processor 202 includes a general purpose processor, such as a microprocessor, capable of supporting multiple software programs, including an operating system, user applications, and software libraries.
  • the processor 202 may also include memory, which provides a temporary storage medium for the software programs used by the processor 202 .
  • the memory may be implemented with RAM, SRAM, SDRAM, or any other high speed volatile memory.
  • the processor 202 is shown connected to the network through a transceiver 204 .
  • the transceiver 204 may be capable of supporting any number of connections to the network, including Ethernet, TI, wireless, cable modem, DSL, fiber optic, or the like.
  • the processing node 102 may also include computer-readable media 206 that provides a permanent storage medium for the software programs.
  • the computer readable media 206 may be implemented with magnetic hard drive, DVD, CD, CD ROM, tape backup, reel-to-reel, and/or any other inexpensive permanent memory capable of storing large amounts of data and software programs.
  • computer-readable media includes any type of storage device that is accessible by the processor 202 and also encompasses a carrier wave that encodes a data signal.
  • each processing node 102 is implemented will depend on the particular application and the design constraints imposed on the overall system. Those skilled in the art will recognize the interchangeability of hardware, firmware, and software configurations under these circumstances, and how best to implement the described functionality for each particular application
  • FIG. 3 is a conceptual diagram illustrating the layered architectural in the processing node.
  • the processing node includes hardware 302 that supports the operating system 304 or other application execution environment.
  • the operating system is shown running a user, or distributed application 308 that supports a distributed computation in the distributed system.
  • a checkpoint library 306 is transparently interposed above the operating system 304 and below the distributed application 308 so that all checkpoint functions are processed through the checkpoint library 306 .
  • the checkpoint library is responsible for taking a checkpoint every time the processing node 102 communicates with the rest of the system over the network.
  • the checkpoint is taken by recording the current state of the processing node 102 in a stable database (not shown) outside the processing node 102 .
  • the individual processing nodes 102 are constituent components of the distributed system 100 .
  • a globally consistent state can be reestablished after a processing node 102 fails without replaying the non-deterministic events internal to that processing node 102 as long as a checkpoint is taken every time the processing node 102 communicates with another processing node.
  • the state of the failed node 102 when the last checkpoint was taken can be recovered from the stable database and loaded into a spare processing node 102 on the network 106 .
  • the distributed computation can then continue. It does not matter that the recovered state of the distributed system is one that existed prior to the occurrence of the error. It is sufficient if the recovered state could have occurred in the system execution prior to the error.
  • the processing node 102 receives a request over the network from two different processing nodes, or computers, attempting to purchase the same item.
  • the checkpoint library 306 takes a checkpoint by recording the current state of the distributed application 308 to a stable database external to the processing node 102 .
  • the two requests are processed in parallel by separate threads of the distributed application 308 . Each thread attempts to access the memory (not shown) to retrieve a state variable j relating to the item.
  • the operating system 304 uses a scheduling algorithm to determine the order in which the two threads will have access to the state variable j.
  • the checkpoint library 306 takes a checkpoint by recording the current state of the processing node 102 to the stable database.
  • the processing node 102 then sends a confirmation over the network to the computer requesting the transaction.
  • the operating system 304 grants the second thread access to the state variable j in the memory.
  • the state of the processing node 102 when the last checkpoint was taken can be recovered from the stable database and loaded into a spare processing node on the network.
  • the spare processing node is loaded with the state of the processing node 102 that existed just prior to the processing node 102 sending the confirmation over the network to the computer requesting the item.
  • the second thread begins processing its request to purchase the item by loading the state variable j from its memory to a processor register. Since the state variable j recovered from the memory is zero, the request to purchase the item will be denied, thereby resulting in a globally consistent state (i.e., the item was not sold to both consumers).
  • a globally consistent state can be achieved even if the processing node 102 fails while the first thread is processing the request.
  • the spare processing node is loaded with the state of the processing node 102 immediately after the two requests to purchase the item were received, i.e., the state of the processing node 102 when the last checkpoint was taken.
  • the spare processing node resumes the transaction it is possible that the second thread will be granted access to the state variable j before the first thread. If this occurs, then the item will be sold to the consumer whose request is being processed by the second thread. Although this result is different than the one that would have occurred had the processing node not failed, it is still a globally consistent state because the item is sold to only one consumer. The consumer whose request was being processed by the first thread does not receive an inconsistent message because the processing node 102 failed before he or she received a confirmation.
  • the processing node 102 is the distributed system and the sub-processing entities 202 a - 202 c are the constituent components.
  • the two requests to purchase the items are processed by different sub-processing entities 202 a, 202 b.
  • a distributed application attempts to access the memory (not shown) to retrieve the state variable j. Since each distributed application 308 a, 308 b is running on separate hardware 302 a, 302 b, respectively, and share memory (not shown), a semaphore is likely to be used to manage access to the state variable j.
  • a semaphore is a hardware or software flag, residing in the memory, which indicates the accessibility of the state variable j.
  • a distributed application requiring access to the state variable j will read the semaphore to determine whether the state variable j is available. If the semaphore indicates that the state variable j is available, then the distributed application will set the semaphore to indicate that the memory space occupied by the state variable j is locked, thus preventing other applications from accessing the state variable.
  • the checkpoint library 306 a will take a checkpoint by recording the current state of the sub-processing entity 202 a to non-volatile memory (not shown) in the processing node 102 .
  • the distributed application 308 a will then send the confirmation to the computer making the request, and clear semaphore to unlock the memory space containing the state variable j. All other applications, including the distributed application 308 b will be prohibited from accessing the state variable j while the semaphore is set.
  • a spare sub-processing entity 202 c may be loaded with the state of the failed sub-processing entity 202 a that existed just after the request to purchase the item was received, (i.e., the state of the failed sub-processing entity 202 a when the last checkpoint was taken). In this state, the semaphore is not set, and therefore, the distributed application 308 b, 308 c may again compete for access to the semaphore in the memory.
  • the result may or may not be the same as the pre-failure state, but whatever the result, the processing node 102 will obtain a globally consistent state because the consumer, whose request was being processed by the distributed application 308 a in the failed sub-processing entity 202 a did not transmit a confirmation that the transaction was successful.
  • the processing node is the distributed system and the distributed applications are the constituent components.
  • the distributed applications are the constituent components. Referring to FIG. 5 , a globally consistent state can reestablished after a distributed application fails without replaying the non-deterministic events internal to the distributed application as long as checkpoints are taken with every communication between the distributed application and the rest of the system.
  • the processor node 102 is executing first, second, and third distributed applications 308 a - 308 c.
  • the third distributed application 308 c has first and second threads of execution, 508 c x and 508 c y , which share an index variable j, which may be stored in a general register (not shown).
  • the first thread 308 c x will increment the variable j and send the resulting value back to the first distributed application 308 a.
  • a query by the second distributed application 308 b to the second thread 308 c y causes the second thread 308 c y to increment the variable j and send the resulting value back to the second distributed application 308 b.
  • first distributed application 308 a may query the first thread 308 c x at the same time the second distributed application 308 b queries the second thread 308 c y .
  • the checkpoint library 306 will take a checkpoint by recording the state of the third distributed application 308 c in non-volatile memory (not shown).
  • the checkpoint library 306 will take a checkpoint every time the third distributed application 308 c it outputs the state variable j to either the first or second distributed application 308 a, 308 b, respectively.
  • the last checkpoint can be recovered from the non-volatile memory and used to roll back the third distributed application 308 c to an error-free state.
  • the third distributed application 308 c fails before the state variables are sent to the first and second distributed applications 308 a, 308 b, respectively, then the third distributed application 308 c will be rolled back to a state that existed just after receiving the queries from the first and second distributed applications 308 a, 308 b, respectively.
  • the scheduling entity in the operating system 304 may allow the second distributed application 308 b to enter the synchronization primitive first.
  • the result is different than the one that would have occurred had the third distributed application 308 c not failed, it is still a globally consistent state because the current state of the variables j received by the first and second distributed applications 308 a, 308 b, respectively, are not inconsistent with any communication received from the third distributed application 308 c received prior to failure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Retry When Errors Occur (AREA)
  • Hardware Redundancy (AREA)

Abstract

Log-Based Rollback Recovery for system failures. The system includes a storage medium, and a component configured to transition through a series of states. The component is further configured to record in the storage medium the state of the component every time the component communicates with another component in the system, the system being configured to recover the most recent state recorded in the storage medium following a failure of the component.

Description

    CROSS REFERENCE TO RELATED APPLICATIONS
  • This application is related to and claims the benefit of the filing date of U.S. provisional application Ser. No. 60/760,026, filed on Jan. 18, 2006, entitled “Method for Enabling Log-Based Rollback-Recovery of Multiple Flows of Control with Shared State,” which is hereby incorporated by reference.
  • BACKGROUND
  • 1. Field
  • The present disclosure relates generally to distributed systems, and more particularly, to systems and techniques for recovering from system failures in distributed systems.
  • 2. Background
  • Computers and other modem processing systems have revolutionized the electronics industry by enabling complex tasks to be performed with just a few strokes of a keypad. These processing systems have evolved from simple self-contained computing devices, such as the calculator, to highly sophisticated distributed systems. Today, almost every aspect of our daily lives involves, in some way, distributed systems. In its simplest form, a distributed system may be thought of an individual computer capable of supporting two or more simultaneous processes, or a single process with multiple threads. On a larger scale, a distributed system may comprise a network with a mainframe that allows hundreds, or even thousands, of computers to share software applications. Distributed systems are also being used today to replace traditional supercomputers with any number of computers, servers, processors, or other components being connected together to perform specialized applications that require immense amounts of computations. The Internet is another example of a distributed system with a host of Internet servers providing the World Wide Web.
  • As we become more dependent upon distributed systems in our daily lives, it becomes increasingly important to guard against system failures. A system failure can be at the very least frustrating, but in other circumstances could lead to catastrophic results. For the individual computer, a system failure can result in the loss of work product and the inconvenience of having to reboot the computer. In larger systems, system failures can be devastating to the business operations of a company or the personal affairs of a consumer.
  • There are a number of system recovery techniques that are employed today to minimize the impact of system failures. One such technique is known as “rollback recovery.” The basic idea behind rollback recovery is to model the operation of a system as a series of states, and when an error occurs, to roll back the system to a previous error-free state and resume operation. One technique for implementing rollback recovery is commonly referred as Checkpoint-Based Rollback Recovery. Using this technique, the system saves in a stable database some of the states it reaches during operation as “checkpoints,” and when an error occurs, the system is restored to a previous error-free state from the checkpoints.
  • Log-Based Rollback Recovery is another technique that builds on the concept of Checkpoint-Based Rollback Recovery. In addition to checkpoints, this technique also uses information about non-deterministic events that occur between successive checkpoints. A non-deterministic event is generally an input to the system whose timing and content are unknown by system prior to receipt. However, for a given input and a given state in which the system receives this input, the execution of the system until the reception of the next input is deterministic. As a result, the execution of the system can be modeled as a sequence of deterministic state intervals, each initiated by a non-deterministic event. This follows the “piecewise deterministic” (PWD) assumption which postulates that all non-deterministic events that cause state transitions to the system can be recorded as determinants. When this assumption holds true, system recovery may be achieved by restoring the system to a previous prior error-free state based on the checkpoints, and then replaying the recorded determinants to restore the system to the state that existed just prior to the error.
  • Unfortunately, current Log-Based Rollback-Recovery techniques have no mechanism to deal with certain types of non-determinism inherent in systems capable of handling multiple processes, or a single process with multiple threads, that share a common state (i.e., address space). As an example, consider a distributed system on the Internet in which two computers conducting an e-commerce transaction with a server compete to purchase the same item. In this example, a scheduling entity within the server will determine which computer is granted access first and, hence, is able to consummate the transaction. However, should a system failure occur and the server be rolled back to a previous error-free state that existed prior to the transaction, there is no guarantee that the same computer will be granted access to the server before the other without extremely invasive modifications to the operating system and/or applications. This can be especially problematic when the system fails after the server confirms the original transaction.
  • SUMMARY
  • In one aspect of the present invention, a system includes a storage medium, a component configured to transition through a series of states, and record in the storage medium the state of the component every time the component communicates with another component in the system, and recovery manager configured to recover the most recent state of the component recorded in the storage medium following a failure of the component.
  • In another aspect of the present invention, computer-readable media contains a set of program instructions executable by hardware in a component of a system while the component is transitioning through a series of states. The instructions include a routine to record in a storage medium the state of the component every time the component communicates with another component in the system.
  • In yet another aspect of the present invention, a method of checkpointing a component in a system while the component is transitioning through a series of states, includes recording in a storage medium the state of the component every time the component communicates with another component in the system, and recovering the most recent state recorded in the storage medium following a failure of the component.
  • In a further aspect of the present invention, a component configured to operate in a system includes means for transitioning through a series of states, and means for recording in a storage medium the state of the component every time the component communicates with another component in the system.
  • In yet a further aspect of the present invention, a processing node configured to operate in a system includes a processor configured to transition through a series of states, the processor having a checkpoint library configured to record in a storage medium the state of the processor every time the processor communicates with another component of the system.
  • It is understood that other embodiments of the present invention will become readily apparent to those skilled in the art from the following detailed description, wherein it is shown and described only various embodiments of the invention by way of illustration. As will be realized, the invention is capable of other and different embodiments and its several details are capable of modification in various other respects, all without departing from the spirit and scope of the present invention. Accordingly, the drawings and detailed description are to be regarded as illustrative in nature and not as restrictive.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • Various aspects of a communications system are illustrated by way of example, and not by way of limitation, in the accompanying drawing, wherein:
  • FIG. 1 is a conceptual block diagram illustrating an example of a distributed system;
  • FIG. 2 is a block diagram illustrating an example of a hardware configuration for a processing node in a distributed system;
  • FIG. 3 is a conceptual block diagram illustrating an example of the communications layering for a processing node in a distributed system;
  • FIG. 4 is a conceptual block diagram illustrating another example of the communications layering for a processing node in a distributed system; and
  • FIG. 5 is a conceptual block diagram illustrating yet another example of the communications layering for a processing node in a distributed system.
  • DETAILED DESCRIPTION
  • The detailed description set forth below in connection with the appended drawings is intended as a description of various embodiments of the invention and is not intended to represent the only embodiments in which the invention may be practiced. The detailed description includes specific details for the purpose of providing a thorough understanding of the invention. However, it will be apparent to those skilled in the art that the invention may be practiced without these specific details. In some instances, well known structures and components are shown in block diagram form in order to avoid obscuring the concepts of the invention.
  • The various techniques described throughout this disclosure may be applied to the constituent components of a distributed system to recover from a system failure, even in the presence of non-deterministic events that are too difficult or expensive to record. According to the PWD assumption, these non-deterministic events must be captured as determinants so the precise set of deterministic state intervals may be recreated. However, the following observation can also be made. The set of deterministic state intervals that occur in a component between any two interactions with the rest of the system appear to all other components in the system as a single deterministic interval. In other words, any non-determinism that occurs internal to one component does not affect any other component in the system until the one component communicates with the system. This communication would commonly take the form of a message passed between the one component and another component in the system, but could also be a modification to a shared file or some other Inter-Process Communication (IPC) mechanism. Thus, a globally consistent state can be reestablished after system failure without replaying the non-deterministic events internal to a component as long as a checkpoint is taken with any communication by the component with the rest of the system. Although the recovered state of the system may not be one that existed prior to the occurrence of the error, it is sufficient if the recovered state could have occurred in the system execution prior to the error.
  • The distributed system of FIG. 1 will be used to illustrate this concept. The distributed system 100 has a group of processing nodes 102 connected through a network 106. The network 106 may be a packet-base network, such as the Internet or corporate Intranet, or any other type of suitable network. The group of processing nodes 102 may be any combination of desktop computers, laptop computers, client workstations, server-enabled computers, dedicated servers, a mainframes, or other processing nodes.
  • A storage medium 108 is shown connected to the network 106. The storage medium 108 provides a stable database for each processing node 102 to record its current state every time a checkpoint is taken. When a processing node 102 fails, a recovery manager 110 may be used to load the state of the failed processing node 102 that existed when the last checkpoint was taken into a spare processing node 102. Alternatively, the recovery manager 110 may roll back the failed processing node 102 to that last checkpoint state and resume operation. Although the storage medium 108 and the recovery manager 110 are shown as separate entities on the network 106, those skilled in the art will readily appreciate that the storage medium 108 and recovery manager 110 may be integrated into a processing node 102 or other entity on the network 106, or distributed across multiple processing nodes 102 and/or other entities.
  • A conceptual block diagram of a processing node is shown in FIG. 2. The processing node 102 includes a processor 202 implemented with one or more processing entities. In one embodiment, the processor 202 includes a general purpose processor, such as a microprocessor, capable of supporting multiple software programs, including an operating system, user applications, and software libraries. The processor 202 may also include memory, which provides a temporary storage medium for the software programs used by the processor 202. The memory may be implemented with RAM, SRAM, SDRAM, or any other high speed volatile memory.
  • The processor 202 is shown connected to the network through a transceiver 204. The transceiver 204 may be capable of supporting any number of connections to the network, including Ethernet, TI, wireless, cable modem, DSL, fiber optic, or the like.
  • The processing node 102 may also include computer-readable media 206 that provides a permanent storage medium for the software programs. The computer readable media 206 may be implemented with magnetic hard drive, DVD, CD, CD ROM, tape backup, reel-to-reel, and/or any other inexpensive permanent memory capable of storing large amounts of data and software programs. Those skilled in the art will recognize that the term “computer-readable media” includes any type of storage device that is accessible by the processor 202 and also encompasses a carrier wave that encodes a data signal.
  • The manner in which each processing node 102 is implemented will depend on the particular application and the design constraints imposed on the overall system. Those skilled in the art will recognize the interchangeability of hardware, firmware, and software configurations under these circumstances, and how best to implement the described functionality for each particular application
  • FIG. 3 is a conceptual diagram illustrating the layered architectural in the processing node. The processing node includes hardware 302 that supports the operating system 304 or other application execution environment. The operating system is shown running a user, or distributed application 308 that supports a distributed computation in the distributed system. A checkpoint library 306 is transparently interposed above the operating system 304 and below the distributed application 308 so that all checkpoint functions are processed through the checkpoint library 306. The checkpoint library is responsible for taking a checkpoint every time the processing node 102 communicates with the rest of the system over the network. The checkpoint is taken by recording the current state of the processing node 102 in a stable database (not shown) outside the processing node 102.
  • Returning to FIG. 1, the individual processing nodes 102 are constituent components of the distributed system 100. A globally consistent state can be reestablished after a processing node 102 fails without replaying the non-deterministic events internal to that processing node 102 as long as a checkpoint is taken every time the processing node 102 communicates with another processing node. When a processing node 102 fails, the state of the failed node 102 when the last checkpoint was taken can be recovered from the stable database and loaded into a spare processing node 102 on the network 106. The distributed computation can then continue. It does not matter that the recovered state of the distributed system is one that existed prior to the occurrence of the error. It is sufficient if the recovered state could have occurred in the system execution prior to the error.
  • An example will now be described with reference to a processing node 102 configured as a server that is capable of supporting e-commerce transactions with other processing nodes. In this example, Referring to FIG. 3, the processing node 102 receives a request over the network from two different processing nodes, or computers, attempting to purchase the same item. Once the requests are received, the checkpoint library 306 takes a checkpoint by recording the current state of the distributed application 308 to a stable database external to the processing node 102. The two requests are processed in parallel by separate threads of the distributed application 308. Each thread attempts to access the memory (not shown) to retrieve a state variable j relating to the item. In this example, j=1 if the item is still available, and j=0 if the item has been sold. The operating system 304 uses a scheduling algorithm to determine the order in which the two threads will have access to the state variable j. The first thread granted access by the operating system 304 will load the state variable j into a processor register (not shown), confirm that the item is still available (i.e., j=1), complete the transaction, and decrement the state variable j (i.e., set j=0) before writing it back to the memory. Once the transaction is complete, the checkpoint library 306 takes a checkpoint by recording the current state of the processing node 102 to the stable database. The state of the first thread includes the state variable j=1. The processing node 102 then sends a confirmation over the network to the computer requesting the transaction.
  • Next, the operating system 304 grants the second thread access to the state variable j in the memory. The second thread processes the state variable in the same way, but this time it will not be able to consummate the transaction because the item is no longer available (i.e., the state variable j=0). In this case, the processing node 102 will send a message over the network back to the requesting computer indicating that the item is unavailable.
  • Should the processing node 102 fail while the second thread is processing the request, the state of the processing node 102 when the last checkpoint was taken can be recovered from the stable database and loaded into a spare processing node on the network. In this case, the spare processing node is loaded with the state of the processing node 102 that existed just prior to the processing node 102 sending the confirmation over the network to the computer requesting the item. Once the spare processing node is loaded with this state, the second thread begins processing its request to purchase the item by loading the state variable j from its memory to a processor register. Since the state variable j recovered from the memory is zero, the request to purchase the item will be denied, thereby resulting in a globally consistent state (i.e., the item was not sold to both consumers).
  • A globally consistent state can be achieved even if the processing node 102 fails while the first thread is processing the request. Under this scenario, the spare processing node is loaded with the state of the processing node 102 immediately after the two requests to purchase the item were received, i.e., the state of the processing node 102 when the last checkpoint was taken. When the spare processing node resumes the transaction, it is possible that the second thread will be granted access to the state variable j before the first thread. If this occurs, then the item will be sold to the consumer whose request is being processed by the second thread. Although this result is different than the one that would have occurred had the processing node not failed, it is still a globally consistent state because the item is sold to only one consumer. The consumer whose request was being processed by the first thread does not receive an inconsistent message because the processing node 102 failed before he or she received a confirmation.
  • The same techniques just described can be extended to a processing node with a processor having two sub-processing entities as represented in FIG. 4. In this example, the processing node 102 is the distributed system and the sub-processing entities 202 a-202 c are the constituent components. The two requests to purchase the items are processed by different sub-processing entities 202 a, 202 b. A distributed application attempts to access the memory (not shown) to retrieve the state variable j. Since each distributed application 308 a, 308 b is running on separate hardware 302 a, 302 b, respectively, and share memory (not shown), a semaphore is likely to be used to manage access to the state variable j. A semaphore is a hardware or software flag, residing in the memory, which indicates the accessibility of the state variable j. A distributed application requiring access to the state variable j will read the semaphore to determine whether the state variable j is available. If the semaphore indicates that the state variable j is available, then the distributed application will set the semaphore to indicate that the memory space occupied by the state variable j is locked, thus preventing other applications from accessing the state variable.
  • In the event the distributed application 308 a is able access the state variable j, the request processed by this distributed application 308 a will be successful. As explained earlier, the state variable j will be loaded into a processor register (not shown) in the hardware 302 a and the transaction consummated because the state variable j=1. Once the transaction is completed, the state variable j will be decremented (i.e., the state variable j=0) and written back to the memory. The checkpoint library 306 a will take a checkpoint by recording the current state of the sub-processing entity 202 a to non-volatile memory (not shown) in the processing node 102. The distributed application 308 a will then send the confirmation to the computer making the request, and clear semaphore to unlock the memory space containing the state variable j. All other applications, including the distributed application 308 b will be prohibited from accessing the state variable j while the semaphore is set.
  • Should the sub-processing entity 202 a fail before the distributed application 308 a confirms the transaction, a spare sub-processing entity 202 c may be loaded with the state of the failed sub-processing entity 202 a that existed just after the request to purchase the item was received, (i.e., the state of the failed sub-processing entity 202 a when the last checkpoint was taken). In this state, the semaphore is not set, and therefore, the distributed application 308 b, 308 c may again compete for access to the semaphore in the memory. The result may or may not be the same as the pre-failure state, but whatever the result, the processing node 102 will obtain a globally consistent state because the consumer, whose request was being processed by the distributed application 308 a in the failed sub-processing entity 202 a did not transmit a confirmation that the transaction was successful.
  • Another example will be provided where the processing node is the distributed system and the distributed applications are the constituent components. Referring to FIG. 5, a globally consistent state can reestablished after a distributed application fails without replaying the non-deterministic events internal to the distributed application as long as checkpoints are taken with every communication between the distributed application and the rest of the system.
  • In this example, the processor node 102 is executing first, second, and third distributed applications 308 a-308 c. The third distributed application 308 c has first and second threads of execution, 508 c x and 508 c y, which share an index variable j, which may be stored in a general register (not shown). In response to a query by the first distributed application 308 a to the first thread 308 c x, the first thread 308 c x will increment the variable j and send the resulting value back to the first distributed application 308 a. In a similar manner, a query by the second distributed application 308 b to the second thread 308 c y causes the second thread 308 c y to increment the variable j and send the resulting value back to the second distributed application 308 b.
  • During execution, with j=0, it is possible that first distributed application 308 a may query the first thread 308 c x at the same time the second distributed application 308 b queries the second thread 308 c y. Once these queries are received, the checkpoint library 306 will take a checkpoint by recording the state of the third distributed application 308 c in non-volatile memory (not shown). A scheduling entity in the operating system 304 may be used to determine which thread enters the synchronization primitive first. Assuming that it is the first thread 308 c x, then the first distributed application 308 a will receive j=1 and the second distributed application 308 b will receive j=2. The checkpoint library 306 will take a checkpoint every time the third distributed application 308 c it outputs the state variable j to either the first or second distributed application 308 a, 308 b, respectively.
  • Should the third distributed application 308 c fail, the last checkpoint can be recovered from the non-volatile memory and used to roll back the third distributed application 308 c to an error-free state. By way of example, if the third distributed application 308 c fails before the state variables are sent to the first and second distributed applications 308 a, 308 b, respectively, then the third distributed application 308 c will be rolled back to a state that existed just after receiving the queries from the first and second distributed applications 308 a, 308 b, respectively. When the distributed application 308 c resumes operation from the last checkpoint, the scheduling entity in the operating system 304 may allow the second distributed application 308 b to enter the synchronization primitive first. If this occurs, then the first distributed application 308 a will receive j=2 and the second distributed application 308 b will receive j=1. Although the result is different than the one that would have occurred had the third distributed application 308 c not failed, it is still a globally consistent state because the current state of the variables j received by the first and second distributed applications 308 a, 308 b, respectively, are not inconsistent with any communication received from the third distributed application 308 c received prior to failure.
  • The various techniques described throughout this disclosure provide an innovative way to integrate checkpoints with Log-Based Rollback-Recovery systems in such a manner that the PWD assumption can be relaxed so as only to require the recording of non-deterministic events that originate somewhere external to a component. These techniques allow the user to determine the set of the non-deterministic events that are to be recorded and replayed as determinants, and ignore the rest. A checkpoint is taken with any communication between the component and the rest of the system, and therefore, all non-determinism that could affect the rest of the system are captured.
  • The previous description is provided to enable any person skilled in the art to practice the various embodiments described herein. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments. Thus, the claims are not intended to be limited to the embodiments shown herein, but is to be accorded the full scope consistent with the language claims, wherein reference to an element in the singular is not intended to mean “one and only one” unless specifically so stated, but rather “one or more.” All structural and functional equivalents to the elements of the various embodiments described throughout this disclosure that are known or later come to be known to those of ordinary skill in the art are expressly incorporated herein by reference and are intended to be encompassed by the claims. Moreover, nothing disclosed herein is intended to be dedicated to the public regardless of whether such disclosure is explicitly recited in the claims. No claim element is to be construed under the provisions of 35 U.S.C. § 112, sixth paragraph, unless the element is expressly recited using the phrase “means for” or, in the case of a method claim, the element is recited using the phrase “step for.”

Claims (44)

1. A system, comprising:
a storage medium;
a component configured to transition through a series of states, and record in the storage medium the state of the component every time the component communicates with another component in the system; and
recovery manager configured to recover the most recent state of the component recorded in the storage medium following a failure of the component.
2. The system of claim 1 wherein the component is further configured to perform a process having multiple threads, the process resulting in the component transitioning through the series of states.
3. The system of claim 2 wherein at least two of the threads share a common state as the component transitions through the series of states.
4. The system of claim 3 wherein the common state comprises an access by said at least two of the threads to a common resource.
5. The system of claim 1 wherein the component is further configured to perform multiple processes in parallel, the processes resulting in the component transitioning through the series of states.
6. The system of claim 5 wherein at least one of the processes comprises multiple threads.
7. The system of claim 5 wherein at least two of the processes share a common state as the component transitions through the series of states.
8. The system of claim 7 wherein the common state comprises an access by said at least two of the processes to a common resource.
9. Computer-readable media containing a set of program instructions executable by hardware in a component of a system while the component is transitioning through a series of states, comprising:
a routine to record in a storage medium the state of the component every time the component communicates with another component in the system.
10. The computer-readable media of claim 9 wherein the component is further configured to execute multiple threads, the execution of the multiple threads resulting in the component transitioning through the series of states.
11. The computer-readable media of claim 10 wherein at least two of the threads share a common state as the component transitions through the series of states.
12. The computer-readable media of claim 11 wherein the common state comprises an access by said at least two of the threads to a common resource.
13. The computer-readable media of claim 9 wherein the component is further configured to perform multiple processes in parallel, the processes resulting in the component transitioning through the series of states.
14. The computer-readable media of claim 13 wherein at least one of the processes comprises multiple threads.
15. The computer-readable media of claim 14 wherein at least two of the processes share a common state as the component transitions through the series of states.
16. The computer-readable media of claim 15 wherein the common state comprises an access by said at least two of the processes to a common resource.
17. The computer-readable media of claim 9 wherein the set of program instructions comprises a checkpoint library accessible to an application running on the hardware, the running of the application resulting in the component transitioning through the series of states.
18. The computer-readable media of claim 17 wherein the hardware supports an operating system, and wherein communications between the application and the operating system flow through the checkpoint library.
19. A method of checkpointing a component in a system while the component is transitioning through a series of states, the method comprising;
recording in a storage medium the state of the component every time the component communicates with another component in the system; and
recovering the most recent state recorded in the storage medium following a failure of the component.
20. The method of claim 19 wherein the component is performing a process having multiple threads, the process resulting in the component transitioning through the series of states.
21. The method of claim 20 wherein at least two of the threads share a common state as the component transitions through the series of states.
22. The method of claim 21 wherein the common state comprises accessing a common resource by said at least two of the threads.
23. The method of claim 19 wherein the component is performing multiple processes in parallel, the processes resulting in the component transitioning through the series of states.
24. The method of claim 23 wherein at least two of the processes sharing a common state as the component transitions through the series of states.
25. The method of claim 24 wherein the common state comprises accessing a common resource by said at least two of the processes.
26. A component configured to operate in a system, comprising:
means for transitioning through a series of states; and
means for recording in a storage medium the state of the component every time the component communicates with another component in the system.
27. The component of claim 26 wherein the means for transitioning through a series of states comprises a process having multiple threads.
28. The component of claim 27 wherein at least two of the threads share a common state.
29. The component of claim 28 wherein the common state comprises an access by said at least two of the threads to a common resource.
30. The component of claim 26 wherein the means for transitioning through a series of states comprises multiple processes performed in parallel.
31. The component of claim 30 wherein at least one of the processes comprises multiple threads.
32. The component of claim 30 wherein at least two of the processes share a common state as the component transitions through the series of states.
33. The system of claim 32 wherein the common state comprises an access by said at least two of the processes to a common resource.
34. A processing node configured to operate in a system, comprising:
a processor configured to transition through a series of states, the processor having a checkpoint library configured to record in a storage medium the state of the processor every time the processor communicates with another component of the system.
35. The processing node of claim 34 wherein the processor is further configured to perform a process having multiple threads, the process resulting in the processor transitioning through the series of states.
36. The processing node of claim 35 wherein at least two of the threads share a common state as the processor transitions through the series of states.
37. The processing node of claim 36 wherein the common state comprises an access by said at least two of the threads to a common resource.
38. The processing node of claim 34 further comprising a second processor configured to transition through a series of states, the second processor having a second checkpoint library configured to record in the storage medium the state of the processing node every time the second processor communicates with another component of the system
39. The processing node of claim 38 wherein the processors are configured to perform multiple processes in parallel, the processes resulting in the processors transitioning through the series of states.
40. The processing node of claim 39 wherein at least one of the processes comprises multiple threads.
41. The processing node of claim 39 wherein at least two of the processes share a common state as the component transitions through the series of states.
42. The processing node of claim 41 wherein the common state comprises an access by said at least two of the processes to a common resource.
43. The processing node of claim 34 wherein the processor includes an application that causes the processor to transition through the series of states, the checkpoint library being accessible to the application.
44. The processing node of claim 43 wherein the processor further includes an operating system, and wherein communications between the application and the operating system flow through the checkpoint library.
US11/424,350 2006-01-18 2006-06-15 Log-based rollback-recovery Abandoned US20070174695A1 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
US11/424,350 US20070174695A1 (en) 2006-01-18 2006-06-15 Log-based rollback-recovery
US12/894,877 US8631276B2 (en) 2006-01-18 2010-09-30 Log-based rollback-recovery
US14/152,806 US10310947B1 (en) 2006-01-18 2014-01-10 Log-based rollback-recovery
US16/430,934 US11093345B1 (en) 2006-01-18 2019-06-04 Log-based rollback-recovery
US17/394,381 US11847029B1 (en) 2006-01-18 2021-08-04 Log-based rollback-recovery

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US76002606P 2006-01-18 2006-01-18
US11/424,350 US20070174695A1 (en) 2006-01-18 2006-06-15 Log-based rollback-recovery

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/894,877 Continuation US8631276B2 (en) 2006-01-18 2010-09-30 Log-based rollback-recovery

Publications (1)

Publication Number Publication Date
US20070174695A1 true US20070174695A1 (en) 2007-07-26

Family

ID=38287027

Family Applications (5)

Application Number Title Priority Date Filing Date
US11/424,350 Abandoned US20070174695A1 (en) 2006-01-18 2006-06-15 Log-based rollback-recovery
US12/894,877 Active US8631276B2 (en) 2006-01-18 2010-09-30 Log-based rollback-recovery
US14/152,806 Active 2026-11-13 US10310947B1 (en) 2006-01-18 2014-01-10 Log-based rollback-recovery
US16/430,934 Active 2026-08-18 US11093345B1 (en) 2006-01-18 2019-06-04 Log-based rollback-recovery
US17/394,381 Active US11847029B1 (en) 2006-01-18 2021-08-04 Log-based rollback-recovery

Family Applications After (4)

Application Number Title Priority Date Filing Date
US12/894,877 Active US8631276B2 (en) 2006-01-18 2010-09-30 Log-based rollback-recovery
US14/152,806 Active 2026-11-13 US10310947B1 (en) 2006-01-18 2014-01-10 Log-based rollback-recovery
US16/430,934 Active 2026-08-18 US11093345B1 (en) 2006-01-18 2019-06-04 Log-based rollback-recovery
US17/394,381 Active US11847029B1 (en) 2006-01-18 2021-08-04 Log-based rollback-recovery

Country Status (1)

Country Link
US (5) US20070174695A1 (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050228834A1 (en) * 2002-12-18 2005-10-13 Fujitsu Limited Distributed transaction processing control
US20080307258A1 (en) * 2007-06-11 2008-12-11 International Business Machines Corporation Distributed Job Manager Recovery
US20100169284A1 (en) * 2008-12-31 2010-07-01 Sap Ag Distributed transactional recovery system and method
US20120030653A1 (en) * 2010-07-30 2012-02-02 Apple Inc. Assumption-based compilation
US20150134326A1 (en) * 2012-05-14 2015-05-14 Touchtype Limited Mechanism for synchronising devices, system and method
US9195486B2 (en) 2010-07-30 2015-11-24 Apple Inc. Observation and analysis based code optimization
US9928072B1 (en) * 2008-05-02 2018-03-27 Azul Systems, Inc. Detecting and recording atomic execution
CN108200157A (en) * 2017-12-29 2018-06-22 北京奇虎科技有限公司 The daily record synchronous method and device that host node triggering retracts
US20200057818A1 (en) * 2018-08-17 2020-02-20 Machbase, Inc. Method and device for searching indexes for sensor tag data
US20200073962A1 (en) * 2018-08-29 2020-03-05 International Business Machines Corporation Checkpointing for increasing efficiency of a blockchain
US10922307B2 (en) 2017-12-11 2021-02-16 NextWorld, LLC Automated transaction engine
US11196542B2 (en) 2018-08-29 2021-12-07 International Business Machines Corporation Checkpointing for increasing efficiency of a blockchain
US11334439B2 (en) 2018-08-29 2022-05-17 International Business Machines Corporation Checkpointing for increasing efficiency of a blockchain

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070174695A1 (en) * 2006-01-18 2007-07-26 Srinidhi Varadarajan Log-based rollback-recovery
US8386594B2 (en) * 2010-02-11 2013-02-26 Intel Corporation Network controller circuitry to initiate, at least in part, one or more checkpoints
US10360095B2 (en) * 2016-03-31 2019-07-23 Change Healthcare Holdings, Llc Methods and apparatuses for improving failure recovery in a distributed system

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5410685A (en) * 1990-06-12 1995-04-25 Regents Of The University Of Michigan Non-intrinsive method and system for recovering the state of a computer system and non-intrusive debugging method and system utilizing same
US5632032A (en) * 1994-02-07 1997-05-20 International Business Machines Corporation Cross address space thread control in a multithreaded environment
US5923832A (en) * 1996-03-15 1999-07-13 Kabushiki Kaisha Toshiba Method and apparatus for checkpointing in computer system
US6092213A (en) * 1997-09-30 2000-07-18 Tandem Computers Incorporated Fault tolerant method of maintaining and distributing configuration information in a distributed processing system
US6105148A (en) * 1995-06-16 2000-08-15 Lucent Technologies Inc. Persistent state checkpoint and restoration systems
US6192391B1 (en) * 1997-05-30 2001-02-20 Nec Corporation Process stop method and apparatus for a distributed memory multi-processor system
US6195676B1 (en) * 1989-12-29 2001-02-27 Silicon Graphics, Inc. Method and apparatus for user side scheduling in a multiprocessor operating system program that implements distributive scheduling of processes
US20040199812A1 (en) * 2001-11-29 2004-10-07 Earl William J. Fault tolerance using logical checkpointing in computing systems
US20040220981A1 (en) * 1999-12-20 2004-11-04 Taylor Kenneth J System and method for a backup parallel server data storage system
US20050071379A1 (en) * 2003-09-30 2005-03-31 Veritas Operating Corporation System and method for maintaining temporal data in data storage
US20060053331A1 (en) * 2004-09-03 2006-03-09 Chou Norman C Slave device having independent error recovery

Family Cites Families (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5590277A (en) * 1994-06-22 1996-12-31 Lucent Technologies Inc. Progressive retry method and apparatus for software failure recovery in multi-process message-passing applications
US5440726A (en) * 1994-06-22 1995-08-08 At&T Corp. Progressive retry method and apparatus having reusable software modules for software failure recovery in multi-process message-passing applications
US5530802A (en) * 1994-06-22 1996-06-25 At&T Corp. Input sequence reordering method for software failure recovery
US5630047A (en) * 1995-09-12 1997-05-13 Lucent Technologies Inc. Method for software error recovery using consistent global checkpoints
US5996088A (en) * 1997-01-22 1999-11-30 Oracle Corporation High-speed database checkpointing through sequential I/O to disk
US5938775A (en) * 1997-05-23 1999-08-17 At & T Corp. Distributed recovery with κ-optimistic logging
US6952825B1 (en) * 1999-01-14 2005-10-04 Interuniversitaire Micro-Elektronica Centrum (Imec) Concurrent timed digital system design method and environment
GB2353113B (en) * 1999-08-11 2001-10-10 Sun Microsystems Inc Software fault tolerant computer system
US6856950B1 (en) * 1999-10-15 2005-02-15 Silicon Graphics, Inc. Abstract verification environment
US6741990B2 (en) * 2001-05-23 2004-05-25 Intel Corporation System and method for efficient and adaptive web accesses filtering
US7047441B1 (en) * 2001-09-04 2006-05-16 Microsoft Corporation Recovery guarantees for general multi-tier applications
US7231554B2 (en) * 2002-03-25 2007-06-12 Availigent, Inc. Transparent consistent active replication of multithreaded application programs
US7584463B2 (en) * 2003-08-27 2009-09-01 Microsoft Corporation State as a first-class citizen of an imperative language
US7373548B2 (en) * 2003-08-29 2008-05-13 Intel Corporation Hardware recovery in a multi-threaded architecture
US7188273B2 (en) * 2003-11-24 2007-03-06 Tsx Inc. System and method for failover
US7711713B2 (en) * 2004-10-21 2010-05-04 International Business Machines Corporation System for deterministic database recovery time
FR2881246B1 (en) * 2005-01-21 2007-03-23 Meiosys Soc Par Actions Simpli PERFECT PROCESS FOR MANAGING, JOURNALIZING OR REJECTING NON-DETERMINISTIC OPERATIONS IN THE CONDUCT OF AN APPLICATION PROCESS
US7493516B2 (en) * 2005-08-29 2009-02-17 Searete Llc Hardware-error tolerant computing
GB0521465D0 (en) * 2005-10-21 2005-11-30 Law Gregory E W System and method for debugging of computer programs
US9268666B2 (en) * 2005-10-21 2016-02-23 Undo Ltd. System and method for debugging of computer programs
US8122427B2 (en) * 2006-01-04 2012-02-21 Microsoft Corporation Decentralized system services
US20070174695A1 (en) * 2006-01-18 2007-07-26 Srinidhi Varadarajan Log-based rollback-recovery
US7937618B2 (en) * 2007-04-26 2011-05-03 International Business Machines Corporation Distributed, fault-tolerant and highly available computing system

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6195676B1 (en) * 1989-12-29 2001-02-27 Silicon Graphics, Inc. Method and apparatus for user side scheduling in a multiprocessor operating system program that implements distributive scheduling of processes
US5410685A (en) * 1990-06-12 1995-04-25 Regents Of The University Of Michigan Non-intrinsive method and system for recovering the state of a computer system and non-intrusive debugging method and system utilizing same
US5632032A (en) * 1994-02-07 1997-05-20 International Business Machines Corporation Cross address space thread control in a multithreaded environment
US6105148A (en) * 1995-06-16 2000-08-15 Lucent Technologies Inc. Persistent state checkpoint and restoration systems
US5923832A (en) * 1996-03-15 1999-07-13 Kabushiki Kaisha Toshiba Method and apparatus for checkpointing in computer system
US6192391B1 (en) * 1997-05-30 2001-02-20 Nec Corporation Process stop method and apparatus for a distributed memory multi-processor system
US6092213A (en) * 1997-09-30 2000-07-18 Tandem Computers Incorporated Fault tolerant method of maintaining and distributing configuration information in a distributed processing system
US20040220981A1 (en) * 1999-12-20 2004-11-04 Taylor Kenneth J System and method for a backup parallel server data storage system
US20040199812A1 (en) * 2001-11-29 2004-10-07 Earl William J. Fault tolerance using logical checkpointing in computing systems
US20050071379A1 (en) * 2003-09-30 2005-03-31 Veritas Operating Corporation System and method for maintaining temporal data in data storage
US20060053331A1 (en) * 2004-09-03 2006-03-09 Chou Norman C Slave device having independent error recovery

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7587397B2 (en) * 2002-12-18 2009-09-08 Fujitsu Limited Distributed transaction processing control
US20050228834A1 (en) * 2002-12-18 2005-10-13 Fujitsu Limited Distributed transaction processing control
US7779298B2 (en) * 2007-06-11 2010-08-17 International Business Machines Corporation Distributed job manager recovery
US20080307258A1 (en) * 2007-06-11 2008-12-11 International Business Machines Corporation Distributed Job Manager Recovery
US10671400B2 (en) 2008-05-02 2020-06-02 Azul Systems, Inc. Enhanced managed runtime environments that support deterministic record and replay
US9928072B1 (en) * 2008-05-02 2018-03-27 Azul Systems, Inc. Detecting and recording atomic execution
US9928071B1 (en) * 2008-05-02 2018-03-27 Azul Systems, Inc. Enhanced managed runtime environments that support deterministic record and replay
US20100169284A1 (en) * 2008-12-31 2010-07-01 Sap Ag Distributed transactional recovery system and method
US9146759B2 (en) * 2010-07-30 2015-09-29 Apple Inc. Assumption-based compilation
US9195486B2 (en) 2010-07-30 2015-11-24 Apple Inc. Observation and analysis based code optimization
US20120030653A1 (en) * 2010-07-30 2012-02-02 Apple Inc. Assumption-based compilation
US20150134326A1 (en) * 2012-05-14 2015-05-14 Touchtype Limited Mechanism for synchronising devices, system and method
US10055397B2 (en) * 2012-05-14 2018-08-21 Touchtype Limited Mechanism for synchronising devices, system and method
US10922307B2 (en) 2017-12-11 2021-02-16 NextWorld, LLC Automated transaction engine
CN108200157A (en) * 2017-12-29 2018-06-22 北京奇虎科技有限公司 The daily record synchronous method and device that host node triggering retracts
US20200057818A1 (en) * 2018-08-17 2020-02-20 Machbase, Inc. Method and device for searching indexes for sensor tag data
US10706054B2 (en) * 2018-08-17 2020-07-07 Machbase, Inc. Method and device for searching indexes for sensor tag data
US10901957B2 (en) * 2018-08-29 2021-01-26 International Business Machines Corporation Checkpointing for increasing efficiency of a blockchain
US20200073962A1 (en) * 2018-08-29 2020-03-05 International Business Machines Corporation Checkpointing for increasing efficiency of a blockchain
US11196542B2 (en) 2018-08-29 2021-12-07 International Business Machines Corporation Checkpointing for increasing efficiency of a blockchain
US11334439B2 (en) 2018-08-29 2022-05-17 International Business Machines Corporation Checkpointing for increasing efficiency of a blockchain

Also Published As

Publication number Publication date
US11093345B1 (en) 2021-08-17
US20110083040A1 (en) 2011-04-07
US11847029B1 (en) 2023-12-19
US10310947B1 (en) 2019-06-04
US8631276B2 (en) 2014-01-14

Similar Documents

Publication Publication Date Title
US11847029B1 (en) Log-based rollback-recovery
US8065273B2 (en) Automated priority restores
US8903779B1 (en) Methods for returning a corrupted database to a known, correct state
JP6362685B2 (en) Replication method, program, and apparatus for online hot standby database
US9275060B1 (en) Method and system for using high availability attributes to define data protection plans
US20120109919A1 (en) High availability database management system and database management method using same
US7853571B2 (en) Techniques for file system recovery
CN101243446A (en) Online page restore from a database mirror
US6874104B1 (en) Assigning recoverable unique sequence numbers in a transaction processing system
US7827144B1 (en) Methods of reading and writing data
EP1380948A2 (en) Process group resource manager
US10409691B1 (en) Linking backup files based on data partitions
CN110121694B (en) Log management method, server and database system
US11093290B1 (en) Backup server resource-aware discovery of client application resources
JP5154843B2 (en) Cluster system, computer, and failure recovery method
US6895415B1 (en) System and method for concurrent distributed snapshot management
US9734022B1 (en) Identifying virtual machines and errors for snapshots
Barga et al. Persistent applications via automatic recovery
WO2019109257A1 (en) Log management method, server and database system
Arockiam et al. FTM-A middle layer architecture for fault tolerance in cloud computing
US7512834B2 (en) Apparatus, system, and method for providing parallel access to a data set configured for automatic recovery
LeLann Chapter 15. Error recovery
US20240152429A1 (en) Recoverable Processes
US7720812B1 (en) Synchronizing write accesses
Wang et al. Transaction support for log-based middleware server recovery

Legal Events

Date Code Title Description
AS Assignment

Owner name: EVERGRID, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:VARADARAJAN, SRINIDHI;RUSCIO, JOSEPH F.;REEL/FRAME:018036/0558;SIGNING DATES FROM 20060724 TO 20060728

AS Assignment

Owner name: TRIPLEPOINT CAPITAL LLC, CALIFORNIA

Free format text: SECURITY AGREEMENT;ASSIGNOR:EVERGRID, INC.;REEL/FRAME:021308/0437

Effective date: 20080429

Owner name: TRIPLEPOINT CAPITAL LLC,CALIFORNIA

Free format text: SECURITY AGREEMENT;ASSIGNOR:EVERGRID, INC.;REEL/FRAME:021308/0437

Effective date: 20080429

AS Assignment

Owner name: LIBRATO, INC., CALIFORNIA

Free format text: CHANGE OF NAME;ASSIGNORS:CALIFORNIA DIGITAL CORPORATION;EVERGRID, INC.;REEL/FRAME:023538/0248;SIGNING DATES FROM 20060403 TO 20080904

Owner name: LIBRATO, INC.,CALIFORNIA

Free format text: CHANGE OF NAME;ASSIGNORS:CALIFORNIA DIGITAL CORPORATION;EVERGRID, INC.;SIGNING DATES FROM 20060403 TO 20080904;REEL/FRAME:023538/0248

Owner name: LIBRATO, INC., CALIFORNIA

Free format text: CHANGE OF NAME;ASSIGNORS:CALIFORNIA DIGITAL CORPORATION;EVERGRID, INC.;SIGNING DATES FROM 20060403 TO 20080904;REEL/FRAME:023538/0248

AS Assignment

Owner name: EVERGRID, INC., CALIFORNIA

Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE RE-RECORDING TO REMOVE INCORRECT APPLICATIONS. PLEASE REMOVE 12/420,015; 7,536,591 AND PCT US04/38853 FROM PROPERTY LIST. PREVIOUSLY RECORDED ON REEL 023538 FRAME 0248. ASSIGNOR(S) HEREBY CONFIRMS THE CHANGE OF NAME SHOULD BE - ASSIGNOR: CALIFORNIA DIGITAL CORPORATION; ASSIGNEE: EVERGRID, INC.;ASSIGNOR:CALIFORNIA DIGITAL CORPORATION;REEL/FRAME:024726/0876

Effective date: 20060403

AS Assignment

Owner name: LIBRATO, INC., CALIFORNIA

Free format text: CHANGE OF NAME;ASSIGNOR:EVERGRID, INC.;REEL/FRAME:024831/0872

Effective date: 20080904

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION