WO1988008161A1 - An operations controller for a fault tolerant multiple node processing system - Google Patents
An operations controller for a fault tolerant multiple node processing system Download PDFInfo
- Publication number
- WO1988008161A1 WO1988008161A1 PCT/US1988/001245 US8801245W WO8808161A1 WO 1988008161 A1 WO1988008161 A1 WO 1988008161A1 US 8801245 W US8801245 W US 8801245W WO 8808161 A1 WO8808161 A1 WO 8808161A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- task
- node
- message
- error
- messages
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/18—Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits
- G06F11/187—Voting techniques
- G06F11/188—Voting techniques where exact match is not required
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0721—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment within a central processing unit [CPU]
- G06F11/0724—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment within a central processing unit [CPU] in a multiprocessor or a multi-core unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1415—Saving, restoring, recovering or retrying at system level
- G06F11/142—Reconfiguring to eliminate the error
- G06F11/1425—Reconfiguring to eliminate the error by reconfiguration of node membership
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1479—Generic software techniques for error detection or fault masking
- G06F11/1482—Generic software techniques for error detection or fault masking by means of middleware or OS functionality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1658—Data re-synchronization of a redundant component, or initial sync of replacement, additional or spare unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/18—Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits
- G06F11/181—Eliminating the failing redundant component
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/18—Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits
- G06F11/182—Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits based on mutual exchange of the output between redundant processing components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/18—Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits
- G06F11/187—Voting techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
- G06F15/161—Computing infrastructure, e.g. computer clusters, blade chassis or hardware partitioning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0751—Error or fault detection not based on redundancy
- G06F11/0754—Error or fault detection not based on redundancy by exceeding limits
- G06F11/076—Error or fault detection not based on redundancy by exceeding limits by exceeding a count or rate limit, e.g. word- or bit count limit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/08—Error detection or correction by redundancy in data representation, e.g. by using checking codes
- G06F11/10—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
Definitions
- This invention is related to commonly assigned, copending patent applications (Docket Numbers ATC-102-A, ATC-104-A, ATC-105-A, ATC-105-B, ATC-106-A, ATC-107-A, and ATC-109-A) filed concurrently herewith.
- the invention is related to the field of multiple node processing systems and in particular to an operations controller each node in the multiple node processor system for controlling the operation of its own node in a fault tolerant manner.
- the earliest attempts to produce fault tolerant computer systems provided redundant computers in which each computer simultaneously executed every task required for the control operation. Voting circuits monitoring the outputs of the multiple computers determined a majority output which was assumed to be the correct output for the system. In this type of system, a faulty computer may or may not be detected and the faulty computer may or may not be turned off.
- the redundant computer concept although highly successful, is expensive because it requires multiple computers of equivalent capabilities. These systems require powerful computers because each computer has to perform every task required for the operation of the system.
- the master-slave concept was introduced in which the operation of several computers were controlled and coordinated by a master control.
- the master control designated which tasks were to be executed by the individual computers. This reduced the execution time of the control operation because all the computers were no longer required to execute every task, and many of the tasks could be executed in parallel.
- the master could remove it from active participation in the system by assigning the task that would normally have been assigned to the faulty computer to the other computers.
- the present Invention is an operations controller for a fault tolerant multiple node processing system based on the system taught by Whiteside et al in U.S. Patent No. 4,323,966 which has improved fault tolerance and control capabilities.
- a predecessor of this operations controller has been described by C. J. Walter et al in their paper "MAFT: A Multicomputer Architecture for Fault-Tolerance in Real-Time Control Systems” published in the proceedings of the Real-Time System Symposium, San Diego, December 3 - 6, 1985.
- the invention is an operations controller for each node in a fault tolerant multiple node processing system.
- Each node has an applications processor for executing a predetermined set of tasks and an operations controller for establishing and maintaining its own node in synchronization with eyery other node in the system, for controlling the operation of its own node, and for selecting the task to be executed by its own applications processor in coordination with all of the other nodes in the system through the exchange on inter-node messages.
- the operations controller has a transmitter for transmitting all of the inter-node messages generated by its own operations controller to all the other nodes in the system.
- the transmitter has an arbitrator for deciding the order in which the inter-node messages are to be transmitted when two or more messages are ready for transmission.
- a message checker for checking each received message for physical and logical errors to generate an inter-node error report containing an error status byte identifying each detected error. The message checker polls each of the receivers to unload the received messages in a repetitive sequence.
- a voter subsystem has a voter for voting on the content of all error free messages containing the same information to generate a voted value and has a deviance checker for generating an inter-node error report identifying each node which sent a message used in the generation of the voted value whose content differed from the voted value by more than a predetermined amount.
- a fault tolerator for passing all error free messages received from the message checker to the voter subsystem, for generating an inter-node error message containing all of the error reports accumulated by all of the subsystems of its own operations controller, for generating a base penalty count for each node in the system based on the number of detected errors and the severity of the detected errors identified in such inter-node error reports, for globally verifying the base penalty count for each node through the exchange of inter-node base penalty count messages, and for generating a system state vector identifying each node whose base penalty count exceeds a predetermined exclusion threshold.
- a task scheduler for selecting the next task to be executed by its own applications processor from an active task list, for maintaining a global data base on the scheduling and execution of each node through the exchange of task completed/started messages and for generating an error report whose scheduling process differs from the scheduling process replicated for that node.
- the operations controller also has a data memory and a task communicator for storing the voted values in the data memory.
- the task communicator further has means for passing the identity of the task selected by the scheduler to the applications processor, means for extracting the voted values required for the execution of the selected task and passing them to the applications processor, means for generating the task completed/started messages identifying the task just completed and the new task started by the applications processor and for generating inter- node data value messages containing the data values generated by the applications processor in the execution of the selected tasks,
- the operations controller further includes a synchronizer for synchronizing the operation of its own node with all of the other non-faulty nodes in the system through the exchange of inter-node time-dependent messages.
- the object of the Invention is an architecture for a multiple node fault tolerant processing system based on the functional and physical partitioning of the application task and the overhead functions.
- Another object of the Invention is a distributed multiple node processing system in which no one node is required to execute every task of the applications task and in which failure of one or more nodes need not prevent execution of any applications task.
- Another object of the invention is a multiple node computer architecture in which task selection and fault detection are globally verified.
- Another object of the invention is a fault tolerant computer architecture in which the exclusion or readmittance of a node into the active set of nodes is made on a global basis.
- Figure 1 is a block diagram of the multi-computer architecture
- FIG. 2 is a block diagram of the Operations Controller
- Figure 3 is the master/atomic period timing diagram
- Figure 4 is a the atomic/subatomic period timing diagram
- FIG. 5 is a block diagram of the Transmitter
- Figure 6 is a circuit diagram of one of the interfaces
- FIG. 7 is a block diagram of the Arbitrator
- Figure 8 shows waveforms for the Self-Test Arbitration Logic
- Figure 9 is a block diagram of the Longitudinal Redundancy Code Generator
- Figure 10 is a block diagram of a Receiver
- Figure 11 is a block diagram of the Message Checker
- Figure 12 is a block diagram of the decision logic for the Between Limits Checker:
- Figure 13 is the format for the error status byte generated by the Message Checker
- FIG. 14 is a block diagram of the Fault Tolerator
- FIG. 15 shows the partitioning of the Fault Tolerator RAM
- Figure 16 shows the format of the Message partition of the Fault Tolerator RAM
- Figure 17 shows the format of the Error Code Files partition of the Fault Tolerator RAM
- Figure 18 shows the format of the Group Mapping partition of the Fault Tolerator RAM
- Figure 19 shows the format of the Error Code Files partition of the Fault Tolerator RAM
- Figure 20 shows the format of the Penalty Weight par titlon of the Fault Tolerator RAM
- Figure 21 is a block diagram of the Fault Tolerator's Message Checker Interface
- Figure 22 is a block diagram of the Fault Tolerator's Error Handler
- Figure 23 is a block diagram of the Error Handler's Error Consistency Checker
- Figure 24 is a block diagram of the Error Handler's Validity Checker
- Figure 25 illustrates the format of the error byte in an error message
- Figure 26 is a timing diagram of the reconfiguration sequence
- FIG. 27 is a block diagram of the Voter Subsystem
- Figure 28 is a flow diagram for the Upper and Lower
- Figure 29 is a circuit diagram of the Lower Medial Value
- Figure 30 is a flow diagram for the Averaging Circuit
- Figure 31 is a circuit diagram of the Averaging Circuit
- Figure 32 is a flow diagram of the Deviance Checker
- Figure 33 is a circuit diagram of a Deviance Checker
- Figure 34 is a block diagram of the Scheduler
- Figure 35 shows the data format of the Scheduler RAM
- Figure 36 shows the data format of the Scheduler ROM
- Figure 37 is a block diagram of the Scheduler's Task Selector Module
- Figure 38 is a flow diagram of the Wake-Up Sequencer's operation
- Figure 39 is a flow diagram of the Execution Timer's operation
- Figure 40 is a flow diagram of the TIC Handler's operation
- Figure 41 is a flow diagram of the TIC Handler's Selection Queue Update sub-process
- Figure 42 is a flow diagram of the TIC Handler's Completion/Termination sub-process
- Figure 43 is a flow diagram of the TIC Handler's Execution Timer Reset sub-process
- Figure 44 is a flow diagram of the TIC Handler's Priority Scan List Update sub-process
- Figure 45 is a flow diagram of the Priority Scanner's operation
- Figure 46 is a flow diagram of the Next Task Selector's operation
- Figure 47 is a block diagram of the Reconfigure Module
- Figure 48 is a flow diagram for the Task Swapper's operation in response to a Node being excluded from the operating set
- Figure 49 is a flow diagram of the Task Swapper's operation in response to a Node being readmitted to the operating set;
- Figure 50 is a flow diagram of the Task Reallocator's operation in response to a Node being excluded from the operating set;
- Figure 51 is a flow diagram of the Task Status Matcher's operation
- Figure 52 is a block diagram of the Task Communicator
- Figure 53 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the Store Data Control;
- Figure 54 is a flow diagram of the Store Data Control's operation
- Figure 55 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the DID Request Handler
- Figure 56 is a flow diagram of the DID Request Handler's operation
- Figure 57 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the Task Terminated Recorder;
- Figure 58 is a flow diagram of the Task Terminated
- Figure 59 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the Task Started Recorder
- Figure 60 is a flow diagram of the Task Started
- Figure 61 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the AP Input Handler;
- Figure 62 is a flow diagram of the AP Input Handler's operation
- Figure 63 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the AP Output Handler:
- Figure 64 is a flow diagram showing the AP Output
- Figure 65 shows the format of the DID Information as stored 1n the DID List
- Figure 66 shows the format of the DID information with the NUDAT bit appended
- Figure 67 is a partial block diagram of the Task
- Figure 68 is a flow diagram showing the operation of the Reconfigure Control during reconfiguration
- Figure 69 is a partial block diagram of the Task Communicator showing the subsystems involved in "reset";
- Figure 70 is a flow diagram of the Reset Control during reset
- Figure 71 is a block diagram of the Synchronizer
- Figure 72 shows the format of the Synchronizer Memory
- Figure 73 shows the format of the Message Memory
- Figure 74 shows the format of the Time Stamp Memory
- Figure 75 shows the format of the Scratch Pad Memory
- Figure 76 shows the waveforms of the signals generated by the Timing Signal Generator
- Figure 77 is a block diagram of the Synchronizer
- Figure 78 is a flow diagram showing the operation of the Data Handler and Expected Message Checker
- Figure 79 is a flow diagram showing the operation of the Within Hard Error Window and Soft Error Window Checker and the Time Stamper;
- Figure 80 is a flow diagram for the operation of the "HEW to warning count" ;
- Figure 81 is a partial block diagram of the Synchronizer showing the elements associated with the operation of the Message Generator
- Figure 82 is a flow diagram of the operation of the
- Figure 83 shows the waveforms of the timing signals for generating a TIC message
- Figure 84 shows the waveforms of the timing signals for generating a sync System State message
- Figure 85 shows the format of the "cold start” pre-sync message
- Figure 86 is a flow diagram showing the operation of the Synchronizer during a "cold start"
- Figures 87 and 87a are flow diagrams showing the generation of the HEW to warning signal during "cold start";
- Figure 88 is a flow diagram showing the storing of data during a "cold start"
- Figure 89 is a flow diagram showing the operation of the Operating Condition Detector during a "cold start"
- Figure 90 is a timing diagram used in the description of the "cold start"
- Figure 91 is a flow diagram of the operation of the Synchronizer during a "warm start"
- Figure 92 is a timing diagram used in the description of a "warm start"
- Figure 93 is a flow diagram of the operation of the Byzantine Voter to generate Byzantine voted task completed vector and Byzantine voted branch condition bits for the Scheduler;
- Figure 94 is a perspective of the Byzantine Voter's three-dimensional memory
- Figure 95 shows the two-dimensional format of ISW vectors resulting from the first Byzantine vote on the three- dimensional ISW matrices.
- Figure 96 is a functional circuit diagram of the Byzantine Voter.
- the multi-computer architecture for fault tolerance is a distributed multi-computer system based on the functional and physical partitioning of the application tasks and the overhead functions, such as fault tolerance and systems operations.
- the multi-computer architecture consists of a plurality of Nodes 10A through 10N, each having an Operations Controller 12 for performing the overhead functions and an Applications Processor 14 for executing the application tasks.
- the multi-computer architecture For each application, the multi-computer architecture is required to execute a predetermined set of tasks, collectively called application tasks.
- Each Node is allocated an active task set which is a subset of the application tasks.
- Each Node in coordination with all of the other Nodes is capable of selecting tasks from its active task set and executing them in a proper sequence.
- the active task set for each Node may be different from the active task set allocated to the other Nodes and each task in the application tasks may be included in the active task set of two or more Nodes depending upon how many Nodes are in the system and the importance of the task to the particular application.
- the multi-computer architecture defines a distributed multi-computer system in which no one Node 101s required to execute every one of the application tasks, yet the failure of one or more Nodes need not prevent the execution of any application task.
- the active task set in each Node is static for any given system configuration or system state and will change as the system state changes with an increase or decrease in the number of active Nodes. This change in the active task set called "reconfiguration,” takes place automatically and assures that every one of the important or critical application tasks will be included in the active task set of at least one of the remaining active Nodes in the system.
- Each Node 10A through 10N is connected to every other Node in the multi-computer architecture through its Operational Controller 12 by means of a private communication link 16.
- the Operations Controller "A" is the only Operations
- Input Devices 20 External information from sensors and manually operated devices collectively identified as Input Devices 20 are transmitted directly to the Applications Processors 14 of each Node through an input line 18. It is not necessary that every Applications Processor receive information from every sensor and/or Input Device, however, each Applications Processor 14 will receive the information from every sensor and/or Input Device which it needs in the execution of the applications task.
- the Applications Processor 14 in each Node will transmit data and control signals, resulting from the execution of the applications task to one or more actuators and/or display devices collectively identified as Output Devices 22.
- the data and/or control signals generated by the Applications Processor 14 in the individual Nodes 10A through 10N may be combined by a Combiner/Voter Network 24 before it is transmitted to the Output Devices 22.
- the Combiner/Voter Network 24 may also be used to generate a single voted value which is transmitted to the Output Devices 22.
- the use or omission of a Combiner/Voter Network 24 is optional. It is not necessary that every actuator or display receive the output generated by every Node in the system. The specific actuator or display only needs to be connected to the Node or Nodes whose Applications Processor 14 is capable of generating the data or command signals it requires.
- the network of Operations Controllers 12 is the heart of the system and is responsible for the inter-node communications, system synchronization, data voting, error detection, error handling, task scheduling, and reconfiguration.
- the Applications Processors 14 are responsible for the execution of the application tasks and for communications with the Input Devices 20 and Output Devices 22.
- the overhead functions performed by the Operations Controllers 12 are transparent to the operations of the Applications Processor 14. Therefore, the structure of the Applications Processor 14may be based solely upon the application requirements. Because of this, dissimilar Applications Processor 14 may be used 1n different Nodes without destroying the symmetry of the multi-computer architecture.
- Each Operations Controller 12 has a transmitter 30 for serially transmitting messages on the Node's private communication link 16.
- the Operations Controller illustrated in Figure 2 is the Operations Controller A as shown in Figure 1.
- the Transmitter 30 will transmit messages on the private communication link 16a.
- Each Operations Controller also has a plurality of Receivers 32a through 32n, each of which is connected to a different private communication link. In the preferred embodiment, the number of Receivers 32a through 32n is equal to the number of Nodes in the multi-computer architecture. In this way, each Operations Controller 12 will receive all of the messages transmitted by every Node in the system including its own.
- Each Receiver 32a through 32n will convert each message received over the private communication link to which it is connected from a serial format to a parallel format then forward it to a Message Checker 34. Each Receiver 32a through 32n will also check the vertical parity and the longitudinal redundancy codes appended to each of the received messages and will generate an error signal identifying any errors detected.
- the Message Checker 34 monitors the Receivers 32a through 32n and subjects each received message to a variety of physical and logical checks. After completion of these physical and logical checks, the messages are sent to a Fault Tolerator 36. Upon the detection of any errors in any message, the Message Checker 34 will generate an error status byte which is also transmitted to the Fault Tolerator 36.
- the Fault Tolerator 36 performs five, basic functions. First, the Fault Tolerator performs further logical checks on the messages received from the Message Checker 34 to detect certain other errors that were not capable of being detected by the Message Checker 34. Second, the Fault Tolerator passes error free messages to a Voter 38 which votes on the content of all messages containing the same information to generate a voted value. Third, it passes selected fields from the error free messages to other subsystems as required. Fourth, the Fault Tolerator aggregates the internal error reports from the various error detection mechanisms in the Operations Controller and generates Error messages which are transmitted to all of the other Nodes in the system by the Transmitter 30.
- the Fault Tolerator 36 monitors the health status of each Node in the system and will initiate a local reconfiguration when a Node is added or excluded from the current number of operating Nodes.
- the Fault Tolerator 36 maintains a base penalty count table which stores the current base penalty counts accumulated for each Node in the system. Each time a Node transmits a message containing an error, every Node in the system, including the one that generated the message, should detect this error and generate an Error message identifying the Node that sent the message containing the error, the type of error detected, and a penalty count for the detected error or erros.
- Each Fault Tolerator 36 will receive these Error messages from every other Node and will increment the base penalty count for that Node which is currently being stored in the base penalty count table, if the detection of the error is supported by Error messages received from a majority of the Nodes.
- the magnitude of the penalty count increment is predetermined and is proportional to the severity of the error. If the incremented base penalty count exceeds an exclusion threshold, as shall be discussed later, the Fault Tolerator initiates a Node exclusion and a reconfiguration process in which the faulty Node is excluded from active participation in the system and the active task sets for the remaining Nodes are changed to accommodate for the reduction in the number of active Nodes.
- the Fault Tolerator 36 will also periodically decrement the base penalty count for each Node in the system so that a Node which was previously excluded may be readmitted Into the active system.
- a previously excluded Node continues to operate in an error free manner for a sufficient period of time, its base penalty count will be decremented below a readmittance threshold which will initiate a Node readmittance and reconfiguration process in which the previously excluded Node is readmitted into the active system.
- the active task set for each Node is readjusted to accommodate for the increase in the number of active Nodes 1n the system.
- the Voter 38 performs an "on-the-fly" vote using all of the current copies of the data values received from the Fault Tolerator 36.
- the voted data value and all copies of the received data are passed to a Task Communicator 44 which stores them in a Data Memory 42.
- the Voter will select a voted data value using an appropriate algorithm as shall be discussed relative to the Voter 38 itself.
- a new voted data value is generated which 1s written over the prior voted data value stored in the Data Memory 42.
- the Data Memory 42 always stores the most current voted data value assuring that a voted data value is always available for sub- sequent processing independent of one or more copies of the data value failing to be generated or "hang" causing a late arrival.
- the Voter 38 will also perform a deviance check between the voted data value and each copy of the received data value, and will generate an error vector to the Fault Tolerator identifying each Node which generated a data value which differed from the voted data value by more than a predetermined amount. This arrangement will support both exact and approximate agreement between the copies of the data values.
- the Voter 38 supports several data types, included pack boolean values, fixed point formats, and the IEEE standard 32-bit floating point format.
- a Scheduler 40 has two modes of operation, normal and reconfiguration.
- the Scheduler 40 1s an event driven, priority based, globally verified scheduling system which selects from its active task set the next task to be executed by Its associated Applications Processor 14. For a given system configuration (set of active Nodes) the active task set assigned to each Node is static.
- the Scheduler 40 selects the next task to be executed.
- the Applications Processor will immediately begin the execution of the selected task and the Task Communicator 44 will immediately initiate the generation of a message informing all of the other Nodes of the identity of the selected task, the Identity of the preceding task finished by the Applications Processor 14, and the branch conditions of the preceding task.
- Conditional branching is controlled by the Applications Processor 14 and is determined by conditions in the applications environment.
- the precedence relationship between a task and its successor task may include conditional branches, concurrent forks, and join operations implemented at task boundaries.
- Conditional branching provides an efficient means of switching operational modes and avoids the necessity of scheduling tasks not required by the current conditions.
- An interactive consistency voting process guarantees agreement on the branch conditions generated by the other Nodes which executed the same task.
- the Scheduler 40 in each Node replicates the scheduling process for every other Node in the system and maintains a global data base on the scheduling and execution of tasks by each Node.
- the Scheduler 40 Upon the receipt of a message from another Node identifying the task completed and the task started, the Scheduler 40 will compare the task completed with the task previously reported as started and generate a scheduling error signal if they are not the same.
- the Scheduler 40 will also compare the task reported as started with a task it has scheduled to be started by that Node. If they are different, the Scheduler will also generate a scheduling error signal.
- the Scheduler 40 will pass all scheduling error signals to the Fault Tolerator 36. All of the Scheduler's error detection mechanisms are globally verified and have been designed to ensure that failure of one or more copies of a task does not upset scheduling.
- a reversible path independent reconfiguration algorithm provides graceful degradation of the workload as faulty Nodes are excluded from the operating system. Because the algorithm is reversible it also supports graceful restoration of the workload as previously excluded Nodes are readmitted following an extended period of error free operation.
- the active task set allocated to each Node is altered to compensate for the change in the number of active Nodes.
- the active task set or at least the critical task of a faulty Node's active task set, may be reallocated and included in the active task set of the other Nodes.
- individual tasks may be globally disabled and replaced with simpler tasks, and some noncritical tasks may be disabled with no replacement.
- the reconfiguration process readjusts the active task set for the active Nodes to accommodate the system capabilities.
- the algorithm supports true distributed processing, rather than just a replication of uniprocessor task loads on redundant Nodes.
- a Task Communicator 44 functions as an input/output (I/O) interface between the Operations Controller 12 and the Applications Processor 14.
- the Applications Processor 14 signals the Task Communicator 44 when it is ready for the next task.
- a simple handshaking protocol is employed to synchronize communications between the Applications Processor 14 and the Task Communicator 44.
- the Task Communicator 44 reads the selected task from the Scheduler 40 and transfers it to the Applications Processor 14.
- the Task Communicator 44 will initiate the transmission of the task completed/task started message identifying the task completed by the Applications Processor 14, the task being started by the Applications Processor and the branch conditions of the completed task.
- the Task Communicator 44 will then fetch the data required for the execution of the started task from the Data Memory 42 and temporarily store it in a buffer in the order in which it is required for the execution of the started task.
- the Task Communicator will pass these data values to the Applications Processor as they are requested. Effectively, the Task
- Communicator 44 looks like an input file to the Applications Processor 14.
- the Task Communicator 44 also receives the data values generated by the Applications Processor 14 in the execution of the selected task and generates Data Value messages which are broadcast by the Transmitter 30 to all of the other Nodes in the system.
- the Task Communicator will also append to the Data Value message a data identification (DID) code and a message type (MT) code which uniquely identifies the message as a Data Value message.
- DID data identification
- MT message type
- the Synchronizer 46 provides two independent functions in the operation of the multi-computer architecture. The first function pertains to the synchronization of the operation of the Nodes 10A through 10N during steady state operation, the second function pertains to the synchronization of the Nodes on start up. During steady state operation, the Synchronizer 46 effects a loose frame base synchronization of the Nodes by the exchange of messages which implicitly denote local clock times. The Synchronizer 46 in each Node counts at its own clock rate, up to a "nominal sync count,” then issues a presynchronization System State message which is immediately broadcast by the Transmitter 30 to all of the other Nodes in the system.
- presynchronization System State messages from all the Nodes in the system, including its own, are received at each Node, they are time stamped in the Synchronizer as to their time of arrival from the Message Checker 34.
- the time stamps are voted on to determine a voted value for the arrival time of the presynchronization System State messages from all the Nodes.
- the difference between the voted time stamp value and the time stamp of the Node's own presynchronization System State message is an error estimate which is used to compute a corrected sync count.
- the error estimate includes any accumulated skew from previous synchronization rounds and the effects of clock drift.
- the Synchronizer 46 will then count up to the corrected sync count and issue a synchronization System State message which is immediately transmitted by the Transmitter 30 to all of the other Nodes in the system.
- the synchronization System State messages will also be time stamped as to their arrival in the Synchronizers in each Node in the system.
- the time stamps of all presynchronization and synchronization System State messages are all compared with the voted time stamp value to determine which Nodes are in synchronization with Its own Node and which are not.
- a soft error signal is generated signifying a potential synchronization error.
- a hard error signal is generated signifying a synchronization error has definitely occurred.
- the soft and hard error signals are transmitted to the Fault Tolerator 36 and are handled 1n the same manner as any other detected error.
- Start up is defined as a process for creating a functional con- figuration of Nodes called an "operating set.” If an "operating set” is in existence, and the functional configuration is changed by the admittance or readmittance of one or more Nodes, the process is called a “warm start.” If no "operating set” is in existence, it is called a “cold start.” In a warm start, the Synchronizer 46 will recognize the existence of an operating set and will attempt to achieve synchronization with the operating set. A cold start is initiated by a power on reset (POREST) signal generated in response to the initial application of electrical power to the system. Each Synchronizer 46 will attempt to achieve point-to-point synchronization with all the Nodes until an operating set is formed. Once an operating set Is formed, those Nodes not included in the operating set will switch to the warm start process and will attempt to achieve synchronization with the operating set.
- POREST power on reset
- the operation of the multi-computer architecture depends upon the exchange of data and operational information by the exchange of inter-node messages.
- inter-node messsaages are data-flow instructions which indicate to each Individual Operations Controller how it should be processed.
- each inter-node message contains the Node identification (NID) code of the Node from which the message originated and a message type (MT) code identifying the message type.
- NID Node identification
- MT message type
- the last byte in each inter-node message is always a block check byte which is checked by the Receivers 32a through 32n to detect transmission errors.
- Data Value messages There are four different Data Value messages which range from a one byte Data Value message to a four byte Data Value message. These Data Value messages are identified as message types MT0 through MT4.
- the second byte of a Data Value message is a data identification (DID) code which when combined with the message type code uniquely identifies that particular data value from other data values used in the system.
- the data identification (DID) code is used by the Message Checker 34 to define the types of checks that are to be performed.
- the MT/DID codes are used to identify which limits will be used by the Message Checker 34 and the deviance to be used by the Voter 38 to define the permissible deviance of each actual data value from the voted values and by the Task Communicator 44 to identify the data value to be supplied to the Applications Processor 14 in the execution of the current task.
- the bytes following the data identification byte are the data values themselves with the last byte being the block check byte as previously indicated.
- a Task Interactive Consistency (TIC) message is a special case of the two byte Data Value message which is identified by the DID being set to zero (0).
- the Task Interactive Consistency message, message type MT1 is a rebroadcast of the task completed vector and branch condition data contained in Task Completed/Started (CS) messages received from the other Nodes and are transmitted at the end of each Subatomic period (SAP), as shall be explained in the discussion of the timing sequence.
- the information content of the Task Interactive Consistency messages are voted on by each Node and the voted values are used by the Scheduler 40 in the task selection and scheduling process.
- a Base Penalty Count (BPC) message contains the base penalty count that the individual Node is storing for each Node in the system including itself. Each Node will use this information to generate a voted base penalty count for each Node in the system. Thereafter, each Node will store the voted base penalty count as the current base penalty count for each Node. This assures that at the beginning of each Master period each Node is storing the same number of base penalty counts for every other Node in the system.
- the Base Penalty Count message is transmitted by each Node at the beginning of each Master period timing interval.
- a System State (SS) message is sent at the end of each Atomic period timing interval and is used for the point-to-point synchronization of the Nodes and to globally affirm reconfiguration when a majority of the Nodes conclude that reconfiguration is required.
- the transmission of the System State message is timed so that the end of Its transmission coincides with the end of the preceding Atomic period and the beginning of the next Atomic period.
- the first byte of the System State message contains the node identification (NID) code of the originating Node and the message type (MT) code.
- the second byte contains three function bits, the first two bits are the synchronization and presynchronization bits which are used in the Synchronization process described above.
- the third bit identifies whether or not the Node is operating or excluded..
- the third and fourth bytes of the System State message are the task completed vector and the branch condition vector, respectively.
- Byte five contains the current system state vector and byte six contains the new system state vector.
- Byte seven and eight contain the higher and lower order of bits of the Node's own period counter.
- Byte nine is an "in sync with" (ISW) vector which defines which Nodes that particular Node determines it is synchro nized with, and byte ten is reserved for future use.
- ISW in sync with
- Byte eleven is the conventional block check byte at the end of the message.
- the Synchronizer uses the time stamp of the pre-synchronization System State messages, identified by the pre-synchronization bit in the second byte being set to generate an error estimate used to compute a correction to the time duration of the last Subatomic period. This correction synchronizes the beginning of the next Atomic period in that Node with the Atomic period being generated by the other Nodes.
- the period counter bytes are used to align the Master periods of all the Nodes in the system. The period counter counts the number of Atomic periods from the beginning of each period and is reset when it counts up to the fixed number of Atomic periods in each Master period. Byte nine is used only during an automatic cold start as shall also be explained in more detailed in the discussion of the Synchronizer 46.
- the Task Completed/Started (TC/S) message is generated by the Task Communicator 44 each time the Applications Processor 14 starts a new task.
- the second and third bytes of the Task Completed/Started message contain the task identification (TID) codes of the task completed and new task started by the Node's Applications Processor 14.
- the fourth byte of this message contains the branch condition of the completed task, and an error correction code (ECC).
- the last inter-node message is the Error message, message type MT7, which is sent whenever the Transmitter 30 is free during an Atomic period. Only one error message reporting the errors attributed to a particular Node can be sent in an Atomic period.
- the second byte of the Error message is the Node identification (NID) code of the Node- accused of being faulty.
- NID Node identification
- the following four bytes contain error flags identifying each error detected.
- the seventh and eighth bytes of the error message contain the base penalty count of the identified Node and the increment penalty count which is to be added to the base penalty count if the errors are supported by Error messages received from other Nodes.
- the increment penalty count is based on the number of errors detected and the severity of these errors, This information is used by the other Nodes to generate a new voted base penalty count for the Node identified in the Error message. A separate Error message is sent for each Node which generates a message having a detected error.
- the overall control system of the multi-computer architecture contains a number of concurrently operating control loops with different time cycles.
- the system imposes the constraint that each cycle time be an integer power of two times a fundamental time interval called an Atomic period. This greatly simplifies the implementation of the Operations Controller 12 and facilitates the verification of correct task scheduling.
- the length of the Atomic period is selected within broad limits by the system designer for each particular application.
- the System State messages, which are used for synchronization are sent at the end of each Atomic period.
- Each Master period contains a fixed number of Atomic periods, as shown in Figure 3. All task scheduling parameters are reinitialized at the beginning of each Master period to prevent the propagation of any scheduling errors.
- the Nodes will also exchange Base Penalty Count messages Immediately following the beginning of each Master period.
- the shortest time period used in the system is the Subatomic (SAP) period, as shown in Figure 4, which defines the shortest execution time recognized by the Operations Controller 12 for any one task. For example, if the execution time of a task is less than a Subatomic period, the Operations Controller 12 will not forward the next scheduled task to the Applications Processor 14 until the beginning of the next Subatomic period. However, when the execution time of a task is longer than a Subatomic period, the Operations Controller 14 will forward the next scheduled task to the Applications Processor as soon as it is ready for it.
- There are an integer number of Subatomic periods in each Atomic period which are selectable by the systems designer to customize the multi-computer architecture to the particular application. As shown in Figure 4, each Subatomic period is delineated by a Task Interactive Consistency message as previously described.
- FIG. 5 is a block diagram. of the Transmitter 30 embodied in each of the Operations Controllers 12.
- the Transmitter 30 has three interfaces, a Synchronizer Interface 50 receiving Task Interactive Consistency messages and System State messages generated by the Synchronizer 46, a Fault Tolerator
- Interface 52 receiving the Error and Base Penalty Count messages generated by the Fault Tolerator 36, and a Task Communicator Interface 54 receiving Data Value and Completed/Started messages generated by the Task Communicator 44.
- the three interfaces are connected to a Message Arbitrator 56 and a Longitudinal Redundancy Code Generator 58.
- the Message Arbitrator 56 determines the order in which the messages ready for transmission are to be sent.
- the Longitudinal Redundancy Code Generator 58 generates a longitudinal redundancy code byte which is appended as the last byte to each transmitted message.
- the message bytes are individually transferred to a Parallel-to-Serial Converter 60 where they are framed between a start bit and two stop bits, then transmitted in a serial format on communication link 16.
- the Transmitter 30 also includes a Self-Test Interface 62 which upon command retrieves a predetermined self-test message from an external ROM (not shown) which is input into the Longitudinal Redundancy Code Generator 58 and transmitted to the communication link by the Parallel-to-Serial Converter 60.
- the Transmitter 30 also has an Initial Parameter Load Module 64 which will load into the Transmitter various predetermined parameters, such as the length of the minimum synchronization period between messages, the length of a warning period for Interactive
- each of the three interfaces has an eight bit input register 66 which receives the messages to be transmitted from its associated message source through a multiplexer 68.
- the multiplexer 68 also receives the three bit Node identification (NID) code which identifies the Node which is generating the message.
- NID Node identification
- a bit counter 70 will count the strobe pulses clocking the message into the Input Register 66 and will in coordination with a flip flop 72 and an AND gate 74 actuate the multiplexer 68 to clock the three bit Node identification code into the Input Register 66 as the last three most significant bits of the first byte.
- the flip flop 72 is responsive to the signal "transmit quiet period" (TQP) generated at the end of its pre- ceding message to generate a first byte signal at its 0 output which enables AND gates 74 and 76.
- the AND gate 74 will transmit the three most significant bits generated by the bit counter 70 in response to the strobe signals loading the first byte Into the input register 66 and will actuate the multiplexer 68 to load the three bit Node identification code into the three most significant bit places of the input register 66.
- the AND gate 76 will respond to the loading of the eighth bit into input register 66 and will generate an output which will actuate the flip flop 78 to a set state. In the set state, the flip flop 78 will generate a message available signal at its Q output and win terminate the buffer available signal at its Q output.
- the message available (MA) signal will reset the flip flop 72 terminating the first byte signal which in turn disables the AND gates 74 and 76.
- the message available (MA) signal is also transmitted to the Message Arbitrator 56 signifying a message is ready for transmission.
- Termination of the buffer available (BA) signal when the flip flop 78 is put in the set state inhibits the message source from transmitting the remaining bytes of the message to the Transmitter 30.
- the first three least significant of bits of the first bytes, which are the message type code, are communicated directly to the Message Arbitrator 56 and are used in the arbitration process to determine which message is to be sent if more than one message is available for transmission or if the sending of that message will not interfere with the transmission of a time critical message generated by the Synchronizer 46.
- the Message Arbitrator 56 will generate a transmit (Txxx) signal identifying the next message to be sent when there is more than one message ready for transmission. This message will actuate the Longitudinal Redundancy Code Generator 58 to pass the selected message to the Parallel-to-Serial Converter for transmission.
- the transmit signal will also reset the flip flop 78 in the appropriate interface which reasserts the buffer available (BA) signal, actuating the associated message source to transmit the remaining bytes of the message to the interface.
- the Message Arbitrator 56 will generate a transmit quiet period (TQP) signal which actuates the Parallel-to-Serial Converter to transmit a null (synchronization) signal for a predetermined period of time following the transmission of each message.
- the quiet period is a time required for the transmission of 24 bits or two (2) null bytes.
- the transmit quiet period (TQP) signal will also set the flip flop 72 indicating that the preceding message has been sent and that the next byte received from the associated message source will be the first byte of the next message.
- a Fault Tolerator (FLT) Task Communicator (TSC) Arbitration Logic 82 will generate, in an alternating manner, PFLT and a PTSC polling signals which are received at the inputs of AND gates 84 and 86, respectively.
- the AND gate 84 will also receive the Fault Tolerator Message Available (FLTMA) signal generated by the Fault Tolerator Interface 52 while AND gate 86 will receive a Task Communicator message available (TSCMA) signal generated by the Task Communicator Interface 54 after the Task Communicator 44 has completed the loading of the first byte of the message ready for transmission.
- FLTMA Fault Tolerator Message Available
- TSCMA Task Communicator message available
- the outputs of the AND gates 84 and 86 are transmit Fault Tolerator (TFLT) and transmit Task Communicator (TSC) signals which are applied to AND Gates 88 and 90, respectively.
- TFLT Fault Tolerator
- TSC transmit Task Communicator
- the alternate inputs to AND gates 88 and 90 are received from a Time Remaining-Message Length Comparator 92 which produces an enabling signal whenever the transmission of the selected message will not interfere with the transmission of a time dependent message as shall be explained hereinafter.
- the AND gate 88 If the AND gate 88 is enabled it will pass the transmit Fault Tolerator (TFLT) signal to the Fault Tolerator Interface 52 to reassert the buffer available signal, enabling it to receive the remaining bytes of the message from the Fault Tolerator 36 and to the Longitudinal Redundancy Code Generator 58 enabling it to pass the message, byte-by-byte from the Fault Tolerator Interface 52 to the Parallel-to-Serial Converter 60 for transmission on the communication link 16.
- TFLT transmit Fault Tolerator
- the AND gate 90 when the AND gate 90 Is enabled, and the polling of the Task Communicator Interface 54 indicates that the Task Communicator 44 has a message ready for transmission, then the AND gate 86 will generate a transmit Task Communicator (TTSC) signal which, if passed by the AND gate 90, will result in the transmission of the Task Communicator's message.
- TTSC transmit Task Communicator
- the TFLT and the TTSC signals when generated, are fed back to Tock the FLT - TSC Arbitration Logic 82 in its current state until after the message is sent.
- the message arbitration between the Fault Tolerator's and Task Communicator's messages is primarily dependent upon the type of the message currently being transmitted.
- the logic performed by the FLT-TSC Arbitration Logic 82 is summarized on Table II.
- the FLT-TSC Arbitration Logic 82 will poll the Fault Tolerator Interface 52 and the Task Communicator Interface 54 in an alternating sequence. However, at the beginning of each Atomic period, the FLT-TSC Arbitration Logic 82 will first poll the Task Communicator Interface 54 for a Task Completed/Started message which will identify the task being started by that Node. If the Task Completed/Started message is not available it will then poll the Fault Tolerator Interface 52.
- the FLT-TSC Arbitration Logic will first poll the Fault Tolerator Interface 52 and wait until it receives the Base Penalty Count message from the Fault Tolerator 36. After the transmission of the Base Penalty Count message, it will then poll the Task Communicator Interface 54 and transmit a Task Completed/Started message identifying the task scheduled to be started by the Applications Processor.
- the FLT-TSC Arbitration Logic 82 will resume polling of the Fault Tolerator Interface 52 and the Task Communicator Interface 54 in an alternating sequence. In a like manner, after a self-test message, the FLT-TSC Arbitration Logic 82 will poll the Task Communicator Interface 54 and wait for a Task Completed/Started message.
- the Synchronizer 46 will load the first byte of either a Task Interactive Consistency or System State message in the Synchronizer Interface 50 a predetermined period of time before the beginning of the next Subatomic or Atomic periods.
- a Warning Period Generator 94 will load a warning period counter with a number corresponding to the number of bits that are capable of being transmitted before the Task Interactive Consistency or System State messages are to be transmitted. As described previously, the transmission of the final bit of either of these messages marks the end of the previous Subatomic or Atomic periods respectively, therefore, their transmission will begin a predetermined time (bit counts) before the end of the period. Since the Task Interactive Consistency and System State messages are of different bit lengths, the number loaded into the warning period counter will be different.
- the Warning Period Generator 94 will decode the message type code contained in the first byte of the message stored in the Synchronizer Interface 50 and will load the warning period counter with a number Indicative of the length of the warning period for that particular type of time critical message.
- the warning period counter will be counted down at the bit transmission rate of the Parallel-to-Serial Converter 60 to generate a number indicative of the time remaining for the transmission of a time critical message.
- the number of counts remaining in the warning period counter are communicated to a Synchronizer Transmission Control 96 and the Time Remaining-
- the Synchronizer Message Control 96 will generate a transmit synchronizer (TSYN) signal which will actuate the Synchronizer Interface 50 to reassert the buffer available signal and will actuate the Longitudinal Redundancy Code Generator 58 to pass the message from the Synchronizer Interface 50 to the Parallel-to-Serial Converter 60 for transmission on the Node's own communication link 16.
- TSN transmit synchronizer
- the Time Remaining-Message Length Comparator 92 will decode the message type of a message selected for transmission by the FLT-TSC Arbitration Logic and determine the number of bits that have to be transmitted for that message. To this number the Time Remaining-Message Length Comparator 92 will add a number equal to the number of bits corresponding to the quiet period between the messages and compare the sum of the message and the quiet period with the count remaining in the warning period counter to determine if the transmission of the selected message will or will not interfere with the transmission of the time critical message from the Synchronizer Interface 50.
- the Time Remaining-Message Length Comparator 92 will generate a signal enabling AND gates 88 and 90 to pass the TFLT or TTSC signals, otherwise the Time Remaining-Message Length Comparator 92 will generate a signal disabling AND gates 88 and 90, inhibiting the transmission of the selected message from either the Fault Tolerator Interface 52 or the Task Communicator Interface 54. This signal will also toggle the FLT-TSC Arbitration Logic 92 to poll the nonselected interface to determine if it has a message to transmit.
- the Time Remaining-Message Length Comparator 92 will determine if there is sufficient time to transmit the message from the nonselected Interface before the transmission of the time critical message from the Synchronizer Interface 50. If there is sufficient time, the message from the nonselected interface will be transmitted, otherwise the AND gates 88 and 90 will remain disabled.
- the Message Arbitrator 56 also has a Byte Counter 100 which counts the number of bytes transmitted by the Parallel-to-Serial Converter 60.
- the output of the Byte Counter 100 is received by a Message Byte Logic 102.
- the Message Byte Logic 102 decodes the message type code of the message being transmitted and determines the number of bytes in that message. After the last byte of the message is transmitted, the Message Byte Logic 102 will first generate a transmit longitudinal redundancy code (TLRC) signal which enables the Longitudinal Redundancy Code Generator 58 to transmit the generated longitudinal redundancy code as the final byte of the message.
- TLRC transmit longitudinal redundancy code
- the Message Byte Logic 102 will then generate a transmit quiet period (TQP) signal enabling the Parallel-to-Serial Converter 60 to transmit the null signal for a predetermined number of bytes which is used for message synchronization.
- TQP transmit quiet period
- the transmit quiet period (TQP) signal is also transmitted to the Synchronizer Transmission Control 96 where it is used to terminate the transmit synchronizer (TSYN) signal.
- TTYN transmit synchronizer
- the Message Byte Logic 102 will generate an end of quiet period (EQP) signal which will reset the Byte Counter 100 and unlatch the FLT-TSC Arbitration Logic 82 for selection of the next message for transmission.
- a Self-Test Arbitration Logic 104 recognizes a request for a self-test In response to a transmitted Task Completed /Started message in which the task Identification (TID) code is the same as the Node identification (NID) code. After the transmission of a self-test request message, the Self-Test Arbitration Logic 104 will inhibit a Task Communicator Enable (TSCE) signal and a Fault Tolerator Enable (FLTE) signal as shown in Figure 8 which, when applied to AND gates 84 and 86, respec- tlvely, inhibits all transmissions from the Fault Tolerator
- TTID task Identification
- NID Node identification
- the Self-Test Arbitration Logic 104 will generate a transmit self-test (TSLT) signal which will actuate the Self-Test Interface 62 to read the self-test message from an associated off board (read only memory) ROM.
- the (TSLT) signal will also enable the Longitudinal Redundancy Code Generator 58 to pass the self- test message from the Self-Test Interface 62 to the Parallel-to-Serial Converter 60 for transmission.
- the Self-Test Arbitration Logic 104 will restore the Task Communicator Enable (TSCE) signal to permit the transmission of a Task Completed/ Started message signifying the completion of the self-test.
- TSCE Task Communicator Enable
- the FLT-TSC Arbitration Logic 82 will automatically select the message from the Task Communicator Interface 54 as the next message to be transmitted following the transmission of the self- test message.
- the Self-Test Arbitration Logic 104 will terminate the Task Communicator Enable (TSCE) signal until after the next Task Interactive Consistency or System State message is transmitted as indicated in Figure 8.
- TSCE Task Communicator Enable
- the Self-Test Interface 62 serves to transfer the self-test message from the off board ROM (not shown) to the Longitudinal Redundancy Code Generator 58.
- the off board ROM will store a plurality of Self-test messages which are transmitted one at a time in response each time a Self-test is requested.
- the first byte of each Self-test message is a number indicative of the number of bytes in the Self-test message which is passed back to the Message Byte Logic 102 to identify the completion of the self- test.
- the last byte in each self-test message stored in the off board ROM is the starting address for the next Self-test message.
- the starting address is not transmitted, but rather is stored in the Self-Test Interface 62 to locate the next Self-test message in the off board ROM to be transmitted.
- the last byte of the last Self-test message stored in the off board ROM contains the starting address of the first Self-test message, so that the Self- test message sequence is repeated.
- the starting address for the first Self-test message is loaded into the Self-Test Interface 62 by the Initial Parameter Load Module 64 in response to an initial load command generated by the Synchronizer 46 in response to the electrical power being turned on.
- the Longitudinal Redundancy Code Generator 58 has an 4:1 Input Multiplexer 110 which receives the message bytes from the Synchronizer Interface 50, Fault Tolerator Interface 52, Task Communicator Interface 54, and
- the Input Multiplexer 110 controls which message will be transmitted to the Parallel-to-Serial Converter 60 in response to the transmit (TFLT, TTSC, TSYN, and TSLT) signals generated by the Message Arbitrator 56, as previously described.
- Each byte of a message selected for transmission by the Message Arbitrator 56 is transmitted to an Output Multiplexer 112 by means of nine parallel lines, one for each bit in the received byte plus the parity bit generated by the associated interface.
- a Longitudinal Redundancy (LR) Bit Generator 114 is connected to each of the nine parallel bit lines and collectively generate a nine bit longitudinal redundancy code.
- Each bit in the longitudinal redundancy code is a function of the bit values in the same bit locations in the preceding bytes.
- the outputs of all the LR bit generators 114 are also received by the Output Multiplexer 112.
- the Output Multiplexer 112 is responsive to the transmit longitudinal redundancy code (TLRC) signal generated by the Message Arbitrator 56 to output the last bit generated by each of the LR bit generators 114 as the last byte of the message being transmitted.
- the output of the Output Multiplexer 112 is con nected directly to the Parallel-to-Serial Converter 60 which frames each received byte between predetermined start and stop bits before it is transmitted on the Node's communication link.
- the messages from Node A transmitted on communication link 16a are received by a Noise Filter and Sync Detector 116.
- the synchronization portion of the Noise Filter and Sync Detector 116 requires that a proper synchronization interval exists prior to the reception of a message.
- the synchronization interval preferably is the time required for the Transmitter 30 to transmit two complete null bytes after each transmitted message.
- the low pass portion of the Noise Filter and Sync Detector 116 prevents false sensing of the "start" and “stop” bits by the Receiver 32a due to noise which may be present on the communication link 16a.
- the low pass filter portion requires that the signal on the communication link 16a be present for four (4) consecutive system clock cycles before it is interpreted as a start or a stop bit.
- the Noise Filter and Sync Detector 116 will generate a new message signal in response to receiving a start bit after a proper synchronization interval.
- the message After passing through the Noise and Sync Detector 116 the message, byte-by-byte, is converted from a serial to a parallel format in a Serial-to-Parallel Converter 118.
- the Serial-to-Parallel Converter 118 also determines when a complete 12-bit byte has been received. If the 12-bit byte is not properly framed by a "start" and two "stop" bits, a new bit is added, the bit first received is discarded and the framing is rechecked. Framing errors are not flagged by the Receiver 32a since this fault will manifest itself during a vertical parity check.
- the start and stop bits are stripped from each byte and the remaining 9-bit byte is trans ferred to a Longitudinal Redundancy Code and Vertical Parity Code (LRC and VPC) Checker 122 to check for parity errors .
- the error checking logic outputs the current combinational val ue of the vertical parity and the longitudinal redundancy codes.
- the vertical parity check portion checks the parity vertical ly across the received message while the longitudinal redundancy code checker portion performs a longitudinal redundancy code check on each byte received from the Serial -to-Paral lel Converter 118.
- the Message Checker 34 decodes the message type information contained in the fi rst byte of the message and determines which byte is the l ast byte in the message and , therefore , for which byte the longitudinal redundancy code check i s val id .
- the Message Checker 34 will ignore al l other LRC error signal s generated by the LRC and VPC Code Checker 122.
- the 8-bit message byte is transferred to a Buffer 120 which interfaces with the Message Checker 34.
- the Buffer 120 temporarily stores each 8-bit message byte until the Message Checker 34 i s ready to check it.
- the Buffer wi l l set a byte ready fl ag signi fying to the Message Checker 34 that it has a message byte ready for transfer.
- the Message Checker 34 wil l unload the message bytes from the Buffer 120 independent of the l oading of new message bytes by the Serial -to-Paral lel Converter 118.
- the 8-bit message bytes are transferred to the Message Checker 34 via a common bus 124 which is shared with al l of the Receivers 32a through 32n in the Operations Control ler 12.
- the Message Checker 34 wil l systematical ly pol l each Receiver one at a time in a repetitious sequence.
- the detai l s of the Message Checker 34 are shown in Figure 11.
- the Message Checker 34 processes the messages received by the Receivers 32a through 32n and verifies their logical content, records any errors detected, and forwards the messages to the Fault Tolerator 36.
- the operation of the Message Checker 34 is controlled by a Sequencer 126 which context switches among the multiple Receivers 32a through 32n in order to prevent overrun of the Buffers 120 in each Receiver.
- Each Receiver 32a through 32n is polled in a token fashion to determine if it has a message byte ready for processing. If the message byte is ready for processing when it is polled by the Sequencer 126 the byte will be processed immediately by the Message Checker 34.
- the Sequencer 126 stores the Node identification (NID) code of the Node 10 associated with each Receiver.
- NID Node identification
- the Sequencer 126 also has a Byte Counter associated with each Receiver 32a through 32n which is indexed each time the Sequencer 126 unloads a byte from that particular Receiver. The byte count uniquely identifies the particular byte being processed by the Message Checker 34.
- the Sequencer 126 will transfer the Node identification code and the byte count to a Data Multiplexer 128 to tag the message byte as it is transferred to the Fault Tolerator 36.
- the Node identification code and the byte count are also transmitted to an Error Check Logic 130 and a Context Storage 132.
- the Error Check Logic 130 will check the Node identification code expected by the Sequencer 126 with the Node identification code contained in the first byte of the message being checked to determine if they are the same. When they are the different the Error Checker Logic 130 will generate an error signal which is recorded in an error status byte being generated in the Context Storage 132.
- the Node identification code is also used as an address into the
- the Context Storage 132 where the relevant Information pertaining to the message being processed 1s stored.
- the Context Storage 132 has a separate storage location for each Node 10 in the system which is addressed by the Node identification code contained in the message.
- the Context Storage 132 stores the message type (MT) code, the data identification (DID) code, the byte count, an error status byte, a data value mask, and an intermediate error signal for each message as it is being processed.
- the Information in the Context Storage 132 will be used by an Address Generator 134 with the message type (MT) code, the data identification (DID) code, and the byte count which Identifies the specific byte to be processed.
- the Address Generator 134 will output an address where the required processing information is stored in a Message Checker ROM 136.
- the Message Checker ROM 136 stores the maximum and minimum values for the data contained in the message, the valid data identification numbers for each message type, and a data mask which identifies how many data values are contained in the message being processed and the number of bytes in each data value.
- the maximum and minimum data values are transmitted to a Between Limits Checker 138 which will check the data contained in each data byte against these maximum and minimum values.
- the Between Limits Checker 138 will generate four different error signals as a result of the between limits checks. The first two are the maximum value (MXER) and minimum value (MNER) error signals, signifying the data value exceeded the maximum value or was less than the minimum value.
- the other two error signals are the equal to maximum value (MXEQ) and equal to minimum value
- MNEQ Error Check logic 130
- the Error Check Logic 130 will OR the vertical parity code and the longitudinal redundancy code error signals generated by the Receiver and generate a parity error signal which is recorded in the error status byte being generated in the Context Storage 132. As previously described, the Error Check Logic 130 will check the expected Node identification (NID) code against the Node identification code contained in the first byte of the message and will check the message type (MT) code by checking to see if bits in bit position 1, 3, and 4 of the first byte are identical. As previously described in the detailed description of the Transmitter 30 the middle bit of the 3-bit message type code is repeated in bit positions 3 and 4 for message type error detections.
- NID Node identification
- MT message type
- the Error Check Logic 130 will also check the validity of the data identification (DID) code contained in the second byte of the message against the maximum value for a (DID) code received from the Message Checker ROM 136 and will generate an error signal if the data identification code has a value greater than the maximum value.
- the Error Check Logic 130 will further check the two's compliment range of the appropriate data byte and generate a range error (RNGER) signal when a two's compliment error range is detected. It will also record in the Context Storage 132 the maximum (MXER) and the minimum (MNER) error signals generated by the Between Limits Checker 138.
- Between Limits Checker 138 often it can be determined from the first byte of a multi-byte data value if the data value within or outside the maximum or minimum values received from the Message Checker ROM 136 and checking of the remaining bytes is no longer necessary. However, when the Between Limits Checker 138 generates a MXEQ or MNEQ signal signifying that the data value of the byte being checked is equal to either the maximum or minimum limit value, it will be necessary to check the next byte against a maximum or a minimum value to make a factual determination of whether or not the received data value is within or outside the predetermined limits.
- the Error Check Logic 130 in response to an MXEQ or an MNEQ signal from the Between Limits Checker 138 will store in the Context Storage an intermediate value signal which signifies to the Context Storage 132 that the between limits check is to be continued on the next byte containing that data value. This process will be repeated with the next subsequent byte if necessary to make a final determination. During the checking of the next byte of the particular data value, the Context Storage 132 will supply to the Error Check Logic 130 stored intermediate value which identifies to which limit, maximum or minimum, the data value of the preceding data byte was equal. From this information, the existence or non-existence of a between the limits error can readily be determined by relatively simple logic as shown on Figure 12.
- a Decoder 140 responsive to the intermediate value stored in the Context Storage 132 will enable AND gates 142 and 144 if the preceding between limits check generated a signal signifying the data value contained in the preceding byte was equal to the maximum value.
- the intermediate value will enable AND gates 146 and 148 signifying that the data value contained in the preceding byte was equal to the minimum value. If on the second byte the Between Limits Checker 138 detects a maximum limit error (MXER) and AND gate 142 is enabled, the maximum limit error MXER will be recorded in the error status byte being generated in the Context Storage 132.
- MXER maximum limit error
- MNER minimum limit error
- MNEQ minimum limit error
- Message Checker ROM 136 identifies the number of individual data values that are in the Data Value message being processed in which data bytes belong to each data value. This mask is used by the Error Check Logic 130 to identify the last byte in each data value. On the last byte of any data value, only maximum or minimum limit errors will be recorded in the Context Storage error status byte. The MXEQ and MNEQ signals will be ignored.
- the Error Check Logic 130 will also detect if the message contained the correct number of bytes.
- the Context Storage 132 stores the message type (MT) code for each message being processed.
- the Error Check Logic 130 will decode the message type code stored in the Context Storage 132 and generate a number corresponding to the number of bytes that type of message should have. It will then compare this number with the byte count generated by the Sequencer 126 prior to receiving a new message signal from the Receiver 32 and will generate a message length error (LENER) signal when they are not the same.
- LENER message length error
- the message length error signal may not be generated until after the error status byte has been sent to the Fault Tolerator 36, the message length error signal will be passed to the Fault Tolerator 36 in the error status byte for the next message received from that Node.
- the format of the error status byte formed in the Context Storage 132 is shown in Figure 13.
- the error status byte contains a flag for the parity error (PARER) a flag for the length error (LENER) for the preceding message, a flag bit for the Node identification (NID) error, a flag bit for the data identification (DID) error, a flag bit for the message type (MT) error, a flag bit for the two's compliment range error (RNGER) and flag bits for the maximum and minimum limit (MXER and MNER) errors.
- PARER parity error
- NID Node identification
- DID data identification
- MT message type
- RNGER flag bit for the two's compliment range error
- MXER and MNER maximum and minimum limit
- the Data Multiplexer 128 transmits each message byte directly to the Fault Tolerator 36 as it is processed by the Message Checker 34.
- the Data Multiplexer will append to each message byte a descriptor byte which contains the Node identification code (NID) and the byte count (BYTC) received from the Sequencer 126 for that particular byte of the message.
- NID Node identification code
- BYTC byte count
- the Data Multiplexer 128 will transmit the error status byte stored in the Context Storage 132 as the last byte.
- the last byte is Identified by a byte count "15" so that it can readily be identified by the Fault Tolerator 36 for fault analysis.
- NID identity
- TID Identity
- MT message type
- the Synchronizer 46 will report to the Error Handler 164 through the Synchronizer Interface 152 any errors it has detected in the Task Interactive Consistency (TIC) and System State (SS) messages.
- the Scheduler Interface 154 will forward to the Scheduler 40 the task identification (TID) code of the task started and the Node identity (NID) of each received Task Completed/Started message. In return, the Scheduler 40 will transmit to the Error Handler 164 through the Scheduler Interface 154 any errors it has detected.
- TID task identification
- NID Node identity
- the Transmitter Interface 156 will forward to the Transmitter 30 the Base Penalty Count and Error messages generated by the Error Handler 164. As previously described, the
- Transmitter Interface 156 will load the first byte of the message to be transferred into the Transmitter's Input Register to signify it has a message ready for transmission. It will then await the reassertion of the buffer available (BAB) signal by the Transmitter 30 before forwarding the remainder of the message to the Transmitter 30 for transmission.
- BAB buffer available
- a Reset Generator 157 is responsive to a reset signal generated by the Error Handler 164 when it determines its own Node is faulty and to a power on reset (POR) signal generated when electrical power is first applied to the Node to generate an Operations Controller reset (OCRES) signal and an initial parameter load (IPL) signal which are transmitted to the other subsystems affecting a reset of the Operations Control l er 12.
- POR power on reset
- OCRES Operations Controller reset
- IPL initial parameter load
- the Fault Tol erator RAM Interface 160 wil l store in a Fault Tol erator RAM 162 the data contained in the message bytes as they are received from the Message Checker Interface 150.
- the Fault Tolerator RAM 162 is a random access memory partitioned as shown in Figure 15.
- a message parti tion section 166 stores in predetermined locations the messages received from each Node. In the message partition section 166 the messages are reassembled to their original format using the identi bomb byte appended to the message bytes by the Message Checker 34.
- a double buffering or double partitioning scheme is used to prevent overwriting of the data that is stil l being used by the Voter 38.
- a context bit generated by the Message Checker Interface 150 determines Into which of the two partitions the new data is to be written . Separate context bits are kept for each Node and are toggled only when the error status byte indicates the current .message is error free. As previously discussed rel ative to the Message Checker 34, the message length (LENER) byte of the error status byte signifies that the preceding message had a message length error and, therefore, i s ignored in the determination of an error free condition for the current message.
- LENER message length
- the format for a single message in the message partition section 166 is il l ustrated in Figure 16. As shown , the message is reconstructed In its original format in the Fault Tolerator RAM 162 using the Node identification (NID) code and the byte count appended to each message byte in the Message Checker as a portion of the address.
- NID Node identification
- the context bit generated by the Message Checker Interface 150, along with the message partition code (bits 8 through 11) generated by the Fault Tolerator RAM Interface 160 completes the address and identi fies which of the two locations in the message partition 166 the message from each Node is to be stored.
- the Fault Tol erator RAM 162 has three sections used by the Error Handl er 164 for generating the Base Penal ty Count and Error messages .
- An error code fi l e section 170 stores the error codes used to generate the Error messages transmitted immediately after the beginning of each Atomic period and to generate the increment penal ty count which is incl uded in the Error message.
- the formated error codes as shown on Figure 17, incl ude an identi fication of the subsystem which reported the error pl us a fl ag indication of the errors detected .
- the error status byte received from the Message Checker 34 i s formated Into two separate error codes.
- the fi rst error code contains the subsystem code 0000 which reported the errors and the error fl ags from the four least significant bits of the error status byte.
- the second error code contains the sub-system code 0001 and the error fl ags from the four most signi ficant bits of the error status byte.
- error codes are stored in the error code fil e section 170 at an address defined by the faulty Nodes identi fication (NID) code and report number as shown in Figure 19.
- the error code fil e section 170 is doubl e partitioned the same as the message partition section 166 so that two error fil es are stored for each Node.
- the context bit generated by the Message Checker Interface 150 identi fies in which of the two error files for that Node the error code wi l l be reported .
- Each error code is used to address a group mapping section 168 of the Fault Tol erator RAM 162.
- the error code addresses a penal ty weight pointer, as shown in Figure 18, which addresses a penalty weight section 172 of the Fault Tol erator RAM.
- the penal ty weight pointer addresses a specific penalty weight which is assigned to the specific combination of reported errors contained in the formated error code.
- the penalty weights resulting from each error code stored in the error fil e for that Node are summed in the Error Handler 164 and appended to the Error message as an increment penalty count (byte-8) for that Node.
- the Error Handler 164 will generate only one Error message in each Atomic period for each Node which transmitted a message which contained an error.
- the Fault Tolerator RAM 162 wil l al so store the deviance l imits for the one byte (MTO) two byte (MT1) , and four byte (MT3 and MT4) Data Val ue messages in four separate sections, 174, 176 , 178 and 180, which are used by the Voter 38, as shall be explained with reference to the Voter hereinafter.
- a Store Message Module 182 receives the message bytes di rectly from the Message Checker 34 and stores them in the message partition section 166 of the Fault Tolerator RAM 162.
- the Store Message Modul e 182 will add the context bits stored in the a Message Checker Interface Context Store 190 to the descriptor (NID plus byte count) appended to the message byte by the Message Checker 34 to generate a partition address (PID) .
- the partition address identi fies the location in the message partition section 166 where the particular message byte is to be stored.
- each Node will first transmit a Base Penalty Count message fol lowed by a Task Completed/Started message.
- the Store Message Module 182 stores for each Node a fi rst fl ag signifying the receipt of the Base Penal ty Count message and a second fl ag signifying the receipt of the subsequent Task Compl eted/Started message. These flags are set to fal se at the beginning of each Master period and are reset to true when the Base Penal ty Count and the Task Completed/Started messages are received for that Node.
- the Voter Interface Buffer is a 8 ⁇ 7 first in-first out buffer in which the four most significant bits are the four most significant bits of the partition address (context bits plus NID) for the received message in the message partition section 166 of the Fault Tolerator RAM 162. The remaining three bits are the message type code contained in the first byte of the message.
- An Error Status Byte Detector 186 listens to the messages being transmitted from the Message Checker 34 to the Fault Tolerator 36 and will detect the receipt of each error status byte (byte 15) generated by the Message Checker 34. If the content of the error status byte, with the exception of the length error (LENER) bit, are all zeros, the Error Status Byte Detector 186 will enable the Message Checker Interface Context Storage 190 to load the Voter Interface Buffer 1»S4 through the Store Message Module 182, or to load a Task Completed Register 202 or to load a Branch Condition Register 200 as required.
- LNER length error
- the Error Status Byte Detector 186 will load each non-zero error status byte in an Error Status Buffer 188 for subsequent processing by the Error Handler 164.
- the Error Status Buffer 188 is an 8 ⁇ 12 first in-first out buffer in which the most significant bit is a self -test flag, the next three bits are the Nodes Identification (NID) code and the remaining 8-b1ts are the received error status byte.
- the Message Checker Interface Context Storage 190 temporarily stores for each Node the information contained in Table III. This Information is temporarily stored since it is not known if the message is error free until the error status byte is received. TABLE 3
- the most significant bit, bit 13 signifies that the received message is a Task Interactive Consistency (TIC) message which is processed by the Synchronizer 46.
- the twelfth bit is the partition context bit which identifies in which partition of the message partition section 166 the message will be stored.
- the context bit 1s toggled when the Error Status Byte Detector 186 indicates the prior message was error free. If the message is not error free, the context bit 1s not toggled and the next message received from that Node is written over the prior message in the Fault Tolerator RAH 162.
- the message type code bits are received directly from the first byte of the message.
- the branch condition bit, b1t-8 is received from a Branch Condition Detector 194 which detects the branch condition contained in the fourth byte of the Task Completed/Started (MT6) message.
- the identification of the started task (TID) is obtained from a Task Started Detector 196 which loads the TID of the started task into the seven least significant bit locations of the Message Checker Interface Context Storage 190.
- the Message Checker Interface Context Storage 190 Upon the receipt of an error status byte which signifies that the received message was error free and if the message is not a Task Interactive Consistency message, the Message Checker Interface Context Storage 190 will transfer the context bit and the message type to the Store Message Module 182.
- the context bit is added to the Node identification (NID) code to form the starting partition (PID) address of that message in the Fault Tolerator RAM 162.
- NID Node identification
- PID starting partition
- the message type code is appended to the partition address and they are transferred to the Voter Interface Buffer 184 for subsequent use by the Voter 38 to extract the data necessary for the voting process.
- the Message Checker Interface Context Storage 190 Upon the receipt of an error status byte signifying the receipt of an error free Task Completed/Started (MT6) message, the Message Checker Interface Context Storage 190 will transfer the identification (TID) code of the stared task and the Node identification (NID) code to a Scheduler Interface Buffer 198 where it is transferred to the Scheduler 40 when requested.
- the Scheduler Interface Buffer 198 is an 8 ⁇ 11 bit first in-first out buffer which is reset at the end of the soft error window (SEW).
- SEW soft error window
- the soft error window is generated by the Synchronizer 46 and defines a period of time bracketing the end of each Subatomic period during which the time critical messages from other Nodes should be received if they are in synchronization with each other.
- the Message Checker Interface Context Storage 190 will transfer the stored branch condition (BC) bit to the Branch Condition Register 200 and transfer the node identification (NID) code of the Node that send the message to the Task Completed Register 202.
- BC branch condition
- NID node identification
- These registers are read by the Synchronizer Interface 152 when requested by the Synchronizer 46.
- the Branch Condition Register 200 and the Task Completed Registers 202 are double buffered with a different set of registers being reset at the end of each hard error window (HEW) signal.
- the hard error window signal is generated by the Synchronizer 46 and brackets the soft error window (SEW) at the end of each Subatomic period and defines the maximum deviance in the arrival time of the time critical messages from the other Nodes. The function of the hard error window (HEW) and soft error window (SEW) will be discussed in greater detail in the detailed description of the Synchronizer 46.
- the Error Handler includes an Error Filer 204, an Error Consistency Checker 206, an Error Message Generator 208, and an Error Handler Context Store 210.
- the Error Filer 204 polls the Message Checker Interface 150, the Synchronizer Interface 152, the Scheduler Interface 154, and the Voter Interface 158 for error reports from the various subsystems within the Operations Controller.
- the Error Filer will format the received error reports into a formated error code, as shown on Figure 17, and tag them with an error file address, as shown on Figure 19.
- the error filer address is a 3-bit error file identification code, a context bit which is the one generated by the Message Checker Interface 150 for filing the message in the message partition of the Fault Tolerator RAM 162, the Node identification (NID) code and a report number.
- NID Node identification
- the formated error code contains a 4-bit code which identifies the subsystem which detected the error and four flag bits identifying the errors detected.
- the Error Filer 204 will pass these formated error codes to the Fault Tolerator RAM Interface 160 which will store them in the error code file section 170 of the Fault Tolerator RAM 162.
- the Error Filer 204 will also forward the number of error reports written to the Error Handler Context Store 210 so that the Error Message Generator 208 will be able to determine how many error reports to process from the Fault Tolerator RAM 162.
- the Error Filer 204 will also detect the self-test flag generated by the Message Checker 34 and forward this flag to the Error Message Generator 208.
- the self-test flag is part of one of the group codes whose penalty weight is programmed to be zero or a very small value.
- the self-test error message will identify all of the errors detected and will include the Incremental and Base Penalty Count.
- the Error Consistency Checker 206 is responsible for consistent handling of the error reports and the base penalty counts for each Node in the system. A form of implicit interac tive consistency is used to achieve this goal.
- the Error Consistency Checker 206 receives through the Voter Interface 158 a voted base penalty count (VBPC) which is generated by the Voter 38 in response to the Base Penalty Count messages received from all the Nodes in the system including its own.
- VBPC voted base penalty count
- these voted base penalty counts are stored in a Base Penalty Count Store 212 as the base penalty counts for each Node independent of the values of the base value penalty count stored for the preceding Master period.
- the Base Penalty Count Store 212 also receives a voted increment penalty count (VIPC) which is generated by the Voter 38 from the error messages received from all of the Nodes including Its own.
- the voted increment penalty count (VIPC) is added to the base penalty count of the accused Node when the error is verified by a Validity Checker 218.
- the Validty Checker 218 is embodied in the Voter 38, but may be part of the Error Consistency Checker 206 as shown in Figure 23.
- the Error Consistency Checker 206 also maintains a
- the Current System State Register 214 which stores a voted current system state (CSS) vector and a Next System State Register 216 which stores a next system state (NSS) vector.
- the current system state vector identifies which Nodes are currently active in the system and which are excluded, while the next system state vector identifies which Nodes are to be included and/or which are to be excluded in the next systems state of the system.
- the system will change its state at the beginning of the next Master period if the voted next system state vector is different from the current system state vector.
- the current and next system state vectors have 8 flag bits, one for each Node, which are set when the Node is excluded and which are reset when the Node is readmitted to the operating set of Nodes.
- Table IV is a l i st of twenty-fi ve fault detection mechani sms used in the systems.
- This table lists the error, the subsystem which detects the error, and whether the detection of the error is symmetric (S) or asymmetric (A). Since the system is symmetric in its structure, most of the errors contained in the messages transmitted to each other should be detected by every other Node. Therefore, every Node should generate an error message which identifies the error detected and the incremental penalty counts to be charged against the Node that made the error. These errors which are detected by all of the Nodes are called symmetric errors. Therefore, the existence of symmetric errors should be verified by at least a majority of the active Nodes in the system. There also is the case where channel noise occurs so that an error manifests itself differently among the receiving Nodes. In this case, the majority of the Nodes will agree which Node is faulty.
- the error or errors detected may be different for each Node and the incremental penalty count reported in the various error messages may likewise be different.
- a median vote on the incremental penalty count will be used to increment the base penalty count for that Node.
- the Validity Checker 218 will not generate a deviance error report to the Error Filer 204 identi fying those Nodes whose incremental penalty counts di ffered from the voted incremental penal ty count by more than the allowed amount.
- Thi s i s to prevent the unjust penal izing of a healthy Node.
- the Majority Agree Detector 224 receives a "missing vector" from the Voter 38 indicating from which Nodes the Base Penalty Count or Error messages were not received and a
- the Majority Agree Detector 224 al so receives a current system state vector from the Current System State Register 214.
- the Majority Agree Detector 224 wil l subtract the "deviance vector” and the "missing vector” from the current system state vector to generate a number corresponding to the number of Nodes which agree with the voted val ue. This number is then compared with the number of Nodes. currently active in the system identified by the current system state vector.
- i f the message is a Base Penal ty Count message
- the voted base penalty counts are stored in the Base Penal ty Count Store 212. Otherwise , i f the message i s an Error message, the base penalty count stored in the Base Penal ty Count Store 212 is Incremented by the voted increment penalty count.
- the Majority Agree Detector will generate a write inhibit signal which is appl ied to the 8ase Penalty Count Store 212 through an AND gate 234. Thi s write inhibit signal will inhibit the writing of the voted val ues in the Base Penalty Count Store 212 provided the reported error or errors i s not an asymmetric error.
- the Asymmetric Error Detector 226 receives the deviance vector, the missing vector, the current system state vector, and generates a deviance report inhibit signal when a majority of the Nodes send error messages Identifying a particular Node as faultly but they disagree as to the incremental penalty counts to be charged against the faultly Node.
- the Asymmetric Error Detector will interrogate the Error Handler Context Store 210 and will generate the deviance report inhibit signal when the detected errors are determined to be asymmetric errors of the type identified in Table IV.
- the deviance report inhibit signal will inhibit the Error Reporter 230 from reporting to the Error Filer 204 a deviance error for any Node which sent an error message containing an incremental penalty count which deviated from the voted incremental penalty count by more than the permitted tolerance.
- the deviance report inhibit signal is also applied to an inverted (negative) input of the AND gate 234.
- the deviance report inhibit signal will disable the AND gate 234 and block the write inhibit signal generated by the Majority Agree Detector 224. This will enable the voted incremented penalty count to be added to the base penalty count stored in the Base Penalty Count Store 212.
- the Error Reporter 230 receives the missing and deviance vectors from the Voter 38, the current system state (CSS) vector from the Current System State Register 214, the error report inhibit signal from the Asymmetric Detector 226, and the write inhibit signal from the output of the AND gate 234. In response to the absence of a write inhibit signal, the Error Reporter 230 will report to the Error Filer 204 the Node identified in the deviance vector as having deviance errors, it will also report in response to the missing vector each Node which did not send a Base Penalty Count or Error message as required.
- the Error Reporter 230 In response to a write inhibit signal and the absence of an error report inhibit signal from the Asymmetric Error Detector 226, the Error Reporter 230 will report each Node having reported an unsupported error. No deviance errors are reported for these unsupported Error messages. Finally, in response to an error report inhibit signal from the Asymmetric Error Detector 226, the Error Reporter 230 wil l report to the Error Fil er 204 any Node which fail s to report the Asymmetric error as identi fied by the missing vector. As previously described , the Error Reporter 230 will not report any deviance errors in the presence of a deviance report inhibit signal from the Asymmetric Error Detector.
- the setting of the excl uded fl ag signi fies that in the next System State the Fault Tol erator has determined that the Node whose excl usion fl ag was set should be excl uded from the operating set.
- the current (CSS) and next (NSS) System State vectors are transferred to the Synchronizer 46 and are incl uded in the next System State (MT5) message as the current system state and the new system state vectors , respectively.
- the new system state is global ly veri fied by the Voter 38 upon the receipt of the System State messages from al l of the participating Nodes in the system.
- the Error Consistency Checker 206 has a Base Penalty Count Decrementor 222 which will decrement the Base Penalty Count for each Node by a predetermined quantity at the end of each Master period. After decrementing the base penalty count for each Node, the Base Penalty Count decrementor 222 will enable the Exclude/Readmit Threshold comparator 220 to compare the decremented base penalty count of each excluded Node with a predetermined readmittance value.
- the Exclude/Readmit Threshold Comparator 220 will reset the flag in the Next System State Register 216 for each previously excluded Node whose decremented base penalty count is less than the readmittance threshold value. This permits Nodes to be readmitted to the operating set the next time the system is reconfigured since its operation has been error free for an extended period of time. This error free operation indicates that the original fault was transient or had been corrected (repaired or replaced).
- the readmittance threshold value is less than the exclusion threshold value to prevent the system from oscillating between two different system states if a Node has an intermittent fault which causes its base penalty count to fluctuate about the exclusion threshold value.
- the Base Penalty Count Store 212, the Current System State Register 214 and the Next System State Register 216 are incorporated in the Error Handler Context Store 210 but may be independent elements in the Error Consistency Checker 206, as shown in Figure 23.
- the Exclude/Readmit Threshold Comparator 220 will also detect the excl usion of its own Node and generate a RESET signal which activates a Reset Generator 157 shown in Figure 22 to generate an Operations Control ler Reset signal (OCRES) and an Initial Parameter Load signal ( IPL) which wil l cause the Operations Control ler to reset and reload the initial parameters as previously described .
- the Reset Generator 157 1s al so responsive to the Power On Reset (POR) si gnal to generate the OCRES and IPL signal s each time the electrical power to the Operations Control ler is turned on .
- the Error Message Generator 208 wil l generate , during each Atomic period , an Error Message for each Node which generated a message containing an error detected by its own Operations Control ler.
- the Error Message Generator 208 wil l al so generate a Base Penal ty Count Message at the beginning of each Master period. These messages are transmitted to the Transmitter 30 through the Transmitter Interface 156.
- the Error Message Generator 208 wil l set to zero (0) the Increment Penalty Count for each Node. It wi ll then check the error code fil e section 170 of the Fault Tol erator RAM 162 for any error reports. The error code of each error report is used to address the Group Mapping Section 168 to obtain the pointer to the penalty weight section 172 of the Fault Tol erator RAM 162 to extract a penalty weight. Thi s penalty weight i s stored as the Increment Penalty Weight for the faulty Node in the Error Handler Context Store 210 and 1s used to increment the base penalty count currently being stored for that Node.
- Thi s process is repeated for each reported error for each Node until the Fault Tolerator receives a System State message signifying the end of the Atomic period for each individual Node.
- the increment penalty count and base penalty count for that Node are frozen.
- the Message Generator wil l then , using the content of the error code file section 170 of the Fault Tol erator RAM 162 and the stored increment penal ty count and the base penalty counts stored in the Error Handl er Context Store 210 construct an error message for each Node for which a fault was detected.
- the base penalty count and the increment penalty count win be frozen by sensing the High End Of Fuzzy (HEOF) signal generated by the Node's own Synchronizer 46 which signifies that all valid System State messages should have been received. This prevents the transmission of the error and base penalty count messages from being hung up while waiting for the missing System State message.
- HEOF High End Of Fuzzy
- the format of the four (4) error bytes of the Error message (bytes 3 through 7) are shown in Figure 25.
- the most significant bit of the first Byte is a self-test flag which identifies that the reported errors were detected in a Self-Test message.
- the increment penalty count Byte 8 will have a small value or be zero.
- the Error Message Generator 208 will retrieve the base penalty counts currently stored for each Node and will generate a Base Penalty Count message which is the first message transmitted after the System State message which is sent by the Synchronizer 46 at the end of the last Atomic period in each Master period.
- the Transmitter's Message Arbitrator 56 will poll the Fault Tolerator Interface 52 after it sends a System State message at the end of the Master period, then wait for a Base Penalty Count message generated by the Fault Tolerator 36.
- Figure 26 depicts the sequence of operations of the Operations Controller which results in a reconfiguration of the system and the role played by the Fault Tolerator 36.
- (a) signifies each Node will broadcast its Base Penalty Count message and reset all of the increment penalty counts In its Error Handler Context Store to zero.
- the Error Handler 164 will then begin the processing of any errors detected by its own Operations Controller's error detection mechanisms.
- (b) in which a message containing an error occurs, each Node will broadcast Error messages identiflying the Node which sent the message for which the errors were detected and the increment penalty count and the base penalty count for that Node.
- the Fault Tolerator will process the received Error messages and detect any unsupported Error messages from other Nodes and perform a medial vote on the increment penalty count for the accused Node whose reported error or errors are supported by a majority of the Nodes. This medial increment penalty count is then added to the base penalty count of the accused Node. The incremented base penalty count is then compared with the exclusion threshold. If the incremented base penalty count exceeds the exclusion threshold, the exclusion bit for that Node is set in the next System State Vector which is passed to the Synchronizer 46.
- the Synchronizer 46 will include the next System State Vector in the System State message which is broadcast to all of the other Nodes.
- the Fault Tolerator 36 will verify the correctness of the next System State by using a median vote of the healthy Nodes and pass this information to the Synchronizer 46 and to the Scheduler 40.
- the Synchronizer 46 and the Scheduler 40 Upon receipt of this information (g) the Synchronizer 46 and the Scheduler 40 will initiate a reconfiguration process in which the System State identified in the voted next System State Vector becomes the current System State for the System. After the reconfiguration is completed (h) the system will begin a new Master period in the new System State.
- the Fault Tolerator operation is no different If more than one fault is detected for any one Node and more than one Node is accused of being faulty in the received Error messages. This sequence can be overliad if successive failures occur in different Atomic periods.
- the Voter 38 performs two primary functions in the pro cessing of the data. First it generates a voted value for all available copies of the data and second, it performs a deviance check to determine if the data value of each copy of the data is within a predetermined tolerance or allowed deviance. Referring to Figure 27, the Voter 38 has a Loader 236 which receives from the Fault Tolerator 236, the message type (MT) code, the node identification (NID) code, and the data identification (DID) code for each message for which a voted value is to be generated.
- MT message type
- NID node identification
- DID data identification
- the Loader 236 will retrieve and temporarily store each copy of the data currently available in the Fault Tolerator RAM 162, which has the same data identification (DID) code as the received message.
- the Loader 236 will also, using the message type code, retrieve from the deviance sections, 174 through 180, of the Fault Tolerator RAM 162 the predetermined deviances for that particular message.
- the Loader 236 will first transmit the most significant bit of each copy of the data in parallel to an Upper Medial Value Sorter 238 and a Lower Medial Value Sorter 240 which will, respectively, sort the received bits to generate an upper (U n ) and a lower (L n ) medial bit value. These upper and lower medial bit values (U n and L n ) are transferred, as they are generated, to an Averaging Circuit 242 and a Deviance Checker 244.
- the Loader 236 wil generate a missing vector (MV) identifying each Node which did not send a Task Interactive Consisteny or System State message.
- the Loader 236 will also generate a missing vector at the end of each Atomic period identifying each Node which did not generate an Error Message or a Base Penalty Count message.
- the Averaging Circuit 242 adds the upper and lower median bit values and divides the sum by two to produce a voted average.
- the Deviance Checker 244 receives the upper (U n ) and the lower (L n ) medial bit values, the deviance values retreived from the Fault Tolerator RAM 162 and the corresponding data bit from each copy of the data being processed and will determine for each bit in each copy of the data value whether or not it is within the allowed deviance. This process is repeated for each bit in each copy starting with the most significant bit to the least significant bit.
- a deviance error (DERR) vector is sent to the Fault Tolerator 36 identifying each Node whose message contained a deviance error.
- the voted data value generated by the Averaging Circuit 242 for the Data Value messages (MT0, MT1, MT2, and MT3) are transmitted to a Voter-Task Communicator Interface 246 which passes them to the Task Communicator 44 along with the data identification (DID) code for that data value.
- the voted values for the base penalty counts contained in the Base Penalty Count messages, the voted values for the current and new System State Vectors contained in the System State messages and the voted values for the incremental and base penalty counts in the Error messages are transmitted to a Voter-Fault Tolerator Interface 248 where they are passed to the Fault Tolerator 36 along with the deviance error (DERR) and the missing vector, as previously described.
- DER deviance error
- the voting process and the deviance checks are repeated each time a message is received which requires a voted value to be generated. This assures that at all times the Task Communicator 44 and the Fault Tolerator 36 will have the best and most current voted values for the data value they may need. Using this type of on-the-fly voting, the system will not hang up if one or more copies of the data is unavailable due to a detected fault in the received message or a faulty Node fails to generate the required message.
- the Upper and Lower Medial Value Sorters 238 and 240 extract two values from the data values being processed. The values chosen depend upon whether the median select (MS) or the median of the medial extremes (MME) voting is imple mented. To avoid confusion, only the implemention for extracting the upper and lower median values will be discussed. Minor changes to implement the mean of the medial extremes (MME) sorting will be discussed briefly hereinafter.
- MS median select
- MME medial extremes
- Sorters 238 and 240 are shown on the flow diagram illustrated in Figure 28, while the details of the circuit for the Lower Medial Value Sorter 240 are shown in Figure 29. Since the Upper Medial Value Sorter 238 is a mirror image of the Lower Medial Value Sorter 240, it need not be shown or discussed in detail.
- m is the actual number of copies of the data being processed which is obtained from the Loader 236.
- the Upper Medial Value Sorter 238 then counts, as shown in block 252, the number of ones (1's) contained in the most significant bit positions of all the copies to generate a number n 1 , which is the number of one bits counted.
- the Upper Medial Value Sorter 238 will then inquire if n 1 - S' n is equal to or less than zero, as shown in decision block 254.
- the upper medial bit value is a 0-bit as indicated in block 256. If the upper medial value U n is a 0-bit, then all the copies having a 1-bit in the same bit position are excluded from the subsequent processing to determine the value of the remaining upper medial value bits. Because some copies of the data value are now excluded, a new value S' n is computed as indicated in block 258 by subtracting from S' n the number of excluded copies (n 1 ) to generate a new value for the analysis of the next highest bit.
- the upper median value U n is a 1-bit as indicated in block 260, and all of the copies having 0-bits in the same bit position are excluded from the subsequent processing.
- the Lower Medial Value Sorter 240 will count the number of O's as indicated in block 262 to generate a number n 0 equal to the number of O's counted. If the number of 0's (n 0 ) is less than So n , as indicated in decision block 264, (n 0 - S° n ⁇ 0) then the lower medial bit (L n ) is a 1-bit and all the data copies having a 0-bit in the same bit position are excluded from processing of the subsequent lower medial bits. Again the medial value So n is corrected by subtracting the number of excluded copies (n 0 ) from the preceding values for So n as indicated in block 268.
- the lower medial value of the bit position is a 0-bit as indicated in block 270 and the data copies having a 1-bit in the same bit position are excluded. This process is repeated until all of the lower medial value bits are determined.
- FIG. 29 The circuit details for the Upper Medial Value Sorter 238 are the mirror of the circuit shown in Figure 29 except that a Zero Counter 280 is replaced by an equivalent One Counter.
- the data values from each copy of the data value retrieved by the Loader 236 are received bit by bit, from the most significant value to the least significant value, by a plurality of OR gates 276-0 through 276-N, and to a like plurality of exclusive NOR gates collectively indicated by block 278.
- the OR gates 276-0 through 276-N will pass the received bits to the Zero Counter 280.
- the Zero Counter 280 will actually count the number of 1-bits received and subtract that number from the number of copies (m) being processed to generate the number of 0's (n 0 ).
- the Loader 236 counts the number of copies of the data it retrieves from the Fault Tolerator RAM 162 and supplies this number to the Zero Counter 280 and to an S° n Generator 282.
- the S° n Generator 282 subtracts one from m and divides the remainder by two to generate the initial value for So n .
- the output of the So n Generator 282 is received by a 2:1 Multiplexer 284 which will pass the initial value of So n to a Register 286.
- the exclusion of the copies having 0-bits in the same bit position is performed by the Exclusive NOR gates 278 and Register 294.
- the 1-bit value produced by the OR gate 290 is applied to the input to each of the Exclusive NOR gates.
- the exclusive NOR gates 278 will generate a logical 1 signal for each copy of the data value which has a 0-bit in the same bit position.
- This logical 1 is stored in the Exclusion Register 294, the output of which is connected to the alternate inputs of OR gates 276-0 through 276-N.
- the inputs of their respective OR gates 276-0 through 276-N whose data has a 0-bit in the same bit position will be a 1-bit which is passed to the Zero Counter, thereby excluding them from further participation in the determination of the lower medial bit values.
- the output, n 0 , from the Zero Counter 280 and the lower medial bit value L n are applied to the input of a Gating Circuit 296 which passes the value of n 0 to a Subtraction Circuit 298 when the lower medial bit value L n is equal to 1.
- This new value of So n is applied to a second input of the Multiplexer 284 and is passed to the Register 286 for use in the processing of the next lower medial bit value.
- a sequencer circuit (not shown) will monitor the number of bits processed and will clear the Exclusion Register 294 and So n Register 286 after the last lower medial bit L n is generated for the current set of data values in preparation for the processing of the next set of data values.
- MME Extremes
- the operation of the Averaging Circuit 242 will be explained with reference to the flow diagram shown in Figure 30 and the circuit diagram shown in Figure 31.
- the averaging process forms the mean of the upper and lower values by keeping two versions of the mean value M and choosing between them as later bits arrive. This process rests on the following two facts:
- the mean bit M is the same except for the case described in (b) below. If the bits are different, then the mean is 1 ⁇ 2, which 1n binary form is a 0.1; and
- a sequence 1 ⁇ 2, 1 ⁇ 2, 1 ⁇ 2, ... 1 ⁇ 2 can be resolved into binary format only when the first identical pair following the sequence arrives. For example, the sequence 1 ⁇ 2, 1 ⁇ 2, 1 ⁇ 2, ... 1 ⁇ 2, 0, where 0 represents the arrival of U n and L n both having 0 values, resolves to 011...11 and the sequence 1 ⁇ 2, 1 ⁇ 2, 1 ⁇ 2, ... 1 ⁇ 2, 1 where represents the arrival of U n and L n both having 1 values resolves to 100...00.
- the process begins by initializing the value A to 0 and the bit number n to 0 as indicated by block 300.
- the value A is the Exclusive OR of the preceding values of L n and U n and is set to 0 at the beginning of the process.
- the process inquires decision block 302 if the
- the process then generates a new value for A depending upon the exclusive OR of the current values of U n and L n as indicated by block 322.
- the medial values, U n and L n are received by an Exclusive OR gate 328 which produces a 1 at its output when U n and L n are different and a 0 bit at its output when U n and L n are the same.
- the output of the Exclusive OR gate 328 is transmitted to the input of a 2-bit Shift Register 330 an inverted input of AND gate 332 and an input of AND gate 348.
- the Shift Register 330 temporarily- stores the output of the Exclusive OR gate 328 for use in processing the next U n and L n bits received from the Upper and Lower Medial Value Sorters 238 and 240.
- the delayed output of the Shift Register 3301 s the value A discussed with reference to Figure 30.
- the upper medial bit U n is also applied to the 0 and 3 inputs of a pair of 8:1 Multiplexers 334 and 336.
- the other inputs to Multiplexers 334 and 336 are preset as shown.
- the Multiplexers 334 and 336 would both output the value of U n applied to the third input to the Multiplexers 334 and 336 which is the value of U n .
- U n was equal to 0 and in the second example, U n was equal to 1.
- the Multiplexer 334 will output a 1 and the Multiplexer 336 will output a 0 as indicated by block 306.
- the outputs of the Multiplexers 334 and 336 are received by 3:1 Multiplexers 338 and 340 as shown.
- the Multiplexers 338 and 340 also receive the outputs of an Inverter 342 which is the complement (U n ) of the upper medial bit value U n .
- the outputs of the 3:1 Multiplexers 338 and 340 are received by a M 1 Register 344 and M 2 Register 346, respectively.
- Multiplexers 338 and 340 are controlled by NAND gate 332 and AND gate 348.
- the NAND gate 332 produces a logical 1 output when the output of Exclusive OR gate 328 is 0 and and the value A is 1. This actuates the 3:1 Multiplexers 338 and 340 to store the complement of U n of the upper medial bit value in both the M 1 Register 344 and the M 2 Register 346, respectively, as indicated by block 314 of Figure 30.
- the AND gate 348 produces a logical 1 output when the output of the Exclusive OR gate 328 is a 1 and A is a 1 which causes the output of the Multiplexer 334 to be stored in the M 2 Register 346 and the output of Multiplexer 336 to be stored in the M 1 Register 346 as indicated by block 308 in Figure 30.
- the output of the NAND gate 332 is also Used to actuate the M 1 Register 344 and the M 2 Register 346 to copy the content of the M 1 Register 344 into the M 2 Register 346 or vice versa depending upon the value of L n as indicated by block 316 in Figure 30.
- the output of the NAND gate 332 and the lower median bit L n are applied to the inputs of an AND gate 350, the output of which determines whether the content of the M 1 Register 344 will be transferred to the M 2 Register 346 or vice versa as indicated by blocks 318 and 320 of Figure 30.
- Deviance Checker 244 shall be discussed with respect to the flow diagram shown in Figure 32 and the circuit inplementation shown in Figure 33.
- the circuit shown in Figure 33 is replicated in the Deviance Checker 244, one circuit for each Node in the system, so that the deviance checks on all the data values being checked can be checked in parallel.
- the Deviance Checker 244 processes the data being checked on a bit-by-bit basis from the most significant bit to the least significant bit as the upper medial U n and the lower medial L n values become available from the Upper Medial Value Sorter 238 and the Lower Medial Value Sorter 240.
- the deviance checking process is based on the condition that A > B can be distinguished from A ⁇ B by adding B to the two's complement of A and looking for an overflow at the most significant bit (MSB).
- MSB most significant bit
- M is the medial value 1 ⁇ 2 (L n + U n );
- D is the predetermined deviance limit retrieved from the Fault Tolerator RAM 162 for the particular data value being checked.
- V is the data value being checked.
- the process proceeds by successfully examining the value of the second previous bit B as carries from the later bits are added to it. If a carry occurs beyond the second previous bit, then an overflow occurs and V ⁇ M + D. However, if the second previous bit B is 0, without a prior or current overflow, then
- the process then proceeds to add 2V, D' + L + U as indicated in block 354, where 2V is the 2's complement of 2V, D' is 2D + 1 which is the deviance value actually stored in the Fault Tolerator RAM 162 and U n and L n are the upper and lower medial values received from the Upper and Lower Medial Value Sorters 238 and 240, respectively.
- the results of this addition produces a first previous sum bit S -1 which is the sum value obtained during the processing of the preceding data value bit, a carry bit C and a double carry bit C' obtained in the processing of the current data bit value.
- the process adds the first previous sum bit S -1 generated during the processing of the preceding data value bit, with the current carry bit C as indicated in block 356 to generate a second previous sum bit interim value B' -2 which is used in the processing of the next data value bit.
- the addition also produces a third carry bit C' ' which is indicative of an additional carry resulting from the processing of the current bits and the first previous bits.
- the carry bit C' ', from block 356, is added to the double carry bit C' of the current data value being processed and to the interim value B' -2 .
- V > M + D and the "fail" flag is set indicating that the data value failed the deviance test as indicated in block 366. If B 1s not equal to zero, the process will proceed to check the next bit of the data value as indicated by block 368. Finally, after checking all of the data value bits and neither the "pass" nor "fail” flags are set, the process will automatically set the "fail” flag as indicated ending the process.
- an Adder 372 adds the first 3 bits U n , L n , and D', to produce an interim sum bit S' and a first carry bit C 1 .
- the Interim sum bit S' is received by an AND gate 374 and an Exclusive OR gate 376 as indicated in equation 3.
- the AND gate 374 will output a second carry bit C 2 which is applied to one input of an AND gate 378 and an input of an Exclusive OR gate 380.
- the AND gate 378 receives the carry bit C 1 from the Adder 372 at its other input.
- the Exclusive OR gate 380 also receives the carry bit C 1 from the Adder 372 at its alternate input.
- the output of the Exclusive OR gate 376 is a sum bit S which is temporarily stored in a two bit Shift Register 382 until the processing of the next data value bit.
- the output of the Exclusive OR gate 380 is a single carry bit C which is received at the inputs of an AND gate 384 and an Exclusive OR gate 386.
- the AND gate 384 and the Exclusive OR gate 386 receive the sum bit S -1 at their other inputs from the Shift Register 382.
- the sum bit S -1 is the sum bit S generated during the processing of the previous data value bit.
- the output of the Exclusive OR gate 386 is the sum of the sum bit S -1 and a carry bit generated during the processing of the current data bit which is a preliminary bit value B' -1 which is stored in a second Shift Register 392.
- the preliminary value B' -1 is an interim value of the second preceding bit value before correction for the carry bits.
- the output of the AND gate 384 is a carry bit C' ' which is received at an input to an Exclusive OR gate 390 which also receives at its alternate, input the double carry output C' from the AND gate 378.
- the output of the Exclusive OR gate 390 is received at an input to an AND gate 388 and an input to an Exclusive OR gate 394.
- the output of the Shift Register 392 is received at the alternate input to the AND gate 388 and Exclusive OR gate 394.
- the output of the AND gate 388 is the carry bit signal "A" for the second preceding bit which is applied to the set input of an S-R flip flop 398.
- the Q output of the S-R flip flop 398 is applied to the D Input of a D-type flip flop 400.
- the output of the Exclusive OR gate 394 is the final bit value 8 of the second preceding data value after corrections for single and double carries.
- the final bit value B is inverted by an Inverter 402 whose output is connected to the SET input of a second S-R flip flop 404.
- the Q output of S-R flip flop 404 is applied to one input of an AND gate 406 whose output is connected to the clock input of the D-type flip flop 400 through an OR gate 408.
- a clock pulse (CLK) is applied to the alternate input of the AND gate 406 which is applied to the input of the D-type flip flop 400 when the AND gate 406 is enabled by the Q output of the S-R flip flop 404.
- a bit counter 410 counts the number of bits processed and generates an overflow pulse after all the bits have been processed.
- the overflow pulse is applied to the clock input of the D-type flip flop 400 through an AND gate 412 and the OR gate 408.
- the alternate input to the AND gate 412 is received from the Q output of the S-R flip flop 398 and is disabled when the S-R flip flop 398 is placed in its SET state by the carry signal A being a 1.
- the Adder 372 produces the interim sum bit S' and the carry bit Ci resulting from the adding of U n , L n , and D'.
- the AND gate 374 produces a carry bit C 2 which results from adding the 2's complement (2V) of 2V to the sum of U n , L n , and D'.
- the carry bit C 2 is combined with the carry bit Ci from the Adder 372 in AND gate 378 to produce the double carry bit C' when both C 1 and C 2 are 1's.
- the output of the Exclusive OR gate 380 is indicative of a single carry bit C from either the Adder 372 or the AND gate 374.
- the sum signal S -1 is the sum S output from the Exclusive OR gate 376 which is output from the Shift Register 382 during the processing of the next subsequent data bit.
- the Exclusive OR gate 386 produces an interim sum value bit B' -1 from the sum bit S -1 from the Shift Register 382 and the carry bit C from the Exclusive OR gate 380.
- the sum bit B' -2 is the signal B' -1 output from the Shift Register 392 during the processing of the second subsequent data value bit.
- the carry bit C'' is the output of the AND gate 384 which is a continuation of the carry bit C when the sum bit S -1 generated in the processing the preceding data value bit is a 1.
- Exclusive OR gate 390 and the AND gate 388 will generate the value A and the Exclusive OR gate 394 will generate the bit val ⁇ e B as indicated in block 358.
- the value of A is 0 when the interim value of the second preceding sum bit, B' -2 is 0 or when both C' and C'' are 0's indicating no carry bits C 1 or C 2 have resulted from the processing of the current data value bit.
- the value of A is 1 when C' or C' ' is a 1 and the interim value of the second preceding sum bit B' -2 is 1.
- the value of B is 1 when B'- 2 is 1 and C' and C'' are 0's or when B' -2 is 0 and C' or C' ' are a 1.
- a 1 at the Q output of the D-type flip flop 400 is the pass flag as indicated in block 362.
- the Q output of the S-R flip flop 398 will disable the AND gate 412, preventing the overflow bit from the Bit Counter 410 from toggling the D-type flip flop 400 after the processing of the last bit. If the AND gate 412 is not disabled by the Q output of the S-R flip flop 398, the overflow bit from the Bit Counter 410 will toggle the D-type flip flop 400, changing its Q output from a 1 to a 0.
- a 0 Q output of the D-type flip flop 400 is the fall flag, as indicated by block 366.
- the function of the decision block 364 is carried out by the Inverter 402, the S-R flip flop 404 and the AND gate 406.
- the Inverter 402 When B is 0, the Inverter 402 will cause the S-R flip flop 404 to be placed in the set state causing its Q output to be a 1.
- a 1 from the Q output from the S-R flip flop 404 enables the AND gate 406 to pass a clock (CLK) pulse which will toggle the D-type flip flop 400 through the OR gate 408.
- CLK clock
- a low or 0 Q output of the D-type flip flop 400 as previously indicated is the fail flag indicated in block 366.
- the pass or fail flag is passed to the Fault Tolerator 36 through the Voter Fault Tolerator Interface 248 as a deviance error (DERR).
- DER deviance error
- the Scheduler 40 has two modes of operation, a normal mode and a reconfiguration mode.
- the normal mode the Scheduler 40 schedules the application task for each operating Node in the system including its own, and monitors the execution of these tasks.
- the reconfiguration mode is entered whenever the Fault Tolerator 36 determines if one or more Nodes are to be excluded or readmitted to the operating set.
- the two modes interact through an activation status which defines which tasks are eligible for execution by each Node.
- the reconfiguration mode modifies the activation status, whereas the normal mode utilizes the activation status to schedule the task.
- the Scheduler 40 implements a dynamic, priority base, nonpre-emptive task scheduling process. Concurrent programming practices and the resolution of inter task dependencies are supported at the boundaries between the tasks. Task-to-node allocation is static for any given System State (configuration), but the sequencing of tasks and the resolution of dependencies are performed dynamically.
- the Scheduler 40 in each Node replicates the scheduling process for every active Node in the system. Fault detection mechanisms permit each Node to recognize erroneous behavior in the sequencing or timing of the task executed by any Node.
- tasks may be reallocated among the operating Nodes. Tasks may also be added or deleted from the active task set to conform to the changes in the overall system capabilitiesitles.
- the Scheduler 40 During start up or reset of the Operations Controller 12 the Scheduler 40 enters the reconfiguration mode with the assumption that no Nodes are operating. When the Fault Tolerator 36 recognizes an "operating set,” that information is passed to the Scheduler 40 as a new System State Vector. The Scheduler then reconfigures the tasks in accordance with the received new System State Vector. By using this method the operation of the Scheduler 40 is self-hoot strapping.
- a block diagram of the Scheduler 40 is shown in Figure 34.
- a Task Selector Module 414 receives information from the Fault Tolerator 36 through a Fault Tolerator Interface 416, from the Synchronizer 46 through a Synchronizer Interface 418, and from the Task Communicator 44 through a Task Communicator Interface 420.
- the Task Selector Module 414 also communicates with a Scheduler RAM 422 and a Scheduler ROM 424 through a Memory Interface 426.
- a Reconfiguration Module 428 is responsive to the reception of a new System State Vector from the Fault Tolerator 36 to reallocate the task to be selected and executed by the new set of operating Nodes.
- the Reconfiguration Module 428 will change the activation status of the tasks stored in the Scheduler RAM 422 using predetermined information stored in the Scheduler ROM 424.
- a map of the Scheduler RAM 422 is shown in Figure 35.
- the entry Old TID contains an entry for each Node in the system and stores the TID previously started by that Node.
- the Swap table entry contains an entry for each task (TID) and stores a predecessor count which is the total number of immediate predecessors to that particular task.
- TID task
- a periodicity corresponding to how many Atomic periods must pass between the execution of the task and two swap count numbers which are used to swap or change the active status of a task on a particular Node shall be explained during the discussion relative to reconfiguration.
- the Allocation Table stores an allocation count for each task-node pair in the system and is used in the reconfiguration process to determine the distribution of the active tasks among the Nodes.
- the Selection Queue 450 has 3 pages, NEXT, PREVIOUS and CHECK. Each page contains three entries for each Node corresponding to the three highest priority tasks currently ready for execution by that Node. "Used" is a Boolean value indicating whether the current iteration of the task in the entry has been started by that Node, ITER is the interation number of that task in the entry, and TID is the task identification code for that task.
- the NEXT page is the entry from which the next task to be executed for each Node is selected, the PREVIOUS page lists the tasks selected during the preceding Subatomic period, and the CHECK page contains the tasks selected during the second preceding Subatomic period for that Node. The pages are rotated at the beginning of each Subatomic period, and the newly selected task for each Node is stored in the NEXT page.
- the Completion Status List contains, for each task, a completion count which corresponds to the number of copies of that task that have been completed, the branch condition count which stores a number corresponding to the number of received Task Completed/Started messages in which the branch condition has a value of 1 and an allocation entry which contains the allocation of that task among the various Nodes.
- the Priority Scan List stores for each task the predecessor count which is the number of preceding tasks which have to be completed before that task can be executed, the iteration number of that task and its allocation.
- the Task Activity List entry stores for each task the predecessor count, the periodicity of the task, and Its allocation.
- a map of the Scheduler ROM 424 is shown in Figure 36.
- the first entry is the Successor List which lists the successor tasks for each terminated task. This list is accessed by the address of the Successor Offset as shall be explained hereinafter. There are two Successor-Lists, one for each of the two possible branch conditions.
- the next four entries are the Preference Vectors for each task and identifies those Nodes preferred for the execution of that task.
- the Relevance Vector contains two entries, the first INCLUDE/EXCLUDE identifies whether the task is to be executed by the Nodes included in the Operating Set or executed by the Nodes excluded from the Operating Set, and a Relevance Vector which identifies to which Nodes the task is relevant.
- the Initial Swap Table entry contains for each task, the initial predecessor count, the periodicity, and the initial swap counts for each task which are loaded into the Task Activity List of the Scheduler RAM 422 during reset or reconfiguration as shall be discussed later.
- the next two entries are the Initial Allocation Counters for each task and lists the initial allocation count or toggle point for each task-node combination. These values are loaded into the Allocation Tables in the Scheduler RAM 422 following reset or power-up.
- the entry Maximum Execution Time Table stores the 2's complement of the maximum execution time for each task and is loaded into the execution timer, for that Node, when the task is started.
- the entry Minimum Execution Time Table stores the 2's complement of the minimum execution time for each task and is used to check the execution time of each task when it is reported as being completed.
- the Successor Offset entry contains for each task, the starting address in the Successor List where the successor tasks are stored.
- the Initializing Table entry stores the maximum Node Identification code (NID) and the maximum Task Identification code (TID) used in the system which are used to Identify when a particular operation is completed.
- Figure 37 shows the details of the Task Selector Module 414.
- the NID and started TID fields of the Task Completed/Started messages are transferred directly from the Fault Tolerator Interface 416 to the Task Communicator Interface 420, and are also temporarily stored in an On-Board RAM 430.
- a Completed/Started Handler 432 transfers the TID and NID of each task identified in a Task Completed/Started message from the On-Board RAM 430 to a Started TID Register 434 shortly after the end of the Soft Error Window (SEW) at the end of each Subatomic period.
- SEW Soft Error Window
- the Started TID Register 434 for each Node is a 3-deep queue in which the new NID and TID are added to the tail of the queue and removed from the head.
- the Task Selector Module 414 also has a TIC Handler 436 is responsive to the Byzantine voted values of the task completed vector and the branch condition bits of the Task Interactive Consistency (TIC) messages.
- This data received from the Byzantine Voter in the Synchronizer 46 is used to update a Selection Queue 450 and a Completion Status List 438, a Wake-up Sequencer 440 responsive to the various period signals generated by a Period Counter 442 for transferring active tasks from a Task Activity List 444 to a Priority Scan List 446 and to the
- Completion Status List 438 a Priority Scanner 448 which selects the tasks in the Priority Scan List 446 which are placed in the Selection Queue 450, the Next Task Selector 452 which selects the highest priority task in the Selection Queue 450 and places it in a Next Task Register 454 from where it is transferred to the Task Communicator 44 for execution by the Applications Processor, an Execution Timer 456 which monitors the execution time of each task being executed by the individual Nodes 1n the system, and an Old TID List 458 which stores the current task being executed by each Node.
- the Task Activity List 444, the Priority Scan List 446, the Completion Status List 438, the Selection Queue 450 and the Old TID List 458 are embodied in the Scheduler RAM 422 as discussed relative to Figure 35.
- the operation of the Wake-up Sequencer 440, the Execution Timer 456, the TIC Handler 436, the Priority Scanner 448, and the Next Task Selector 452 will be discussed relative to the flow diagrams shown in Figures 38 through 46.
- the operation of the Completed/Started Handler 432 is relatively simple in that it transfers the content of the On-Board RAM 430 to the Started TID Register 434 at the beginning of each Subatomic period.
- the flow diagram shown in Figure 38 describes the operation of the Wake-up Sequencer 440.
- the process begins by repeatedly inquiring if the Subatomic period is the last Subatomic period (LSAP) or is the third Subatomic period (SAP), as indicated by Inquiry Blocks 460 and 462. If it is the last Subatomic period, the process initializes the TID pointer to the Task Activity List 444 to 0, as indicated by block 464. The process then inquires decision block 468, if the periodicity of the tasks TID is less than the period indicated by the Period Counter 442. If it is, the Priority Scan List is initialized, as indicated in block 470. The Priority Scan List Iteration set is equal to the current iteration for that task.
- the predecessor count is set equal to the predecessor count contained in the Task Activity List and the allocation is set equal to the allocation contained in the Task Activity List.
- the process then proceeds to inquire, decision block 472, if the task just processed was the last task. If it is, the operation of the Wake-Up Sequencer 440 is completed, otherwise, the process will index to the next task on the Task Activity List 444 as indicated by block 480 and again check if the periodicity of that task is less than the period of the Period Counter, as indicated by decision block 468. If the TID period is greater than the Period Counter then the task is not entered into the Priority Scan List 446 and the task pointer is indexed to the next task in the Task Activity List as indicated by block 480.
- the last task in the Task Activity List 444 is a null task which has a periodicity of 0. Thus the last task will always be entered into the Priority Scan List 446 as indicated by block 470 when there is no other task whose periodicity is less than the period of the Period Counter 442.
- the Wake-Up Sequencer 440 will again initialize the pointer to the Task Activity List to the first task as indicated by block 482.
- the Wake-Up Sequencer will then inquire, decision block 484, if the periodicity of the task is less than the period indicated by the Period Counter 442. If it is, the Wake-Up Sequencer will initialize the Completion Status List 438, as indicated by block 486. It will then set the iteration in the Completion Status List to 0, the Branch Condition List to 0, and set the allocation to the allocation indicated in the Task Activity List.
- the Wake-Up Sequencer 440 will then inquire, decision block 488, if it is the last task in the Task Activity List.
- the operation of the Wake-Up Sequencer 440 is completed. Otherwise the TID pointer in the Task Activity List will be Indexed to the next task as indicated by block 490 and the above procedure will be repeated. If the periodicity of the task is greater than the period indicated by the Period Counter 442, the Completion Status List 438 will not be updated and the pointer to the task in the Task Activity List will be indexed to the next task. When the pointer in the Task Activity List is indexed to the last task, it will always be entered into the Completed Started List since it has a periodicity of 0.
- Timer 456 will check the execution timer for each Node as shown in the flow diagram in Figure 39. As previously indicated, the execution time for the task being executed by each Node is the 2's complement of the maximum execution time. This is done because with current technology it is easier to increment the time rather than decrement the time.
- the operation of the Execution Timer 456 begins by Initializing the timer pointer to the first Node, as Indicated in block 492. The Execution Timer will then Increment the times stored by each Node by one (1), as indicated in block 494. The Execution Timer 456 will then check each timer for the time remaining for the execution of the task, as Indicated by decision block 496.
- the timer for any particular Node is equal to 0, then the timer will set an error flag for that Node to true. This information is then sent to the TIC Handler 436 before it is passed to the Fault Tolerator Interface 416 for reasons which shall be explained later. If the current time is not equal to 0, the Execution Timer 456 will inquire, decision block 500, if it has checked the last Node and, if it has, it will exit the execution timer process. Otherwise it will increment the Node pointer to the next Node, as indicated by block 502 and check the current time of the next Node.
- the operation of the TIC Handler 436 will be described with reference to Figures 40 through 44.
- the TIC Handler responds to the arrival of the voted value of the Task Interactive Consistency message and modifies the main data structure based upon that data. It treats the occurrence of a time error as equivalent to a confirmed completion so that a stalled or permanently hung copy of a task does not hold up the rest of the work load.
- the operation of the TIC Handler 436 starts following the completion of the Execution Timer checks and the receipt of the Byzantine data from the Synchronizer 46.
- the TIC Handler 436 selects a Node for which either a confirmed completion or an overtime error has been reported.
- the TIC Handler clears the timer error bit associated with that Node since the completion was confirmed during the same Subatomic period in which the timer expired.
- the TIC Handler searches the CHECK page of the Selection Queue 450 for the TID of the first unused task encountered for the Node which was reported to have completed a task. This is the TID of the task which the Node should have started. If this TID does not match the TID currently stored in the Started TID Register 434 for that Node, then a sequence error is recorded.
- the TIC Handler calls each of Its sub-process, Selection Queue Update, Completion Termination, Execution Timer Reset, and Priority Scan Update, and sequentially updates the data structure for the selected Node.
- the TIC Handler process is repeated for each Node.
- the operation of the TIC Handler begins by inquiring if the Byzantine data is available, as indicated by decision block 504. If it is not available, the TIC Handler 436 will wait until it does become available. Otherwise the TIC Handler will initialize the pointer to CHECK page of the Selection Queue 450 to the first Node position, as indicated by block 506. The process will then inquire, as Indicated by decision block 508, if the Node completed a task, as indicated by the Byzantine data. In parallel, if the Byzantine data did not indicate that a task was completed by that Node, the process will check to see if a time error had occurred, as indicated in decision block 524.
- the process will increment the Node pointer to the next Node, as indicated by block 526. The process will then check to determine if it had investigated the last or the maximum Node as indicated by block 528. If it was the last Node, it will exit the program, otherwise it will proceed to check the next Node to see if it had completed a task or a time error had occurred.
- the TIC Handler When a Node has completed a task and a time error has been recorded for that Node, the TIC Handler will set the time error to false, as indicated by block 510 since the task was completed in the same Subatomic period in which the time error was detected. Therefore, the time error is invalid and it is cancelled. If either the Node had completed a task or a time error had occurred, the process will then mark as used the first unused entry for that Node found in the CHECK page of the Selection Queue, as indicated by block 512. It will then store as the current TID the TID of the entry that had just been marked used and it will store the current iteration as the iteration of that same entry, as indicated by block 514.
- the process will then check to determine that the current task is also the same task that was reported by that Node in its last Task Completed/Started message which was stored in the Started TID Register 434 as shown in Figure 37. If the current task and the task reported as completed in the last Task Completed/Started message for that Node are not the same, the TIC Handler 436 will set the Sequence Error flag to "true," as indicated by block 520. The Process will then call the Selection Queue (SQ) Update sub-process, as indicated by block 518 and wait for the completion of the Priority Scan List (PSL) Update sub-process, as Indicated by block 522. When the Priority Scan List Update is completed, the process will then index a Node pointer to the next Node, as indicated by block 526 and then check to see if it has processed the last Node, as indicated by decision block 528.
- SQ Selection Queue
- PSL Priority Scan List
- the sub-process Selection Queue Update for the TIC Handler searches the NEXT and the PREVIOUS pages of the Selection Queue 450 for the Nodes selected by the TIC Handler. When an entry is found containing both the current task and the current iteration, it is marked "used.” Such entries may or may not be found because the tasks with a higher priority than the current task may have become available between the generation of the CHECK page and the generation of the PREVIOUS or NEXT page. It is not necessary to mark the CHECK page entry since it will not be accessed again before it is refreshed.
- the Selection Queue Update sub-process begins by initializing the pointer to the PREVIOUS page to the 0 entry, as indicated by block 530.
- the process will then index the entry to the first entry, as indicated by block 532, and will inquire if the current TID and iteration are equal to the TID and iteration of the entry, as indicated in decision block 534. If they are the same, then the entry "used” is marked “true,” as indicated by block 536. Otherwise the process will inquire if it has checked all of the three entries of the PREVIOUS page, as indicated by decision block 538. If it has not checked all of the entries on the PREVIOUS page of the Selection Queue 450, it will proceed to index the entry to the second entry and so on until it has checked all three entries on the PREVIOUS page.
- the program will then proceed to the NEXT page of the Selection Queue 450 and again will set the pointer to the 0 entry, as indicated by block 540. It will index the entry, as indicated by block 542, then inquire if the current TID and iteration are the same as the TID and iteration of the entry, as indicated by block 544. If they are, it will mark the "used" entry "true,” as indicated by block 546. Otherwise the process will then inquire if it has checked all three entries, as indicated by decision block 548.
- CT Completed Termination
- the TIC Handler sub-process Completion/Termination records the completion of each copy of a task in the Completion Status List. If the final copy has been completed (or timed out), then the task is "terminated.” The Successor List entries associated with the terminated task and the majority branch conditions are accessed via the base address in the Successor-Offset List, as indicated in Figure 36. The predecessor count for each successor of the terminated task is then decremented. If the branch conditions generated by the various copies result in a tie, then the branch condition 0 is selected by the default.
- the TIC Handler 436 retains an old valid bit for each Node indicating whether the TID listed in the Old TID section of the Scheduler RAM 422, as shown in Figure 35 is a valid Old TID or not. All bits of the old valid are set to false during system reconfiguration to indicate that the next task to be executed by each Node is the first task and that there are no previous tasks to process. The old valid is set to true after the confirmed start of the first task on the Node and before the confirmed start of the second task on the Node.
- the started task is the first task being executed on that Node following a recon figuration. Therefore, there is no completed task to process and the Completion/Termination sub-process need not be executed. Similarly, if the completed task is a null task, there is no need to terminate the task.
- the point at which the task Is compared to the maximum task is the latest point at which the comparison can be made without potentially reporting a termination of a null task, and this makes the content of the maximum task entry on the Completion Status List irrelevant.
- Completion/Termination begins by checking the old valid flag for the Node, as indicated by block 552. As previously indicated, if old valid is not true, the process will then proceed to the next sub-process Execution Timer Reset as shall be discussed hereinafter. However, if old valid is true, the process will then record the completion of the task using the TID stored in Old TID as the TID of the completed task then accessing the Completion Status List 438 and setting the allocation for that TID-NID allocation to false, as indicated by block 554. The process will then inquire, as indicated in decision block 556, if the branch condition is equal to 1. If it is, it will then increment the branch condition entry in the Completion Status List 438, as indicated by block 558. However, if the branch condition is equal to 0, the process will proceed to inquire, as indicated by decision block 560, if all of the copies of that task have been completed. This is indicated by all the entries in the allocation section of the Completed Status List being set to false.
- the sub-process will proceed to report to the Task Communicator the identity of the terminated task, as indicated by block 562.
- the process will then get the address of the first successor task from the Successor-Offset entry contained in the Scheduler ROM 424, as indicated by block 564.
- the process will then inquire, as indicated by decision block 566, if the successor task is equal to the maximum successor task which corresponds to the end of the Successor Task List for the terminated task. If that is the end of the Successor Task List, the program will then proceed to call the Execution Timer Reset sub-process, as indicated by block 572.
- the process will continue to update the Completion Status Table by decrementing the predecessor count for each successor task by 1, as indicated by block 568. The process will then increment the address to the Successor List, as indicated by block 570 and proceed to analyze the next task on the Successor List.
- the Execution Timer Reset sub-process of the TIC Handler 436 checks the execution timer for each Node for a minimum time error and reloads the timer for the newly started task. If the old valid flag for that Node is false, then there is no completed task and the error is not recorded.
- the Execution Timer Reset is the last process to access the Old TID entry in the Scheduler RAM 422. It is, therefore, a convenient place in which to copy the current TID and to set the flag old valid true.
- FIG 43 is a flow diagram showing the process executed by the Execution Timer Reset sub-process.
- the process begins by setting the TID equal to the Old TID for that particular Node, as indicated by block 574.
- the process compares, as indicated by decision block 576, the current execution time for that TID with the minimum time. If the current execution time is greater than the minimum execution time it then inquires, as indicated by block 578, if the old valid flag is true. If old valid is true, then the Execution Timer Reset sub-process win set the time error flag for that Node to "true,” as indicated by block 580.
- the process will then reset the Execution Timer, as indicated by block 582, by setting the current time for that Node equal to the maximum time for the current task which is contained in the Scheduler ROM 424 in the entry entitled Maximum Execution Time Table, as shown in Figure 36.
- the process will then update the Old TID entry in the Scheduler RAM 422 by setting the Old TID for that Node equal to the current TID, as indicated by block 584, then set the old valid flag for that Node as true, as indicated by block 586.
- the process will then proceed to call up the Priority Scan List Update sub-process, as Indicated by block 588.
- the Priority Scan List Update sub-process of the TIC Handler 436 records the start of the current task on the Node by clearing the Priority Scan List current TID allocation for the Node. This process makes the current task ineligible for reexecution by that Node until its next Iteration, at which time the Wake-up Sequencer 440 reinitializes the Priority Scan List entry for the task.
- the two iteration values may differ within the first three Subatomic periods of an Atomic period if the task ran during the last three Subatomic periods of the previous Atomic period.
- Figure 44 is a flow diagram showing the procedure executed by the TIC Handler 436 in the execution of the Priority Scan List Update.
- the process begins by inquiring if the entry is current, as indicated by block 590. If the entry is current, the process will then proceed to inquire if the current task is a null task (maximum TID), as indicated by block 592. If the current task is not a null task, the Priority Scan List is updated by recording that the Node has started that particular task, as Indicated by block 594. Effectively, the process sets the flag in the allocation entry of the Priority Scan List for that particular Node to false. If the entry is not current, or if the task is a null task, the process returns, as indicated by block 596 to the TIC Handler process illustrated in Figure 40.
- the Priority Scanner 448 selects a candidate task for the next Subatomic period based on the latest confirmed data about the progress of the application work load.
- the operation of the Priority Scanner 448 follows the updating of the Priority Scan List by the TIC Handler 436.
- the Priority Scanner 448 will first rotate the page pointers of the Selection Queue 450 then select three tasks for each Node by scanning the Priority Scan List in the order of increasing TID's. In the Priority Scan List the highest priority tasks have the lower TID numbers and the lowest priority tasks have the higher TID numbers.
- the selected tasks are then written into the NEXT page of the Selection Queue for their respective Nodes.
- the operation of the Priority Scanner 448 begins by rotating the pointers in the Selection Queue 450, as indicated by block 598.
- the Priority Scanner then sets all of the Node entry pointers to the first entry as indicated by block 600. It then starts at the top of the TID list for the first task, as indicated by block 602.
- the Priority Scanner 448 then inquires, as indicated by block 604, if the precedent count for that task is equal to 0 Indicating that all of the predecessor tasks have been completed. If all of the preceding conditions are satisfied, the Priority Scanner 448 will investigate if the task has been previously started on that particular Node, as indicated by decision block 606.
- the Priority Scanner will then inquire if that particular Node already has three entries, as indicated by block 608. If it does have three entries, it will then check to see if that Node was the last Node, as indicated by block 610. If it is not the last Node, it will then index to the next Node, as indicated by block 612 and will proceed to check the entries for the next Node. If the Node being evaluated is the last Node, the Priority Scanner 448 will proceed to check if each Node has more than three entries, as indicated by block 618. If each Node has more than three entries, then the operation of the Priority Scanner is completed and it will exit.
- the Priority Scanner 448 will inquire, as indicated in block 620, if it has processed the last task. If it has processed the last task, then it will fill all the remaining entries with the null task which is the maximum TID, as indicated by block 622. However, if the TID is not the maximum or last task in the list, the process will increment the TID number and will repeat.
- the process will copy the TID and iteration from the Priority Scan List to the NEXT page of the Selection Queue 450 for that Node, as Indicated by block 614. It will then increment the entry for that Node, as indicated by block 616 and then inquire, as indicated by decision block 610, if that Node was the last Node. If it is not the last Node, then the process will proceed to the next Node, as indicated by block 612 or will check if the entries in all the Nodes are full, as indicated by decision block 618.
- the Next Task Selector 452 examines the first entry of the NEXT page of the Selection Queue 450 for its own Node (NID). If that task has not been previously started by its own Node, then it records that task in its Next Task Register 454 which is passed to the Task Communicator 44 through the Task Communicator
- the Next Task Selector Since the Selection Queue 450 is not updated until the third Subatomic period after a task is started, the Next Task Selector must maintain a local record of tasks started on its own Node.
- the TID's of the previous tasks started are maintained in a two entry deep stack to record the previous two tasks actually started by the Node's Task Communicator 44.
- the Scheduler 42 receives immediate notification from the Task Communicator whenever a task is started. It then pushes the currently selected task onto the previous TID stack allowing the oldest entry to fall off the bottom of the stack.
- the operation of the Next Task Selector 452 is triggered by the beginning of the soft-error window, while the Transmitter is occupied with the transmission of a Task Interactive Consistency or a System State message. Therefore, the Task Communicator cannot transmit a task
- the Next Task Selector 452 is the only module in the Scheduler which has access to its own Node Identification (NID) code.
- NID Node Identification
- the operation of the Next Task Selector 452 will be discussed with reference to the flow diagram shown in Figure 46.
- the operation of the Next Task Selector begins with the setting of the entry pointer to the the NEXT page to its own NID and to entry 0, as indicated by block 626.
- the Next Task Selector increments the entry pointer to the first task, as indicated by block 628 and records as the selected task the task that is entered for Its own Node in the entry of the Selection Queue 450, as indicated by block 630.
- the Next Task Selector will then inquire, decision block 632, if this is the third entry in its own entry of the NEXT page. If it is, it will store the selected tasks in the Next Task Register 454.
- the Next Task Selector will inquire, as indicated by decision block 636, if the selected task and iteration are the same as the first or second previously selected task and iteration. If the selected task and iteration are the same as a first or second previously selected task and iteration, the Next Task Selector will proceed to increment the entry and examine the next task in the Selection Queue, as indicated by block 628. However, if the selected task and iteration were not previously selected, the Next Task Selector will store the selected task in the Next Task Register 454, as indicated by block 634 completing the selection process.
- the Next Task Selector 452 selects the third entry regardless of its previous selection status. This feature allows multiple entries of the null task to be placed in the Selection Queue simultaneously in the event there are no other tasks ready to run. Thus, when no other tasks are ready to execute, the Node will start the null task every Subatomic period until another task becomes available.
- the operation of the Task Selector Module begins with the end of the Soft Error Window (SEW), at which time no Task Complete/Started messages should be arriving from the non-faulty Nodes.
- SEW Soft Error Window
- the Completed/Started Handler will transfer the content of the Task Completed/Started messages stored on the On-Board RAM 430 to the Started TID Register 434 before the earliest possible arrival of the Task Completed/Started messages for the next Subatomic period. All of the other processes executed by the submodules with the exception of the Next Task Selector 452 must be completed before the beginning of the Next Soft Error Window.
- the operation of the Next Task Selector 452 is triggered by the beginning of the soft error window and must be completed by the time the Transmitter completes sending its Task Interactive Consistency and/or System State messages and becomes available to the Task Communicator for sending Task
- the operation of the Wake-up Sequencer is triggered by the end of the operation of the Completed/Started Handler 432.
- the Execution Timer 456 will perform its execution timer checks.
- the TIC Handler 436 will then proceed to update the Selection Queue 450 and the Completion Status List 458, to reset the execution timers, and update the Priority Scan List 446.
- the Priority Scan List is updated, the Priority from the Priority Scan List 446 to the Selection Queue 450.
- the Next Task Selector 452 will select the next task from the Selection Queue 450 and place it in the Next Task Register 454.
- the active task set for a Node is defined as a set of tasks enabled for execution on that Node.
- a Bool ean val ue "activation-status" may be used to represent whether a given task is enabled for execution on the given Node.
- the purpose of reconfi guration is to modify the activation-status for each task Node pair when the System State is modified by the excl usion or readmission of a Node.
- Three independent operations are needed to correctly manage the activation-status val ues . 1) Individual tasks may be enabl ed or disabled for all Nodes in the system to account for changes in the overal l system capabil ities.
- a task may be el iminated completely from the active task set or repl aced by functional ly equi valent simpl er tasks. This operation of activation or deactivation of a task is referred to as swapping.
- a task which may be enabl ed for execution is said to be swapped in whi l e a task which is di sabled is said to be swapped out .
- Active tasks may be real located among the operating Nodes of the system. For example, i f a Node is excl uded , one copy of each task as executed by that Node wi l l be lost. In order to maintain the desi red redundancy of each task , one copy of each affected task must be executed by some other Node. The Scheduler does not requi re all these tasks to be reassigned to one Node but rather may distribute these tasks among the remaining Nodes as desi red . A side affect of reallocation is that it may require that lower priority tasks be swapped out if the remaining Nodes are highly utilized.
- Tasks may be prohibited from executing on individual Nodes based upon their operational status. For example, when a Node is excluded by a state transition, it is generally desirable to prohibit any application tasks from executing on that Node. However, it is desirable for the excluded Node to initiate a comprehensive sequence of diagnostic tasks.
- the set of all the tasks in the system are divided into two mutually exclusive subsets, the included task set and the excluded task set. Members of the included task set may only be executed by the included Nodes and the members of the excluded task set may only be executed by excluded Nodes.
- the Reconfiguration Module includes a Task Swapper 638, a Task Reallocator 640, and a Task Status Matcher 642.
- a Current and Next System State Comparator 644 receives the System State Vector from the Fault Tolerator Interface 416, as indicated in Figure 34 and generates a Delta
- the Task Swapper 638 generates a Boolean swap-status value indicating whether the task is swapped in or swapped out of the Active Task set. This process uses the Swap Table 646 which is contained in the Scheduler RAM 422 as previously described.
- the Task Reallocator 640 generates one Boolean allocation-status value for each task-node pair in the system.
- the Task Real locator 640 uses the Allocation Tables 648 which are contained in the Scheduler RAM 422, as indicated in Figure 35.
- the Status 642 Matcher generates a Boolean match-status value for each task-node pair.
- the Task Status Matcher 642 uses the Relevance Vector Table 650 which is one of the tables stored in the Scheduler ROM 424, as previously discussed with reference to Figure 36.
- the swap-status value, the allocation-status value, and the match-status value are handled together as symbolically indicated by AND Gate 652 and stored in the Task Activity List 444 shown in Figure 37.
- the operation performed by the Task Swapper 638 is largely one determining the relevance of each Node to each task.
- the application designer may define any subset of the Nodes as relevant to the performance of each task.
- the swapped-status of each task is determined solely by the number of relevant Nodes included in the operating set.
- the new System State is examined to determine whether the number of relevant Nodes in the operating set will change the swap-status of each task.
- the number of relevant Nodes at which the change is required is defined as the "toggle value" for that task.
- two toggle values are provided to enhance the flexibility for system reconfiguration. The operation of the Task Swapper 638 will be discussed relative to the flow diagram in Figure 48.
- the operation of the Task Swapper 638 begins with the setting of the pointers to the Relevance Vector in the Scheduler ROM 424 and the pointer to the Swap Tables in the Scheduler RAM 422 to the first task, as indicated by block 653.
- the Task Swapper 638 will complement the swap status, as indicated by block 658, then will decrement the swap count for that task in the Swap Table 646 as indicated by block 660. However, if the swapped count is not equal to 0, the Task Swapper 638 will not complement the swap status of that task, but will simply decrement the swap count stored in the Swap Table 646. After decrementing the Swap Table 646, the Task Swapper will proceed to increment TID pointers to the next task as indicated by block 662 then inquire if this task is the last task in the system as indicated by decision block 664. If it is the last task, the operation of the Task Swapper is completed, otherwise the Task Swapper will repeat the above process until all the tasks have been evaluated.
- System State Vector indicates that a Node has been readmitted to the system is indicated in the flow diagram in Figure 49.
- the Swap Table 646 is part of the Scheduler RAM 422 as illustrated in Figure 34.
- the Task Swapper will then inquire, decision block 668, if the task is relevant to the Node which has been readmitted into the operating set.
- the Task Swapper will proceed to evaluate the next task, as indicated by block 676 and decision block 678. However, if the task is relevant to the readmitted Node, the Task Swapper will increment the swapped count in the Swap Table 646 as Indicated by block 670 then inquire, as Indicated by decision block 672, if the number of relevant Nodes is equal to the Toggle Point. If the number of relevant Nodes equals the Toggle Point, then the Task Swapper 638 will complement the swap status of that task, as indicated by block 674 and proceed to the next task, as indicated by block 676.
- the swap-status of the task will not be complemented and the Task Swapper will proceed to evaluate the next task, as indicated in block 676.
- the Task Swapper will then inquire, as indicated by decision block 678, if the task was the last task to be evaluated. If the last task has been processed, the Task Swapper 638 is finished with its operation, otherwise the process will be repeated for each task until the last task is processed.
- the swapping process has the following properties:
- the swapped status of a task depends only on the System State and not on the sequence of transitions which preceded that state.
- the operation of the Task Reallocator 640 is very similar to the process of the Task Swapper. There are, however, two major differences between swapping and reallocation:
- the new System State is examined to determine whether the number of preferred Nodes in the operating set will change the allocation- status of each task.
- the number of preferred Nodes at which the change is required is defined as a Toggle Value for that task and Node.
- any number of Toggle Values may be defined for any task pair. However, only one Toggle Value is required for each task-node pair to provide the flexibility desired for system reconfiguration.
- the Reallocation process begins with the first task, as indicated by block 680, in Figure 50.
- the Task Reallocator 640 will then inquire if the excluded Node (i) is a more preferred Node for that task than the Node (n) being evaluated, as indicated in decision block 684. If the excluded Node (i) it is not a more preferred Node for that task, the Task Reallocator will then proceed to determine if it is a more preferred Node than the next Node, as indicated by block 692 and decision block 694.
- the Task Reallocator will repeat the process for each Node until the last Node is evaluated then the Task Reallocator will index to the next task, as indicated by block 696 and repeat this process until all of the task-node combinations have been completed, as indicated by decision block 698.
- the operations of the Task Reallocator treats all tasks Nodes and Toggle Points independently.
- the allocation-status depends on the number of available preferred Nodes and not on the identity of these Nodes.
- the operation of the Task Allocator is reversible and path independent. For example, if the Delta System State Vector indicates a Node has been readmitted into the operating set, the operation of the Task Allocator parallels that of the Task Swapper in that the allocation count is incremented rather than decremented, as indicated by block 680 and the incrementing of the allocation count takes place prior to the inquiry to determine if the number of preferred Nodes in the state is equal to the Toggle Point, as indicated by decision block 676.
- the allocation-status of the task-node pairs depends only on the
- the operation of the Task Status Matcher 642 will be discussed with reference to Figure 51.
- a Node is excluded from the operating set for a faulty behavior, it is generally desirable to prohibit application tasks from executing on that Node.
- the Swapping and Allocation processes described above are not capable of supporting this function. Therefore, the total task set is divided into two mutually exclusive sets, the included task set and the excluded task set.
- the tasks of the included task set are permitted to be active only on the Nodes Included in the operating set.
- the tasks in the excluded task set are permitted to be active only in the Nodes excluded from the operating set.
- the activationstatus of that task on the Node is determined by the Swapping and Allocation processes. Conversely, if the status of the task does not match the status of the Node, then that task is prohibited from being executed on that Node regardless of the results of the swapping and reallocation processes.
- the Task Matching Process begins by setting the task pointer to the first task, as indicated by block 700.
- the Task Status Matcher determines if there is a match between the included/excluded status of the task with the included/excluded status of the Node, as indicated by the System State Vector. This is indicated in block 704 which says "TID-NID match equal to the Exclusive OR the Relevance Vector contained in ROM 424 and bit "n" the System State Vector.”
- the Task Status Matcher will then process the next Node, as indicated by block 706 and decision block 708 until all of the Nodes have been evaluated with respect to the given task.
- the Task Status Matcher 642 will then Index the task pointer to the next task, as indicated by block 710 and repeatedly evaluate all of the tasks until the last task has been evaluated, as indicated by decision block 712. After all of the task-node combinations have been evaluated, the operation of the Task Status Matcher is completed.
- the Task Communicator 44 coordinates and supports all the communications between the Operations Controller 12 and its associated Applications Processor 14. Upon demand, the Task Communicator 44 provides the Applications Processor 14 with the Task Identification Code (TID) of the next task to be executed and supplies the required input data values. The Task Communicator receives all output data generated by the Applications Processor 14 and broadcasts them to all the other Nodes in the system via the Transmitter 30. When the Applications Processor 14 reports an error condition, the Task Communicator 44 reports the error condition to the Fault Tolerator 36 through the Voter 38. When a task is completed by the Applications Processor 14, the Task Communicator receives the value of the Branch Condition (BC) generated by the Applications Processor 14, and broadcasts it to all the other Nodes in the next Task Completed/Started message.
- BC Branch Condition
- the Task Communicator 44 communicates directly with the Scheduler 40, the Voter 38, the Transmitter 30, and the Applications Processor 14.
- the Task Communicator has a table which lists by tasks the expected sequence of the input data to be used by the Applications Processor 14 and the expected sequence of the output data generated by the Applications Processor. Using messages from the Scheduler 40, the Task Communicator keeps track of the current tasks currently being executed by all the Nodes and uses this information to supply information to the Voter relating to the message currently being processed.
- the Task Communicator has a Voter Interface 714, a Scheduler Interface 716, and a Transmitter Interface 718.
- the Voter Interface 714 interfaces with the Voter 38 and receives the voted data and deviance vector and the Message Type Code (MT), and Data Identification Code (DID) of the voted data.
- the Voter win also send the Node Identification Code (NID) of a Node from which 1t has received data and request that the Task Communicator identify the DID of the data it has received.
- the Voter Interface 714 will also receive an Error Report from an Error Reporter 754 which is communicated to the Voter Interface 74.
- the Voter Interface 714 receives data and the associated MT and DID codes from the Voter 38.
- a Store Data Control 720 passes the data to the Data Memory where it is stored using the MT and DID codes and the complement of a context bit taken from a Context Bit Memory 732 as an address.
- the Data Memory is partitioned in a manner similar to that previously discussed with reference to the Fault Tolerator RAM 162 and shown in Figure 16.
- the context bit taken from a Context Bit Memory 732 is used to store the data in the Data Memory 42 in the appropriate partition.
- a DID Request Handler 724 receives a DID request from the Voter 38 through the Voter Interface 714 in the form of a Node Identification Code (NID).
- NID Node Identification Code
- the DID Request Handler 724 will access a Pointer Table 726 and transmit back to the Voter 38, the identity (DID) of the expected data that the Voter should currently be processing. If the Voter detects a mismatch between the expected DID and the DID of the data value it is currently processing it will set an error flag.
- the Scheduler Interface 716 receives from the Scheduler
- a Task Terminated Recorder 730 will flip the context bit of the Context Bit Memory 732 for output DID's of the task reported as terminated.
- the Task Terminated Recorder 730 will also set a new NUDAT bit in a Nudat Bit Memory 722 Indicating to the Voter next time it requests that same DID it will be the first request for that DID since the task that generated it was terminated.
- the Task Terminated Recorder 730 will also OR all the deviances stored in the Data Memory 42 for the terminated task output DID's and store them in a Deviance Error Register 734.
- a Task Started Recorder 736 upon the receipt of a message from the Scheduler Interface 716 indicating the starting of a new task by any Node in the system, will access the Pointer Table 726 to determine if the address stored in the Pointer Table points to a null DID indicating that that Node had completed the preceding task. If the address stored in the Pointer Table 726 does not point to a null DID, the Task Started Recorder 736 will set a sequence error flag for that Node which is stored in a Sequence Error Register 738.
- the Task Started Recorder 736 will access the pointer in a DID List 728 with the Task Identification Code (TID) of the task started and store the address for the first DID in that task in the Pointer Table 726.
- TID Task Identification Code
- a Next Task Recorder 740 will store the next task received from the Scheduler in a Next Task Register 742.
- An AP Input Handler 744 will transfer the identification ⁇ f the next task stored in the Next Task Register 742 to an AP Input FIFO 746.
- the AP Input Handler will then access the Pointer and DID List 728 with the Task Identification Code (TID) and get the address for the data stored in the Data Memory 42 needed for the execution of that task; This data will then be stored in the AP Input FIFO 746.
- TID Task Identification Code
- the AP Input Handler 744 will also generate a Task Completed/Started message which is sent to the Transmitter 30 through the Transmitter Interface 718, which transmits this message to all the other Nodes in the system.
- the Applications Processor 14 executes the selected task, the data resulting from the execution of the task will be stored in an AP Output FIFO 748.
- An Applications Output Handler 750 will access the pointer and DID List 728 and obtain the Message Type Code (MT) and the Data Identification Code (DID) for each data value generated by the Applications Processor.
- MT Message Type Code
- DID Data Identification Code
- the last word generated by the Applications Processor 14 contains an Applications Processor Reported Error (APRE) vector which is stored in an APRE Register 752.
- the last word generated by the Applications Processor 14 also contains the branch condition bit which is to be included in the next Task Completed/Started message generated by the AP Input Handler 744. This branch condition is stored in the Transmitter Interface 718 until it receives the remainder of the Task Completed/Started message from the AP Input Handl er 744.
- the Error Reporter 754 receives the Deviance Error Vector from the Deviance Error Register 734, the Sequence Error Vector from the Sequence Error Register 738, and the Application Processor Reported Error Vector from the APRE Error Register 752 and transmits these Error vectors to the Voter 38 through the Voter Interface 714.
- An Arbitrator 756 arbitrates the operation of the various Modules in the Task Communicator.
- the Store Data Control receives the voted data and deviances from the Voter. Along with this information, it also receives the Message Type Code (MT) and the Data Identification Code (DID).
- MT Message Type Code
- DID Data Identification Code
- the operation of the Store Data Control 720 begins by evaluating the first byte received from the Voter Interface 714, as indicated by decision block 760. When the first byte is all 0's, as indicated in block 760, there is no data available and the Store Data Control 720 will wait until it receives the first non-zero first byte. After receiving a non-zero first byte, the Store Data Control 720 will inquire, as indicated in decision block 762, if the data is a System State vector. If the data is not a System State vector, the Store Data Control 720 will access the Context Bit Memory 732 for the state of the context bit, using MT and DID codes as indicated by block 764.
- the TOC bit marks the Atomic period in which the System State Vector was generated.
- the TOC Bit Flip Flop 758 is complemented to mark the beginning of each new Atomic period in response to the rising edge of the Atomic period (AP) signal .
- the Store Data Control 720 will then store the voted data in the Data Memory 42 at the generated address, as indicated by block 774.
- the Store Data Control 720 will then generate an address for the deviance vector by setting the deviance bit equal to 1 and the context bit equal to 0 as indicated by block 776. It win then store the deviance vector in the Data Memory 42, as indicated by block 778.
- the Context Bit Memory 732 stores a context bit for each DID used in the system and the context bit identifies which of the data stored in the Data and Deviance RAM is the current data and which data is pending or incomplete since all of the copies of that data have not been received.
- the context bits in the Context Bit Memory 732 for the output DID's of each terminated task are complemented by the Task Terminated Recorder 730 in response to a message from the Scheduler that a particular task is terminated. The Task Terminated Recorder 730 will then complement all of the DID's which resulted from the terminated task as shall be discussed hereinafter.
- the DID Request Handler 724 receives from the Voter Interface 714 the identification or NID of the Node whose data is being processed by the Voter 38 indicating that the Voter is requesting the Data Identification (DID) Code of the data currently being processed a NUDAT bit embedded in this DID code tells the Voter 38 whether this is the first time the Voter has requested this particular DID since termination of the task that generated the data.
- DID Data Identification
- the DID Request Handler will address the Pointer Table 726 with the NID to obtain a Pointer to a DID List 830 which is part of the Pointer and DID List 728 shown in Figure 52.
- the DID Request Handler will then access the DID List 830 and obtain the Data Identification Code DID from the DID List 830. It will then access the Nudat Bit Memory 722 and transmit the nudat bit with the DID back to the Voter 38 through the Voter Interface 714.
- the operation of the DID Request Handler 724 begins by accessing the Pointer Table 726 with the NID to get the Pointer to the DID List 830, as indicated by block 780.
- the DID Request Handler will then access the DID List 830 to get the DID of the data which is currently being processed by the Voter, as indicated by block 782.
- the DID Request Handler 724 will then access the Nudat Bit Memory 722 to get the NUDAT bit, as indicated by block 784. It will then append the NUDAT bit to the DID as indicated by block 786 and pass the DID and the NUDAT bit to the Voter Interface 714.
- the DID Request Handler will then set the NUDAT bit to i, as indicated by block 788, then inquire, as indicated by decision block 790, if the DID was a null task. If it was not a null task, the DID Request Handler 724 will then increment the pointer in the Pointer Table 726 to the next DID for that Node, as indicated by block 792. However, if the DID was a null DID, the DID Request Handler 724 will not increment the pointer in the Pointer Table 726 but will leave it at the null DID indicating that all the DID's for that task have already been transmitted.
- the data stored in the DID List has two 8 bit bytes, as shown in Figure 65.
- the first byte consists of three fields, a Data Value Mask, a Data Type, and a Message Type.
- the second byte is the actual Data Identification Code (DID) of the data.
- This Information is used by the AP Output Handler 750 to tag the data generated by the Applications Processor 14 in the transmitted Data Value messages.
- the DID Request Handler 724 will append the NUDAT bit to the most significant bit position (MSB) of the first byte obtained from the Data List 830, as Indicated in Figure 66, since the most significant bit of the Data Value Mask is not needed by the Voter 38.
- MSB most significant bit position
- the Task Terminated Recorder 730 receives the Task Identification Code (TID) of the Terminated Task from the Scheduler Interface 716.
- TID Task Identification Code
- the Task Terminated Recorder 730 will then access the Output Pointer List 794 which is part of the Pointer and DID List 728 shown in Figure 52.
- the Output Pointer List 794 and the DID List 830 are embodied in a common offboard ROM not shown.
- the Task Terminated Recorder 730 will then access the Context Bit Memory 732 and the Nudat Bit Memory 722, and complement the context bits and set the NUDAT bits for all the DIDs that resulted from the terminated task.
- the Task Terminated Recorder 730 will then, using the addresses obtained from the DID List 830 access the deviance vectors stored in the Data Memory 42 for all the deviance vectors associated with the DIDs of the terminated task. It will then OR all of these deviance vectors with the content of the Deviance Error Register 734.
- the operation of the Task Terminated Recorder 730 begins by inquiring, as indicated by block 796, if there is a task to be terminated. If there is no task to be terminated, as indicated, by the two bits of the Status Word being 0's, the Task Terminated Recorder 730 will do nothing. However, if either of the bits of the Status Word is a 1 then the Task Terminated Recorder will inquire, as indicated by decision block 798, if the Status Word is a 10. The least significant bit of this Status Word indicates whether the task is a null task which requires no further action by the Task Terminated Recorder 730.
- the Task Terminated Recorder 730 will access the Output Pointer List 794 to get a Pointer to the DID List 830, as indicated by block 800.
- the Task Terminated Recorder will then set the least significant bit of the Status Word to 1, as indicated by block 802.
- the Task Terminated Recorder 730 will then access the DID List 830 with the Pointer and will inquire, as indicated by block 804, if the DID is a null DID, indicating that it is the last DID of that task. If it is a null DID, then the Task Terminated Recorder 730 will set the least significant bit of the status word to 0, as indicated by block 814 and terminate the processing of the terminated task.
- the Task Terminated Recorder will set the nudat bit for that DID to 1, as indicated by block 806, complement the context bit in the Context Bit Memory for that DID, as indicated by block 808.
- the Task Terminated Control, 736 will then OR the deviances stored In the Data Memory 42, as indicated by block 810 and store the OR values in a Deviance Error Register 734 to generate a deviance vector which is transmitted to the Voter Interface 714 by the Error Reporter 754.
- the task terminated pointer in the Pointer Table 726 is incremented as indicated by block 812.
- the Task Started Recorder 736 receives the TID and NID of the task started which is contained in a received Task Started message from the Scheduler 40.
- the Task Started Recorder 736 will first check the Pointer Table 726 to determine if the current DID is a null DID. If it is not, it will record a sequence error in the Sequence Error Register 738 for the identified Node. This check is made because a new task should not have been started until all the data from the preceding task has been received.
- the Task Started Recorder 736 will then update the Pointer Table 726 by accessing the Output Pointer List 794 with the TID to get the Pointer to the DID List 830 for the first DID resulting from the execution of that task.
- the Task Started Recorder 736 will then store the Pointer obtained from the Output Pointer List 794 into the Pointer Table 726.
- the operation of the Task Started Recorder 736 begins with accessing the Pointer Table 726 to get the pointer to the DID List 830, as Indicated in block 816.
- the Task Started Recorder will then access the DID List 830 with the pointer to get the DID currently being stored for that Node, as indicated by block 818. Then it win inquire, as indicated by decision block 820, if the DID is a null DID. If it is not, the Task Started Recorder will record a scheduling error in the Sequence Error Register 738, as indicated by block 826.
- the Task Started Recorder will access the Output Pointer List 794 with the TID of the started task to get the Pointer to the DID list 830 to the first DID for that task, as indicated by block 822. It will then access the DID list 830 with the Pointer then write the address of the first DID into the Pointer Table 726 for that Node, as indicated by block 824, completing the recording of the task started.
- the details of the AP Input Handler 744 will be discussed relative to the block diagram shown in Figure 61 and the flow diagram shown in Figure 62.
- the AP Input Handler 744 identifies and fetches the input data values required by the Applications Processor 14 for the execution of the next task. From the view point of the Applications Processor 14, the
- Task Communicator 44 acts like a storage device containing an input file.
- the values are provided in a predetermined order as specified by the input TID-to-DID Mapping List for the current TID.
- each input data mapping word DID consists of the Data Value Mask (DVM), the Data Type (DT), and the Message Type (MT), and the actual Data Identification Code (DID) which is the starting address of the data value in the Data Memory 42.
- DVM Data Value Mask
- DT Data Type
- MT Message Type
- DID Data Identification Code
- the addresses are ambiguous because each address points to two different locations in the Data Memory 42.
- the context bit for a particular DID defines the location that has the current input data.
- the AP Input Data Handler 744 fetches each current data value from the Data Memory 42 and loads it into the AP Input FIFO 746. Upon a request from the Applications Processor 14, the data values from the AP Input FIFO 746 are transferred to the Applications Processor. This cycle is repeated until all input data values have been transferred.
- the AP Input Handler 744 interacts with an Input Pointer List 828, the DID List 830, an AP Input Pointer Table 832, and the AP Input FIFO 746.
- the Input Pointer List 828 and the DID List 830 are part of the Pointer and DID List 728 shown in Figure 52 and are embodied in an off-board ROM (not shown).
- the AP Input Handler 744 receives the next task from the Next Task Register 742 and, using the Input Pointer List 828, DID List 830, and the AP Input Pointer Table 832, will extract from the Data Memory 42 the data required for the execution of the task. This information is stored in the AP Input FIFO 746 and made available to the Applications. Processor 14. The AP Input Handler 744 will also generate a Task Completed/Started message identifying the task completed and the next task to be started by its own Applications Processor 14. The AP Input Handler 744 will also generate a Task Release message sent to the Scheduler 40 through the Scheduler Interface 716.
- the contents of the TOC Bit Flip Flop 758 are appended to the addresses stored in the AP Input Pointer Table 832 to identify the current system state.
- the contents of the Context Bit Memory 732 are appended to the addresses to identify the current data values stored in the Data Memory 42.
- the mapping list for each task in the Input Pointer List 828 consists of a contiguous group of DID's terminated by a null DID. A pointer to the beginning of each mapping list is stored in the Input Pointer List 828 and is addressed by the TID of the task.
- the nul DID is used to identify the end of the contiguous group of DID's in the DID List 830.
- the AP Input Handler 744 first inquires if this is the first task to be executed in this Subatomic period, as indicated by decision block 834. This is because only one task can be started in any given Subatomic period. If this is not the first task, then the AP Input Handler 744 will wait until the beginning of the next Subatomic period. Otherwise the AP Input Handler 744 will inquire if the next task is a new task received since the last CS message or reset or reconfiguration commands, as indicated by decision block 836. If the task is not a new task, the AP Input Handler 744 will wait until it receives a new task.
- the AP Input Handler 744 will send a Task Completed/Started message to the Transmitter Interface 718, as indicated by block 838. This Task Completed/Started message will subsequently be transmitted by the Transmitter 30 to all of the other Nodes in the system. The AP Input Handler 744 will then inquire, as indicated by decision block 840, if the transmission of the Task
- the AP Input Handler will wait until the transmission of the Task
- the AP Input Handler 744 will send a Task Release message to the Scheduler 40 through the Scheduler Interface 716, as indicated by block 842, informing it that the Task Communicator has transmitted a Task
- the AP Input Handler 744 will then transfer the next task from the Next Task Register 742 and store it as the current task in the Applications Input Pointer Table, as indicated by block 844.
- the AP Input Handler will then write the TID of the current task into the AP Input FIFO 746, informing the Applications Processor 14 of the identity of the task, as indicated by block 846.
- the AP Input Handler 744 will get the DID Pointer from the Input Pointer List 828 and store the DID Pointer in the AP Input Pointer Table 832, as indicated by block 848.
- the AP Input Handler Using the DID Pointer, the AP Input Handler will then get the address of the DID in the Data and Deviance RAM and store this address in the AP Input Pointer Table, as indicated by block 850. The AP Input Handler 744 will then inquire, as indicated in decision block 852, if the DID is a null DID. If the DID is a null DID Indicating it is the last DID in the task, the operation of the AP Input Handler 744 is completed. Otherwise, the AP Input Handler 744 will inquire, as indicated in decision block 854, if the DID is a System State Vector.
- the AP Input Handler 744 will access the TOC Bit Flip Flop and get the TOC bit which is used in place of the context bit for addressing the Data Memory 42, as indicated by block 866. Otherwise, if the DID is not a null DID nor a System State Vector, the AP Input Handler will access the Context Bit Memory for the context bit as indicated by block 856 and append it to the address in the AP Input Pointer Table. Using the address In the DID list 83 and the context bit or the TOC bit from the TOC Bit Flip Flop, the AP Input Handler will access the Data Memory 42 and write the first two data bytes into the AP Input FIFO, as indicated by block 858.
- the AP Input Handler will then inquire, as indicated by block 860, if the Message Type is either a Message Type 2 or Message Type 3 which have 4 bytes rather than 2 bytes. If the Message Type is not either a Message Type 2 or 3, the AP Input Handler 744 will index the DID Pointer and store the indexed DID Pointer in the AP Input Pointer Table, as indicated by block 864. If the Message Type is either a Message Type 2 or a Message Type 3, the AP Input Handler 744 will address the Data
- Memory 42 again and write the final 2 data bytes into the AP Input FIFO 746, as indicated by block 862, then increment the DID Pointer as indicated in block 864.
- the AP Input Handler 744 will repeat this process to obtain all of the data values needed by the Applications Processor 14 for the execution of the task.
- the last DID for any task is a null DID. This null DID is recognized by the AP Input Handler 744, as indicated in decision block 852 and will terminate the loading of the AP Input FIFO 746.
- the AP Output Handler 750 will be discussed relative to the block diagram shown in Figure 63 and the flow diagram shown in Figure 64.
- the AP Output Handler 750 receives and processes the output data values generated by the Applications Processor 14 and broadcasts them to all the Nodes in the system. As with input values, output values are expected in a predetermined order specified by the output TID-to-DID Mapping List for the current task.
- the AP Output Handler 750 loads it into an Output Data Register along with its MT and DID codes and initiates broadcast of the Data Value message. After outputing the last value for each task, the Applications Processor generates a combined
- the Applications Processor Reported Error Vector (APRE) and the Branch Condition (BC).
- the Applications Processor Reported Error portion is used with the previous Applications Processor Reported Error data to be reported to the Voter 38 through the Error Reporter 754.
- the Branch Condition (BC) is used as part of the next Task Completed/Started message generated by the AP Input Handler 744.
- the AP Output Handler 750 responds to the AP Input Handler completing its operation.
- the AP Output Handler 750 first accesses the Pointer Table 726 and obtains the pointer which identifies the current task which is stored in the DID List 830. This Pointer is then used to address the DID List 830 to obtain the Data Identification Code (DID), the Message Type (MT), and the Data Value Mask (DVM).
- the AP Output Handler 750 will then store the Message Type bits and the Data Identification Code in an Output Data Register 870. As the data is generated by the Applications Processor 14, the data is stored in the AP Output FIFO 748.
- the data is then transferred from the AP Output FIFO 748 to the Output Data Register 870.
- the AP Output Handler 750 will then transmit the first 2 bytes to the Transmitter Interface 718, and will transmit the Applications Processor Reported Error Vector to the APRE Register 752.
- the AP Output Handler 750 will read the current DID Pointer from the Pointer Table 726 and store it in the AP Output Pointer Table 868, as indicated by block 872. The AP Output Handler will then access the DID List using the pointer stored in the Pointer Table 726 and store the Message Type and DID in the Output Data Register 870, as indicated by block 874.
- the AP Output Handler 750 After the Message Type and DID bytes are stored in the Data Output Register 870, the AP Output Handler 750 will inquire, as indicated by decision block 875, whether the Applications Processor has placed any data values in the AP Output FIFO 748. If no data has been placed in the AP Output FIFO 748, the AP Output Handler 750 will wait until data is generated by the Applications Processor 14, as indicated by decision block 876. After the Applications Processor 14 has stored the generated data values in the AP Output FIFO 748, the AP Output Handler 750 will transfer the data values to the Output Data Register 870, as indicated by block 878.
- the AP Output Handler 750 will then inquire, as indicated in decision block 880, if the Message Type is a Message Type 2 or a Message Type 3. If the Message Type is a Message Type 2 or 3, the AP Output Handler 750 will wait until the final two data bytes of the data are generated and placed in the AP Output FIFO 748, as indicated by decision block 882. After the Applications Processor has written the second two data bytes into the AP Output FIFO 748, the AP Output Handler 750 will transfer the contents of the AP Output FIFO 748 into the Output Data Register 870, as indicated in block 884.
- the AP Output Handler will inquire if the DID of this data is a null DID, as indicated by decision block 886. If the DID is not a null DID, the AP Output Handler 750 will send the Message Type and Data Identification Code bytes (DID) to the Transmitter Interface 718, as indicated by block 888. The AP Output Handler will then send the data bytes to the Transmitter Interface 718 to complete the Data Value message, as indicated by block 890.
- DID Data Identification Code bytes
- the AP Output Handler 750 will then increment the current DID Pointer in the AP Output Pointer Table 868, as indicated in block 892 and repeat the processing of the next data value generated by the Applications Processor 14. As previously indicated, the last DID in the DID List 830 for that particular task will be a null DID which will be detected by the AP Output Handler 750, as indicated by decision block 886. If the DID is a null DID indicating that the
- the Applications Processor 14 has generated all of the data values for that task, the Applications Processor will generate a last data word containing the Applications Processor Reported Error and the Branch Condition (BC) bit for the subsequent tasks.
- the AP Output Handler 750 will store the Applications Processor Reported Error Vector in the APRE Register 752, as indicated by block 894, and then will proceed to process the branch condition, as indicated by decision block 896.
- the AP Output Handler will first inquire if the majority of the branch condition bits are equal to 0.
- the AP Output Handler will generate a branch condition byte consisting of all 0's, as indicated by block 898, otherwise, if the majority of branch condition bytes are 1's, the AP Output Handler 750 will generate a branch condition byte of consecutive 1's, as indicated by block 900. Finally, the AP Output Handler 750 will store the branch condition byte in the Transmitter Interface 718, as indicated by block 902. This byte will be appended to the next Task Completed/Started message generated by the AP Input Handler 744 as the branch condition of the completed task.
- the Task Communicator 44 also includes a Reconfigure Control 904, as shown in Figure 67, and a Reset Control 920, as shown in Figure 69.
- the Reconfigure Control 904 transmits an interrupt to the Applications Processor 14 and awaits acknowledgement. After acknowledgement, the Reconfigure Control 904 will initialize the Output Handler 750 and its Output Data Register 870, the AP Input FIFO 746, the AP Output FIFO 748, and the Pointer Table 726. The operation of the Reconfigure Control 904 will be discussed relative to the flow diagram shown in Figure 68.
- the Reconfigure Control 904 In response to a reconfiguration request from the Scheduler 40, the Reconfigure Control 904 will send a reconfigure interrupt to the Applications Processor 14, as indicated in block 906. It will then terminate all messages to the Transmitter Interface 718, as indicated by block 908, by clearing the Output Data Register 870. The Reconfigure Control 904 will then await acknowledgement of the interrupt signal from the Applications Processor 14, as indicated by decision block 910. After the Applications Processor has acknowledged the interrupt, the Reconfigure Control will clear the AP Input FIFO 746 and the AP Output FIFO 748, as indicated by block 912, then set all the pointers in the Pointer Table 726 to null DID's, as indicated by block 914.
- the Reconfigure Control will restart the AP Input Handler 744, as indicated in block 916, then send a Task Communicator Ready (TSCRDY) signal to the Scheduler 40, as indicated in block 918, indicating that the Task Communicator 44 is ready to begin processing data in the new reconfigured System State.
- TSCRDY Task Communicator Ready
- the Task Communicator also has a Reset Control 920 responsive to the Operations Controller Reset (OCRES) signal, as indicated in Figure 69.
- the Reset Control 920 interfaces with the Applications Processor 14, an AP Ready Flag 922, a Reset flag 924, the AP Input FIFO 746, the. AP Output FIFO 748, the Pointer Table 726, and the AP Input Handler 744.
- the operation of the Reset Control 920 begins by sending a reset request to the Applications Processor 14, as indicated by block 926.
- the Reset Control 920 will then set the Reset Flag 924 to "ON," as indicated by block 928 to signify te the other subsystems of the Task Communicator 44 that the Operations Controller is being reset.
- the Reset Control 920 will then set the AP Ready Flag 922 to "OFF," as indicated by block 930, to signify to the Scheduler 40 that the Applications Processor is not yet ready to start processing any tasks.
- the Reset Control 920 will then proceed to clear the AP Input FIFO 746 and the AP Output FIFO 748, as indicated by block 932, then set all of the pointers in the Pointer Table 726 to null DID's, as indicated by block 934.
- the Reset Control will then start the AP Input Handler 744, as indicated by block 936, and wait for the Applications Processor to signify that it is ready, as indicated by decision block 938.
- the Reset Control 920 will turn the Reset Flag 924 "OFF” and the AP Ready Flag 922 “ON,” signifying that the Task Communicator 44 is now ready to start processing data, as indicated by block 940.
- the Synchronizer 46 establishes and maintains the synchronization between all of the Operation Controllers in the System.
- the multi-computer architecture uses loose synchroni zation which i s accompl ished by synchronous rounds of message transmission by each Node in the system.
- each Synchroni zer 46 detects and time stamps each time dependent message received by its own Node. These time dependent messages are transmitted by every other Node in the System at predetermined interval s and they are received by al l the other Nodes in the system.
- a Node will receive its own time dependent messages al ong with the other time dependent messages sent by the other Nodes .
- the compari son of the time stamps on a Nodes own time dependent message with the time stamps on al l of the other time dependent messages is what drives the fault tolerant conversion algorithm.
- the synchronization is done over two timing message i nterval s . These interval s are del imited by a pre-sync message and a sync message which are transmitted alternatively.
- a Node i s defined to be in point-to-point synchronization with another Node when it is sending its own sync and pre-sync messages at the same time the other Node is sending its sync and pre-sync messages. Since the synchronization cannot be exact, a window is specified by the user which defines the al lowabl e error in the time that messages are received from the Nodes that are considered to be synchronization with each other.
- the mechani sm which forces the Nodes into synchronization with each other involves a calcul ation done by each Node, that determines where in time the center of the last cluster of pre-sync messages occurred .
- Each Node wil l then determine the di fference between its own pre-sync time stamp and that of the center.
- Each Node wi ll then use that di fference to adjust the length of the interval from the time it sent its pre-sync message to the transmission of Its sync message.
- Fault tol erance in these calculations is requi red and is accompl i shed with an approximate agreement algorithm. System convergence is accompl ished by having al l Nodes repeat these steps continuously.
- Nodes Over every sync to pre-sync interval all Nodes listen to the time dependent messages received from all of the Nodes, including their own, determine a correction, and then apply the correction over the following pre-sync to sync interval.
- the pattern is repetitive: taking the form measure error, make correction, measure error, make correction,... etc.
- the time interval from the sync to the pre-sync message is a nominal user defined value.
- the implemented synchronization logic supports three catagories of operation: a cold start, a warm start, and steady state operation.
- Cold start synchronization logic handles the situation where the system has just powered up and no one Node is necessarily in synchronization with any other Node.
- a cold start is executed when no operating set of Nodes exists.
- every good Node in the system attempts to synchronize with every other good Node and then simultaneously and consistently decide together which Nodes are in synchronization and whether or not the number which are synchronized is larger than the user specified start up size.
- each Node maintains a byte of information called the "in-sync-with" (ISW) vector.
- ISW in-sync-with
- the content of the "in-sync-with" vector defines which other Nodes in the system the local Node believes it is in synchronization with.
- Byzantine Agreement on the ISW vectors of all the good Nodes in the system is used to define the initial operating set.
- Byzantine Agreement is required concerning this "in-sync-with" data in order for cold starts to be fault tolerant.
- POS potential operating set
- All the other healty Nodes not in the potential operating set will reach the same conclusion, that an operating set is formed, but they will also recognize that they are not incl uded in the operating set and will switch to a warm start mode of operation .
- each good Node continues its efforts to synchroni ze with the exi sting operating set. Once a Node in the warm start mode of operation bel ieves it is in synchroni zation with the operating set it wi l l begin normal operation. After the warm starting Node has behaved correctly long enough, the other Nodes wi ll admit it into the operating set.
- the last situation is the steady state mode of operation where each operating Node simply maintains synchronization and al ignment with the other good Nodes in the system.
- the steady state convergence al gorithm runs under al l conditions since it has the abil ity to converge the local Node to a common synchronization point with all other good Nodes in a fault tolerant manner.
- the real di fferences between warm and cold starts centers around the logic used to determine when the operating set is formed and the Nodes in the operating set are ready to start schedul ing tasks to be executed by thei r Applications Processor.
- the Synchronizer includes a Synchronizer Control 952 which receives the Task Interactive Consi stency (TIC) messages and the System State (SS) messages from the Message Checker through a Message Checker Interface 942.
- the System State messages are the sync and pre-sync messages previously described and are distingui shed by a function bit which identi fies the System State message as a sync or pre-sync message.
- a Timing Signal Generator 950 generates timing signals which are transmitted to the Synchronizer Control ler 952.
- the signal s generated by the Timing Signal Generator are the Subatomic period (SAP) signal .
- SAP Subatomic period
- the Synchroni zer Control 952 al so receives a Clock signal and Reset signal from the system bus .
- the Reset signal may be either the power on Reset (RESET) or the internal Operation Controller Reset (OCRESET) signals. These signals have been previously discussed and need not be repeated here.
- the Synchronizer Control 952 will also receive Its own 3 bit Node identification (NID) code.
- a Byzantine Voter 954 performs a byzantine vote on the
- a Synchronizer Memory 948 stores the data received from the Message Checker Interface and other data required by the Synchronizer Control 952 for establishing and maintaining synchronization between its own Node and the other Nodes in the system.
- the Synchronizer Memory 948 has a Scratch Pad Memory 962, a Message Memory 964 and a Time Stamp Memory 966, as shown in Figure 72.
- An Error Reporter 958 receives the errors detected by the Synchronizer Control 952 and transmits them to the Fault Tolerator Interface 946.
- the Synchronizer Control 952 generates the time dependent Task Interactive Consistency (TIC) and the System State (SS) messages which are transmitted to the other Nodes in the system through the Transmitter Interface 960, as previously described.
- TIC Task Interactive Consistency
- SS System State
- Figure 76 shows the waveforms of the various signals generated by the Timing Signal Generator 950.
- the Master period (MP) is a timing signal which reflects the length of each Master period interval of the Operations Controller. This interval is the longest of the synchronization clocks and reflects the periodicity of the lowest repetition task being run in the application.
- the Master period can be considered to be the "frame size" of the application. During the Master period interval the total pattern of tasks are repeated.
- the Atomic period (AP) is a timing signal which reflects the beginning and end of each Atomic period Interval. This interval is representative of the fastest repetition task being run in the application.
- the Master period described above is user specified as an integer number of Atomic periods.
- the Subatomic period is a timing signal which reflects the beginning and end of each Subatomic period interval.
- the Atomic period interval is user specified as an integer number of Subatomic periods.
- the last Atomic period (LAP) is an active high signal that windows the last Atomic period that occurs in each Master period.
- the last Subatomic period (LSAP) is an active high signal that windows the last Subatomic period that occurs in each Atomic period.
- the soft error window (SEW) is an active high signal that brackets a span of time around an event time mark that defines the soft error window for the arrival of System Synchronization messages.
- SEW soft error window
- HEW hard error window
- HEW is an active high signal that brackets a span of time around an event time mark that defines the hard error window for the arrival of system synchronization messages.
- the format of the Message Memory 964 is shown in Figure 73.
- the Message Memory 964 stores for each Node the branch condition byte, the task completed vector, the next system state vector, the current system state vector, the content of the Atomic period counter, and two bytes, one reserved for a cold start and the other reserved for a warm start. This format is repeated for each Node in the system.
- the format of the Time Stamp Memory 966 is shown in Figure 74.
- the Time Stamp Memory consists of a coarse time count and a fine time count and includes an update (u) flag and a time stamp (TS) flag.
- the update flag signifies that the stored time stamp is for a System State message received during the current time stamp Interval.
- the time stamp flag indicates whether the time stamp 1s for a System State message in which the sync function bit Is set or for a System State message in which the presync function bit is set.
- the coarse count of the time stamp is indicative of the number of Subatomic periods that have passed since the preceding System State message was generated.
- the fine time stamp count corresponds to the number of synchronizer clock pulses received during the last Subatomic period of the Atomic period.
- the coarse or Subatomic period counts are used primarily during the cold start and the warm start to achieve rapid convergence of the synchronization between the local Node and the other Nodes in the system.
- the fine or synchronizer clock time stamp counts are used primarily during the steady state operation to maintain the synchronization between the Nodes.
- the Time Stamp Memory 966 will store a time stamp for each Node in the system and includes a special entry for storing the time stamp of the System State messages transmitted by its own Node.
- the format of the Scratch Pad Memory 962 is shown in Figure 75.
- the Scratch Pad Memory 962 stores the "in-sync-with" (ISW) vectors for each Node in the system. These "in-sync-with" vectors are contained in the sync and pre-sync System State messages.
- the Scratch Pad Memory 962 will also store two message warning counts, one indicative of the time from the end of the warning period to the transmission of the Task Interactive Consistency message and the other indicative of the time from the end of the warning period to the transmission of the System State message.
- the Scratch Pad Memory will also store the Subatomic period count which is used to time stamp the received messages.
- the Scratch Pad Memory also has an entry storing the number of Subatomic periods per an Atomic period, the Atomic period count, and the number of Atomic periods per Master period.
- the Scratch Pad Memory also will store an actual hard error window (HEW) to warning period count and a nominal hard error window (HEW) to warning period count.
- the actual hard error window to warning period reflects the corrected length of the Atomic period between the pre-sync and sync messages which is computed from the difference between the voted time stamp value and its own time stamp value.
- the next entries in the Scratch Pad Memory are the error window parameters.
- the error window parameters include a hard error window count and a soft error window count.
- the next two entries in the Scratch Pad Memory are the computed correction for the Subatomic period delta and the computed correction for the Subatomic period count.
- the next entry is the maximum allowed correction for the Subatomic period delta.
- the final entry in the Scratch Pad Memory is the minimum start up si ze for determining the existence of a potential operating set.
- Checker Interface 942 is passed di rectly to a Data Handler 968 and a Time Stamper 972.
- the Data Handler 968 wi ll store the data in the Message Memory 964, the Scratch Pad Memory 962, or a Byzantine Voter Memory 970 as requi red .
- the received message is checked by an Expected Message Checker 974 and a Withi n Hard Error Window and Soft Error Window Checker 976. If the message is not an expected message, as shall be discussed later, the Expected Message Checker 974 will generate a sequence error signal which is transmitted to an Error Reporter 978 and to the Time Stamper 972.
- the Within Hard Error Window and Soft Error Window Checker will generate either a hard error or a soft error which is also transmitted to the Error Reporter 978 and the Time Stamper 972.
- the Time Stamper 972 will not record the time stamp in the Time Stamp Memory 966 if either a sequence error or a hard error is detected. Any message which i s received outside the hard error wi ndow or not received at al l is essentially ignored by the Synchronizer. However, a received vector will be generated snowing a missing message error for each Node which failed to report during the hard error wi ndow i nterval .
- the synchronization process will not use any time stamp val ue associated with the Nodes which failed to report within the hard error window. This prevents good Nodes from trying to move towards badly out-of-sync Nodes that may possibly be faulty.
- the Time Stamper 972, however, wil l record the time stamp in the Time Stamp Message i f only a soft error is detected .
- the Time Stamper 972 wi l l record the number of Subatomic periods counted in a Subatomic Period (SAP) Counter 971 as the coarse time stamp count and the number of sync cl ock bits from a Sync Clock 969 as the fine time stamp.
- the Time Stamper 972 wil l set the update fl ag in the Time Stamp Memory and set the time stamp fl ag to Indicate i f the received message was either a sync or pre-sync System State message.
- a Pre-sync Message Counter 980 and a Sync Message Counter 982 wil l count, respectively, the number of sync messages or pre-sync messages received during the particul ar time stamp i nterval .
- a Comparator 984 wi ll determine whether the number of pre-sync messages was greater than the sync messages or vice versa .
- Thi s information is communicated to the Time Stamp Voter 956 which wil l extract from the Time Stamp Memory 966 the time stamps corresponding to whether there were a greater number of pre-sync messages or sync messages received during the time stamp Interval .
- the Time Stamp Voter 956 wil l vote on the received time stamps and generate a medial or voted val ue for the time stamps of the received messages .
- This information i s passed to a Sync Correction Generator 990 which compares the time stamp of the Synchronizer' s own System State message with the medial or voted val ue generated by the Time Stamp Voter 956 and wil l generate a correction delta .
- This correction del ta is added to the nominal interval between System State messages in the Adder 992 to correct for the detected di fferences.
- the sum of the nominal interval pl us the correction delta are then stored in the Scratch Pad Memory 962 as the computed correction Subatomic period count and the actual HEW to warning period count , as indicated in the formatof the Scratch Pad Memory shown in Figure 75.
- the computed correction Subatomic period count and the actual HEW to warning period count are used in the timing of the duration of the interval from the pre-sync System State message to the sync System State message, as shown in Figure 84.
- a Byzantine Voter 954 performs a byzantine vote on the task compl eted vector and the branch cond ition bits contained in the Task Interactive Consi stency messages which are passed back to the Scheduler 40 through the Schedul er Interface 944. During a cold start the Byzantine Voter 954 wi l l al so perform a byzantine vote on the "in-sync-with" matrix transmitted in the pre-sync System State messages to generate a voted "in-sync-with" vector.
- Thi s "in-sync-with" vector is transmitted to an Operating Condition Detector 1000 which sums the number of "in-sync-with” bits contained in the voted “in-sync-with” vector and compares thi s sum with the minimum start up si ze for a potential operating set (POS) of Nodes. If the sum of the bits in the Byzantine voted " in-sync-with" vector is greater than the minimum start up size, the Operating Condition Detector 1000 will then determine if its own Node is contained in the "in-sync-with” vector. If its own Node is contained within the "in-sync-with” vector, the Operating Condition Detector will set.
- an Operating Fl ag 1004 indicating that it is ready to start operating.
- the Operating Condition Detector 1000 determines that its own Node is not within the operating set, it will set a Warm Start Flag 1002 indicating the exi stence of an operating set and that it i s not in synchronization with that set. This will cause the Synchroni zer 46 to enter into a warm start mode of operation as previously di scussed .
- the Operating Condition Detector 1000 wil l then transmit the "in-sync-with" (ISW) vector and the operating flag bit to the Fault Tol erator 36 through the Fault Tol erator Interface 946.
- ISW in-sync-with
- An In-Sync-With Detector ( ISW) 944 wil l compare the time stamp of its own System State message with each time stamp stored in the Time Stamp Memory 966 to generate an "in-sync-with" vector which is stored in an In Sync With ( ISW) Register 996.
- the "in-sync-with" vector stored in the In-Sync-With Register 996 and the state of the Operating Flag 1004 are passed to a Message Generator 998 and are used in the generation of the next System State message.
- the output of the Message Generator 998 is passed to the Transmitter through the Transmitter Interface 960.
- the steady state operation of the Synchronizer will be di scussed with reference to the fl ow diagrams ill l ustrated in Figures 78 through 82 and the waveforms shown in Figures 83 and 84.
- the flow diagram shown in Figure 78 describes the operation of the Data Handler 968, the Expected Message Checker 974, and the Within Hard Error Window and Soft Error Window Checker 976.
- All of the data received from the Message Checker Interface 942 is stored in the Message Memory 964.
- the system will then inquire, as indicated by decision block 1008, if the operating flag is true. If the operating flag is not true, the system will default to either a cold start or a warm start as will be described later herein.
- the Expected Message Checker will then inquire if it is a Task Interactive Consistency (TIC) message as indicated by decision block 1010. If it is not a Task Interactive Consistency message, then the message is a System State messge, then the Expected Message Checker 974 will inquire if the Subatomic period count in the Time Stamp Memory is equal to zero, as indicated by block 1012.
- the Subatomic period count stored in the Time Stamp Memory is the two's complement of the number of Subatomic periods in the Atomic period. This Subatomic period count is incremented each time the Timing Signal Generator 950 generates a Subatomic period signal. When the Subatomic period count in the Time Stamp Memory is equal to zero, then a System State message is the expected message.
- the Expected Message Checker will reload the time stamp counters for the Node from which the message was received, as indicated by block 1014, and then inquire if the sync/pre-sync, (s/p) function bit contained in the message was equal to the complement of the (TS) flag stored in the Time Stamp Memory.
- the sync and pre-sync System State messages are sent in an alternating manner, therefore, the function bits in the received message should be the complement of the function bits of the previous message which is currently stored by the TS flag in the Time Stamp Memory 966.
- the sequence error flag for the Node from which the message was received (NODE j) is set to false, as Indicated by block 1020. Conversely, if the sync/pre-sync bit is not the complement of the time stamp flag the sequence error flag for the Node from which the message was received is set to true, as indicated by block 1022. If the Subatomic period count in decision block 1012 is not equal to zero, the Expected Message Checker will set the sequence error flag for the Node from which the message was received to true, as indicated by block 1024, then reload the time stamp counters for Node j, as indicated by block 1026.
- Subatomic period count in the Time Stamp Memory is equal to or greater than zero, as indicated by decision block 1028. If the Subatomic period time stamp count is equal to or greater than zero, then a System State message should have been received and, therefore, there is a sequence error. The Expected Message Checker 974 will then set the sequence error flag for the Node from which the message was received to true, as indicated by block 1034. However, if the Subatomic period count in the Time Stamp Memory is less than zero, the Expected Message Checker 974 will increment the time stamp count stored in the Time Stamp Memory for that Node (Node j), as indicated by block 1032.
- Window Checker 976 begins by checking to determine if a sequence error has been detected, as indicated by decision block 1036. If a sequence error has been detected by the Expected Message Checker, the Within Hard Error Window and Soft Error Window Checker 976 will set the update flag in the Time Stamp Memory 966 to false, as indicated by block 1054. Otherwise, the Within Hard Error Window and Soft Error Window Checker 976 will inquire whether the message was received within the hard error window, as indicated by decision block 1040.
- the Within Hard Error Window and Soft Error Window Checker 976 will set the hard error window flag to true, as indicated by block 1042, then set the update flag in the Time Stamp Memory to false, as indicated by block 1054. If the message was received within the hard error window, the Within Hard Error Window and Soft Error Window Checker 976 will inquire, as indicated by decision block 1044, whether the message was received within the soft error window. If the message was not received within the soft error window, the Checker will set the soft error window flag to true, as indicated by block 1046 and the Checker will proceed to ask, as indicated by decision block 1048, whether the received message was a Task Interactive
- TIC Consistency
- the Time Stamper will be enabled to time stamp the received message.
- the Time stamp equals the SAP count received in the SAP Counter 971 and the sync clock count received from the Sync Clock 969.
- the Time Stamper 972 will then set the TS flag bit to pre-sync and the update flag equal to true, as indicated by block 1050.
- the Time Stamper 972 will time stamp the received message and set the time stamp flag to sync and the update flag to true, as indicated by block 1052.
- the message After the message has been time stamped it is stored in the Time Stamp Memory 966, as indicated by block 1038.
- the Task Interactive Consistency (TIC) messages are not time stamped.
- the generation of the actual HEW to warning period count is described with reference to the flow diagram shown in Figure 80.
- the Time Stamp Voter will then inquire if the update flag is true, as indicated by decision block 1058. If the update flag is not true, indicating that the time stamp value has not been updated during the current Atomic period, the Time Stamp Voter will then increment the time stamp memory Node pointer to the next Node and inquire if the update flag of that Node is true.
- the time stamp value is loaded into the Time Stamp Voter, as indicated by block 1060, then the Node pointer to the Time Stamp Memory is incremented, as indicated by block 1062.
- the Time Stamp Voter 956 will then inquire if the Node to which the pointer is pointing is the maximum or last Node to be polled, as indicated by decision block 1064. If the Node is not the last Node, (MAX NID) the process of loading the Time Stamp Voter will continue until the time stamp value from the last Node is loaded into the Time Stamp Voter 956.
- Time Stamp Voter Once the Time Stamp Voter is loaded with all of the current time stamp values it will vote on the time stamp values which were loaded into it and generate a voted time stamp value (TS v ), as indicated by block 1066.
- the Sync Correction Generator 990 will then subtract the Node's own time stamp value from the voted time stamp value to generate a sync delta, as indicated by block 1068.
- the actual HEW to warning period is then generated by adding in Adder 992 the sync delta to the nominal HEW to warning period stored in the Scratch Pad Memory, as indicated by block 1070. This actual HEW to warning count is then stored in the Scratch Pad Memory, as indicated by block 1071.
- the Message Generator 998 receives the clock (CLK), the Last Subatomic (LSAP) period, and the HEW signals from the Timing Signal Generator 950. It also receives the Warm Start Flag 1002, the Operating Flag 1004, and the "in-sync-with" vector from the In-Sync-With (ISW) Register 996.
- CLK clock
- LSAP Last Subatomic
- HEW High Speed Packet Data Register
- the data used in the messages generated by the Message Generator 998 is obtained from the Sync Memory 948, which includes the Scratch Pad Memory 962, the Message Memory 964, and the Time Stamp Memory 966.
- the messages generated by the Message Generator are passed to the Transmitter Interface 960 which ultimately passes these messages to the Transmitter 30.
- the Message Generator 998 first waits until the end of the hard error window, as indicated by decision block 1074.
- the Message Generator will inquire if the Subatomic period is a Last Subatomic period, as indicated by decision block 1076. If the Subatomic period is not a Last Subatomic period, then the message to be generated is a Task Interactive Consistency (TIC) message in which the data identification code (DID) is set equal to zero, as indicated by block 1078. If the current Subatomic period is the last Subatomic period then the message to be transmitted is a System State message in which the sync/pre-sync bit is equal to the complement of the TS flag currently stored in the Time Stamp Memory, as indicated by block 1094.
- TIC Task Interactive Consistency
- the Message Generator will inquire if the operating flag is true, as indicated by decision block 1080. If the operating flag is not true then no TIC message is to be sent. However, if the operating flag is true, the Message Generator 998 will load an Event Counter 1072 with the nominal HEW to warning count stored in the Scratch Pad Memory 962, as indicated in block 1082, then assemble a normal Task Interactive Consistency message, as indicated by block 1083. As shown in Table 1, the normal Task
- Interactive Consistency message includes the task completed vector and the branch condition bits obtained from the Message Memory 964.
- the Message Generator will then wait until the Event Counter 1072 is incremented to zero by the clock signals, as indicated by decision block 1084.
- the Event Counter is equal to zero, the Message Generator 998 will send the first byte of the message to the Transmitter through the Transmitter Interface 960, as indicated by block 1086, then transfer the remaining bytes of the message to the Transmitter Interface 960, as indicated by block 1088.
- the Transmitter Interface 960 will then wait for the buffer available (BA) signal from the Transmitter 30, as indicated by decision block 1090, then send the remaining bytes of the message to the Transmitter, as indicated by block 1092.
- BA buffer available
- the sending of the first byte of a message from the Synchronizer will start the warning period for the time depen dent Task Interactive Consistency and System State messages.
- the Transmitter will begin the transmission of the time dependent message and will transmit a buffer available signal to the Transmitter Interface, which triggers the transferring of the remaining bytes stored in the Transmitter Interface to the Transmitter.
- the Message Generator will then inquire if the System State message to be transmitted is a sync or pre-sync message, as indicated by block 1096. If the message is a sync message, the Message Generator will load the Event Counter 1072 with the actual HEW to warning count from the Scratch Pad Memory, as indicated by block 1098, and then will generate a normal System State message, as indicated by block 1099.
- the Message Generator will load the Event Counter 1072 with the normal HEW to warning count, as indicated by block 1097, and then will interrogate the warm start and operational flags to determine if the system is in the cold start mode, as indicated by decision block 1077.
- a cold start is indicated by both the Warm Start and Operating Flags being false.
- the Message Generator 998 will then generate a normal System State message, as indicated by block 1099.
- the Synchronizer is in the cold start mode
- the Message Generator will generate a cold start pre-sync message, as indicated by block 1081.
- the cold start pre-sync message has a format as indicated in Figure 85, which is different from the normal pre-sync System State message shown on Table I.
- This cold start pre-sync message contains an "in-sync-with" matrix containing the in sync vectors received from all the operating Nodes in the system.
- the Message Generator 998 will then wait for the end of the HEW to warning period by monitoring the Event Counter, as indicated in decision block 1084.
- the Message Generator will then send the first byte of the message to the Transmitter 30, as indicated by block 1086, then transfer the remaining bytes of the message to the Transmitter Interface, as indicated by block 1088.
- the Transmitter Interface 960 will then pass the remaining bytes of the System State message to the Transmitter, as indicated by block 1092.
- Consistency messages and the System State messages are shown in Figures 83 and 84, respectively.
- the Message Generator's Event Counter 1072 is loaded at the end of the hard error window (HEW) with the nominal HEW to warning count, as indicated by block 1082 in flow diagram Figure 82.
- the Message Generator 998 will then wait until the end of the HEW to warning period and then transmit the first byte of the Task Interactive Consistency message to the Transmitter, as indicated by the waveform sync dat.
- the receipt of this first byte of the Task Interactive Consistency message will initiate the beginning of the Task Interactive Consistency warning period and will also terminate the buffer available (BA) signal, as indicated by buffer available (BA) waveform, as indicated on Figure 83.
- the Transmitter will initiate the transmission of the first byte to all of the other Nodes in the system. It will also reassert the buffer available signal, causing the Transmitter Interface 960 to send the remaining data bytes to the Transmitter, as indicated by the sync dat and buffer available waveforms.
- the last byte transmitted by the Transmitter is the longitudinal redundancy code check byte, the end of which is timed to coincide with the end of the Subatomic period.
- the HEW to warning period is the same as for the Task Interactive Consistency message, however, the Transmitter will substitute the System State warning period for the Task Interactive Consistency warning period and will begin the transmission of the System State message at a point in time earlier than it would have started to transmit the Task
- FIG 84 there is shown the timing sequence for the transmission of a sync System State message.
- the Event Counter 1072 in the Message Generator 998 is loaded with the actual HEW to warning count, as indicated by block 1098 in Figure 82.
- the actual HEW to warning count is the sum of the nominal HEW to warning count plus the calculated sync delta.
- the Message Generator will transmit the first byte of the sync System State message directly to the Transmitter 30 through the Transmitter Interface 960.
- the Transmitter then will initiate the transmission of the System State message at the end of the System State message warning period and will reinstate the buffer available signal, as indicated, causing the Transmitter Interface to transmit the remaining sync System State message to the Transmitter 30.
- the transmission of the last byte of the System State message defines the end of the Atomic period.
- the adding of the sync delta to the nominal HEW to warning period corrects the length of the Atomic period so that its ending should coincide with the ends of the Atomic periods generated by the other Nodes in the system, thus, establishing point-to-point synchronization with all of the other Nodes.
- the cold start procedure begins by inquiring if the Synchronizer is in the cold start mode, as indicated by decision block 1100.
- the cold start is indicated by the absence of the warm start flag and the operation flag. If the Synchronizer is not in the cold start mode of operation it will inquire whether it is in the warm start mode of operation, as indicated by decision block 1102. If the Synchronizer is in the warm start mode of operation, as indicated by the warm start flag being true, the Synchronizer will call the warm start procedure, as indicated by block 1105. Otherwise, the Synchronizer will exit the cold start routine and default to the steady state mode of operation, as indicated by block 1103.
- the Synchronizer will listen for messages from the other Nodes over the listening period shown in Figure 90.
- the Synchronizer will then inquire, as indicated by decision block 1104, if the message received is a sync or a pre-sync message, as determined from the function bits contained in the message. If the message is a pre-sync message, the message is time stamped, as indicated by block 1106, and the pre-sync Message Counter 980 is incremented, as indicated by block 1108.
- the In Sync With Detector 994 will then inquire if the time stamp of the received message minus the time stamp of its own message is less than the hard error window, as indicated by decision block 1110.
- the "in-sync-with" flag corresponding to the Node from which the message was received is set to true, as indicated by block 1112. Otherwise if the difference between the time stamp of the received message minus the time stamp of its own message is greater than the hard window error the "in-sync-with" flag in the In-Sync-With Register 996 is set to false, as indicated by block 1114.
- the Time Stamper will time stamp the received message and set the TS flag to sync and the update flag to true, as indicated by block 1116.
- the Synchronizer will then increment the Sync Message Counter 982, as indicated by block 1118.
- the operation of the Sync Correction Generator 990 and Adder 992 shall be explained with reference to the flow diagram illustrated in Figure 87.
- the Sync Correction Generator 990 first inquires, as indicated by decision block 1120, if the listening period is done.
- the listening period during a cold start is equal to a full Atomic period plus the hard error window, as indicated in Figure 90.
- the Within Hard Error Window and Soft Error Window Checker 976 will not generate any error signals during this period in response to the Warm Start Flag and the Operating Flag being set to false.
- the Sync Correction Generator 990 will inquire if the number of pre-sync counts stored in the Pre-Sync Message Counter 980 is equal to the number of sync counts stored in the Sync Message Counter 982. If the pre-sync count is equal to the sync counts, the Sync Correction Generator will set the Subatomic period delta equal to zero and the sync delta equal to zero, as indicated by block 1148.
- the Sync Correction Generator 990 will then inquire if the pre-sync count is greater than the sync count, as indicated by block 1124, If the pre-sync count is greater than the sync count the Time Stamp Voter 956 will extract from the Time Stamp Memory all of the time stamps for which the TS flag is set to the pre-sync and the update flag is set to true. The Time Stamp Voter 956 will then generate a voted Subatomic period count and a voted sync clock count using the extracted values. The Sync Correction Generator 990 will then subtract its own Subatomic period count from the voted Subatomic period count to generate the SAP delta and will subtract its own sync clock count from the voted sync clock to generate a sync, delta, as indicated by block 1126.
- the Time Stamp Voter 956 will generate a SAP delta and a sync delta using the time stamps having their TS flag set to sync and the update flag equal to true, as indicated by block 1146.
- the Sync Correction Generator 990 will set the Subatomic period delta equal to zero and the sync delta equal to computed sync delta, as indicated in block 1129.
- the Sync Correction Generator 990 will then inquire if the sync delta is greater than the maximum sync delta, as indicated by decision block 1132. If it is, the Sync Correction Generator will set the sync delta equal to the maximum sync delta stored in the Scratch Pad Memory 962, as indicated in Figure 75. If the sync delta is not greater than the maximum sync delta, as determined by decision block 1132, the Sync Correction Generator will inquire if the sync delta is greater than the two's complement of the maximum sync delta, as indicated by decision block 1136.
- the Sync Correction Generator 990 will set the sync delta equal to the two's complement of the maximum sync delta, as indicated by block 1138. Otherwise, the sync delta will remain the computed sync delta.
- the Sync Correction Generator 990 will inquire if the Subatomic period delta is greater than zero, as indicated by decision block 1128. If the Subatomic period delta is greater than zero, the Sync Correction Generator will set the Subatomic period delta equal to the Subatomic period delta minus 1 and the sync delta equal to the maximum sync delta, as indicated in block 1130. Otherwise, the Sync Correction Generator will set the Subatomic period delta equal to the Subatomic period delta plus 1 and the sync delta equal to the two's complement of the maximum sync delta, as indicated by block 1144.
- the actual Subatomic period per Atomic period count is generated by adding the Subatomic period delta to the nominal Subatomic period per Atomic period count, as indicated in block 1140.
- the actual HEW to warning period is generated by adding the sync delta to the nominal HEW to warning period in the Adder 992, as indicated by block 1141.
- the actual Subatomic period per Atomic period count and the actual HEW to warning period counts are stored in the Scratch Pad Memory 960, in the locations identified in Figure 75.
- the final operation of the Sync Correction Generator 990 is to set the Pre-sync Message Counter 980 and the Sync Message Counter 982 to zero, as indicated in block 1142.
- the operation of the Data Handler 968 during the cold start mode of operation is shown in the flow diagram of Figure 88.
- the Data Handler inquires if the sync/pre-sync bit is a sync bit, as indicated by decision block 1150. If the sync/pre-sync function bit contained in the message from Node j is a sync bit, the Data Handler, as indicated by block 1152, will store the "in-sync-with" vector of the received message in the In-Sync-With matrix (row j) contained in the Scratch Pad Memory, as shown in Figure 75.
- the sync/pre-sync function bit contained in the message is a pre-sync bit
- the In-Sync-With matrix contained in the pre-sync message is stored in the Byzantine Voter Memory 970, as indicated by block 1154.
- the determination of a potential operating set of Nodes and the setting of the Operating and Warm Start Flags shall be discussed relative to the flow diagram shown in Figure 89.
- the Byzantine Voter 954 will wait until the listening period is over, as indicated by decision block 1156, then execute a byzantine vote using the In-Sync-With matrix stored in the Byzantine Voter Memory 920, as indicated by block 1157. Since each Node sends an In-Sync-With matrix which is stored in the Byzantine Voter Memory, these In-Synce-With matrices form a three-dimensional cube of "in-sync-with" vectors, as shown in Figure 94.
- the Byzantine Voter makes a first vote through the In-Sync-With matrix, as shown by the arrow 1204 in Figure 94, which will reduce the three-dimensional matrix to a two-dimensional matrix, as shown in Figure 95.
- the Byzantine Voter 954 will then take a second vote in a direction of the arrow 1206, shown in Figure 95, to generate a Byzantine Agreement as to which Nodes are in synchronization with each other.
- the Byzantine Agreement is then forwarded to the Operating Condition Detector 1000 as a potential operating set (POS), as indicated by block 1158.
- the Operating Condition Detector 1000 will then compare the number of Nodes in the potential operating set with the minimum number of Nodes required for start up, as indicated by decision block 1160.
- the Operating Condition Detector will set the Warm Start Flag 1002 and the Operating Flag 1004 to false, as indicated by block 1161. However, if the number of Nodes in the potential operating set is greater than the start up size, the Operating Condition Detector 1000 will then determine if its own Node is in the potential operating set, as indicated by block 1162. If the Node is in the potential operating set, the Operating Condition Detector will set the Operating Flag equal to true, as indicated by block 1164, and then send the potential operating set to the Fault Tolerator along with the Operating Flag, as indicated by block 1166.
- the Operation Condition Detector will set the Warm Start Flag 1002 to true, as indicated in block 1168.
- the setting of the Warm Start Flag to true will switch the operation of the Synchronizer from the cold start mode to the warm start mode, as indicated by block 1105 in Figure 86.
- the potential operating set and the Operating Flag transmitted. to the Fault Tolerator are transferred to the Scheduler 40 and initiates the operation of the Operations Controller.
- Figure 90 shows the operation of the Synchronizer during a cold start.
- each Synchronizer will transmit an initial sync System State message in which the "in-sync-with" vector is all zeros.
- the Time Stamper 972 will then time stamp the System State messages received from all of the other Nodes in the system during the time stamp listening period which is equal to an Atomic period plus the hard error window interval, as indicated.
- the Synchronizer will count the number of Subatomic periods which elapsed from the initial sync System State message and will send a pre-sync System State message at the appropriate time.
- the Synchronizer will process the received time stamps and will generate the required SAP delta and the sync deltas and adjust the interval between the pre-sync message and the next sync message.
- the Synchronizer will also compare its time stamp. with the voted time stamp and determine which Nodes it is in synchronization with.
- the Synchronizer will again transmit a sync message which contains the generated "in-sync-with" vector.
- the Synchronizer will collect and store the "in-sync-with" vectors received from the other Nodes in the Scratch Pad Memory and assemble an "In-Sync-With" matrix in the Scratch Pad Memory.
- the Synchronizer then will count the nominal number of Subatomic periods per Atomic period and will generate the special "cold-start" pre-sync System State message which contains the In-Sync-With matrix assembled in the Scratch Pad Memory.
- the Synchronizer will time stamp all of the sync messages received from the other Nodes.
- the Synchronizer will collect the In-Sync-With matrices transmitted by the other Nodes and store them in the Byzantine Voter Memory 970.
- the Synchronizer After the end of the hard error window the Synchronizer will .compute the sync correction for the interval between the pre-sync and the next sync message to effect synchronization between the Nodes. It will then determine its own "in-sync-with" vector and perform a byzantine vote on the In-Sync-With matrices stored in the Byzantine Voter Memory.
- the Synchronizer will also test to determine if a potential operating set exists and whether or not its own Node is included in the potential operating set.
- the Synchronizer will once again transmit a sync System State message which will include its own “in-sync-with” vector. It will also assemble a new "in-sync-with” matrix from the "in-sync-with” vectors generated by the other Nodes between the second and the third pre-sync System State message. This process is repeated until a potential operating set is determined by the result of the byzantine vote on the In-Sync-With matrices stored in the Byzantine Voter Memory.
- the Synchronizer during a warm start will be discussed relative to the flow diagram shown in Figure 91 and the timing diagram shown in Figure 92.
- the Synchronizer recognizes the existence of a potential operating set and its main function is to establish synchronization with that operating this set.
- the warm start begins by the detection of the warm start flag, as indicated by decision block 1170. If the warm start flag is true, the Time Stamper will time stamp each received message, as indicated by block 1172. The In Sync With Detector 994 will then determine if it is "in-sync-with" any of the other Nodes, as indicated by decision block 1174, in which the difference between the Node's own time stamp and the time stamp for each received message is compared with the hard error window.
- the "in-sync-with" flag in the ISW Register 996 is set to true for each Node in which this occurs, as indicated by block 1176. If the difference between its own time stamp and the time stamp of the received message is greater than the hard error window interval, the ISW Detector 994 will set the "in-sync-with" bit for that particular Node stored in the ISW Register 996 to false, as indicated by block 1178.
- the Synchronizer will time stamp all of the System State messages received during a listening period which is equal to one Atomic period plus the hard error window interval. This is the same listening period used during a cold start.
- the Synchronizer will compute the sync correction which will adjust the length of the Atomic period between the pre-sync and the next sync System State message, as indicated by block 1184. The computation of this correction is the same as the computation used during a cold start.
- the Operating Condition Detector 1000 If the Operating Condition Detector 1000 concludes that its own Node is in synchronization with the existing operating set of Nodes the Operating Condition Detector will set the operating flag equal to true and the warm up flag equal to false, as indicated by block 1188, and then it will send the "in-sync-with" vector and the operating flag to the Fault Tolerator 36, as indicated by block 1190.
- the Fault Tolerator 36 will use this in-sync-with vector as its initial system state vector during the subsequent start up operations.
- the Synchronizer will only transmit sync and pre-sync System State messages in an alternating sequence.
- the Synchronizer will compute the sync correction to adjust the length of the Atomic period following the pre-sync message to effect synchronization with the existing operating set. It will also generate its own local "in-sync-with" vector during the same processing interval and test this "in-sync-with" vector to determine if its own Node is in synchronization with the operating set. If its own Node is in synchronization with the existing operating set, the Synchronizer will then go to the operating state and will exit the warm up state. As shown in Figure 92, this process is repeated until the Synchronizer is in sync with the existing operating set.
- the Synchronizer also performs a byzantine vote on the task completed vector and the branch condition bits, as previously described with reference to the Scheduler 40.
- the task completed vector and the branch condition bits are embodied as separate bytes in the Task Interactive Consistency and the System State messages and are stored in the Message Memory 964.
- the Synchronizer will transfer the task completed vectors from the Message Memory 964 to the Byzantine Voter Memory 970, as indicated by block 1194. After all the task completed vectors are transferred to the Byzantine Voter Memory the Byzantine Voter 954 will execute a byzantine vote on all of the transferred task completed vectors and generate a voted task completed (TC) vector, as indicated by block 1196. The Synchronizer will then transfer the branch condition bits to the Byzantine Voter Memory 970, as indicated by block 1198, then execute a byzantine vote to generate voted branch condition bits, as indicated by block 1200. The Byzantine Voter 954 will then pass the voted task completed vector and the voted branch condition bits to the Scheduler, as indicated by block 1202. This assures that the Scheduler in each Node will record the task completed in a consistent fault tolerant manner.
- the function of the Byzantine Voter is to guarantee consistency among the Nodes in reaching agreement on certain critical matters.
- the reliability of a distributed fault-tolerant system depends on the ability of all the non-faulty Nodes to reach a consistent agreement despite the presence of one or more faulty Nodes. Since all failure modes of the faulty Node cannot be enumerated, any mechanism for achieving agreement must be provably correct in the presence of arbitrary failures.
- each Synchronizer will generate its own “in-sync-with” vector which is transmitted to all of the other Nodes in the System State messages.
- Each Node will store the "in-sync-with” vectors received from all of the other Nodes in the Scratch Pad Memory 962 to form an "in-sync-with" matrix, as shown in Figure 75.
- this "in-sync-with" matrix is transmitted with each pre-sync System State message to all of the other Nodes in the system.
- Each Synchronizer will then store each of these "in-sync-with" matrices in the Byzantine Voter Memory to form a three-dimensional cube, as shown in Figure 94. This constitutes the two rounds of transmission required for a Byzantine Agreement.
- the Byzantine Voter will first vote on the value of each "in-sync-with" bit of the matrix longitudinally through the matrix, as indicated by the direction of the arrow 1204 in Figure 94.
- the first vote will reduce the three-dimensional cube to a two-dimensional matrix, as shown in Figure 95, where each "in-sync-with" bits is the voted value of the first vote.
- the Byzantine Voter 954 will then v.ote on the values of the "in-sync-with" bit in each column of the "in-sync-with” matrix shown in Figure 95.
- the direction of the second vote by the Byzantine Voter is indicated by arrow 1206.
- the result of the second vote will be a Byzantine Agreement of the individual "in-sync-with" bits for each Node in the system, which is transmitted to the Operating Condition Detector 1000, as shown in Figure 77.
- the circuit details of the Byzantine Voter are shown in Figure 96.
- the Data Handler 968 will load the data to be voted on by the Byzantine Voter into the Byzantine Voter Memory 970.
- a Byzantine Voter Control Logic 1230 will activate an Address Generator 1210 in response to the message type and the operating flags.
- the Byzantine Voter will vote on the task completed and branch condition vectors contained in the Task Interactive Consistency messages and the "in-sync-with" matrices contained in the pre-sync System State messages during the cold start mode of operation.
- the Address Generator 1210 will address the Byzantine Voter Memory in the appropriate manner and store the addressed data in a Data Register 1208.
- Each bit in the Data Register 1208 is applied to one input of a plurality of AND gates 1212 through 1226. Each AND gate receives a respective one of the data bits stored in the Data Register 1208.
- a Decoder 1228 responsive to the addresses being generated by the Address Generator, will selectively deactivate one of the AND gates which corresponds to the "in-sync-with" bit generated by each Node with respect to itself, as indicated by the X's in the blocks shown in Figure 95.
- a Message Counter 1284 monitors the number of vectors or matrices loaded into the Byzantine Memory 970 and generates a two's complement value corresponding to one half of the number of vectors or matrices loaded into the Byzantine Voter Memory 970. This value is loaded into a plurality of Accumulators 1264 through 1278 through a like plurality of 2:1 Multiplexers 1232 through 1246 and Adders 1248 through 1262. Under the control of the Byzantine Voter Control Logic the Data Register will then switch the 2:1 Multiplexers 1232 through 1246 to the output of the AND gates 1212 through 1226 and will add the content stored in the Data Registers to the amount stored in the Accumulators.
- the Byzantine Control Logic will then load the data from the Byzantine Voter Memory into the Data Register 1208 in accordance with the message type and the operating flags. For example, if the byzantine vote is being taken among the "in-sync-with" matrices generated during the cold store, the Address Generator 1210 will sequentially load the in sync with vector from Node 0 from the matrix transmitted by Node 0 and then sequentially the matrix from Node 1 through Node 7. Ouring the adding of the bit value in each of the Adders 1248 through 1262 to the amount stored in the
- an overflow bit will be generated when the sum is greater than 0. Any overflow bits generated during the addition process will be stored in a Byzantine Voted Value Register 1280.
- the content of the Byzantine Voted Value Register is passed to a 1:3 Multiplexer 1282 which stores this data in the Byzantine Voter Memory 970 to form the two-dimensional matrix, such as shown in Figure 95.
- the Address Generator will then index and will process the "in-sync-with" vectors from Node 1 from the matrices from Node 0 through Node 7 as previously done with regard to the "in-sync-with" vectors of Node 0.
- First the Message Counter 1284 will load the Accumulators with a two's complement value corresponding to one half of the number of matrices that will be processed.
- the Address Generator will then load the "in-sync-with" vectors generated by Node 1 taken from the matrices received from Node 0 through Node 7, as previously described.
- the overflow bits signifying that the results from the addition exceed zero are stored in the Byzantine Voted Value Register 1280 which also is restored in the Byzantine Voter Memory 970 through the Multiplexer 1282. This process is repeated for the "in-sync-with" vector generated by each of the Nodes and is terminated when the vectors from Node 7 are completely processed and all the Byzantine Voted Values are stored back into the Byzantine Voter Memory 970, forming the two-dimensional matrix shown in Figure 95.
- the Byzantine Voter Control Logic 1230 will initiate the second vote in which the voting is taken down the columns, as indicated by arrow 1206 in Figure 95.
- the Address Generator 1210 will load the column for Node 0 into the Data Register 1208.
- the Message Counter again will load the two's complement corresponding to the value of one half the number of bits to be processed by the Byzantine Voter into the Accumulators 1264 through 1278.
- the Adders will then add the bits stored in the Data Register to the value stored in the Accumulators 1264 through 1278. This process is repeated until the columns for all of the Nodes have been processed. Again, the overflow bits from the Adders 1248 through 1262 are stored in the Byzantine Voted Value Register 1280.
- the Byzantine Voter Control Logic 1230 will then activate the 1:3 Multiplexer to pass the
- in-sync-with vector stored in the Byzantine Voted Value Register to the Operating Condition Detector 1000, as previously described.
- This "in-sync-with" vector represents a Byzantine Agreement on which Nodes are in synchronization with each other.
- the Data Handler When the Byzantine Voter is voting on the task completed vectors and the branch condition bits contained in the Task Interactive Consistency and System State messages, the Data Handler will load these values into the Byzantine Voter Memory 970.
- the Byzantine Voter Control Logic 1230 will then activate the Address Generator 1210 to load the columns of the task completed vectors into the Data Register 1208, as previously described with reference to the second vote on the "in-sync-with" vectors.
- the voting process is then identical to the voting process for the second vote on the "in-sync-with” vectors and the voted value is loaded into the Byzantine Voted Value Register from the overflow outputs of the Adders 1248 through 1262.
- the Byzantine Voter Control Logic 1230 will then activate the 1:3 Multiplexer 1282 to pass the voted task completed vector and the voted branch condition bits to the Scheduler Interface 944, as previously described.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Mathematical Physics (AREA)
- Hardware Redundancy (AREA)
Abstract
A operations controller (12) for a multiple node fault tolerant processing system having a transmitter (30) for transmitting inter-node messages, a plurality of receivers (32a-32n), each receiving inter-node messages from only one of the nodes and a message checker (34) for checking each received message for physical and logical errors. A fault tolerator (36) assembles all of the errors detected and decides which nodes are faulty based on the number and severity of the detected errors. A voter (38) generates a voted value for each value which is received from the other nodes which is stored in a data memory (42) by a task communicator (44). A scheduler (40) selects the tasks to be executed by an applications processor (14) which is passed to the task communicator (44). The task communicator (44) passes the selected task and the data required for the execution of that task to the applications processor (14) and transmits the data resulting from that task to all of the nodes in the system. A synchronizer (46) synchronizes the operation of its own node with all of the other nodes in the system.
Description
AN OPERATIONS CONTROLLER FOR A FAULT TOLERANT MULTIPLE NODE PROCESSING SYSTEM
CROSS REFERENCE
This invention is related to commonly assigned, copending patent applications (Docket Numbers ATC-102-A, ATC-104-A, ATC-105-A, ATC-105-B, ATC-106-A, ATC-107-A, and ATC-109-A) filed concurrently herewith.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention is related to the field of multiple node processing systems and in particular to an operations controller each node in the multiple node processor system for controlling the operation of its own node in a fault tolerant manner.
2. Description of the Prior Art
The earliest attempts to produce fault tolerant computer systems provided redundant computers in which each computer simultaneously executed every task required for the control operation. Voting circuits monitoring the outputs of the multiple computers determined a majority output which was assumed to be the correct output for the system. In this type of system, a faulty computer may or may not be detected and the faulty computer may or may not be turned off.
The redundant computer concept, although highly successful, is expensive because it requires multiple computers of equivalent capabilities. These systems require powerful computers because each computer has to perform every task required for the operation of the system. As an alternative, the master-slave concept was introduced in which the operation of several computers were controlled and coordinated by a master control. The master
control designated which tasks were to be executed by the individual computers. This reduced the execution time of the control operation because all the computers were no longer required to execute every task, and many of the tasks could be executed in parallel. In this type of system when a computer is detected as faulty, the master could remove it from active participation in the system by assigning the task that would normally have been assigned to the faulty computer to the other computers. The problem encountered in the master-slave concept is that the system is totally dependent upon the health of the master and if the master fails then the system fails. This defect may be rectified by using redundant master controls, however, the increased cost of redundant masters limits the applicability of these systems to situations where the user is willing to pay for the added reliability. Typical of such situations are the controls of nuclear power plants, space exploration and other situations where failure of the control system would endanger lives.
Recent improvements to the master-slave and redundant execution fault tolerant computer systems discussed above are exemplified in the October 1978 proceedings of the IEEE, Volume 66, No. 10, which is dedicated to fault tolerant computer systems. Of particular Interest are the papers entitled "Pluribus: An Operational Fault Tolerant Microprocessor" by D. Katuski et al., Pages 1146 - 1159 and "SIFT: The Design and Analysis of a Fault Tolerant Computer for Aircraft Control" by J. H. Wensley et al., Pages 1240 - 1255. The SIFT system uses redundant execution of each system task and of the master control functions. The Pluribus system has a master copy of the most current information which can be lost if certain types of faults occur.
More recently a new fault tolerant multiple computer architecture has been disclosed by Whiteside et al, in U.S. Patent No. 4,256,547, in which each of the individual task execution nodes has an applications processor and an operations controller
which functions as a master for Its own node.
The present Invention is an operations controller for a fault tolerant multiple node processing system based on the system taught by Whiteside et al in U.S. Patent No. 4,323,966 which has improved fault tolerance and control capabilities. A predecessor of this operations controller has been described by C. J. Walter et al in their paper "MAFT: A Multicomputer Architecture for Fault-Tolerance in Real-Time Control Systems" published in the proceedings of the Real-Time System Symposium, San Diego, December 3 - 6, 1985.
SUMMARY OF THE INVENTION
The invention is an operations controller for each node in a fault tolerant multiple node processing system. Each node has an applications processor for executing a predetermined set of tasks and an operations controller for establishing and maintaining its own node in synchronization with eyery other node in the system, for controlling the operation of its own node, and for selecting the task to be executed by its own applications processor in coordination with all of the other nodes in the system through the exchange on inter-node messages.
The operations controller has a transmitter for transmitting all of the inter-node messages generated by its own operations controller to all the other nodes in the system. The transmitter has an arbitrator for deciding the order in which the inter-node messages are to be transmitted when two or more messages are ready for transmission. A plurality of receivers,- each receiver associated with a respective one node and only receiving messages from that node. A message checker for checking each received message for physical and logical errors to generate an inter-node error report containing an error status byte identifying each detected error. The message checker polls each of the receivers to unload the received messages in a repetitive
sequence. A voter subsystem has a voter for voting on the content of all error free messages containing the same information to generate a voted value and has a deviance checker for generating an inter-node error report identifying each node which sent a message used in the generation of the voted value whose content differed from the voted value by more than a predetermined amount.
A fault tolerator for passing all error free messages received from the message checker to the voter subsystem, for generating an inter-node error message containing all of the error reports accumulated by all of the subsystems of its own operations controller, for generating a base penalty count for each node in the system based on the number of detected errors and the severity of the detected errors identified in such inter-node error reports, for globally verifying the base penalty count for each node through the exchange of inter-node base penalty count messages, and for generating a system state vector identifying each node whose base penalty count exceeds a predetermined exclusion threshold. A task scheduler for selecting the next task to be executed by its own applications processor from an active task list, for maintaining a global data base on the scheduling and execution of each node through the exchange of task completed/started messages and for generating an error report whose scheduling process differs from the scheduling process replicated for that node.
The operations controller also has a data memory and a task communicator for storing the voted values in the data memory. The task communicator further has means for passing the identity of the task selected by the scheduler to the applications processor, means for extracting the voted values required for the execution of the selected task and passing them to the applications processor, means for generating the task completed/started messages identifying the task just completed and the new task started by the applications processor and for generating inter- node data value messages containing the data values generated by
the applications processor in the execution of the selected tasks,
The operations controller further includes a synchronizer for synchronizing the operation of its own node with all of the other non-faulty nodes in the system through the exchange of inter-node time-dependent messages.
The object of the Invention is an architecture for a multiple node fault tolerant processing system based on the functional and physical partitioning of the application task and the overhead functions.
Another object of the Invention is a distributed multiple node processing system in which no one node is required to execute every task of the applications task and in which failure of one or more nodes need not prevent execution of any applications task.
Another object of the invention is a multiple node computer architecture in which task selection and fault detection are globally verified.
Another object of the invention is a fault tolerant computer architecture in which the exclusion or readmittance of a node into the active set of nodes is made on a global basis.
These and other objects of the invention will become more apparent from a reading of the specification in conjunction with the drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a block diagram of the multi-computer architecture;
Figure 2 is a block diagram of the Operations Controller;
Figure 3 is the master/atomic period timing diagram;
Figure 4 is a the atomic/subatomic period timing diagram;
Figure 5 is a block diagram of the Transmitter;
Figure 6 is a circuit diagram of one of the interfaces;
Figure 7 is a block diagram of the Arbitrator;
Figure 8 shows waveforms for the Self-Test Arbitration Logic;
Figure 9 is a block diagram of the Longitudinal Redundancy Code Generator;
Figure 10 is a block diagram of a Receiver;
Figure 11 is a block diagram of the Message Checker;
Figure 12 is a block diagram of the decision logic for the Between Limits Checker:
Figure 13 is the format for the error status byte generated by the Message Checker;
Figure 14 is a block diagram of the Fault Tolerator;
Figure 15 shows the partitioning of the Fault Tolerator RAM;
Figure 16 shows the format of the Message partition of the Fault Tolerator RAM;
Figure 17 shows the format of the Error Code Files partition of the Fault Tolerator RAM;
Figure 18 shows the format of the Group Mapping partition of the Fault Tolerator RAM;
Figure 19 shows the format of the Error Code Files partition of the Fault Tolerator RAM;
Figure 20 shows the format of the Penalty Weight par
titlon of the Fault Tolerator RAM;
Figure 21 is a block diagram of the Fault Tolerator's Message Checker Interface;
Figure 22 is a block diagram of the Fault Tolerator's Error Handler;
Figure 23 is a block diagram of the Error Handler's Error Consistency Checker;
Figure 24 is a block diagram of the Error Handler's Validity Checker;
Figure 25 illustrates the format of the error byte in an error message;
Figure 26 is a timing diagram of the reconfiguration sequence;
Figure 27 is a block diagram of the Voter Subsystem;
Figure 28 is a flow diagram for the Upper and Lower
Medial Value Sorters;
Figure 29 is a circuit diagram of the Lower Medial Value
Sorter;
Figure 30 is a flow diagram for the Averaging Circuit;
Figure 31 is a circuit diagram of the Averaging Circuit;
Figure 32 is a flow diagram of the Deviance Checker;
Figure 33 is a circuit diagram of a Deviance Checker;
Figure 34 is a block diagram of the Scheduler;
Figure 35 shows the data format of the Scheduler RAM;
Figure 36 shows the data format of the Scheduler ROM;
Figure 37 is a block diagram of the Scheduler's Task Selector Module;
Figure 38 is a flow diagram of the Wake-Up Sequencer's operation;
Figure 39 is a flow diagram of the Execution Timer's operation;
Figure 40 is a flow diagram of the TIC Handler's operation;
Figure 41 is a flow diagram of the TIC Handler's Selection Queue Update sub-process;
Figure 42 is a flow diagram of the TIC Handler's Completion/Termination sub-process;
Figure 43 is a flow diagram of the TIC Handler's Execution Timer Reset sub-process;
Figure 44 is a flow diagram of the TIC Handler's Priority Scan List Update sub-process;
Figure 45 is a flow diagram of the Priority Scanner's operation;
Figure 46 is a flow diagram of the Next Task Selector's operation;
Figure 47 is a block diagram of the Reconfigure Module;
Figure 48 is a flow diagram for the Task Swapper's operation in response to a Node being excluded from the operating set;
Figure 49 is a flow diagram of the Task Swapper's operation in response to a Node being readmitted to the operating set;
Figure 50 is a flow diagram of the Task Reallocator's operation in response to a Node being excluded from the operating set;
Figure 51 is a flow diagram of the Task Status Matcher's operation;
Figure 52 is a block diagram of the Task Communicator;
Figure 53 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the Store Data Control;
Figure 54 is a flow diagram of the Store Data Control's operation;
Figure 55 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the DID Request Handler;
Figure 56 is a flow diagram of the DID Request Handler's operation;
Figure 57 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the Task Terminated Recorder;
Figure 58 is a flow diagram of the Task Terminated
Recorder's operation;
Figure 59 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the Task Started Recorder;
Figure 60 is a flow diagram of the Task Started
Recorder's operation;
Figure 61 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the AP Input Handler;
Figure 62 is a flow diagram of the AP Input Handler's operation;
Figure 63 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the AP Output Handler:
Figure 64 is a flow diagram showing the AP Output
Handler's operation:
Figure 65 shows the format of the DID Information as stored 1n the DID List;
Figure 66 shows the format of the DID information with the NUDAT bit appended;
Figure 67 is a partial block diagram of the Task
Communicator showing the subsystems involved in "reconfiguration";
Figure 68 is a flow diagram showing the operation of the Reconfigure Control during reconfiguration;
Figure 69 is a partial block diagram of the Task Communicator showing the subsystems involved in "reset";
Figure 70 is a flow diagram of the Reset Control during reset;
Figure 71 is a block diagram of the Synchronizer;
Figure 72 shows the format of the Synchronizer Memory;
Figure 73 shows the format of the Message Memory;
Figure 74 shows the format of the Time Stamp Memory;
Figure 75 shows the format of the Scratch Pad Memory;
Figure 76 shows the waveforms of the signals generated by the Timing Signal Generator;
Figure 77 is a block diagram of the Synchronizer
Control;
Figure 78 is a flow diagram showing the operation of the Data Handler and Expected Message Checker;
Figure 79 is a flow diagram showing the operation of the Within Hard Error Window and Soft Error Window Checker and the Time Stamper;
Figure 80 is a flow diagram for the operation of the
"HEW to warning count" ;
Figure 81 is a partial block diagram of the Synchronizer showing the elements associated with the operation of the Message Generator;
Figure 82 is a flow diagram of the operation of the
Message Generator and the Transmitter Interface;
Figure 83 shows the waveforms of the timing signals for generating a TIC message;
Figure 84 shows the waveforms of the timing signals for generating a sync System State message;
Figure 85 shows the format of the "cold start" pre-sync message;
Figure 86 is a flow diagram showing the operation of the Synchronizer during a "cold start";
Figures 87 and 87a are flow diagrams showing the generation of the HEW to warning signal during "cold start";
Figure 88 is a flow diagram showing the storing of data during a "cold start";
Figure 89 is a flow diagram showing the operation of the Operating Condition Detector during a "cold start";
Figure 90 is a timing diagram used in the description of the "cold start";
Figure 91 is a flow diagram of the operation of the Synchronizer during a "warm start";
Figure 92 is a timing diagram used in the description of a "warm start";
Figure 93 is a flow diagram of the operation of the
Byzantine Voter to generate Byzantine voted task completed vector and Byzantine voted branch condition bits for the Scheduler;
Figure 94 is a perspective of the Byzantine Voter's three-dimensional memory;
Figure 95 shows the two-dimensional format of ISW vectors resulting from the first Byzantine vote on the three- dimensional ISW matrices; and
Figure 96 is a functional circuit diagram of the Byzantine Voter.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
The multi-computer architecture for fault tolerance is a distributed multi-computer system based on the functional and physical partitioning of the application tasks and the overhead functions, such as fault tolerance and systems operations. As shown in Figure 1, the multi-computer architecture consists of a plurality of Nodes 10A through 10N, each having an Operations Controller 12 for performing the overhead functions and an Applications Processor 14 for executing the application tasks.
For each application, the multi-computer architecture is required to execute a predetermined set of tasks, collectively called application tasks. Each Node is allocated an active task set which is a subset of the application tasks. Each Node in coordination with all of the other Nodes is capable of selecting tasks from its active task set and executing them in a proper sequence. The active task set for each Node may be different from the active task set allocated to the other Nodes and each task in the application tasks may be included in the active task set of two or more Nodes depending upon how many Nodes are in the system and the importance of the task to the particular application. In this way, the multi-computer architecture defines a distributed
multi-computer system in which no one Node 101s required to execute every one of the application tasks, yet the failure of one or more Nodes need not prevent the execution of any application task. As shall be more fully explained later on, the active task set in each Node is static for any given system configuration or system state and will change as the system state changes with an increase or decrease in the number of active Nodes. This change in the active task set called "reconfiguration," takes place automatically and assures that every one of the important or critical application tasks will be included in the active task set of at least one of the remaining active Nodes in the system.
Each Node 10A through 10N is connected to every other Node in the multi-computer architecture through its Operational Controller 12 by means of a private communication link 16. For example, the Operations Controller "A" is the only Operations
Controller capable of transmitting on communication link 16a. All of the other Nodes are connected to the communication link 16a and will receive every message transmitted by the Operations Controller "A" over communication link 16a. In a like manner, the Operations Controller "B" of Node 10B is the only Operations
Controller capable of transmitting messages on communication link 16b, and Operations Controller N of the Node 10N is the only Operations Controller capable of transmitting messages on communication link 16n.
External information from sensors and manually operated devices collectively identified as Input Devices 20 are transmitted directly to the Applications Processors 14 of each Node through an input line 18. It is not necessary that every Applications Processor receive information from every sensor and/or Input Device, however, each Applications Processor 14 will receive the information from every sensor and/or Input Device which it needs in the execution of the applications task.
In a like manner, the Applications Processor 14 in each Node will transmit data and control signals, resulting from the execution of the applications task to one or more actuators and/or
display devices collectively identified as Output Devices 22. The data and/or control signals generated by the Applications Processor 14 in the individual Nodes 10A through 10N may be combined by a Combiner/Voter Network 24 before it is transmitted to the Output Devices 22. Further, when multiple values of the same data and/or control signals are generated by two or more of the Nodes, the Combiner/Voter Network 24 may also be used to generate a single voted value which is transmitted to the Output Devices 22. The use or omission of a Combiner/Voter Network 24 is optional. It is not necessary that every actuator or display receive the output generated by every Node in the system. The specific actuator or display only needs to be connected to the Node or Nodes whose Applications Processor 14 is capable of generating the data or command signals it requires.
The network of Operations Controllers 12 is the heart of the system and is responsible for the inter-node communications, system synchronization, data voting, error detection, error handling, task scheduling, and reconfiguration. The Applications Processors 14 are responsible for the execution of the application tasks and for communications with the Input Devices 20 and Output Devices 22. In the multi-computer architecture, the overhead functions performed by the Operations Controllers 12 are transparent to the operations of the Applications Processor 14. Therefore, the structure of the Applications Processor 14may be based solely upon the application requirements. Because of this, dissimilar Applications Processor 14 may be used 1n different Nodes without destroying the symmetry of the multi-computer architecture.
The structural details of the Operations Controller 12 in each Node 10A through 10N are shown in Figure 2. Each Operations Controller 12 has a transmitter 30 for serially transmitting messages on the Node's private communication link 16. For discussion purposes, it will be assumed that the Operations Controller illustrated in Figure 2 is the Operations Controller A as shown in Figure 1. In this case, the Transmitter 30 will transmit messages on the private communication link 16a. Each Operations Controller also has a plurality of Receivers 32a
through 32n, each of which is connected to a different private communication link. In the preferred embodiment, the number of Receivers 32a through 32n is equal to the number of Nodes in the multi-computer architecture. In this way, each Operations Controller 12 will receive all of the messages transmitted by every Node in the system including its own. Each Receiver 32a through 32n will convert each message received over the private communication link to which it is connected from a serial format to a parallel format then forward it to a Message Checker 34. Each Receiver 32a through 32n will also check the vertical parity and the longitudinal redundancy codes appended to each of the received messages and will generate an error signal identifying any errors detected.
The Message Checker 34 monitors the Receivers 32a through 32n and subjects each received message to a variety of physical and logical checks. After completion of these physical and logical checks, the messages are sent to a Fault Tolerator 36. Upon the detection of any errors in any message, the Message Checker 34 will generate an error status byte which is also transmitted to the Fault Tolerator 36.
The Fault Tolerator 36 performs five, basic functions. First, the Fault Tolerator performs further logical checks on the messages received from the Message Checker 34 to detect certain other errors that were not capable of being detected by the Message Checker 34. Second, the Fault Tolerator passes error free messages to a Voter 38 which votes on the content of all messages containing the same information to generate a voted value. Third, it passes selected fields from the error free messages to other subsystems as required. Fourth, the Fault Tolerator aggregates the internal error reports from the various error detection mechanisms in the Operations Controller and generates Error messages which are transmitted to all of the other Nodes in the system by the Transmitter 30. Finally, the Fault Tolerator 36 monitors the health status of each Node in the system and will initiate a local reconfiguration when a Node is added or excluded from the current
number of operating Nodes. The Fault Tolerator 36 maintains a base penalty count table which stores the current base penalty counts accumulated for each Node in the system. Each time a Node transmits a message containing an error, every Node in the system, including the one that generated the message, should detect this error and generate an Error message identifying the Node that sent the message containing the error, the type of error detected, and a penalty count for the detected error or erros. Each Fault Tolerator 36 will receive these Error messages from every other Node and will increment the base penalty count for that Node which is currently being stored in the base penalty count table, if the detection of the error is supported by Error messages received from a majority of the Nodes. The magnitude of the penalty count increment is predetermined and is proportional to the severity of the error. If the incremented base penalty count exceeds an exclusion threshold, as shall be discussed later, the Fault Tolerator initiates a Node exclusion and a reconfiguration process in which the faulty Node is excluded from active participation in the system and the active task sets for the remaining Nodes are changed to accommodate for the reduction in the number of active Nodes.
The Fault Tolerator 36 will also periodically decrement the base penalty count for each Node in the system so that a Node which was previously excluded may be readmitted Into the active system. When a previously excluded Node continues to operate in an error free manner for a sufficient period of time, its base penalty count will be decremented below a readmittance threshold which will initiate a Node readmittance and reconfiguration process in which the previously excluded Node is readmitted into the active system. When the previously excluded Node is readmitted into the system the active task set for each Node is readjusted to accommodate for the increase in the number of active Nodes 1n the system.
The Voter 38 performs an "on-the-fly" vote using all of the current copies of the data values received from the Fault
Tolerator 36. The voted data value and all copies of the received data are passed to a Task Communicator 44 which stores them in a Data Memory 42. The Voter will select a voted data value using an appropriate algorithm as shall be discussed relative to the Voter 38 itself. Each time a new copy of a data value is received, a new voted data value is generated which 1s written over the prior voted data value stored in the Data Memory 42. In this manner, the Data Memory 42 always stores the most current voted data value assuring that a voted data value is always available for sub- sequent processing independent of one or more copies of the data value failing to be generated or "hang" causing a late arrival. The Voter 38 will also perform a deviance check between the voted data value and each copy of the received data value, and will generate an error vector to the Fault Tolerator identifying each Node which generated a data value which differed from the voted data value by more than a predetermined amount. This arrangement will support both exact and approximate agreement between the copies of the data values. The Voter 38 supports several data types, included pack boolean values, fixed point formats, and the IEEE standard 32-bit floating point format.
A Scheduler 40 has two modes of operation, normal and reconfiguration. In the normal mode of operation the Scheduler 40 1s an event driven, priority based, globally verified scheduling system which selects from its active task set the next task to be executed by Its associated Applications Processor 14. For a given system configuration (set of active Nodes) the active task set assigned to each Node is static. Each time the associated Application Processor begins a task, the Scheduler 40 selects the next task to be executed. The Applications Processor will immediately begin the execution of the selected task and the Task Communicator 44 will immediately initiate the generation of a message informing all of the other Nodes of the identity of the selected task, the Identity of the preceding task finished by the Applications Processor 14, and the branch conditions of the preceding task. Conditional branching is controlled by the Applications Processor
14 and is determined by conditions in the applications environment. The precedence relationship between a task and its successor task may include conditional branches, concurrent forks, and join operations implemented at task boundaries.
Conditional branching provides an efficient means of switching operational modes and avoids the necessity of scheduling tasks not required by the current conditions. An interactive consistency voting process guarantees agreement on the branch conditions generated by the other Nodes which executed the same task.
The Scheduler 40 in each Node replicates the scheduling process for every other Node in the system and maintains a global data base on the scheduling and execution of tasks by each Node. Upon the receipt of a message from another Node identifying the task completed and the task started, the Scheduler 40 will compare the task completed with the task previously reported as started and generate a scheduling error signal if they are not the same. The Scheduler 40 will also compare the task reported as started with a task it has scheduled to be started by that Node. If they are different, the Scheduler will also generate a scheduling error signal. The Scheduler 40 will pass all scheduling error signals to the Fault Tolerator 36. All of the Scheduler's error detection mechanisms are globally verified and have been designed to ensure that failure of one or more copies of a task does not upset scheduling.
In the reconfiguration mode of operation, a reversible path independent reconfiguration algorithm provides graceful degradation of the workload as faulty Nodes are excluded from the operating system. Because the algorithm is reversible it also supports graceful restoration of the workload as previously excluded Nodes are readmitted following an extended period of error free operation.
In reconfiguration, the active task set allocated to each Node is altered to compensate for the change in the number of
active Nodes. During reconfiguration after the exclusion of a faulty Node, the active task set, or at least the critical task of a faulty Node's active task set, may be reallocated and included in the active task set of the other Nodes. In other instances, individual tasks may be globally disabled and replaced with simpler tasks, and some noncritical tasks may be disabled with no replacement. The reconfiguration process readjusts the active task set for the active Nodes to accommodate the system capabilities. The algorithm supports true distributed processing, rather than just a replication of uniprocessor task loads on redundant Nodes.
A Task Communicator 44 functions as an input/output (I/O) interface between the Operations Controller 12 and the Applications Processor 14. The Applications Processor 14 signals the Task Communicator 44 when it is ready for the next task. A simple handshaking protocol is employed to synchronize communications between the Applications Processor 14 and the Task Communicator 44. Upon receipt of this signal the Task Communicator 44 reads the selected task from the Scheduler 40 and transfers it to the Applications Processor 14. Concurrently, the Task Communicator 44 will initiate the transmission of the task completed/task started message identifying the task completed by the Applications Processor 14, the task being started by the Applications Processor and the branch conditions of the completed task. The Task Communicator 44 will then fetch the data required for the execution of the started task from the Data Memory 42 and temporarily store it in a buffer in the order in which it is required for the execution of the started task. The Task Communicator will pass these data values to the Applications Processor as they are requested. Effectively, the Task
Communicator 44 looks like an input file to the Applications Processor 14.
The Task Communicator 44 also receives the data values generated by the Applications Processor 14 in the execution of the selected task and generates Data Value messages which are
broadcast by the Transmitter 30 to all of the other Nodes in the system. The Task Communicator will also append to the Data Value message a data identification (DID) code and a message type (MT) code which uniquely identifies the message as a Data Value message.
The Synchronizer 46 provides two independent functions in the operation of the multi-computer architecture. The first function pertains to the synchronization of the operation of the Nodes 10A through 10N during steady state operation, the second function pertains to the synchronization of the Nodes on start up. During steady state operation, the Synchronizer 46 effects a loose frame base synchronization of the Nodes by the exchange of messages which implicitly denote local clock times. The Synchronizer 46 in each Node counts at its own clock rate, up to a "nominal sync count," then issues a presynchronization System State message which is immediately broadcast by the Transmitter 30 to all of the other Nodes in the system. As the presynchronization System State messages from all the Nodes in the system, including its own, are received at each Node, they are time stamped in the Synchronizer as to their time of arrival from the Message Checker 34. The time stamps are voted on to determine a voted value for the arrival time of the presynchronization System State messages from all the Nodes. The difference between the voted time stamp value and the time stamp of the Node's own presynchronization System State message is an error estimate which is used to compute a corrected sync count. The error estimate includes any accumulated skew from previous synchronization rounds and the effects of clock drift. The Synchronizer 46 will then count up to the corrected sync count and issue a synchronization System State message which is immediately transmitted by the Transmitter 30 to all of the other Nodes in the system. The synchronization System State messages will also be time stamped as to their arrival in the Synchronizers in each Node in the system.
The time stamps of all presynchronization and synchronization System State messages are all compared with the voted time stamp value to determine which Nodes are in synchronization
with Its own Node and which are not. When the difference in the time stamps exceeds a first magnitude a soft error signal is generated signifying a potential synchronization error. However, if the time stamp difference exceeds a second magnitude, larger than the first magnitude, a hard error signal is generated signifying a synchronization error has definitely occurred. The soft and hard error signals are transmitted to the Fault Tolerator 36 and are handled 1n the same manner as any other detected error. Start up is defined as a process for creating a functional con- figuration of Nodes called an "operating set." If an "operating set" is in existence, and the functional configuration is changed by the admittance or readmittance of one or more Nodes, the process is called a "warm start." If no "operating set" is in existence, it is called a "cold start." In a warm start, the Synchronizer 46 will recognize the existence of an operating set and will attempt to achieve synchronization with the operating set. A cold start is initiated by a power on reset (POREST) signal generated in response to the initial application of electrical power to the system. Each Synchronizer 46 will attempt to achieve point-to-point synchronization with all the Nodes until an operating set is formed. Once an operating set Is formed, those Nodes not included in the operating set will switch to the warm start process and will attempt to achieve synchronization with the operating set.
INTER -NODE MESSAGES
The operation of the multi-computer architecture depends upon the exchange of data and operational information by the exchange of inter-node messages. These inter-node messsaages are data-flow instructions which indicate to each Individual Operations Controller how it should be processed.
The various inter-node messages and their information content are listed on Table 1.
The inter-node messages all have the same basic format so as to simplify their handling in the receiving mode. The first byte of each inter-node message contains the Node identification (NID) code of the Node from which the message originated and a message type (MT) code identifying the message type. The last byte in each inter-node message is always a block check byte which is checked by the Receivers 32a through 32n to detect transmission errors.
There are four different Data Value messages which range from a one byte Data Value message to a four byte Data Value message. These Data Value messages are identified as message types MT0 through MT4. The second byte of a Data Value message is a data identification (DID) code which when combined with the message type code uniquely identifies that particular data value from other data values used in the system. The data identification (DID) code is used by the Message Checker 34 to define the types of checks that are to be performed. The MT/DID codes are used to identify which limits will be used by the Message Checker 34 and the deviance to be used by the Voter 38 to define the permissible deviance of each actual data value from the voted values and by the Task Communicator 44 to identify the data value to be supplied to the Applications Processor 14 in the execution of the current task. The bytes following the data identification byte are the data values themselves with the last byte being the block check byte as previously indicated.
A Task Interactive Consistency (TIC) message is a special case of the two byte Data Value message which is identified by the DID being set to zero (0). The Task Interactive Consistency message, message type MT1, is a rebroadcast of the task completed vector and branch condition data contained in Task Completed/Started (CS) messages received from the other Nodes and are transmitted at the end of each Subatomic period (SAP), as shall be explained in the discussion of the timing sequence. The information content of the Task Interactive Consistency messages are voted on by each Node and the voted values are used by the
Scheduler 40 in the task selection and scheduling process.
A Base Penalty Count (BPC) message, message type MT4, contains the base penalty count that the individual Node is storing for each Node in the system including itself. Each Node will use this information to generate a voted base penalty count for each Node in the system. Thereafter, each Node will store the voted base penalty count as the current base penalty count for each Node. This assures that at the beginning of each Master period each Node is storing the same number of base penalty counts for every other Node in the system. The Base Penalty Count message is transmitted by each Node at the beginning of each Master period timing interval.
A System State (SS) message, message type MT5, is sent at the end of each Atomic period timing interval and is used for the point-to-point synchronization of the Nodes and to globally affirm reconfiguration when a majority of the Nodes conclude that reconfiguration is required. The transmission of the System State message is timed so that the end of Its transmission coincides with the end of the preceding Atomic period and the beginning of the next Atomic period. The first byte of the System State message contains the node identification (NID) code of the originating Node and the message type (MT) code. The second byte contains three function bits, the first two bits are the synchronization and presynchronization bits which are used in the Synchronization process described above. The third bit identifies whether or not the Node is operating or excluded.. The third and fourth bytes of the System State message are the task completed vector and the branch condition vector, respectively. Byte five contains the current system state vector and byte six contains the the new system state vector. When the sending Node has concluded reconfiguration is necessary, the new system state vector will be different from the current state vector. Byte seven and eight contain the higher and lower order of bits of the Node's own period counter. Byte nine is an "in sync with" (ISW) vector which defines which Nodes that particular Node determines it is synchro
nized with, and byte ten is reserved for future use. Byte eleven is the conventional block check byte at the end of the message. The Synchronizer uses the time stamp of the pre-synchronization System State messages, identified by the pre-synchronization bit in the second byte being set to generate an error estimate used to compute a correction to the time duration of the last Subatomic period. This correction synchronizes the beginning of the next Atomic period in that Node with the Atomic period being generated by the other Nodes. The period counter bytes are used to align the Master periods of all the Nodes in the system. The period counter counts the number of Atomic periods from the beginning of each period and is reset when it counts up to the fixed number of Atomic periods in each Master period. Byte nine is used only during an automatic cold start as shall also be explained in more detailed in the discussion of the Synchronizer 46.
The Task Completed/Started (TC/S) message, message type MT6, is generated by the Task Communicator 44 each time the Applications Processor 14 starts a new task. The second and third bytes of the Task Completed/Started message contain the task identification (TID) codes of the task completed and new task started by the Node's Applications Processor 14. The fourth byte of this message contains the branch condition of the completed task, and an error correction code (ECC).
The last inter-node message is the Error message, message type MT7, which is sent whenever the Transmitter 30 is free during an Atomic period. Only one error message reporting the errors attributed to a particular Node can be sent in an Atomic period. The second byte of the Error message is the Node identification (NID) code of the Node- accused of being faulty. The following four bytes contain error flags identifying each error detected. The seventh and eighth bytes of the error message contain the base penalty count of the identified Node and the increment penalty count which is to be added to the base
penalty count if the errors are supported by Error messages received from other Nodes. The increment penalty count is based on the number of errors detected and the severity of these errors, This information is used by the other Nodes to generate a new voted base penalty count for the Node identified in the Error message. A separate Error message is sent for each Node which generates a message having a detected error.
TIMING PERIODS
The overall control system of the multi-computer architecture contains a number of concurrently operating control loops with different time cycles. The system imposes the constraint that each cycle time be an integer power of two times a fundamental time interval called an Atomic period. This greatly simplifies the implementation of the Operations Controller 12 and facilitates the verification of correct task scheduling. The length of the Atomic period is selected within broad limits by the system designer for each particular application. The System State messages, which are used for synchronization are sent at the end of each Atomic period.
The longest control loop employed by the. system is the
Master period. Each Master period contains a fixed number of Atomic periods, as shown in Figure 3. All task scheduling parameters are reinitialized at the beginning of each Master period to prevent the propagation of any scheduling errors. The Nodes will also exchange Base Penalty Count messages Immediately following the beginning of each Master period.
The shortest time period used in the system is the Subatomic (SAP) period, as shown in Figure 4, which defines the shortest execution time recognized by the Operations Controller 12 for any one task. For example, if the execution time of a task is less than a Subatomic period, the Operations Controller 12 will not forward the next scheduled task to the Applications Processor
14 until the beginning of the next Subatomic period. However, when the execution time of a task is longer than a Subatomic period, the Operations Controller 14 will forward the next scheduled task to the Applications Processor as soon as it is ready for it. There are an integer number of Subatomic periods in each Atomic period which are selectable by the systems designer to customize the multi-computer architecture to the particular application. As shown in Figure 4, each Subatomic period is delineated by a Task Interactive Consistency message as previously described.
TRANSMITTER
Figure 5 is a block diagram. of the Transmitter 30 embodied in each of the Operations Controllers 12. The Transmitter 30 has three interfaces, a Synchronizer Interface 50 receiving Task Interactive Consistency messages and System State messages generated by the Synchronizer 46, a Fault Tolerator
Interface 52 receiving the Error and Base Penalty Count messages generated by the Fault Tolerator 36, and a Task Communicator Interface 54 receiving Data Value and Completed/Started messages generated by the Task Communicator 44. The three interfaces are connected to a Message Arbitrator 56 and a Longitudinal Redundancy Code Generator 58. The Message Arbitrator 56 determines the order in which the messages ready for transmission are to be sent. The Longitudinal Redundancy Code Generator 58 generates a longitudinal redundancy code byte which is appended as the last byte to each transmitted message. The message bytes are individually transferred to a Parallel-to-Serial Converter 60 where they are framed between a start bit and two stop bits, then transmitted in a serial format on communication link 16.
The Transmitter 30 also includes a Self-Test Interface 62 which upon command retrieves a predetermined self-test message from an external ROM (not shown) which is input into the Longitudinal Redundancy Code Generator 58 and transmitted to the
communication link by the Parallel-to-Serial Converter 60. The Transmitter 30 also has an Initial Parameter Load Module 64 which will load into the Transmitter various predetermined parameters, such as the length of the minimum synchronization period between messages, the length of a warning period for Interactive
Consistency and System State messages and the starting address in the ROM where the self-test messages are stored.
As shown in Figure 6, each of the three interfaces has an eight bit input register 66 which receives the messages to be transmitted from its associated message source through a multiplexer 68. The multiplexer 68 also receives the three bit Node identification (NID) code which identifies the Node which is generating the message.
Whenever the associated message source has a message to be transmitted, it will hold the message until a buffer available signal is present signifying the input register 66 Is empty. The message source will then transmit the first byte of the message to the input register 66. A bit counter 70 will count the strobe pulses clocking the message into the Input Register 66 and will in coordination with a flip flop 72 and an AND gate 74 actuate the multiplexer 68 to clock the three bit Node identification code into the Input Register 66 as the last three most significant bits of the first byte. The flip flop 72 is responsive to the signal "transmit quiet period" (TQP) generated at the end of its pre- ceding message to generate a first byte signal at its 0 output which enables AND gates 74 and 76. The AND gate 74 will transmit the three most significant bits generated by the bit counter 70 in response to the strobe signals loading the first byte Into the input register 66 and will actuate the multiplexer 68 to load the three bit Node identification code into the three most significant bit places of the input register 66.
The AND gate 76 will respond to the loading of the eighth bit into input register 66 and will generate an output which will actuate the flip flop 78 to a set state. In the set state, the flip flop 78 will generate a message available signal
at its Q output and win terminate the buffer available signal at its Q output. The message available (MA) signal will reset the flip flop 72 terminating the first byte signal which in turn disables the AND gates 74 and 76. The message available (MA) signal is also transmitted to the Message Arbitrator 56 signifying a message is ready for transmission.
Termination of the buffer available (BA) signal when the flip flop 78 is put in the set state inhibits the message source from transmitting the remaining bytes of the message to the Transmitter 30. The first three least significant of bits of the first bytes, which are the message type code, are communicated directly to the Message Arbitrator 56 and are used in the arbitration process to determine which message is to be sent if more than one message is available for transmission or if the sending of that message will not interfere with the transmission of a time critical message generated by the Synchronizer 46.
The Message Arbitrator 56 will generate a transmit (Txxx) signal identifying the next message to be sent when there is more than one message ready for transmission. This message will actuate the Longitudinal Redundancy Code Generator 58 to pass the selected message to the Parallel-to-Serial Converter for transmission. The transmit signal will also reset the flip flop 78 in the appropriate interface which reasserts the buffer available (BA) signal, actuating the associated message source to transmit the remaining bytes of the message to the interface.
These are then transmitted directly to the Longitudinal Redundancy Code Generator 58 as they are received. When all of the bytes of the message are transmitted, the Message Arbitrator 56 will generate a transmit quiet period (TQP) signal which actuates the Parallel-to-Serial Converter to transmit a null (synchronization) signal for a predetermined period of time following the transmission of each message. In the preferred embodiment, the quiet period is a time required for the transmission of 24 bits or two (2) null bytes. The transmit quiet period (TQP) signal will also set the flip flop 72 indicating that the preceding message has
been sent and that the next byte received from the associated message source will be the first byte of the next message.
The details of the Message Arbitrator 56 are shown on Figure 7. Under normal operation when no critical time messages, such as Task Interactive Consistency (TIC) and System State (SS) messages, are to be sent, a Fault Tolerator (FLT) Task Communicator (TSC) Arbitration Logic 82 will generate, in an alternating manner, PFLT and a PTSC polling signals which are received at the inputs of AND gates 84 and 86, respectively. The AND gate 84 will also receive the Fault Tolerator Message Available (FLTMA) signal generated by the Fault Tolerator Interface 52 while AND gate 86 will receive a Task Communicator message available (TSCMA) signal generated by the Task Communicator Interface 54 after the Task Communicator 44 has completed the loading of the first byte of the message ready for transmission. The outputs of the AND gates 84 and 86 are transmit Fault Tolerator (TFLT) and transmit Task Communicator (TSC) signals which are applied to AND Gates 88 and 90, respectively. The alternate inputs to AND gates 88 and 90 are received from a Time Remaining-Message Length Comparator 92 which produces an enabling signal whenever the transmission of the selected message will not interfere with the transmission of a time dependent message as shall be explained hereinafter. If the AND gate 88 is enabled it will pass the transmit Fault Tolerator (TFLT) signal to the Fault Tolerator Interface 52 to reassert the buffer available signal, enabling it to receive the remaining bytes of the message from the Fault Tolerator 36 and to the Longitudinal Redundancy Code Generator 58 enabling it to pass the message, byte-by-byte from the Fault Tolerator Interface 52 to the Parallel-to-Serial Converter 60 for transmission on the communication link 16. In a like manner, when the AND gate 90 Is enabled, and the polling of the Task Communicator Interface 54 indicates that the Task Communicator 44 has a message ready for transmission, then the AND gate 86 will generate a transmit Task Communicator (TTSC) signal which, if passed by the AND gate 90, will result in the transmission of the Task Communicator's
message. The TFLT and the TTSC signals, when generated, are fed back to Tock the FLT - TSC Arbitration Logic 82 in its current state until after the message is sent.
The message arbitration between the Fault Tolerator's and Task Communicator's messages is primarily dependent upon the type of the message currently being transmitted. The logic performed by the FLT-TSC Arbitration Logic 82 is summarized on Table II.
Normally the FLT-TSC Arbitration Logic 82 will poll the Fault Tolerator Interface 52 and the Task Communicator Interface 54 in an alternating sequence. However, at the beginning of each Atomic period, the FLT-TSC Arbitration Logic 82 will first poll the Task Communicator Interface 54 for a Task Completed/Started message which will identify the task being started by that Node.
If the Task Completed/Started message is not available it will then poll the Fault Tolerator Interface 52.
At the beginning of each Master period, all of the Nodes should transmit a Base Penalty Count message which is used for global verification of the health of each Node In the system. Therefore, after each System State message which is coincident with the beginning of a Master period, the FLT-TSC Arbitration Logic will first poll the Fault Tolerator Interface 52 and wait until it receives the Base Penalty Count message from the Fault Tolerator 36. After the transmission of the Base Penalty Count message, it will then poll the Task Communicator Interface 54 and transmit a Task Completed/Started message identifying the task scheduled to be started by the Applications Processor. If the Fault Tolerator 36 does not generate a Base Penalty Count message within a predetermined period of time, the FLT-TSC Arbitration Logic 82 will resume polling of the Fault Tolerator Interface 52 and the Task Communicator Interface 54 in an alternating sequence. In a like manner, after a self-test message, the FLT-TSC Arbitration Logic 82 will poll the Task Communicator Interface 54 and wait for a Task Completed/Started message.
The Synchronizer 46 will load the first byte of either a Task Interactive Consistency or System State message in the Synchronizer Interface 50 a predetermined period of time before the beginning of the next Subatomic or Atomic periods. A Warning Period Generator 94 will load a warning period counter with a number corresponding to the number of bits that are capable of being transmitted before the Task Interactive Consistency or System State messages are to be transmitted. As described previously, the transmission of the final bit of either of these messages marks the end of the previous Subatomic or Atomic periods respectively, therefore, their transmission will begin a predetermined time (bit counts) before the end of the period. Since the Task Interactive Consistency and System State messages are of different bit lengths, the number loaded into the warning period counter will be different. The Warning Period Generator 94 will
decode the message type code contained in the first byte of the message stored in the Synchronizer Interface 50 and will load the warning period counter with a number Indicative of the length of the warning period for that particular type of time critical message. The warning period counter will be counted down at the bit transmission rate of the Parallel-to-Serial Converter 60 to generate a number indicative of the time remaining for the transmission of a time critical message. The number of counts remaining in the warning period counter are communicated to a Synchronizer Transmission Control 96 and the Time Remaining-
Message Length Comparator 92. When the warning period counter is counted down to zero the Synchronizer Message Control 96 will generate a transmit synchronizer (TSYN) signal which will actuate the Synchronizer Interface 50 to reassert the buffer available signal and will actuate the Longitudinal Redundancy Code Generator 58 to pass the message from the Synchronizer Interface 50 to the Parallel-to-Serial Converter 60 for transmission on the Node's own communication link 16.
The Time Remaining-Message Length Comparator 92 will decode the message type of a message selected for transmission by the FLT-TSC Arbitration Logic and determine the number of bits that have to be transmitted for that message. To this number the Time Remaining-Message Length Comparator 92 will add a number equal to the number of bits corresponding to the quiet period between the messages and compare the sum of the message and the quiet period with the count remaining in the warning period counter to determine if the transmission of the selected message will or will not interfere with the transmission of the time critical message from the Synchronizer Interface 50. If the transmission of the selected message will not Interfere with the sending of the time critical message from the Synchronizer 46, the Time Remaining-Message Length Comparator 92 will generate a signal enabling AND gates 88 and 90 to pass the TFLT or TTSC signals, otherwise the Time Remaining-Message Length Comparator 92 will generate a signal disabling AND gates 88 and 90, inhibiting the transmission of the selected message from either the Fault Tolerator Interface 52 or
the Task Communicator Interface 54. This signal will also toggle the FLT-TSC Arbitration Logic 92 to poll the nonselected interface to determine if it has a message to transmit. If the nonselected Interface has a message ready for transmission, the Time Remaining-Message Length Comparator 92 will determine if there is sufficient time to transmit the message from the nonselected Interface before the transmission of the time critical message from the Synchronizer Interface 50. If there is sufficient time, the message from the nonselected interface will be transmitted, otherwise the AND gates 88 and 90 will remain disabled.
The Message Arbitrator 56 also has a Byte Counter 100 which counts the number of bytes transmitted by the Parallel-to-Serial Converter 60. The output of the Byte Counter 100 is received by a Message Byte Logic 102. The Message Byte Logic 102 decodes the message type code of the message being transmitted and determines the number of bytes in that message. After the last byte of the message is transmitted, the Message Byte Logic 102 will first generate a transmit longitudinal redundancy code (TLRC) signal which enables the Longitudinal Redundancy Code Generator 58 to transmit the generated longitudinal redundancy code as the final byte of the message. The Message Byte Logic 102 will then generate a transmit quiet period (TQP) signal enabling the Parallel-to-Serial Converter 60 to transmit the null signal for a predetermined number of bytes which is used for message synchronization. The transmit quiet period (TQP) signal is also transmitted to the Synchronizer Transmission Control 96 where it is used to terminate the transmit synchronizer (TSYN) signal. At the end of the quiet period, the Message Byte Logic 102 will generate an end of quiet period (EQP) signal which will reset the Byte Counter 100 and unlatch the FLT-TSC Arbitration Logic 82 for selection of the next message for transmission.
A Self-Test Arbitration Logic 104 recognizes a request for a self-test In response to a transmitted Task Completed /Started message in which the task Identification (TID) code is the same as the Node identification (NID) code. After the
transmission of a self-test request message, the Self-Test Arbitration Logic 104 will inhibit a Task Communicator Enable (TSCE) signal and a Fault Tolerator Enable (FLTE) signal as shown in Figure 8 which, when applied to AND gates 84 and 86, respec- tlvely, inhibits all transmissions from the Fault Tolerator
Interface 52 or the Task Communicator Interface 54. Immediately following the next Task Interactive Consistency or System State message, the Self-Test Arbitration Logic 104 will generate a transmit self-test (TSLT) signal which will actuate the Self-Test Interface 62 to read the self-test message from an associated off board (read only memory) ROM. The (TSLT) signal will also enable the Longitudinal Redundancy Code Generator 58 to pass the self- test message from the Self-Test Interface 62 to the Parallel-to-Serial Converter 60 for transmission. After transmission of the self-test message, the Self-Test Arbitration Logic 104 will restore the Task Communicator Enable (TSCE) signal to permit the transmission of a Task Completed/ Started message signifying the completion of the self-test. As indicated in Table II, the FLT-TSC Arbitration Logic 82 will automatically select the message from the Task Communicator Interface 54 as the next message to be transmitted following the transmission of the self- test message. After the transmission of the Task Completed/Started message the Self-Test Arbitration Logic 104 will terminate the Task Communicator Enable (TSCE) signal until after the next Task Interactive Consistency or System State message is transmitted as indicated in Figure 8.
The Self-Test Interface 62 serves to transfer the self-test message from the off board ROM (not shown) to the Longitudinal Redundancy Code Generator 58. The off board ROM will store a plurality of Self-test messages which are transmitted one at a time in response each time a Self-test is requested. The first byte of each Self-test message is a number indicative of the number of bytes in the Self-test message which is passed back to the Message Byte Logic 102 to identify the completion of the self- test. The last byte in each self-test message stored in the off
board ROM is the starting address for the next Self-test message. The starting address is not transmitted, but rather is stored in the Self-Test Interface 62 to locate the next Self-test message in the off board ROM to be transmitted. The last byte of the last Self-test message stored in the off board ROM contains the starting address of the first Self-test message, so that the Self- test message sequence is repeated. The starting address for the first Self-test message is loaded into the Self-Test Interface 62 by the Initial Parameter Load Module 64 in response to an initial load command generated by the Synchronizer 46 in response to the electrical power being turned on.
As illustrated in Figure 9, the Longitudinal Redundancy Code Generator 58 has an 4:1 Input Multiplexer 110 which receives the message bytes from the Synchronizer Interface 50, Fault Tolerator Interface 52, Task Communicator Interface 54, and
Self-Test Interface 62. The Input Multiplexer 110 controls which message will be transmitted to the Parallel-to-Serial Converter 60 in response to the transmit (TFLT, TTSC, TSYN, and TSLT) signals generated by the Message Arbitrator 56, as previously described. Each byte of a message selected for transmission by the Message Arbitrator 56 is transmitted to an Output Multiplexer 112 by means of nine parallel lines, one for each bit in the received byte plus the parity bit generated by the associated interface. A Longitudinal Redundancy (LR) Bit Generator 114 is connected to each of the nine parallel bit lines and collectively generate a nine bit longitudinal redundancy code. Each bit in the longitudinal redundancy code is a function of the bit values in the same bit locations in the preceding bytes. The outputs of all the LR bit generators 114 are also received by the Output Multiplexer 112. The Output Multiplexer 112 is responsive to the transmit longitudinal redundancy code (TLRC) signal generated by the Message Arbitrator 56 to output the last bit generated by each of the LR bit generators 114 as the last byte of the message being transmitted. The output of the Output Multiplexer 112 is con
nected directly to the Parallel-to-Serial Converter 60 which frames each received byte between predetermined start and stop bits before it is transmitted on the Node's communication link.
RECEIVERS
The structures of the Receivers 32a through 32n are identical, therefore, only the structure of the Receiver 32a will be discussed in detail. Referring to Figure 10, the messages from Node A transmitted on communication link 16a are received by a Noise Filter and Sync Detector 116. The synchronization portion of the Noise Filter and Sync Detector 116 requires that a proper synchronization interval exists prior to the reception of a message. As described relative to the Transmitter 30, the synchronization interval preferably is the time required for the Transmitter 30 to transmit two complete null bytes after each transmitted message.
The low pass portion of the Noise Filter and Sync Detector 116 prevents false sensing of the "start" and "stop" bits by the Receiver 32a due to noise which may be present on the communication link 16a. The low pass filter portion requires that the signal on the communication link 16a be present for four (4) consecutive system clock cycles before it is interpreted as a start or a stop bit. The Noise Filter and Sync Detector 116 will generate a new message signal in response to receiving a start bit after a proper synchronization interval.
After passing through the Noise and Sync Detector 116 the message, byte-by-byte, is converted from a serial to a parallel format in a Serial-to-Parallel Converter 118. The Serial-to-Parallel Converter 118 also determines when a complete 12-bit byte has been received. If the 12-bit byte is not properly framed by a "start" and two "stop" bits, a new bit is added, the bit first received is discarded and the framing is rechecked. Framing errors are not flagged by the Receiver 32a since this fault will manifest itself during a vertical parity check. After conversion to a parallel format, the start and stop bits are stripped from each byte and the remaining 9-bit byte is trans
ferred to a Longitudinal Redundancy Code and Vertical Parity Code (LRC and VPC) Checker 122 to check for parity errors . The error checking logic outputs the current combinational val ue of the vertical parity and the longitudinal redundancy codes. The vertical parity check portion checks the parity vertical ly across the received message while the longitudinal redundancy code checker portion performs a longitudinal redundancy code check on each byte received from the Serial -to-Paral lel Converter 118. The Message Checker 34 decodes the message type information contained in the fi rst byte of the message and determines which byte is the l ast byte in the message and , therefore , for which byte the longitudinal redundancy code check i s val id . The Message Checker 34 will ignore al l other LRC error signal s generated by the LRC and VPC Code Checker 122.
In paral lel with the vertical parity and l ongitudinal redundancy checks , the 8-bit message byte is transferred to a Buffer 120 which interfaces with the Message Checker 34. The Buffer 120 temporarily stores each 8-bit message byte until the Message Checker 34 i s ready to check it. Upon receipt of a message byte , the Buffer wi l l set a byte ready fl ag signi fying to the Message Checker 34 that it has a message byte ready for transfer. The Message Checker 34 wil l unload the message bytes from the Buffer 120 independent of the l oading of new message bytes by the Serial -to-Paral lel Converter 118. The 8-bit message bytes are transferred to the Message Checker 34 via a common bus 124 which is shared with al l of the Receivers 32a through 32n in the Operations Control ler 12. The transfer of the message between the Receivers 32 and the Message Checker 34 i s on a byte-by-byte basi s in response to a pol l ing signal generated by the Message Checker. The Message Checker 34 wil l systematical ly pol l each Receiver one at a time in a repetitious sequence.
MESSAGE CHECKER
The detai l s of the Message Checker 34 are shown in Figure 11. The Message Checker 34 processes the messages
received by the Receivers 32a through 32n and verifies their logical content, records any errors detected, and forwards the messages to the Fault Tolerator 36. The operation of the Message Checker 34 is controlled by a Sequencer 126 which context switches among the multiple Receivers 32a through 32n in order to prevent overrun of the Buffers 120 in each Receiver. Each Receiver 32a through 32n is polled in a token fashion to determine if it has a message byte ready for processing. If the message byte is ready for processing when it is polled by the Sequencer 126 the byte will be processed immediately by the Message Checker 34. Otherwise the Sequencer 126 will advance and poll the next Receiver in the polling sequence. The Sequencer 126 stores the Node identification (NID) code of the Node 10 associated with each Receiver. The Sequencer 126 also has a Byte Counter associated with each Receiver 32a through 32n which is indexed each time the Sequencer 126 unloads a byte from that particular Receiver. The byte count uniquely identifies the particular byte being processed by the Message Checker 34.
The Sequencer 126 will transfer the Node identification code and the byte count to a Data Multiplexer 128 to tag the message byte as it is transferred to the Fault Tolerator 36. The Node identification code and the byte count are also transmitted to an Error Check Logic 130 and a Context Storage 132. The Error Check Logic 130 will check the Node identification code expected by the Sequencer 126 with the Node identification code contained in the first byte of the message being checked to determine if they are the same. When they are the different the Error Checker Logic 130 will generate an error signal which is recorded in an error status byte being generated in the Context Storage 132. The Node identification code is also used as an address into the
Context Storage 132 where the relevant Information pertaining to the message being processed 1s stored. The Context Storage 132 has a separate storage location for each Node 10 in the system which is addressed by the Node identification code contained in the message.
The Context Storage 132 stores the message type (MT) code, the data identification (DID) code, the byte count, an error status byte, a data value mask, and an intermediate error signal for each message as it is being processed. As each byte is unloaded from the Receivers, the Information in the Context Storage 132 will be used by an Address Generator 134 with the message type (MT) code, the data identification (DID) code, and the byte count which Identifies the specific byte to be processed. In response to this Information, the Address Generator 134 will output an address where the required processing information is stored in a Message Checker ROM 136. The Message Checker ROM 136 stores the maximum and minimum values for the data contained in the message, the valid data identification numbers for each message type, and a data mask which identifies how many data values are contained in the message being processed and the number of bytes in each data value.
The maximum and minimum data values are transmitted to a Between Limits Checker 138 which will check the data contained in each data byte against these maximum and minimum values. The Between Limits Checker 138 will generate four different error signals as a result of the between limits checks. The first two are the maximum value (MXER) and minimum value (MNER) error signals, signifying the data value exceeded the maximum value or was less than the minimum value. The other two error signals are the equal to maximum value (MXEQ) and equal to minimum value
(MNEQ) signals. These latter error signals are transmitted to the Error Check logic 130 which will store them in the Context Storage 132 as intermediate error signals.
The Error Check Logic 130 will OR the vertical parity code and the longitudinal redundancy code error signals generated by the Receiver and generate a parity error signal which is recorded in the error status byte being generated in the Context Storage 132. As previously described, the Error Check Logic 130 will check the expected Node identification (NID) code against the Node identification code contained in the first byte of the message and will check the message type (MT) code by checking to
see if bits in bit position 1, 3, and 4 of the first byte are identical. As previously described in the detailed description of the Transmitter 30 the middle bit of the 3-bit message type code is repeated in bit positions 3 and 4 for message type error detections. The Error Check Logic 130 will also check the validity of the data identification (DID) code contained in the second byte of the message against the maximum value for a (DID) code received from the Message Checker ROM 136 and will generate an error signal if the data identification code has a value greater than the maximum value. The Error Check Logic 130 will further check the two's compliment range of the appropriate data byte and generate a range error (RNGER) signal when a two's compliment error range is detected. It will also record in the Context Storage 132 the maximum (MXER) and the minimum (MNER) error signals generated by the Between Limits Checker 138.
With regard to the Between Limits Checker 138, often it can be determined from the first byte of a multi-byte data value if the data value within or outside the maximum or minimum values received from the Message Checker ROM 136 and checking of the remaining bytes is no longer necessary. However, when the Between Limits Checker 138 generates a MXEQ or MNEQ signal signifying that the data value of the byte being checked is equal to either the maximum or minimum limit value, it will be necessary to check the next byte against a maximum or a minimum value to make a factual determination of whether or not the received data value is within or outside the predetermined limits. The Error Check Logic 130 in response to an MXEQ or an MNEQ signal from the Between Limits Checker 138 will store in the Context Storage an intermediate value signal which signifies to the Context Storage 132 that the between limits check is to be continued on the next byte containing that data value. This process will be repeated with the next subsequent byte if necessary to make a final determination. During the checking of the next byte of the particular data value, the Context Storage 132 will supply to the Error Check Logic 130 stored intermediate value which identifies to which limit, maximum or minimum, the data value of the preceding data byte was equal. From this information, the existence or non-existence of a between
the limits error can readily be determined by relatively simple logic as shown on Figure 12. A Decoder 140 responsive to the intermediate value stored in the Context Storage 132 will enable AND gates 142 and 144 if the preceding between limits check generated a signal signifying the data value contained in the preceding byte was equal to the maximum value. Alternatively, the intermediate value will enable AND gates 146 and 148 signifying that the data value contained in the preceding byte was equal to the minimum value. If on the second byte the Between Limits Checker 138 detects a maximum limit error (MXER) and AND gate 142 is enabled, the maximum limit error MXER will be recorded in the error status byte being generated in the Context Storage 132. In a like manner, if a minimum limit error (MNER) is detected on the second byte and the AND gate 146 is enabled, the minimum limit error (MNER) will be stored in the error status byte. If the second byte applies an equal to maximum (MXEQ) or equal to minimum (MNEQ) signal to the inputs of the AND gates 144 and 148, respectively, an intermediate value will again be stored in the Context Storage 132 and the final decision delayed to the next byte. The data value mask received by the Context Storage 132 from the
Message Checker ROM 136 identifies the number of individual data values that are in the Data Value message being processed in which data bytes belong to each data value. This mask is used by the Error Check Logic 130 to identify the last byte in each data value. On the last byte of any data value, only maximum or minimum limit errors will be recorded in the Context Storage error status byte. The MXEQ and MNEQ signals will be ignored.
The Error Check Logic 130 will also detect if the message contained the correct number of bytes. The Context Storage 132 stores the message type (MT) code for each message being processed. In response to a message signal received with a message byte from a particular Receiver 32, the Error Check Logic 130 will decode the message type code stored in the Context Storage 132 and generate a number corresponding to the number of bytes that type of message should have. It will then compare this number with the byte count generated by the Sequencer 126 prior to
receiving a new message signal from the Receiver 32 and will generate a message length error (LENER) signal when they are not the same. Because the length error (LENER) signal may not be generated until after the error status byte has been sent to the Fault Tolerator 36, the message length error signal will be passed to the Fault Tolerator 36 in the error status byte for the next message received from that Node.
The format of the error status byte formed in the Context Storage 132 is shown in Figure 13. In an ascending order of bit positions, starting with the least significant or zero bit position the error status byte contains a flag for the parity error (PARER) a flag for the length error (LENER) for the preceding message, a flag bit for the Node identification (NID) error, a flag bit for the data identification (DID) error, a flag bit for the message type (MT) error, a flag bit for the two's compliment range error (RNGER) and flag bits for the maximum and minimum limit (MXER and MNER) errors.
Returning to Figure 11 the Data Multiplexer 128 transmits each message byte directly to the Fault Tolerator 36 as it is processed by the Message Checker 34. The Data Multiplexer will append to each message byte a descriptor byte which contains the Node identification code (NID) and the byte count (BYTC) received from the Sequencer 126 for that particular byte of the message. At the end of the message, independent of its length, the Data Multiplexer 128 will transmit the error status byte stored in the Context Storage 132 as the last byte. The last byte is Identified by a byte count "15" so that it can readily be identified by the Fault Tolerator 36 for fault analysis.
FAULT TOLERATOR
The details of the Fault Tolerator 36 are shown on
Figure 14. The Fault Tolerator 36 has a Message Checker Interface
150 which receives the messages byte-by-byte after being checked by the Message Checker 34. Upon receipt of an error free Task Completed/Started message, the Message Checker Interface 150 will forward the identity (NID) of the Node which sent the message condition contained in the message to a Synchronizer Interface 152, the Identity (TID) of the new task started, and the branch condition contained in the message to the Scheduler Interface 154. The Message Checker Interface 150 will also send the Node identification (NID) code and the message type (MT) code to a Voter Interface 158 and the data along with a partition bit to a Fault Tolerator RAM Interface 160. The Message Checker Interface 150 will also forward the error status byte (byte = 15) generated by the Message Checker 34 to an Error Handler 164 for processing.
The Synchronizer 46 will report to the Error Handler 164 through the Synchronizer Interface 152 any errors it has detected in the Task Interactive Consistency (TIC) and System State (SS) messages. The Scheduler Interface 154 will forward to the Scheduler 40 the task identification (TID) code of the task started and the Node identity (NID) of each received Task Completed/Started message. In return, the Scheduler 40 will transmit to the Error Handler 164 through the Scheduler Interface 154 any errors it has detected.
The Transmitter Interface 156 will forward to the Transmitter 30 the Base Penalty Count and Error messages generated by the Error Handler 164. As previously described, the
Transmitter Interface 156 will load the first byte of the message to be transferred into the Transmitter's Input Register to signify it has a message ready for transmission. It will then await the reassertion of the buffer available (BAB) signal by the Transmitter 30 before forwarding the remainder of the message to the Transmitter 30 for transmission.
A Reset Generator 157 is responsive to a reset signal generated by the Error Handler 164 when it determines its own Node is faulty and to a power on reset (POR) signal generated when electrical power is first applied to the Node to generate an
Operations Controller reset (OCRES) signal and an initial parameter load (IPL) signal which are transmitted to the other subsystems affecting a reset of the Operations Control l er 12.
The Fault Tol erator RAM Interface 160 wil l store in a Fault Tol erator RAM 162 the data contained in the message bytes as they are received from the Message Checker Interface 150. The Fault Tolerator RAM 162 is a random access memory partitioned as shown in Figure 15. A message parti tion section 166, as shown on Figure 15, stores in predetermined locations the messages received from each Node. In the message partition section 166 the messages are reassembled to their original format using the identi fier byte appended to the message bytes by the Message Checker 34. A double buffering or double partitioning scheme is used to prevent overwriting of the data that is stil l being used by the Voter 38. A context bit generated by the Message Checker Interface 150 determines Into which of the two partitions the new data is to be written . Separate context bits are kept for each Node and are toggled only when the error status byte indicates the current .message is error free. As previously discussed rel ative to the Message Checker 34, the message length (LENER) byte of the error status byte signifies that the preceding message had a message length error and, therefore, i s ignored in the determination of an error free condition for the current message.
The format for a single message in the message partition section 166 is il l ustrated in Figure 16. As shown , the message is reconstructed In its original format in the Fault Tolerator RAM 162 using the Node identification (NID) code and the byte count appended to each message byte in the Message Checker as a portion of the address. The context bit generated by the Message Checker Interface 150, along with the message partition code (bits 8 through 11) generated by the Fault Tolerator RAM Interface 160 completes the address and identi fies which of the two locations in the message partition 166 the message from each Node is to be stored.
The Fault Tol erator RAM 162 has three sections used by the Error Handl er 164 for generating the Base Penal ty Count and
Error messages .
An error code fi l e section 170 stores the error codes used to generate the Error messages transmitted immediately after the beginning of each Atomic period and to generate the increment penal ty count which is incl uded in the Error message.
Since there are thi rty-five di fferent error detection mechani sms in each Operations Control ler 12, there is a possibil i ty of two to the thi rty-fi fth power of error combinations that may result from each message transmitted in the system. In order to reduce the number of combination of errors to a reasonabl e number, compatibl e with the state of the art storage capabil ities of the Fault Tol erator RAM 162 , the error reports from the various subsystems are formated into special error codes as they are received . The formated error codes , as shown on Figure 17, incl ude an identi fication of the subsystem which reported the error pl us a fl ag indication of the errors detected . For example, the error status byte received from the Message Checker 34 i s formated Into two separate error codes. The fi rst error code contains the subsystem code 0000 which reported the errors and the error fl ags from the four least significant bits of the error status byte. The second error code contains the sub-system code 0001 and the error fl ags from the four most signi ficant bits of the error status byte. These error codes are stored in the error code fil e section 170 at an address defined by the faulty Nodes identi fication (NID) code and report number as shown in Figure 19. The error code fil e section 170 is doubl e partitioned the same as the message partition section 166 so that two error fil es are stored for each Node. The context bit generated by the Message Checker Interface 150 identi fies in which of the two error files for that Node the error code wi l l be reported .
Each error code is used to address a group mapping section 168 of the Fault Tol erator RAM 162. The error code addresses a penal ty weight pointer, as shown in Figure 18, which addresses a penalty weight section 172 of the Fault Tol erator RAM. As shown
in Figure 20, the penal ty weight pointer addresses a specific penalty weight which is assigned to the specific combination of reported errors contained in the formated error code. The penalty weights resulting from each error code stored in the error fil e for that Node are summed in the Error Handler 164 and appended to the Error message as an increment penalty count (byte-8) for that Node. As previously indicated , the Error Handler 164 will generate only one Error message in each Atomic period for each Node which transmitted a message which contained an error.
The Fault Tolerator RAM 162 wil l al so store the deviance l imits for the one byte (MTO) two byte (MT1) , and four byte (MT3 and MT4) Data Val ue messages in four separate sections, 174, 176 , 178 and 180, which are used by the Voter 38, as shall be explained with reference to the Voter hereinafter.
The detail s of the Message Checker Interface 150 are ill ustrated in Figure 21. A Store Message Module 182 receives the message bytes di rectly from the Message Checker 34 and stores them in the message partition section 166 of the Fault Tolerator RAM 162. The Store Message Modul e 182 will add the context bits stored in the a Message Checker Interface Context Store 190 to the descriptor (NID plus byte count) appended to the message byte by the Message Checker 34 to generate a partition address (PID) . The partition address identi fies the location in the message partition section 166 where the particular message byte is to be stored. As previously discussed , at the beginning of each Master period , each Node will first transmit a Base Penalty Count message fol lowed by a Task Completed/Started message. The Store Message Module 182 stores for each Node a fi rst fl ag signifying the receipt of the Base Penal ty Count message and a second fl ag signifying the receipt of the subsequent Task Compl eted/Started message. These flags are set to fal se at the beginning of each Master period and are reset to true when the Base Penal ty Count and the Task Completed/Started messages are received for that Node. Unless both of these flags are set to true the Store Message Module 182 wil l di sable the writing of the address of any subsequently
received messages from that Node in a Voter Interface Buffer 184. As a result, the subsequently received data from that Node will not be processed by the Voter 38 and will be ignored during any subsequent processing. The Voter Interface Buffer is a 8 × 7 first in-first out buffer in which the four most significant bits are the four most significant bits of the partition address (context bits plus NID) for the received message in the message partition section 166 of the Fault Tolerator RAM 162. The remaining three bits are the message type code contained in the first byte of the message.
An Error Status Byte Detector 186 listens to the messages being transmitted from the Message Checker 34 to the Fault Tolerator 36 and will detect the receipt of each error status byte (byte 15) generated by the Message Checker 34. If the content of the error status byte, with the exception of the length error (LENER) bit, are all zeros, the Error Status Byte Detector 186 will enable the Message Checker Interface Context Storage 190 to load the Voter Interface Buffer 1»S4 through the Store Message Module 182, or to load a Task Completed Register 202 or to load a Branch Condition Register 200 as required. Otherwise the Error Status Byte Detector 186 will load each non-zero error status byte in an Error Status Buffer 188 for subsequent processing by the Error Handler 164. The Error Status Byte Detector 186 will also detect if a message is a self-test message (TID=NID) set a self- test flag in the Error Status Buffer 188. The Error Status Buffer 188 is an 8 × 12 first in-first out buffer in which the most significant bit is a self -test flag, the next three bits are the Nodes Identification (NID) code and the remaining 8-b1ts are the received error status byte.
The Message Checker Interface Context Storage 190 temporarily stores for each Node the information contained in Table III. This Information is temporarily stored since it is not known if the message is error free until the error status byte is received.
TABLE 3
Message Checker Interface Context Storage
Bit Description When Written
13 TIC Flag MT1, Byte Count = 2 (DID=0)
12 Partition Context Bit Byte Count = 15
11-9 Message Type Code Byte Count = 1
8 Branch Condition Bit MT6, Byte Count = 4
7-0 Started TID MT6, Byte Count = 3
The most significant bit, bit 13, signifies that the received message is a Task Interactive Consistency (TIC) message which is processed by the Synchronizer 46. This flag is set by a Task Interactive Consistency Message Detector 192 in response to a message type MT1 having a data identification code which are all zero's, (DID=0) and will inhibit the loading of the address of this message in the Voter Interface Buffer 184 since it is only used by the Synchronizer and no other subsystem of the Operations Controller. The twelfth bit is the partition context bit which identifies in which partition of the message partition section 166 the message will be stored. The context bit 1s toggled when the Error Status Byte Detector 186 indicates the prior message was error free. If the message is not error free, the context bit 1s not toggled and the next message received from that Node is written over the prior message in the Fault Tolerator RAH 162.
The message type code bits are received directly from the first byte of the message. The branch condition bit, b1t-8, is received from a Branch Condition Detector 194 which detects the branch condition contained in the fourth byte of the Task Completed/Started (MT6) message. The identification of the started task (TID) is obtained from a Task Started Detector 196 which loads the TID of the started task into the seven least significant bit locations of the Message Checker Interface Context Storage 190.
Upon the receipt of an error status byte which signifies that the received message was error free and if the message is not a Task Interactive Consistency message, the Message Checker Interface Context Storage 190 will transfer the context bit and the message type to the Store Message Module 182. In the Store Message Module 182, the context bit is added to the Node identification (NID) code to form the starting partition (PID) address of that message in the Fault Tolerator RAM 162. The message type code is appended to the partition address and they are transferred to the Voter Interface Buffer 184 for subsequent use by the Voter 38 to extract the data necessary for the voting process.
Upon the receipt of an error status byte signifying the receipt of an error free Task Completed/Started (MT6) message, the Message Checker Interface Context Storage 190 will transfer the identification (TID) code of the stared task and the Node identification (NID) code to a Scheduler Interface Buffer 198 where it is transferred to the Scheduler 40 when requested. The Scheduler Interface Buffer 198 is an 8 × 11 bit first in-first out buffer which is reset at the end of the soft error window (SEW). The soft error window is generated by the Synchronizer 46 and defines a period of time bracketing the end of each Subatomic period during which the time critical messages from other Nodes should be received if they are in synchronization with each other.
In parallel, the Message Checker Interface Context Storage 190 will transfer the stored branch condition (BC) bit to the Branch Condition Register 200 and transfer the node identification (NID) code of the Node that send the message to the Task Completed Register 202. These registers are read by the Synchronizer Interface 152 when requested by the Synchronizer 46. The Branch Condition Register 200 and the Task Completed Registers 202 are double buffered with a different set of registers being reset at the end of each hard error window (HEW) signal. The hard error window signal is generated by the Synchronizer 46 and brackets the soft error window (SEW) at the end of each Subatomic period and defines the maximum deviance in the arrival time of the time critical messages from the other Nodes. The function of the
hard error window (HEW) and soft error window (SEW) will be discussed in greater detail in the detailed description of the Synchronizer 46.
The Error Handler, as shown on Figure 22, includes an Error Filer 204, an Error Consistency Checker 206, an Error Message Generator 208, and an Error Handler Context Store 210. The Error Filer 204 polls the Message Checker Interface 150, the Synchronizer Interface 152, the Scheduler Interface 154, and the Voter Interface 158 for error reports from the various subsystems within the Operations Controller. The Error Filer will format the received error reports into a formated error code, as shown on Figure 17, and tag them with an error file address, as shown on Figure 19. The error filer address is a 3-bit error file identification code, a context bit which is the one generated by the Message Checker Interface 150 for filing the message in the message partition of the Fault Tolerator RAM 162, the Node identification (NID) code and a report number. As previously described, the formated error code contains a 4-bit code which identifies the subsystem which detected the error and four flag bits identifying the errors detected.
The Error Filer 204 will pass these formated error codes to the Fault Tolerator RAM Interface 160 which will store them in the error code file section 170 of the Fault Tolerator RAM 162. The Error Filer 204 will also forward the number of error reports written to the Error Handler Context Store 210 so that the Error Message Generator 208 will be able to determine how many error reports to process from the Fault Tolerator RAM 162. The Error Filer 204 will also detect the self-test flag generated by the Message Checker 34 and forward this flag to the Error Message Generator 208. The self-test flag is part of one of the group codes whose penalty weight is programmed to be zero or a very small value. The self-test error message will identify all of the errors detected and will include the Incremental and Base Penalty Count.
The Error Consistency Checker 206 is responsible for consistent handling of the error reports and the base penalty counts for each Node in the system. A form of implicit interac
tive consistency is used to achieve this goal. At the beginning of each Master period, the Error Consistency Checker 206 receives through the Voter Interface 158 a voted base penalty count (VBPC) which is generated by the Voter 38 in response to the Base Penalty Count messages received from all the Nodes in the system including its own. Referring now to Figure 23, these voted base penalty counts are stored in a Base Penalty Count Store 212 as the base penalty counts for each Node independent of the values of the base value penalty count stored for the preceding Master period. In this manner all the Nodes in the system will begin each Master period with the same base penalty counts for each Node in the system. The Base Penalty Count Store 212 also receives a voted increment penalty count (VIPC) which is generated by the Voter 38 from the error messages received from all of the Nodes including Its own. The voted increment penalty count (VIPC) is added to the base penalty count of the accused Node when the error is verified by a Validity Checker 218. Preferably the Validty Checker 218 is embodied in the Voter 38, but may be part of the Error Consistency Checker 206 as shown in Figure 23.
The Error Consistency Checker 206 also maintains a
Current System State Register 214 which stores a voted current system state (CSS) vector and a Next System State Register 216 which stores a next system state (NSS) vector. The current system state vector identifies which Nodes are currently active in the system and which are excluded, while the next system state vector identifies which Nodes are to be included and/or which are to be excluded in the next systems state of the system. The system will change its state at the beginning of the next Master period if the voted next system state vector is different from the current system state vector. The current and next system state vectors have 8 flag bits, one for each Node, which are set when the Node is excluded and which are reset when the Node is readmitted to the operating set of Nodes.
Prior to the discussion of the Validity Checker 218, the various types of errors that are detected in each Node will be
di scussed briefly. Table IV is a l i st of twenty-fi ve fault detection mechani sms used in the systems.
TABLE IV
Fault Detection Mechanisms
Error Subsystem Sym/Asym
Message Vertical Parity MSC A
Message Longitudinal Redundancy MSC A
Message Length MSC A
Synchronization - Hard MSC A Synchronization - Soft MSC A
Send Node ID MSC S
Invalid Message Type MSC S
Invalid Data ID MSC S
Task ID Sequence FLT S Data ID Sequence FLT S
Data Limit MSC S
Data Deviance FLT S
Task Run Time SCH S
Current System State FIT S Next System State FLT S
Penalty Count Base Deviance FLT S
Penalty Count Increment Deviance FLT S
Missed BPC Message FLT S
Unsupported Error Report FLT S
Missing Error Report FLT S
Self Detection Monitor FLT S
M.P. Misalignment SYN S
Sync Sequence Error SYN S Sync Missing Message SYN S
Too Many Data Messages VTR S
AP Reported Error TSC S
Last DID Shipped SC S
Wrong Message during SEW FLT A Too Many Error Reports VTR S
Too Many BPC VTR S
Exceeded Max. No. of Errors FLT A
This table lists the error, the subsystem which detects the error, and whether the detection of the error is symmetric (S) or asymmetric (A). Since the system is symmetric in its structure, most of the errors contained in the messages transmitted to each other should be detected by every other Node. Therefore, every Node should generate an error message which identifies the error detected and the incremental penalty counts to be charged against the Node that made the error. These errors which are detected by all of the Nodes are called symmetric errors. Therefore, the existence of symmetric errors should be verified by at least a majority of the active Nodes in the system. There also is the case where channel noise occurs so that an error manifests itself differently among the receiving Nodes. In this case, the majority of the Nodes will agree which Node is faulty. However, the error or errors detected may be different for each Node and the incremental penalty count reported in the various error messages may likewise be different. A median vote on the incremental penalty count will be used to increment the base penalty count for that Node. However, the Validity Checker 218 will not generate a
deviance error report to the Error Filer 204 identi fying those Nodes whose incremental penalty counts di ffered from the voted incremental penal ty count by more than the allowed amount. Thi s i s to prevent the unjust penal izing of a healthy Node.
Turning now to Figure 24, the Val idity Checker 218, whether embodied in the Voter 38 or the Fault Tolerator 36, has a Majority Agree Detector 224 , an Asymmetric Error Detector 226, and an Error Reporter 230. The Majority Agree Detector 224 receives a "missing vector" from the Voter 38 indicating from which Nodes the Base Penalty Count or Error messages were not received and a
"deviance vector" indicating which Nodes generated a Base Penal ty Count or Error message in which the base penalty or increment penalty counts were outside of the permitted deviances about the voted val ues. The Majority Agree Detector 224 al so receives a current system state vector from the Current System State Register 214. The Majority Agree Detector 224 wil l subtract the "deviance vector" and the "missing vector" from the current system state vector to generate a number corresponding to the number of Nodes which agree with the voted val ue. This number is then compared with the number of Nodes. currently active in the system identified by the current system state vector. If a tie or a majority of the Nodes sent messages whose val ues agree with the voted val ues , then i f the message is a Base Penal ty Count message, the voted base penalty counts are stored in the Base Penal ty Count Store 212. Otherwise , i f the message i s an Error message, the base penalty count stored in the Base Penal ty Count Store 212 is Incremented by the voted increment penalty count. If the messages received from the other Nodes do not represent a majority, then the Majority Agree Detector will generate a write inhibit signal which is appl ied to the 8ase Penalty Count Store 212 through an AND gate 234. Thi s write inhibit signal will inhibit the writing of the voted val ues in the Base Penalty Count Store 212 provided the reported error or errors i s not an asymmetric error.
The Asymmetric Error Detector 226 receives the deviance
vector, the missing vector, the current system state vector, and generates a deviance report inhibit signal when a majority of the Nodes send error messages Identifying a particular Node as faultly but they disagree as to the incremental penalty counts to be charged against the faultly Node. The Asymmetric Error Detector will interrogate the Error Handler Context Store 210 and will generate the deviance report inhibit signal when the detected errors are determined to be asymmetric errors of the type identified in Table IV. The deviance report inhibit signal will inhibit the Error Reporter 230 from reporting to the Error Filer 204 a deviance error for any Node which sent an error message containing an incremental penalty count which deviated from the voted incremental penalty count by more than the permitted tolerance. The deviance report inhibit signal is also applied to an inverted (negative) input of the AND gate 234. The deviance report inhibit signal will disable the AND gate 234 and block the write inhibit signal generated by the Majority Agree Detector 224. This will enable the voted incremented penalty count to be added to the base penalty count stored in the Base Penalty Count Store 212.
The Error Reporter 230, receives the missing and deviance vectors from the Voter 38, the current system state (CSS) vector from the Current System State Register 214, the error report inhibit signal from the Asymmetric Detector 226, and the write inhibit signal from the output of the AND gate 234. In response to the absence of a write inhibit signal, the Error Reporter 230 will report to the Error Filer 204 the Node identified in the deviance vector as having deviance errors, it will also report in response to the missing vector each Node which did not send a Base Penalty Count or Error message as required. In response to a write inhibit signal and the absence of an error report inhibit signal from the Asymmetric Error Detector 226, the Error Reporter 230 will report each Node having reported an unsupported error. No deviance errors are reported for these unsupported Error messages. Finally, in response to an error report inhibit signal from the Asymmetric Error Detector 226, the Error
Reporter 230 wil l report to the Error Fil er 204 any Node which fail s to report the Asymmetric error as identi fied by the missing vector. As previously described , the Error Reporter 230 will not report any deviance errors in the presence of a deviance report inhibit signal from the Asymmetric Error Detector.
Returning to Figure 23, the Error Consistency Checker 206 al so incl udes an Excl ude/Readmit Threshold Comparator 220 responsive to the incrementing of the base penalty count in the Base Penalty Count Store 212 by the voted increment penalty count. The Excl ude/Readmit Threshold Comparator 220 wil l compare the incremented base penalty count with a predetermined excl usion threshold val ue and when the incremented base penal ty count exceeds the excl usion threshold val ue, the Excl ude/Readmit Threshold Comparator 220 wi ll set the excl uded flag in the next System State Register 216 in the bit position which corresponds to the faulty Node. The setting of the excl uded fl ag signi fies that in the next System State the Fault Tol erator has determined that the Node whose excl usion fl ag was set should be excl uded from the operating set. At the end of each Atomic period , the current (CSS) and next (NSS) System State vectors are transferred to the Synchronizer 46 and are incl uded in the next System State (MT5) message as the current system state and the new system state vectors , respectively. The new system state is global ly veri fied by the Voter 38 upon the receipt of the System State messages from al l of the participating Nodes in the system. The majority view of what the new system state i s to be is the medial val ue generated by the voting process. Thus, an error in a local decision to excl ude or incl ude a Node will manifest itsel f as a deviance error.
Actual reconfiguration of the workload to the new voted system state is carried out by the Schedul er 40 and the time at which the sequence is Initiated i s based on an application ' s designer sel ectabl e parameter. Reconfiguration can either occur at the next Atomic period after which a new system state i s glo
bally verified or wait until the next Master period. If reconfiguration occurs at any Atomic period then the voted new system state vector is passed to the Scheduler 40 as a system state vector during normal transfer sequence between the Fault Tolerator and the Scheduler 40. However, if reconfiguration occurs at the Master period boun-daries, the voted new system state vector is passed to the Scheduler 40 only when the flag signifying the last Subatomic period (LSAP) in the Master period is true.
To permit the readmittance of an excluded Node following an extended period of error free operation, the Error Consistency Checker 206 has a Base Penalty Count Decrementor 222 which will decrement the Base Penalty Count for each Node by a predetermined quantity at the end of each Master period. After decrementing the base penalty count for each Node, the Base Penalty Count decrementor 222 will enable the Exclude/Readmit Threshold comparator 220 to compare the decremented base penalty count of each excluded Node with a predetermined readmittance value. The Exclude/Readmit Threshold Comparator 220 will reset the flag in the Next System State Register 216 for each previously excluded Node whose decremented base penalty count is less than the readmittance threshold value. This permits Nodes to be readmitted to the operating set the next time the system is reconfigured since its operation has been error free for an extended period of time. This error free operation indicates that the original fault was transient or had been corrected (repaired or replaced). Preferably, the readmittance threshold value is less than the exclusion threshold value to prevent the system from oscillating between two different system states if a Node has an intermittent fault which causes its base penalty count to fluctuate about the exclusion threshold value. The Base Penalty Count Store 212, the Current System State Register 214 and the Next System State Register 216, preferably, are incorporated in the Error Handler Context Store 210 but may be independent elements in the Error Consistency Checker 206, as shown in Figure 23.
The Exclude/Readmit Threshold Comparator 220 will also
detect the excl usion of its own Node and generate a RESET signal which activates a Reset Generator 157 shown in Figure 22 to generate an Operations Control ler Reset signal (OCRES) and an Initial Parameter Load signal ( IPL) which wil l cause the Operations Control ler to reset and reload the initial parameters as previously described . The Reset Generator 157 1s al so responsive to the Power On Reset (POR) si gnal to generate the OCRES and IPL signal s each time the electrical power to the Operations Control ler is turned on .
The Error Message Generator 208 wil l generate , during each Atomic period , an Error Message for each Node which generated a message containing an error detected by its own Operations Control ler. The Error Message Generator 208 wil l al so generate a Base Penal ty Count Message at the beginning of each Master period. These messages are transmitted to the Transmitter 30 through the Transmitter Interface 156.
At the beginning of each Atomic period , the Error Message Generator 208 wil l set to zero (0) the Increment Penalty Count for each Node. It wi ll then check the error code fil e section 170 of the Fault Tol erator RAM 162 for any error reports. The error code of each error report is used to address the Group Mapping Section 168 to obtain the pointer to the penalty weight section 172 of the Fault Tol erator RAM 162 to extract a penalty weight. Thi s penalty weight i s stored as the Increment Penalty Weight for the faulty Node in the Error Handler Context Store 210 and 1s used to increment the base penalty count currently being stored for that Node. Thi s process is repeated for each reported error for each Node until the Fault Tolerator receives a System State message signifying the end of the Atomic period for each individual Node. In response to receiving a System State message from a particular Node , the increment penalty count and base penalty count for that Node are frozen. The Message Generator wil l then , using the content of the error code file section 170 of the Fault Tol erator RAM 162 and the stored increment penal ty count and the base penalty counts stored in the Error Handl er
Context Store 210 construct an error message for each Node for which a fault was detected. In the event a System State message is not received from a faulty Node, the base penalty count and the increment penalty count win be frozen by sensing the High End Of Fuzzy (HEOF) signal generated by the Node's own Synchronizer 46 which signifies that all valid System State messages should have been received. This prevents the transmission of the error and base penalty count messages from being hung up while waiting for the missing System State message.
The format of the four (4) error bytes of the Error message (bytes 3 through 7) are shown in Figure 25. The most significant bit of the first Byte is a self-test flag which identifies that the reported errors were detected in a Self-Test message. In a Self-Test Error message, the increment penalty count Byte 8 will have a small value or be zero.
At the beginning of each Master period, the Error Message Generator 208 will retrieve the base penalty counts currently stored for each Node and will generate a Base Penalty Count message which is the first message transmitted after the System State message which is sent by the Synchronizer 46 at the end of the last Atomic period in each Master period. As discussed relative to the Transmitter 30, the Transmitter's Message Arbitrator 56 will poll the Fault Tolerator Interface 52 after it sends a System State message at the end of the Master period, then wait for a Base Penalty Count message generated by the Fault Tolerator 36.
Figure 26 depicts the sequence of operations of the Operations Controller which results in a reconfiguration of the system and the role played by the Fault Tolerator 36. Referring to Figure 26, at the beginning of each Master period, (a) signifies each Node will broadcast its Base Penalty Count message and reset all of the increment penalty counts In its Error Handler Context Store to zero. The Error Handler 164 will then begin the processing of any errors detected by its own Operations Controller's error detection mechanisms. At the end of the first
or any subsequent Atomic period, (b), in which a message containing an error occurs, each Node will broadcast Error messages identiflying the Node which sent the message for which the errors were detected and the increment penalty count and the base penalty count for that Node. By the end of the next Atomic period (c) the Error messages from all of the Nodes should have been received. During the next Atomic period (d) the Fault Tolerator will process the received Error messages and detect any unsupported Error messages from other Nodes and perform a medial vote on the increment penalty count for the accused Node whose reported error or errors are supported by a majority of the Nodes. This medial increment penalty count is then added to the base penalty count of the accused Node. The incremented base penalty count is then compared with the exclusion threshold. If the incremented base penalty count exceeds the exclusion threshold, the exclusion bit for that Node is set in the next System State Vector which is passed to the Synchronizer 46. At the end of that Atomic period, (e) the Synchronizer 46 will include the next System State Vector in the System State message which is broadcast to all of the other Nodes. At the beginning of the next Atomic period (f) the Fault Tolerator 36 will verify the correctness of the next System State by using a median vote of the healthy Nodes and pass this information to the Synchronizer 46 and to the Scheduler 40. Upon receipt of this information (g) the Synchronizer 46 and the Scheduler 40 will initiate a reconfiguration process in which the System State identified in the voted next System State Vector becomes the current System State for the System. After the reconfiguration is completed (h) the system will begin a new Master period in the new System State. Although the above example is directed to a single fault by a single Node, the Fault Tolerator operation is no different If more than one fault is detected for any one Node and more than one Node is accused of being faulty in the received Error messages. This sequence can be overliad if successive failures occur in different Atomic periods.
VOTER
The Voter 38 performs two primary functions in the pro
cessing of the data. First it generates a voted value for all available copies of the data and second, it performs a deviance check to determine if the data value of each copy of the data is within a predetermined tolerance or allowed deviance. Referring to Figure 27, the Voter 38 has a Loader 236 which receives from the Fault Tolerator 236, the message type (MT) code, the node identification (NID) code, and the data identification (DID) code for each message for which a voted value is to be generated. As each such message is received, the Loader 236 will retrieve and temporarily store each copy of the data currently available in the Fault Tolerator RAM 162, which has the same data identification (DID) code as the received message. The Loader 236 will also, using the message type code, retrieve from the deviance sections, 174 through 180, of the Fault Tolerator RAM 162 the predetermined deviances for that particular message.
The Loader 236 will first transmit the most significant bit of each copy of the data in parallel to an Upper Medial Value Sorter 238 and a Lower Medial Value Sorter 240 which will, respectively, sort the received bits to generate an upper (Un) and a lower (Ln) medial bit value. These upper and lower medial bit values (Un and Ln) are transferred, as they are generated, to an Averaging Circuit 242 and a Deviance Checker 244. At the end of the hard error window (HEW) the Loader 236 wil generate a missing vector (MV) identifying each Node which did not send a Task Interactive Consisteny or System State message. The Loader 236 will also generate a missing vector at the end of each Atomic period identifying each Node which did not generate an Error Message or a Base Penalty Count message.
The Averaging Circuit 242 adds the upper and lower median bit values and divides the sum by two to produce a voted average.
The Deviance Checker 244 receives the upper (Un) and the lower (Ln) medial bit values, the deviance values retreived from
the Fault Tolerator RAM 162 and the corresponding data bit from each copy of the data being processed and will determine for each bit in each copy of the data value whether or not it is within the allowed deviance. This process is repeated for each bit in each copy starting with the most significant bit to the least significant bit. At the end of each message, a deviance error (DERR) vector is sent to the Fault Tolerator 36 identifying each Node whose message contained a deviance error.
The voted data value generated by the Averaging Circuit 242 for the Data Value messages (MT0, MT1, MT2, and MT3) are transmitted to a Voter-Task Communicator Interface 246 which passes them to the Task Communicator 44 along with the data identification (DID) code for that data value. The voted values for the base penalty counts contained in the Base Penalty Count messages, the voted values for the current and new System State Vectors contained in the System State messages and the voted values for the incremental and base penalty counts in the Error messages are transmitted to a Voter-Fault Tolerator Interface 248 where they are passed to the Fault Tolerator 36 along with the deviance error (DERR) and the missing vector, as previously described.
The voting process and the deviance checks are repeated each time a message is received which requires a voted value to be generated. This assures that at all times the Task Communicator 44 and the Fault Tolerator 36 will have the best and most current voted values for the data value they may need. Using this type of on-the-fly voting, the system will not hang up if one or more copies of the data is unavailable due to a detected fault in the received message or a faulty Node fails to generate the required message.
The Upper and Lower Medial Value Sorters 238 and 240, respectively, extract two values from the data values being processed. The values chosen depend upon whether the median select (MS) or the median of the medial extremes (MME) voting is imple
mented. To avoid confusion, only the implemention for extracting the upper and lower median values will be discussed. Minor changes to implement the mean of the medial extremes (MME) sorting will be discussed briefly hereinafter.
The process used by the Upper and Lower Medial Value
Sorters 238 and 240 are shown on the flow diagram illustrated in Figure 28, while the details of the circuit for the Lower Medial Value Sorter 240 are shown in Figure 29. Since the Upper Medial Value Sorter 238 is a mirror image of the Lower Medial Value Sorter 240, it need not be shown or discussed in detail.
Referring first to Figure 28, the process begins by initializing the bit count (n) to zero and to generate an initial median value S'n = Sºn = (m-1)/2 as shown in block 250. In the calculation of the initial median value Sºn, m is the actual number of copies of the data being processed which is obtained from the Loader 236. The Upper Medial Value Sorter 238 then counts, as shown in block 252, the number of ones (1's) contained in the most significant bit positions of all the copies to generate a number n1, which is the number of one bits counted. The Upper Medial Value Sorter 238 will then inquire if n1 - S'n is equal to or less than zero, as shown in decision block 254. If the number of 1's is less than S'n than the upper medial bit value is a 0-bit as indicated in block 256. If the upper medial value Un is a 0-bit, then all the copies having a 1-bit in the same bit position are excluded from the subsequent processing to determine the value of the remaining upper medial value bits. Because some copies of the data value are now excluded, a new value S'n is computed as indicated in block 258 by subtracting from S'n the number of excluded copies (n1) to generate a new value for the analysis of the next highest bit.
When the number of 1-bits is greater than S'n then the upper median value Un is a 1-bit as indicated in block 260, and all of the copies having 0-bits in the same bit position are
excluded from the subsequent processing. After the upper median value Un for the most significant bit is determined, the process will proceed to the next most significant bit, block 272, and the above procedure is repeated until all the bits in the data value (n=nmax) have been Processed as indicated in decision block 274.
In a similar manner, the Lower Medial Value Sorter 240 will count the number of O's as indicated in block 262 to generate a number n0 equal to the number of O's counted. If the number of 0's (n0) is less than Sºn, as indicated in decision block 264, (n0 - S°n ≤ 0) then the lower medial bit (Ln) is a 1-bit and all the data copies having a 0-bit in the same bit position are excluded from processing of the subsequent lower medial bits. Again the medial value Sºn is corrected by subtracting the number of excluded copies (n0) from the preceding values for Sºn as indicated in block 268. If the number of 0-bits (n0) is greater than Sºn, then the lower medial value of the bit position is a 0-bit as indicated in block 270 and the data copies having a 1-bit in the same bit position are excluded. This process is repeated until all of the lower medial value bits are determined.
The circuit details for the Lower Medial Value Sorter
240 are shown in Figure 29. The circuit details for the Upper Medial Value Sorter 238 are the mirror of the circuit shown in Figure 29 except that a Zero Counter 280 is replaced by an equivalent One Counter. Referring to Figure 29, the data values from each copy of the data value retrieved by the Loader 236 are received bit by bit, from the most significant value to the least significant value, by a plurality of OR gates 276-0 through 276-N, and to a like plurality of exclusive NOR gates collectively indicated by block 278. The OR gates 276-0 through 276-N will pass the received bits to the Zero Counter 280. The Zero Counter 280 will actually count the number of 1-bits received and subtract that number from the number of copies (m) being processed to generate the number of 0's (n0). The Loader 236 counts the number of copies of the data it retrieves from the Fault Tolerator RAM 162 and supplies this number to the Zero Counter 280 and to an S°n
Generator 282. The S°n Generator 282 subtracts one from m and divides the remainder by two to generate the initial value for Sºn.
The output of the Sºn Generator 282 is received by a 2:1 Multiplexer 284 which will pass the initial value of Sºn to a Register 286. The output (n0) of the Zero Counter 280 and the content of the Register 286 are received by a Dual Comparator 288 which performs the dual comparison of n0 = Sºn and n0 < Sºn. The outputs of the Dual Comparator 288 are applied to the inputs of an OR gate 290 which outputs the lower medial value Ln. If n0 = Sºn or n0 < Sºn, then the lower medial bit value Ln is a 1-bit, as indicated by decision block 264 and block 266 of Figure 28. The exclusion of the copies having 0-bits in the same bit position is performed by the Exclusive NOR gates 278 and Register 294. The 1-bit value produced by the OR gate 290 is applied to the input to each of the Exclusive NOR gates. The exclusive NOR gates 278 will generate a logical 1 signal for each copy of the data value which has a 0-bit in the same bit position. This logical 1 is stored in the Exclusion Register 294, the output of which is connected to the alternate inputs of OR gates 276-0 through 276-N. As a result, the inputs of their respective OR gates 276-0 through 276-N whose data has a 0-bit in the same bit position will be a 1-bit which is passed to the Zero Counter, thereby excluding them from further participation in the determination of the lower medial bit values. If the lower medial bit value, Ln, is a 0, then a logical 0 signal is applied to the Exclusive NOR gates 278 which causes a logical 1 to be stored in the Exclusion Register 294 for each copy of the data value which presented a 1-bit for processing.
The output, n0, from the Zero Counter 280 and the lower medial bit value Ln are applied to the input of a Gating Circuit 296 which passes the value of n0 to a Subtraction Circuit 298 when the lower medial bit value Ln is equal to 1. The Subtraction Circuit 298 also receives the current value of Sºn stored in Register 286, and performs the subtraction Sºn = S°n - n0 indicated in block 268 of Figure 28. This new value of Sºn is applied
to a second input of the Multiplexer 284 and is passed to the Register 286 for use in the processing of the next lower medial bit value. A sequencer circuit (not shown) will monitor the number of bits processed and will clear the Exclusion Register 294 and Sºn Register 286 after the last lower medial bit Ln is generated for the current set of data values in preparation for the processing of the next set of data values.
The operation of the Upper Medial Value Sorter 238 and the Lower Medial Value Sorter 240 for producing upper (Un) and lower (Ln) medial values for generating a Mean of the Medial
Extremes (MME) voted value is identical to that described above except that the mediam values S'n and Sºn are replaed with values T'n and Tºn which are the smallest of S'n and S°n, respectively, or a fixed value. The resulted voted value generated by the Averaging Circuit 242 using these values is then the Mean of the Medial Extremes.
The operation of the Averaging Circuit 242 will be explained with reference to the flow diagram shown in Figure 30 and the circuit diagram shown in Figure 31. The averaging process forms the mean of the upper and lower values by keeping two versions of the mean value M and choosing between them as later bits arrive. This process rests on the following two facts:
a) If the bits Un and Ln are identical at any particular bit position, the mean bit M is the same except for the case described in (b) below. If the bits are different, then the mean is ½, which 1n binary form is a 0.1; and
b) A sequence ½, ½, ½, ... ½ can be resolved into binary format only when the first identical pair following the sequence arrives. For example, the sequence ½, ½, ½, ... ½, 0, where 0 represents the arrival of Un and Ln both having 0 values, resolves to 011...11 and the sequence ½, ½, ½, ... ½, 1 where represents the arrival of Un and Ln both having 1 values resolves to 100...00.
Referring to Figure 30, the process begins by initializing the value A to 0 and the bit number n to 0 as indicated by block 300. The value A is the Exclusive OR of the preceding values of Ln and Un and is set to 0 at the beginning of the process. The process then inquires decision block 302 if the
Exclusive OR of Ln and Un is equal to zero (Ln Un = 0). If the Exclusive OR of Ln and Un is equal to zero, the process then inquires if A is equal to 0, as indicated by decision block 310. If A is equal to 0, the value of Ln is inserted into both registers M1 and M2 as indicated by block 312. Register M1 and M2 keep two different versions of the mean M in order to resolve the problem discussed above in (b) where Un and Ln are different. If A in decision block 310 is not equal to 0, then the Averaging Circuit 242 enters the complement (Un) of Un into registers M1 and M2 as indicated by block 314. The process then Inquires, decision block 316, if Ln is equal to 0. This is the resolution of the sequences discussed in (b) above, if the sequences exist. In the instant embodiment M1 stores the sequences (1, 0, 0, ... 0, 0) described above and M2 stores the second sequence (0, 1, 1, ... 1, 1). If Ln = 0 then the sequence is resolved to be the sequence stored in M2, therefore, M1 is made equal to M2, as indicated in block 320. Otherwise, if Ln is not equal to 0, then Ln is a 1, and the sequence is resolved to be the sequence stored in M1 and M2 is made equal to M1, as indicated in block 318.
If the Exclusive OR of Ln and Un in decision block 302 is equal to 1 signifying Ln and Un are different, the process Inquires, decision block 304, if this is the first time this has occurred (A=0). If A=0, then 1 is inserted into the corresponding bit position of the register M1, starting the sequence (1, 0, 0, ... 0, 0) and a zero is Inserted onto the register M2 starting the sequence (0, 1, 1, ... 1, 1). If A=1 signifying that this is not a first occurrence of Ln and Un being different, a zero is inserted into the corresponding bit positions of register M1 and a 1 is inserted in the corresponding bit positions of register M2. The process then generates a new value for A depending upon the
exclusive OR of the current values of Un and Ln as indicated by block 322. The process will then index the bit count to n=n+1, block 324, then inquire, decision block 326, if the last bit (nmax) has been processed. If not, the Averaging Circuit 242 will proceed to process the next values of Un and Ln generated by the Upper and Lower Medial Value Sorters 238 and 240, respectively.
Referring now to Figure 31 the medial values, Un and Ln, respectively, are received by an Exclusive OR gate 328 which produces a 1 at its output when Un and Ln are different and a 0 bit at its output when Un and Ln are the same. The output of the Exclusive OR gate 328 is transmitted to the input of a 2-bit Shift Register 330 an inverted input of AND gate 332 and an input of AND gate 348. The Shift Register 330 temporarily- stores the output of the Exclusive OR gate 328 for use in processing the next Un and Ln bits received from the Upper and Lower Medial Value Sorters 238 and 240. The delayed output of the Shift Register 3301s the value A discussed with reference to Figure 30. The upper medial bit Un is also applied to the 0 and 3 inputs of a pair of 8:1 Multiplexers 334 and 336. The other inputs to Multiplexers 334 and 336 are preset as shown. The values of A, Un, and Ln are used to address the Multiplexers 334 and 336 to output the value Un or one of the preset values. For example, if A=Ln=Un=0 then the Multiplexers 334 and 336 would both output the 0 input which is the value of Un as indicated by block 312 in Figure 30. Likewise, if A=0 and Ln=Un=1 then the Multiplexers 334 and 336 would both output the value of Un applied to the third input to the Multiplexers 334 and 336 which is the value of Un. In the first example, Un was equal to 0 and in the second example, Un was equal to 1. Note, if A is 0 and Un and Ln are different then the Multiplexer 334 will output a 1 and the Multiplexer 336 will output a 0 as indicated by block 306. However, if A=1 and Ln and Un are different, the outputs of the Multiplexers 334 and 336 will be reversed as indicated by block 308 of Figure 30.
The outputs of the Multiplexers 334 and 336 are received
by 3:1 Multiplexers 338 and 340 as shown. The Multiplexers 338 and 340 also receive the outputs of an Inverter 342 which is the complement (Un) of the upper medial bit value Un. The outputs of the 3:1 Multiplexers 338 and 340 are received by a M1 Register 344 and M2 Register 346, respectively. The outputs of the 3:1
Multiplexers 338 and 340 are controlled by NAND gate 332 and AND gate 348. The NAND gate 332 produces a logical 1 output when the output of Exclusive OR gate 328 is 0 and and the value A is 1. This actuates the 3:1 Multiplexers 338 and 340 to store the complement of Un of the upper medial bit value in both the M1 Register 344 and the M2 Register 346, respectively, as indicated by block 314 of Figure 30. The AND gate 348 produces a logical 1 output when the output of the Exclusive OR gate 328 is a 1 and A is a 1 which causes the output of the Multiplexer 334 to be stored in the M2 Register 346 and the output of Multiplexer 336 to be stored in the M1 Register 346 as indicated by block 308 in Figure 30.
The output of the NAND gate 332 is also Used to actuate the M1 Register 344 and the M2 Register 346 to copy the content of the M1 Register 344 into the M2 Register 346 or vice versa depending upon the value of Ln as indicated by block 316 in Figure 30. The output of the NAND gate 332 and the lower median bit Ln are applied to the inputs of an AND gate 350, the output of which determines whether the content of the M1 Register 344 will be transferred to the M2 Register 346 or vice versa as indicated by blocks 318 and 320 of Figure 30.
The operation of the Deviance Checker 244 shall be discussed with respect to the flow diagram shown in Figure 32 and the circuit inplementation shown in Figure 33. The circuit shown in Figure 33 is replicated in the Deviance Checker 244, one circuit for each Node in the system, so that the deviance checks on all the data values being checked can be checked in parallel.
In order not to cause any significant delays in checking
the deviance, the Deviance Checker 244 processes the data being checked on a bit-by-bit basis from the most significant bit to the least significant bit as the upper medial Un and the lower medial Ln values become available from the Upper Medial Value Sorter 238 and the Lower Medial Value Sorter 240. The deviance checking process is based on the condition that A > B can be distinguished from A ≤ B by adding B to the two's complement of A and looking for an overflow at the most significant bit (MSB). In the instant application the Deviance Checker actually checks the relationship of the following equation:
M -D≤V≤M + D (1)
where: M is the medial value ½ (Ln + Un);
D is the predetermined deviance limit retrieved from the Fault Tolerator RAM 162 for the particular data value being checked; and
V is the data value being checked.
Since the solution for M - D ≤ V is substantially equivalent to the solution for V ≤ M + D we will only discuss the latter in detail. The differences between the two solutions are well within the perview of one skilled in the art.
The process adds the four available inputs Ln, Un, D and V and looks for an overflow at the most significant bit position using the equation:
M + D - V = ½ (L + U) + D - V ≤0 (2)
which can be rewritten as: (3)
This process is complicated by the fact that we are adding four bits, rather than three since the bit value of the
mean M is not available. In the addition of four bits there is the possibility that all four bits are 1's causing a double carry to the second previous bit. The solution to this is as follows:
a) A sequence such as ...110XX cannot overflow. For example, in the worst case (X=Y=1) even two double carries gives the results ...111000. Therefore, if an overflow has not already occurred, a zero (0) in the second previous bit position unconditionally indicates that no overflow will occur at the most significant bit position whatever happens to the latter bits; and
b) The sequence before the second previous bit will always be 111...111 if neither an overflow nor the condition in (a) above has occurred. Therefore, a carry past the second previous bit will always cause an overflow.
The process proceeds by successfully examining the value of the second previous bit B as carries from the later bits are added to it. If a carry occurs beyond the second previous bit, then an overflow occurs and V ≤ M + D. However, if the second previous bit B is 0, without a prior or current overflow, then
V > M + D. Finally, if all bits pass without either of the above conditions occurring, then the sum M + D - V is less than 0 and
V > M + D.
Referring now to the flow diagram shown in Figure 32, the circuit is initialized as shown in block 352 by setting the bit number n = 0, the initial sum bit S' = 1, and the interim sum bit of the second preceding bit B'-1 = 0. The process then proceeds to add 2V, D' + L + U as indicated in block 354, where 2V is the 2's complement of 2V, D' is 2D + 1 which is the deviance value actually stored in the Fault Tolerator RAM 162 and Un and Ln are the upper and lower medial values received from the Upper and Lower Medial Value Sorters 238 and 240, respectively. As indicated in block 354, the results of this addition produces a first previous sum bit S-1 which is the sum value obtained during the processing of the preceding data value bit, a carry bit C and a
double carry bit C' obtained in the processing of the current data bit value.
Next, the process adds the first previous sum bit S-1 generated during the processing of the preceding data value bit, with the current carry bit C as indicated in block 356 to generate a second previous sum bit interim value B'-2 which is used in the processing of the next data value bit. The addition also produces a third carry bit C' ' which is indicative of an additional carry resulting from the processing of the current bits and the first previous bits. The carry bit C' ', from block 356, is added to the double carry bit C' of the current data value being processed and to the interim value B'-2. The sum and carry bits resulting from the addition of (C' C' ') + B-2 a carry bit A for the second preceding bit and bit value B which is the final bit value of the second previous bit after correcting for all carries. The process then inquires if the carry bit A is equal to 1 as indicated in decision block 360, if A=1 then V ≤ M + 0 as previously indicated and the "pass" flag is set as indicated in block 362. However, if A=0, the process inquires, decision block 364, if B=0? If the second previous bit B is equal to zero, then there will be no overflow at the most significant bit position. Therefore, V > M + D and the "fail" flag is set indicating that the data value failed the deviance test as indicated in block 366. If B 1s not equal to zero, the process will proceed to check the next bit of the data value as indicated by block 368. Finally, after checking all of the data value bits and neither the "pass" nor "fail" flags are set, the process will automatically set the "fail" flag as indicated ending the process.
Referring now to Figure 33, an Adder 372 adds the first 3 bits Un, Ln, and D', to produce an interim sum bit S' and a first carry bit C1. The Interim sum bit S' is received by an AND gate 374 and an Exclusive OR gate 376 as indicated in equation 3. The AND gate 374 will output a second carry bit C2 which is applied to one input of an AND gate 378 and an input of an
Exclusive OR gate 380. The AND gate 378 receives the carry bit C1 from the Adder 372 at its other input. The Exclusive OR gate 380 also receives the carry bit C1 from the Adder 372 at its alternate input.
The output of the Exclusive OR gate 376 is a sum bit S which is temporarily stored in a two bit Shift Register 382 until the processing of the next data value bit. The output of the Exclusive OR gate 380 is a single carry bit C which is received at the inputs of an AND gate 384 and an Exclusive OR gate 386. The AND gate 384 and the Exclusive OR gate 386 receive the sum bit S-1 at their other inputs from the Shift Register 382. The sum bit S-1 is the sum bit S generated during the processing of the previous data value bit. The output of the Exclusive OR gate 386 is the sum of the sum bit S-1 and a carry bit generated during the processing of the current data bit which is a preliminary bit value B'-1 which is stored in a second Shift Register 392. The preliminary value B'-1 is an interim value of the second preceding bit value before correction for the carry bits. The output of the AND gate 384 is a carry bit C' ' which is received at an input to an Exclusive OR gate 390 which also receives at its alternate, input the double carry output C' from the AND gate 378.
The output of the Exclusive OR gate 390 is received at an input to an AND gate 388 and an input to an Exclusive OR gate 394. The output of the Shift Register 392 is received at the alternate input to the AND gate 388 and Exclusive OR gate 394.
The output of the AND gate 388 is the carry bit signal "A" for the second preceding bit which is applied to the set input of an S-R flip flop 398. The Q output of the S-R flip flop 398 is applied to the D Input of a D-type flip flop 400. The output of the D-type flip flop 400 1s the pass-fall flag for the deviance check. If A=1 as indicated in decision block 360 of Figure 32, then the Q outputs of the S-R flip flop 398 and D-type flip flop 400 are 1's signifying that the data value (V) is less than the median (M) plus the deviance (D). If the Q output of the S-R flip flop 398
and D-type flip flop 400 are O's, then the data value failed the deviance check.
The output of the Exclusive OR gate 394 is the final bit value 8 of the second preceding data value after corrections for single and double carries. The final bit value B is inverted by an Inverter 402 whose output is connected to the SET input of a second S-R flip flop 404. The Q output of S-R flip flop 404 is applied to one input of an AND gate 406 whose output is connected to the clock input of the D-type flip flop 400 through an OR gate 408. A clock pulse (CLK) is applied to the alternate input of the AND gate 406 which is applied to the input of the D-type flip flop 400 when the AND gate 406 is enabled by the Q output of the S-R flip flop 404.
A bit counter 410 counts the number of bits processed and generates an overflow pulse after all the bits have been processed. The overflow pulse is applied to the clock input of the D-type flip flop 400 through an AND gate 412 and the OR gate 408. The alternate input to the AND gate 412 is received from the Q output of the S-R flip flop 398 and is disabled when the S-R flip flop 398 is placed in its SET state by the carry signal A being a 1.
In operation, the Adder 372 produces the interim sum bit S' and the carry bit Ci resulting from the adding of Un, Ln, and D'. The AND gate 374 produces a carry bit C2 which results from adding the 2's complement (2V) of 2V to the sum of Un, Ln, and D'.
The carry bit C2 is combined with the carry bit Ci from the Adder 372 in AND gate 378 to produce the double carry bit C' when both C1 and C2 are 1's. The output of the Exclusive OR gate 380 is indicative of a single carry bit C from either the Adder 372 or the AND gate 374. The sum signal S-1 is the sum S output from the Exclusive OR gate 376 which is output from the Shift Register 382 during the processing of the next subsequent data bit. These are the operations specified in block 354 of Figure 32. The opera
tions of block 356 are carried out by the AND gate 384, Exclusive OR gate 386, and Shift Register 392. The Exclusive OR gate 386 produces an interim sum value bit B'-1 from the sum bit S-1 from the Shift Register 382 and the carry bit C from the Exclusive OR gate 380. The sum bit B'-2 is the signal B'-1 output from the Shift Register 392 during the processing of the second subsequent data value bit. The carry bit C'' is the output of the AND gate 384 which is a continuation of the carry bit C when the sum bit S-1 generated in the processing the preceding data value bit is a 1. During the processing of the next data value bit, the
Exclusive OR gate 390 and the AND gate 388 will generate the value A and the Exclusive OR gate 394 will generate the bit valμe B as indicated in block 358. The value of A is 0 when the interim value of the second preceding sum bit, B'-2 is 0 or when both C' and C'' are 0's indicating no carry bits C1 or C2 have resulted from the processing of the current data value bit. The value of A is 1 when C' or C' ' is a 1 and the interim value of the second preceding sum bit B'-2 is 1. The value of B is 1 when B'-2 is 1 and C' and C'' are 0's or when B'-2 is 0 and C' or C' ' are a 1.
When A is a 1, the S-R flip flop 398 will be set and its
Q output win be a 1, which when applied to the D input of the D-type flip flop 400 will cause its Q output to become a 1. A 1 at the Q output of the D-type flip flop 400 is the pass flag as indicated in block 362. The Q output of the S-R flip flop 398 will disable the AND gate 412, preventing the overflow bit from the Bit Counter 410 from toggling the D-type flip flop 400 after the processing of the last bit. If the AND gate 412 is not disabled by the Q output of the S-R flip flop 398, the overflow bit from the Bit Counter 410 will toggle the D-type flip flop 400, changing its Q output from a 1 to a 0. A 0 Q output of the D-type flip flop 400 is the fall flag, as indicated by block 366.
The function of the decision block 364 is carried out by the Inverter 402, the S-R flip flop 404 and the AND gate 406. When B is 0, the Inverter 402 will cause the S-R flip flop 404 to
be placed in the set state causing its Q output to be a 1. A 1 from the Q output from the S-R flip flop 404 enables the AND gate 406 to pass a clock (CLK) pulse which will toggle the D-type flip flop 400 through the OR gate 408. The Q output of the D-type flip flop 400 in the absence of the Q output of the S-R flip flop 398 being a 1 will go to a low or 0 signal. A low or 0 Q output of the D-type flip flop 400 as previously indicated is the fail flag indicated in block 366.. The pass or fail flag is passed to the Fault Tolerator 36 through the Voter Fault Tolerator Interface 248 as a deviance error (DERR).
SCHEDULER
The Scheduler 40 has two modes of operation, a normal mode and a reconfiguration mode. In the normal mode, the Scheduler 40 schedules the application task for each operating Node in the system including its own, and monitors the execution of these tasks. The reconfiguration mode is entered whenever the Fault Tolerator 36 determines if one or more Nodes are to be excluded or readmitted to the operating set. The two modes interact through an activation status which defines which tasks are eligible for execution by each Node. The reconfiguration mode modifies the activation status, whereas the normal mode utilizes the activation status to schedule the task.
During normal mode operation, the Scheduler 40 implements a dynamic, priority base, nonpre-emptive task scheduling process. Concurrent programming practices and the resolution of inter task dependencies are supported at the boundaries between the tasks. Task-to-node allocation is static for any given System State (configuration), but the sequencing of tasks and the resolution of dependencies are performed dynamically. The Scheduler 40 in each Node replicates the scheduling process for every active Node in the system. Fault detection mechanisms permit each Node to recognize erroneous behavior in the sequencing or timing of the task executed by any Node.
During reconfiguration, tasks may be reallocated among the operating Nodes. Tasks may also be added or deleted from the active task set to conform to the changes in the overall system capabilitles.
During start up or reset of the Operations Controller 12 the Scheduler 40 enters the reconfiguration mode with the assumption that no Nodes are operating. When the Fault Tolerator 36 recognizes an "operating set," that information is passed to the Scheduler 40 as a new System State Vector. The Scheduler then reconfigures the tasks in accordance with the received new System State Vector. By using this method the operation of the Scheduler 40 is self-hoot strapping.
A block diagram of the Scheduler 40 is shown in Figure 34. A Task Selector Module 414 receives information from the Fault Tolerator 36 through a Fault Tolerator Interface 416, from the Synchronizer 46 through a Synchronizer Interface 418, and from the Task Communicator 44 through a Task Communicator Interface 420. The Task Selector Module 414 also communicates with a Scheduler RAM 422 and a Scheduler ROM 424 through a Memory Interface 426.
A Reconfiguration Module 428 is responsive to the reception of a new System State Vector from the Fault Tolerator 36 to reallocate the task to be selected and executed by the new set of operating Nodes. The Reconfiguration Module 428 will change the activation status of the tasks stored in the Scheduler RAM 422 using predetermined information stored in the Scheduler ROM 424.
A map of the Scheduler RAM 422 is shown in Figure 35. The entry Old TID contains an entry for each Node in the system and stores the TID previously started by that Node. The Swap table entry contains an entry for each task (TID) and stores a predecessor count which is the total number of immediate predecessors to that particular task. A periodicity corresponding to
how many Atomic periods must pass between the execution of the task and two swap count numbers which are used to swap or change the active status of a task on a particular Node shall be explained during the discussion relative to reconfiguration.
The Allocation Table stores an allocation count for each task-node pair in the system and is used in the reconfiguration process to determine the distribution of the active tasks among the Nodes.
The Selection Queue 450 has 3 pages, NEXT, PREVIOUS and CHECK. Each page contains three entries for each Node corresponding to the three highest priority tasks currently ready for execution by that Node. "Used" is a Boolean value indicating whether the current iteration of the task in the entry has been started by that Node, ITER is the interation number of that task in the entry, and TID is the task identification code for that task. The NEXT page is the entry from which the next task to be executed for each Node is selected, the PREVIOUS page lists the tasks selected during the preceding Subatomic period, and the CHECK page contains the tasks selected during the second preceding Subatomic period for that Node. The pages are rotated at the beginning of each Subatomic period, and the newly selected task for each Node is stored in the NEXT page.
The Completion Status List contains, for each task, a completion count which corresponds to the number of copies of that task that have been completed, the branch condition count which stores a number corresponding to the number of received Task Completed/Started messages in which the branch condition has a value of 1 and an allocation entry which contains the allocation of that task among the various Nodes.
The Priority Scan List stores for each task the predecessor count which is the number of preceding tasks which have to be completed before that task can be executed, the iteration number of that task and its allocation. The Task Activity List entry stores for each task the predecessor count, the periodicity
of the task, and Its allocation.
A map of the Scheduler ROM 424 is shown in Figure 36. The first entry is the Successor List which lists the successor tasks for each terminated task. This list is accessed by the address of the Successor Offset as shall be explained hereinafter. There are two Successor-Lists, one for each of the two possible branch conditions. The next four entries are the Preference Vectors for each task and identifies those Nodes preferred for the execution of that task. The Relevance Vector contains two entries, the first INCLUDE/EXCLUDE identifies whether the task is to be executed by the Nodes included in the Operating Set or executed by the Nodes excluded from the Operating Set, and a Relevance Vector which identifies to which Nodes the task is relevant. The Initial Swap Table entry contains for each task, the initial predecessor count, the periodicity, and the initial swap counts for each task which are loaded into the Task Activity List of the Scheduler RAM 422 during reset or reconfiguration as shall be discussed later.
The next two entries are the Initial Allocation Counters for each task and lists the initial allocation count or toggle point for each task-node combination. These values are loaded into the Allocation Tables in the Scheduler RAM 422 following reset or power-up. The entry Maximum Execution Time Table stores the 2's complement of the maximum execution time for each task and is loaded into the execution timer, for that Node, when the task is started. The entry Minimum Execution Time Table stores the 2's complement of the minimum execution time for each task and is used to check the execution time of each task when it is reported as being completed. The Successor Offset entry contains for each task, the starting address in the Successor List where the successor tasks are stored. Finally, the Initializing Table entry stores the maximum Node Identification code (NID) and the maximum Task Identification code (TID) used in the system which are used to Identify when a particular operation is completed.
Figure 37 shows the details of the Task Selector Module 414. The NID and started TID fields of the Task Completed/Started messages are transferred directly from the Fault Tolerator Interface 416 to the Task Communicator Interface 420, and are also temporarily stored in an On-Board RAM 430. A Completed/Started Handler 432 transfers the TID and NID of each task identified in a Task Completed/Started message from the On-Board RAM 430 to a Started TID Register 434 shortly after the end of the Soft Error Window (SEW) at the end of each Subatomic period. This is the period of time when all non-faulty Operation Controllers are transmitting their Task Interactive Consistency or System State messages and all the Task Completed/Started messages from the preceding Subatomic period should have been received. The Started TID Register 434 for each Node is a 3-deep queue in which the new NID and TID are added to the tail of the queue and removed from the head.
The Task Selector Module 414 also has a TIC Handler 436 is responsive to the Byzantine voted values of the task completed vector and the branch condition bits of the Task Interactive Consistency (TIC) messages. This data, received from the Byzantine Voter in the Synchronizer 46 is used to update a Selection Queue 450 and a Completion Status List 438, a Wake-up Sequencer 440 responsive to the various period signals generated by a Period Counter 442 for transferring active tasks from a Task Activity List 444 to a Priority Scan List 446 and to the
Completion Status List 438, a Priority Scanner 448 which selects the tasks in the Priority Scan List 446 which are placed in the Selection Queue 450, the Next Task Selector 452 which selects the highest priority task in the Selection Queue 450 and places it in a Next Task Register 454 from where it is transferred to the Task Communicator 44 for execution by the Applications Processor, an Execution Timer 456 which monitors the execution time of each task being executed by the individual Nodes 1n the system, and an Old TID List 458 which stores the current task being executed by each Node. The Task Activity List 444, the Priority Scan List 446, the Completion Status List 438, the Selection Queue 450 and the Old TID List 458 are embodied in the Scheduler RAM 422 as discussed
relative to Figure 35.
The operation of the Wake-up Sequencer 440, the Execution Timer 456, the TIC Handler 436, the Priority Scanner 448, and the Next Task Selector 452 will be discussed relative to the flow diagrams shown in Figures 38 through 46. The operation of the Completed/Started Handler 432 is relatively simple in that it transfers the content of the On-Board RAM 430 to the Started TID Register 434 at the beginning of each Subatomic period.
The flow diagram shown in Figure 38 describes the operation of the Wake-up Sequencer 440. The process begins by repeatedly inquiring if the Subatomic period is the last Subatomic period (LSAP) or is the third Subatomic period (SAP), as indicated by Inquiry Blocks 460 and 462. If it is the last Subatomic period, the process initializes the TID pointer to the Task Activity List 444 to 0, as indicated by block 464. The process then inquires decision block 468, if the periodicity of the tasks TID is less than the period indicated by the Period Counter 442. If it is, the Priority Scan List is initialized, as indicated in block 470. The Priority Scan List Iteration set is equal to the current iteration for that task. The predecessor count is set equal to the predecessor count contained in the Task Activity List and the allocation is set equal to the allocation contained in the Task Activity List. The process then proceeds to inquire, decision block 472, if the task just processed was the last task. If it is, the operation of the Wake-Up Sequencer 440 is completed, otherwise, the process will index to the next task on the Task Activity List 444 as indicated by block 480 and again check if the periodicity of that task is less than the period of the Period Counter, as indicated by decision block 468. If the TID period is greater than the Period Counter then the task is not entered into the Priority Scan List 446 and the task pointer is indexed to the next task in the Task Activity List as indicated by block 480. The last task in the Task Activity List 444 is a null task which has a periodicity of 0. Thus the last task will always be entered into the Priority Scan List 446 as indicated by block 470 when
there is no other task whose periodicity is less than the period of the Period Counter 442.
If the period indicated by the Period Counter 442 is the third Subatomic period, the Wake-Up Sequencer 440 will again initialize the pointer to the Task Activity List to the first task as indicated by block 482. The Wake-Up Sequencer will then inquire, decision block 484, if the periodicity of the task is less than the period indicated by the Period Counter 442. If it is, the Wake-Up Sequencer will initialize the Completion Status List 438, as indicated by block 486. It will then set the iteration in the Completion Status List to 0, the Branch Condition List to 0, and set the allocation to the allocation indicated in the Task Activity List. The Wake-Up Sequencer 440 will then inquire, decision block 488, if it is the last task in the Task Activity List. If it is, the operation of the Wake-Up Sequencer 440 is completed. Otherwise the TID pointer in the Task Activity List will be Indexed to the next task as indicated by block 490 and the above procedure will be repeated. If the periodicity of the task is greater than the period indicated by the Period Counter 442, the Completion Status List 438 will not be updated and the pointer to the task in the Task Activity List will be indexed to the next task. When the pointer in the Task Activity List is indexed to the last task, it will always be entered into the Completed Started List since it has a periodicity of 0.
After the wake-up process is completed, the Execution
Timer 456 will check the execution timer for each Node as shown in the flow diagram in Figure 39. As previously indicated, the execution time for the task being executed by each Node is the 2's complement of the maximum execution time. This is done because with current technology it is easier to increment the time rather than decrement the time. The operation of the Execution Timer 456 begins by Initializing the timer pointer to the first Node, as Indicated in block 492. The Execution Timer will then Increment the times stored by each Node by one (1), as indicated in block 494. The Execution Timer 456 will then check each timer for the
time remaining for the execution of the task, as Indicated by decision block 496. If the timer for any particular Node is equal to 0, then the timer will set an error flag for that Node to true. This information is then sent to the TIC Handler 436 before it is passed to the Fault Tolerator Interface 416 for reasons which shall be explained later. If the current time is not equal to 0, the Execution Timer 456 will inquire, decision block 500, if it has checked the last Node and, if it has, it will exit the execution timer process. Otherwise it will increment the Node pointer to the next Node, as indicated by block 502 and check the current time of the next Node.
The operation of the TIC Handler 436 will be described with reference to Figures 40 through 44. The TIC Handler responds to the arrival of the voted value of the Task Interactive Consistency message and modifies the main data structure based upon that data. It treats the occurrence of a time error as equivalent to a confirmed completion so that a stalled or permanently hung copy of a task does not hold up the rest of the work load. The operation of the TIC Handler 436 starts following the completion of the Execution Timer checks and the receipt of the Byzantine data from the Synchronizer 46. The TIC Handler 436 selects a Node for which either a confirmed completion or an overtime error has been reported. If a confirmed completion has been reported, the TIC Handler clears the timer error bit associated with that Node since the completion was confirmed during the same Subatomic period in which the timer expired. The TIC Handler then searches the CHECK page of the Selection Queue 450 for the TID of the first unused task encountered for the Node which was reported to have completed a task. This is the TID of the task which the Node should have started. If this TID does not match the TID currently stored in the Started TID Register 434 for that Node, then a sequence error is recorded. Finally, the TIC Handler calls each of Its sub-process, Selection Queue Update, Completion Termination, Execution Timer Reset, and Priority Scan Update, and sequentially updates the data structure for the selected Node.
The TIC Handler process is repeated for each Node.
As shown in Figure 40, the operation of the TIC Handler begins by inquiring if the Byzantine data is available, as indicated by decision block 504. If it is not available, the TIC Handler 436 will wait until it does become available. Otherwise the TIC Handler will initialize the pointer to CHECK page of the Selection Queue 450 to the first Node position, as indicated by block 506. The process will then inquire, as Indicated by decision block 508, if the Node completed a task, as indicated by the Byzantine data. In parallel, if the Byzantine data did not indicate that a task was completed by that Node, the process will check to see if a time error had occurred, as indicated in decision block 524. If the Byzantine data indicated that the Node did not complete a task and there was no time error, the process will increment the Node pointer to the next Node, as indicated by block 526. The process will then check to determine if it had investigated the last or the maximum Node as indicated by block 528. If it was the last Node, it will exit the program, otherwise it will proceed to check the next Node to see if it had completed a task or a time error had occurred.
When a Node has completed a task and a time error has been recorded for that Node, the TIC Handler will set the time error to false, as indicated by block 510 since the task was completed in the same Subatomic period in which the time error was detected. Therefore, the time error is invalid and it is cancelled. If either the Node had completed a task or a time error had occurred, the process will then mark as used the first unused entry for that Node found in the CHECK page of the Selection Queue, as indicated by block 512. It will then store as the current TID the TID of the entry that had just been marked used and it will store the current iteration as the iteration of that same entry, as indicated by block 514. The process will then check to determine that the current task is also the same task that was reported by that Node in its last Task Completed/Started message which was stored in the Started TID Register 434 as shown
in Figure 37. If the current task and the task reported as completed in the last Task Completed/Started message for that Node are not the same, the TIC Handler 436 will set the Sequence Error flag to "true," as indicated by block 520. The Process will then call the Selection Queue (SQ) Update sub-process, as indicated by block 518 and wait for the completion of the Priority Scan List (PSL) Update sub-process, as Indicated by block 522. When the Priority Scan List Update is completed, the process will then index a Node pointer to the next Node, as indicated by block 526 and then check to see if it has processed the last Node, as indicated by decision block 528.
The sub-process Selection Queue Update for the TIC Handler searches the NEXT and the PREVIOUS pages of the Selection Queue 450 for the Nodes selected by the TIC Handler. When an entry is found containing both the current task and the current iteration, it is marked "used." Such entries may or may not be found because the tasks with a higher priority than the current task may have become available between the generation of the CHECK page and the generation of the PREVIOUS or NEXT page. It is not necessary to mark the CHECK page entry since it will not be accessed again before it is refreshed. The Selection Queue Update sub-process begins by initializing the pointer to the PREVIOUS page to the 0 entry, as indicated by block 530. The process will then index the entry to the first entry, as indicated by block 532, and will inquire if the current TID and iteration are equal to the TID and iteration of the entry, as indicated in decision block 534. If they are the same, then the entry "used" is marked "true," as indicated by block 536. Otherwise the process will inquire if it has checked all of the three entries of the PREVIOUS page, as indicated by decision block 538. If it has not checked all of the entries on the PREVIOUS page of the Selection Queue 450, it will proceed to index the entry to the second entry and so on until it has checked all three entries on the PREVIOUS page. After either finding the TID in one of the entries in the PREVIOUS page or completing checking the PREVIOUS page and not finding an
entry, the program will then proceed to the NEXT page of the Selection Queue 450 and again will set the pointer to the 0 entry, as indicated by block 540. It will index the entry, as indicated by block 542, then inquire if the current TID and iteration are the same as the TID and iteration of the entry, as indicated by block 544. If they are, it will mark the "used" entry "true," as indicated by block 546. Otherwise the process will then inquire if it has checked all three entries, as indicated by decision block 548. If it has not, it will then Index the pointer to the next entry and continue to investigate until it has either found the current TID and iteration in the entry or it has checked all three entries. The process will then call the Completed Termination (CT) sub-process, as indicated by block 550.
The TIC Handler sub-process Completion/Termination records the completion of each copy of a task in the Completion Status List. If the final copy has been completed (or timed out), then the task is "terminated." The Successor List entries associated with the terminated task and the majority branch conditions are accessed via the base address in the Successor-Offset List, as indicated in Figure 36. The predecessor count for each successor of the terminated task is then decremented. If the branch conditions generated by the various copies result in a tie, then the branch condition 0 is selected by the default.
The TIC Handler 436 retains an old valid bit for each Node indicating whether the TID listed in the Old TID section of the Scheduler RAM 422, as shown in Figure 35 is a valid Old TID or not. All bits of the old valid are set to false during system reconfiguration to indicate that the next task to be executed by each Node is the first task and that there are no previous tasks to process. The old valid is set to true after the confirmed start of the first task on the Node and before the confirmed start of the second task on the Node.
If the old valid is false, then the started task is the first task being executed on that Node following a recon
figuration. Therefore, there is no completed task to process and the Completion/Termination sub-process need not be executed. Similarly, if the completed task is a null task, there is no need to terminate the task. In the flow diagram shown in Figure 41, the point at which the task Is compared to the maximum task is the latest point at which the comparison can be made without potentially reporting a termination of a null task, and this makes the content of the maximum task entry on the Completion Status List irrelevant.
Referring now to Figure 42, the sub-process
Completion/Termination begins by checking the old valid flag for the Node, as indicated by block 552. As previously indicated, if old valid is not true, the process will then proceed to the next sub-process Execution Timer Reset as shall be discussed hereinafter. However, if old valid is true, the process will then record the completion of the task using the TID stored in Old TID as the TID of the completed task then accessing the Completion Status List 438 and setting the allocation for that TID-NID allocation to false, as indicated by block 554. The process will then inquire, as indicated in decision block 556, if the branch condition is equal to 1. If it is, it will then increment the branch condition entry in the Completion Status List 438, as indicated by block 558. However, if the branch condition is equal to 0, the process will proceed to inquire, as indicated by decision block 560, if all of the copies of that task have been completed. This is indicated by all the entries in the allocation section of the Completed Status List being set to false.
If all of the copies of the task have been completed, the sub-process will proceed to report to the Task Communicator the identity of the terminated task, as indicated by block 562. After reporting the termination of the task to the Task Communicator 44, the process will then get the address of the first successor task from the Successor-Offset entry contained in the Scheduler ROM 424, as indicated by block 564. The process will then inquire, as indicated by decision block 566, if the successor
task is equal to the maximum successor task which corresponds to the end of the Successor Task List for the terminated task. If that is the end of the Successor Task List, the program will then proceed to call the Execution Timer Reset sub-process, as indicated by block 572. If the successor task is not the maximum TID listed on the Successor List for the terminated task, the process will continue to update the Completion Status Table by decrementing the predecessor count for each successor task by 1, as indicated by block 568. The process will then increment the address to the Successor List, as indicated by block 570 and proceed to analyze the next task on the Successor List.
The Execution Timer Reset sub-process of the TIC Handler 436 checks the execution timer for each Node for a minimum time error and reloads the timer for the newly started task. If the old valid flag for that Node is false, then there is no completed task and the error is not recorded. The Execution Timer Reset is the last process to access the Old TID entry in the Scheduler RAM 422. It is, therefore, a convenient place in which to copy the current TID and to set the flag old valid true.
Figure 43 is a flow diagram showing the process executed by the Execution Timer Reset sub-process. The process begins by setting the TID equal to the Old TID for that particular Node, as indicated by block 574. The process then compares, as indicated by decision block 576, the current execution time for that TID with the minimum time. If the current execution time is greater than the minimum execution time it then inquires, as indicated by block 578, if the old valid flag is true. If old valid is true, then the Execution Timer Reset sub-process win set the time error flag for that Node to "true," as indicated by block 580. If the current execution time is not greater than the minimum time or if the old valid flag is not true or if a time error has been recorded, the process will then reset the Execution Timer, as indicated by block 582, by setting the current time for that Node equal to the maximum time for the current task which is contained in the Scheduler ROM 424 in the entry entitled Maximum Execution
Time Table, as shown in Figure 36. The process will then update the Old TID entry in the Scheduler RAM 422 by setting the Old TID for that Node equal to the current TID, as indicated by block 584, then set the old valid flag for that Node as true, as indicated by block 586. The process will then proceed to call up the Priority Scan List Update sub-process, as Indicated by block 588.
The Priority Scan List Update sub-process of the TIC Handler 436 records the start of the current task on the Node by clearing the Priority Scan List current TID allocation for the Node. This process makes the current task ineligible for reexecution by that Node until its next Iteration, at which time the Wake-up Sequencer 440 reinitializes the Priority Scan List entry for the task. Two conditions must be satisfied before the update is performed: 1) the started task must not be a null task, since a null task must always be available, it may never be removed from the Priority Scan List; and 2) the iteration number of the started task is the same as the iteration number in the Priority Scan List. The two iteration values may differ within the first three Subatomic periods of an Atomic period if the task ran during the last three Subatomic periods of the previous Atomic period.
Figure 44 is a flow diagram showing the procedure executed by the TIC Handler 436 in the execution of the Priority Scan List Update. The process begins by inquiring if the entry is current, as indicated by block 590. If the entry is current, the process will then proceed to inquire if the current task is a null task (maximum TID), as indicated by block 592. If the current task is not a null task, the Priority Scan List is updated by recording that the Node has started that particular task, as Indicated by block 594. Effectively, the process sets the flag in the allocation entry of the Priority Scan List for that particular Node to false. If the entry is not current, or if the task is a null task, the process returns, as indicated by block 596 to the TIC Handler process illustrated in Figure 40.
The Priority Scanner 448 selects a candidate task for
the next Subatomic period based on the latest confirmed data about the progress of the application work load. The operation of the Priority Scanner 448 follows the updating of the Priority Scan List by the TIC Handler 436. The Priority Scanner 448 will first rotate the page pointers of the Selection Queue 450 then select three tasks for each Node by scanning the Priority Scan List in the order of increasing TID's. In the Priority Scan List the highest priority tasks have the lower TID numbers and the lowest priority tasks have the higher TID numbers. The selected tasks are then written into the NEXT page of the Selection Queue for their respective Nodes.
The operation of the Priority Scanner 448 begins by rotating the pointers in the Selection Queue 450, as indicated by block 598. The Priority Scanner then sets all of the Node entry pointers to the first entry as indicated by block 600. It then starts at the top of the TID list for the first task, as indicated by block 602. The Priority Scanner 448 then inquires, as indicated by block 604, if the precedent count for that task is equal to 0 Indicating that all of the predecessor tasks have been completed. If all of the preceding conditions are satisfied, the Priority Scanner 448 will investigate if the task has been previously started on that particular Node, as indicated by decision block 606. If the task has not been previously started on that Node, the Priority Scanner will then inquire if that particular Node already has three entries, as indicated by block 608. If it does have three entries, it will then check to see if that Node was the last Node, as indicated by block 610. If it is not the last Node, it will then index to the next Node, as indicated by block 612 and will proceed to check the entries for the next Node. If the Node being evaluated is the last Node, the Priority Scanner 448 will proceed to check if each Node has more than three entries, as indicated by block 618. If each Node has more than three entries, then the operation of the Priority Scanner is completed and it will exit. However, if not all of the Nodes have three entries, then the Priority Scanner 448 will inquire, as indicated in block 620, if it has processed the last task. If it
has processed the last task, then it will fill all the remaining entries with the null task which is the maximum TID, as indicated by block 622. However, if the TID is not the maximum or last task in the list, the process will increment the TID number and will repeat.
Referring back to decision block 608, if the entries for a particular Node are not greater than 3 then the process will copy the TID and iteration from the Priority Scan List to the NEXT page of the Selection Queue 450 for that Node, as Indicated by block 614. It will then increment the entry for that Node, as indicated by block 616 and then inquire, as indicated by decision block 610, if that Node was the last Node. If it is not the last Node, then the process will proceed to the next Node, as indicated by block 612 or will check if the entries in all the Nodes are full, as indicated by decision block 618.
The Next Task Selector 452 examines the first entry of the NEXT page of the Selection Queue 450 for its own Node (NID). If that task has not been previously started by its own Node, then it records that task in its Next Task Register 454 which is passed to the Task Communicator 44 through the Task Communicator
Interface 420 when requested by the Applications Processor. If the task has been previously started, then the next entry on the NEXT page of the Selection Queue 450 is examined for the same criteria. The process continues until an entry is found which has not been executed, or until the third entry has been examined.
Since the Selection Queue 450 is not updated until the third Subatomic period after a task is started, the Next Task Selector must maintain a local record of tasks started on its own Node. The TID's of the previous tasks started are maintained in a two entry deep stack to record the previous two tasks actually started by the Node's Task Communicator 44. The Scheduler 42 receives immediate notification from the Task Communicator whenever a task is started. It then pushes the currently selected task onto the previous TID stack allowing the oldest entry to fall
off the bottom of the stack. The operation of the Next Task Selector 452 is triggered by the beginning of the soft-error window, while the Transmitter is occupied with the transmission of a Task Interactive Consistency or a System State message. Therefore, the Task Communicator cannot transmit a task
Completed/Started message or start the selected task while the Next Task Selector 452 is modifying the selected task. The Next Task Selector 452 is the only module in the Scheduler which has access to its own Node Identification (NID) code.
The operation of the Next Task Selector 452 will be discussed with reference to the flow diagram shown in Figure 46. The operation of the Next Task Selector begins with the setting of the entry pointer to the the NEXT page to its own NID and to entry 0, as indicated by block 626. The Next Task Selector then increments the entry pointer to the first task, as indicated by block 628 and records as the selected task the task that is entered for Its own Node in the entry of the Selection Queue 450, as indicated by block 630. The Next Task Selector will then inquire, decision block 632, if this is the third entry in its own entry of the NEXT page. If it is, it will store the selected tasks in the Next Task Register 454. However, if it is not the third entry, the Next Task Selector will inquire, as indicated by decision block 636, if the selected task and iteration are the same as the first or second previously selected task and iteration. If the selected task and iteration are the same as a first or second previously selected task and iteration, the Next Task Selector will proceed to increment the entry and examine the next task in the Selection Queue, as indicated by block 628. However, if the selected task and iteration were not previously selected, the Next Task Selector will store the selected task in the Next Task Register 454, as indicated by block 634 completing the selection process.
It can be seen from the above flow diagram that if the first two entries in the Selection Oueue 450 have been previously executed by this Node, the Next Task Selector 452 selects the third entry regardless of its previous selection status. This
feature allows multiple entries of the null task to be placed in the Selection Queue simultaneously in the event there are no other tasks ready to run. Thus, when no other tasks are ready to execute, the Node will start the null task every Subatomic period until another task becomes available.
The operation of the Task Selector Module begins with the end of the Soft Error Window (SEW), at which time no Task Complete/Started messages should be arriving from the non-faulty Nodes. First, the Completed/Started Handler will transfer the content of the Task Completed/Started messages stored on the On-Board RAM 430 to the Started TID Register 434 before the earliest possible arrival of the Task Completed/Started messages for the next Subatomic period. All of the other processes executed by the submodules with the exception of the Next Task Selector 452 must be completed before the beginning of the Next Soft Error Window. The operation of the Next Task Selector 452 is triggered by the beginning of the soft error window and must be completed by the time the Transmitter completes sending its Task Interactive Consistency and/or System State messages and becomes available to the Task Communicator for sending Task
Completed/Started messages. The operation of the Wake-up Sequencer is triggered by the end of the operation of the Completed/Started Handler 432. After the operation of the Wake-up Sequencer 440 is completed the Execution Timer 456 will perform its execution timer checks. The TIC Handler 436 will then proceed to update the Selection Queue 450 and the Completion Status List 458, to reset the execution timers, and update the Priority Scan List 446. After the Priority Scan List is updated, the Priority from the Priority Scan List 446 to the Selection Queue 450. Finally, the Next Task Selector 452 will select the next task from the Selection Queue 450 and place it in the Next Task Register 454.
The details of the Reconfiguration Module 428 will be discussed relative to Figure 47. When the System State is modified by the exclusion or readmission of a Node, it is
necessary to reconfigure the assignment of tasks to the remaining operating Nodes . There are 2N possibl e states for an N Node system. Thus, in an 8 Node system there are 256 possibl e states. The storage of a Separate Assignment List for each of these states would requi re an excessive amount of memory. Therefore, reconfiguration i s effected by a transition-based algorithm which does not deal with the new state directly. Rather, it reconfigures the task load based upon the change between the old and new states. The transition-based approach i s inherently less complex than a state based approach since there are only 2N possible transitions, representing excl usion or readmi ssion of each of the N Nodes .
The active task set for a Node is defined as a set of tasks enabled for execution on that Node. For given tasks and Nodes a Bool ean val ue "activation-status" may be used to represent whether a given task is enabled for execution on the given Node. The purpose of reconfi guration is to modify the activation-status for each task Node pair when the System State is modified by the excl usion or readmission of a Node. Three independent operations are needed to correctly manage the activation-status val ues . 1) Individual tasks may be enabl ed or disabled for all Nodes in the system to account for changes in the overal l system capabil ities. For example, when the total number of operating Nodes fall s below some preset val ue, a task may be el iminated completely from the active task set or repl aced by functional ly equi valent simpl er tasks. This operation of activation or deactivation of a task is referred to as swapping. A task which may be enabl ed for execution is said to be swapped in whi l e a task which is di sabled is said to be swapped out .
2) Active tasks may be real located among the operating Nodes of the system. For example, i f a Node is excl uded , one copy of each task as executed by that Node wi l l be lost. In order to maintain the desi red redundancy of each task , one copy of each affected task must be executed by some other Node. The Scheduler does not requi re all these tasks to be reassigned to one Node but rather may distribute these tasks among the remaining Nodes as desi red .
A side affect of reallocation is that it may require that lower priority tasks be swapped out if the remaining Nodes are highly utilized.
3) Tasks may be prohibited from executing on individual Nodes based upon their operational status. For example, when a Node is excluded by a state transition, it is generally desirable to prohibit any application tasks from executing on that Node. However, it is desirable for the excluded Node to initiate a comprehensive sequence of diagnostic tasks. The set of all the tasks in the system are divided into two mutually exclusive subsets, the included task set and the excluded task set. Members of the included task set may only be executed by the included Nodes and the members of the excluded task set may only be executed by excluded Nodes.
The following discussions define the operations required for the reconfiguration of the task in response to a State Transition. If multiple changes to the System State are required, they are performed sequentially one Node at a time in any multiple reconfiguration, all readmissions are processed before any exclusions are processed.
Referring now to Figure 47, the Reconfiguration Module includes a Task Swapper 638, a Task Reallocator 640, and a Task Status Matcher 642. A Current and Next System State Comparator 644 receives the System State Vector from the Fault Tolerator Interface 416, as indicated in Figure 34 and generates a Delta
System State Vector which identifies only those Nodes whose System State has changed between the next System State and the current System State. The Delta System State Vector also includes a flag indicating whether any Node has been readmitted to the current operating set. The Task Swapper 638 generates a Boolean swap-status value indicating whether the task is swapped in or swapped out of the Active Task set. This process uses the Swap Table 646 which is contained in the Scheduler RAM 422 as previously described. The Task Reallocator 640 generates one Boolean allocation-status value for each task-node pair in the system.
The Task Real locator 640 uses the Allocation Tables 648 which are contained in the Scheduler RAM 422, as indicated in Figure 35. The Status 642 Matcher generates a Boolean match-status value for each task-node pair. The Task Status Matcher 642 uses the Relevance Vector Table 650 which is one of the tables stored in the Scheduler ROM 424, as previously discussed with reference to Figure 36. The swap-status value, the allocation-status value, and the match-status value are handled together as symbolically indicated by AND Gate 652 and stored in the Task Activity List 444 shown in Figure 37.
The operation performed by the Task Swapper 638 is largely one determining the relevance of each Node to each task. The application designer may define any subset of the Nodes as relevant to the performance of each task. The swapped-status of each task is determined solely by the number of relevant Nodes included in the operating set. When a State Transition occurs, the new System State is examined to determine whether the number of relevant Nodes in the operating set will change the swap-status of each task. The number of relevant Nodes at which the change is required is defined as the "toggle value" for that task. In the preferred embodiment of the system, two toggle values are provided to enhance the flexibility for system reconfiguration. The operation of the Task Swapper 638 will be discussed relative to the flow diagram in Figure 48.
The operation of the Task Swapper 638 begins with the setting of the pointers to the Relevance Vector in the Scheduler ROM 424 and the pointer to the Swap Tables in the Scheduler RAM 422 to the first task, as indicated by block 653. The Task Swapper will then inquire if the task is relevant to the Node excluded from the operating set, as indicated by decision block 654. If the task is not relevant to the excluded Node, the Task Swapper will proceed to evaluate the next task, as indicated by block 662. However, if the task is relevant to the excluded Node, the Task Swapper will Inquire, as indicated by block 656, if the number of relative Nodes in the System State is equal to the
Toggle Point (swap count = 0). If the number of relevant Nodes equals the Toggle Point, the Task Swapper 638 will complement the swap status, as indicated by block 658, then will decrement the swap count for that task in the Swap Table 646 as indicated by block 660. However, if the swapped count is not equal to 0, the Task Swapper 638 will not complement the swap status of that task, but will simply decrement the swap count stored in the Swap Table 646. After decrementing the Swap Table 646, the Task Swapper will proceed to increment TID pointers to the next task as indicated by block 662 then inquire if this task is the last task in the system as indicated by decision block 664. If it is the last task, the operation of the Task Swapper is completed, otherwise the Task Swapper will repeat the above process until all the tasks have been evaluated.
The operation of the Task Swapper 638 when the Delta
System State Vector indicates that a Node has been readmitted to the system is indicated in the flow diagram in Figure 49. As indicated with reference to operation of the Task Swapper for an excluded Node, when a Node is readmitted into the operating set, the Task Swapper 638 will first set the pointers to the Preference Vector entry of the ROM 424 and the Swap Table 646 to the first task (TID = 1) as indicated by block 666. The Swap Table 646 is part of the Scheduler RAM 422 as illustrated in Figure 34. The Task Swapper will then inquire, decision block 668, if the task is relevant to the Node which has been readmitted into the operating set. If the task is not relevant to the readmitted Node, the Task Swapper will proceed to evaluate the next task, as indicated by block 676 and decision block 678. However, if the task is relevant to the readmitted Node, the Task Swapper will increment the swapped count in the Swap Table 646 as Indicated by block 670 then inquire, as Indicated by decision block 672, if the number of relevant Nodes is equal to the Toggle Point. If the number of relevant Nodes equals the Toggle Point, then the Task Swapper 638 will complement the swap status of that task, as indicated by block 674 and proceed to the next task, as indicated by block 676.
If the number of relevant Nodes is not equal to the Toggle Point (swap count ≠ 0), the swap-status of the task will not be complemented and the Task Swapper will proceed to evaluate the next task, as indicated in block 676. The Task Swapper will then inquire, as indicated by decision block 678, if the task was the last task to be evaluated. If the last task has been processed, the Task Swapper 638 is finished with its operation, otherwise the process will be repeated for each task until the last task is processed.
The swapping process has the following properties:
1) All tasks and toggle points are treated independently;
2) The swapped status depends on the number of available relevant Nodes, not on the identity of those Nodes; and
3) The process is reversible and path independent.
The swapped status of a task depends only on the System State and not on the sequence of transitions which preceded that state.
The operation of the Task Reallocator 640 is very similar to the process of the Task Swapper. There are, however, two major differences between swapping and reallocation:
1) In reallocation, not all Nodes respond identically to a particular change of state. For example, if a give Node is excluded, a second Node may be required to assume the excluded Node's tasks, while the rest of the Nodes take no action whatsoever. It is, therefore, necessary to treat each Node independently.
2) In order to reallocate active tasks, it is not sufficient to note just the relevance of a given Node to each task. A method is required to determine which of the operating Nodes will assume or drop tasks in response to the transition. This is accomplished by allocating each task to various Nodes in a predetermined order of preference.
The "preferred set" for a given task-node pair is defined as the set of Nodes which are more preferred than others for execution of a given task. The application designer may define any subset of system Nodes which are the preferred set for each task-node pair. The allocation-status of each task-node pair is determined solely by the number of preferred Nodes included in the current System State. When a State Transition occurs, the new System State is examined to determine whether the number of preferred Nodes in the operating set will change the allocation- status of each task. The number of preferred Nodes at which the change is required is defined as a Toggle Value for that task and Node. In general, any number of Toggle Values may be defined for any task pair. However, only one Toggle Value is required for each task-node pair to provide the flexibility desired for system reconfiguration.
The Reallocation process begins with the first task, as indicated by block 680, in Figure 50. The Task Reallocator 640 will then start with the first Node (NID = 0), as indicated by block 682. The Task Reallocator 640 will then inquire if the excluded Node (i) is a more preferred Node for that task than the Node (n) being evaluated, as indicated in decision block 684. If the excluded Node (i) it is not a more preferred Node for that task, the Task Reallocator will then proceed to determine if it is a more preferred Node than the next Node, as indicated by block 692 and decision block 694. If the excluded Node is a more preferred Node for the execution of the task, then the Task Reallocator inquires if the number of preferred Nodes in the state equals the Toggle Point (allocation count = 0), as indicated in block 686. If the number of preferred Nodes is equal to the Toggle Point, the allocation-status for Node (n) is complemented, as indicated in block 688, otherwise the allocation-status is not complemented and the allocation count for that task-node combination is decremented, as Indicated by block 690. After decrementing the allocation count, the Task Reallocator will increment the pointer to the next Node, as indicated by block 692 then
inquire, as indicated by decision block 694, if the Node is the last Node in the system. If it is not the last Node, the Task Reallocator will repeat the process for each Node until the last Node is evaluated then the Task Reallocator will index to the next task, as indicated by block 696 and repeat this process until all of the task-node combinations have been completed, as indicated by decision block 698.
The operations of the Task Reallocator treats all tasks Nodes and Toggle Points independently. The allocation-status depends on the number of available preferred Nodes and not on the identity of these Nodes. Also, the operation of the Task Allocator is reversible and path independent. For example, if the Delta System State Vector indicates a Node has been readmitted into the operating set, the operation of the Task Allocator parallels that of the Task Swapper in that the allocation count is incremented rather than decremented, as indicated by block 680 and the incrementing of the allocation count takes place prior to the inquiry to determine if the number of preferred Nodes in the state is equal to the Toggle Point, as indicated by decision block 676. The allocation-status of the task-node pairs depends only on the
System State and not on the sequence of transitions which preceded that task.
The operation of the Task Status Matcher 642 will be discussed with reference to Figure 51. When a Node is excluded from the operating set for a faulty behavior, it is generally desirable to prohibit application tasks from executing on that Node. However, it is desirable, to institute a comprehensive set of diagnostic tasks on the excluded Nodes. The Swapping and Allocation processes described above are not capable of supporting this function. Therefore, the total task set is divided into two mutually exclusive sets, the included task set and the excluded task set. The tasks of the included task set are permitted to be active only on the Nodes Included in the operating set. Similarly, the tasks in the excluded task set are permitted to be active only in the Nodes excluded from the operating set.
If the included/excluded status of a given task matches the included/excluded status of a given Node, then the activationstatus of that task on the Node is determined by the Swapping and Allocation processes. Conversely, if the status of the task does not match the status of the Node, then that task is prohibited from being executed on that Node regardless of the results of the swapping and reallocation processes.
Referring now to Figure 51, the Task Matching Process begins by setting the task pointer to the first task, as indicated by block 700. The Task Status Matcher 642 then sets the pointer to the first Node (NID = 0), as indicated by block 702. The Task Status Matcher then determines if there is a match between the included/excluded status of the task with the included/excluded status of the Node, as indicated by the System State Vector. This is indicated in block 704 which says "TID-NID match equal to the Exclusive OR the Relevance Vector contained in ROM 424 and bit "n" the System State Vector." The Task Status Matcher will then process the next Node, as indicated by block 706 and decision block 708 until all of the Nodes have been evaluated with respect to the given task. The Task Status Matcher 642 will then Index the task pointer to the next task, as indicated by block 710 and repeatedly evaluate all of the tasks until the last task has been evaluated, as indicated by decision block 712. After all of the task-node combinations have been evaluated, the operation of the Task Status Matcher is completed.
TASK COMMUNICATOR
The details of the Task Communicator 44 are shown in Figure 52. However, the operation of the Task Communicator will be discussed in detail with reference to the subsystem diagrams and flow diagrams shown in Figures 53 through 70.
The Task Communicator 44 coordinates and supports all the communications between the Operations Controller 12 and its associated Applications Processor 14. Upon demand, the Task Communicator 44 provides the Applications Processor 14 with the
Task Identification Code (TID) of the next task to be executed and supplies the required input data values. The Task Communicator receives all output data generated by the Applications Processor 14 and broadcasts them to all the other Nodes in the system via the Transmitter 30. When the Applications Processor 14 reports an error condition, the Task Communicator 44 reports the error condition to the Fault Tolerator 36 through the Voter 38. When a task is completed by the Applications Processor 14, the Task Communicator receives the value of the Branch Condition (BC) generated by the Applications Processor 14, and broadcasts it to all the other Nodes in the next Task Completed/Started message.
The Task Communicator 44 communicates directly with the Scheduler 40, the Voter 38, the Transmitter 30, and the Applications Processor 14. The Task Communicator has a table which lists by tasks the expected sequence of the input data to be used by the Applications Processor 14 and the expected sequence of the output data generated by the Applications Processor. Using messages from the Scheduler 40, the Task Communicator keeps track of the current tasks currently being executed by all the Nodes and uses this information to supply information to the Voter relating to the message currently being processed.
Referring to Figure 52, the Task Communicator has a Voter Interface 714, a Scheduler Interface 716, and a Transmitter Interface 718. The Voter Interface 714 interfaces with the Voter 38 and receives the voted data and deviance vector and the Message Type Code (MT), and Data Identification Code (DID) of the voted data. The Voter win also send the Node Identification Code (NID) of a Node from which 1t has received data and request that the Task Communicator identify the DID of the data it has received. The Voter Interface 714 will also receive an Error Report from an Error Reporter 754 which is communicated to the Voter Interface 74.
The Voter Interface 714 receives data and the associated MT and DID codes from the Voter 38. A Store Data Control 720 passes the data to the Data Memory where it is stored using the MT
and DID codes and the complement of a context bit taken from a Context Bit Memory 732 as an address. The Data Memory is partitioned in a manner similar to that previously discussed with reference to the Fault Tolerator RAM 162 and shown in Figure 16. The context bit taken from a Context Bit Memory 732 is used to store the data in the Data Memory 42 in the appropriate partition. A DID Request Handler 724 receives a DID request from the Voter 38 through the Voter Interface 714 in the form of a Node Identification Code (NID). The DID Request Handler 724 will access a Pointer Table 726 and transmit back to the Voter 38, the identity (DID) of the expected data that the Voter should currently be processing. If the Voter detects a mismatch between the expected DID and the DID of the data value it is currently processing it will set an error flag.
The Scheduler Interface 716 receives from the Scheduler
40 the identification of the task terminated, the identification of the task started, the identification of the Node, and the next task selected by the Scheduler. A Task Terminated Recorder 730 will flip the context bit of the Context Bit Memory 732 for output DID's of the task reported as terminated. The Task Terminated Recorder 730 will also set a new NUDAT bit in a Nudat Bit Memory 722 Indicating to the Voter next time it requests that same DID it will be the first request for that DID since the task that generated it was terminated. The Task Terminated Recorder 730 will also OR all the deviances stored in the Data Memory 42 for the terminated task output DID's and store them in a Deviance Error Register 734.
A Task Started Recorder 736, upon the receipt of a message from the Scheduler Interface 716 indicating the starting of a new task by any Node in the system, will access the Pointer Table 726 to determine if the address stored in the Pointer Table points to a null DID indicating that that Node had completed the preceding task. If the address stored in the Pointer Table 726 does not point to a null DID, the Task Started Recorder 736 will set a sequence error flag for that Node which is stored in a
Sequence Error Register 738. After it has completed this check, the Task Started Recorder 736 will access the pointer in a DID List 728 with the Task Identification Code (TID) of the task started and store the address for the first DID in that task in the Pointer Table 726. A Next Task Recorder 740 will store the next task received from the Scheduler in a Next Task Register 742.
An AP Input Handler 744 will transfer the identification αf the next task stored in the Next Task Register 742 to an AP Input FIFO 746. The AP Input Handler will then access the Pointer and DID List 728 with the Task Identification Code (TID) and get the address for the data stored in the Data Memory 42 needed for the execution of that task; This data will then be stored in the AP Input FIFO 746. When the Applications Processor 14 is ready to begin the execution of the next task, it will access the AP Input FIFO 746 for the Task Identification Code and the data necessary for the execution of the task. The AP Input Handler 744 will also generate a Task Completed/Started message which is sent to the Transmitter 30 through the Transmitter Interface 718, which transmits this message to all the other Nodes in the system. When the Applications Processor 14 executes the selected task, the data resulting from the execution of the task will be stored in an AP Output FIFO 748. An Applications Output Handler 750 will access the pointer and DID List 728 and obtain the Message Type Code (MT) and the Data Identification Code (DID) for each data value generated by the Applications Processor. Each Message Type Code, and the Data Identification Code along with the data is transmitted to the Transmitter Interface 718 and transmitted by the Transmitter 30 to all the other Nodes in the system. The last word generated by the Applications Processor 14 contains an Applications Processor Reported Error (APRE) vector which is stored in an APRE Register 752. The last word generated by the Applications Processor 14 also contains the branch condition bit which is to be included in the next Task Completed/Started message generated by the AP Input Handler 744. This branch condition is stored in the Transmitter Interface 718 until it receives the remainder of the Task Completed/Started message from the AP Input
Handl er 744.
The Error Reporter 754 receives the Deviance Error Vector from the Deviance Error Register 734, the Sequence Error Vector from the Sequence Error Register 738, and the Application Processor Reported Error Vector from the APRE Error Register 752 and transmits these Error vectors to the Voter 38 through the Voter Interface 714. An Arbitrator 756 arbitrates the operation of the various Modules in the Task Communicator.
The operation of the Store Data Control will be discussed relative to Figure 53 and the flow diagram shown in
Figure 54. As more clearly indicated in Figure 53, the Store Data Control receives the voted data and deviances from the Voter. Along with this information, it also receives the Message Type Code (MT) and the Data Identification Code (DID). Referring now to Figure 53, the operation of the Store Data Control 720 begins by evaluating the first byte received from the Voter Interface 714, as indicated by decision block 760. When the first byte is all 0's, as indicated in block 760, there is no data available and the Store Data Control 720 will wait until it receives the first non-zero first byte. After receiving a non-zero first byte, the Store Data Control 720 will inquire, as indicated in decision block 762, if the data is a System State vector. If the data is not a System State vector, the Store Data Control 720 will access the Context Bit Memory 732 for the state of the context bit, using MT and DID codes as indicated by block 764. The Store Data
Control 720 will then complement the context bit, as indicated by block 766 then generate an address, block 768, using the complemented context bit and the deviance bit set equal to zero (DEV = 0). However, if the data is a System State Vector, the Store Data Control 720 will access a TOC Bit Flip Flop 758 for the TOC bit, as indicated by block 770 then generate the address using the TOC bit as the context bit and setting the deviance bit to zero, as Indicated by block 772. The TOC bit marks the Atomic period in which the System State Vector was generated. The TOC Bit Flip Flop 758 is complemented to mark the beginning of each new Atomic period in response to the rising edge of the Atomic period (AP)
signal .
In either case, after the address is generated, the Store Data Control 720 will then store the voted data in the Data Memory 42 at the generated address, as indicated by block 774. The Store Data Control 720 will then generate an address for the deviance vector by setting the deviance bit equal to 1 and the context bit equal to 0 as indicated by block 776. It win then store the deviance vector in the Data Memory 42, as indicated by block 778.
As previously indicated, a change in the System State will only occur at the Atomic period boundaries. Therefore, for those tasks which require the current System State Vector, this Vector is stored in the Data Memory 42 and the TOC bit identifies where the current System State Vector 1s stored. The Context Bit Memory 732 stores a context bit for each DID used in the system and the context bit identifies which of the data stored in the Data and Deviance RAM is the current data and which data is pending or incomplete since all of the copies of that data have not been received. The context bits in the Context Bit Memory 732 for the output DID's of each terminated task are complemented by the Task Terminated Recorder 730 in response to a message from the Scheduler that a particular task is terminated. The Task Terminated Recorder 730 will then complement all of the DID's which resulted from the terminated task as shall be discussed hereinafter.
The DID Request Handler 724 receives from the Voter Interface 714 the identification or NID of the Node whose data is being processed by the Voter 38 indicating that the Voter is requesting the Data Identification (DID) Code of the data currently being processed a NUDAT bit embedded in this DID code tells the Voter 38 whether this is the first time the Voter has requested this particular DID since termination of the task that generated the data.
As shown in Figure 55, the DID Request Handler will address the Pointer Table 726 with the NID to obtain a Pointer to
a DID List 830 which is part of the Pointer and DID List 728 shown in Figure 52. The DID Request Handler will then access the DID List 830 and obtain the Data Identification Code DID from the DID List 830. It will then access the Nudat Bit Memory 722 and transmit the nudat bit with the DID back to the Voter 38 through the Voter Interface 714.
Referring now to Figure 56, the operation of the DID Request Handler 724 begins by accessing the Pointer Table 726 with the NID to get the Pointer to the DID List 830, as indicated by block 780. The DID Request Handler will then access the DID List 830 to get the DID of the data which is currently being processed by the Voter, as indicated by block 782. The DID Request Handler 724 will then access the Nudat Bit Memory 722 to get the NUDAT bit, as indicated by block 784. It will then append the NUDAT bit to the DID as indicated by block 786 and pass the DID and the NUDAT bit to the Voter Interface 714. The DID Request Handler will then set the NUDAT bit to i, as indicated by block 788, then inquire, as indicated by decision block 790, if the DID was a null task. If it was not a null task, the DID Request Handler 724 will then increment the pointer in the Pointer Table 726 to the next DID for that Node, as indicated by block 792. However, if the DID was a null DID, the DID Request Handler 724 will not increment the pointer in the Pointer Table 726 but will leave it at the null DID indicating that all the DID's for that task have already been transmitted.
The data stored in the DID List has two 8 bit bytes, as shown in Figure 65. The first byte consists of three fields, a Data Value Mask, a Data Type, and a Message Type. The second byte is the actual Data Identification Code (DID) of the data. This Information is used by the AP Output Handler 750 to tag the data generated by the Applications Processor 14 in the transmitted Data Value messages. The DID Request Handler 724 will append the NUDAT bit to the most significant bit position (MSB) of the first byte obtained from the Data List 830, as Indicated in Figure 66, since the most significant bit of the Data Value Mask is not
needed by the Voter 38.
The operation of the Task Terminated Recorder 730 shall be discussed relative to the block diagram shown in Figure 57 and the flow diagram shown in Figure 58. Referring first to the block diagram shown in Figure 57, the Task Terminated Recorder 730 receives the Task Identification Code (TID) of the Terminated Task from the Scheduler Interface 716. The Task Terminated Recorder 730 will then access the Output Pointer List 794 which is part of the Pointer and DID List 728 shown in Figure 52. The Output Pointer List 794 and the DID List 830 are embodied in a common offboard ROM not shown. The Task Terminated Recorder 730 will then access the Context Bit Memory 732 and the Nudat Bit Memory 722, and complement the context bits and set the NUDAT bits for all the DIDs that resulted from the terminated task. The Task Terminated Recorder 730 will then, using the addresses obtained from the DID List 830 access the deviance vectors stored in the Data Memory 42 for all the deviance vectors associated with the DIDs of the terminated task. It will then OR all of these deviance vectors with the content of the Deviance Error Register 734.
The operation of the Task Terminated Recorder 730 will now be discussed with reference to the flow diagram shown in Figure 58.
The operation of the Task Terminated Recorder 730 begins by inquiring, as indicated by block 796, if there is a task to be terminated. If there is no task to be terminated, as indicated, by the two bits of the Status Word being 0's, the Task Terminated Recorder 730 will do nothing. However, if either of the bits of the Status Word is a 1 then the Task Terminated Recorder will inquire, as indicated by decision block 798, if the Status Word is a 10. The least significant bit of this Status Word indicates whether the task is a null task which requires no further action by the Task Terminated Recorder 730. If it is not a null task, the Task Terminated Recorder 730 will access the Output Pointer List 794 to get a Pointer to the DID List 830, as
indicated by block 800. The Task Terminated Recorder will then set the least significant bit of the Status Word to 1, as indicated by block 802. The Task Terminated Recorder 730 will then access the DID List 830 with the Pointer and will inquire, as indicated by block 804, if the DID is a null DID, indicating that it is the last DID of that task. If it is a null DID, then the Task Terminated Recorder 730 will set the least significant bit of the status word to 0, as indicated by block 814 and terminate the processing of the terminated task. If, however, DID is not a null DID, the Task Terminated Recorder will set the nudat bit for that DID to 1, as indicated by block 806, complement the context bit in the Context Bit Memory for that DID, as indicated by block 808. The Task Terminated Control, 736 will then OR the deviances stored In the Data Memory 42, as indicated by block 810 and store the OR values in a Deviance Error Register 734 to generate a deviance vector which is transmitted to the Voter Interface 714 by the Error Reporter 754. The task terminated pointer in the Pointer Table 726 is incremented as indicated by block 812.
The operation of the Task Started Recorder 736 will be discussed with reference to the block diagram shown in Figure 59 and the flow diagram in Figure 60. Referring first to the block diagram of Figure 59, the Task Started Recorder 736 receives the TID and NID of the task started which is contained in a received Task Started message from the Scheduler 40. The Task Started Recorder 736 will first check the Pointer Table 726 to determine if the current DID is a null DID. If it is not, it will record a sequence error in the Sequence Error Register 738 for the identified Node. This check is made because a new task should not have been started until all the data from the preceding task has been received. The Task Started Recorder 736 will then update the Pointer Table 726 by accessing the Output Pointer List 794 with the TID to get the Pointer to the DID List 830 for the first DID resulting from the execution of that task. The Task Started Recorder 736 will then store the Pointer obtained from the Output Pointer List 794 into the Pointer Table 726.
Referring now to Figure 60, the operation of the Task Started Recorder 736 begins with accessing the Pointer Table 726
to get the pointer to the DID List 830, as Indicated in block 816. The Task Started Recorder will then access the DID List 830 with the pointer to get the DID currently being stored for that Node, as indicated by block 818. Then it win inquire, as indicated by decision block 820, if the DID is a null DID. If it is not, the Task Started Recorder will record a scheduling error in the Sequence Error Register 738, as indicated by block 826. However, if the DID task is a null DID, the Task Started Recorder will access the Output Pointer List 794 with the TID of the started task to get the Pointer to the DID list 830 to the first DID for that task, as indicated by block 822. It will then access the DID list 830 with the Pointer then write the address of the first DID into the Pointer Table 726 for that Node, as indicated by block 824, completing the recording of the task started.
The details of the AP Input Handler 744 will be discussed relative to the block diagram shown in Figure 61 and the flow diagram shown in Figure 62. The AP Input Handler 744 identifies and fetches the input data values required by the Applications Processor 14 for the execution of the next task. From the view point of the Applications Processor 14, the
Task Communicator 44 acts like a storage device containing an input file. The values are provided in a predetermined order as specified by the input TID-to-DID Mapping List for the current TID. As previously described, each input data mapping word DID consists of the Data Value Mask (DVM), the Data Type (DT), and the Message Type (MT), and the actual Data Identification Code (DID) which is the starting address of the data value in the Data Memory 42. The addresses are ambiguous because each address points to two different locations in the Data Memory 42. The context bit for a particular DID defines the location that has the current input data.
The AP Input Data Handler 744 fetches each current data value from the Data Memory 42 and loads it into the AP Input FIFO 746. Upon a request from the Applications Processor 14, the data values from the AP Input FIFO 746 are transferred to the
Applications Processor. This cycle is repeated until all input data values have been transferred. Referring now to Figure 61, the AP Input Handler 744 interacts with an Input Pointer List 828, the DID List 830, an AP Input Pointer Table 832, and the AP Input FIFO 746. The Input Pointer List 828 and the DID List 830 are part of the Pointer and DID List 728 shown in Figure 52 and are embodied in an off-board ROM (not shown). The AP Input Handler 744 receives the next task from the Next Task Register 742 and, using the Input Pointer List 828, DID List 830, and the AP Input Pointer Table 832, will extract from the Data Memory 42 the data required for the execution of the task. This information is stored in the AP Input FIFO 746 and made available to the Applications. Processor 14. The AP Input Handler 744 will also generate a Task Completed/Started message identifying the task completed and the next task to be started by its own Applications Processor 14. The AP Input Handler 744 will also generate a Task Release message sent to the Scheduler 40 through the Scheduler Interface 716. The contents of the TOC Bit Flip Flop 758 are appended to the addresses stored in the AP Input Pointer Table 832 to identify the current system state. The contents of the Context Bit Memory 732 are appended to the addresses to identify the current data values stored in the Data Memory 42. The mapping list for each task in the Input Pointer List 828 consists of a contiguous group of DID's terminated by a null DID. A pointer to the beginning of each mapping list is stored in the Input Pointer List 828 and is addressed by the TID of the task. The nul DID is used to identify the end of the contiguous group of DID's in the DID List 830.
Referring to the flow diagram shown in Figure 62, the AP Input Handler 744 first inquires if this is the first task to be executed in this Subatomic period, as indicated by decision block 834. This is because only one task can be started in any given Subatomic period. If this is not the first task, then the AP Input Handler 744 will wait until the beginning of the next Subatomic period. Otherwise the AP Input Handler 744 will inquire if the next task is a new task received since the last CS message
or reset or reconfiguration commands, as indicated by decision block 836. If the task is not a new task, the AP Input Handler 744 will wait until it receives a new task. If, however, the task is a new task and it is the first task in the Subatomic period, the AP Input Handler 744 will send a Task Completed/Started message to the Transmitter Interface 718, as indicated by block 838. This Task Completed/Started message will subsequently be transmitted by the Transmitter 30 to all of the other Nodes in the system. The AP Input Handler 744 will then inquire, as indicated by decision block 840, if the transmission of the Task
Completed/Started message has started. In the event that the transmission of the Task Completed/Started message is delayed because of the transmission of a time dependent message, such as a System State or a Task Interactive Consistency message, the AP Input Handler will wait until the transmission of the Task
Completed/Started message has begun. After the transmission of the Task Completed/Started message has begun, the AP Input Handler 744 will send a Task Release message to the Scheduler 40 through the Scheduler Interface 716, as indicated by block 842, informing it that the Task Communicator has transmitted a Task
Completed/Started message identifying the selected next task as the started task. The AP Input Handler 744 will then transfer the next task from the Next Task Register 742 and store it as the current task in the Applications Input Pointer Table, as indicated by block 844. The AP Input Handler will then write the TID of the current task into the AP Input FIFO 746, informing the Applications Processor 14 of the identity of the task, as indicated by block 846. Then using the current task, the AP Input Handler 744 will get the DID Pointer from the Input Pointer List 828 and store the DID Pointer in the AP Input Pointer Table 832, as indicated by block 848. Using the DID Pointer, the AP Input Handler will then get the address of the DID in the Data and Deviance RAM and store this address in the AP Input Pointer Table, as indicated by block 850. The AP Input Handler 744 will then inquire, as indicated in decision block 852, if the DID is a null DID. If the DID is a null DID Indicating it is the last DID in the task, the operation of the AP Input Handler 744 is completed.
Otherwise, the AP Input Handler 744 will inquire, as indicated in decision block 854, if the DID is a System State Vector. If the DID is a System State Vector, the AP Input Handler 744 will access the TOC Bit Flip Flop and get the TOC bit which is used in place of the context bit for addressing the Data Memory 42, as indicated by block 866. Otherwise, if the DID is not a null DID nor a System State Vector, the AP Input Handler will access the Context Bit Memory for the context bit as indicated by block 856 and append it to the address in the AP Input Pointer Table. Using the address In the DID list 83 and the context bit or the TOC bit from the TOC Bit Flip Flop, the AP Input Handler will access the Data Memory 42 and write the first two data bytes into the AP Input FIFO, as indicated by block 858. The AP Input Handler will then inquire, as indicated by block 860, if the Message Type is either a Message Type 2 or Message Type 3 which have 4 bytes rather than 2 bytes. If the Message Type is not either a Message Type 2 or 3, the AP Input Handler 744 will index the DID Pointer and store the indexed DID Pointer in the AP Input Pointer Table, as indicated by block 864. If the Message Type is either a Message Type 2 or a Message Type 3, the AP Input Handler 744 will address the Data
Memory 42 again and write the final 2 data bytes into the AP Input FIFO 746, as indicated by block 862, then increment the DID Pointer as indicated in block 864. The AP Input Handler 744 will repeat this process to obtain all of the data values needed by the Applications Processor 14 for the execution of the task. As previously discussed, the last DID for any task is a null DID. This null DID is recognized by the AP Input Handler 744, as indicated in decision block 852 and will terminate the loading of the AP Input FIFO 746.
The AP Output Handler 750 will be discussed relative to the block diagram shown in Figure 63 and the flow diagram shown in Figure 64. The AP Output Handler 750 receives and processes the output data values generated by the Applications Processor 14 and broadcasts them to all the Nodes in the system. As with input values, output values are expected in a predetermined order specified by the output TID-to-DID Mapping List for the current task.
When a Data Value is received from the Applications Processor 14, the AP Output Handler 750 loads it into an Output Data Register along with its MT and DID codes and initiates broadcast of the Data Value message. After outputing the last value for each task, the Applications Processor generates a combined
Applications Processor Reported Error Vector (APRE) and the Branch Condition (BC). The Applications Processor Reported Error portion is used with the previous Applications Processor Reported Error data to be reported to the Voter 38 through the Error Reporter 754. The Branch Condition (BC) is used as part of the next Task Completed/Started message generated by the AP Input Handler 744.
Referring now to Figure 63, the AP Output Handler 750 responds to the AP Input Handler completing its operation. The AP Output Handler 750 first accesses the Pointer Table 726 and obtains the pointer which identifies the current task which is stored in the DID List 830. This Pointer is then used to address the DID List 830 to obtain the Data Identification Code (DID), the Message Type (MT), and the Data Value Mask (DVM). The AP Output Handler 750 will then store the Message Type bits and the Data Identification Code in an Output Data Register 870. As the data is generated by the Applications Processor 14, the data is stored in the AP Output FIFO 748. The data is then transferred from the AP Output FIFO 748 to the Output Data Register 870. The AP Output Handler 750 will then transmit the first 2 bytes to the Transmitter Interface 718, and will transmit the Applications Processor Reported Error Vector to the APRE Register 752.
The details of the operation of the AP Output Handler 750 will now be discussed with reference to the flow diagram shown in Figure 64. After the AP Input Handler 744 has completed the loading of the initial data value into the AP Input FIFO 746, the AP Output Handler 750 will read the current DID Pointer from the Pointer Table 726 and store it in the AP Output Pointer Table 868, as indicated by block 872. The AP Output Handler will then access the DID List using the pointer stored in the Pointer Table 726 and store the Message Type and DID in the Output Data Register 870, as indicated by block 874. After the Message Type and DID bytes are
stored in the Data Output Register 870, the AP Output Handler 750 will inquire, as indicated by decision block 875, whether the Applications Processor has placed any data values in the AP Output FIFO 748. If no data has been placed in the AP Output FIFO 748, the AP Output Handler 750 will wait until data is generated by the Applications Processor 14, as indicated by decision block 876. After the Applications Processor 14 has stored the generated data values in the AP Output FIFO 748, the AP Output Handler 750 will transfer the data values to the Output Data Register 870, as indicated by block 878. The AP Output Handler 750 will then inquire, as indicated in decision block 880, if the Message Type is a Message Type 2 or a Message Type 3. If the Message Type is a Message Type 2 or 3, the AP Output Handler 750 will wait until the final two data bytes of the data are generated and placed in the AP Output FIFO 748, as indicated by decision block 882. After the Applications Processor has written the second two data bytes into the AP Output FIFO 748, the AP Output Handler 750 will transfer the contents of the AP Output FIFO 748 into the Output Data Register 870, as indicated in block 884. If the Message Type is a Message type 0 or 1, or after the content of the second two data bytes are written into the Output Data Register 870, the AP Output Handler will inquire if the DID of this data is a null DID, as indicated by decision block 886. If the DID is not a null DID, the AP Output Handler 750 will send the Message Type and Data Identification Code bytes (DID) to the Transmitter Interface 718, as indicated by block 888. The AP Output Handler will then send the data bytes to the Transmitter Interface 718 to complete the Data Value message, as indicated by block 890. The AP Output Handler 750 will then increment the current DID Pointer in the AP Output Pointer Table 868, as indicated in block 892 and repeat the processing of the next data value generated by the Applications Processor 14. As previously indicated, the last DID in the DID List 830 for that particular task will be a null DID which will be detected by the AP Output Handler 750, as indicated by decision block 886. If the DID is a null DID indicating that the
Applications Processor 14 has generated all of the data values for that task, the Applications Processor will generate a last data word containing the Applications Processor Reported Error and the
Branch Condition (BC) bit for the subsequent tasks. The AP Output Handler 750 will store the Applications Processor Reported Error Vector in the APRE Register 752, as indicated by block 894, and then will proceed to process the branch condition, as indicated by decision block 896. The AP Output Handler will first inquire if the majority of the branch condition bits are equal to 0. If the majority of branch condition bits are 0's, then the AP Output Handler will generate a branch condition byte consisting of all 0's, as indicated by block 898, otherwise, if the majority of branch condition bytes are 1's, the AP Output Handler 750 will generate a branch condition byte of consecutive 1's, as indicated by block 900. Finally, the AP Output Handler 750 will store the branch condition byte in the Transmitter Interface 718, as indicated by block 902. This byte will be appended to the next Task Completed/Started message generated by the AP Input Handler 744 as the branch condition of the completed task.
The Task Communicator 44 also includes a Reconfigure Control 904, as shown in Figure 67, and a Reset Control 920, as shown in Figure 69. Referring first to Figure 67, the Reconfigure Control 904 transmits an interrupt to the Applications Processor 14 and awaits acknowledgement. After acknowledgement, the Reconfigure Control 904 will initialize the Output Handler 750 and its Output Data Register 870, the AP Input FIFO 746, the AP Output FIFO 748, and the Pointer Table 726. The operation of the Reconfigure Control 904 will be discussed relative to the flow diagram shown in Figure 68. In response to a reconfiguration request from the Scheduler 40, the Reconfigure Control 904 will send a reconfigure interrupt to the Applications Processor 14, as indicated in block 906. It will then terminate all messages to the Transmitter Interface 718, as indicated by block 908, by clearing the Output Data Register 870. The Reconfigure Control 904 will then await acknowledgement of the interrupt signal from the Applications Processor 14, as indicated by decision block 910. After the Applications Processor has acknowledged the interrupt, the Reconfigure Control will clear the AP Input FIFO 746 and the AP Output FIFO 748, as indicated by block 912, then set all the pointers in the Pointer Table 726 to null DID's, as indicated by
block 914. After the Input and Output FIFO's have been cleared, the Reconfigure Control will restart the AP Input Handler 744, as indicated in block 916, then send a Task Communicator Ready (TSCRDY) signal to the Scheduler 40, as indicated in block 918, indicating that the Task Communicator 44 is ready to begin processing data in the new reconfigured System State.
The Task Communicator also has a Reset Control 920 responsive to the Operations Controller Reset (OCRES) signal, as indicated in Figure 69. The Reset Control 920 interfaces with the Applications Processor 14, an AP Ready Flag 922, a Reset flag 924, the AP Input FIFO 746, the. AP Output FIFO 748, the Pointer Table 726, and the AP Input Handler 744. Referring to Figure 70, the operation of the Reset Control 920 begins by sending a reset request to the Applications Processor 14, as indicated by block 926. The Reset Control 920 will then set the Reset Flag 924 to "ON," as indicated by block 928 to signify te the other subsystems of the Task Communicator 44 that the Operations Controller is being reset. The Reset Control 920 will then set the AP Ready Flag 922 to "OFF," as indicated by block 930, to signify to the Scheduler 40 that the Applications Processor is not yet ready to start processing any tasks. The Reset Control 920 will then proceed to clear the AP Input FIFO 746 and the AP Output FIFO 748, as indicated by block 932, then set all of the pointers in the Pointer Table 726 to null DID's, as indicated by block 934. The Reset Control will then start the AP Input Handler 744, as indicated by block 936, and wait for the Applications Processor to signify that it is ready, as indicated by decision block 938. After the Applications Processor 14 signifies that it is ready to start processing data, the Reset Control 920 will turn the Reset Flag 924 "OFF" and the AP Ready Flag 922 "ON," signifying that the Task Communicator 44 is now ready to start processing data, as indicated by block 940.
SYNCHRONIZER
The Synchronizer 46 establishes and maintains the
synchronization between all of the Operation Controllers in the System. The multi-computer architecture uses loose synchroni zation which i s accompl ished by synchronous rounds of message transmission by each Node in the system. In thi s method , each Synchroni zer 46 detects and time stamps each time dependent message received by its own Node. These time dependent messages are transmitted by every other Node in the System at predetermined interval s and they are received by al l the other Nodes in the system. As a result of the wrap-around interconnection shown in Fi gure 1, a Node will receive its own time dependent messages al ong with the other time dependent messages sent by the other Nodes . The compari son of the time stamps on a Nodes own time dependent message with the time stamps on al l of the other time dependent messages is what drives the fault tolerant conversion algorithm.
The synchronization is done over two timing message i nterval s . These interval s are del imited by a pre-sync message and a sync message which are transmitted alternatively. A Node i s defined to be in point-to-point synchronization with another Node when it is sending its own sync and pre-sync messages at the same time the other Node is sending its sync and pre-sync messages. Since the synchronization cannot be exact, a window is specified by the user which defines the al lowabl e error in the time that messages are received from the Nodes that are considered to be synchronization with each other.
Fundamentally, the mechani sm which forces the Nodes into synchronization with each other involves a calcul ation done by each Node, that determines where in time the center of the last cluster of pre-sync messages occurred . Each Node wil l then determine the di fference between its own pre-sync time stamp and that of the center. Each Node wi ll then use that di fference to adjust the length of the interval from the time it sent its pre-sync message to the transmission of Its sync message. Fault tol erance in these calculations is requi red and is accompl i shed with an approximate agreement algorithm. System convergence is accompl ished by having al l Nodes repeat these steps continuously.
Over every sync to pre-sync interval all Nodes listen to the time dependent messages received from all of the Nodes, including their own, determine a correction, and then apply the correction over the following pre-sync to sync interval. The pattern is repetitive: taking the form measure error, make correction, measure error, make correction,... etc. The time interval from the sync to the pre-sync message is a nominal user defined value.
The explanation of the synchronization procedure described above is accurate. However, the actual mechanism implemented is more involved than the steady state system described because it must also be fault tolerant under all conditions, be capable of detecting and characterizing system timing errors, and must support the other time dependent functions of the system.
The implemented synchronization logic supports three catagories of operation: a cold start, a warm start, and steady state operation. Cold start synchronization logic handles the situation where the system has just powered up and no one Node is necessarily in synchronization with any other Node. In particular, a cold start is executed when no operating set of Nodes exists. In this case, every good Node in the system attempts to synchronize with every other good Node and then simultaneously and consistently decide together which Nodes are in synchronization and whether or not the number which are synchronized is larger than the user specified start up size. In order to accomplish initial synchronization and start up, each Node maintains a byte of information called the "in-sync-with" (ISW) vector. The content of the "in-sync-with" vector defines which other Nodes in the system the local Node believes it is in synchronization with. Byzantine Agreement on the ISW vectors of all the good Nodes in the system is used to define the initial operating set. Byzantine Agreement is required concerning this "in-sync-with" data in order for cold starts to be fault tolerant. Once enough Nodes reach Byzantine Agreement on a potential operating set (POS), all the Nodes that are in that set begin operating. All the other healty Nodes not in the potential operating set will reach the same conclusion, that an operating set is formed, but they will also
recognize that they are not incl uded in the operating set and will switch to a warm start mode of operation .
In the warm start mode of operation each good Node continues its efforts to synchroni ze with the exi sting operating set. Once a Node in the warm start mode of operation bel ieves it is in synchroni zation with the operating set it wi l l begin normal operation. After the warm starting Node has behaved correctly long enough, the other Nodes wi ll admit it into the operating set.
The last situation is the steady state mode of operation where each operating Node simply maintains synchronization and al ignment with the other good Nodes in the system. In practice the steady state convergence al gorithm runs under al l conditions since it has the abil ity to converge the local Node to a common synchronization point with all other good Nodes in a fault tolerant manner. The real di fferences between warm and cold starts centers around the logic used to determine when the operating set is formed and the Nodes in the operating set are ready to start schedul ing tasks to be executed by thei r Applications Processor.
The detail s of the Synchronizer 46 are shown on Figure
71. The Synchronizer includes a Synchronizer Control 952 which receives the Task Interactive Consi stency (TIC) messages and the System State (SS) messages from the Message Checker through a Message Checker Interface 942. The System State messages are the sync and pre-sync messages previously described and are distingui shed by a function bit which identi fies the System State message as a sync or pre-sync message. A Timing Signal Generator 950 generates timing signals which are transmitted to the Synchronizer Control ler 952. The signal s generated by the Timing Signal Generator are the Subatomic period (SAP) signal . The
Atomic period (AP) signal the Master period (MP) si gnal the Last Subatomic period (LSAP) signal the Last Atomic period (LAP) signal the Soft Error Window (SEW) si gnal and the Hard Error Window (HEW) signal . The Synchroni zer Control 952 al so receives a Clock signal and Reset signal from the system bus . The Reset signal may be either the power on Reset (RESET) or the internal Operation
Controller Reset (OCRESET) signals. These signals have been previously discussed and need not be repeated here. The Synchronizer Control 952 will also receive Its own 3 bit Node identification (NID) code.
A Byzantine Voter 954 performs a byzantine vote on the
"in-sync-with" matrices received from itself and the other Nodes during the cold start mode of operation and on the content of the Task Interactive Consistency messages. The byzantine vote on the content of the Task Interactive Consistency (TIC) messages is transmitted directly to a Scheduler Interface 944, while the result of the byzantine vote on the "in-sync-with" matrices is passed to a Fault Tolerator Interface 946. A Time Stamp Voter 956 will vote on the time stamps of the System State messages received from all of the Nodes to generate a voted time stamp value. A Synchronizer Memory 948 stores the data received from the Message Checker Interface and other data required by the Synchronizer Control 952 for establishing and maintaining synchronization between its own Node and the other Nodes in the system. The Synchronizer Memory 948 has a Scratch Pad Memory 962, a Message Memory 964 and a Time Stamp Memory 966, as shown in Figure 72. An Error Reporter 958 receives the errors detected by the Synchronizer Control 952 and transmits them to the Fault Tolerator Interface 946. The Synchronizer Control 952 generates the time dependent Task Interactive Consistency (TIC) and the System State (SS) messages which are transmitted to the other Nodes in the system through the Transmitter Interface 960, as previously described.
Figure 76 shows the waveforms of the various signals generated by the Timing Signal Generator 950. The Master period (MP) is a timing signal which reflects the length of each Master period interval of the Operations Controller. This interval is the longest of the synchronization clocks and reflects the periodicity of the lowest repetition task being run in the application. The Master period can be considered to be the "frame size" of the application. During the Master period interval the total pattern of tasks are repeated. The Atomic period (AP) is a
timing signal which reflects the beginning and end of each Atomic period Interval. This interval is representative of the fastest repetition task being run in the application. The Master period described above is user specified as an integer number of Atomic periods. The Subatomic period (SAP) is a timing signal which reflects the beginning and end of each Subatomic period interval. The Atomic period interval is user specified as an integer number of Subatomic periods. The last Atomic period (LAP) is an active high signal that windows the last Atomic period that occurs in each Master period. The last Subatomic period (LSAP) is an active high signal that windows the last Subatomic period that occurs in each Atomic period. The soft error window (SEW) is an active high signal that brackets a span of time around an event time mark that defines the soft error window for the arrival of System Synchronization messages. Finally, the hard error window (HEW) is an active high signal that brackets a span of time around an event time mark that defines the hard error window for the arrival of system synchronization messages.
The format of the Message Memory 964 is shown in Figure 73. The Message Memory 964 stores for each Node the branch condition byte, the task completed vector, the next system state vector, the current system state vector, the content of the Atomic period counter, and two bytes, one reserved for a cold start and the other reserved for a warm start. This format is repeated for each Node in the system.
The format of the Time Stamp Memory 966 is shown in Figure 74. The Time Stamp Memory consists of a coarse time count and a fine time count and includes an update (u) flag and a time stamp (TS) flag. The update flag signifies that the stored time stamp is for a System State message received during the current time stamp Interval. The time stamp flag indicates whether the time stamp 1s for a System State message in which the sync function bit Is set or for a System State message in which the presync function bit is set. The coarse count of the time stamp is indicative of the number of Subatomic periods that have passed since the preceding System State message was generated. The fine
time stamp count corresponds to the number of synchronizer clock pulses received during the last Subatomic period of the Atomic period. The coarse or Subatomic period counts are used primarily during the cold start and the warm start to achieve rapid convergence of the synchronization between the local Node and the other Nodes in the system. The fine or synchronizer clock time stamp counts are used primarily during the steady state operation to maintain the synchronization between the Nodes. The Time Stamp Memory 966 will store a time stamp for each Node in the system and includes a special entry for storing the time stamp of the System State messages transmitted by its own Node.
The format of the Scratch Pad Memory 962 is shown in Figure 75. The Scratch Pad Memory 962 stores the "in-sync-with" (ISW) vectors for each Node in the system. These "in-sync-with" vectors are contained in the sync and pre-sync System State messages. The Scratch Pad Memory 962 will also store two message warning counts, one indicative of the time from the end of the warning period to the transmission of the Task Interactive Consistency message and the other indicative of the time from the end of the warning period to the transmission of the System State message. The Scratch Pad Memory will also store the Subatomic period count which is used to time stamp the received messages. The Scratch Pad Memory also has an entry storing the number of Subatomic periods per an Atomic period, the Atomic period count, and the number of Atomic periods per Master period. The Scratch Pad Memory also will store an actual hard error window (HEW) to warning period count and a nominal hard error window (HEW) to warning period count. The actual hard error window to warning period reflects the corrected length of the Atomic period between the pre-sync and sync messages which is computed from the difference between the voted time stamp value and its own time stamp value. The next entries in the Scratch Pad Memory are the error window parameters. The error window parameters include a hard error window count and a soft error window count. The next two entries in the Scratch Pad Memory are the computed correction for the Subatomic period delta and the computed correction for the Subatomic period count. The next entry is the maximum allowed
correction for the Subatomic period delta. The final entry in the Scratch Pad Memory is the minimum start up si ze for determining the existence of a potential operating set.
The detail s of the Synchronizer Control 952 are il l ustrated in Figure 77. The data received by the Message
Checker Interface 942 is passed di rectly to a Data Handler 968 and a Time Stamper 972. The Data Handler 968 wi ll store the data in the Message Memory 964, the Scratch Pad Memory 962, or a Byzantine Voter Memory 970 as requi red . Prior to the Time Stamper 972 storing the time stamp of the message in the Time Stamp Memory 966 the received message is checked by an Expected Message Checker 974 and a Withi n Hard Error Window and Soft Error Window Checker 976. If the message is not an expected message, as shall be discussed later, the Expected Message Checker 974 will generate a sequence error signal which is transmitted to an Error Reporter 978 and to the Time Stamper 972. In a like manner, if the received message is outside the hard error window or the soft error window the Within Hard Error Window and Soft Error Window Checker will generate either a hard error or a soft error which is also transmitted to the Error Reporter 978 and the Time Stamper 972. The Time Stamper 972 will not record the time stamp in the Time Stamp Memory 966 if either a sequence error or a hard error is detected. Any message which i s received outside the hard error wi ndow or not received at al l is essentially ignored by the Synchronizer. However, a received vector will be generated snowing a missing message error for each Node which failed to report during the hard error wi ndow i nterval . The synchronization process will not use any time stamp val ue associated with the Nodes which failed to report within the hard error window. This prevents good Nodes from trying to move towards badly out-of-sync Nodes that may possibly be faulty. The Time Stamper 972, however, wil l record the time stamp in the Time Stamp Message i f only a soft error is detected .
The Time Stamper 972 wi l l record the number of Subatomic periods counted in a Subatomic Period (SAP) Counter 971 as the
coarse time stamp count and the number of sync cl ock bits from a Sync Clock 969 as the fine time stamp. The Time Stamper 972 wil l set the update fl ag in the Time Stamp Memory and set the time stamp fl ag to Indicate i f the received message was either a sync or pre-sync System State message.
A Pre-sync Message Counter 980 and a Sync Message Counter 982 wil l count, respectively, the number of sync messages or pre-sync messages received during the particul ar time stamp i nterval . A Comparator 984 wi ll determine whether the number of pre-sync messages was greater than the sync messages or vice versa . Thi s information is communicated to the Time Stamp Voter 956 which wil l extract from the Time Stamp Memory 966 the time stamps corresponding to whether there were a greater number of pre-sync messages or sync messages received during the time stamp Interval . The Time Stamp Voter 956 wil l vote on the received time stamps and generate a medial or voted val ue for the time stamps of the received messages . This information i s passed to a Sync Correction Generator 990 which compares the time stamp of the Synchronizer' s own System State message with the medial or voted val ue generated by the Time Stamp Voter 956 and wil l generate a correction delta . This correction del ta is added to the nominal interval between System State messages in the Adder 992 to correct for the detected di fferences. The sum of the nominal interval pl us the correction delta are then stored in the Scratch Pad Memory 962 as the computed correction Subatomic period count and the actual HEW to warning period count , as indicated in the formatof the Scratch Pad Memory shown in Figure 75. The computed correction Subatomic period count and the actual HEW to warning period count are used in the timing of the duration of the interval from the pre-sync System State message to the sync System State message, as shown in Figure 84.
A Byzantine Voter 954 performs a byzantine vote on the task compl eted vector and the branch cond ition bits contained in the Task Interactive Consi stency messages which are passed back to the Scheduler 40 through the Schedul er Interface 944. During a cold start the Byzantine Voter 954 wi l l al so perform a byzantine
vote on the "in-sync-with" matrix transmitted in the pre-sync System State messages to generate a voted "in-sync-with" vector. Thi s "in-sync-with" vector is transmitted to an Operating Condition Detector 1000 which sums the number of "in-sync-with" bits contained in the voted "in-sync-with" vector and compares thi s sum with the minimum start up si ze for a potential operating set (POS) of Nodes. If the sum of the bits in the Byzantine voted " in-sync-with" vector is greater than the minimum start up size, the Operating Condition Detector 1000 will then determine if its own Node is contained in the "in-sync-with" vector. If its own Node is contained within the "in-sync-with" vector, the Operating Condition Detector will set. an Operating Fl ag 1004 indicating that it is ready to start operating. However, if an operating set is detected and the Operating Condition Detector 1000 determines that its own Node is not within the operating set, it will set a Warm Start Flag 1002 indicating the exi stence of an operating set and that it i s not in synchronization with that set. This will cause the Synchroni zer 46 to enter into a warm start mode of operation as previously di scussed . If an operating set is detected , and its own Node is in the operating set, the Operating Condition Detector 1000 wil l then transmit the "in-sync-with" (ISW) vector and the operating flag bit to the Fault Tol erator 36 through the Fault Tol erator Interface 946.
An In-Sync-With Detector ( ISW) 944 wil l compare the time stamp of its own System State message with each time stamp stored in the Time Stamp Memory 966 to generate an "in-sync-with" vector which is stored in an In Sync With ( ISW) Register 996.
The "in-sync-with" vector stored in the In-Sync-With Register 996 and the state of the Operating Flag 1004 are passed to a Message Generator 998 and are used in the generation of the next System State message. The output of the Message Generator 998 is passed to the Transmitter through the Transmitter Interface 960.
The steady state operation of the Synchronizer will be di scussed with reference to the fl ow diagrams ill l ustrated in
Figures 78 through 82 and the waveforms shown in Figures 83 and 84. The flow diagram shown in Figure 78 describes the operation of the Data Handler 968, the Expected Message Checker 974, and the Within Hard Error Window and Soft Error Window Checker 976. As indicated by block 1006, all of the data received from the Message Checker Interface 942 is stored in the Message Memory 964. The system will then inquire, as indicated by decision block 1008, if the operating flag is true. If the operating flag is not true, the system will default to either a cold start or a warm start as will be described later herein. If the operating flag is true the Expected Message Checker will then inquire if it is a Task Interactive Consistency (TIC) message as indicated by decision block 1010. If it is not a Task Interactive Consistency message, then the message is a System State messge, then the Expected Message Checker 974 will inquire if the Subatomic period count in the Time Stamp Memory is equal to zero, as indicated by block 1012. The Subatomic period count stored in the Time Stamp Memory is the two's complement of the number of Subatomic periods in the Atomic period. This Subatomic period count is incremented each time the Timing Signal Generator 950 generates a Subatomic period signal. When the Subatomic period count in the Time Stamp Memory is equal to zero, then a System State message is the expected message. If the Subatomic period count is equal to zero, the Expected Message Checker will reload the time stamp counters for the Node from which the message was received, as indicated by block 1014, and then inquire if the sync/pre-sync, (s/p) function bit contained in the message was equal to the complement of the (TS) flag stored in the Time Stamp Memory. In normal operation the sync and pre-sync System State messages are sent in an alternating manner, therefore, the function bits in the received message should be the complement of the function bits of the previous message which is currently stored by the TS flag in the Time Stamp Memory 966. If the sync/pre-sync function bit is the complement of the time stamp flag stored in the Time Stamp Memory, then the sequence error flag for the Node from which the message was received (NODE j) is set to false, as Indicated by block 1020. Conversely, if the sync/pre-sync bit is not the complement of the
time stamp flag the sequence error flag for the Node from which the message was received is set to true, as indicated by block 1022. If the Subatomic period count in decision block 1012 is not equal to zero, the Expected Message Checker will set the sequence error flag for the Node from which the message was received to true, as indicated by block 1024, then reload the time stamp counters for Node j, as indicated by block 1026.
If the received message is a Task Interactive Consistency (TIC) message, as determined in decision block 1010, the Expected Message Checker 974 will then inquire if the
Subatomic period count in the Time Stamp Memory is equal to or greater than zero, as indicated by decision block 1028. If the Subatomic period time stamp count is equal to or greater than zero, then a System State message should have been received and, therefore, there is a sequence error. The Expected Message Checker 974 will then set the sequence error flag for the Node from which the message was received to true, as indicated by block 1034. However, if the Subatomic period count in the Time Stamp Memory is less than zero, the Expected Message Checker 974 will increment the time stamp count stored in the Time Stamp Memory for that Node (Node j), as indicated by block 1032.
The operation of the Within Hard Error Window and Soft Error Window Checker 976 and the Time Stamper 972 will be discussed with reference to the flow diagram shown in Figure 79. The operation of the Within Hard Error Window and Soft Error
Window Checker 976 begins by checking to determine if a sequence error has been detected, as indicated by decision block 1036. If a sequence error has been detected by the Expected Message Checker, the Within Hard Error Window and Soft Error Window Checker 976 will set the update flag in the Time Stamp Memory 966 to false, as indicated by block 1054. Otherwise, the Within Hard Error Window and Soft Error Window Checker 976 will inquire whether the message was received within the hard error window, as indicated by decision block 1040. If the message was not received within the hard error window (HEW), the Within Hard Error Window and Soft Error Window Checker 976 will set the hard error window
flag to true, as indicated by block 1042, then set the update flag in the Time Stamp Memory to false, as indicated by block 1054. If the message was received within the hard error window, the Within Hard Error Window and Soft Error Window Checker 976 will inquire, as indicated by decision block 1044, whether the message was received within the soft error window. If the message was not received within the soft error window, the Checker will set the soft error window flag to true, as indicated by block 1046 and the Checker will proceed to ask, as indicated by decision block 1048, whether the received message was a Task Interactive
Consistency (TIC) message. If the message is not a TIC message, the Checker will then proceed to ask the if the message was a presync System State message, as indicated by decision block 1049.
If the message was a pre-sync System State message, then the Time Stamper will be enabled to time stamp the received message. The Time stamp equals the SAP count received in the SAP Counter 971 and the sync clock count received from the Sync Clock 969. The Time Stamper 972 will then set the TS flag bit to pre-sync and the update flag equal to true, as indicated by block 1050. However, if the message is a System State sync message, the Time Stamper 972 will time stamp the received message and set the time stamp flag to sync and the update flag to true, as indicated by block 1052. After the message has been time stamped it is stored in the Time Stamp Memory 966, as indicated by block 1038. As indicated by decision block 1049, the Task Interactive Consistency (TIC) messages are not time stamped.
The generation of the actual HEW to warning period count is described with reference to the flow diagram shown in Figure 80. The generation of the actual HEW warning period counts begins by setting the Node (NID) pointer in the Time Stamp Memory to the first Node (NID=0), as indicated by block 1056. The Time Stamp Voter will then inquire if the update flag is true, as indicated by decision block 1058. If the update flag is not true, indicating that the time stamp value has not been updated during the current Atomic period, the Time Stamp Voter will then increment the time stamp memory Node pointer to the next Node and inquire if
the update flag of that Node is true. If the update flag is true, then the time stamp value is loaded into the Time Stamp Voter, as indicated by block 1060, then the Node pointer to the Time Stamp Memory is incremented, as indicated by block 1062. The Time Stamp Voter 956 will then inquire if the Node to which the pointer is pointing is the maximum or last Node to be polled, as indicated by decision block 1064. If the Node is not the last Node, (MAX NID) the process of loading the Time Stamp Voter will continue until the time stamp value from the last Node is loaded into the Time Stamp Voter 956. Once the Time Stamp Voter is loaded with all of the current time stamp values it will vote on the time stamp values which were loaded into it and generate a voted time stamp value (TSv), as indicated by block 1066. The Sync Correction Generator 990 will then subtract the Node's own time stamp value from the voted time stamp value to generate a sync delta, as indicated by block 1068. The actual HEW to warning period is then generated by adding in Adder 992 the sync delta to the nominal HEW to warning period stored in the Scratch Pad Memory, as indicated by block 1070. This actual HEW to warning count is then stored in the Scratch Pad Memory, as indicated by block 1071.
The operation of the Message Generator 998 will be discussed relative to the block diagram shown in Figure 81 and the flow diagram shown in Figure 82. Referring first to the block diagram shown in Figure 81, the Message Generator receives the clock (CLK), the Last Subatomic (LSAP) period, and the HEW signals from the Timing Signal Generator 950. It also receives the Warm Start Flag 1002, the Operating Flag 1004, and the "in-sync-with" vector from the In-Sync-With (ISW) Register 996. The data used in the messages generated by the Message Generator 998 is obtained from the Sync Memory 948, which includes the Scratch Pad Memory 962, the Message Memory 964, and the Time Stamp Memory 966. The messages generated by the Message Generator are passed to the Transmitter Interface 960 which ultimately passes these messages to the Transmitter 30.
Referring now to the flow diagram shown in Figure 82, the Message Generator 998 first waits until the end of the hard
error window, as indicated by decision block 1074. At the end of the HEW the Message Generator will inquire if the Subatomic period is a Last Subatomic period, as indicated by decision block 1076. If the Subatomic period is not a Last Subatomic period, then the message to be generated is a Task Interactive Consistency (TIC) message in which the data identification code (DID) is set equal to zero, as indicated by block 1078. If the current Subatomic period is the last Subatomic period then the message to be transmitted is a System State message in which the sync/pre-sync bit is equal to the complement of the TS flag currently stored in the Time Stamp Memory, as indicated by block 1094.
If the message type is a Task Interactive Consistency message, the Message Generator will inquire if the operating flag is true, as indicated by decision block 1080. If the operating flag is not true then no TIC message is to be sent. However, if the operating flag is true, the Message Generator 998 will load an Event Counter 1072 with the nominal HEW to warning count stored in the Scratch Pad Memory 962, as indicated in block 1082, then assemble a normal Task Interactive Consistency message, as indicated by block 1083. As shown in Table 1, the normal Task
Interactive Consistency message includes the task completed vector and the branch condition bits obtained from the Message Memory 964. The Message Generator will then wait until the Event Counter 1072 is incremented to zero by the clock signals, as indicated by decision block 1084. When the Event Counter is equal to zero, the Message Generator 998 will send the first byte of the message to the Transmitter through the Transmitter Interface 960, as indicated by block 1086, then transfer the remaining bytes of the message to the Transmitter Interface 960, as indicated by block 1088. The Transmitter Interface 960 will then wait for the buffer available (BA) signal from the Transmitter 30, as indicated by decision block 1090, then send the remaining bytes of the message to the Transmitter, as indicated by block 1092.
As previously described relative to the operation of the Transmitter 30, the sending of the first byte of a message from the Synchronizer will start the warning period for the time depen
dent Task Interactive Consistency and System State messages. At the end of the warning period, the Transmitter will begin the transmission of the time dependent message and will transmit a buffer available signal to the Transmitter Interface, which triggers the transferring of the remaining bytes stored in the Transmitter Interface to the Transmitter.
If the message type is a System State message, as indicated by block 1094, the Message Generator will then inquire if the System State message to be transmitted is a sync or pre-sync message, as indicated by block 1096. If the message is a sync message, the Message Generator will load the Event Counter 1072 with the actual HEW to warning count from the Scratch Pad Memory, as indicated by block 1098, and then will generate a normal System State message, as indicated by block 1099. If, however, the message is a pre-sync System State message, the Message Generator will load the Event Counter 1072 with the normal HEW to warning count, as indicated by block 1097, and then will interrogate the warm start and operational flags to determine if the system is in the cold start mode, as indicated by decision block 1077. A cold start is indicated by both the Warm Start and Operating Flags being false. If the system is not in a cold start mode, the Message Generator 998 will then generate a normal System State message, as indicated by block 1099. However, if the Synchronizer is in the cold start mode, the Message Generator will generate a cold start pre-sync message, as indicated by block 1081. The cold start pre-sync message has a format as indicated in Figure 85, which is different from the normal pre-sync System State message shown on Table I. This cold start pre-sync message contains an "in-sync-with" matrix containing the in sync vectors received from all the operating Nodes in the system. The Message Generator 998 will then wait for the end of the HEW to warning period by monitoring the Event Counter, as indicated in decision block 1084. The Message Generator will then send the first byte of the message to the Transmitter 30, as indicated by block 1086, then transfer the remaining bytes of the message to the Transmitter Interface, as indicated by block 1088. When the Transmitter generates the
buffer available signal, as Indicated in decision block 1090, the Transmitter Interface 960 will then pass the remaining bytes of the System State message to the Transmitter, as indicated by block 1092.
The timing sequences for the Task Interactive
Consistency messages and the System State messages are shown in Figures 83 and 84, respectively. Referring first to the waveforms shown on Figure 83, the Message Generator's Event Counter 1072 is loaded at the end of the hard error window (HEW) with the nominal HEW to warning count, as indicated by block 1082 in flow diagram Figure 82. The Message Generator 998 will then wait until the end of the HEW to warning period and then transmit the first byte of the Task Interactive Consistency message to the Transmitter, as indicated by the waveform sync dat. As previously described with reference to the Transmitter 30, the receipt of this first byte of the Task Interactive Consistency message will initiate the beginning of the Task Interactive Consistency warning period and will also terminate the buffer available (BA) signal, as indicated by buffer available (BA) waveform, as indicated on Figure 83. At the end of the Task Interactive Consistency warning period, the Transmitter will initiate the transmission of the first byte to all of the other Nodes in the system. It will also reassert the buffer available signal, causing the Transmitter Interface 960 to send the remaining data bytes to the Transmitter, as indicated by the sync dat and buffer available waveforms. As shown, the last byte transmitted by the Transmitter is the longitudinal redundancy code check byte, the end of which is timed to coincide with the end of the Subatomic period. As discussed relative to Figure 82, when the next message to be sent is a pre-sync System State message, the HEW to warning period is the same as for the Task Interactive Consistency message, however, the Transmitter will substitute the System State warning period for the Task Interactive Consistency warning period and will begin the transmission of the System State message at a point in time earlier than it would have started to transmit the Task
Interactive Consistency message, as indicated in Figure 84.
Referring now to Figure 84, there is shown the timing
sequence for the transmission of a sync System State message. In the transmission of a sync System State message, the Event Counter 1072 in the Message Generator 998 is loaded with the actual HEW to warning count, as indicated by block 1098 in Figure 82. As previously discussed, the actual HEW to warning count is the sum of the nominal HEW to warning count plus the calculated sync delta. At the end of the actual HEW to warning count, the Message Generator will transmit the first byte of the sync System State message directly to the Transmitter 30 through the Transmitter Interface 960. The Transmitter then will initiate the transmission of the System State message at the end of the System State message warning period and will reinstate the buffer available signal, as indicated, causing the Transmitter Interface to transmit the remaining sync System State message to the Transmitter 30. The transmission of the last byte of the System State message defines the end of the Atomic period. The adding of the sync delta to the nominal HEW to warning period corrects the length of the Atomic period so that its ending should coincide with the ends of the Atomic periods generated by the other Nodes in the system, thus, establishing point-to-point synchronization with all of the other Nodes.
The operation of the Synchronizer in a cold start mode is discussed relative to the flow diagram shown in Figures 86 through 89 and the timing diagram shown in Figure 90.
Referring first to Figure 86, the cold start procedure begins by inquiring if the Synchronizer is in the cold start mode, as indicated by decision block 1100. The cold start is indicated by the absence of the warm start flag and the operation flag. If the Synchronizer is not in the cold start mode of operation it will inquire whether it is in the warm start mode of operation, as indicated by decision block 1102. If the Synchronizer is in the warm start mode of operation, as indicated by the warm start flag being true, the Synchronizer will call the warm start procedure, as indicated by block 1105. Otherwise, the Synchronizer will exit the cold start routine and default to the steady state mode of operation, as indicated by block 1103.
If the Synchronizer is in the cold start mode of operation, the Synchronizer will listen for messages from the other Nodes over the listening period shown in Figure 90. The Synchronizer will then inquire, as indicated by decision block 1104, if the message received is a sync or a pre-sync message, as determined from the function bits contained in the message. If the message is a pre-sync message, the message is time stamped, as indicated by block 1106, and the pre-sync Message Counter 980 is incremented, as indicated by block 1108. The In Sync With Detector 994 will then inquire if the time stamp of the received message minus the time stamp of its own message is less than the hard error window, as indicated by decision block 1110. If the difference between the time stamp of the received message and the time stamp of its own message is less than the hard error window the "in-sync-with" flag corresponding to the Node from which the message was received is set to true, as indicated by block 1112. Otherwise if the difference between the time stamp of the received message minus the time stamp of its own message is greater than the hard window error the "in-sync-with" flag in the In-Sync-With Register 996 is set to false, as indicated by block 1114.
Returning now to decision block 1104, if the sync/pre-sync function bit contained in the received message is a sync bit, the Time Stamper will time stamp the received message and set the TS flag to sync and the update flag to true, as indicated by block 1116. The Synchronizer will then increment the Sync Message Counter 982, as indicated by block 1118.
The operation of the Sync Correction Generator 990 and Adder 992 shall be explained with reference to the flow diagram illustrated in Figure 87. The Sync Correction Generator 990 first inquires, as indicated by decision block 1120, if the listening period is done. The listening period during a cold start is equal to a full Atomic period plus the hard error window, as indicated in Figure 90. During this phase of the operation the Within Hard Error Window and Soft Error Window Checker 976 will not generate any error signals during this period in response to the Warm Start Flag and the Operating Flag being set to false.
Once the listening period is over the Sync Correction Generator 990 will inquire if the number of pre-sync counts stored in the Pre-Sync Message Counter 980 is equal to the number of sync counts stored in the Sync Message Counter 982. If the pre-sync count is equal to the sync counts, the Sync Correction Generator will set the Subatomic period delta equal to zero and the sync delta equal to zero, as indicated by block 1148. If the pre-sync count is not equal to the sync count, the Sync Correction Generator 990 will then inquire if the pre-sync count is greater than the sync count, as indicated by block 1124, If the pre-sync count is greater than the sync count the Time Stamp Voter 956 will extract from the Time Stamp Memory all of the time stamps for which the TS flag is set to the pre-sync and the update flag is set to true. The Time Stamp Voter 956 will then generate a voted Subatomic period count and a voted sync clock count using the extracted values. The Sync Correction Generator 990 will then subtract its own Subatomic period count from the voted Subatomic period count to generate the SAP delta and will subtract its own sync clock count from the voted sync clock to generate a sync, delta, as indicated by block 1126.
Alternatively, if the sync count is greater than the pre-sync count, the Time Stamp Voter 956 will generate a SAP delta and a sync delta using the time stamps having their TS flag set to sync and the update flag equal to true, as indicated by block 1146.
If the Subatomic period delta is equal to zero, as indicated by decision block 1127, then the Sync Correction Generator 990 will set the Subatomic period delta equal to zero and the sync delta equal to computed sync delta, as indicated in block 1129. The Sync Correction Generator 990 will then inquire if the sync delta is greater than the maximum sync delta, as indicated by decision block 1132. If it is, the Sync Correction Generator will set the sync delta equal to the maximum sync delta stored in the Scratch Pad Memory 962, as indicated in Figure 75. If the sync delta is not greater than the maximum sync delta, as determined by decision block 1132, the Sync Correction Generator will inquire if
the sync delta is greater than the two's complement of the maximum sync delta, as indicated by decision block 1136. If the sync delta is greater than the two's complement of the maximum sync delta, the Sync Correction Generator 990 will set the sync delta equal to the two's complement of the maximum sync delta, as indicated by block 1138. Otherwise, the sync delta will remain the computed sync delta.
Returning now to decision block 1127, if the Subatomic period delta is not equal to zero then the Sync Correction Generator 990 will inquire if the Subatomic period delta is greater than zero, as indicated by decision block 1128. If the Subatomic period delta is greater than zero, the Sync Correction Generator will set the Subatomic period delta equal to the Subatomic period delta minus 1 and the sync delta equal to the maximum sync delta, as indicated in block 1130. Otherwise, the Sync Correction Generator will set the Subatomic period delta equal to the Subatomic period delta plus 1 and the sync delta equal to the two's complement of the maximum sync delta, as indicated by block 1144.
Once the Subatomic period delta and the sync deltas are determined, the actual Subatomic period per Atomic period count is generated by adding the Subatomic period delta to the nominal Subatomic period per Atomic period count, as indicated in block 1140. The actual HEW to warning period is generated by adding the sync delta to the nominal HEW to warning period in the Adder 992, as indicated by block 1141. The actual Subatomic period per Atomic period count and the actual HEW to warning period counts are stored in the Scratch Pad Memory 960, in the locations identified in Figure 75. The final operation of the Sync Correction Generator 990 is to set the Pre-sync Message Counter 980 and the Sync Message Counter 982 to zero, as indicated in block 1142.
The operation of the Data Handler 968 during the cold start mode of operation is shown in the flow diagram of Figure 88. As each message is received from the Message Checker Interface 942, the Data Handler inquires if the sync/pre-sync bit is a sync
bit, as indicated by decision block 1150. If the sync/pre-sync function bit contained in the message from Node j is a sync bit, the Data Handler, as indicated by block 1152, will store the "in-sync-with" vector of the received message in the In-Sync-With matrix (row j) contained in the Scratch Pad Memory, as shown in Figure 75. However, if the sync/pre-sync function bit contained in the message is a pre-sync bit, the In-Sync-With matrix contained in the pre-sync message is stored in the Byzantine Voter Memory 970, as indicated by block 1154.
The determination of a potential operating set of Nodes and the setting of the Operating and Warm Start Flags shall be discussed relative to the flow diagram shown in Figure 89. The Byzantine Voter 954 will wait until the listening period is over, as indicated by decision block 1156, then execute a byzantine vote using the In-Sync-With matrix stored in the Byzantine Voter Memory 920, as indicated by block 1157. Since each Node sends an In-Sync-With matrix which is stored in the Byzantine Voter Memory, these In-Synce-With matrices form a three-dimensional cube of "in-sync-with" vectors, as shown in Figure 94. The Byzantine Voter makes a first vote through the In-Sync-With matrix, as shown by the arrow 1204 in Figure 94, which will reduce the three-dimensional matrix to a two-dimensional matrix, as shown in Figure 95. The Byzantine Voter 954 will then take a second vote in a direction of the arrow 1206, shown in Figure 95, to generate a Byzantine Agreement as to which Nodes are in synchronization with each other. The Byzantine Agreement is then forwarded to the Operating Condition Detector 1000 as a potential operating set (POS), as indicated by block 1158. The Operating Condition Detector 1000 will then compare the number of Nodes in the potential operating set with the minimum number of Nodes required for start up, as indicated by decision block 1160. If the number of Nodes in the potential operating set is less than the minimum start up size, the Operating Condition Detector will set the Warm Start Flag 1002 and the Operating Flag 1004 to false, as indicated by block 1161. However, if the number of Nodes in the potential operating set is greater than the start up size, the Operating
Condition Detector 1000 will then determine if its own Node is in the potential operating set, as indicated by block 1162. If the Node is in the potential operating set, the Operating Condition Detector will set the Operating Flag equal to true, as indicated by block 1164, and then send the potential operating set to the Fault Tolerator along with the Operating Flag, as indicated by block 1166. If the Node is not within the potential operating set, the Operation Condition Detector will set the Warm Start Flag 1002 to true, as indicated in block 1168. The setting of the Warm Start Flag to true will switch the operation of the Synchronizer from the cold start mode to the warm start mode, as indicated by block 1105 in Figure 86. The potential operating set and the Operating Flag transmitted. to the Fault Tolerator are transferred to the Scheduler 40 and initiates the operation of the Operations Controller.
Figure 90 shows the operation of the Synchronizer during a cold start. At the beginning of the cold start each Synchronizer will transmit an initial sync System State message in which the "in-sync-with" vector is all zeros. The Time Stamper 972 will then time stamp the System State messages received from all of the other Nodes in the system during the time stamp listening period which is equal to an Atomic period plus the hard error window interval, as indicated. During this period, the Synchronizer will count the number of Subatomic periods which elapsed from the initial sync System State message and will send a pre-sync System State message at the appropriate time. In this first pre-sync message the In-Sync-With matrices are all zeros since it has not received any in-sync-with vectors from the other Nodes at this point in time. At the end of the hard error window following the transmission of the pre-sync System State message, the Synchronizer will process the received time stamps and will generate the required SAP delta and the sync deltas and adjust the interval between the pre-sync message and the next sync message. The Synchronizer will also compare its time stamp. with the voted time stamp and determine which Nodes it is in synchronization with. At the end of the adjusted interval, the Synchronizer will again transmit a sync message which contains the generated
"in-sync-with" vector. During the interval from the preceding pre-sync message to the following pre-sync message the Synchronizer will collect and store the "in-sync-with" vectors received from the other Nodes in the Scratch Pad Memory and assemble an "In-Sync-With" matrix in the Scratch Pad Memory.
The Synchronizer then will count the nominal number of Subatomic periods per Atomic period and will generate the special "cold-start" pre-sync System State message which contains the In-Sync-With matrix assembled in the Scratch Pad Memory. During the listening period preceding the sending of the second pre-sync System State message the Synchronizer will time stamp all of the sync messages received from the other Nodes. In the hard error window interval either side of the transmission of the second pre-sync System State message, the Synchronizer will collect the In-Sync-With matrices transmitted by the other Nodes and store them in the Byzantine Voter Memory 970. After the end of the hard error window the Synchronizer will .compute the sync correction for the interval between the pre-sync and the next sync message to effect synchronization between the Nodes. It will then determine its own "in-sync-with" vector and perform a byzantine vote on the In-Sync-With matrices stored in the Byzantine Voter Memory.
During this processing interval immediately following the end of the HEW interval, the Synchronizer will also test to determine if a potential operating set exists and whether or not its own Node is included in the potential operating set.
At the end of the adjusted synchronization interval the Synchronizer will once again transmit a sync System State message which will include its own "in-sync-with" vector. It will also assemble a new "in-sync-with" matrix from the "in-sync-with" vectors generated by the other Nodes between the second and the third pre-sync System State message. This process is repeated until a potential operating set is determined by the result of the byzantine vote on the In-Sync-With matrices stored in the Byzantine Voter Memory.
The operation of the Synchronizer during a warm start
will be discussed relative to the flow diagram shown in Figure 91 and the timing diagram shown in Figure 92. During the warm start, the Synchronizer recognizes the existence of a potential operating set and its main function is to establish synchronization with that operating this set.
Referring now to Figure 91, the warm start begins by the detection of the warm start flag, as indicated by decision block 1170. If the warm start flag is true, the Time Stamper will time stamp each received message, as indicated by block 1172. The In Sync With Detector 994 will then determine if it is "in-sync-with" any of the other Nodes, as indicated by decision block 1174, in which the difference between the Node's own time stamp and the time stamp for each received message is compared with the hard error window. If the difference between the Node's own time stamp and the time stamp of the received message is less than the hard error window interval the "in-sync-with" flag in the ISW Register 996 is set to true for each Node in which this occurs, as indicated by block 1176. If the difference between its own time stamp and the time stamp of the received message is greater than the hard error window interval, the ISW Detector 994 will set the "in-sync-with" bit for that particular Node stored in the ISW Register 996 to false, as indicated by block 1178.
During the warm start the Synchronizer will time stamp all of the System State messages received during a listening period which is equal to one Atomic period plus the hard error window interval. This is the same listening period used during a cold start. When the listening period is done, as indicated by decision block 1180, the Synchronizer will compute the sync correction which will adjust the length of the Atomic period between the pre-sync and the next sync System State message, as indicated by block 1184. The computation of this correction is the same as the computation used during a cold start. If the Operating Condition Detector 1000 concludes that its own Node is in synchronization with the existing operating set of Nodes the Operating Condition Detector will set the operating flag equal to true and the warm up flag equal to false, as indicated by block
1188, and then it will send the "in-sync-with" vector and the operating flag to the Fault Tolerator 36, as indicated by block 1190. The Fault Tolerator 36 will use this in-sync-with vector as its initial system state vector during the subsequent start up operations.
Referring now to the timing diagram shown in Figure 92, during the warm start period the Synchronizer will only transmit sync and pre-sync System State messages in an alternating sequence. In the processing interval following the hard error window associated with each sync and pre-sync System State message the Synchronizer will compute the sync correction to adjust the length of the Atomic period following the pre-sync message to effect synchronization with the existing operating set. It will also generate its own local "in-sync-with" vector during the same processing interval and test this "in-sync-with" vector to determine if its own Node is in synchronization with the operating set. If its own Node is in synchronization with the existing operating set, the Synchronizer will then go to the operating state and will exit the warm up state. As shown in Figure 92, this process is repeated until the Synchronizer is in sync with the existing operating set.
The Synchronizer also performs a byzantine vote on the task completed vector and the branch condition bits, as previously described with reference to the Scheduler 40. The task completed vector and the branch condition bits are embodied as separate bytes in the Task Interactive Consistency and the System State messages and are stored in the Message Memory 964.
Referring now to Figure 93, at the end of each hard error window, as indicated by block 1192, the Synchronizer will transfer the task completed vectors from the Message Memory 964 to the Byzantine Voter Memory 970, as indicated by block 1194. After all the task completed vectors are transferred to the Byzantine Voter Memory the Byzantine Voter 954 will execute a byzantine vote on all of the transferred task completed vectors and generate a voted task completed (TC) vector, as indicated by block 1196. The Synchronizer will then transfer the branch condition bits to the
Byzantine Voter Memory 970, as indicated by block 1198, then execute a byzantine vote to generate voted branch condition bits, as indicated by block 1200. The Byzantine Voter 954 will then pass the voted task completed vector and the voted branch condition bits to the Scheduler, as indicated by block 1202. This assures that the Scheduler in each Node will record the task completed in a consistent fault tolerant manner.
BYZANTINE VOTER
The function of the Byzantine Voter is to guarantee consistency among the Nodes in reaching agreement on certain critical matters. The reliability of a distributed fault-tolerant system depends on the ability of all the non-faulty Nodes to reach a consistent agreement despite the presence of one or more faulty Nodes. Since all failure modes of the faulty Node cannot be enumerated, any mechanism for achieving agreement must be provably correct in the presence of arbitrary failures.
The problem of reaching agreement was originally expounded by analogy to the several divisions of the Byzantine army encamped around an enemy city, as described by Lamport, L., Shostak, R., and Pease, M., "The Byzantine General's Problem," ACM TOPLAS, Volumn 4, Number 3, July 19, 1982 and "Reaching Agreement in the Presence of Faults," JACM, Volume 27, No. 2, April 1980. In the Byzantine army analogy, each division is commanded by a general (Node) which can communicate with the other generals via messengers (communication links). The generals need to reach a consistent decision about whether to attack or retreat. Some of the generals may be traitors who will attempt to confuse the other generals. Since all possible failure modes must be considered a traitorous general is permitted to lie, send different messages to different generals, to tamper with relayed messages, to act in collusion with other traitors, or otherwise appear to act in a pernicious manner.
The system state which guarantees system consistency is referred to as a Byzantine Agreement, and is defined by two conditions.
1. Agreement: All loyal generals agree on the contents of every message sent.
2. Validity: If the sending general is loyal, then all loyal receiving generals agree on the content of his messages as originally sent.
These Agreement conditions embody three important concepts. First, if the sending general is a traitor the specific decision made by the loyal generals is immaterial provided they all make the same decision. Second, reaching agreement does not require identification of the traitors. Third, no assumptions have been made restricting the traitor's behavior.
In order to guarantee Byzantine Agreement regarding a given message, one or more synchronous rounds of transmission are required. During each round, every general broadcasts a copy of every message received during the previous round. Agreement can be guaranteed in the presence of one traitor if there are at least four generals and two rounds of messages are transmitted.
For numerical data, it is also possible to define a state of approximate agreement as meeting two similar conditions.
1. Agreement: All non-faulty Nodes eventually agree on the values that are within some small difference of each other.
2. Validity: The voted value obtained by each non-faulty Node must be within the range of initial values generated by the non-faulty Nodes.
The details of the Byzantine Voter are shown in Figure
96 and will be discussed with reference to the "in-sync-with" matrices and vectors shown in Figures 94 and 95, respectively. It is to be recognized that the Byzantine Voter discussed here is not limited to the voting on the "in-sync-with" vectors, the task completed vectors, or the binary bits as applied to the instant application.
As previously discussed, each Synchronizer will generate its own "in-sync-with" vector which is transmitted to all of the other Nodes in the System State messages. Each Node will store the "in-sync-with" vectors received from all of the other Nodes in the Scratch Pad Memory 962 to form an "in-sync-with" matrix, as shown in Figure 75. During the cold start mode of operation this "in-sync-with" matrix, as shown in Figure 85, is transmitted with each pre-sync System State message to all of the other Nodes in the system. Each Synchronizer will then store each of these "in-sync-with" matrices in the Byzantine Voter Memory to form a three-dimensional cube, as shown in Figure 94. This constitutes the two rounds of transmission required for a Byzantine Agreement.
The Byzantine Voter will first vote on the value of each "in-sync-with" bit of the matrix longitudinally through the matrix, as indicated by the direction of the arrow 1204 in Figure 94. The first vote will reduce the three-dimensional cube to a two-dimensional matrix, as shown in Figure 95, where each "in-sync-with" bits is the voted value of the first vote. The Byzantine Voter 954 will then v.ote on the values of the "in-sync-with" bit in each column of the "in-sync-with" matrix shown in Figure 95. The direction of the second vote by the Byzantine Voter is indicated by arrow 1206. The result of the second vote will be a Byzantine Agreement of the individual "in-sync-with" bits for each Node in the system, which is transmitted to the Operating Condition Detector 1000, as shown in Figure 77. The circuit details of the Byzantine Voter are shown in Figure 96.
Referring now to Figure 96, the Data Handler 968 will load the data to be voted on by the Byzantine Voter into the Byzantine Voter Memory 970. A Byzantine Voter Control Logic 1230 will activate an Address Generator 1210 in response to the message type and the operating flags. As previously discussed, the Byzantine Voter will vote on the task completed and branch condition vectors contained in the Task Interactive Consistency messages and the "in-sync-with" matrices contained in the pre-sync System State messages during the cold start mode of operation.
The Address Generator 1210 will address the Byzantine Voter Memory in the appropriate manner and store the addressed data in a Data Register 1208. Each bit in the Data Register 1208 is applied to one input of a plurality of AND gates 1212 through 1226. Each AND gate receives a respective one of the data bits stored in the Data Register 1208. A Decoder 1228, responsive to the addresses being generated by the Address Generator, will selectively deactivate one of the AND gates which corresponds to the "in-sync-with" bit generated by each Node with respect to itself, as indicated by the X's in the blocks shown in Figure 95.
A Message Counter 1284 monitors the number of vectors or matrices loaded into the Byzantine Memory 970 and generates a two's complement value corresponding to one half of the number of vectors or matrices loaded into the Byzantine Voter Memory 970. This value is loaded into a plurality of Accumulators 1264 through 1278 through a like plurality of 2:1 Multiplexers 1232 through 1246 and Adders 1248 through 1262. Under the control of the Byzantine Voter Control Logic the Data Register will then switch the 2:1 Multiplexers 1232 through 1246 to the output of the AND gates 1212 through 1226 and will add the content stored in the Data Registers to the amount stored in the Accumulators. The Byzantine Control Logic will then load the data from the Byzantine Voter Memory into the Data Register 1208 in accordance with the message type and the operating flags. For example, if the byzantine vote is being taken among the "in-sync-with" matrices generated during the cold store, the Address Generator 1210 will sequentially load the in sync with vector from Node 0 from the matrix transmitted by Node 0 and then sequentially the matrix from Node 1 through Node 7. Ouring the adding of the bit value in each of the Adders 1248 through 1262 to the amount stored in the
Accumulators 1264 through 1277, an overflow bit will be generated when the sum is greater than 0. Any overflow bits generated during the addition process will be stored in a Byzantine Voted Value Register 1280. After the data from the matrix from Node 7 is processed, the content of the Byzantine Voted Value Register is passed to a 1:3 Multiplexer 1282 which stores this data in the Byzantine Voter Memory 970 to form the two-dimensional matrix,
such as shown in Figure 95.
The Address Generator will then index and will process the "in-sync-with" vectors from Node 1 from the matrices from Node 0 through Node 7 as previously done with regard to the "in-sync-with" vectors of Node 0. First the Message Counter 1284 will load the Accumulators with a two's complement value corresponding to one half of the number of matrices that will be processed. The Address Generator will then load the "in-sync-with" vectors generated by Node 1 taken from the matrices received from Node 0 through Node 7, as previously described. Again, the overflow bits signifying that the results from the addition exceed zero, are stored in the Byzantine Voted Value Register 1280 which also is restored in the Byzantine Voter Memory 970 through the Multiplexer 1282. This process is repeated for the "in-sync-with" vector generated by each of the Nodes and is terminated when the vectors from Node 7 are completely processed and all the Byzantine Voted Values are stored back into the Byzantine Voter Memory 970, forming the two-dimensional matrix shown in Figure 95.
After the first vote has been completed on all of the
"in-sync-with" vectors stored in the "in-sync-with" matrices from all of the Nodes, the Byzantine Voter Control Logic 1230 will initiate the second vote in which the voting is taken down the columns, as indicated by arrow 1206 in Figure 95. During the second vote the Address Generator 1210 will load the column for Node 0 into the Data Register 1208. The Message Counter again will load the two's complement corresponding to the value of one half the number of bits to be processed by the Byzantine Voter into the Accumulators 1264 through 1278. The Adders will then add the bits stored in the Data Register to the value stored in the Accumulators 1264 through 1278. This process is repeated until the columns for all of the Nodes have been processed. Again, the overflow bits from the Adders 1248 through 1262 are stored in the Byzantine Voted Value Register 1280. The Byzantine Voter Control Logic 1230 will then activate the 1:3 Multiplexer to pass the
"in-sync-with" vector stored in the Byzantine Voted Value Register
to the Operating Condition Detector 1000, as previously described. This "in-sync-with" vector represents a Byzantine Agreement on which Nodes are in synchronization with each other.
When the Byzantine Voter is voting on the task completed vectors and the branch condition bits contained in the Task Interactive Consistency and System State messages, the Data Handler will load these values into the Byzantine Voter Memory 970. The Byzantine Voter Control Logic 1230 will then activate the Address Generator 1210 to load the columns of the task completed vectors into the Data Register 1208, as previously described with reference to the second vote on the "in-sync-with" vectors. The voting process is then identical to the voting process for the second vote on the "in-sync-with" vectors and the voted value is loaded into the Byzantine Voted Value Register from the overflow outputs of the Adders 1248 through 1262. The Byzantine Voter Control Logic 1230 will then activate the 1:3 Multiplexer 1282 to pass the voted task completed vector and the voted branch condition bits to the Scheduler Interface 944, as previously described.
The Operations Controller and the subsystems discussed herein represent a distributed multi-computer fault-tolerant architecture based on the functional and physical partitioning of the application task and the overhead functions. It is not intended that the invention be limited to the structures illustrated and discussed herein. It is known that those skilled in the art are capable of making changes and improvements within the spirit of this invention as described above and set forth in the appended claims.
What is claimed is:
Claims
1. In a fault tolerant multiple node processing system wherein each node has an applications processor for executing a predetermined set of tasks and an operations controller for establish and maintaining its own node in synchronization with every other node in the system, for controlling the operation of its own node, and for selecting the tasks to be executed by its own application processor in coordination with all of the other nodes in the system through the exchange of inter- node messages with all of the other nodes in the system, an operations controller comprising: a transmitter for transmitting all of the inter- node messages generated by its own operations controller to all the nodes in the system including its own node over a private communication link, said transmitter having an arbitrator for deciding the order in which said inter-node messages are to be transmitted when two or more messages are ready for transmission, at the same time; a plurality of receivers, each receiver associated with a respective one of said plurality of nodes and only receiving messages from that node; a message checker for checking each received message for physical and logical errors to generate an internal error report containing an error status byte identifying each detected error, said message checker polling each of said receivers to unload the received messages in a repetitive sequence; a voter subsystem having a voter for voting on the content of all error free messages containing the same information to generate a voted value and a deviance checker for generating an internal error report containing a deviance vector identifying each node which sent a message used in the generation of said voted value whose content differed from the voted value by more than a predetermined deviance value; a fault tolerator for passing all error free messages received from said message checker to said voter subsystem, for generating an inter-node error message containing all of said error reports accumulated by all the subsystems, for generating a base penalty count for each node in the system based on the number of errors detected and the severity of the detected errors identified in said internal error reports, for globally verifying the base penalty count for each node through the exchange of inter-node base penalty count messages, and for generating a system state vector identifying each node whose base penalty count exceeds a predetermined exclusion threshold; a task scheduler for selecting the next task to be executed by the node's own applications processor from an active task list, for replicating the scheduling of other nodes in the system, for maintaining a global data base in the scheduling and execution by each node through the exchange of task completed/started messages, and for generating an error report identifying each node whose scheduling process differs from the scheduling process replicated for that node, said task scheduler further having means to reconfigure said active task list in response to said system state vector generated by the fault tolerator indicating a change in the number of non-excluded nodes; a data memory; a task communicator for storing said voted values in said data memory, for passing the identity of the task selected by the scheduler to the applications processor, for extracting the voted values required for the execution of the selected task and passing them to the applications processor, for generating said task completed/started messages identifying the task just completed and the new task started by the applications processor, and for generating inter-node data value messages containing the data values generated by the applications processor in the execution of the selected tasks; and a synchronizer for synchronizing the operation of its own node with all of the other non-faulty nodes in the system through the exchange of inter-node time- dependent messages, said synchronizer storing a time stamp signifying the local time which each time- dependent message was received and correcting the synchronization of its own node based on the difference between the time stamp of its own time-dependent message and a voted time stamp derived from the time stamps for all the time-dependent messages received within a predetermined time window.
2. The operations controller of Claim 1 wherein said transmitter comprises: a first interface for receiving the messages generated by said fault tolerator; a second interface for receiving the messages generated by a said task communicator; a synchronizer interface for receiving said inter-node time-dependent messages generated by said synchronizer; an arbitrator responsive to said first, second, and synchronizer interfaces receiving messages to be transmitted for arbitrating the priorities of these messages and to generate a transmit signal identifying which message is to be transmitted, said arbitrator delaying the transmission of the inter-node messages received by said first and second interfaces if their transmission will interface with the transmission of said time dependent messages; a parallel-to-serial converter for converting said inter-node messages to a serial format for transmission over the private communication link to all of the nodes in the said processing system; and a first multiplexer for passing the inter-node message stored in one of the said first, second, and synchronizer interfaces to said parallel-to-serial converter in response to said transmit signal generated by said arbitrator.
3. The operations controller of Claim 2 wherein each of said inter-node messages has a message type code identifying the type of information contained in the message and a data identification code which uniquely identifies the particular data value contained in the message, said message checker comprising: sequencer means for context switching among said plurality of receivers in a predetermined sequence to unload the received inter-node messages from the receivers; a context storage for storing the relevant information pertaining^ to the message being processed, said context storage having one entry for each node, each of said entries storing at least the message type code, the date identification code, a byte count which identifies the specific byte being processed and an error status byte; error check logic means for checking the node identification code contained in the message with reference to the expected node identification code associated with the receiver which received the message, for checking the message type code, for checking the data identification code against a maximum data identification code, and for checking the number of bytes contained in the message, said error check logic means recording all detected errors in said error status byte; between limits checker means for checking the data values contained in the messages against predetermined maximum and minimum limit values and for reporting an exceeds limit error to said error check logic means whenever the data value contained in a message is not within said maximum and minimum limit values, said error check logic means recording said exceeds limits error in said error status byte; and multiplexer mean's for passing on each received message to said fault tolerator for further processing, said multiplexer means appending to each message, as it is passed on, an error report containing said error status byte currently stored in said context storage.
4. The operations controller of Claim 3 wherein said fault tolerator comprises: a data memory for storing the content of all error free inter-node messages received from said message checker; an error file for storing the content of all of the received error reports; error handler means for storing the error reports received from said plurality of subsystems in said error file and for generating a base penalty count for each node from the content of said error file, said base penalty count being indicative of the operational status of that particular node, said error handler means further having means for determining which nodes are faulty and for excluding such faulty nodes, in coordination with all of the other nodes in the system, from participating in the operation of said multiple node processing system through the exchange of said inter-node messages, said inter-node messages including error messages, each containing the content of said error file for a particular node and a base penalty count message containing the base penalty count of each node; and interface means for storing in said data memory all of the error free messages passed by said message checker, the passing the identities of the faulty nodes to said scheduler and said synchronizer, and for passing all error reports to said error handler.
5. The operations controller of Claim 4 wherein said voter subsystem comprises: an upper medial value sorter for sorting the data values received from said fault tolerator to generate an upper medial value; a lower medial value sorter for sorting the data values received from said fault tolerator to generate a lower medial value; averaging means for averaging said upper and lower medial values to generate said voted value; deviance checker means for comparing in parallel the content of each received message with said upper and lower medial values to generate said deviance vector; and loader means for loading the content of the messages received from said fault tolerator into said upper and lower medial value sorters and said deviance checker means.
6. The operations controller of Claim 5 wherein said task scheduler comprises: a task activity list containing an entry for each active task in said multiple node processing system, each entry containing an execution periodicity and a node allocation for that task; a priority scan list containing a selected portion of the active tasks in said task activity list which are available for execution, said selected portion of said active tasks being stored in their preferred order of execution; a completion status list storing the same selected portion of said active tasks stored in said priority scan list; a selection queue storing for each node the active tasks ready for selection in their preferred order of execution; a period counter for counting said fundamental timing periods to generate a period count corresponding to the number of fundamental periods which have expired since the beginning of a new master period; wake-up sequencer means for interrogating said task activity list to transfer to said priority scan list and said completion status list all of the active tasks whose periodicity is greater than said period count; priority scan means for transferring to said selection queue for each node entry the highest three priority active tasks in said priority scan list which are ready for execution by that node; task selector means to selecting the highest priority active task currently stored in said selection queue for its own node as the next task scheduled for execution by its Own applications processor; and a task interactive consistency handler for updating the status of each task in said task activity list, said priority scan list, said completion status list and said selection queue which are identified in inter-node messages reporting the completion of a task.
7. The operations controller of Claim 6 wherein said task communicator comprises: a data memory for storing said voted values, said data memory having at least two partitions identified by a context bit, each partition having a plurality of entries for storing said voted values; a context bit memory for storing a context bit for each data identification code, said context bit identifying said voted values stored in said data memory which are ready for use in the execution of tasks by the applications processor; a task terminated recorder for complementing the context bit in said context bit memory in response to said task terminated signal generated by said task scheduler; a store data control for storing said voted values in said data memory using said message type code, said data identification code and the complement of said context bit associated with the voted value as an address for the appropriate entry in said data memory; a next task register for storing the task identification code of the task selected by the task scheduler for execution by the applications processor; an input FIFO register accessible by the applications processor for storing the identity of the next task to be executed by the applications processor and the voted values required for the execution of said next task; an input handler, responsive to the applications processor completing the preceding task to generate said task completed/started message sent to all of the nodes in the processing system, to transfer the task identification code of said next task to said input FIFO register, and to access said data memory for the voted values required for the execution of said next task, said input handler using said context bits to identify which voted values in said data memory are to be used in the execution of said next task; an output FIFO register for receiving from said applications processor the data values resulting from the execution of each task; and an output handler for generating data value messages sent to all the nodes in said multiple processing system, said data value messages containing the data values stored in said output FIFO register and the identification code for the data values.
8. The operations controller of Claim 7 wherein said time-dependent messages include alternating sync and pre-sync time-dependent messages, said synchronizer comprising: a message interface for receiving said sync and pre-sync time-dependent messages; counter means for generating a local time; a time stamp memory having one entry for each node in the system, each entry storing a time stamp for the most recent time-dependent message received from the associated node; a time stamper responsive to receiving a time- dependent message from each node for generating a time stamp corresponding to the local time at which said time-dependent message is received and for storing said time stamp in said time stamp memory in the entry associated with the node from which the time-dependent message was received; a time stamp voter for generating a voted time stamp corresponding to a medial value of the time stamps stored in said time stamp memory for said pre-sync time-dependent messages; a sync correction generator for generating sync delta having a value corresponding to the difference between said voted time stamp and the time stamp of its own pre-sync time-dependent message; means for adding said sync delta to a nominal transmission timing interval to generate an actual transmission timing interval, said nominal transmission timing interval corresponding to a minimal timing interval between the end of the transmission of the sync time-dependent message and the passing of the pre-sync time-dependent message to said transmitter; and message generator means for generating said sync and pre-sync time-dependent messages, said message generator means passing said pre-sync time-dependent messages to the transmitter at the end of said minimal transmission timing interval and pass said sync time-dependent messages to the transmitter at the end of said actual transmission timing interval.
9. In a multiple node processing system having an applications processor (14) for processing a predetermined set of tasks and an operations controller (12) for controlling the operation of its own node, the operations controller (12) having a transmitter (30) for transmitting messages to all of the other nodes (10) in the processing system, a synchronizer (46) for synchronizing the operation of its own node (10) with the operation of all the other nodes (10) in the system through the exchange of inter-node time-dependent messages comprising alternating pre-sync and sync messages separated in time by a nominal timing interval, comprising: a message interface (942) for receiving the sync and pre-sync time-dependent messages; counter means (969, 971) for generating a local time; a time stamp memory (966) having one entry for each node (10) in the system, each entry storing a time stamp foe the most recent time-dependent message received from the associated node; a time stamper (972) responsive to receiving a time-dependent message from each node (10) for generating a time stamp corresponding to the local time at which said time-dependent message is received and for storing said time stamp in said time stamp memory (966) in the entry associated with the node (10) from which the time-dependent message it was received; a time stamp votor (956) for generating a voted time stamp corresponding to a medial value of the time stamps stored in said time stamp memory (966) for said pre-sync time-dependent messages; a sync correction generator (990) for generating sync delta having a value corresponding to the difference between said voted time stamp and the time stamp of its own pre-sync time-dependent message; means for adding (992, 962) said sync delta to a nominal transmission timing interval for generating an actual transmission timing interval, said nominal transmission timing interval corresponding to a nominal timing interval between the end of the transmission of the sync time-dependent message and the passing of the pre-sync time-dependent message to the transmitter (30); and message generator means (998) for generating said sync and pre-sync time-dependent messages, said message generator means (998) passing said pre-sync time- dependent messages to the transmitter (30) at the end of said nominal transmission timing interval and passing said sync time-dependent messages to the transmitter at the end of said actual transmission timing interval.
10. The synchronizer (46) of Claim 1 having a timing signal generator (950) for generating at least a first signal coincident with the end of the transmission of said sync and pre-sync time-dependent messages and a hard error window signal defining an interval of time bracketing said first signal during which all of said sync and pre-sync time-dependent messages should be received from the nodes which are in synchronization with its own node, said synchronizer (46) having a hard error window detector (976) for inhibiting said time stamper (972) from recording a time stamp in said time stamp memory (966) for each node (10) whose time-dependent message was not received within said hard error window, and for recording a hard error for each node whose time-dependent message was not received within said hard error window.
11. The synchronizer (46) of Claim 2 wherein said timing signal generator (95-) generates a second timing signal which divides said first timing interval into a plurality of sub-intervals, said counter means (969, 971 ) comprising: a first counter means (971) for counting said second timing signals between said first timing signals to generate a sub-interval count; a second counter means (969) for counting clock pulses between said second timing signals to generate a clock count; and wherein said time stamp comprises a coarse time stamp portion containing said sub-interval count and a fine time stamp portion containing said clock counts.
12. The synchronizer (46) of Claim 3 wherein said time stamp votor (956) generates a voted coarse time stamp corresponding to a medial value of said sub-interval counts stored In said timing stamp memory (966) and a voted fine time stamp corresponding to a medial value of said clock counts stored in said time memory (966) and wherein said sync correction generator (990) generates a coarse delta equal to the difference between said voted coarse time stamp and its own coarse time stamp and a fine delta equal to the difference between said voted fine time stamp and the number of clock counts in its own time stamp.
13. The synchronizer (46) of Claim 4 wherein said sync and pre-sync time-dependent messages have the same message type code and are distinguished by function bits contained therein, said synchronizer (46) further has an expected message detector (974) for recording a sequence error for each node (10) which sends a message within the hard error window which has the wrong message type code and records a sequence error for each node (10) in which said sync and pre-sync time-dependent messages are not received in an alternating manner, said expected message detector (974) inhibiting said time stamper (972) from recording a time stamp in said time stamp memory (966) for each node (10) which sent a message having a detected error sequence.
14. In a multiple node fault tolerant processing system capable of processing a set of application tasks in which each node (10) has an applications processor (14) for executing a predetermined subset of said set of application tasks and an operations controller (12) for controlling the operation of the node (10) and scheduling the order in which the individual tasks in said predetermined subset of tasks are to be executed by the applications processor (14) through the exchange of inter-node messages containing data and operation information with all of the other nodes (10) in the processing system, the operations controller (12) generating at least two timing period intervals, a fundamental timing period and a master period which is an integer multiple of the fundamental timing period, the master period defining the timing interval during which every task in said predetermined subset of tasks is scheduled for execution by the applications processor at least once, a task scheduler (40) characterized by: a task activity list (444) containing an entry for each active task in said multiple node processing system, each entry containing an execution periodicity and a node allocation for that task; a priority scan list (446) containing a selected portion of the active tasks in the task activity list (444) which are available for execution, said selected portion of said active tasks being stored in their preferred order of execution; a completion status list (438) storing the same selected portion of said active tasks stored in said priority scan list (446); a selection queue (450) storing for each node the active tasks ready for selection in their preferred order of execution; a period counter (442) for counting said fundamental timing periods to generate a period count corresponding to the number of fundamental periods which have expired since the beginning of a new master period; wake-up sequencer means (440) for interrogating said task activity list (444) to transfer to said priority scan list (446) and said completion status list (438) all of the active tasks whose periodicity is greater than said period count; priority scan means (448) for transferring to said selection queue (450) for each node entry the highest three priority active tasks which are ready for execution by that node; task selector means (452) for selecting the highest priority active task currently stored in said selection queue for its own node as the next task scheduled for execution by its own applications processor; and a task Interactive consistency handler (436) for updating the status of each task in said task activity list (444), said priority scan list (446), said completion status list (438) and said selection queue (450) which are identified in inter-node messages reporting, the completion of a task.
15. The task scheduler (40) of Claim 1 wherein each task entry of said completion status list (438) has a completion count entry storing the 2's complement of the number of nodes which are scheduled to execute that task, said task interactive consistency handler (436) having means for incrementing said completion count in response to inter-node messages identifying which node completed that task and for setting a terminated flag when said completion count is decremented to zero indicating that the task has been executed by all of the nodes (10) scheduled to execute that task.
16. The task scheduler (40) of Claim 1 wherein each task entry of said task activity list (444) and said priority scan list (445) has a predecessor count entry indicative of the number of tasks which must be terminated before it can be executed, said task interactive consistency handler (436) having a successor list storing the identity of all the tasks for which the terminated task is a predecessor, and means responsive to the termination of a task for accessing said successor list to identify each task for which the terminated task is a predecessor, and for decrementing the predecessor count in said task activity list (444) and said priority scan list (448) for each of said identified tasks.
17. The task scheduler (40) of Claim 3 further having an "old task" list (458) storing for each node (10) the task currently being executed by that node, said task interactive consistency handler (436) having- means for recording as "used" in said selection queue (450) the highest priority task currently stored for each node (10) which reported it has started a new task and for recording the identity of said highest priority task in said "old task" list (458).
18. The task scheduler (40) of Claim 1 in which the inter-node messages exchanged between the nodes (10) includes a task completed/started message and a task interactive consistency message, said task completed/started message is sent to all of the other nodes (10) whenever a node begins a new task, said task completed/started message containing at least the identity of the task started and the identity of the task completed by that node and said task interactive consistency message is sent at predetermined timing intervals and contains a task completed vector identifying each node (10) which sent a task completed/started message, said task completed vector being a voted composite of the task completed/started messages received from all the nodes (10) in said predetermined timing interval, said task scheduler (40) having a started task register (434) for storing the identity of the task reported as started in said task completed/started messages received from that node (10) and said task interactive consistency handler (436) responsive to the task completed vector contained in said task interactive consistency message to compare the identity of the highest priority task stored for each node (10) identified as having completed a task in said task completed vector with the identity of the task stored in said task started register (434) and to generate a sequence error signal when they are not the same.
19. In a multiple node processing system wherein each node has an applications processor for executing a predetermined set of tasks and an operations controller for controlling the operation of its own node in coordination with all of the other nodes in the processing system through the exchange of internode messages, the operations controller having at least three subsystems generating internode messages including a synchronizer which generates time dependent internode messages marking predetermined time intervals used in the synchronization of the multiple nodes with each other and the scheduling of the tasks to be executed by the applications processor, in a coordinated manner, a transmitter for the operations controller comprising: a first interface for receiving the messages generated by a first subsystem; a second interface for receiving the messages generated by a second subsystem; a synchronizer interface for receiving said time dependent internode messages generated by said synchronizer; an arbitrator responsive to said first, second, and synchronizer interfaces receiving messages to be transmitted for arbitrating the priorities of these messages and to generate a transmit signal identifying which message is to be transmitted, said arbitrator delaying the transmission of the messages received by said first and second interfaces if their transmission will interfere with the transmission of said time dependent inter-node messages; a parallel-to-serial convertor for converting said internode messages to a serial format for transmission over the private communication link to all of the nodes in the said processing system; and a first multiplexer for passing the message stored in one of said first, second, and synchronizer interfaces to said parallel-to-serial converter in response to said transmit signal generated by said arbitrator.
20. In a multiple node processing system wherein each node in the multiple node processing system has an applications processor for executing a predetermined set of tasks and an operations controller for controlling the operation of the node in coordination with all of the other nodes in the processing system through the exchange of inter-node messages containing data and operational information, each operations controller having a plurality of receivers, each of which receives only the messages transmitted by a selected one of said multiple nodes, each of the inter-node messages containing a node identification code identifying the node which sent the message, a message type code identifying the type of message it is, and a data identification code further distinguishing the data contained in the message, a message checker for the operations controller comprising; sequencer means for context switching among the plurality of receivers in a predetermined sequence to unload the received messages from the receivers; a context storage for storing the relevant information pertaining to the message being processed, said context storage having one entry for each node, each of said entries storing at least the message type code, the data identification code, a byte count which identifies the specific byte being processed and an error status byte; error check logic means for checking the node identification code contained in the message with reference to the expected node identification code associated with the receiver which received the message, for checking the message type code, for checking the data identification code against a maximum data identification code, and for checking the number of bytes contained in the message, said error check logic means recording all detected errors in said error status byte; between limits checker means for checking the data values contained in the messages against predetermined maximum and minimum limit values and for reporting an exceeds limit error to said error check logic means whenever the data value contained in a message is not within said maximum and minimum limit values, said error check logic means recording said exceeds limits error in said error status byte; and multiplexer means for passing on each received message for further processing in the operations controller, said multiplexer means appending to each message, as it is passed on, the error status byte currently stored in the context storage.
21. In a multiple node fault tolerant processing system wherein each node has an applications processor for executing application tasks and an operations controller for controlling the operation of the node in coordination with all of the other nodes in the processing system through the exchange of inter-node messages and selects the tasks to be executed by the applications processor, the operations controller having a plurality of subsystems including a message checker, a scheduler, a synchronizer and a voter, each of which is capable of detecting errors and generating internal error reports identifying each error detected, each operations controller having at least two operating system states and operative to switch from one operating system state to another in response to the exclusion of a faulty node or the readmittance of a healthy node which changes the number of healthy nodes operating in the processing system, a fault tolerator for said operations controller comprising: a message memory for storing the content of all received error free internode messages; an error file for storing the content of the received error reports; error handler means for storing the error reports received from said plurality of subsystems in said error file and for generating a base penalty count for each node from the content of said error file, said base penalty count being indicative of the operational status of the particular node, said error handler means further having means for determining which nodes are faulty and for excluding such faulty nodes, in coordination with all of the other nodes in the system, from participating in the operation of said multiple node processing system through the exchange of internode messages, said internode messages including error messages containing the content of said error file for a particular node and a base penalty count message containing said base penalty count of each node; and interface means for storing in said message memory all of the error free messages passed by the message checker, for passing the identities of the faulty nodes to the scheduler and the synchronizer, and for passing all error reports to said error handler.
22. A voter subsystem for a multiple node fault tolerant processing system wherein each node in the multiple node processing system has an applications processor for executing a predetermined set of application tasks and an operations controller for controlling the operation of the node in coordination with all of the other nodes in the processing system through the exchange of internode messages containing data and operational information and for selecting the application tasks to be executed by the applications processor, said voter subsystem comprising: voter means for processing in parallel a plurality of values received in said internode messages to generate a voted value; deviance checker means for comparing in parallel each of said plurality of values received in said internode messages with said voted value to generate a deviance vector identifying each received value of said plurality of values which differed from said voted value by more than a predetermined deviance value; and loader means for loading said values received in said internode messages into said voter means and said deviance checker means.
23. An on-the-fly voter comprising: sorter means for sorting a plurality of received values to extract an upper and a lower medial value; averaging means for averaging said upper and lower medial values to generate said voted value; and loader means for loading said sorter means with said plurality of received values, bit-by-bit, starting from the most significant bit position to the least significant bit position.
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US039,190 | 1987-04-15 | ||
US07/038,818 US4816989A (en) | 1987-04-15 | 1987-04-15 | Synchronizer for a fault tolerant multiple node processing system |
US038,813 | 1987-04-15 | ||
US07/038,813 US4914657A (en) | 1987-04-15 | 1987-04-15 | Operations controller for a fault tolerant multiple node processing system |
US07/039,190 US4805107A (en) | 1987-04-15 | 1987-04-15 | Task scheduler for a fault tolerant multiple node processing system |
US038,818 | 1987-04-15 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1988008161A1 true WO1988008161A1 (en) | 1988-10-20 |
Family
ID=27365455
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1988/001245 WO1988008161A1 (en) | 1987-04-15 | 1988-04-15 | An operations controller for a fault tolerant multiple node processing system |
Country Status (5)
Country | Link |
---|---|
US (6) | US4805107A (en) |
EP (1) | EP0356460A4 (en) |
JP (1) | JPH02503122A (en) |
CA (1) | CA1301938C (en) |
WO (1) | WO1988008161A1 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0518511A2 (en) * | 1991-06-11 | 1992-12-16 | International Business Machines Corporation | Computer having I/O channel check and parity check detector |
WO2001027726A1 (en) * | 1999-10-08 | 2001-04-19 | Alliedsignal Inc. | System and method for fault-tolerant clock synchronization using interactive convergence |
EP1857935A1 (en) * | 2006-05-16 | 2007-11-21 | Saab Ab | Fault tolerant data bus node in a distributed system |
CN114639496A (en) * | 2022-02-24 | 2022-06-17 | 中核武汉核电运行技术股份有限公司 | Redundancy association method and device for nuclear power technological equipment |
CN114639496B (en) * | 2022-02-24 | 2024-11-12 | 中核武汉核电运行技术股份有限公司 | Nuclear power process equipment redundancy association method and device |
Families Citing this family (405)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4805107A (en) * | 1987-04-15 | 1989-02-14 | Allied-Signal Inc. | Task scheduler for a fault tolerant multiple node processing system |
US5136708A (en) * | 1987-06-09 | 1992-08-04 | Oce-Nederland B.V. | Distributed office automation system with specific task assignment among workstations |
DE3719283A1 (en) * | 1987-06-10 | 1988-12-22 | Bosch Gmbh Robert | METHOD FOR LOCALIZING DEFECTIVE STATIONS IN LOCAL NETWORKS AND RELATED INTERFACE CONTROLLERS |
CA2003338A1 (en) * | 1987-11-09 | 1990-06-09 | Richard W. Cutts, Jr. | Synchronization of fault-tolerant computer system having multiple processors |
AU616213B2 (en) * | 1987-11-09 | 1991-10-24 | Tandem Computers Incorporated | Method and apparatus for synchronizing a plurality of processors |
WO1989008361A1 (en) * | 1988-02-29 | 1989-09-08 | Kabushiki Kaisha Komatsu Seisakusho | Data pick-up circuit for serial controller |
US5091870A (en) * | 1988-05-27 | 1992-02-25 | Ncr Corporation | Apparatus for measuring the speed of transmission of digital characters |
US5025369A (en) * | 1988-08-25 | 1991-06-18 | David Schwartz Enterprises, Inc. | Computer system |
US4985831A (en) * | 1988-10-31 | 1991-01-15 | Evans & Sutherland Computer Corp. | Multiprocessor task scheduling system |
AU625293B2 (en) * | 1988-12-09 | 1992-07-09 | Tandem Computers Incorporated | Synchronization of fault-tolerant computer system having multiple processors |
US4965717A (en) * | 1988-12-09 | 1990-10-23 | Tandem Computers Incorporated | Multiple processor system having shared memory with private-write capability |
US5594866A (en) * | 1989-01-18 | 1997-01-14 | Intel Corporation | Message routing in a multi-processor computer system with alternate edge strobe regeneration |
US5057997A (en) * | 1989-02-13 | 1991-10-15 | International Business Machines Corp. | Interruption systems for externally changing a context of program execution of a programmed processor |
JPH02245864A (en) * | 1989-03-20 | 1990-10-01 | Hitachi Ltd | Multiprocessor system |
JPH02287858A (en) * | 1989-04-28 | 1990-11-27 | Toshiba Corp | Restarting system for distributed processing system |
DE3917715A1 (en) * | 1989-05-31 | 1990-12-06 | Teldix Gmbh | COMPUTER SYSTEM |
US5261085A (en) * | 1989-06-23 | 1993-11-09 | Digital Equipment Corporation | Fault-tolerant system and method for implementing a distributed state machine |
US5138708A (en) * | 1989-08-03 | 1992-08-11 | Unisys Corporation | Digital processor using current state comparison for providing fault tolerance |
US5247650A (en) * | 1989-08-30 | 1993-09-21 | Industrial Technology Institute | System for combining originally software incompatible control, kinematic, and discrete event simulation systems into a single integrated simulation system |
JPH03130842A (en) * | 1989-10-17 | 1991-06-04 | Toshiba Corp | Simultaneous execution controller for data base system |
US6038584A (en) * | 1989-11-17 | 2000-03-14 | Texas Instruments Incorporated | Synchronized MIMD multi-processing system and method of operation |
US5295258A (en) * | 1989-12-22 | 1994-03-15 | Tandem Computers Incorporated | Fault-tolerant computer system with online recovery and reintegration of redundant components |
US5381536A (en) * | 1989-12-29 | 1995-01-10 | Cray Research, Inc. | Method and apparatus for separate mark and wait instructions for processors having multiple memory ports |
US5208914A (en) * | 1989-12-29 | 1993-05-04 | Superconductor Systems Limited Partnership | Method and apparatus for non-sequential resource access |
US5218676A (en) * | 1990-01-08 | 1993-06-08 | The University Of Rochester | Dynamic routing system for a multinode communications network |
US5203004A (en) * | 1990-01-08 | 1993-04-13 | Tandem Computers Incorporated | Multi-board system having electronic keying and preventing power to improperly connected plug-in board with improperly configured diode connections |
US5357632A (en) * | 1990-01-09 | 1994-10-18 | Hughes Aircraft Company | Dynamic task allocation in a multi-processor system employing distributed control processors and distributed arithmetic processors |
US5150367A (en) * | 1990-03-02 | 1992-09-22 | General Electric Company | Composite range constraint propagation control |
US5214780A (en) * | 1990-03-23 | 1993-05-25 | Sun Microsystems, Inc. | Synchronized journaling system |
JP2793993B2 (en) * | 1990-04-05 | 1998-09-03 | 株式会社ゼクセル | Program runaway detection method in microcomputer |
EP0459912B1 (en) * | 1990-05-30 | 1996-09-11 | Fujitsu Limited | An issue processing system for a right to use a resource |
US5202988A (en) * | 1990-06-11 | 1993-04-13 | Supercomputer Systems Limited Partnership | System for communicating among processors having different speeds |
US5440730A (en) * | 1990-08-09 | 1995-08-08 | Bell Communications Research, Inc. | Time index access structure for temporal databases having concurrent multiple versions |
WO1992003784A1 (en) * | 1990-08-23 | 1992-03-05 | Supercomputer Systems Limited Partnership | Scheduling method for a multiprocessing operating system |
US5448731A (en) * | 1990-11-20 | 1995-09-05 | International Business Machines Corporation | Method and apparatus for controlling the deferred execution of user requests in a data processing system |
US5386555A (en) * | 1991-01-29 | 1995-01-31 | Casio Computer Co., Ltd. | Data processing system having a plurality of units in which design of the system can be changed by simple definition |
GB2259797B (en) * | 1991-09-19 | 1995-08-16 | Rockwell International Corp | Fault tolerant multi-drop communications system |
US6295615B1 (en) | 1991-12-09 | 2001-09-25 | Sprint Communications Company, L. P. | Automatic restoration of communication channels |
DE59108895D1 (en) * | 1991-12-18 | 1998-01-08 | Siemens Ag | Procedures for keeping the time in computer networks |
GB2263988B (en) * | 1992-02-04 | 1996-05-22 | Digital Equipment Corp | Work flow management system and method |
US5428769A (en) * | 1992-03-31 | 1995-06-27 | The Dow Chemical Company | Process control interface system having triply redundant remote field units |
US5287508A (en) * | 1992-04-07 | 1994-02-15 | Sun Microsystems, Inc. | Method and apparatus for efficient scheduling in a multiprocessor system |
JP2534430B2 (en) * | 1992-04-15 | 1996-09-18 | インターナショナル・ビジネス・マシーンズ・コーポレイション | Methods for achieving match of computer system output with fault tolerance |
JP3268338B2 (en) | 1992-04-22 | 2002-03-25 | 株式会社日立製作所 | Computer system |
US5325526A (en) * | 1992-05-12 | 1994-06-28 | Intel Corporation | Task scheduling in a multicomputer system |
GB2268817B (en) * | 1992-07-17 | 1996-05-01 | Integrated Micro Products Ltd | A fault-tolerant computer system |
JPH0683652A (en) * | 1992-08-31 | 1994-03-25 | Sharp Corp | Microcomputer system |
US5903757A (en) * | 1992-09-10 | 1999-05-11 | International Business Machines Corporation | Monitoring and handling of exception conditions in computer system |
US5421004A (en) * | 1992-09-24 | 1995-05-30 | International Business Machines Corporation | Hierarchical testing environment |
EP0592080A2 (en) * | 1992-09-24 | 1994-04-13 | International Business Machines Corporation | Method and apparatus for interprocess communication in a multicomputer system |
CA2106280C (en) * | 1992-09-30 | 2000-01-18 | Yennun Huang | Apparatus and methods for fault-tolerant computing employing a daemon monitoring process and fault-tolerant library to provide varying degrees of fault tolerance |
US6951019B1 (en) * | 1992-09-30 | 2005-09-27 | Apple Computer, Inc. | Execution control for processor tasks |
US5715386A (en) * | 1992-09-30 | 1998-02-03 | Lucent Technologies Inc. | Apparatus and methods for software rejuvenation |
US5471631A (en) * | 1992-10-19 | 1995-11-28 | International Business Machines Corporation | Using time stamps to correlate data processing event times in connected data processing units |
CA2100540A1 (en) * | 1992-10-19 | 1994-04-20 | Jonel George | System and method for performing resource reconfiguration in a computer system |
WO1994011820A1 (en) * | 1992-11-06 | 1994-05-26 | University Of Newcastle Upon Tyne | Efficient schemes for constructing reliable computing nodes in distributed systems |
US6151689A (en) * | 1992-12-17 | 2000-11-21 | Tandem Computers Incorporated | Detecting and isolating errors occurring in data communication in a multiple processor system |
JPH06188909A (en) * | 1992-12-18 | 1994-07-08 | Fujitsu Ltd | Abnormal packet processing system |
US5513354A (en) * | 1992-12-18 | 1996-04-30 | International Business Machines Corporation | Fault tolerant load management system and method |
US5507002A (en) * | 1992-12-24 | 1996-04-09 | At&T Global Information Solutions Company | Peripheral component interconnect special cycle protocol using soft message IDS |
FR2700401B1 (en) * | 1993-01-08 | 1995-02-24 | Cegelec | System for synchronizing responding tasks. |
US5426774A (en) * | 1993-04-06 | 1995-06-20 | Honeywell Inc. | Method for maintaining a sequence of events function during failover in a redundant multiple layer system |
US5506956A (en) * | 1993-04-07 | 1996-04-09 | Sprint Communications Company L.P. | Error correction and channel restoration apparatus for T1 digital links |
US5577196A (en) * | 1993-04-07 | 1996-11-19 | Sprint Communications Co. L.P. | Intelligent digital signal hitless protection switch |
JPH06332710A (en) * | 1993-05-21 | 1994-12-02 | Fujitsu Ltd | Object directional data processing system |
JPH0721135A (en) * | 1993-07-02 | 1995-01-24 | Fujitsu Ltd | Data processing system with duplex monitor function |
US5600786A (en) * | 1993-07-30 | 1997-02-04 | Honeywell Inc. | FIFO fail-safe bus |
US5473771A (en) * | 1993-09-01 | 1995-12-05 | At&T Corp. | Fault-tolerant processing system architecture |
US5440741A (en) * | 1993-09-20 | 1995-08-08 | Motorola, Inc. | Software overload control method |
US5812757A (en) * | 1993-10-08 | 1998-09-22 | Mitsubishi Denki Kabushiki Kaisha | Processing board, a computer, and a fault recovery method for the computer |
US5604863A (en) * | 1993-11-01 | 1997-02-18 | International Business Machines Corporation | Method for coordinating executing programs in a data processing system |
US5437032A (en) * | 1993-11-04 | 1995-07-25 | International Business Machines Corporation | Task scheduler for a miltiprocessor system |
EP0731945B1 (en) * | 1993-12-01 | 2000-05-17 | Marathon Technologies Corporation | Fault resilient/fault tolerant computing |
US5640504A (en) * | 1994-01-24 | 1997-06-17 | Advanced Computer Applications, Inc. | Distributed computing network |
US5600822A (en) * | 1994-04-05 | 1997-02-04 | International Business Machines Corporation | Resource allocation synchronization in a parallel processing system |
US5577211A (en) * | 1994-05-11 | 1996-11-19 | Ibm Corporation | System and method using chained structure queues for ordering of message delivery between connected nodes wherein unsuccessful message portion is skipped and retried |
JPH07312616A (en) * | 1994-05-18 | 1995-11-28 | Fujitsu Ltd | Double reception preventing method for communication system for selective receiving system |
US5586252A (en) * | 1994-05-24 | 1996-12-17 | International Business Machines Corporation | System for failure mode and effects analysis |
JP3636744B2 (en) * | 1994-06-14 | 2005-04-06 | 株式会社日立製作所 | Distributed system and method for creating automatic operation schedule of distributed system |
JPH0863365A (en) * | 1994-08-23 | 1996-03-08 | Fujitsu Ltd | Data processor |
US5644700A (en) * | 1994-10-05 | 1997-07-01 | Unisys Corporation | Method for operating redundant master I/O controllers |
US5586305A (en) * | 1994-10-21 | 1996-12-17 | Hewlett-Packard Company | Smart distributed measurement and control system with a flexible architecture |
US5657245A (en) * | 1994-11-09 | 1997-08-12 | Westinghouse Electric Corporation | Component maintenance system |
US6876309B1 (en) * | 1994-11-21 | 2005-04-05 | Espeed, Inc. | Bond trading system |
US8588729B2 (en) | 1994-11-21 | 2013-11-19 | Bgc Partners, Inc. | Method for retrieving data stored in a database |
US5596716A (en) * | 1995-03-01 | 1997-01-21 | Unisys Corporation | Method and apparatus for indicating the severity of a fault within a computer system |
US5734817A (en) * | 1995-03-01 | 1998-03-31 | Unisys Corporation | Method for making a data base available to a user program during data base recovery |
JPH096633A (en) * | 1995-06-07 | 1997-01-10 | Internatl Business Mach Corp <Ibm> | Method and system for operation of high-performance multiplelogical route in data-processing system |
US5666486A (en) * | 1995-06-23 | 1997-09-09 | Data General Corporation | Multiprocessor cluster membership manager framework |
US5796935A (en) * | 1995-07-20 | 1998-08-18 | Raytheon Company | Voting node for a distributed control system |
US5802340A (en) * | 1995-08-22 | 1998-09-01 | International Business Machines Corporation | Method and system of executing speculative store instructions in a parallel processing computer system |
US5682470A (en) * | 1995-09-01 | 1997-10-28 | International Business Machines Corporation | Method and system for achieving collective consistency in detecting failures in a distributed computing system |
US5832262A (en) * | 1995-09-14 | 1998-11-03 | Lockheed Martin Corporation | Realtime hardware scheduler utilizing processor message passing and queue management cells |
US5887143A (en) * | 1995-10-26 | 1999-03-23 | Hitachi, Ltd. | Apparatus and method for synchronizing execution of programs in a distributed real-time computing system |
US6141769A (en) | 1996-05-16 | 2000-10-31 | Resilience Corporation | Triple modular redundant computer system and associated method |
US5742753A (en) * | 1996-06-06 | 1998-04-21 | The Boeing Company | Mesh interconnected array in a fault-tolerant computer system |
EP0825506B1 (en) * | 1996-08-20 | 2013-03-06 | Invensys Systems, Inc. | Methods and apparatus for remote process control |
US5790397A (en) * | 1996-09-17 | 1998-08-04 | Marathon Technologies Corporation | Fault resilient/fault tolerant computing |
US6275844B1 (en) * | 1997-02-03 | 2001-08-14 | Electronic Data Systems Corporation | Automatically maintaining applications in a computer system using a check-in time |
CA2228574A1 (en) * | 1997-06-05 | 1999-08-02 | Attention Control Systems, Inc. | An automatic planning and cueing system and method |
DE19742918A1 (en) * | 1997-09-29 | 1999-04-01 | Cit Alcatel | Process for the exchange of data packets within a secure multi-computer system |
US5978928A (en) * | 1997-10-17 | 1999-11-02 | Hewlett-Packard Company | Relative logarithmic time stamps for reduced memory map size |
US6151620A (en) * | 1997-10-22 | 2000-11-21 | Novell, Inc. | Conference messaging system |
WO1999026133A2 (en) | 1997-11-14 | 1999-05-27 | Marathon Technologies Corporation | Method for maintaining the synchronized execution in fault resilient/fault tolerant computer systems |
US6098179A (en) * | 1998-01-22 | 2000-08-01 | Digital Equipment Corporation | Method and apparatus for performing error detection |
JP3609599B2 (en) * | 1998-01-30 | 2005-01-12 | 富士通株式会社 | Node proxy system, node monitoring system, method thereof, and recording medium |
US6601107B1 (en) * | 1998-02-02 | 2003-07-29 | Hughes Electronics Corporation | Adaptive fuzzy control of data acquisition and broadcasting |
US6243856B1 (en) | 1998-02-03 | 2001-06-05 | Amazing Media, Inc. | System and method for encoding a scene graph |
US6263496B1 (en) | 1998-02-03 | 2001-07-17 | Amazing Media, Inc. | Self modifying scene graph |
US6272650B1 (en) * | 1998-02-03 | 2001-08-07 | Amazing Media, Inc. | System and method for disambiguating scene graph loads |
US6347253B1 (en) * | 1998-04-30 | 2002-02-12 | Kabushiki Kaisha Toshiba | Control system for executing a task instruction by performing distributed processing via a number of modules |
US6691183B1 (en) | 1998-05-20 | 2004-02-10 | Invensys Systems, Inc. | Second transfer logic causing a first transfer logic to check a data ready bit prior to each of multibit transfer of a continous transfer operation |
US6748451B2 (en) * | 1998-05-26 | 2004-06-08 | Dow Global Technologies Inc. | Distributed computing environment using real-time scheduling logic and time deterministic architecture |
US6178522B1 (en) * | 1998-06-02 | 2001-01-23 | Alliedsignal Inc. | Method and apparatus for managing redundant computer-based systems for fault tolerant computing |
JP4086372B2 (en) * | 1998-08-28 | 2008-05-14 | キヤノン株式会社 | COMMUNICATION SYSTEM, INFORMATION PROCESSING DEVICE, OUTPUT DEVICE, CONTROL METHOD, AND STORAGE MEDIUM |
US7013305B2 (en) | 2001-10-01 | 2006-03-14 | International Business Machines Corporation | Managing the state of coupling facility structures, detecting by one or more systems coupled to the coupling facility, the suspended state of the duplexed command, detecting being independent of message exchange |
US6389551B1 (en) | 1998-12-17 | 2002-05-14 | Steeleye Technology, Inc. | Method of preventing false or unnecessary failovers in a high availability cluster by using a quorum service |
GB2345161A (en) * | 1998-12-23 | 2000-06-28 | Motorola Ltd | Microprocessor module and method |
US6363496B1 (en) * | 1999-01-29 | 2002-03-26 | The United States Of America As Represented By The Secretary Of The Air Force | Apparatus and method for reducing duration of timeout periods in fault-tolerant distributed computer systems |
DE19910863A1 (en) * | 1999-03-11 | 2000-09-21 | Siemens Ag | Device and method for processing orders |
EP1052793B8 (en) * | 1999-05-11 | 2009-04-01 | Canon Kabushiki Kaisha | Method and device for synchronisation between two networks |
US7096465B1 (en) | 1999-05-17 | 2006-08-22 | Invensys Systems, Inc. | Process control configuration system with parameterized objects |
AU5025600A (en) * | 1999-05-17 | 2000-12-05 | Foxboro Company, The | Process control configuration system with parameterized objects |
US6754885B1 (en) | 1999-05-17 | 2004-06-22 | Invensys Systems, Inc. | Methods and apparatus for controlling object appearance in a process control configuration system |
US7089530B1 (en) | 1999-05-17 | 2006-08-08 | Invensys Systems, Inc. | Process control configuration system with connection validation and configuration |
US7272815B1 (en) | 1999-05-17 | 2007-09-18 | Invensys Systems, Inc. | Methods and apparatus for control configuration with versioning, security, composite blocks, edit selection, object swapping, formulaic values and other aspects |
US7043728B1 (en) | 1999-06-08 | 2006-05-09 | Invensys Systems, Inc. | Methods and apparatus for fault-detecting and fault-tolerant process control |
US6501995B1 (en) | 1999-06-30 | 2002-12-31 | The Foxboro Company | Process control system and method with improved distribution, installation and validation of components |
US6788980B1 (en) | 1999-06-11 | 2004-09-07 | Invensys Systems, Inc. | Methods and apparatus for control using control devices that provide a virtual machine environment and that communicate via an IP network |
US6393126B1 (en) | 1999-06-23 | 2002-05-21 | Datum, Inc. | System and methods for generating trusted and authenticatable time stamps for electronic documents |
US6381712B1 (en) * | 1999-06-30 | 2002-04-30 | Sun Microsystems, Inc. | Method and apparatus for providing an error messaging system |
AU6615600A (en) | 1999-07-29 | 2001-02-19 | Foxboro Company, The | Methods and apparatus for object-based process control |
US6658556B1 (en) | 1999-07-30 | 2003-12-02 | International Business Machines Corporation | Hashing a target address for a memory access instruction in order to determine prior to execution which particular load/store unit processes the instruction |
US6446165B1 (en) | 1999-07-30 | 2002-09-03 | International Business Machines Corporation | Address dependent caching behavior within a data processing system having HSA (hashed storage architecture) |
US6823471B1 (en) * | 1999-07-30 | 2004-11-23 | International Business Machines Corporation | Method for providing high availability within a data processing system via a reconfigurable hashed storage subsystem |
US6516404B1 (en) | 1999-07-30 | 2003-02-04 | International Business Machines Corporation | Data processing system having hashed architected processor facilities |
US6449691B1 (en) | 1999-07-30 | 2002-09-10 | International Business Machines Corporation | Asymmetrical cache properties within a hashed storage subsystem |
US6470442B1 (en) | 1999-07-30 | 2002-10-22 | International Business Machines Corporation | Processor assigning data to hardware partition based on selectable hash of data address |
US6598118B1 (en) | 1999-07-30 | 2003-07-22 | International Business Machines Corporation | Data processing system with HSA (hashed storage architecture) |
EP1188294B1 (en) * | 1999-10-14 | 2008-03-26 | Bluearc UK Limited | Apparatus and method for hardware implementation or acceleration of operating system functions |
US6473660B1 (en) | 1999-12-03 | 2002-10-29 | The Foxboro Company | Process control system and method with automatic fault avoidance |
US6502141B1 (en) * | 1999-12-14 | 2002-12-31 | International Business Machines Corporation | Method and system for approximate, monotonic time synchronization for a multiple node NUMA system |
US6694362B1 (en) * | 2000-01-03 | 2004-02-17 | Micromuse Inc. | Method and system for network event impact analysis and correlation with network administrators, management policies and procedures |
US7099855B1 (en) | 2000-01-13 | 2006-08-29 | International Business Machines Corporation | System and method for electronic communication management |
US6779128B1 (en) | 2000-02-18 | 2004-08-17 | Invensys Systems, Inc. | Fault-tolerant data transfer |
US6408277B1 (en) | 2000-06-21 | 2002-06-18 | Banter Limited | System and method for automatic task prioritization |
US9699129B1 (en) | 2000-06-21 | 2017-07-04 | International Business Machines Corporation | System and method for increasing email productivity |
US8290768B1 (en) | 2000-06-21 | 2012-10-16 | International Business Machines Corporation | System and method for determining a set of attributes based on content of communications |
US6801202B2 (en) * | 2000-06-29 | 2004-10-05 | Sun Microsystems, Inc. | Graphics system configured to parallel-process graphics data using multiple pipelines |
JP2002082815A (en) * | 2000-09-07 | 2002-03-22 | Oki Electric Ind Co Ltd | Task program control system |
US6963887B2 (en) * | 2000-09-22 | 2005-11-08 | Sarnoff Corporation | Method and device for performing data pattern matching |
US20050157654A1 (en) * | 2000-10-12 | 2005-07-21 | Farrell Craig A. | Apparatus and method for automated discovery and monitoring of relationships between network elements |
US7383191B1 (en) * | 2000-11-28 | 2008-06-03 | International Business Machines Corporation | Method and system for predicting causes of network service outages using time domain correlation |
US20020080888A1 (en) * | 2000-12-22 | 2002-06-27 | Li Shu | Message splitting and spatially diversified message routing for increasing transmission assurance and data security over distributed networks |
DE10065117A1 (en) * | 2000-12-28 | 2002-07-04 | Bosch Gmbh Robert | Method and communication system for exchanging data between at least two participants via a bus system |
US7644057B2 (en) * | 2001-01-03 | 2010-01-05 | International Business Machines Corporation | System and method for electronic communication management |
GB0102220D0 (en) * | 2001-01-27 | 2001-03-14 | Galleria Software Developement | Databases |
DE10109558C1 (en) * | 2001-02-28 | 2003-01-30 | Siemens Ag | Additional circuit on the receiver side for the boundary scan during data transmission with differential signals |
US6526491B2 (en) * | 2001-03-22 | 2003-02-25 | Sony Corporation Entertainment Inc. | Memory protection system and method for computer architecture for broadband networks |
US6966015B2 (en) * | 2001-03-22 | 2005-11-15 | Micromuse, Ltd. | Method and system for reducing false alarms in network fault management systems |
US6744739B2 (en) * | 2001-05-18 | 2004-06-01 | Micromuse Inc. | Method and system for determining network characteristics using routing protocols |
US7043727B2 (en) * | 2001-06-08 | 2006-05-09 | Micromuse Ltd. | Method and system for efficient distribution of network event data |
US7673223B2 (en) * | 2001-06-15 | 2010-03-02 | Qualcomm Incorporated | Node processors for use in parity check decoders |
US6633856B2 (en) * | 2001-06-15 | 2003-10-14 | Flarion Technologies, Inc. | Methods and apparatus for decoding LDPC codes |
US7516208B1 (en) | 2001-07-20 | 2009-04-07 | International Business Machines Corporation | Event database management method and system for network event reporting system |
US7036068B2 (en) | 2001-07-25 | 2006-04-25 | Hewlett-Packard Development Company, L.P. | Error correction coding and decoding in a solid-state storage device |
US6981196B2 (en) | 2001-07-25 | 2005-12-27 | Hewlett-Packard Development Company, L.P. | Data storage method for use in a magnetoresistive solid-state storage device |
US20030023922A1 (en) * | 2001-07-25 | 2003-01-30 | Davis James A. | Fault tolerant magnetoresistive solid-state storage device |
US20050286685A1 (en) * | 2001-08-10 | 2005-12-29 | Nikola Vukovljak | System and method for testing multiple dial-up points in a communications network |
US6766482B1 (en) | 2001-10-31 | 2004-07-20 | Extreme Networks | Ethernet automatic protection switching |
JP3823044B2 (en) * | 2001-10-31 | 2006-09-20 | パナソニック モバイルコミュニケーションズ株式会社 | Time stamp value controller |
US7512931B2 (en) * | 2001-11-13 | 2009-03-31 | National Instruments Corporation | Graphical program nodes for implementing a measurement state model |
US7171493B2 (en) * | 2001-12-19 | 2007-01-30 | The Charles Stark Draper Laboratory | Camouflage of network traffic to resist attack |
US7363368B2 (en) | 2001-12-24 | 2008-04-22 | International Business Machines Corporation | System and method for transaction recording and playback |
US6560512B1 (en) * | 2002-01-04 | 2003-05-06 | Machine Consciousness, Inc. | Relational robotic controller |
US6996443B2 (en) * | 2002-01-11 | 2006-02-07 | Bae Systems Information And Electronic Systems Integration Inc. | Reconfigurable digital processing system for space |
US7168008B2 (en) * | 2002-01-18 | 2007-01-23 | Mobitv, Inc. | Method and system for isolating and protecting software components |
US7171605B2 (en) * | 2002-02-01 | 2007-01-30 | International Business Machines Corporation | Check bit free error correction for sleep mode data retention |
US7093004B2 (en) * | 2002-02-04 | 2006-08-15 | Datasynapse, Inc. | Using execution statistics to select tasks for redundant assignment in a distributed computing platform |
US6973604B2 (en) | 2002-03-08 | 2005-12-06 | Hewlett-Packard Development Company, L.P. | Allocation of sparing resources in a magnetoresistive solid-state storage device |
US7069442B2 (en) | 2002-03-29 | 2006-06-27 | Intel Corporation | System and method for execution of a secured environment initialization instruction |
US20030217054A1 (en) * | 2002-04-15 | 2003-11-20 | Bachman George E. | Methods and apparatus for process, factory-floor, environmental, computer aided manufacturing-based or other control system with real-time data distribution |
GB2387683B (en) * | 2002-04-19 | 2007-03-28 | Hewlett Packard Co | Workflow processing scheduler |
US7668899B2 (en) * | 2002-05-07 | 2010-02-23 | Alcatel-Lucent Usa Inc. | Decoupled routing network method and system |
US7174551B2 (en) * | 2002-05-20 | 2007-02-06 | International Business Machines Corporation | Multiple task wait system for use in a data warehouse environment |
US7440462B2 (en) * | 2002-05-23 | 2008-10-21 | Motorola, Inc. | Quality of service (QOS) control mechanisms using mediation devices in an asynchronous network |
US7492773B2 (en) * | 2002-05-23 | 2009-02-17 | Motorola, Inc. | Media access control and distributed data processing using mediation devices in an asynchronous network |
US20030223090A1 (en) * | 2002-05-28 | 2003-12-04 | Mustafa Seifi | Method and implementation for message-driven job processing |
US7154886B2 (en) | 2002-07-22 | 2006-12-26 | Qlogic Corporation | Method and system for primary blade selection in a multi-module fiber channel switch |
US20040133642A1 (en) * | 2002-07-23 | 2004-07-08 | Vazquez Pedro A. | Server and application programming interface for distributed rendezvous |
US20040024954A1 (en) * | 2002-07-30 | 2004-02-05 | Rust Robert A. | Time stamp management system for disk arrays |
US7334046B1 (en) | 2002-08-05 | 2008-02-19 | Qlogic, Corporation | System and method for optimizing frame routing in a network |
US7397768B1 (en) | 2002-09-11 | 2008-07-08 | Qlogic, Corporation | Zone management in a multi-module fibre channel switch |
US7362717B1 (en) | 2002-10-03 | 2008-04-22 | Qlogic, Corporation | Method and system for using distributed name servers in multi-module fibre channel switches |
US7089484B2 (en) * | 2002-10-21 | 2006-08-08 | International Business Machines Corporation | Dynamic sparing during normal computer system operation |
US7127446B1 (en) * | 2002-10-30 | 2006-10-24 | Advanced Micro Devices, Inc. | File system based task queue management |
US7533382B2 (en) * | 2002-10-30 | 2009-05-12 | Stmicroelectronics, Inc. | Hyperprocessor |
US7415565B2 (en) * | 2002-10-31 | 2008-08-19 | Ring Technology Enterprises, Llc | Methods and systems for a storage system with a program-controlled switch for routing data |
ATE497208T1 (en) * | 2002-10-31 | 2011-02-15 | Ring Technology Entpr S Llc | METHOD AND SYSTEMS FOR A STORAGE SYSTEM |
US7197662B2 (en) * | 2002-10-31 | 2007-03-27 | Ring Technology Enterprises, Llc | Methods and systems for a storage system |
US7707351B2 (en) * | 2002-10-31 | 2010-04-27 | Ring Technology Enterprises Of Texas, Llc | Methods and systems for an identifier-based memory section |
US6879526B2 (en) * | 2002-10-31 | 2005-04-12 | Ring Technology Enterprises Llc | Methods and apparatus for improved memory access |
US7457822B1 (en) | 2002-11-01 | 2008-11-25 | Bluearc Uk Limited | Apparatus and method for hardware-based file system |
US7243264B2 (en) * | 2002-11-01 | 2007-07-10 | Sonics, Inc. | Method and apparatus for error handling in networks |
US8041735B1 (en) | 2002-11-01 | 2011-10-18 | Bluearc Uk Limited | Distributed file system and method |
US7319669B1 (en) | 2002-11-22 | 2008-01-15 | Qlogic, Corporation | Method and system for controlling packet flow in networks |
US7149923B1 (en) * | 2003-01-17 | 2006-12-12 | Unisys Corporation | Software control using the controller as a component to achieve resiliency in a computer system utilizing separate servers for redundancy |
US7389507B2 (en) * | 2003-02-10 | 2008-06-17 | Tandberg Data Corporation | Operating-system-independent modular programming method for robust just-in-time response to multiple asynchronous data streams |
US6957375B2 (en) * | 2003-02-26 | 2005-10-18 | Flarion Technologies, Inc. | Method and apparatus for performing low-density parity-check (LDPC) code operations using a multi-level permutation |
JP2004280558A (en) * | 2003-03-17 | 2004-10-07 | Ricoh Co Ltd | Interface circuit and optical disk device provided with interface circuit |
US7389230B1 (en) | 2003-04-22 | 2008-06-17 | International Business Machines Corporation | System and method for classification of voice signals |
US8495002B2 (en) * | 2003-05-06 | 2013-07-23 | International Business Machines Corporation | Software tool for training and testing a knowledge base |
US20050187913A1 (en) | 2003-05-06 | 2005-08-25 | Yoram Nelken | Web-based customer service interface |
US20040237089A1 (en) * | 2003-05-20 | 2004-11-25 | Teh Jin Teik | Separation of data and instruction for improving system performance in a multiple process environment |
JP4296050B2 (en) * | 2003-07-14 | 2009-07-15 | 三菱電機株式会社 | Electronic control unit |
US7406628B2 (en) * | 2003-07-15 | 2008-07-29 | Seagate Technology Llc | Simulated error injection system in target device for testing host system |
US7620059B2 (en) | 2003-07-16 | 2009-11-17 | Qlogic, Corporation | Method and apparatus for accelerating receive-modify-send frames in a fibre channel network |
US7463646B2 (en) | 2003-07-16 | 2008-12-09 | Qlogic Corporation | Method and system for fibre channel arbitrated loop acceleration |
US7471635B2 (en) | 2003-07-16 | 2008-12-30 | Qlogic, Corporation | Method and apparatus for test pattern generation |
US7355966B2 (en) | 2003-07-16 | 2008-04-08 | Qlogic, Corporation | Method and system for minimizing disruption in common-access networks |
US7453802B2 (en) | 2003-07-16 | 2008-11-18 | Qlogic, Corporation | Method and apparatus for detecting and removing orphaned primitives in a fibre channel network |
US7525910B2 (en) | 2003-07-16 | 2009-04-28 | Qlogic, Corporation | Method and system for non-disruptive data capture in networks |
US7388843B2 (en) | 2003-07-16 | 2008-06-17 | Qlogic, Corporation | Method and apparatus for testing loop pathway integrity in a fibre channel arbitrated loop |
US7792115B2 (en) | 2003-07-21 | 2010-09-07 | Qlogic, Corporation | Method and system for routing and filtering network data packets in fibre channel systems |
US7420982B2 (en) | 2003-07-21 | 2008-09-02 | Qlogic, Corporation | Method and system for keeping a fibre channel arbitrated loop open during frame gaps |
US7466700B2 (en) | 2003-07-21 | 2008-12-16 | Qlogic, Corporation | LUN based hard zoning in fibre channel switches |
US7646767B2 (en) | 2003-07-21 | 2010-01-12 | Qlogic, Corporation | Method and system for programmable data dependant network routing |
US7684401B2 (en) | 2003-07-21 | 2010-03-23 | Qlogic, Corporation | Method and system for using extended fabric features with fibre channel switch elements |
US7447224B2 (en) | 2003-07-21 | 2008-11-04 | Qlogic, Corporation | Method and system for routing fibre channel frames |
US7558281B2 (en) | 2003-07-21 | 2009-07-07 | Qlogic, Corporation | Method and system for configuring fibre channel ports |
US7573909B2 (en) | 2003-07-21 | 2009-08-11 | Qlogic, Corporation | Method and system for programmable data dependant network routing |
US7522529B2 (en) | 2003-07-21 | 2009-04-21 | Qlogic, Corporation | Method and system for detecting congestion and over subscription in a fibre channel network |
US7525983B2 (en) | 2003-07-21 | 2009-04-28 | Qlogic, Corporation | Method and system for selecting virtual lanes in fibre channel switches |
US7477655B2 (en) | 2003-07-21 | 2009-01-13 | Qlogic, Corporation | Method and system for power control of fibre channel switches |
US7406092B2 (en) | 2003-07-21 | 2008-07-29 | Qlogic, Corporation | Programmable pseudo virtual lanes for fibre channel systems |
US7430175B2 (en) * | 2003-07-21 | 2008-09-30 | Qlogic, Corporation | Method and system for managing traffic in fibre channel systems |
US7580354B2 (en) | 2003-07-21 | 2009-08-25 | Qlogic, Corporation | Multi-speed cut through operation in fibre channel switches |
US7583597B2 (en) | 2003-07-21 | 2009-09-01 | Qlogic Corporation | Method and system for improving bandwidth and reducing idles in fibre channel switches |
US7522522B2 (en) | 2003-07-21 | 2009-04-21 | Qlogic, Corporation | Method and system for reducing latency and congestion in fibre channel switches |
US7512067B2 (en) | 2003-07-21 | 2009-03-31 | Qlogic, Corporation | Method and system for congestion control based on optimum bandwidth allocation in a fibre channel switch |
US7894348B2 (en) | 2003-07-21 | 2011-02-22 | Qlogic, Corporation | Method and system for congestion control in a fibre channel switch |
US7630384B2 (en) | 2003-07-21 | 2009-12-08 | Qlogic, Corporation | Method and system for distributing credit in fibre channel systems |
US8234395B2 (en) | 2003-07-28 | 2012-07-31 | Sonos, Inc. | System and method for synchronizing operations among a plurality of independently clocked digital data processing devices |
US11294618B2 (en) | 2003-07-28 | 2022-04-05 | Sonos, Inc. | Media player system |
US11106425B2 (en) | 2003-07-28 | 2021-08-31 | Sonos, Inc. | Synchronizing operations among a plurality of independently clocked digital data processing devices |
US8290603B1 (en) | 2004-06-05 | 2012-10-16 | Sonos, Inc. | User interfaces for controlling and manipulating groupings in a multi-zone media system |
US11106424B2 (en) | 2003-07-28 | 2021-08-31 | Sonos, Inc. | Synchronizing operations among a plurality of independently clocked digital data processing devices |
US9207905B2 (en) | 2003-07-28 | 2015-12-08 | Sonos, Inc. | Method and apparatus for providing synchrony group status information |
US11650784B2 (en) | 2003-07-28 | 2023-05-16 | Sonos, Inc. | Adjusting volume levels |
US8086752B2 (en) | 2006-11-22 | 2011-12-27 | Sonos, Inc. | Systems and methods for synchronizing operations among a plurality of independently clocked digital data processing devices that independently source digital data |
US20050033625A1 (en) * | 2003-08-06 | 2005-02-10 | International Business Machines Corporation | Method, apparatus and program storage device for scheduling the performance of maintenance tasks to maintain a system environment |
US7852856B2 (en) * | 2003-08-29 | 2010-12-14 | Broadcom Corp. | System and method for providing pooling or dynamic allocation of connection context data |
US7352701B1 (en) | 2003-09-19 | 2008-04-01 | Qlogic, Corporation | Buffer to buffer credit recovery for in-line fibre channel credit extension devices |
US20050120272A1 (en) * | 2003-11-13 | 2005-06-02 | Smith Zachary S. | Systems and methods for determining bug ownership |
US7500152B2 (en) * | 2003-12-05 | 2009-03-03 | Freescale Semiconductor, Inc. | Apparatus and method for time ordering events in a system having multiple time domains |
JP2005184335A (en) * | 2003-12-18 | 2005-07-07 | Oki Electric Ind Co Ltd | Missynchronization preventing device in wireless communication device |
US7039661B1 (en) * | 2003-12-29 | 2006-05-02 | Veritas Operating Corporation | Coordinated dirty block tracking |
US7480293B2 (en) | 2004-02-05 | 2009-01-20 | Qlogic, Corporation | Method and system for preventing deadlock in fibre channel fabrics using frame priorities |
US7564789B2 (en) | 2004-02-05 | 2009-07-21 | Qlogic, Corporation | Method and system for reducing deadlock in fibre channel fabrics using virtual lanes |
US7761923B2 (en) * | 2004-03-01 | 2010-07-20 | Invensys Systems, Inc. | Process control methods and apparatus for intrusion detection, protection and network hardening |
US20050228967A1 (en) * | 2004-03-16 | 2005-10-13 | Sony Computer Entertainment Inc. | Methods and apparatus for reducing power dissipation in a multi-processor system |
US8224639B2 (en) | 2004-03-29 | 2012-07-17 | Sony Computer Entertainment Inc. | Methods and apparatus for achieving thermal management using processing task scheduling |
GB2412755A (en) * | 2004-03-30 | 2005-10-05 | Hewlett Packard Development Co | Coordination of lifecycle state changes in software components |
US20050273657A1 (en) * | 2004-04-01 | 2005-12-08 | Hiroshi Ichiki | Information processing apparatus and method, and recording medium and program for controlling the same |
US9374607B2 (en) | 2012-06-26 | 2016-06-21 | Sonos, Inc. | Media playback system with guest access |
US9977561B2 (en) | 2004-04-01 | 2018-05-22 | Sonos, Inc. | Systems, methods, apparatus, and articles of manufacture to provide guest access |
US7340167B2 (en) | 2004-04-23 | 2008-03-04 | Qlogic, Corporation | Fibre channel transparent switch for mixed switch fabrics |
US7930377B2 (en) | 2004-04-23 | 2011-04-19 | Qlogic, Corporation | Method and system for using boot servers in networks |
GB0411054D0 (en) * | 2004-05-18 | 2004-06-23 | Ricardo Uk Ltd | Fault tolerant data processing |
US8868698B2 (en) | 2004-06-05 | 2014-10-21 | Sonos, Inc. | Establishing a secure wireless network with minimum human intervention |
US8326951B1 (en) | 2004-06-05 | 2012-12-04 | Sonos, Inc. | Establishing a secure wireless network with minimum human intervention |
US7404020B2 (en) | 2004-07-20 | 2008-07-22 | Qlogic, Corporation | Integrated fibre channel fabric controller |
US8171474B2 (en) * | 2004-10-01 | 2012-05-01 | Serguei Mankovski | System and method for managing, scheduling, controlling and monitoring execution of jobs by a job scheduler utilizing a publish/subscription interface |
US7593997B2 (en) | 2004-10-01 | 2009-09-22 | Qlogic, Corporation | Method and system for LUN remapping in fibre channel networks |
US8295299B2 (en) | 2004-10-01 | 2012-10-23 | Qlogic, Corporation | High speed fibre channel switch element |
US7411958B2 (en) | 2004-10-01 | 2008-08-12 | Qlogic, Corporation | Method and system for transferring data directly between storage devices in a storage area network |
US7593344B2 (en) * | 2004-10-14 | 2009-09-22 | Temic Automotive Of North America, Inc. | System and method for reprogramming nodes in an automotive switch fabric network |
US7623552B2 (en) | 2004-10-14 | 2009-11-24 | Temic Automotive Of North America, Inc. | System and method for time synchronizing nodes in an automotive network using input capture |
US20060083172A1 (en) * | 2004-10-14 | 2006-04-20 | Jordan Patrick D | System and method for evaluating the performance of an automotive switch fabric network |
WO2006044140A2 (en) * | 2004-10-14 | 2006-04-27 | Motorola, Inc. | System and method for time synchronizing nodes in an automotive network |
US7593429B2 (en) * | 2004-10-14 | 2009-09-22 | Temic Automotive Of North America, Inc. | System and method for time synchronizing nodes in an automotive network using input capture |
US7599377B2 (en) * | 2004-10-15 | 2009-10-06 | Temic Automotive Of North America, Inc. | System and method for tunneling standard bus protocol messages through an automotive switch fabric network |
US7613190B2 (en) * | 2004-10-18 | 2009-11-03 | Temic Automotive Of North America, Inc. | System and method for streaming sequential data through an automotive switch fabric |
FR2877169B1 (en) * | 2004-10-27 | 2007-01-26 | France Telecom | METHOD FOR SYNCHRONIZING A RADIO RECEIVER, AND RECEIVER ADAPTED FOR IMPLEMENTING SUCH A METHOD |
US20060155753A1 (en) * | 2004-11-11 | 2006-07-13 | Marc Asher | Global asynchronous serialized transaction identifier |
US20060155770A1 (en) * | 2004-11-11 | 2006-07-13 | Ipdev Co. | System and method for time-based allocation of unique transaction identifiers in a multi-server system |
US20060123098A1 (en) * | 2004-11-11 | 2006-06-08 | Ipdev | Multi-system auto-failure web-based system with dynamic session recovery |
US7519058B2 (en) | 2005-01-18 | 2009-04-14 | Qlogic, Corporation | Address translation in fibre channel switches |
JP4626322B2 (en) * | 2005-02-03 | 2011-02-09 | 富士ゼロックス株式会社 | program |
US8589944B2 (en) * | 2005-03-16 | 2013-11-19 | Ricoh Production Print Solutions | Method and system for task mapping to iteratively improve task assignment in a heterogeneous computing system |
US7539931B2 (en) * | 2005-04-08 | 2009-05-26 | Hewlett-Packard Development Company, L.P. | Storage element for mitigating soft errors in logic |
US20060236158A1 (en) * | 2005-04-15 | 2006-10-19 | Thayer Larry J | Memory element for mitigating soft errors in logic |
US9459960B2 (en) | 2005-06-03 | 2016-10-04 | Rambus Inc. | Controller device for use with electrically erasable programmable memory chip with error detection and retry modes of operation |
US7831882B2 (en) | 2005-06-03 | 2010-11-09 | Rambus Inc. | Memory system with error detection and retry modes of operation |
US20070033247A1 (en) * | 2005-08-02 | 2007-02-08 | The Mathworks, Inc. | Methods and system for distributing data to technical computing workers |
US8769495B1 (en) * | 2005-09-30 | 2014-07-01 | Sony Computer Entertainment Inc. | Systems and methods for debugging in a multiprocessor environment |
TWI297237B (en) * | 2005-10-28 | 2008-05-21 | Hon Hai Prec Ind Co Ltd | Power switching circuit and power supply system using the same |
US7676604B2 (en) * | 2005-11-22 | 2010-03-09 | Intel Corporation | Task context direct indexing in a protocol engine |
EP1796000A1 (en) * | 2005-12-06 | 2007-06-13 | International Business Machines Corporation | Method, system and computer program for distributing software products in trial mode |
US7562285B2 (en) | 2006-01-11 | 2009-07-14 | Rambus Inc. | Unidirectional error code transfer for a bidirectional data link |
US8438572B2 (en) * | 2006-03-15 | 2013-05-07 | Freescale Semiconductor, Inc. | Task scheduling method and apparatus |
US8315274B2 (en) * | 2006-03-29 | 2012-11-20 | Honeywell International Inc. | System and method for supporting synchronous system communications and operations |
US7860857B2 (en) | 2006-03-30 | 2010-12-28 | Invensys Systems, Inc. | Digital data processing apparatus and methods for improving plant performance |
TW200743028A (en) * | 2006-05-12 | 2007-11-16 | Benq Corp | State synchronization apparatuses and methods |
US8352805B2 (en) | 2006-05-18 | 2013-01-08 | Rambus Inc. | Memory error detection |
US7925791B2 (en) * | 2006-07-17 | 2011-04-12 | The Math Works, Inc. | Recoverable error detection for concurrent computing programs |
US8483853B1 (en) | 2006-09-12 | 2013-07-09 | Sonos, Inc. | Controlling and manipulating groupings in a multi-zone media system |
US8788080B1 (en) | 2006-09-12 | 2014-07-22 | Sonos, Inc. | Multi-channel pairing in a media system |
US9202509B2 (en) | 2006-09-12 | 2015-12-01 | Sonos, Inc. | Controlling and grouping in a multi-zone media system |
US7730029B2 (en) * | 2006-09-15 | 2010-06-01 | Alcatel Lucent | System and method of fault tolerant reconciliation for control card redundancy |
US7891818B2 (en) | 2006-12-12 | 2011-02-22 | Evans & Sutherland Computer Corporation | System and method for aligning RGB light in a single modulator projector |
US8019723B2 (en) * | 2006-12-20 | 2011-09-13 | International Business Machines Corporation | Deferred copy target pull of volume data |
US7925626B2 (en) * | 2006-12-20 | 2011-04-12 | International Business Machines Corporation | Immediate copy target pull of volume data |
US20110072313A1 (en) * | 2007-08-17 | 2011-03-24 | Nxp B.V. | System for providing fault tolerance for at least one micro controller unit |
US8397236B2 (en) * | 2007-08-24 | 2013-03-12 | Virtualmetrix, Inc. | Credit based performance managment of computer systems |
CN101377746A (en) * | 2007-08-31 | 2009-03-04 | 鸿富锦精密工业(深圳)有限公司 | System and method for updating arranged task |
US7743274B2 (en) * | 2007-09-12 | 2010-06-22 | International Business Machines Corporation | Administering correlated error logs in a computer system |
US7453910B1 (en) | 2007-12-18 | 2008-11-18 | International Business Machines Corporation | Synchronization of independent clocks |
US8117512B2 (en) * | 2008-02-06 | 2012-02-14 | Westinghouse Electric Company Llc | Failure detection and mitigation in logic circuits |
US7925742B2 (en) * | 2008-02-28 | 2011-04-12 | Microsoft Corporation | Correlating performance data of multiple computing devices |
WO2009118731A2 (en) | 2008-03-27 | 2009-10-01 | Rocketick Technologies Ltd | Design simulation using parallel processors |
US8358317B2 (en) | 2008-05-23 | 2013-01-22 | Evans & Sutherland Computer Corporation | System and method for displaying a planar image on a curved surface |
US8702248B1 (en) | 2008-06-11 | 2014-04-22 | Evans & Sutherland Computer Corporation | Projection method for reducing interpixel gaps on a viewing surface |
CN102124432B (en) | 2008-06-20 | 2014-11-26 | 因文西斯系统公司 | Systems and methods for immersive interaction with actual and/or simulated facilities for process, environmental and industrial control |
JP2010011093A (en) * | 2008-06-27 | 2010-01-14 | Hitachi Ltd | Distributed system |
US9032377B2 (en) * | 2008-07-10 | 2015-05-12 | Rocketick Technologies Ltd. | Efficient parallel computation of dependency problems |
US7983175B2 (en) * | 2008-09-19 | 2011-07-19 | International Business Machines Corporation | System and method for detecting a network failure |
US8077378B1 (en) | 2008-11-12 | 2011-12-13 | Evans & Sutherland Computer Corporation | Calibration system and method for light modulation device |
IT1391785B1 (en) * | 2008-11-21 | 2012-01-27 | St Microelectronics Srl | ELECTRONIC SYSTEM FOR DETECTION OF FAILURE |
US8266477B2 (en) * | 2009-01-09 | 2012-09-11 | Ca, Inc. | System and method for modifying execution of scripts for a job scheduler using deontic logic |
US8463964B2 (en) * | 2009-05-29 | 2013-06-11 | Invensys Systems, Inc. | Methods and apparatus for control configuration with enhanced change-tracking |
US8127060B2 (en) * | 2009-05-29 | 2012-02-28 | Invensys Systems, Inc | Methods and apparatus for control configuration with control objects that are fieldbus protocol-aware |
US8463865B2 (en) * | 2010-03-09 | 2013-06-11 | Texas Instruments Incorporated | Video synchronization with distributed modules |
US8782653B2 (en) * | 2010-03-26 | 2014-07-15 | Virtualmetrix, Inc. | Fine grain performance resource management of computer systems |
US8677071B2 (en) * | 2010-03-26 | 2014-03-18 | Virtualmetrix, Inc. | Control of processor cache memory occupancy |
US8626927B2 (en) * | 2010-05-06 | 2014-01-07 | Verizon Patent And Licensing Inc. | System for and method of distributing files |
US8930896B1 (en) | 2010-07-23 | 2015-01-06 | Amazon Technologies, Inc. | Data anonymity and separation for user computation |
US11265652B2 (en) | 2011-01-25 | 2022-03-01 | Sonos, Inc. | Playback device pairing |
US11429343B2 (en) | 2011-01-25 | 2022-08-30 | Sonos, Inc. | Stereo playback configuration and control |
JP5648544B2 (en) * | 2011-03-15 | 2015-01-07 | 富士通株式会社 | Scheduling program and information processing apparatus |
US9128748B2 (en) | 2011-04-12 | 2015-09-08 | Rocketick Technologies Ltd. | Parallel simulation using multiple co-simulators |
JP5932242B2 (en) * | 2011-05-20 | 2016-06-08 | キヤノン株式会社 | Information processing apparatus, communication method, and program |
US20130074088A1 (en) * | 2011-09-19 | 2013-03-21 | Timothy John Purcell | Scheduling and management of compute tasks with different execution priority levels |
US9641826B1 (en) | 2011-10-06 | 2017-05-02 | Evans & Sutherland Computer Corporation | System and method for displaying distant 3-D stereo on a dome surface |
US9729115B2 (en) | 2012-04-27 | 2017-08-08 | Sonos, Inc. | Intelligently increasing the sound level of player |
US9008330B2 (en) | 2012-09-28 | 2015-04-14 | Sonos, Inc. | Crossover frequency adjustments for audio speakers |
KR20140071688A (en) * | 2012-12-04 | 2014-06-12 | 삼성디스플레이 주식회사 | Display Device and Driving Method Thereof |
RU2543316C2 (en) | 2012-12-25 | 2015-02-27 | Закрытое акционерное общество "Лаборатория Касперского" | System and method of fail-safe execution of scheduled tasks in distributed media |
US9510055B2 (en) | 2013-01-23 | 2016-11-29 | Sonos, Inc. | System and method for a media experience social interface |
US9092313B2 (en) * | 2013-01-25 | 2015-07-28 | Honeywell International Inc. | System and method for three input voting |
US9573277B2 (en) * | 2013-04-15 | 2017-02-21 | Alan Rosen | Intelligent visual humanoid robot and computer vision system programmed to perform visual artificial intelligence processes |
US9307508B2 (en) | 2013-04-29 | 2016-04-05 | Google Technology Holdings LLC | Systems and methods for syncronizing multiple electronic devices |
US9256461B2 (en) * | 2013-09-18 | 2016-02-09 | International Business Machines Corporation | Handling interrupt actions for inter-thread communication |
US9720576B2 (en) | 2013-09-30 | 2017-08-01 | Sonos, Inc. | Controlling and displaying zones in a multi-zone system |
US9288596B2 (en) | 2013-09-30 | 2016-03-15 | Sonos, Inc. | Coordinator device for paired or consolidated players |
US9654545B2 (en) | 2013-09-30 | 2017-05-16 | Sonos, Inc. | Group coordinator device selection |
US20150095679A1 (en) | 2013-09-30 | 2015-04-02 | Sonos, Inc. | Transitioning A Networked Playback Device Between Operating Modes |
US9300647B2 (en) | 2014-01-15 | 2016-03-29 | Sonos, Inc. | Software application and zones |
US9313591B2 (en) | 2014-01-27 | 2016-04-12 | Sonos, Inc. | Audio synchronization among playback devices using offset information |
US20150220498A1 (en) | 2014-02-05 | 2015-08-06 | Sonos, Inc. | Remote Creation of a Playback Queue for a Future Event |
US9226073B2 (en) | 2014-02-06 | 2015-12-29 | Sonos, Inc. | Audio output balancing during synchronized playback |
US9226087B2 (en) | 2014-02-06 | 2015-12-29 | Sonos, Inc. | Audio output balancing during synchronized playback |
US9679054B2 (en) | 2014-03-05 | 2017-06-13 | Sonos, Inc. | Webpage media playback |
US10587693B2 (en) | 2014-04-01 | 2020-03-10 | Sonos, Inc. | Mirrored queues |
US9302393B1 (en) * | 2014-04-15 | 2016-04-05 | Alan Rosen | Intelligent auditory humanoid robot and computerized verbalization system programmed to perform auditory and verbal artificial intelligence processes |
US20150324552A1 (en) | 2014-05-12 | 2015-11-12 | Sonos, Inc. | Share Restriction for Media Items |
US20150356084A1 (en) | 2014-06-05 | 2015-12-10 | Sonos, Inc. | Social Queue |
US9874997B2 (en) | 2014-08-08 | 2018-01-23 | Sonos, Inc. | Social playback queues |
US10645130B2 (en) | 2014-09-24 | 2020-05-05 | Sonos, Inc. | Playback updates |
US9723038B2 (en) | 2014-09-24 | 2017-08-01 | Sonos, Inc. | Social media connection recommendations based on playback information |
US9667679B2 (en) | 2014-09-24 | 2017-05-30 | Sonos, Inc. | Indicating an association between a social-media account and a media playback system |
US9959087B2 (en) | 2014-09-24 | 2018-05-01 | Sonos, Inc. | Media item context from social media |
US9690540B2 (en) | 2014-09-24 | 2017-06-27 | Sonos, Inc. | Social media queue |
EP3114625A1 (en) | 2014-09-24 | 2017-01-11 | Sonos, Inc. | Social media connection recommendations based on playback information |
US9860286B2 (en) | 2014-09-24 | 2018-01-02 | Sonos, Inc. | Associating a captured image with a media item |
DE102014114883A1 (en) * | 2014-10-14 | 2016-04-14 | Beckhoff Automation Gmbh | Method and system for monitoring a first subscriber of a communication network |
WO2016067420A1 (en) * | 2014-10-30 | 2016-05-06 | 三菱電機株式会社 | Computer, and data processing method and program |
US10248376B2 (en) | 2015-06-11 | 2019-04-02 | Sonos, Inc. | Multiple groupings in a playback system |
US10303422B1 (en) | 2016-01-05 | 2019-05-28 | Sonos, Inc. | Multiple-device setup |
US9886234B2 (en) | 2016-01-28 | 2018-02-06 | Sonos, Inc. | Systems and methods of distributing audio to one or more playback devices |
US10101941B2 (en) * | 2016-09-20 | 2018-10-16 | International Business Machines Corporation | Data mirror invalid timestamped write handling |
US10712997B2 (en) | 2016-10-17 | 2020-07-14 | Sonos, Inc. | Room association based on name |
US11108698B2 (en) * | 2017-02-03 | 2021-08-31 | Microsoft Technology Licensing, Llc | Systems and methods for client-side throttling after server handling in a trusted client component |
US10802878B2 (en) * | 2017-03-31 | 2020-10-13 | Bmc Software, Inc. | Phased start and stop of resources in a mainframe environment |
DE102017208484A1 (en) * | 2017-05-19 | 2018-11-22 | Robert Bosch Gmbh | Method and device for detecting hardware errors in microprocessors |
JP6934754B2 (en) * | 2017-06-15 | 2021-09-15 | 株式会社日立製作所 | Distributed processing system, distributed processing system management method, and distributed processing system management program |
CN107943567B (en) * | 2017-10-20 | 2021-12-28 | 北京知道未来信息技术有限公司 | High-reliability task scheduling method and system based on AMQP protocol |
TWI639921B (en) * | 2017-11-22 | 2018-11-01 | 大陸商深圳大心電子科技有限公司 | Command processing method and storage controller using the same |
US10802929B2 (en) | 2018-01-03 | 2020-10-13 | Tesla, Inc. | Parallel processing system runtime state reload |
US11361839B2 (en) | 2018-03-26 | 2022-06-14 | Rambus Inc. | Command/address channel error detection |
KR20240007689A (en) | 2018-05-31 | 2024-01-16 | 조비 에어로, 인크. | Electric power system architecture and fault tolerant vtol aircraft using same |
US12006048B2 (en) | 2018-05-31 | 2024-06-11 | Joby Aero, Inc. | Electric power system architecture and fault tolerant VTOL aircraft using same |
US10666565B2 (en) * | 2018-06-08 | 2020-05-26 | Citrix Systems, Inc. | Method to measure relative QOS gains and to reduce the variance in QOS for similar connections for during bandwidth contention |
WO2020009871A1 (en) | 2018-07-02 | 2020-01-09 | Joby Aero, Inc. | System and method for airspeed determination |
US11323214B2 (en) | 2018-09-17 | 2022-05-03 | Joby Aero, Inc. | Aircraft control system |
US20200331602A1 (en) | 2018-12-07 | 2020-10-22 | Joby Aero, Inc. | Rotary airfoil and design method therefor |
EP3891067B1 (en) | 2018-12-07 | 2024-01-17 | Joby Aero, Inc. | Aircraft control system and method |
US10845823B2 (en) | 2018-12-19 | 2020-11-24 | Joby Aero, Inc. | Vehicle navigation system |
KR102652792B1 (en) | 2019-04-23 | 2024-03-29 | 조비 에어로, 인크. | Battery thermal management system and method |
US11230384B2 (en) | 2019-04-23 | 2022-01-25 | Joby Aero, Inc. | Vehicle cabin thermal management system and method |
EP3959127A4 (en) | 2019-04-25 | 2023-01-11 | Joby Aero, Inc. | Vtol aircraft |
US10951475B2 (en) * | 2019-06-28 | 2021-03-16 | Intel Corporation | Technologies for transmit scheduler dynamic configurations |
EP4162473A4 (en) | 2020-06-05 | 2024-07-03 | Joby Aero Inc | Aircraft control system and method |
CN113254177B (en) * | 2021-05-31 | 2023-06-27 | 广州虎牙科技有限公司 | Task submitting method based on cluster, computer program product and electronic equipment |
CN118626507B (en) * | 2024-08-14 | 2024-10-29 | 济南浪潮数据技术有限公司 | Data consistency processing method, distributed storage system and electronic equipment |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3748449A (en) * | 1971-12-02 | 1973-07-24 | Litton Systems Inc | Device for determining the median number in a series of numbers |
US4183087A (en) * | 1978-03-07 | 1980-01-08 | Hughes Aircraft Company | Peak deviation sampling |
US4318173A (en) * | 1980-02-05 | 1982-03-02 | The Bendix Corporation | Scheduler for a multiple computer system |
US4334223A (en) * | 1980-06-18 | 1982-06-08 | Sperry Corporation | Median detector |
US4392199A (en) * | 1979-09-28 | 1983-07-05 | Siemens Aktiengesellschaft | Fault-tolerant system employing multi-microcomputers using two-out-of-three majority decision |
US4438494A (en) * | 1981-08-25 | 1984-03-20 | Intel Corporation | Apparatus of fault-handling in a multiprocessing system |
US4503534A (en) * | 1982-06-30 | 1985-03-05 | Intel Corporation | Apparatus for redundant operation of modules in a multiprocessing system |
US4503535A (en) * | 1982-06-30 | 1985-03-05 | Intel Corporation | Apparatus for recovery from failures in a multiprocessing system |
US4513440A (en) * | 1982-06-29 | 1985-04-23 | Harris Corporation | Hardware median filter |
US4523273A (en) * | 1982-12-23 | 1985-06-11 | Purdue Research Foundation | Extra stage cube |
US4583088A (en) * | 1982-06-15 | 1986-04-15 | International Business Machines Corp. | Method and apparatus for controlling access to a communication network |
US4627051A (en) * | 1982-12-03 | 1986-12-02 | Nec | Loop network system controlled by a simple clock station |
US4626843A (en) * | 1983-09-27 | 1986-12-02 | Trw Inc. | Multi-master communication bus system with parallel bus request arbitration |
US4630196A (en) * | 1983-04-13 | 1986-12-16 | At&T Information Systems, Inc. | Store and forward facility for use in multiprocessing environment |
US4642756A (en) * | 1985-03-15 | 1987-02-10 | S & H Computer Systems, Inc. | Method and apparatus for scheduling the execution of multiple processing tasks in a computer system |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3544778A (en) * | 1967-11-29 | 1970-12-01 | Westinghouse Electric Corp | Decision network |
US3667057A (en) * | 1970-05-22 | 1972-05-30 | Bendix Corp | Method and means for providing an output corresponding to the average of acceptable input signals |
US3805235A (en) * | 1972-12-26 | 1974-04-16 | Collins Radio Co | Equalization means for multi-channel redundant control system |
CH601874A5 (en) * | 1976-07-07 | 1978-07-14 | Bbc Brown Boveri & Cie | |
US4323966A (en) * | 1980-02-05 | 1982-04-06 | The Bendix Corporation | Operations controller for a fault-tolerant multiple computer system |
US4375683A (en) * | 1980-11-12 | 1983-03-01 | August Systems | Fault tolerant computational system and voter circuit |
US4554661A (en) * | 1983-10-31 | 1985-11-19 | Burroughs Corporation | Generalized fault reporting system |
US4805107A (en) * | 1987-04-15 | 1989-02-14 | Allied-Signal Inc. | Task scheduler for a fault tolerant multiple node processing system |
-
1987
- 1987-04-15 US US07/039,190 patent/US4805107A/en not_active Expired - Fee Related
- 1987-04-15 US US07/038,818 patent/US4816989A/en not_active Expired - Lifetime
- 1987-04-15 US US07/038,813 patent/US4914657A/en not_active Expired - Lifetime
-
1988
- 1988-04-15 JP JP63504242A patent/JPH02503122A/en active Pending
- 1988-04-15 EP EP19880904791 patent/EP0356460A4/en not_active Withdrawn
- 1988-04-15 WO PCT/US1988/001245 patent/WO1988008161A1/en not_active Application Discontinuation
- 1988-04-15 CA CA000564342A patent/CA1301938C/en not_active Expired - Lifetime
-
1989
- 1989-05-12 US US07/351,876 patent/US4933940A/en not_active Expired - Lifetime
- 1989-06-07 US US07/362,960 patent/US4972415A/en not_active Expired - Lifetime
- 1989-06-07 US US07/362,957 patent/US4980857A/en not_active Expired - Lifetime
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3748449A (en) * | 1971-12-02 | 1973-07-24 | Litton Systems Inc | Device for determining the median number in a series of numbers |
US4183087A (en) * | 1978-03-07 | 1980-01-08 | Hughes Aircraft Company | Peak deviation sampling |
US4392199A (en) * | 1979-09-28 | 1983-07-05 | Siemens Aktiengesellschaft | Fault-tolerant system employing multi-microcomputers using two-out-of-three majority decision |
US4318173A (en) * | 1980-02-05 | 1982-03-02 | The Bendix Corporation | Scheduler for a multiple computer system |
US4334223A (en) * | 1980-06-18 | 1982-06-08 | Sperry Corporation | Median detector |
US4438494A (en) * | 1981-08-25 | 1984-03-20 | Intel Corporation | Apparatus of fault-handling in a multiprocessing system |
US4583088A (en) * | 1982-06-15 | 1986-04-15 | International Business Machines Corp. | Method and apparatus for controlling access to a communication network |
US4513440A (en) * | 1982-06-29 | 1985-04-23 | Harris Corporation | Hardware median filter |
US4503534A (en) * | 1982-06-30 | 1985-03-05 | Intel Corporation | Apparatus for redundant operation of modules in a multiprocessing system |
US4503535A (en) * | 1982-06-30 | 1985-03-05 | Intel Corporation | Apparatus for recovery from failures in a multiprocessing system |
US4627051A (en) * | 1982-12-03 | 1986-12-02 | Nec | Loop network system controlled by a simple clock station |
US4523273A (en) * | 1982-12-23 | 1985-06-11 | Purdue Research Foundation | Extra stage cube |
US4630196A (en) * | 1983-04-13 | 1986-12-16 | At&T Information Systems, Inc. | Store and forward facility for use in multiprocessing environment |
US4626843A (en) * | 1983-09-27 | 1986-12-02 | Trw Inc. | Multi-master communication bus system with parallel bus request arbitration |
US4642756A (en) * | 1985-03-15 | 1987-02-10 | S & H Computer Systems, Inc. | Method and apparatus for scheduling the execution of multiple processing tasks in a computer system |
Non-Patent Citations (1)
Title |
---|
See also references of EP0356460A4 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0518511A2 (en) * | 1991-06-11 | 1992-12-16 | International Business Machines Corporation | Computer having I/O channel check and parity check detector |
EP0518511A3 (en) * | 1991-06-11 | 1993-10-27 | Ibm | Computer having i/o channel check and parity check detector |
WO2001027726A1 (en) * | 1999-10-08 | 2001-04-19 | Alliedsignal Inc. | System and method for fault-tolerant clock synchronization using interactive convergence |
US6801951B1 (en) | 1999-10-08 | 2004-10-05 | Honeywell International Inc. | System and method for fault-tolerant clock synchronization using interactive convergence |
EP1857935A1 (en) * | 2006-05-16 | 2007-11-21 | Saab Ab | Fault tolerant data bus node in a distributed system |
US8194533B2 (en) | 2006-05-16 | 2012-06-05 | Saab Ab | Fault tolerant data bus node and system |
CN114639496A (en) * | 2022-02-24 | 2022-06-17 | 中核武汉核电运行技术股份有限公司 | Redundancy association method and device for nuclear power technological equipment |
CN114639496B (en) * | 2022-02-24 | 2024-11-12 | 中核武汉核电运行技术股份有限公司 | Nuclear power process equipment redundancy association method and device |
Also Published As
Publication number | Publication date |
---|---|
US4933940A (en) | 1990-06-12 |
US4914657A (en) | 1990-04-03 |
JPH02503122A (en) | 1990-09-27 |
US4980857A (en) | 1990-12-25 |
US4805107A (en) | 1989-02-14 |
US4972415A (en) | 1990-11-20 |
US4816989A (en) | 1989-03-28 |
EP0356460A4 (en) | 1992-06-24 |
EP0356460A1 (en) | 1990-03-07 |
CA1301938C (en) | 1992-05-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US4914657A (en) | Operations controller for a fault tolerant multiple node processing system | |
US4937741A (en) | Synchronization of fault-tolerant parallel processing systems | |
CN1192309C (en) | Method and apparatus for managing redundant computer-based system for fault tolerant computing | |
Keichafer et al. | The MAFT architecture for distributed fault tolerance | |
EP0357768B1 (en) | Record lock processor for multiprocessing data system | |
US4321666A (en) | Fault handler for a multiple computer system | |
US4333144A (en) | Task communicator for multiple computer system | |
US7555516B2 (en) | Fast Paxos recovery | |
CA1254303A (en) | Method for multiprocessor communications | |
EP0143125A2 (en) | Computer system | |
CN100422943C (en) | Latent error detection | |
EP0381334B1 (en) | Apparatus for management, comparison, and correction of redundant digital data | |
EP0315303A2 (en) | Duplicated fault-tolerant computer system with error checking | |
Katsuki et al. | Pluribus—An operational fault-tolerant multiprocessor | |
CA2339783A1 (en) | Fault tolerant computer system | |
Weinstock | SIFT: System design and implementation | |
Palumbo et al. | A performance evaluation of the software-implemented fault-tolerancecomputer | |
Jensen | A distributed function computer for real-time control | |
Junior et al. | A Time-Phased Partitioned Checkpoint Approach to Reduce State Snapshot Overhead | |
CN116069495A (en) | Method, system and computer equipment for scheduling elastic deep learning job | |
Moser et al. | Total ordering algorithms | |
Aguilar et al. | Fault tolerance protocols for parallel programs based on tasks replication | |
Liu et al. | Efficient implementation techniques for gracefully degradable multiprocessor systems | |
Mancini et al. | The Join algorithm: ordering messages in replicated systems | |
CN114416436A (en) | Reliability method for single event upset effect based on SoC chip |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): JP |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE FR GB IT LU NL SE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1988904791 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 1988904791 Country of ref document: EP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1988904791 Country of ref document: EP |