[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

US7821919B2 - Data processing apparatus and data processing method - Google Patents

Data processing apparatus and data processing method Download PDF

Info

Publication number
US7821919B2
US7821919B2 US10/827,433 US82743304A US7821919B2 US 7821919 B2 US7821919 B2 US 7821919B2 US 82743304 A US82743304 A US 82743304A US 7821919 B2 US7821919 B2 US 7821919B2
Authority
US
United States
Prior art keywords
data
error
reception interface
interface sections
reception
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related, expires
Application number
US10/827,433
Other versions
US20040208130A1 (en
Inventor
Fumitoshi Mizutani
Shinya Oda
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Assigned to NEC CORPORATION reassignment NEC CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MIZUTANI, FUMITOSHI, ODA, SHINYA
Publication of US20040208130A1 publication Critical patent/US20040208130A1/en
Application granted granted Critical
Publication of US7821919B2 publication Critical patent/US7821919B2/en
Expired - Fee Related legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4004Coupling between buses
    • G06F13/4027Coupling between buses using bus bridges
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1443Transmit or communication errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/1629Error detection by comparing the output of redundant processing systems
    • G06F11/1641Error detection by comparing the output of redundant processing systems where the comparison is not performed by the redundant processing components
    • G06F11/1645Error detection by comparing the output of redundant processing systems where the comparison is not performed by the redundant processing components and the comparison itself uses redundant hardware
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/0026PCI express

Definitions

  • the present invention relates to a data processing apparatus and a data processing method which process same data in parallel.
  • One of computer systems which perform data processing is a fault-tolerant computer system which has a redundant architecture designed using existing components as disclosed in, for example, pages 5 to 7 and FIG. 1 of Unexamined Japanese Patent Application KOKAI Publication No. H9-128349.
  • This computer system employs a lock-step system.
  • a plurality of processors with a redundant architecture synchronously process same data in parallel. Then, the outputs from the processors are compared with one another to detect an error if any and the error is corrected.
  • Recent computer systems are employing a fast serial link system, such as the PCI-Express, Hyper-Transport (registered trademark) or InfiniBand (registered trademark), which can ensure fast data transmission and reception, to connect processors to I/O (Input/Output) systems.
  • a fast serial link system such as the PCI-Express, Hyper-Transport (registered trademark) or InfiniBand (registered trademark)
  • the interface sections When detecting communication errors, for example, individual interface sections which intervene data transmission and reception between processors and I/O systems, the interface sections request resending of data at their own timings different from one another. Accordingly, the timing and order of processes to be executed by the individual processors deviate, so that the lock-step system cannot be maintained. This makes it difficult for plural processors to synchronously process same data.
  • a data processing apparatus that has a plurality of reception interface sections ( 16 , 26 ) which receive same data from a same data sender and processes data, received by the plurality of reception interface sections, in parallel.
  • each of the reception interface sections includes a communication error processing section which, upon occurrence of an error in the received data, stops receiving the data, sends a communication error signal to stop data reception from the data sender to other reception interface sections, and requests the data sender to resend data.
  • This structure can permit synchronous processing of same data even when a communication error occurs.
  • the communication error processing section of each of the reception interface sections may cancel the error-occurred data, and request the data sender to resend the canceled data.
  • the data sender may send same serial data, and when an error occurs in received serial data, the communication error processing section of each of the reception interface sections may cancel the error-occurred serial data and serial data received following that error-occurred serial data, and request the data sender to resend the canceled serial data.
  • the data sender may send the data packet by packet with a sequence number affixed to each packet, and when an error occurs in data of the received packet, the communication error processing section of each of the reception interface sections may request the data sender to resend data packet by packet based on the sequence number affixed to each packet received.
  • the data processing apparatus may further comprise a frequency divider which generates a sync signal by dividing a frequency of a predetermined clock signal and sends the generated sync signal to each of the reception interface sections, and each of the reception interface sections may receive data according to the sync signal supplied from the frequency divider.
  • a data processing apparatus that has a transmission interface section which transmits transmission data to a plurality of data receivers at a same timing.
  • the transmission interface section generates packet data by dividing the transmission data to data of a data length sendable within one period of a predetermined clock signal and sends individual pieces of packet data generated to the plurality of data receivers at the same timing in synchronism with the clock signal.
  • a data processing method that performs parallel processing of data received by a plurality of reception interface sections which receive same data from a same data sender.
  • the data processing method comprises:
  • the data reception step and the error information output step may be executed according to a sync signal generated by dividing a frequency of a predetermined clock signal.
  • the data pressing method may further comprise:
  • the data processing method may further comprise:
  • the data cancellation step and the data reception stopping step are executed in at least one of a case where an error is detected at the error detection step and a case where error information is received at the error information reception step, and
  • the data resend requesting step requests resending of data canceled at the data cancellation step.
  • the data cancellation step may be executed according to the sync signal.
  • a computer program that performs parallel processing of data received by a plurality of reception interface sections which receive same data from a same data sender.
  • the computer program allows a computer to execute:
  • the data reception step and the error information output step may be executed according to a sync signal generated by dividing a frequency of a predetermined clock signal.
  • the computer may be allowed to further execute:
  • the computer may be allowed to further execute:
  • the data cancellation step and the data reception stopping step may be executed in at least one of a case where an error is detected at the error detection step and a case where error information is received at the error information reception step, and
  • the data resend requesting step may request resending of data canceled at the data cancellation step.
  • the data cancellation step may be executed according to the sync signal.
  • the invention can permit synchronous processing of same data even when a communication error occurs.
  • the invention can guarantee the identity of data when same data is processed in parallel.
  • the invention can allow a computer system to be designed without suffering any restriction on the lengths of communication lines.
  • FIG. 1 is a block diagram illustrating the architecture of a computer system according to an embodiment of the invention
  • FIGS. 2A to 2D are explanatory diagrams showing the structure of packet data to be transmitted and received
  • FIGS. 5A to 5E are timing charts illustrating the operation of the memory bridge shown in FIG. 1 ;
  • FIGS. 6A to 6E are timing charts illustrating the operation of the memory bridge shown in FIG. 1 ;
  • FIGS. 7A to 7E are timing charts illustrating the operation of the memory bridge shown in FIG. 1 ;
  • FIGS. 8A to 8E are timing charts illustrating the operation of the memory bridge shown in FIG. 1 ;
  • FIGS. 9A to 9E are timing charts illustrating the operation of the memory bridge shown in FIG. 1 ;
  • FIG. 10 is a flowchart illustrating the procedures of the operation of the computer system according to the embodiment of the invention.
  • the data processing apparatus is explained as a computer system which has a redundant architecture.
  • the sub system 1 has an arithmetic operation system 11 and an I/O (Input/Output) system 12 .
  • the sub system 2 has an arithmetic operation system 21 and an I/O system 22 .
  • the arithmetic operation systems 11 and 21 are supplied with synchronous clock signals CLK of 166 MHz. Accordingly, the sub systems 1 and 2 synchronously and simultaneously execute the same process according to the lock-step system.
  • a frequency divider 31 is connected between the sub systems 1 and 2 .
  • the frequency divider 31 frequency-divides the supplied clock signal CLK of an FSB (Front Side Bus), thereby generating a sync signal S 1 .
  • the frequency divider 31 supplies the generated sync signal S 1 to a memory bridge 16 of the arithmetic operation system 11 , a memory bridge 26 of the arithmetic operation system 21 , an I/O bridge 18 of the I/O system 12 and an I/O bridge 28 of the I/O system 22 .
  • the PCI-Express interface is used in data transmission and reception.
  • the PCI-Express interface employs a serial link to prevent data skew between signal lines that occurs in a parallel bus.
  • the arithmetic operation systems 11 and 21 and the I/O systems 12 and 22 are connected to one another according to the PCI-Express interface.
  • the frequency divider 31 divides the frequency of 166 MHz of the clock signal CLK to a one-sixteenth frequency of 10.4 MHz in such a way that one period of the sync signal S 1 becomes equivalent to 24 symbol times of the PCI-Express interface of 2.5 Gbps/lane.
  • devices such as the arithmetic operation systems 11 and 21 and the I/O systems 12 and 22 , are connected one to one.
  • the link uses four signal lines bidirectionaily, two in one direction. The set of four signal lines is called a lane.
  • One symbol time is the time needed to send valid data of 1 byte according to the PCI-Express interface after data in one lane is encoded with 8B/10B.
  • the I/O bridges 18 and 28 and the memory bridges 16 and 26 can send valid data of 24 bytes per lane to one another in one period of the sync signal S 1 .
  • the arithmetic operation system 11 has processors 13 and 14 , a main memory unit 15 and the memory bridge 16 .
  • the arithmetic operation system 21 has processors 23 and 24 , a main memory unit 25 and the memory bridge 26 .
  • the processors 13 , 14 , 23 and 24 execute arithmetic operations.
  • the main memory units 15 and 25 store data or so.
  • the memory bridge 16 is connected to the processors 13 and 14 by the front side bus (FSB), and the memory bridge 26 is connected to the processors 23 and 24 by the front side bus (FSB).
  • the memory bridges 16 and 26 operate in synchronism with the clock signal CLK.
  • the memory bridges 16 and 26 perform data transmission and reception with the I/O bridges 18 and 28 .
  • the memory bridges 16 and 26 send and receive a communication error signal S 2 to and from each other.
  • the memory bridges 16 and 26 share information on a communication error by the communication error signal S 2 and execute an error process in concert with each other.
  • the communication error signal S 2 is sent and received as an open drain signal. The detailed structures of the memory bridges 16 and 26 will be discussed later.
  • the I/O system 12 has an I/O device 17 , the I/O bridge 18 and a configuration register 19 .
  • the I/O system 22 has an I/O device 27 , the I/O bridge 28 and a configuration register 29 .
  • the I/O devices 17 and 27 perform data transmission and reception with the I/O bridges 18 and 28 , respectively.
  • the I/O bridges 18 and 28 perform serial transmission with the I/O devices 17 and 27 or the memory bridges 16 and 26 , respectively.
  • the memory bridges 16 and 26 of the respective arithmetic operation systems 11 and 21 are cross-linked to the I/O bridges 18 and 28 of the respective I/O systems 12 and 22 by the links L 1 . That is, the memory bridge 16 of the arithmetic operation system 11 is connected to the I/O systems 12 and 22 , and the I/O bridge 26 of the arithmetic operation system 21 is connected to the I/O systems 12 and 22 .
  • This cross-link connection can allow each of the arithmetic operation systems 11 and 21 to communicate with the I/O systems 12 and 22 . Accordingly, the each of the I/O systems 12 and 22 can communicate with the arithmetic operation systems 11 and 21 .
  • PCI-Express interface To ensure layer-by-layer upgrading, the functions of the PCI-Express interface are hierarchized. The protocol is defined for each layer.
  • a header is added to data in a transaction layer to thereby generate a transaction layer packet as shown in FIGS. 2A and 2B .
  • a sequence number and CRC Cyclic Redundancy Check as status information is added to the transaction layer packet in a data link layer, thereby generating a data link layer packet (DLLP).
  • CRC Cyclic Redundancy Check
  • frame data is added to the data link layer packet in a physical layer.
  • the resultant packet is transmitted and received.
  • Each of the I/O bridges 18 and 28 has an interface circuit section (not shown) for transmission and reception of data according to the PCI-Express interface.
  • Each of the configuration-registers 19 and 29 holds data for limiting the packet length and the number of pieces of data of an upstream packet to be sent from the I/O bridges 18 and 28 to the memory bridges 16 and 26 in one period of the sync signal S 1 supplied from the frequency divider 31 .
  • the packet length and the number of pieces of data are limited to prevent a packet to be sent from being influenced by the length of the transmission path and the drifting of the clock. Specifically, the maximum packet length of individual packets is 192 bytes.
  • the I/O bridge 18 or 28 simultaneously sends the same packet to the respective memory bridges 16 and 26 at the rising timing of the sync signal S 1 .
  • the I/O bridges 18 and 28 When sending a plurality of small packets, the I/O bridges 18 and 28 perform transmission control so that the number of pieces of data does not exceed the maximum data number transmittable in one period of the sync signal S 1 . According to the control, the I/O bridges 18 and 28 work in such a way that transmission time of one packet does not exceed one period of the sync signal S 1 of 10.4 MHz.
  • the values in the configuration registers 19 and 29 can be changed using the BIOS (Basic Input/Output System).
  • BIOS Basic Input/Output System
  • the sub system 1 , 2 has a non-volatile memory (not shown) to store the BIOS.
  • the computer system with the above-described architecture performs failure diagnosis by comparing communication contents exchanged between the sub systems.
  • the computer system masks the sub system having failure and continues the process in progress using the remaining sub system.
  • the structures of the memory bridges 16 and 26 are discussed next. As the structure of the memory bridge 26 is the same as that of the memory bridge 16 , only the structure of the memory bridge 16 is described below.
  • the memory bridge 16 has an interface circuit section 40 , a synchronization buffer 50 and an internal circuit section 60 as shown in FIG. 3 .
  • the interface circuit section 40 is provided in association with the PCI-Express interface.
  • the interface circuit section 40 is separated into a data link/physical layer 41 and a transaction layer 42 .
  • the data link/physical layer 41 is separated into physical layers 43 - 1 to 43 n, a data link layer (RX) 44 and a data link layer (TX) 45 .
  • the transaction layer 42 is separated into a communication error processing section 46 and a transaction layer 47 .
  • the data link/physical layer 41 , the transaction layer 42 and the internal circuit section 60 operate in synchronism with different clock signals.
  • the physical layers 43 - 1 to 43 n send and receive packets shown in FIG. 2D in one period of the sync signal S 1 .
  • the interface circuit section 40 has an elastic buffer (EB) to hold a packet to be transmitted or received
  • EB elastic buffer
  • the data link layer (RX) 44 acquires a data link layer packet from the packet shown in FIG. 2D .
  • the data link layer (TX) 45 receives an ACK/NACK/flow control signal output from the communication error processing section 46 .
  • the communication error processing section 46 performs a process associated with a communication error.
  • the data link layer (RX) 44 directly sends some error signals in status information to the transaction layer 47 , and the data link layer (RX) 44 directly sends the ACK/NACK/flow control signal to the data link layer (TX) 45 .
  • the memory bridge 16 has the communication error processing section 46 which acquires the status information at the data link layer (RX) 44 .
  • the communication error processing section 46 sends the acquired status information to the transaction layer 47 and the data link layer (TX) 45 .
  • the communication error processing section 46 checks the CRC affixed to the data link layer packet to detect a communication error if any. The communication error processing section 46 then outputs error information.
  • the communication error processing section 46 sends data and status information as they are to the transaction layer 47 and the data link layer (TX) 45 .
  • the interface circuit section 40 regularly returns an ACK signal to the I/O bridges 18 and 28 which have sent the data, according to the status information.
  • the communication error processing section 46 cancels all the packets received in one period of the sync signal S 1 as lost packets. Then, the communication error processing section 46 stops outputting the received data to the transaction layer 47 .
  • the communication error processing section 46 instructs the data link layer (RX) 44 to set the sequence number of a next packet to be received to the sequence number prior to the reception of the communication error packet.
  • the communication error processing section 46 When detecting a communication error, the communication error processing section 46 asserts or enables a communication error signal S 2 for one period of the sync signal S 1 . The communication error processing section 46 sends the asserted communication error signal S 2 to the memory bridge 26 via the signal line.
  • the transaction layer 47 accepts a read request and a write request from a higher-rank software layer and requests the data 11 * layer (RX) 44 and the data link layer (TX) 45 to transfer a packet.
  • the synchronization buffer 50 serves to exchange data between the transaction layer 47 and the internal circuit section 60 .
  • the synchronization buffer 50 holds data output from the transaction layer 47 .
  • the internal circuit section 60 acquires data held in the synchronization buffer 50 at the timing synchronous with the sync signal S 1 and sends the acquired data to the processors 13 and 14 and the main memory unit 15 .
  • the interface circuit sections of the I/O bridges 18 and 28 become transmission interface sections and the interface circuit sections of the memory bridges 16 and 26 become reception interface sections.
  • the memory bridges 16 and 26 can send serial data to the I/O bridges 18 and 28 .
  • the interface circuit sections of the memory bridges 16 and 26 become transmission interface sections and the interface circuit sections of the I/O bridges 18 and 28 become reception interface sections.
  • the I/O bridge 18 When supplied with data from the I/O device 17 , the I/O bridge 18 adds a header to the serial data at the transaction layer as shown in FIGS. 2A and 2B . Then, the I/O bridge 18 generates a transaction layer packet.
  • the I/O bridge 18 adds the sequence number and CRC as status information to the generated transaction layer packet at the data link layer. Then, the I/O bridge 18 generates a data link layer packet.
  • the I/O bridge 18 adds fame data to the generated data link layer packet at the physical layer, as shown in FIG. 2D . Then, the I/O bridge 18 sends the packet shown in FIG. 2D to the memory bridges 16 and 26 via the links L 1 .
  • the interface circuit section 40 of the memory bridge 16 receives the data at the physical layers 43 - 1 to 43 n.
  • the interface circuit section 40 temporarily stores all the packets, received at the physical layers 43 - 1 to 43 n in one period of the sync signal SI, in the elastic buffer as shown in FIGS. 4A and 4B . Then, the interface circuit section 40 sends the stored packets to the data link layer (RX) 44 .
  • RX data link layer
  • the interface circuit section 40 acquires a data link layer packet from the packet shown in FIG. 2D at the data link layer (RX) 44 .
  • the interface circuit section 40 performs error detection based on the CRC included in the data link layer packet shown in FIG. 2C .
  • the communication error processing section 46 acquires packets received in each period of the sync signal S 1 in synchronism with the next rising of the sync signal S 1 .
  • the communication error processing section 46 sets the communication error signal S 2 to a high (H) level to deassert or disable the communication error signal S 2 . Therefore, the received packets become valid.
  • the communication error processing section 46 sends each packet to the transaction layer 47 in synchronism with the next rising of the sync signal S 1 as shown in FIG. 4E .
  • the interface circuit section 40 acquires a transaction layer packet from the data link layer packet at the transaction layer 47 .
  • the interface circuit section 40 then acquires data from the transaction layer packet and sends the data to the synchronization buffer 50 as shown in FIG. 4F .
  • the internal circuit section 60 acquires data from the synchronization buffer 50 in synchronism with the rising of the sync signal S 1 . Then, the internal circuit section 60 sends the acquired data to the processors 13 and 14 and the main memory unit 15 .
  • the memory bridges 16 and 26 receive data nearly at the same time as shown in FIGS. 5B and 5C , if the length of the link L 1 between the I/O bridge 18 and the memory bridge 16 and the length of the link L 1 between the I/O bridge 18 and the memory bridge 26 hardly differ from each other.
  • the arithmetic operation systems 11 and 21 executes the same process in synchronism with the clock signal CLK.
  • the I/O bridge 18 changes data stored in the configuration register 19 using the BIOS in such a way as to make the length of data in one packet shorter.
  • the interface circuit section 40 of the memory bridge 16 detects a communication error in the packet received at the second clock cycle of the sync signal S 1 as shown in FIGS. 6A and 6B .
  • the communication error processing section 46 cancels all the packets at the third clock cycle even if the packets include a data link layer packet (DLLP).
  • DLLP data link layer packet
  • the communication error processing section 46 cancels packets at and following the third clock cycle.
  • the communication error processing section 46 cancels reception of all packets until the packets canceled at the third clock cycle are sent again.
  • the communication error processing section 46 sets the sequence number of the packet managed at the data link layer (RX) 44 to the sequence number prior to the occurrence of the communication error.
  • the communication error processing section 46 sets the communication error signal S 2 to a low (L) level to assert or enable the communication error signal S 2 .
  • L low
  • the communication error signal S 1 At the fourth clock cycle of the sync signal S 1 , no packets are received so that the communication error signal S 2 is deasserted.
  • the communication error processing section 46 request the I/O bridge 18 or the data sender to resend data.
  • the I/O bridge 18 resends a packet whose resending is requested.
  • the I/O bridge 18 resends a packet whose transmission has not been acknowledged even when a predetermined period has passed without an ACK signal returned from the memory bridge 16 .
  • the memory bridge 16 receives a packet with a sequence number 2 in response to the resent request at the sixth clock cycle of the sync signal S 1 as shown in FIG. 7B .
  • the interface circuit section 40 receives the packets canceled at and following the third clock cycle as shown in FIGS. 7C , 7 D and 7 E.
  • the memory bridge 26 sends the low-level communication error signal S 2 to the memory bridge 16 as shown in FIG. 8D .
  • the communication error processing section 46 cancels the packet with the sequence number 2 , held in the communication error processing section 46 at the third clock cycle of the sync signal S 1 , as shown in FIG. 8E .
  • the communication error processing section 46 stops giving a packet to the transaction layer 47 .
  • the communication error processing section 46 sets the sequence number of a packet to be received next, which is managed by the data link layer (RX) 44 , to the value prior to the packet cancellation.
  • the communication error processing section 46 cancels the packet having the communication error first as shown in FIGS. 9C and 9E .
  • the communication error processing section 46 When canceling the packet having the communication error, the communication error processing section 46 asserts the communication error signal S 2 to request resending of a sequence of packets with and following the sequence number 2 , as shown in FIG. 9D . While canceling the packet having the communication error, however, the communication error processing section 46 does not cancel the data link layer packet received at the third clock cycle of the sync signal SI, as shown in FIGS. 9C and 9E . This is because the data link layer packet has no sequence number so that a sequence number error does not occur.
  • the communication error processing section 46 can specify the order from the sequence numbers of resent packets without canceling the data link layer packet. Accordingly, the memory bridge 16 can receive resent packets without problems.
  • the communication error processing section 46 cancels the received packet Then, the communication error processing section 46 sends the asserted communication error signal S 2 to the memory bridge 26 and request the packet sender to resend the canceled packet.
  • the communication error processing sections of the memory bridges 16 and 26 cooperate to request the packet sender to resend a packet Accordingly, a deviation in synchronism of received data can be avoided.
  • the arithmetic operation systems 11 and 21 can therefore process same data synchronously.
  • the packet lengths and the number of pieces of data of a packet to be transmitted are limited by the I/O bridge 18 to avoid influence of a difference in lengths of the links L 1 if any.
  • the embodiment therefore makes it easier to design the circuit board to construct a fault-tolerant computer system and design the casing of the computer system.
  • FIG. 10 is a flowchart illustrating the procedures of the processing of data received in one period of the sync signal S 1 in the interface circuit section 40 .
  • the interface circuit section 40 receives packets, sent from the I/O bridge 18 , at the physical layers 43 - 1 to 43 n according to the sync signal S 1 shown in FIG. 4A .
  • the interface circuit section 40 temporarily stores all packets, received in one period of the sync signal S 1 , in the elastic buffer, as shown in FIG. 4B , and then sends the packets to the data link layer (RX) 44 .
  • the interface circuit section 40 sends the received data to the communication error processing section 46 .
  • the communication error processing section 46 acquires the data in synchronism with the rising of the sync signal S 1 .
  • the communication error processing section 46 checks the CRC included in the data link layer packet to determine if a communication error is detected.
  • step 5 the communication error processing section 46 asserts the communication error signal S 2 or sets the communication error signal S 2 to a low (L) level in synchronism with the rising of the sync signal S 1 to enable the error, as shown in FIG. 6D .
  • the communication error processing section 46 sends the asserted (low-level) communication error signal S 2 to the interface circuit section of the memory bridge 26 via the signal line. Accordingly, a plurality of interface circuit sections can share error information on received packets. As the communication error signal S 2 is synchronous with the rising of the sync signal S 1 , the plural interface circuit sections can share synchronous error information.
  • the communication error processing section 46 determines whether the asserted (low-level) communication error signal S 2 has been received from the interface circuit section of the memory bridge 26 or not.
  • the communication error signal S 2 is asserted in synchronism with the rising of the sync signal SI even if the interface circuit section 40 of the memory bridge 16 does not detect a communication error, as shown in FIG. 8D .
  • step 8 the communication error processing section 46 cancels all the packets received in one period of the sync signal S 1 as shown in FIGS. 6E and 8E .
  • the communication error signal S 2 is asserted, no packet is sent to the transaction layer 47 in synchronism with the next rising of the sync signal S 1 . Therefore, the sending of packets to the transaction layer 47 is stopped. Further, packets in the next period are canceled.
  • the communication error processing section 46 sets the sequence number of a packet to be received next, which is managed by the data link layer (RX) 44 , to the value prior to the occurrence of the error. This can stop reception of other packets until the packet whose communication error has been detected is received.
  • the communication error processing section 46 request the I/O bridge 18 or the data sender to resend data.
  • the I/O bridge 18 sends the requested packet to the memory bridges 16 and 26 . Even when a single interface circuit section detects a communication error, therefore, a plurality of interface circuit sections can receive same data synchronously.
  • the communication error processing section 46 deasserts the communication error signal S 2 or sets the communication error signal S 2 to a high (H) level in synchronism with the rising of the sync signal S 1 to disable the error, as shown in FIGS. 6D and 8D .
  • the communication error signal S 2 can be deasserted.
  • the interface circuit section 40 determines whether the packet corresponding to the resend request has been received or not.
  • step 12 is repeated.
  • the flow returns to step 2 and the interface circuit section 40 receives the canceled packet and subsequent packets, as shown in FIGS. 7A to 7E .
  • step 13 the communication error processing section 46 sends the packet to the transaction layer 47 in synchronism with the next rising of the sync signal Si as shown in FIG. 4E .
  • the communication error processing section 46 sends status information to the transaction layer 47 and the data link layer (TX) 45 . According to the status information, the interface circuit section 40 regularly returns the ACK signal to the I/O bridge 18 which has-sent the data.
  • step 14 ST 14
  • the interface circuit section 40 acquires data from the transaction layer packet and sends the data to the synchronization buffer 50 as shown in FIG. 4F .
  • step 9 at which the sequence number is set can be executed before step 8 at which a packet is canceled, or these steps can be executed in parallel.
  • step 7 at which reception of the communication error signal S 2 is determined can be executed before step 4 at which detection of a communication error is determined, or these steps can be executed in parallel.
  • a computer can be allowed to execute the procedures of the operation by a computer program.
  • the computer program can be recorded a computer readable recording medium, such as a floppy disk, CD-ROM or hard disk.
  • the computer program is loaded into the main memory unit 15 , the computer can perform the operation described above.
  • the invention is not limited to the embodiment described above and can be worked out in various embodiments.
  • each of the memory bridges 16 and 26 and the I/O bridges 18 and 28 is so constructed as to have an interface circuit section in the embodiment.
  • the arithmetic operation systems 1 I and 21 may respectively have transmission/reception bridges 71 and 72 in addition to the memory bridges 16 and 26 , as shown in FIG. 11 .
  • the arithmetic operation systems 11 and 21 may respectively have transmission/reception bridges 81 and 82 in addition to the I/O systems 12 and 22 .
  • each of the transmission/reception bridges 71 , 72 , 81 and 82 has a communication error processing section.
  • the transmission/reception bridges 71 and 72 are connected to the memory bridges 16 and 26 , respectively.
  • the transmission/reception bridges 81 and 82 are connected to the I/O bridges 18 and 28 , respectively.
  • the transmission/reception bridges 71 and 72 and the transmission/reception bridges 81 and 82 synchronously perform data exchange according to the lock-step system.
  • the transmission/reception bridges 71 and 72 are connected to the existing memory bridges 16 and 26 by one set of communication links. The connection is achieved by fast serial links that are supported by the existing memory bridges.
  • the length of the link between the memory bridge 16 and the transmission/reception bridge 71 and the length of the link between the memory bridge 26 and the transmission/reception bridge 72 are made as short as possible in order to avoid occurrence of a communication error originating from a difference in reception timing.
  • This structure can realize a fault-tolerant computer system while using existing system chip set components as they are.
  • the computer system takes a double redundant architecture that has two sub systems 1 and 2 which respectively have two processors 13 and 14 and two processors 23 and 24 .
  • the architecture of the computer system is not however restrictive and can take a triple redundant architecture or an architecture having a greater number of redundancy levels.
  • the embodiment has been explained as an example which uses the PCI-Express interface for fast serial links.
  • the link system is not however limited to this particular type.
  • other fast serial links of InfiniBand, HyperTransport or the like may be used instead of the PCI-Express.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Communication Control (AREA)
  • Retry When Errors Occur (AREA)
  • Hardware Redundancy (AREA)
  • Multi Processors (AREA)

Abstract

Each of memory bridges and I/O bridges, cross-linked to one another, is provided with an interface circuit section which performs data transmission and reception according to an PCI-Express interface. Each interface circuit section has a communication error processing section. When an error occurs in data received from the I/O bridge, the communication error processing section of the memory bridge cancels the received data and sends a communication error signal to the memory bridge. When receiving the communication error signal, the memory bridge stops receiving the data. Then, the communication error processing section of the memory bridge requests the I/O bridge to resend data.

Description

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a data processing apparatus and a data processing method which process same data in parallel.
2. Description of the Related Art
One of computer systems which perform data processing is a fault-tolerant computer system which has a redundant architecture designed using existing components as disclosed in, for example, pages 5 to 7 and FIG. 1 of Unexamined Japanese Patent Application KOKAI Publication No. H9-128349. This computer system employs a lock-step system.
In the lock-step system, first, a plurality of processors with a redundant architecture synchronously process same data in parallel. Then, the outputs from the processors are compared with one another to detect an error if any and the error is corrected.
Recent computer systems are employing a fast serial link system, such as the PCI-Express, Hyper-Transport (registered trademark) or InfiniBand (registered trademark), which can ensure fast data transmission and reception, to connect processors to I/O (Input/Output) systems.
While the use of such a fast data transmission and reception system in the computer system with the redundant architecture indeed makes the data transmission and reception speed faster, the structure makes it harder to guarantee the identity of data to be processed by plural processors and makes it easier to cause communication errors.
When detecting communication errors, for example, individual interface sections which intervene data transmission and reception between processors and I/O systems, the interface sections request resending of data at their own timings different from one another. Accordingly, the timing and order of processes to be executed by the individual processors deviate, so that the lock-step system cannot be maintained. This makes it difficult for plural processors to synchronously process same data.
When only some of plural interface sections have detected communication errors, for example, the interface sections that have detected the communication errors do not share the error information with the other interface sections. While those interface sections which have detected the communication errors request resending of data, therefore, those interface sections which have not detected them receive data as it is. In this case, the timing of the subsequent processing of the received pieces of data, the same though they are, deviates, so that the identity of data in parallel processing cannot be guaranteed.
Further, such a computer system is likely to suffer a data delay originated from the lengths of communication lines. When the data delay shifts the timing of processing by plural processors, the plural processors have a difficulty in synchronously processing same data as in the case mentioned previously. This requires that the equal line lengths should be provided strictly, thus placing considerable restrictions on the degree of freedom on the structure of the casing of the system, the design of the board, and the structure of the board.
SUMMARY OF THE INVENTION
Accordingly, it is a primary object of the invention to provide a data processing apparatus and a data processing method which can synchronously process same data even when a communication error occurs.
It is another object of the invention to provide a data processing apparatus and a data processing method which can guarantee the identity of data when same data is processed in parallel.
It is a further object of the invention to provide a data processing apparatus and a data processing method which can allow a computer system to be designed without suffering any restriction on the lengths of communication lines.
To achieve the objects, according to the first aspect of the invention, there is provided a data processing apparatus that has a plurality of reception interface sections (16, 26) which receive same data from a same data sender and processes data, received by the plurality of reception interface sections, in parallel. In the data processing apparatus, each of the reception interface sections includes a communication error processing section which, upon occurrence of an error in the received data, stops receiving the data, sends a communication error signal to stop data reception from the data sender to other reception interface sections, and requests the data sender to resend data.
This structure can permit synchronous processing of same data even when a communication error occurs.
In the data processing apparatus, when an error occurs in part of received data, the communication error processing section of each of the reception interface sections may cancel the error-occurred data, and request the data sender to resend the canceled data.
The data sender may send same serial data, and when an error occurs in received serial data, the communication error processing section of each of the reception interface sections may cancel the error-occurred serial data and serial data received following that error-occurred serial data, and request the data sender to resend the canceled serial data.
The data sender may send the data packet by packet with a sequence number affixed to each packet, and when an error occurs in data of the received packet, the communication error processing section of each of the reception interface sections may request the data sender to resend data packet by packet based on the sequence number affixed to each packet received.
The data processing apparatus may further comprise a frequency divider which generates a sync signal by dividing a frequency of a predetermined clock signal and sends the generated sync signal to each of the reception interface sections, and each of the reception interface sections may receive data according to the sync signal supplied from the frequency divider.
According to the second aspect of the invention, there is provided a data processing apparatus that has a transmission interface section which transmits transmission data to a plurality of data receivers at a same timing. In the data processing apparatus, the transmission interface section generates packet data by dividing the transmission data to data of a data length sendable within one period of a predetermined clock signal and sends individual pieces of packet data generated to the plurality of data receivers at the same timing in synchronism with the clock signal.
According to the third aspect of the invention, there is provided a data processing method that performs parallel processing of data received by a plurality of reception interface sections which receive same data from a same data sender. The data processing method comprises:
a data reception step of receiving data from the data sender at one of the plurality of reception interface sections;
an error detection step of detecting an error in the received data; and
an error information output step of outputting information on the detected error to other reception interface sections.
The data reception step and the error information output step may be executed according to a sync signal generated by dividing a frequency of a predetermined clock signal.
The data pressing method may further comprise:
an error information reception step of receiving error information, output from the other reception interface sections, at the one of the reception interface sections; and
a data resend requesting step of requesting the data sender to resend data in at least one of a case where an error is detected at the error detection step and a case where error information is received at the error information reception step.
The data processing method may further comprise:
a data cancellation step of canceling data; and
a data reception stopping step of stopping data reception, and wherein
the data cancellation step and the data reception stopping step are executed in at least one of a case where an error is detected at the error detection step and a case where error information is received at the error information reception step, and
the data resend requesting step requests resending of data canceled at the data cancellation step.
The data cancellation step may be executed according to the sync signal.
According to the fourth aspect of the invention, there is provided a computer program that performs parallel processing of data received by a plurality of reception interface sections which receive same data from a same data sender. The computer program allows a computer to execute:
a data reception step of receiving data from the data sender at one of the plurality of reception interface sections;
an error detection step of detecting an error in the received data; and
an error information output step of outputting information on the detected error to other reception interface sections.
The data reception step and the error information output step may be executed according to a sync signal generated by dividing a frequency of a predetermined clock signal.
The computer may be allowed to further execute:
an error information reception step of receiving error information, output from the other reception interface sections, at the one of the reception interface sections; and
a data resend requesting step of requesting the data sender to resend data in at least one of a case where an error is detected at the error detection step and a case where error information is received at the error information reception step.
The computer may be allowed to further execute:
a data cancellation step of canceling data; and
a data reception stopping step of stopping data reception, and
the data cancellation step and the data reception stopping step may be executed in at least one of a case where an error is detected at the error detection step and a case where error information is received at the error information reception step, and
the data resend requesting step may request resending of data canceled at the data cancellation step.
The data cancellation step may be executed according to the sync signal.
The invention can permit synchronous processing of same data even when a communication error occurs.
The invention can guarantee the identity of data when same data is processed in parallel.
The invention can allow a computer system to be designed without suffering any restriction on the lengths of communication lines.
BRIEF DESCRIPTION OF THE DRAWINGS
These objects and other objects and advantages of the present invention will become more apparent upon reading of the following detailed description and the accompanying drawings in which:
FIG. 1 is a block diagram illustrating the architecture of a computer system according to an embodiment of the invention;
FIGS. 2A to 2D are explanatory diagrams showing the structure of packet data to be transmitted and received;
FIG. 3 is a block diagram showing the detailed structure of a memory bridge shown in FIG. 1;
FIGS. 4A to 4G are timing charts illustrating the operation of the memory bridge shown in FIG. 1;
FIGS. 5A to 5E are timing charts illustrating the operation of the memory bridge shown in FIG. 1;
FIGS. 6A to 6E are timing charts illustrating the operation of the memory bridge shown in FIG. 1;
FIGS. 7A to 7E are timing charts illustrating the operation of the memory bridge shown in FIG. 1;
FIGS. 8A to 8E are timing charts illustrating the operation of the memory bridge shown in FIG. 1;
FIGS. 9A to 9E are timing charts illustrating the operation of the memory bridge shown in FIG. 1;
FIG. 10 is a flowchart illustrating the procedures of the operation of the computer system according to the embodiment of the invention; and
FIG. 11 is a block diagram showing an application example of the computer system according to the embodiment of the invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
A data processing apparatus according to one preferred embodiment of the invention is described below with reference to the accompanying drawings.
The data processing apparatus according to the embodiment is explained as a computer system which has a redundant architecture.
FIG. 1 illustrates the architecture of a computer system according to the embodiment.
The computer system according to the embodiment is a fault-tolerant computer system which has a plurality of processors with a redundant architecture and sub systems 1 and 2. This system operates according to the “lock-step system” in which plural processors synchronously process same data in parallel.
The sub system 1 has an arithmetic operation system 11 and an I/O (Input/Output) system 12. The sub system 2 has an arithmetic operation system 21 and an I/O system 22.
The arithmetic operation systems 11 and 21 are supplied with synchronous clock signals CLK of 166 MHz. Accordingly, the sub systems 1 and 2 synchronously and simultaneously execute the same process according to the lock-step system.
A frequency divider 31 is connected between the sub systems 1 and 2. The frequency divider 31 frequency-divides the supplied clock signal CLK of an FSB (Front Side Bus), thereby generating a sync signal S1.
The frequency divider 31 supplies the generated sync signal S1 to a memory bridge 16 of the arithmetic operation system 11, a memory bridge 26 of the arithmetic operation system 21, an I/O bridge 18 of the I/O system 12 and an I/O bridge 28 of the I/O system 22.
It is premised in the embodiment that the PCI-Express interface is used in data transmission and reception. The PCI-Express interface employs a serial link to prevent data skew between signal lines that occurs in a parallel bus. The arithmetic operation systems 11 and 21 and the I/ O systems 12 and 22 are connected to one another according to the PCI-Express interface.
The frequency divider 31 divides the frequency of 166 MHz of the clock signal CLK to a one-sixteenth frequency of 10.4 MHz in such a way that one period of the sync signal S1 becomes equivalent to 24 symbol times of the PCI-Express interface of 2.5 Gbps/lane.
According to the PCI-Express interface, devices, such as the arithmetic operation systems 11 and 21 and the I/ O systems 12 and 22, are connected one to one. When data is transferred using a differential signal, the link uses four signal lines bidirectionaily, two in one direction. The set of four signal lines is called a lane.
One symbol time is the time needed to send valid data of 1 byte according to the PCI-Express interface after data in one lane is encoded with 8B/10B.
As the frequency of the sync signal S1 becomes 10.4 MHz, the I/O bridges 18 and 28 and the memory bridges 16 and 26 can send valid data of 24 bytes per lane to one another in one period of the sync signal S1.
The arithmetic operation system 11 has processors 13 and 14, a main memory unit 15 and the memory bridge 16. The arithmetic operation system 21 has processors 23 and 24, a main memory unit 25 and the memory bridge 26.
The processors 13, 14, 23 and 24 execute arithmetic operations. The main memory units 15 and 25 store data or so. The memory bridge 16 is connected to the processors 13 and 14 by the front side bus (FSB), and the memory bridge 26 is connected to the processors 23 and 24 by the front side bus (FSB). The memory bridges 16 and 26 operate in synchronism with the clock signal CLK.
The memory bridges 16 and 26 perform data transmission and reception with the I/O bridges 18 and 28. The memory bridges 16 and 26 send and receive a communication error signal S2 to and from each other. The memory bridges 16 and 26 share information on a communication error by the communication error signal S2 and execute an error process in concert with each other. The communication error signal S2 is sent and received as an open drain signal. The detailed structures of the memory bridges 16 and 26 will be discussed later.
The I/O system 12 has an I/O device 17, the I/O bridge 18 and a configuration register 19. The I/O system 22 has an I/O device 27, the I/O bridge 28 and a configuration register 29.
The I/ O devices 17 and 27 perform data transmission and reception with the I/O bridges 18 and 28, respectively.
The I/O bridges 18 and 28 perform serial transmission with the I/ O devices 17 and 27 or the memory bridges 16 and 26, respectively.
The I/O bridges 18 and 28 and the memory bridges 16 and 26 are connected together by ×8 links L1 of the PCI-Express interface.
The memory bridges 16 and 26 of the respective arithmetic operation systems 11 and 21 are cross-linked to the I/O bridges 18 and 28 of the respective I/ O systems 12 and 22 by the links L1. That is, the memory bridge 16 of the arithmetic operation system 11 is connected to the I/ O systems 12 and 22, and the I/O bridge 26 of the arithmetic operation system 21 is connected to the I/ O systems 12 and 22.
This cross-link connection can allow each of the arithmetic operation systems 11 and 21 to communicate with the I/ O systems 12 and 22. Accordingly, the each of the I/ O systems 12 and 22 can communicate with the arithmetic operation systems 11 and 21.
To ensure layer-by-layer upgrading, the functions of the PCI-Express interface are hierarchized. The protocol is defined for each layer.
According to the PCI-Express interface, a header is added to data in a transaction layer to thereby generate a transaction layer packet as shown in FIGS. 2A and 2B.
As shown in FIG. 2C, a sequence number and CRC (Cyclic Redundancy Check) as status information is added to the transaction layer packet in a data link layer, thereby generating a data link layer packet (DLLP).
As shown in FIG. 2D, frame data is added to the data link layer packet in a physical layer. The resultant packet is transmitted and received.
Each of the I/O bridges 18 and 28 has an interface circuit section (not shown) for transmission and reception of data according to the PCI-Express interface.
Each of the configuration- registers 19 and 29 holds data for limiting the packet length and the number of pieces of data of an upstream packet to be sent from the I/O bridges 18 and 28 to the memory bridges 16 and 26 in one period of the sync signal S1 supplied from the frequency divider 31.
The packet length and the number of pieces of data are limited to prevent a packet to be sent from being influenced by the length of the transmission path and the drifting of the clock. Specifically, the maximum packet length of individual packets is 192 bytes.
According to the limitation, the I/ O bridge 18 or 28 simultaneously sends the same packet to the respective memory bridges 16 and 26 at the rising timing of the sync signal S1.
When sending a plurality of small packets, the I/O bridges 18 and 28 perform transmission control so that the number of pieces of data does not exceed the maximum data number transmittable in one period of the sync signal S1. According to the control, the I/O bridges 18 and 28 work in such a way that transmission time of one packet does not exceed one period of the sync signal S1 of 10.4 MHz.
The values in the configuration registers 19 and 29 can be changed using the BIOS (Basic Input/Output System). The sub system 1, 2 has a non-volatile memory (not shown) to store the BIOS.
The computer system with the above-described architecture performs failure diagnosis by comparing communication contents exchanged between the sub systems. When deciding a specific sub system has failed, the computer system masks the sub system having failure and continues the process in progress using the remaining sub system.
The structures of the memory bridges 16 and 26 are discussed next. As the structure of the memory bridge 26 is the same as that of the memory bridge 16, only the structure of the memory bridge 16 is described below.
The memory bridge 16 has an interface circuit section 40, a synchronization buffer 50 and an internal circuit section 60 as shown in FIG. 3.
The interface circuit section 40 is provided in association with the PCI-Express interface. The interface circuit section 40 is separated into a data link/physical layer 41 and a transaction layer 42.
The data link/physical layer 41 is separated into physical layers 43-1 to 43n, a data link layer (RX) 44 and a data link layer (TX) 45. The transaction layer 42 is separated into a communication error processing section 46 and a transaction layer 47.
The data link/physical layer 41, the transaction layer 42 and the internal circuit section 60 operate in synchronism with different clock signals.
The physical layers 43-1 to 43n send and receive packets shown in FIG. 2D in one period of the sync signal S1. The interface circuit section 40 has an elastic buffer (EB) to hold a packet to be transmitted or received The interface circuit section 40 outputs error information when a communication error is detected at the physical layers 43-1 to 43n.
The data link layer (RX) 44 acquires a data link layer packet from the packet shown in FIG. 2D.
The data link layer (TX) 45 receives an ACK/NACK/flow control signal output from the communication error processing section 46.
The communication error processing section 46 performs a process associated with a communication error.
According to the conventional PCI-Express, the data link layer (RX) 44 directly sends some error signals in status information to the transaction layer 47, and the data link layer (RX) 44 directly sends the ACK/NACK/flow control signal to the data link layer (TX) 45. According to the embodiment, the memory bridge 16 has the communication error processing section 46 which acquires the status information at the data link layer (RX) 44. The communication error processing section 46 sends the acquired status information to the transaction layer 47 and the data link layer (TX) 45.
The communication error processing section 46 checks the CRC affixed to the data link layer packet to detect a communication error if any. The communication error processing section 46 then outputs error information.
When no communication error is detected at the physical layers 43-1 to 43n and the data link layer (RX) 44, the communication error processing section 46 sends data and status information as they are to the transaction layer 47 and the data link layer (TX) 45. When the received data has no communication error, the interface circuit section 40 regularly returns an ACK signal to the I/O bridges 18 and 28 which have sent the data, according to the status information.
When a communication error is detected at the physical layers 43-1 to 43n or the data link layer (RX) 44, on the other hand, the communication error processing section 46 cancels all the packets received in one period of the sync signal S1 as lost packets. Then, the communication error processing section 46 stops outputting the received data to the transaction layer 47.
When canceling the packets, the communication error processing section 46 instructs the data link layer (RX) 44 to set the sequence number of a next packet to be received to the sequence number prior to the reception of the communication error packet.
When detecting a communication error, the communication error processing section 46 asserts or enables a communication error signal S2 for one period of the sync signal S1. The communication error processing section 46 sends the asserted communication error signal S2 to the memory bridge 26 via the signal line.
The transaction layer 47 accepts a read request and a write request from a higher-rank software layer and requests the data 11* layer (RX) 44 and the data link layer (TX) 45 to transfer a packet.
The synchronization buffer 50 serves to exchange data between the transaction layer 47 and the internal circuit section 60. The synchronization buffer 50 holds data output from the transaction layer 47.
The internal circuit section 60 acquires data held in the synchronization buffer 50 at the timing synchronous with the sync signal S1 and sends the acquired data to the processors 13 and 14 and the main memory unit 15.
In case where the I/O bridges 18 and 28 send serial data to the memory bridges 16 and 26, the interface circuit sections of the I/O bridges 18 and 28 become transmission interface sections and the interface circuit sections of the memory bridges 16 and 26 become reception interface sections.
As mentioned above, the memory bridges 16 and 26 can send serial data to the I/O bridges 18 and 28. In this case, the interface circuit sections of the memory bridges 16 and 26 become transmission interface sections and the interface circuit sections of the I/O bridges 18 and 28 become reception interface sections.
The operation of the computer system according to the embodiment is described next.
The following description will be given of the case where the I/O bridge 18 sends serial data to the memory bridges 16 and 26.
When supplied with data from the I/O device 17, the I/O bridge 18 adds a header to the serial data at the transaction layer as shown in FIGS. 2A and 2B. Then, the I/O bridge 18 generates a transaction layer packet.
As shown in FIG. 2C, the I/O bridge 18 adds the sequence number and CRC as status information to the generated transaction layer packet at the data link layer. Then, the I/O bridge 18 generates a data link layer packet.
Next, the I/O bridge 18 adds fame data to the generated data link layer packet at the physical layer, as shown in FIG. 2D. Then, the I/O bridge 18 sends the packet shown in FIG. 2D to the memory bridges 16 and 26 via the links L1.
The interface circuit section 40 of the memory bridge 16 receives the data at the physical layers 43-1 to 43n.
The interface circuit section 40 temporarily stores all the packets, received at the physical layers 43-1 to 43n in one period of the sync signal SI, in the elastic buffer as shown in FIGS. 4A and 4B. Then, the interface circuit section 40 sends the stored packets to the data link layer (RX) 44.
The interface circuit section 40 acquires a data link layer packet from the packet shown in FIG. 2D at the data link layer (RX) 44. The interface circuit section 40 performs error detection based on the CRC included in the data link layer packet shown in FIG. 2C.
As shown in FIG. 4C, the communication error processing section 46 acquires packets received in each period of the sync signal S1 in synchronism with the next rising of the sync signal S1.
When no error is detected in the received packets, as shown in FIG. 4D, the communication error processing section 46 sets the communication error signal S2 to a high (H) level to deassert or disable the communication error signal S2. Therefore, the received packets become valid.
Then, the communication error processing section 46 sends each packet to the transaction layer 47 in synchronism with the next rising of the sync signal S1 as shown in FIG. 4E.
The interface circuit section 40 acquires a transaction layer packet from the data link layer packet at the transaction layer 47. The interface circuit section 40 then acquires data from the transaction layer packet and sends the data to the synchronization buffer 50 as shown in FIG. 4F.
The internal circuit section 60 acquires data from the synchronization buffer 50 in synchronism with the rising of the sync signal S1. Then, the internal circuit section 60 sends the acquired data to the processors 13 and 14 and the main memory unit 15.
In case where the I/O bridge 18 sends data to the memory bridges 16 and 26, the memory bridges 16 and 26 receive data nearly at the same time as shown in FIGS. 5B and 5C, if the length of the link L1 between the I/O bridge 18 and the memory bridge 16 and the length of the link L1 between the I/O bridge 18 and the memory bridge 26 hardly differ from each other.
When the link L1 between the I/O bridge 18 and the memory bridge 26 is longer than the link L1 between the 110 bridge 18 and the memory bridge 16, however, the timings at which the memory bridges 16 and 26 receive data differ from each other, as shown in FIGS. 5D and 5E.
If there is a timing difference, when the difference lies within the same period of the sync signal S1, the arithmetic operation systems 11 and 21 executes the same process in synchronism with the clock signal CLK.
If the memory bridge 26 receives data over the first period and the second period of the sync signal S1, the I/O bridge 18 changes data stored in the configuration register 19 using the BIOS in such a way as to make the length of data in one packet shorter.
Next, the interface circuit section 40 of the memory bridge 16 detects a communication error in the packet received at the second clock cycle of the sync signal S1 as shown in FIGS. 6A and 6B.
In this case, as shown in FIGS. 6C and 6E, the communication error processing section 46 cancels all the packets at the third clock cycle even if the packets include a data link layer packet (DLLP).
The communication error processing section 46 cancels packets at and following the third clock cycle. The communication error processing section 46 cancels reception of all packets until the packets canceled at the third clock cycle are sent again.
When detecting a communication error, the communication error processing section 46 sets the sequence number of the packet managed at the data link layer (RX) 44 to the sequence number prior to the occurrence of the communication error.
When detecting a communication error, as shown in FIG. 6D, the communication error processing section 46 sets the communication error signal S2 to a low (L) level to assert or enable the communication error signal S2. At the fourth clock cycle of the sync signal S1, no packets are received so that the communication error signal S2 is deasserted.
The communication error processing section 46 request the I/O bridge 18 or the data sender to resend data. When receiving a resend request from the memory bridge 16, the I/O bridge 18 resends a packet whose resending is requested. The I/O bridge 18 resends a packet whose transmission has not been acknowledged even when a predetermined period has passed without an ACK signal returned from the memory bridge 16.
Subsequently, the memory bridge 16 receives a packet with a sequence number 2 in response to the resent request at the sixth clock cycle of the sync signal S1 as shown in FIG. 7B. When there is no communication error, the interface circuit section 40 receives the packets canceled at and following the third clock cycle as shown in FIGS. 7C, 7D and 7E.
Next, even if the memory bridge 16 detects no communication error in the received data as shown in FIGS. 8A, 8B and 8C, when the memory bridge 26 detects a communication error, the memory bridge 26 sends the low-level communication error signal S2 to the memory bridge 16 as shown in FIG. 8D.
When the memory bridge 26 asserts the communication error signal S2 at the third clock cycle of the sync signal S1, the communication error processing section 46 cancels the packet with the sequence number 2, held in the communication error processing section 46 at the third clock cycle of the sync signal S1, as shown in FIG. 8E.
At and following the fourth clock cycle of the sync signal S1, the communication error processing section 46 stops giving a packet to the transaction layer 47.
The communication error processing section 46 then sets the sequence number of a packet to be received next, which is managed by the data link layer (RX) 44, to the value prior to the packet cancellation.
When the memory bridge 16 receives a packet constructed only by a data link layer packet following a packet having a communication error as shown in FIGS. 9A and 9B, the communication error processing section 46 cancels the packet having the communication error first as shown in FIGS. 9C and 9E.
When canceling the packet having the communication error, the communication error processing section 46 asserts the communication error signal S2 to request resending of a sequence of packets with and following the sequence number 2, as shown in FIG. 9D. While canceling the packet having the communication error, however, the communication error processing section 46 does not cancel the data link layer packet received at the third clock cycle of the sync signal SI, as shown in FIGS. 9C and 9E. This is because the data link layer packet has no sequence number so that a sequence number error does not occur.
The communication error processing section 46 can specify the order from the sequence numbers of resent packets without canceling the data link layer packet. Accordingly, the memory bridge 16 can receive resent packets without problems.
According to the embodiment, as explained above, when the interface circuit section 40 of the memory bridge 16 detects a communication error, the communication error processing section 46 cancels the received packet Then, the communication error processing section 46 sends the asserted communication error signal S2 to the memory bridge 26 and request the packet sender to resend the canceled packet.
Even when a communication error occurs, therefore, the communication error processing sections of the memory bridges 16 and 26 cooperate to request the packet sender to resend a packet Accordingly, a deviation in synchronism of received data can be avoided. The arithmetic operation systems 11 and 21 can therefore process same data synchronously.
Further, the packet lengths and the number of pieces of data of a packet to be transmitted are limited by the I/O bridge 18 to avoid influence of a difference in lengths of the links L1 if any.
The embodiment therefore makes it easier to design the circuit board to construct a fault-tolerant computer system and design the casing of the computer system.
The procedures of the operation of the computer system according to the embodiment of the invention are described referring to FIG. 10.
FIG. 10 is a flowchart illustrating the procedures of the processing of data received in one period of the sync signal S1 in the interface circuit section 40.
First, at step 1 (ST1), the interface circuit section 40 receives packets, sent from the I/O bridge 18, at the physical layers 43-1 to 43n according to the sync signal S1 shown in FIG. 4A.
At the next step 2 (ST2), the interface circuit section 40 temporarily stores all packets, received in one period of the sync signal S1, in the elastic buffer, as shown in FIG. 4B, and then sends the packets to the data link layer (RX) 44.
At the next step 3 (ST3), the interface circuit section 40 sends the received data to the communication error processing section 46. As shown in FIG. 4C, the communication error processing section 46 acquires the data in synchronism with the rising of the sync signal S1.
At the next step 4 (ST4), the communication error processing section 46 checks the CRC included in the data link layer packet to determine if a communication error is detected.
When a communication error is detected, the flow goes to step 5 (ST5) where the communication error processing section 46 asserts the communication error signal S2 or sets the communication error signal S2 to a low (L) level in synchronism with the rising of the sync signal S1 to enable the error, as shown in FIG. 6D.
Then, at step 6 (ST6), the communication error processing section 46 sends the asserted (low-level) communication error signal S2 to the interface circuit section of the memory bridge 26 via the signal line. Accordingly, a plurality of interface circuit sections can share error information on received packets. As the communication error signal S2 is synchronous with the rising of the sync signal S1, the plural interface circuit sections can share synchronous error information.
At step 7 (ST7), the communication error processing section 46 determines whether the asserted (low-level) communication error signal S2 has been received from the interface circuit section of the memory bridge 26 or not. When the asserted communication error signal S2 has been received, the communication error signal S2 is asserted in synchronism with the rising of the sync signal SI even if the interface circuit section 40 of the memory bridge 16 does not detect a communication error, as shown in FIG. 8D.
When the communication error signal S2 is asserted, the flow goes to step 8 (ST8) where the communication error processing section 46 cancels all the packets received in one period of the sync signal S1 as shown in FIGS. 6E and 8E. As the communication error signal S2 is asserted, no packet is sent to the transaction layer 47 in synchronism with the next rising of the sync signal S1. Therefore, the sending of packets to the transaction layer 47 is stopped. Further, packets in the next period are canceled.
At the next step 9 (ST9), the communication error processing section 46 sets the sequence number of a packet to be received next, which is managed by the data link layer (RX) 44, to the value prior to the occurrence of the error. This can stop reception of other packets until the packet whose communication error has been detected is received.
At the next step 10 (ST10), the communication error processing section 46 request the I/O bridge 18 or the data sender to resend data. In response to the request, the I/O bridge 18 sends the requested packet to the memory bridges 16 and 26. Even when a single interface circuit section detects a communication error, therefore, a plurality of interface circuit sections can receive same data synchronously.
At the next step 11 (ST11), the communication error processing section 46 deasserts the communication error signal S2 or sets the communication error signal S2 to a high (H) level in synchronism with the rising of the sync signal S1 to disable the error, as shown in FIGS. 6D and 8D. As other packets than the canceled packet are not received until the canceled packet is resent, the communication error signal S2 can be deasserted.
At the next step 12 (ST12), the interface circuit section 40 determines whether the packet corresponding to the resend request has been received or not.
When the packet is not received, step 12 is repeated. When it is determined the packet has been received, the flow returns to step 2 and the interface circuit section 40 receives the canceled packet and subsequent packets, as shown in FIGS. 7A to 7E.
When no communication error is detected at step 4 and the asserted communication error signal S2 is not received at step 7, the communication error signal S2 is not asserted as shown in FIG. 4D. In this case, the flow goes to step 13 (ST13) where the communication error processing section 46 sends the packet to the transaction layer 47 in synchronism with the next rising of the sync signal Si as shown in FIG. 4E. Further, the communication error processing section 46 sends status information to the transaction layer 47 and the data link layer (TX) 45. According to the status information, the interface circuit section 40 regularly returns the ACK signal to the I/O bridge 18 which has-sent the data.
The flow then goes to step 14 (ST14) where the interface circuit section 40 acquires data from the transaction layer packet and sends the data to the synchronization buffer 50 as shown in FIG. 4F.
The individual steps can be modified adequately according to the conditions. For example, step 9 at which the sequence number is set can be executed before step 8 at which a packet is canceled, or these steps can be executed in parallel. Further, step 7 at which reception of the communication error signal S2 is determined can be executed before step 4 at which detection of a communication error is determined, or these steps can be executed in parallel.
A computer can be allowed to execute the procedures of the operation by a computer program. The computer program can be recorded a computer readable recording medium, such as a floppy disk, CD-ROM or hard disk. In the embodiment of the invention, as the program is installed on the computer, for example, the computer program is loaded into the main memory unit 15, the computer can perform the operation described above.
The invention is not limited to the embodiment described above and can be worked out in various embodiments.
For example, each of the memory bridges 16 and 26 and the I/O bridges 18 and 28 is so constructed as to have an interface circuit section in the embodiment. However, the arithmetic operation systems 1I and 21 may respectively have transmission/reception bridges 71 and 72 in addition to the memory bridges 16 and 26, as shown in FIG. 11. Further, the arithmetic operation systems 11 and 21 may respectively have transmission/reception bridges 81 and 82 in addition to the I/ O systems 12 and 22.
In this case, each of the transmission/reception bridges 71, 72, 81 and 82 has a communication error processing section. The transmission/reception bridges 71 and 72 are connected to the memory bridges 16 and 26, respectively. The transmission/reception bridges 81 and 82 are connected to the I/O bridges 18 and 28, respectively.
The transmission/reception bridges 71 and 72 and the transmission/reception bridges 81 and 82 synchronously perform data exchange according to the lock-step system. The transmission/reception bridges 71 and 72 are connected to the existing memory bridges 16 and 26 by one set of communication links. The connection is achieved by fast serial links that are supported by the existing memory bridges. In this case, the length of the link between the memory bridge 16 and the transmission/reception bridge 71 and the length of the link between the memory bridge 26 and the transmission/reception bridge 72 are made as short as possible in order to avoid occurrence of a communication error originating from a difference in reception timing.
This structure can realize a fault-tolerant computer system while using existing system chip set components as they are.
In the embodiment, the computer system takes a double redundant architecture that has two sub systems 1 and 2 which respectively have two processors 13 and 14 and two processors 23 and 24. The architecture of the computer system is not however restrictive and can take a triple redundant architecture or an architecture having a greater number of redundancy levels.
The embodiment has been explained as an example which uses the PCI-Express interface for fast serial links. The link system is not however limited to this particular type. For example, other fast serial links of InfiniBand, HyperTransport or the like may be used instead of the PCI-Express.
The foregoing description of the embodiment has been given of the case where the memory bridges 16 and 26 exchange serial data with the I/O bridges 18 and 28. The data to be exchanged may be parallel data instead of serial data.
Various embodiments and changes may be made thereunto without departing from the broad spirit and scope of the invention. The above-described embodiments are intended to illustrate the present invention, not to limit the scope of the present invention. The scope of the present invention is shown by the attached claims rather than the embodiments. Various modifications made within the meaning of an equivalent of the claims of the invention and within the claims are to be regarded to be in the scope of the present invention.
This application is based on Japanese Patent Application No. 2003-115621 filed on Apr. 21, 2003 and including specification, claims, drawings and summary. The disclosure of the above Japanese Patent Application is incorporated herein by reference in its entirety.

Claims (17)

1. A data processing apparatus that has a plurality of reception interface sections which receive same data from a same data sender and processes data, received by said plurality of reception interface sections, in parallel, comprising:
a frequency divider which generates a sync signal by dividing a frequency of a predetermined clock signal and sends said generated sync signal to each of said reception interface sections and said data sender,
wherein each of said reception interface sections receives data, which is divided by said data sender to data of a data length shorter than one period length of said sync signal supplied from said frequency divider, from said data sender according to said sync signal,
wherein each of said reception interface sections includes a communication error processing section which, upon occurrence of an error in said received data by one of said reception interface sections, stops receiving said data, sends a communication error signal to all other of said reception interface sections to stop data reception from said data sender, and requests said data sender to resend data,
wherein each of said reception interface sections includes an arithmetic operation unit, an I/O unit, and a memory bridge that provides data from said arithmetic operation unit to said I/O unit of the respective reception interface section,
wherein said error in said received data is detected by said memory bridge of said one of said reception interface sections, and
wherein said memory bridge of said one of said reception interface sections sends the communication error signal to said other memory bridges of said other reception interface sections, and further comprising:
a transaction layer that receives the communication error signal output from the communication error processing section;
an internal circuit section; and
a synchronization buffer that exchanges data between the transaction layer and the internal circuit section,
wherein the internal circuit section acquires data held in the synchronization buffer at a timing synchronous with said sync signal and sends the acquired data to a processor external to the data processing apparatus.
2. The data processing apparatus according to claim 1, wherein when an error occurs in part of received data, said communication error processing section of each of said reception interface sections cancels said error-occurred data, and requests said data sender to resend said canceled data.
3. The data processing apparatus according to claim 1, wherein said data sender sends same serial data, and
when an error occurs in received serial data, said communication error processing section of each of said reception interface sections cancels said error-occurred serial data and serial data received following that error-occurred serial data, and requests said data sender to resend said canceled serial data.
4. The data processing apparatus according to claim 1, wherein said data sender sends said data packet by packet with a sequence number affixed to each packet, and
when an error occurs in data of the received packet, said communication error processing section of each of said reception interface sections requests said data sender to resend data packet by packet based on said sequence number affixed to each packet received.
5. A data processing method of a data processing apparatus that performs parallel processing of data received by a plurality of reception interface sections which receive same data from a same data sender and comprises:
a frequency division step of generating a sync signal by dividing a frequency of a predetermined clock signal and sending said generated sync signal to each of said reception interface sections and said data sender;
a data reception step of receiving data, which is divided by said data sender to data of a data length shorter than one period length of said sync signal generated in said frequency division step, from said data sender at each of said plurality of reception interface sections according to said sync signal;
an error detection step of detecting an error in said received data by one of said reception interface sections; and an error information output step of outputting information on said detected error by said one of said reception interface sections to all other of said reception interface sections,
wherein each of said reception interface sections includes an arithmetic operation unit, an I/O unit, and a memory bridge that provides data from said arithmetic operation unit to said I/O unit of said respective reception interface section,
wherein said error detection step comprises detecting said error in said received data by said memory bridge of said one of said reception interface sections, and
wherein said error information output step comprises sending, by said memory bridge of said one of said reception interface sections, said information on said detected error to said other memory bridges of said other reception interface sections, and further comprising: receiving, by a transaction layer, a communication error signal output in the error information output step;
exchanging data, by a synchronization buffer, between a transaction layer and an internal circuit section, wherein the internal circuit section acquires data held in the synchronization buffer at a timing synchronous with said sync signal; and
sending the acquired data by the internal circuit section to a processor external to the data processing apparatus.
6. The data processing method according to claim 5, wherein said error information output step is executed according to said sync signal.
7. The data processing method according to claim 6, further comprising:
an error information reception step of receiving error information, output from said other reception interface sections, at said one of said reception interface sections; and
a data resend requesting step of requesting said data sender to resend data in at least one of a case where an error is detected at said error detection step and a case where error information is received at said error information reception step.
8. The data processing method according to claim 7, further comprising:
a data cancellation step of canceling data; and
a data reception stopping step of stopping data reception, and wherein
said data cancellation step and said data reception stopping step are executed in at least one of a case where an error is detected at said error detection step and a case where error information is received at said error information reception step, and
said data resend requesting step requests resending of data canceled at said data cancellation step.
9. The data processing method according to claim 8, wherein said data cancellation step is executed according to said sync signal.
10. A non-transitory computer readable medium having thereon a computer program, which when executed, performs parallel processing of data received by a plurality of reception interface sections which receive same data from a same data sender and allows a data processing apparatus to execute:
a frequency division step of generating a sync signal by dividing a frequency of a predetermined clock signal and sending said generated sync signal to each of said reception interface sections and said data sender;
a data reception step of receiving data, which is divided by said data sender to data of a data length shorter than one period length of said sync signal generated in said frequency division step, from said data sender at each of said plurality of reception interface sections according to said sync signal;
an error detection step of detecting an error in said received data by one of said reception interface sections; and
an error information output step of outputting information on said detected error by said one of said reception interface sections to all other of said reception interface sections,
wherein each of said reception interface sections includes an arithmetic operation unit, an I/O unit, and a memory bridge that provides data from said arithmetic operation unit to said I/O unit of said respective reception interface section,
wherein said error detection step comprises detecting said error in said received data by said memory bridge of said one of said reception interface sections, and
wherein said error information output step comprises sending, by said memory bridge of said one of said reception interface sections, said information on said detected error to said other memory bridges of said other reception interface sections, and wherein said data processing apparatus is allowed to further execute: receiving, by a transaction layer, a communication error signal output in the error information output step;
exchanging data, by a synchronization buffer, between a transaction layer and an internal circuit section, wherein the internal circuit section acquires data held in the synchronization buffer at a timing synchronous with said sync signal; and
sending the acquired data by the internal circuit section to a processor external to the data processing apparatus.
11. The computer readable medium according to claim 10, wherein said error information output step is executed according to said sync signal.
12. The computer readable medium according to claim 11, wherein said data processing apparatus is allowed to further execute:
an error information reception step of receiving error information, output from said other reception interface sections, at said one of said reception interface sections; and a data resend requesting step of requesting said data sender to resend data in at least one of a case where an error is detected at said error detection step and a case where error information is received at said error information reception step.
13. The computer readable medium according to claim 12, wherein said data processing apparatus is allowed to further execute:
a data cancellation step of canceling data; and
a data reception stopping step of stopping data reception, and
said data cancellation step and said data reception stopping step are executed in at least one of a case where an error is detected at said error detection step and a case where error information is received at said error information reception step, and
said data resend requesting step requests resending of data canceled at said data cancellation step.
14. The computer readable medium according to claim 13, wherein said data cancellation step is executed according to said sync signal.
15. The data processing apparatus according to claim 1, wherein said memory bridge of said one of said reception interface sections sends the communication error signal to said other memory bridges of said other reception interface sections as an open drain signal.
16. The data processing method according to claim 5, wherein, in said error information output step, said memory bridge of said one of said reception interface sections sends the communication error signal to said other memory bridges of said other reception interface sections as an open drain signal.
17. The computer readable medium according to claim 10, wherein, in said error information output step, said memory bridge of said one of said reception interface sections sends the communication error signal to said other memory bridges of said other reception interface sections as an open drain signal.
US10/827,433 2003-04-21 2004-04-20 Data processing apparatus and data processing method Expired - Fee Related US7821919B2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2003115621A JP4492035B2 (en) 2003-04-21 2003-04-21 Data processing device
JP2003-115621 2003-04-21

Publications (2)

Publication Number Publication Date
US20040208130A1 US20040208130A1 (en) 2004-10-21
US7821919B2 true US7821919B2 (en) 2010-10-26

Family

ID=33028288

Family Applications (1)

Application Number Title Priority Date Filing Date
US10/827,433 Expired - Fee Related US7821919B2 (en) 2003-04-21 2004-04-20 Data processing apparatus and data processing method

Country Status (7)

Country Link
US (1) US7821919B2 (en)
EP (1) EP1477899B1 (en)
JP (1) JP4492035B2 (en)
CN (1) CN1287284C (en)
AU (1) AU2004201674A1 (en)
CA (1) CA2464779A1 (en)
DE (1) DE602004024266D1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100135153A1 (en) * 2008-12-03 2010-06-03 Micron Technology, Inc. Redundant signal transmission
US20150052263A1 (en) * 2013-08-15 2015-02-19 Fujitsu Limited Information processing system and control method of information processing system
US20190198131A1 (en) * 2017-12-21 2019-06-27 SK Hynix Inc. Semiconductor apparatus and system relating to performing a high speed test in a low speed operation environment
US11200312B1 (en) * 2018-07-02 2021-12-14 Rockwell Collins, Inc. Dual lock step processor system

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4165499B2 (en) 2004-12-13 2008-10-15 日本電気株式会社 Computer system, fault tolerant system using the same, and operation control method thereof
KR100853290B1 (en) 2004-12-14 2008-08-21 엘지전자 주식회사 Method and device for Bus controlling of mobile apparatus
JP2006178618A (en) * 2004-12-21 2006-07-06 Nec Corp Fault tolerant computer and data transmission control method
JP4154610B2 (en) 2004-12-21 2008-09-24 日本電気株式会社 Fault tolerant computer and control method thereof
JP2006178616A (en) 2004-12-21 2006-07-06 Nec Corp Fault tolerant system, controller used thereform, operation method and operation program
JP4168403B2 (en) 2004-12-21 2008-10-22 日本電気株式会社 Fault tolerant system, control device used therefor, access control method, and control program
CN100351734C (en) * 2004-12-31 2007-11-28 华硕电脑股份有限公司 Mainboard and PCI Express x16 slot thereof
JP4558519B2 (en) 2005-01-18 2010-10-06 富士通株式会社 Information processing apparatus and system bus control method
JP4410190B2 (en) 2005-03-24 2010-02-03 富士通株式会社 PCI-Express communication system
US7765357B2 (en) 2005-03-24 2010-07-27 Fujitsu Limited PCI-express communications system
CN100456273C (en) * 2005-03-24 2009-01-28 富士通株式会社 PCI-Express communications system
US20070009267A1 (en) * 2005-06-22 2007-01-11 Crews Darren S Driving a laser using an electrical link driver
US7487274B2 (en) 2005-08-01 2009-02-03 Asic Architect, Inc. Method and apparatus for generating unique identification numbers for PCI express transactions with substantially increased performance
US20070028152A1 (en) * 2005-08-01 2007-02-01 Mishra Kishore K System and Method of Processing Received Line Traffic for PCI Express that Provides Line-Speed Processing, and Provides Substantial Gate-Count Savings
US7669073B2 (en) * 2005-08-19 2010-02-23 Stratus Technologies Bermuda Ltd. Systems and methods for split mode operation of fault-tolerant computer systems
US7536489B2 (en) 2005-08-30 2009-05-19 Ricoh Company Limited Information processing system for determining payload size based on packet-to-payload size ratio
CN100517257C (en) * 2005-10-28 2009-07-22 鸿富锦精密工业(深圳)有限公司 Tool for testing high speed peripheral component interconnected bus interface
US8050290B2 (en) 2007-05-16 2011-11-01 Wilocity, Ltd. Wireless peripheral interconnect bus
US9075926B2 (en) * 2007-07-19 2015-07-07 Qualcomm Incorporated Distributed interconnect bus apparatus
US8010838B2 (en) * 2008-11-20 2011-08-30 International Business Machines Corporation Hardware recovery responsive to concurrent maintenance
US9400722B2 (en) * 2011-11-15 2016-07-26 Ge Aviation Systems Llc Method of providing high integrity processing
EP3109769B1 (en) * 2012-12-13 2019-10-16 Coherent Logix, Incorporated Multiprocessor system with improved secondary interconnection network
CN103684689A (en) * 2013-11-29 2014-03-26 重庆西信天元数据资讯有限公司 Self-check data transmission method
CN106796541B (en) 2015-03-20 2021-03-09 瑞萨电子株式会社 Data processing apparatus

Citations (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01154242A (en) 1987-09-04 1989-06-16 Digital Equip Corp <Dec> Double-zone failure-proof computer system
US4860119A (en) * 1987-05-29 1989-08-22 Ricoh Co., Ltd. Image forming system with memory duplex printing
JPH02218022A (en) 1989-02-17 1990-08-30 Ricoh Co Ltd Separate type optical pickup device
JPH02264337A (en) 1989-04-04 1990-10-29 Nec Corp Data transfer control system
JPH0471037A (en) 1990-07-12 1992-03-05 Toshiba Corp Duplex system for electronic computer
JPH05241095A (en) 1992-03-02 1993-09-21 Matsushita Electric Ind Co Ltd Method for compensating spherical aberration of optical disk and optical head using the same
JPH06324281A (en) 1993-05-10 1994-11-25 Ricoh Co Ltd Optical pickup device
US5422893A (en) 1994-08-04 1995-06-06 International Busines Machines Corporation Maintaining information from a damaged frame by the receiver in a communication link
US5502733A (en) 1991-03-08 1996-03-26 Matsushita Electric Industrial Co., Ltd. Data transfer device
JPH08278950A (en) 1995-04-04 1996-10-22 Hitachi Ltd Multiplexed computer system and fault restoring method
EP0747803A2 (en) 1992-12-17 1996-12-11 Tandem Computers Incorporated Clock for a fail-fast, fail-functional, fault-tolerant multiprocessor system
JPH0963108A (en) 1995-08-28 1997-03-07 Sony Corp Optical pickup device
US5630056A (en) 1994-09-20 1997-05-13 Stratus Computer, Inc. Digital data processing methods and apparatus for fault detection and fault tolerance
JPH10154085A (en) 1996-11-21 1998-06-09 Fujitsu Ltd System supervisory and controlling method by dual supervisory/controlling processor and dual supervisory/ controlling processor system
WO1999026133A2 (en) 1997-11-14 1999-05-27 Marathon Technologies Corporation Method for maintaining the synchronized execution in fault resilient/fault tolerant computer systems
JPH11144294A (en) 1997-11-07 1999-05-28 Sony Corp Optical pickup device
JPH11296394A (en) 1998-04-15 1999-10-29 Nec Corp Duplex information processor
JP2000040249A (en) 1998-07-17 2000-02-08 Pioneer Electronic Corp Aberration correcting device, astigmatism measuring method, and optical pickup
US6151154A (en) 1998-03-12 2000-11-21 Pioneer Electronic Corporation Optical pickup, aberration correction unit and astigmatism measurement method
JP2001290668A (en) 2000-04-04 2001-10-19 Koken:Kk Fault tolerant computer and communication system using the same
US6330701B1 (en) 1997-12-10 2001-12-11 Telefonaktiebolaget Lm Ericsson (Publ) Method relating to processors, and processors adapted to function in accordance with the method
JP2002133697A (en) 2000-10-26 2002-05-10 Asahi Glass Co Ltd Optical head device
JP2002342975A (en) 2000-12-28 2002-11-29 Sony Corp Optical disk recording and/or reproducing device, and aberration adjusting method
JP2002373445A (en) 2001-06-13 2002-12-26 Nec Corp Optical head device
US7068576B2 (en) 2000-12-28 2006-06-27 Sony Corporation Optical disc recording and/or reproducing apparatus and aberration adjustment method

Patent Citations (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4860119A (en) * 1987-05-29 1989-08-22 Ricoh Co., Ltd. Image forming system with memory duplex printing
JPH01154242A (en) 1987-09-04 1989-06-16 Digital Equip Corp <Dec> Double-zone failure-proof computer system
JPH02218022A (en) 1989-02-17 1990-08-30 Ricoh Co Ltd Separate type optical pickup device
JPH02264337A (en) 1989-04-04 1990-10-29 Nec Corp Data transfer control system
JPH0471037A (en) 1990-07-12 1992-03-05 Toshiba Corp Duplex system for electronic computer
US5502733A (en) 1991-03-08 1996-03-26 Matsushita Electric Industrial Co., Ltd. Data transfer device
JPH05241095A (en) 1992-03-02 1993-09-21 Matsushita Electric Ind Co Ltd Method for compensating spherical aberration of optical disk and optical head using the same
EP0747803A2 (en) 1992-12-17 1996-12-11 Tandem Computers Incorporated Clock for a fail-fast, fail-functional, fault-tolerant multiprocessor system
US6496940B1 (en) 1992-12-17 2002-12-17 Compaq Computer Corporation Multiple processor system with standby sparing
JPH09128349A (en) 1992-12-17 1997-05-16 Tandem Comput Inc Fail-first, fail-functional and fault-tolerant multiprocessor system
JPH06324281A (en) 1993-05-10 1994-11-25 Ricoh Co Ltd Optical pickup device
US5422893A (en) 1994-08-04 1995-06-06 International Busines Machines Corporation Maintaining information from a damaged frame by the receiver in a communication link
US5630056A (en) 1994-09-20 1997-05-13 Stratus Computer, Inc. Digital data processing methods and apparatus for fault detection and fault tolerance
JPH08278950A (en) 1995-04-04 1996-10-22 Hitachi Ltd Multiplexed computer system and fault restoring method
JPH0963108A (en) 1995-08-28 1997-03-07 Sony Corp Optical pickup device
JPH10154085A (en) 1996-11-21 1998-06-09 Fujitsu Ltd System supervisory and controlling method by dual supervisory/controlling processor and dual supervisory/ controlling processor system
JPH11144294A (en) 1997-11-07 1999-05-28 Sony Corp Optical pickup device
WO1999026133A2 (en) 1997-11-14 1999-05-27 Marathon Technologies Corporation Method for maintaining the synchronized execution in fault resilient/fault tolerant computer systems
JP2001526422A (en) 1997-12-10 2001-12-18 テレフオンアクチーボラゲツト エル エム エリクソン(パブル) Processor-related methods and processors adapted for functions based on the methods
US6330701B1 (en) 1997-12-10 2001-12-11 Telefonaktiebolaget Lm Ericsson (Publ) Method relating to processors, and processors adapted to function in accordance with the method
US6151154A (en) 1998-03-12 2000-11-21 Pioneer Electronic Corporation Optical pickup, aberration correction unit and astigmatism measurement method
JPH11296394A (en) 1998-04-15 1999-10-29 Nec Corp Duplex information processor
JP2000040249A (en) 1998-07-17 2000-02-08 Pioneer Electronic Corp Aberration correcting device, astigmatism measuring method, and optical pickup
JP2001290668A (en) 2000-04-04 2001-10-19 Koken:Kk Fault tolerant computer and communication system using the same
JP2002133697A (en) 2000-10-26 2002-05-10 Asahi Glass Co Ltd Optical head device
JP2002342975A (en) 2000-12-28 2002-11-29 Sony Corp Optical disk recording and/or reproducing device, and aberration adjusting method
US7068576B2 (en) 2000-12-28 2006-06-27 Sony Corporation Optical disc recording and/or reproducing apparatus and aberration adjustment method
JP2002373445A (en) 2001-06-13 2002-12-26 Nec Corp Optical head device
US6791927B2 (en) 2001-06-13 2004-09-14 Nec Corporation Optical head having optimum tilt angles

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100135153A1 (en) * 2008-12-03 2010-06-03 Micron Technology, Inc. Redundant signal transmission
US8570860B2 (en) * 2008-12-03 2013-10-29 Micron Technology, Inc. Redundant signal transmission
US20150052263A1 (en) * 2013-08-15 2015-02-19 Fujitsu Limited Information processing system and control method of information processing system
US9509780B2 (en) * 2013-08-15 2016-11-29 Fujitsu Limited Information processing system and control method of information processing system
US20190198131A1 (en) * 2017-12-21 2019-06-27 SK Hynix Inc. Semiconductor apparatus and system relating to performing a high speed test in a low speed operation environment
US10529437B2 (en) * 2017-12-21 2020-01-07 SK Hynix Inc. Semiconductor apparatus and system relating to performing a high speed test in a low speed operation environment
US11200312B1 (en) * 2018-07-02 2021-12-14 Rockwell Collins, Inc. Dual lock step processor system

Also Published As

Publication number Publication date
JP4492035B2 (en) 2010-06-30
CN1287284C (en) 2006-11-29
DE602004024266D1 (en) 2010-01-07
CN1540514A (en) 2004-10-27
AU2004201674A1 (en) 2004-11-04
EP1477899A2 (en) 2004-11-17
US20040208130A1 (en) 2004-10-21
CA2464779A1 (en) 2004-10-21
EP1477899A3 (en) 2008-06-11
JP2004326151A (en) 2004-11-18
EP1477899B1 (en) 2009-11-25

Similar Documents

Publication Publication Date Title
US7821919B2 (en) Data processing apparatus and data processing method
US6425033B1 (en) System and method for connecting peripheral buses through a serial bus
US6400682B1 (en) Method and apparatus for a fault tolerant, software transparent and high data integrity extension to a backplane bus or interconnect
US8117525B2 (en) Method for parallel data integrity checking of PCI express devices
US7106742B1 (en) Method and system for link fabric error detection and message flow control
US7124319B2 (en) Delay compensation for synchronous processing sets
EP1825382B1 (en) Low protocol, high speed serial transfer for intra-board or inter-board data communication
US20020087921A1 (en) Method and apparatus for detecting and recovering from errors in a source synchronous bus
US7139965B2 (en) Bus device that concurrently synchronizes source synchronous data while performing error detection and correction
US20100138573A1 (en) System including transmitter and receiver
US20100241909A1 (en) Fault-tolerant system
US7031258B1 (en) Digital data system with link level message flow control
JP5772911B2 (en) Fault tolerant system
US6862283B2 (en) Method and apparatus for maintaining packet ordering with error recovery among multiple outstanding packets between two devices
US9178692B1 (en) Serial link training method and apparatus with deterministic latency
US11226790B2 (en) Arithmetic processing apparatus with delay-and-swap processing circuit
US6834362B2 (en) Apparatus and method for error detection on source-synchronous buses
WO2024102715A1 (en) In-band data package transmission
JP4048988B2 (en) Fault tolerant system and synchronization method used therefor
CN116685959A (en) Logical physical layer interface specification supporting PCIE 6.0, CXL 3.0 and UPI 3.0 protocols
RU2700560C1 (en) Gigaspacewire communication interface device
US6587988B1 (en) Dynamic parity inversion for I/O interconnects
JP4204885B2 (en) Data communication system
JPH04355532A (en) Fault detection system in bus network

Legal Events

Date Code Title Description
AS Assignment

Owner name: NEC CORPORATION, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MIZUTANI, FUMITOSHI;ODA, SHINYA;REEL/FRAME:015237/0529

Effective date: 20040419

STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552)

Year of fee payment: 8

FEPP Fee payment procedure

Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

LAPS Lapse for failure to pay maintenance fees

Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20221026