[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN118509392A - Network-on-chip system based on self-adaptive routing - Google Patents

Network-on-chip system based on self-adaptive routing Download PDF

Info

Publication number
CN118509392A
CN118509392A CN202410948724.2A CN202410948724A CN118509392A CN 118509392 A CN118509392 A CN 118509392A CN 202410948724 A CN202410948724 A CN 202410948724A CN 118509392 A CN118509392 A CN 118509392A
Authority
CN
China
Prior art keywords
data
arbitration
packet
source node
state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410948724.2A
Other languages
Chinese (zh)
Other versions
CN118509392B (en
Inventor
刘帆
毕立强
杨亮
赵达
钱黎明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cetc Shentai Information Technology Co ltd
Original Assignee
Cetc Shentai Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cetc Shentai Information Technology Co ltd filed Critical Cetc Shentai Information Technology Co ltd
Priority to CN202410948724.2A priority Critical patent/CN118509392B/en
Publication of CN118509392A publication Critical patent/CN118509392A/en
Application granted granted Critical
Publication of CN118509392B publication Critical patent/CN118509392B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/10Packet switching elements characterised by the switching fabric construction
    • H04L49/109Integrated on microchip, e.g. switch-on-chip
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7807System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
    • G06F15/7825Globally asynchronous, locally synchronous, e.g. network on chip
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F5/00Methods or arrangements for data conversion without changing the order or content of the data handled
    • G06F5/06Methods or arrangements for data conversion without changing the order or content of the data handled for changing the speed of data flow, i.e. speed regularising or timing, e.g. delay lines, FIFO buffers; over- or underrun control therefor
    • G06F5/065Partitioned buffers, e.g. allowing multiple independent queues, bidirectional FIFO's
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/12Avoiding congestion; Recovering from congestion

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本发明涉及嵌入式处理器技术领域,具体涉及一种基于自适应路由的片上网络系统。该系统基于输入缓冲、译码逻辑、输入状态机、读使能逻辑、站台无效逻辑、一级轮转仲裁、输出状态机、数据站台和包级轮转仲裁等逻辑。本发明提供了一种自适应路由的可行性方案,同一源节点的数据不会在两个目的节点同时传输,提升了带宽利用率;设计了一种带有权重的数据轮转仲裁,降低了数据的堵塞概率,提升了数据传输效率。

The present invention relates to the field of embedded processor technology, and in particular to an on-chip network system based on adaptive routing. The system is based on input buffer, decoding logic, input state machine, read enable logic, station invalid logic, first-level round-robin arbitration, output state machine, data station and packet-level round-robin arbitration logic. The present invention provides a feasible solution for adaptive routing, in which data of the same source node will not be transmitted at two destination nodes at the same time, thereby improving bandwidth utilization; a data round-robin arbitration with weights is designed, which reduces the probability of data congestion and improves data transmission efficiency.

Description

一种基于自适应路由的片上网络系统A Network-on-Chip System Based on Adaptive Routing

技术领域Technical Field

本发明涉及嵌入式处理器技术领域,具体涉及一种基于自适应路由的片上网络系统。The present invention relates to the technical field of embedded processors, and in particular to an on-chip network system based on adaptive routing.

背景技术Background Art

随着大带宽的多核、众核芯片的发展和普及,片上多处理器系统(Multi-CoreSystems)的设计框架成为现代嵌入式系统的发展趋势,也是应用最广泛的超大规模集成电路设计。作为最有潜力的下一代片上多处理器系统架构,基于片上网络(Network-on-Chip,NoC)的众核系统互联结构能够提供超强大的并行处理能力、高带宽的片上数据传输能力、高效的计算和通信资源利用率以及系统良好的可扩展性,已经被广泛应用于高性能嵌入式系统,所以亟需一种优化解决了片上核心与非核心硬件单元之间数据传输的片上网络系统方案。With the development and popularization of high-bandwidth multi-core and many-core chips, the design framework of multi-core systems on a chip has become the development trend of modern embedded systems and the most widely used VLSI design. As the most promising next-generation multi-processor system on a chip architecture, the many-core system interconnection structure based on the network-on-chip (NoC) can provide super-powerful parallel processing capabilities, high-bandwidth on-chip data transmission capabilities, efficient computing and communication resource utilization, and good system scalability. It has been widely used in high-performance embedded systems, so there is an urgent need for an on-chip network system solution that optimizes the data transmission between the core and non-core hardware units on a chip.

发明内容Summary of the invention

针对现有技术的不足,本发明提供了一种基于自适应路由的片上网络系统,解决了片上核心与非核心硬件单元之间数据传输的问题,提供了一种自适应路由的可行性方案;设计了一种带有权重的数据轮转仲裁,降低了数据的堵塞概率,提升了带宽利用率 。In view of the shortcomings of the prior art, the present invention provides an on-chip network system based on adaptive routing, which solves the problem of data transmission between core and non-core hardware units on a chip and provides a feasible solution for adaptive routing; a data round-robin arbitration with weights is designed to reduce the probability of data congestion and improve bandwidth utilization.

本发明通过以下技术方案予以实现:The present invention is achieved through the following technical solutions:

一种基于自适应路由的片上网络系统,包括输入缓冲、译码逻辑、输入状态机、读使能逻辑、站台无效逻辑、一级轮转仲裁、输出状态机、数据站台和包级轮转仲裁;An on-chip network system based on adaptive routing includes an input buffer, a decoding logic, an input state machine, a read enable logic, a station invalidation logic, a first-level round-robin arbitration, an output state machine, a data station, and a packet-level round-robin arbitration;

所述输入缓冲,采用异步FIFO的形式缓存来自源节点的数据;片上网络和各个节点间的数据通道只有一个物理通道,但划分多个虚通道,每个虚通道表示一种类型的数据包,控制架构共三种数据包格式,且每个虚通道均设置一组写使能和读使能信号;The input buffer caches data from the source node in the form of an asynchronous FIFO; the data channel between the on-chip network and each node has only one physical channel, but is divided into multiple virtual channels, each virtual channel represents a type of data packet, the control architecture has a total of three data packet formats, and each virtual channel is set with a set of write enable and read enable signals;

所述译码逻辑,译码缓冲中的数据,根据译码结果可知该数据的目的节点,并向目的节点发起Req请求;The decoding logic decodes the data in the buffer, and the destination node of the data can be known according to the decoding result, and a Req request is initiated to the destination node;

所述输入状态机,控制译码逻辑的请求发送;The input state machine controls the request sending of the decoding logic;

所述读使能逻辑,控制输入端口何时向源节点发送读使能信号,源节点计数脉冲信号,可知输入端口异步FIFO的可用深度值;The read enable logic controls when the input port sends a read enable signal to the source node. The source node counts the pulse signal to know the available depth value of the asynchronous FIFO of the input port;

所述站台无效逻辑,控制数据在两个目的节点都传输时,选择其中一个进行传输,另一个目的节点需要置无效;The station invalidation logic controls the data to be transmitted to two destination nodes, and selects one of them for transmission, and the other destination node needs to be invalidated;

所述一级轮转仲裁,仲裁来自不同源节点发起的请求,只要异步FIFO非空,每一拍都有请求参与仲裁且当拍生成仲裁结果;The first-level round-robin arbitration arbitrates requests initiated by different source nodes. As long as the asynchronous FIFO is not empty, there are requests participating in the arbitration in each beat and the arbitration result is generated in the beat;

所述输出状态机,控制一个源节点的数据包传输完成后,根据状态机状态切换轮转仲裁器的优先级,传输下一个源节点的数据包;The output state machine controls the priority of the round-robin arbiter to transmit the data packet of the next source node after the data packet transmission of a source node is completed according to the state of the state machine;

所述数据站台,保证数据传输为流水设计;The data station ensures that data transmission is a pipeline design;

所述包级轮转仲裁,仲裁不同虚通道的数据请求,且包级轮转仲裁器带有权重和仲裁使能。权重根据包的数量设置,为Data0需要传输5包数据后切换优先级,Data1每包都切换优先级;仲裁使能定义为可写入下级站台。The packet-level round-robin arbitration arbitrates data requests of different virtual channels, and the packet-level round-robin arbiter has weights and arbitration enable. The weight is set according to the number of packets, for example, Data0 needs to switch priority after transmitting 5 packets of data, and Data1 switches priority for each packet; the arbitration enable is defined as being writable to the lower-level station.

优选的,所述异步FIFO深度为8,宽度为322位。其中低288位为数据位,包含256位的数据和32位的ECC校验;高34位为边带信息,包含SrcID(源节点)、DstID(目的节点)、数据类型TYPE(数据虚通道类型)、MAF号以及边带信息的偶校验等信息。本文共2种数据包格式,第一种包含有4个flit,每个flit都包含边带信息和数据。第二种包含有5个flit,具体为1个包头和4个数据,其中包头只包含边带信息不含数据,数据位为全0;数据不含边带信息而只有数据,边带信息位为全0。Preferably, the asynchronous FIFO has a depth of 8 and a width of 322 bits. The lower 288 bits are data bits, including 256 bits of data and 32 bits of ECC check; the upper 34 bits are sideband information, including SrcID (source node), DstID (destination node), data type TYPE (data virtual channel type), MAF number, and even check of sideband information. This article contains 2 data packet formats. The first contains 4 flits, and each flit contains sideband information and data. The second contains 5 flits, specifically 1 header and 4 data, where the header only contains sideband information but not data, and the data bits are all 0; the data does not contain sideband information but only data, and the sideband information bits are all 0.

优选的,所述译码模块,根据缓冲数据边带信息的DstID和TYPE域可知该数据的目的节点。Preferably, the decoding module can know the destination node of the data according to the DstID and TYPE fields of the buffered data sideband information.

优选的,所述输入状态机,控制源节点虚通道数据请求的发送,包含ARB、TRANS1和TRANS2三个状态。ARB状态下,译码出Req请求;TRANS1状态下,异步FIFO非空时,将要传输的是ARB状态下译码出的整包数据;FIFO空时,无数据传输;该状态下最多传输一个flit;TRANS2状态下,传输剩余数据。Preferably, the input state machine controls the sending of the source node virtual channel data request, including three states: ARB, TRANS1 and TRANS2. In the ARB state, the Req request is decoded; in the TRANS1 state, when the asynchronous FIFO is not empty, the whole packet of data decoded in the ARB state is to be transmitted; when the FIFO is empty, no data is transmitted; in this state, at most one flit is transmitted; in the TRANS2 state, the remaining data is transmitted.

优选的,所述读使能逻辑,当输出端口的轮转仲裁器向输入端口输出仲裁授权时,即向源节点发起读使能脉冲,源节点每接收一个脉冲信号,信用值加1, 源节点初始信用值为输入端口异步FIFO的深度,同时生成的该读使能信号为1时,即表明该读使能信号有效,FIFO的读指针加1。Preferably, the read enable logic initiates a read enable pulse to the source node when the round-robin arbitrator of the output port outputs an arbitration authorization to the input port. The credit value of the source node increases by 1 each time the source node receives a pulse signal. The initial credit value of the source node is the depth of the asynchronous FIFO of the input port. When the read enable signal generated at the same time is 1, it indicates that the read enable signal is valid, and the read pointer of the FIFO increases by 1.

优选的,所述一级轮转仲裁,仲裁来自不同源节点的同一种虚通道的请求,只要仲裁上一个源节点的数据,只有该源节点的整包数据都传输完才会切换优先级;此外,该轮转仲裁器带有仲裁使能,仲裁使能定义为可写入下级站台。Preferably, the first-level round-robin arbitration arbitrates requests for the same virtual channel from different source nodes. As long as the data of the previous source node is arbitrated, the priority will be switched only when the entire packet of data of the source node is transmitted; in addition, the round-robin arbitrator has an arbitration enable, and the arbitration enable is defined as being writable to the lower-level station.

优选的,所述输出状态机表明两个目的节点的虚通道传输数据的状态,包含ARB、TRANS1和TRANS2三个状态。ARB状态下,仲裁包头;TRANS1状态下,对于置无效的目的节点,该状态下当前数据实际不会传输到目的节点;对于非置无效目的节点,该状态下会一直传输数据包的剩余数据;TRANS2状态下,在TRANS1状态下置无效的目的节点传输下一个源节点的整包数据。Preferably, the output state machine indicates the state of data transmission of the virtual channels of the two destination nodes, including three states: ARB, TRANS1 and TRANS2. In the ARB state, the arbitration packet header; in the TRANS1 state, for the invalidated destination node, the current data will not actually be transmitted to the destination node in this state; for the non-invalidated destination node, the remaining data of the data packet will be transmitted in this state; in the TRANS2 state, the invalidated destination node in the TRANS1 state transmits the entire packet data of the next source node.

优选的,所述数据站台保证数据传输为流水设计,来自一个源节点的数据包需要连续传输完后才会传输下一个源节点的数据包,并且每个节点输出端口的每个虚通道都会设置一个站台。Preferably, the data station ensures that data transmission is a pipeline design, and the data packets from one source node need to be transmitted continuously before the data packets of the next source node are transmitted, and a station is set for each virtual channel of each node output port.

优选的,所述包级轮转仲裁,仲裁2种不同虚通道的数据包,请求当拍就输出仲裁结果。在仲裁器使能的情况下,每拍都会参与仲裁。无气泡传输时,请求当拍输出仲裁授权信号;有气泡时,无仲裁授权生成。当仲裁上一种虚通道的数据时需要一个计数器计数该虚通道数据包传输的数量,只有计数器达到该虚通道数据包的权重才会切换下一个虚通道的数据。此外,该轮转仲裁器带有仲裁使能,仲裁使能定义为可写入下级站台。Preferably, the packet-level round-robin arbitration arbitrates data packets of two different virtual channels, and the arbitration result is output when the request is made. When the arbitrator is enabled, each beat will participate in the arbitration. When there is no bubble transmission, the arbitration authorization signal is output when the request is made; when there is a bubble, no arbitration authorization is generated. When arbitrating the data of a virtual channel, a counter is required to count the number of data packets transmitted on the virtual channel. Only when the counter reaches the weight of the data packet of the virtual channel will the data of the next virtual channel be switched. In addition, the round-robin arbitrator has arbitration enable, and the arbitration enable is defined as being writable to the lower-level station.

本发明的有益效果为:The beneficial effects of the present invention are:

1)、自适应路由机制,同一源节点的数据不会在两个目的节点同时传输,置无效端口可以传输其他源节点的数据,提升了带宽利用率。1) Adaptive routing mechanism: data from the same source node will not be transmitted to two destination nodes at the same time. Invalidating ports can transmit data from other source nodes, thus improving bandwidth utilization.

2)、带有权重的数据轮转仲裁,降低了数据的堵塞概率,提升了数据传输的效率 。2) Weighted data round-robin arbitration reduces the probability of data congestion and improves the efficiency of data transmission.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings required for use in the embodiments or the description of the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying creative work.

图1是本发明所述的数据自适应路由整体示意图;FIG1 is an overall schematic diagram of data adaptive routing according to the present invention;

图2是本发明所述的输入状态机的状态示意图;FIG2 is a schematic diagram of the state of the input state machine according to the present invention;

图3是本发明所述的站台无效逻辑示意图;FIG3 is a schematic diagram of the station invalid logic of the present invention;

图4是本发明所述的输出状态机的状态示意图;FIG4 is a state diagram of an output state machine according to the present invention;

图5是本发明所述的一级轮转仲裁器的示意图;FIG5 is a schematic diagram of a primary round-robin arbiter according to the present invention;

图6是本发明所述的站台示意图。FIG. 6 is a schematic diagram of a station according to the present invention.

具体实施方式DETAILED DESCRIPTION

为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述。显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purpose, technical solution and advantages of the embodiments of the present invention clearer, the technical solution in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

如图1所示,本发明提供了一种基于自适应路由的片上网络系统,其特征在于基于输入缓冲、译码逻辑、输入状态机、读使能逻辑、站台无效逻辑、一级轮转仲裁、输出状态机、数据站台和包级轮转仲裁。As shown in FIG1 , the present invention provides a network-on-chip system based on adaptive routing, characterized by being based on input buffer, decoding logic, input state machine, read enable logic, station invalidation logic, first-level round-robin arbitration, output state machine, data station and packet-level round-robin arbitration.

输入缓冲,采用异步FIFO的形式缓存来自源节点的数据;片上网络和各个节点间的数据通道只有一个物理通道,但划分多个虚通道,每个虚通道表示一种类型的数据包,控制架构共三种数据包格式,且每个虚通道均设置一组写使能和读使能信号。当源节点向异步FIFO写入一个数据时,写使能有效且异步FIFO写指针加1;当输出端口虚通道的轮转仲裁器输出的仲裁授权有效时,异步FIFO的读使能有效且异步FIFO的读指针加1。Input buffer, using asynchronous FIFO to cache data from source nodes; the data channel between the on-chip network and each node has only one physical channel, but is divided into multiple virtual channels, each virtual channel represents a type of data packet, the control architecture has three data packet formats, and each virtual channel is set with a set of write enable and read enable signals. When the source node writes a data to the asynchronous FIFO, the write enable is valid and the asynchronous FIFO write pointer is incremented by 1; when the arbitration authorization output by the round-robin arbiter of the output port virtual channel is valid, the read enable of the asynchronous FIFO is valid and the read pointer of the asynchronous FIFO is incremented by 1.

译码逻辑,根据异步FIFO数据边带信息的DstID和TYPE域可知该数据的目的节点;译码有效即可向目的节点发起相应类型的Req请求。对于第二种数据包,由于包头只包含边带信息不含数据,数据位为全0;数据不含边带信息而只有数据,边带信息位为全0。所以当包头译码有效时,需在状态机的控制下,将Req请求通过寄存器锁存,在TRANS1和TRANS2状态下,该Req请求一直有效。The decoding logic can know the destination node of the data according to the DstID and TYPE fields of the asynchronous FIFO data sideband information; if the decoding is valid, the corresponding type of Req request can be initiated to the destination node. For the second type of data packet, since the packet header only contains the sideband information but not the data, the data bits are all 0; the data does not contain the sideband information but only the data, and the sideband information bits are all 0. Therefore, when the packet header decoding is valid, the Req request needs to be latched through the register under the control of the state machine. In the TRANS1 and TRANS2 states, the Req request is always valid.

如图2所示,输入状态机控制源节点虚通道数据请求的发送,包含ARB、TRANS1和TRANS2三个状态。ARB状态下,根据边带信息进行译码;TRANS1和TRANS2状态下,锁存请求。TRANS1状态下,将要传输的是除包头的第一个数据;TRANS2状态下,传输剩余数据,直到计数器计满才会切换到ARB状态,否则一直保持在该状态。输入端口实际发出的请求为该保持信号和异步FIFO非空信号相与。As shown in Figure 2, the input state machine controls the sending of the source node virtual channel data request, including three states: ARB, TRANS1 and TRANS2. In the ARB state, decoding is performed according to the sideband information; in the TRANS1 and TRANS2 states, the request is latched. In the TRANS1 state, the first data except the packet header is to be transmitted; in the TRANS2 state, the remaining data is transmitted until the counter is full before switching to the ARB state, otherwise it remains in this state. The request actually sent by the input port is the AND of the hold signal and the asynchronous FIFO non-empty signal.

读使能逻辑,输出端口每个虚通道的轮转仲裁器会向输入端口输出仲裁结果,将所有仲裁授权信号经过一个或门后生成的信号即为向源节点发起读使能脉冲,源节点每接收一个脉冲信号,信用值加1;同时,生成的信号为1有效时,异步FIFO的读指针加1。Read enable logic, the round-robin arbiter of each virtual channel of the output port will output the arbitration result to the input port. The signal generated after all arbitration authorization signals pass through an OR gate is the read enable pulse initiated to the source node. The credit value increases by 1 for each pulse signal received by the source node; at the same time, when the generated signal is valid, the read pointer of the asynchronous FIFO increases by 1.

如图3所示,站台无效逻辑控制数据在两个目的节点都可以传输时,选择其中一个进行传输。当Dst0和Dst1输出端口的Data0虚通道的一级轮转仲裁器都仲裁上Src0发起的Data0请求时,Dst0和Dst1输出端口会向Src0输入端口返回各自的c_Dst2Srci_Data_Grant信号。考虑时序紧张,在Src0输入端口会将两个c_Dst02Srci_Data_Grant打一拍,然后根据Src0_Dst_Ptr(该指针用于两个Dst都可用时,选择其中一个Dst进行传输,为0表示选择Dst0,为1表示选择Dst1,初始值为0)选择Dst0进行数据传输,此时c_Src02Dst1_Data_Grant_Invalid置1并输出到Dst1。As shown in Figure 3, when the station invalid logic controls data to be transmitted at both destination nodes, one of them is selected for transmission. When the first-level round-robin arbitrators of the Data0 virtual channel of the Dst0 and Dst1 output ports both arbitrate the Data0 request initiated by Src0, the Dst0 and Dst1 output ports will return their respective c_Dst2Srci_Data_Grant signals to the Src0 input port. Considering the tight timing, the two c_Dst02Srci_Data_Grants will be tapped at the Src0 input port, and then Dst0 will be selected for data transmission according to Src0_Dst_Ptr (this pointer is used to select one of the Dsts for transmission when both Dsts are available, 0 means selecting Dst0, 1 means selecting Dst1, and the initial value is 0). At this time, c_Src02Dst1_Data_Grant_Invalid is set to 1 and output to Dst1.

如图4和图5所示,一级轮转仲裁器仲裁来自不同源节点的同一种虚通道的请求。假设Dst0和Dst1输出端口输入的请求都为3’b101,输出状态机ARB状态下,Dst0和Dst1一级轮转仲裁器当拍输出的仲裁授权信号都为3’b001; TRANS1状态下,异步FIFO中的数据会同时写入到Dst0和Dst1的站台上,但此时c_Src02Dst1_Data_Grant_Invalid置1,该信号取反和站台Valid信号相与后为0,因此数据不会被写入到Dst1;同时,Dst1端口会仲裁下一个源节点的包头并跳转到TRANS2状态。对于Dst0,除包头外的数据会被继续传输,传输完会跳转到ARB状态,没传输完会保持在该状态;TRANS2状态下,Dst1端口传输下一个源节点的数据,传输完会跳转到ARB状态,没传输完会保持在该状态。As shown in Figures 4 and 5, the first-level round-robin arbiter arbitrates requests for the same virtual channel from different source nodes. Assuming that the requests input to the output ports of Dst0 and Dst1 are both 3'b101, in the ARB state of the output state machine, the arbitration grant signals output by the first-level round-robin arbiters of Dst0 and Dst1 are both 3'b001; In the TRANS1 state, the data in the asynchronous FIFO will be written to the platforms of Dst0 and Dst1 at the same time, but at this time c_Src02Dst1_Data_Grant_Invalid is set to 1, and the signal is negated and ANDed with the platform Valid signal to 0, so the data will not be written to Dst1; At the same time, the Dst1 port will arbitrate the packet header of the next source node and jump to the TRANS2 state. For Dst0, data other than the packet header will continue to be transmitted, and it will jump to the ARB state after the transmission is completed, and it will remain in this state if it is not transmitted; in the TRANS2 state, the Dst1 port transmits the data of the next source node, and it will jump to the ARB state after the transmission is completed, and it will remain in this state if it is not transmitted;

如图6所示,数据站台保证数据传输为流水设计。Stage_Wr_En有效表示可将数据写入站台,此时站台有效位Stage_Valid为1。Stage_Valid为0时,表明无请求参与仲裁。当Stage_Wr_En和Stage_Rd_En同时为1时,数据流水传输。As shown in Figure 6, the data stage ensures that data transmission is pipelined. When Stage_Wr_En is valid, it means that data can be written to the stage. At this time, the stage valid bit Stage_Valid is 1. When Stage_Valid is 0, it indicates that there is no request to participate in arbitration. When Stage_Wr_En and Stage_Rd_En are 1 at the same time, data is pipelined.

包级轮转仲裁,仲裁输出端口不同虚通道站台的数据。该包级轮转仲裁器带有仲裁使能和权重,仲裁使能定义为可以写入下级站台,权重值根据数据虚通道类型设置,例如Data0设置为5,Data1设置为1。当无气泡传输数据时,对于Data0,每传输一个完整的数据包会产生一拍SendSucc信号,计数器计数该信号,只有达到设置的权重值时才会切换轮转优先级传输另一虚通道的数据。对于Data1,每传输一个完整的数据包都会切换轮转优先级。有气泡传输时,正常切换轮转优先级。包级轮转仲裁器输出的仲裁授权信号作为对应数据虚通道的写使能信号随着对应虚通道的数据传输到下一级。Packet-level round-robin arbitration, arbitrating the data of different virtual channel stations on the output port. The packet-level round-robin arbiter has arbitration enable and weight. Arbitration enable is defined as the ability to write to the lower-level station. The weight value is set according to the data virtual channel type, for example, Data0 is set to 5 and Data1 is set to 1. When there is no bubble to transmit data, for Data0, a SendSucc signal will be generated for each complete data packet transmitted. The counter counts the signal, and only when the set weight value is reached will the round-robin priority be switched to transmit the data of another virtual channel. For Data1, the round-robin priority will be switched for each complete data packet transmitted. When there is bubble transmission, the round-robin priority is switched normally. The arbitration authorization signal output by the packet-level round-robin arbiter is used as the write enable signal of the corresponding data virtual channel and is transmitted to the next level along with the data of the corresponding virtual channel.

以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。The above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit the same. Although the present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that the technical solutions described in the aforementioned embodiments may still be modified, or some of the technical features thereof may be replaced by equivalents. However, these modifications or replacements do not deviate the essence of the corresponding technical solutions from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1.一种基于自适应路由的片上网络系统,所述的片上网络系统中带有权重的数据轮转仲裁,降低了数据的堵塞概率,其特征在于:包括输入缓冲、译码逻辑、输入状态机、读使能逻辑、站台无效逻辑、一级轮转仲裁、输出状态机、数据站台和包级轮转仲裁;1. A network-on-chip system based on adaptive routing, wherein the network-on-chip system has a weighted data round-robin arbitration, which reduces the probability of data congestion, and is characterized by comprising an input buffer, a decoding logic, an input state machine, a read enable logic, a station invalidation logic, a first-level round-robin arbitration, an output state machine, a data station, and a packet-level round-robin arbitration; 所述输入缓冲,采用异步FIFO的形式缓存来自源节点的数据;片上网络和各个节点间的数据通道只有一个物理通道,但划分多个虚通道,每个虚通道表示一种类型的数据包,控制架构共三种数据包格式,且每个虚通道均设置一组写使能和读使能信号;第一种数据包格式只有1个流控单元,该数据包格式含有边带信息和数据;第二种数据包格式包含有5个流控单元,具体为1个包头和4个数据,其中包头只包含边带信息不含数据,数据位为全0;数据不含边带信息而只有数据,边带信息位为全0;第三种数据包格式包含有4个流控单元,每个流控单元均包含边带信息和数据;The input buffer caches data from the source node in the form of an asynchronous FIFO; the data channel between the on-chip network and each node has only one physical channel, but is divided into multiple virtual channels, each virtual channel represents a type of data packet, and the control architecture has three data packet formats, and each virtual channel is set with a set of write enable and read enable signals; the first data packet format has only one flow control unit, and the data packet format contains sideband information and data; the second data packet format contains five flow control units, specifically one packet header and four data, wherein the packet header only contains sideband information but no data, and the data bits are all 0; the data does not contain sideband information but only data, and the sideband information bits are all 0; the third data packet format contains four flow control units, each of which contains sideband information and data; 所述译码逻辑,译码异步FIFO中的数据,根据译码结果能够获知该数据的目的节点,并向目的节点发起Req请求;The decoding logic decodes the data in the asynchronous FIFO, can know the destination node of the data according to the decoding result, and initiates a Req request to the destination node; 所述输入状态机,控制译码逻辑的请求发送;The input state machine controls the request sending of the decoding logic; 所述读使能逻辑,控制输入端口何时向源节点发送读使能信号,源节点计数脉冲信号,可知输入端口异步FIFO的可用深度值;The read enable logic controls when the input port sends a read enable signal to the source node. The source node counts the pulse signal to know the available depth value of the asynchronous FIFO of the input port; 所述站台无效逻辑,控制数据在两个目的节点都传输时,选择其中一个进行传输,同时另一个目的节点需要置无效;The station invalidation logic controls the data to be transmitted to two destination nodes, and selects one of them for transmission, while the other destination node needs to be invalidated; 所述一级轮转仲裁,仲裁来自不同源节点发起的请求,只要异步FIFO非空,每一拍都有请求参与仲裁且当拍生成仲裁结果;The first-level round-robin arbitration arbitrates requests initiated by different source nodes. As long as the asynchronous FIFO is not empty, there are requests participating in the arbitration in each beat and the arbitration result is generated in the beat; 所述输出状态机,控制一个源节点的数据包传输完成后,根据状态机状态切换轮转仲裁器的优先级,传输下一个源节点的数据包;The output state machine controls the priority of the round-robin arbiter to transmit the data packet of the next source node after the data packet transmission of a source node is completed according to the state of the state machine; 所述数据站台,保证数据传输为流水设计;The data station ensures that data transmission is a pipeline design; 所述包级轮转仲裁,仲裁不同虚通道的数据请求,且包级轮转仲裁器带有权重和仲裁使能;权重根据包的数量设置,分为Data0需要传输5包数据后切换优先级,Data1每包都切换优先级;仲裁使能定义为可写入下级站台。The packet-level round-robin arbitration arbitrates data requests of different virtual channels, and the packet-level round-robin arbiter has weights and arbitration enables; the weights are set according to the number of packets, and are divided into Data0, which needs to transmit 5 packets of data before switching priorities, and Data1 switches priorities for each packet; the arbitration enable is defined as being writable to the lower-level station. 2.如权利要求1所述的一种基于自适应路由的片上网络系统,其特征在于,所述异步FIFO深度为8,宽度为322位;其中低288位为数据位,包含256位的数据和32位的ECC校验;高34位为边带信息,包含SrcID、DstID、数据类型TYPE、MAF号以及边带信息的偶校验;本文共2种数据包格式,第一种包含有4个flit,每个flit都包含边带信息和数据;第二种包含有5个flit,具体为1个包头和4个数据,其中包头只包含边带信息不含数据,数据位为全0;数据不含边带信息而只有数据,边带信息位为全0。2. A network-on-chip system based on adaptive routing as described in claim 1, characterized in that the asynchronous FIFO has a depth of 8 and a width of 322 bits; the lower 288 bits are data bits, including 256 bits of data and 32 bits of ECC check; the upper 34 bits are sideband information, including SrcID, DstID, data type TYPE, MAF number and even check of sideband information; there are 2 data packet formats in this article, the first one includes 4 flits, each flit includes sideband information and data; the second one includes 5 flits, specifically 1 header and 4 data, wherein the header only includes sideband information but not data, and the data bits are all 0; the data does not include sideband information but only data, and the sideband information bits are all 0. 3.如权利要求1所述的一种基于自适应路由的片上网络系统,其特征在于,所述译码逻辑,根据异步FIFO数据边带信息的DstID和TYPE域可知该数据的目的节点;译码有效即可向目的节点发起Req请求。3. A network-on-chip system based on adaptive routing as described in claim 1, characterized in that the decoding logic can know the destination node of the data based on the DstID and TYPE fields of the asynchronous FIFO data sideband information; and a Req request can be initiated to the destination node if the decoding is valid. 4.如权利要求1所述的一种基于自适应路由的片上网络系统,其特征在于,所述输入状态机,控制源节点虚通道数据请求的发送,包含ARB、TRANS1和TRANS2三个状态;ARB状态下,译码出Req请求;TRANS1状态下,异步FIFO非空时,将要传输的是ARB状态下译码出的整包数据;FIFO空时,无数据传输;该状态下最多传输一个flit;TRANS2状态下,传输剩余数据。4. A network-on-chip system based on adaptive routing as described in claim 1, characterized in that the input state machine controls the sending of virtual channel data requests of the source node, and includes three states: ARB, TRANS1 and TRANS2; in the ARB state, the Req request is decoded; in the TRANS1 state, when the asynchronous FIFO is not empty, the whole packet data decoded in the ARB state is to be transmitted; when the FIFO is empty, no data is transmitted; in this state, at most one flit is transmitted; in the TRANS2 state, the remaining data is transmitted. 5.如权利要求1所述的一种基于自适应路由的片上网络系统,其特征在于,所述读使能逻辑当输出端口的所述轮转仲裁器向所述输入端口输出仲裁授权信号时,即向源节点发起读使能脉冲,源节点每接收一个脉冲信号,信用值加1;源节点初始信用值为所述输入端口中FIFO的深度,同时生成的信号为1有效时,FIFO的读指针加1。5. A network-on-chip system based on adaptive routing as described in claim 1, characterized in that when the round-robin arbitrator of the output port outputs an arbitration authorization signal to the input port, the read enable logic initiates a read enable pulse to the source node, and the credit value increases by 1 each time the source node receives a pulse signal; the initial credit value of the source node is the depth of the FIFO in the input port, and when the generated signal is valid, the read pointer of the FIFO increases by 1. 6.如权利要求1所述的一种基于自适应路由的片上网络系统,其特征在于,所述站台无效逻辑,当两个目的节点都传输同一源节点的数据时,将两个目的节点输出端口返回的仲裁授权信号打一拍后,置其中一个仲裁授权信号无效,整个数据包只能在仲裁授权有效的目的节点上进行传输。6. A network-on-chip system based on adaptive routing as described in claim 1, characterized in that the station invalidation logic, when two destination nodes transmit data from the same source node, after the arbitration authorization signals returned by the output ports of the two destination nodes are matched, one of the arbitration authorization signals is set invalid, and the entire data packet can only be transmitted on the destination node with valid arbitration authorization. 7.如权利要求1所述的一种基于自适应路由的片上网络系统,其特征在于,所述一级轮转仲裁,仲裁来自不同源节点的同一种虚通道的请求,只要仲裁上一个源节点的数据,只有该源节点的整包数据都传输完才会切换优先级;此外,该轮转仲裁器带有仲裁使能,仲裁使能定义为可写入下级站台。7. A network-on-chip system based on adaptive routing as described in claim 1, characterized in that the first-level round-robin arbitration arbitrates requests for the same virtual channel from different source nodes, and only when the data of the previous source node is arbitrated, the priority will be switched only after the entire packet of data of the source node is transmitted; in addition, the round-robin arbitrator has an arbitration enable, and the arbitration enable is defined as being writable to the lower-level station. 8.如权利要求1所述的一种基于自适应路由的片上网络系统,其特征在于,所述输出状态机表明两个目的节点的虚通道传输数据的状态,包含ARB、TRANS1和TRANS2三个状态;ARB状态下,仲裁包头;TRANS1状态下,对于置无效的目的节点,该状态下当前数据实际不会传输到目的节点;对于非置无效目的节点,该状态下会一直传输数据包的剩余数据;TRANS2状态下,在TRANS1状态下置无效的目的节点传输下一个源节点的整包数据。8. A network-on-chip system based on adaptive routing as described in claim 1, characterized in that the output state machine indicates the state of data transmission of the virtual channels of two destination nodes, including three states: ARB, TRANS1 and TRANS2; in the ARB state, the arbitration packet header; in the TRANS1 state, for the invalidated destination node, the current data will not actually be transmitted to the destination node in this state; for the non-invalidated destination node, the remaining data of the data packet will be transmitted in this state; in the TRANS2 state, the invalidated destination node in the TRANS1 state transmits the entire packet data of the next source node. 9.如权利要求1所述的一种基于自适应路由的片上网络系统,其特征在于,所述站台保证数据传输为流水设计,来自一个源节点的数据包需要连续传输完后才会传输下一个源节点的数据包,并且每个节点输出端口的每个虚通道都会设置一个站台。9. A network-on-chip system based on adaptive routing as described in claim 1, characterized in that the station ensures that data transmission is a pipeline design, and the data packets from a source node need to be transmitted continuously before the data packets of the next source node are transmitted, and a station is set for each virtual channel of each node output port. 10.如权利要求1所述的一种基于自适应路由的片上网络系统,其特征在于,所述包级轮转仲裁器,仲裁2种不同虚通道的数据包,请求当拍就输出仲裁结果;在仲裁器使能的情况下,每拍都会参与仲裁;无气泡传输时,请求当拍输出仲裁授权信号;有气泡时,无仲裁授权生成;当仲裁上一种虚通道的数据时需要一个计数器计数该虚通道数据包传输的数量,只有计数器达到该虚通道数据包的权重才会切换下一个虚通道的数据;此外,该轮转仲裁器带有仲裁使能,仲裁使能定义为可写入下级站台。10. A network-on-chip system based on adaptive routing as described in claim 1, characterized in that the packet-level round-robin arbitrator arbitrates data packets of two different virtual channels, and outputs the arbitration result when requesting a beat; when the arbitrator is enabled, each beat will participate in the arbitration; when there is no bubble transmission, the arbitration authorization signal is output when requesting a beat; when there is a bubble, no arbitration authorization is generated; when arbitrating the data of a virtual channel, a counter is required to count the number of data packets transmitted in the virtual channel, and only when the counter reaches the weight of the data packet of the virtual channel will the data of the next virtual channel be switched; in addition, the round-robin arbitrator has arbitration enable, and the arbitration enable is defined as being writable to the lower-level platform.
CN202410948724.2A 2024-07-16 2024-07-16 Network-on-chip system based on self-adaptive routing Active CN118509392B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410948724.2A CN118509392B (en) 2024-07-16 2024-07-16 Network-on-chip system based on self-adaptive routing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410948724.2A CN118509392B (en) 2024-07-16 2024-07-16 Network-on-chip system based on self-adaptive routing

Publications (2)

Publication Number Publication Date
CN118509392A true CN118509392A (en) 2024-08-16
CN118509392B CN118509392B (en) 2024-11-19

Family

ID=92243508

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410948724.2A Active CN118509392B (en) 2024-07-16 2024-07-16 Network-on-chip system based on self-adaptive routing

Country Status (1)

Country Link
CN (1) CN118509392B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120110106A1 (en) * 2010-11-02 2012-05-03 Sonics, Inc. Apparatus and methods for on layer concurrency in an integrated circuit
CN102685017A (en) * 2012-06-07 2012-09-19 桂林电子科技大学 On-chip network router based on field programmable gate array (FPGA)
CN105871730A (en) * 2016-03-22 2016-08-17 广东工业大学 Novel compact, efficient and fast on-chip network router based on network coding
US20220103478A1 (en) * 2020-09-28 2022-03-31 Vmware, Inc. Flow processing offload using virtual port identifiers
CN115905103A (en) * 2022-12-01 2023-04-04 电子科技大学 A cross-chip interconnection system and adaptive routing method
CN117156006A (en) * 2023-11-01 2023-12-01 中电科申泰信息科技有限公司 Data route control architecture of network on chip

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120110106A1 (en) * 2010-11-02 2012-05-03 Sonics, Inc. Apparatus and methods for on layer concurrency in an integrated circuit
CN102685017A (en) * 2012-06-07 2012-09-19 桂林电子科技大学 On-chip network router based on field programmable gate array (FPGA)
CN105871730A (en) * 2016-03-22 2016-08-17 广东工业大学 Novel compact, efficient and fast on-chip network router based on network coding
US20220103478A1 (en) * 2020-09-28 2022-03-31 Vmware, Inc. Flow processing offload using virtual port identifiers
CN115905103A (en) * 2022-12-01 2023-04-04 电子科技大学 A cross-chip interconnection system and adaptive routing method
CN117156006A (en) * 2023-11-01 2023-12-01 中电科申泰信息科技有限公司 Data route control architecture of network on chip

Also Published As

Publication number Publication date
CN118509392B (en) 2024-11-19

Similar Documents

Publication Publication Date Title
US8085801B2 (en) Resource arbitration
US7277449B2 (en) On chip network
Chen et al. The IBM Blue Gene/Q interconnection network and message unit
US7051150B2 (en) Scalable on chip network
US6996651B2 (en) On chip network with memory device address decoding
US7200137B2 (en) On chip network that maximizes interconnect utilization between processing elements
US6751684B2 (en) System and method of allocating bandwidth to a plurality of devices interconnected by a plurality of point-to-point communication links
US7143219B1 (en) Multilevel fair priority round robin arbiter
US6691192B2 (en) Enhanced general input/output architecture and related methods for establishing virtual channels therein
US7139860B2 (en) On chip network with independent logical and physical layers
US20090300292A1 (en) Using criticality information to route cache coherency communications
US7526626B2 (en) Memory controller configurable to allow bandwidth/latency tradeoff
US6674720B1 (en) Age-based network arbitration system and method
CN103810133B (en) Method and apparatus for managing the access to sharing read buffer resource
CN101841420B (en) Low Latency Router Architecture for Network-on-Chip
CN102185751B (en) One-cycle router on chip based on quick path technology
CN112152932B (en) Network-on-chip routing control method, network-on-chip router and readable storage medium
US20140281081A1 (en) Proactive quality of service in multi-matrix system bus
US20060190641A1 (en) Buffer management in packet switched fabric devices
CN117156006B (en) Data route control architecture of network on chip
US7796624B2 (en) Systems and methods for providing single-packet and multi-packet transactions in an integrated circuit
TW200407712A (en) Configurable multi-port multi-protocol network interface to support packet processing
US7657682B2 (en) Bus interconnect with flow control
US7039750B1 (en) On-chip switch fabric
CN118509392A (en) Network-on-chip system based on self-adaptive routing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant