CN115550281B - Resource scheduling method and architecture for AWGR (optical fiber reinforced plastic) optical switching data center network - Google Patents
Resource scheduling method and architecture for AWGR (optical fiber reinforced plastic) optical switching data center network Download PDFInfo
- Publication number
- CN115550281B CN115550281B CN202211520300.3A CN202211520300A CN115550281B CN 115550281 B CN115550281 B CN 115550281B CN 202211520300 A CN202211520300 A CN 202211520300A CN 115550281 B CN115550281 B CN 115550281B
- Authority
- CN
- China
- Prior art keywords
- rack
- module
- data
- optical
- scheduler
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000003287 optical effect Effects 0.000 title claims abstract description 51
- 238000000034 method Methods 0.000 title claims abstract description 26
- 239000013307 optical fiber Substances 0.000 title description 3
- 229920002430 Fibre-reinforced plastic Polymers 0.000 title description 2
- 230000005540 biological transmission Effects 0.000 claims description 33
- 239000000872 buffer Substances 0.000 claims description 28
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000003491 array Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 2
- 230000008569 process Effects 0.000 abstract description 8
- 230000004044 response Effects 0.000 abstract description 6
- 238000012163 sequencing technique Methods 0.000 abstract 2
- 239000000835 fiber Substances 0.000 abstract 1
- 230000006854 communication Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 230000003068 static effect Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 3
- 125000004122 cyclic group Chemical group 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000005265 energy consumption Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 238000005206 flow analysis Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/62—Queue scheduling characterised by scheduling criteria
- H04L47/625—Queue scheduling characterised by scheduling criteria for service slots or service orders
- H04L47/6275—Queue scheduling characterised by scheduling criteria for service slots or service orders based on priority
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/62—Queue scheduling characterised by scheduling criteria
- H04L47/625—Queue scheduling characterised by scheduling criteria for service slots or service orders
- H04L47/6255—Queue scheduling characterised by scheduling criteria for service slots or service orders queue load conditions, e.g. longest queue first
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/62—Queue scheduling characterised by scheduling criteria
- H04L47/6295—Queue scheduling characterised by scheduling criteria using multiple queues, one for each individual QoS, connection, flow or priority
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04Q—SELECTING
- H04Q11/00—Selecting arrangements for multiplex systems
- H04Q11/0001—Selecting arrangements for multiplex systems using optical switching
- H04Q11/0062—Network aspects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04Q—SELECTING
- H04Q11/00—Selecting arrangements for multiplex systems
- H04Q11/0001—Selecting arrangements for multiplex systems using optical switching
- H04Q11/0062—Network aspects
- H04Q2011/0075—Wavelength grouping or hierarchical aspects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04Q—SELECTING
- H04Q11/00—Selecting arrangements for multiplex systems
- H04Q11/0001—Selecting arrangements for multiplex systems using optical switching
- H04Q11/0062—Network aspects
- H04Q2011/0086—Network resource allocation, dimensioning or optimisation
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a resource scheduling method for an AWGR (fiber Bragg grating) optical switching data center network, which comprises the steps of collecting flow information through a scheduler, sequencing the flow information, analyzing and judging the flow information according to the sequencing, determining whether the current flow can be sent, adding the flow information of the part into a result list if the current flow can be sent, finally confirming the length of the result list, and issuing a scheduling instruction after determining; the corresponding framework comprises a rack, a scheduler and an array waveguide grating, wherein the scheduler and the array waveguide grating are connected with the rack, the rack comprises a top-of-rack exchanger and a plurality of servers, and the servers are communicated through the data exchange topology of the top-of-rack exchanger. The scheduling method solves the problems of optical data packet conflict and limited optical wavelength resources in the optical switching network by distributing the time slots and the wavelengths, solves the problems of timely distributing the optical wavelength resources and retransmitting the optical data conflict packets by a scheduling strategy, and avoids the waiting time delay caused by a request-response process by adopting a staggered time slot method.
Description
Technical Field
The invention relates to the technical field of communication, in particular to a resource scheduling method and a resource scheduling architecture for an AWGR (optical fiber array) optical switching data center network.
Background
With the rapid development of cloud computing, mobile internet and streaming media, data center network traffic is growing exponentially, which puts higher and higher demands on the performance of the data center. Currently, data centers are connected together by electrical switches with tens of thousands or even millions of servers that meet the real-time dynamic needs of users through frequent interactions and data storage. Due to the limitation of encapsulation technology, the bandwidth acceleration of the I/O ports of the electrical switch is far lower than the rapidly-increased network traffic demand; meanwhile, the traditional data center adopts a network architecture of electric switches, frequent optical-electrical-optical conversion is needed in the communication process between the switches, and huge energy consumption is brought in the process. The data packets undergo multiple stages of queuing and processing latency as they pass through each stage of switch, which greatly increases the data transmission latency. Accordingly, data center networks of conventional electrical switching architecture face a great challenge of rapidly increasing traffic in terms of switching nodes and network architecture, and in order to overcome the above drawbacks, new optical switching data center network architectures are becoming popular.
Due to the transparent characteristics of the data modulation format and the transmission rate, the optical switch has extremely large I/O port switching bandwidth, and compared with the traditional electrical switching multi-layer system architecture, the flattened optical switching architecture with highly distributed control has the advantages that the processing time of the controller is shortened while the large capacity and the expandability are provided, and the overall performance of the network is optimized in the aspects of throughput, delay and the like. Arrayed Waveguide Gratings (AWGR) are passive optical devices that are routed cyclically from the wavelength of a given input port through an output port, and are used in conjunction with fixed wavelength lasers to form a data center optical interconnect switching fabric. Because of the lack of optical cache, when one destination port simultaneously has two or more source ports (wavelengths) to send data in the process of data transmission, the destination port can generate the phenomena of optical conflict and packet loss; in the small-scale data center optical network, a full-connection mode can be adopted, namely, each receiving and transmitting end is provided with an exclusive wavelength for data transmission, but because of the characteristic limitation of an Array Waveguide Grating (AWGR), the number of ports of the Array Waveguide Grating (AWGR) is limited, and the full-connection network structure cannot interconnect a large-scale data center network.
To solve the above problem of wavelength collision, a scheduling scheme is required to schedule data transmission in AWGR to avoid the problem of data packet collision, and to meet the requirement of data center size growth and avoid limitation of limited wavelength resources, a time slot division manner is used to guide ordered transmission of wavelengths. Some static scheduling schemes currently use a polling mode to schedule a forwarding time period and wavelength, that is, assuming that n wavelengths can transmit data, and m ports need to transmit data (n < m), in this time period, 1 st to n ports of m ports will be selected to transmit data, then n+1 st to n+n ports will be selected to transmit data in the next time period, and when the last transmitting port is selected, the cycle starts from the first transmitting port.
Disadvantages of the full connection mode: the wavelength conflict problem of the AWGR data center network can be solved by using a full connection mode, namely, each transceiver end is interconnected by an exclusive physical wavelength channel, but the interconnection mode can occupy a large number of ports of the AWGR. Due to the limitation of the manufacturing process, the number of the ports of the Array Waveguide Grating (AWGR) cannot be increased limitlessly along with the increase of the network scale, and the interconnection mode can only be applied to a microminiature data center network.
Static scheduling scheme shortcoming: in the static scheduling scheme, forwarding time slots and wavelengths are arranged in a polling mode, and each port which is likely to collide can be allocated with different time slots or wavelengths as far as possible. Meanwhile, in the request-response process in the scheduling scheme, the data transmission needs to wait for the response process, which increases the waiting time delay in data transmission.
Chinese patent CN105959163a discloses a passive optical interconnection network structure and a data communication method based on software definition, which can reduce power consumption, delay and reliability, but the application scenario is limited, and the real-time requirement cannot be guaranteed.
Disclosure of Invention
In view of the above problems, the present invention aims to provide a resource scheduling method and architecture for an AWGR-oriented optical switching data center network, so as to solve the problem of wavelength collision in an AWGR-based optical switching data center, the problem that optical bandwidths cannot be flexibly allocated on demand in real time in a static scheduling scheme, and the latency caused by a request-response process in the scheduling scheme. In order to achieve the above purpose, the present invention provides the following technical solutions: the resource scheduling method and the architecture for the AWGR-oriented optical switching data center network comprise the following steps of:
s1, a top-of-rack exchanger collects the occupation condition of each cache of each group and source nodes and destination nodes corresponding to the caches, then sends a request to a dispatcher and waits for receiving a reply command, and the dispatcher receives the request transmitted by each top-of-rack exchanger and analyzes streaming information;
s2, the scheduler stores the counted flow information through a priority queue, the priority queue sorts the flow according to the flow of each flow, and the flow is arranged in front;
s3, traversing the elements of the priority queue in sequence, analyzing and judging the stream, determining whether the current stream can be sent, if so, carrying out the next step, otherwise, re-executing the step S3;
s4, adding the current stream into a result array, stopping calculation if the length of the result array reaches the upper limit of the number or the priority queue is empty, otherwise, re-executing S3;
and S5, the scheduler issues control commands to each top-of-rack exchanger according to the information of the result array, and after receiving the commands corresponding to the scheduler, the top-of-rack exchanger controls the corresponding caches in the sending module to send data in a staggered time slot mode.
Preferably, different caches in the top-of-rack switch are grouped, and the caches share a sending module for sending, so that streams sent to different destination ends are sent in different time slots.
Preferably, the offset time slots specifically are: and starting to transmit data at the time t1 of the time slot N, wherein the data transmission at the time t2 is finished, at the moment, the buffer memory module of the top-of-rack exchanger collects buffer memory and transmits a request to the scheduler, the scheduler finishes scheduling at the time t3 and transmits a transmission command to each top-of-rack exchanger, and the data transmission of the N+1 time slot is carried out at the time t3, so that the cycle is finished.
Preferably, the step S3 further comprises traversing the elements of the priority queue according to the sequence, and mapping through two dictionary arrays T And map R Storing the states of the transmitting module and the receiving module in each rack-top exchanger, if the map T And map with R The element value in the transmission module or the receiving module is 0, which indicates that the transmission module or the receiving module is unoccupied and can transmit; if map T Or map R The element value in the element is 1, which indicates that the sending module or the receiving module is occupied, and the sending module and the receiving module cannot send the information, and the number of the sending module and the number of the receiving module are the same.
Preferably, the S4 specifically is: if the length of the result array reaches MxN, the data stream which can be sent reaches the upper limit of the number, or the length of the array of the stream is not enough MxN at the moment, but the priority queue is empty, the calculation is stopped; the number of the top-frame switches is N, the number of the sending modules in the top-frame switches is M, and the number of streams which can be sent simultaneously is M multiplied by N at most.
The architecture of the resource scheduling method for the AWGR optical switching data center network comprises a rack, a scheduler and an array waveguide grating, wherein the scheduler and the array waveguide grating are connected with the rack, a top-of-rack switch and a plurality of servers are arranged in the rack, and communication is realized among the servers through the data switching topology of the top-of-rack switch.
Preferably, the arrayed waveguide grating structure is a 4×4 array, the arrayed waveguide grating routes optical signals to corresponding output ports according to a circulating wavelength routing mode, 4 input ports arranged in the arrayed waveguide grating input signals with different wavelengths, the number of the transmitting modules of each overhead switch is fixed, each transmitting module in the same overhead switch can only transmit unique wavelength lambda different from other transmitting modules, the arrayed waveguide grating is adopted as an optical switching node, so that the signals with different wavelengths of each input port can reach fixed output ports, and the transmitting modules are responsible for transmitting data packets in the buffer modules in the form of optical signals with different wavelengths;
wherein:the lambda in (a) indicates that the signal is an optical signal with a specific wavelength, the upper corner mark i is the input port serial number of the signal, and the lower corner mark w is the wavelength serial number of the signal.
Preferably, the internal structure of the top-of-rack switch comprises an ethernet switch module, a buffer module is arranged between the ethernet switch module and the sending module and between the ethernet switch module and the receiving module for connection, the buffer module is arranged between the ethernet switch module and the server for connection, data of the server is uploaded to the ethernet switch module from the server in the rack, the ethernet switch module distributes according to the destination of the data packet, if the destination is the server in the rack, the data is directly forwarded to the destination server in the rack, if the destination is the server outside the rack, the data is uploaded to the buffer module corresponding to the sending module, and a control instruction is waited for data transmission.
Preferably, the number of the sending modules=the number of the top-frame switches/the number of the time slots=the number of the wavelengths;
buffer module number = roof-of-rack switch number/wavelength number = arrayed waveguide grating number;
number of wavelengths = number of arrayed waveguide grating ports.
Compared with the prior art, the invention has the beneficial effects that:
for the allocation of time slots and wavelengths, the problems of conflict of optical data packets and limited optical wavelength resources in an optical switching network are solved; the invention solves the problems of conflict packet retransmission time delay caused by timely allocation of optical wavelength resources and contention of optical data packets which cannot be solved in a static scheduling scheme through a unique scheduling strategy, and avoids the waiting time delay caused by a request-response process in a traditional scheduling scheme by adopting a staggered time slot method.
Drawings
Fig. 1 is a flowchart of a central network resource scheduling method according to the present invention.
Fig. 2 is a diagram showing the internal structure of the frame of the present invention.
Fig. 3 is a schematic diagram of an AWGR-oriented optical switching data center network according to the present invention.
Fig. 4 is a schematic diagram of optical signal flow through a 4x4 arrayed waveguide grating according to the present invention.
Fig. 5 is an internal structure diagram of an ethernet switch module according to the present invention.
Fig. 6 is a diagram of the staggered time slots and conventional scheduling time slots of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the description of the present invention, it should be noted that the directions or positional relationships indicated by the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In the description of the present invention, it should be noted that, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "configured" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art. Hereinafter, an embodiment of the present invention will be described in accordance with its entire structure.
The invention aims at the wavelength conflict problem of the network based on the AWGR optical switching data center, carries out non-polling scheduling according to the occupation ratio condition of the buffer memory of the sending port, can solve the data packet contention problem and the flexible allocation optical bandwidth problem of the data center of the optical network by the scheduling scheme, and finally uses a special 'misplacement time slot' method to avoid the waiting time delay caused by the 'request-response' process in the scheduling strategy.
The data center network architecture based on AWGR mainly includes a scheduler, a top of rack switch (ToR), an arrayed waveguide grating AWGR, and a server. As shown in fig. 2, each rack is internally composed of K servers and a top-of-rack switch (ToR), and the K servers realize communication with other servers in the rack or servers outside the rack through the top-of-rack switch (ToR). As shown in fig. 3, this is an inter-chassis data exchange topology, which is the core of the present invention, consisting of N chassis, a scheduler and a number of Arrayed Waveguide Gratings (AWGR). Fig. 4 is an Arrayed Waveguide Grating (AWGR), which is a passive optical device, having a cyclic wavelength routing feature, and may implement an optical interconnection network. Arrayed Waveguide Gratings (AWGR) are typically used with tunable lasers or fixed wavelength lasers, and in the present invention, all-optical switching techniques are implemented using a combination of fixed wavelength lasers. The combination of arrayed waveguide gratings and fixed wavelength lasers has the advantages of high capacity, low processing delay and low insertion loss. FIG. 4 is a schematic diagram of a 4x4 Arrayed Waveguide Grating (AWGR) structure, 4 input ports for inputting signals of different wavelengths, whereinLambda in (1) denotes that the signal is an optical signal having a specific wavelength, and is marked with an upper angle iThe subscript w is the wavelength number of the signal. Arrayed Waveguide Gratings (AWGR) route optical signals to corresponding output ports in a cyclic wavelength routing manner.
The top of rack switch (ToR) is implemented by using an FPGA, and the internal modules are shown in fig. 5, where data to be forwarded to other servers in the rack is uploaded from the servers in the rack to the ethernet switch module, and the ethernet switch module distributes the data according to the destination of the data packet, and if the destination is other servers in the rack, the data is forwarded directly, and if the destination is other servers outside the rack, the data is uploaded to the corresponding buffers of the transmitting module (Tx). Buffers storing different destination data in the top of rack switches (tors) are grouped and they share a transmitting module (Tx) for transmission, but cannot be transmitted simultaneously. By staggering the time slots, streams destined for different destinations are sent in different time slots. The function of the transmitting modules (Tx) is to transmit the data packets in the buffer as optical signals in the form of wavelengths, the number of transmitting modules (Tx) in each top-of-rack switch (ToR) is fixed and each transmitting module (Tx) can only transmit a different wavelength λ, because an Arrayed Waveguide Grating (AWGR) is used as an optical switching node, the wavelength of each input port can reach which output port is fixed, that is to say the route of each stream transmitted in the form of optical wavelength is fixed. The data is forwarded according to a routing table configured in advance when forwarded at a top of rack switch (ToR). When the dispatching starts, the dispatcher firstly collects the flow information stored by the buffer memory modules in each top-of-rack exchanger (ToR), the dispatcher sorts all the flow information and analyzes and judges the flows according to the sequence, if the flow can be sent, the flow is added into the result array, if the flow can not be sent, the next flow analysis and judgment is carried out, when the result array is full, the analysis and judgment of the flows is finished, and finally, a control command is issued to each top-of-rack exchanger (ToR) according to the information of the result array. And the top of rack exchanger (ToR) controls the corresponding buffer memory in the transmission module (Tx) to transmit data according to the command after receiving the command corresponding to the scheduler. Of course, the top of rack switch (ToR) also has the same number of receiving modules (Rx) as the number of transmitting modules (Tx).
Wherein the number of wavelengths = number of transmit modules (Tx) = number of top of rack switches (tors)/number of slots; buffer number = number of top of rack switches (tors) per number of wavelengths;
arrayed Waveguide Grating (AWGR) number = number of caches; arrayed waveguide grating (AWGR port number) =number of wavelengths;
the flow chart of the algorithm part of the invention is shown in figure 1, the dispatching mechanism is mainly completed by a dispatcher realized by an FPGA, and the specific implementation of the dispatching algorithm flow is as follows:
a top of rack switch (ToR) firstly collects the occupation condition of each buffer memory of each group, and the source node and the destination node corresponding to the buffer memory. The top of rack switch (ToR) then sends a request to the scheduler and waits for a receive command to send data. The dispatcher receives requests from each top-of-rack switch (ToR), and analyzes the traffic information from the requests for subsequent operations.
The number of top-of-rack switches (tors) is set to N, and each top-of-rack switch (ToR) has M transmission modules (Tx) (similarly, M reception modules), so that the number of streams that can be simultaneously transmitted is at most mxn. The scheduler stores statistics of these flows through a priority queue, which orders the flows according to the size of each flow, with the flow being the front.
Thus we traverse the elements of the priority queue in order and map through two dictionary arrays T And map R Preserving the state of transceivers in each roof-top-exchange (ToR), e.g. map T [1]Representing the status of M transmitting modules (Tx) in a top of rack switch (TOR) 1, if map T [1][Tx]The value of (2) is 0, indicating that the transmission module (Tx) is currently unoccupied, and if the value is 1, indicating that it is already occupied, the next stream in the priority queue to be transmitted by the transmission module (Tx) cannot be transmitted in this time slot, and needs to wait for other time slots to transmit. map R [1]The status of M receiving modules (Rx) in the top of rack switch (ToR) 1, if map R [1][Rx]The value of (2) is 0, indicating that the receiving module (Rx) is unoccupied, and if the value is 1, indicating that it is already occupied, the latter in the priority queue needs to be used with the connectionThe stream received by the receiving module (Rx) cannot be transmitted in this time slot, and needs to wait for other time slots to transmit.
Therefore, the factor determining whether each stream can be transmitted is the state of the transmitting module (Tx) and the receiving module (Rx), and the sufficient requirement that a certain stream can be transmitted is that it requires the transmitting module (Tx) and the receiving module (Rx) to be idle at the same time.
The transmittable stream is put into a result array, and if the length of the array reaches the upper limit of the number of the transmittable data streams at the moment, the calculation is stopped, or the calculation is finished when the number of the transmittable streams at the moment is insufficient and the MxN priority queue is empty.
Finally, the scheduler issues a control instruction to a top of rack switch (ToR) according to the calculation result, and the top of rack switch (ToR) judges which buffer memory can transmit the stream according to the instruction, and then opens the corresponding buffer memory to transmit data.
The issuing rule for the beginning of each slot is calculated at the end of the last slot. As shown in fig. 6, data transmission starts at time t1 of the time slot N, data transmission ends at time t2, at this time, the buffer module of the top of rack switch (ToR) collects the buffered data and sends a request to the scheduler, and at time t3, the scheduler completes scheduling and sends a transmission command to each top of rack switch (ToR), and at time t3, data transmission of n+1 time slots is performed. That is, the network traffic information starts to be collected and scheduled when each time slot is near the end, because the scheduling time is short compared with the time slot length, the traffic information change generated in the scheduling time is negligible, and the bandwidth of each server is 10Gbps assuming that 40 servers are arranged in each rack, because the time spent for scheduling is ns, and the maximum generated data amount is 40x10 if the scheduling time is 5ns 9 x5×10 -9 Even if the data are all sent to the same top of rack switch (ToR) in the cluster in extreme cases, the data size in a buffer is negligible compared with a buffer length, the information is collected in advance to calculate and then a new control command is issued to the top of rack switch (ToR) between the beginning of the next time slot, and the scheduling method of time slot dislocation can also ensure the schedulingReal-time performance of the degree.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof, and accordingly, the embodiments are to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
Claims (6)
1. The resource scheduling method for the AWGR-oriented optical switching data center network is characterized by comprising the following steps of:
s1, a top-of-rack exchanger collects the occupation condition of each cache of each group and source nodes and destination nodes corresponding to the caches, then sends a request to a dispatcher and waits for receiving a reply command, and the dispatcher receives the request transmitted by each top-of-rack exchanger and analyzes streaming information;
s2, the scheduler stores the counted flow information through a priority queue, the priority queue sorts the flow according to the flow of each flow, and the flow is arranged in front;
s3, traversing the elements of the priority queue in sequence, analyzing and judging the stream, determining whether the current stream can be sent, if so, carrying out the next step, otherwise, re-executing the step S3;
the S3 further comprises traversing the elements of the priority queue according to the sequence, and mapping through two dictionary arrays T And map R Storing the states of the transmitting module and the receiving module in each rack-top exchanger, if the map T And map with R The element value in the transmission module or the receiving module is 0, which indicates that the transmission module or the receiving module is unoccupied and can transmit; if map T Or map R The element value in the device is 1, which indicates that the sending module or the receiving module is occupied, and the sending module or the receiving module cannot send the information, wherein the number of the sending module and the number of the receiving module are the same;
s4, adding the current stream into a result array, stopping calculation if the length of the result array reaches the upper limit of the number or the priority queue is empty, otherwise, re-executing S3;
the step S4 specifically comprises the following steps: if the length of the result array reaches MxN, the data stream which can be sent reaches the upper limit of the number, or the length of the array of the stream is not enough MxN at the moment, but the priority queue is empty, the calculation is stopped; the number of the top-frame exchangers is N, the number of the sending modules in the top-frame exchangers is M, and the number of streams which can be sent simultaneously is M multiplied by N at most;
s5, the scheduler issues control commands to each top-of-rack exchanger according to the information of the result array, and after receiving the commands corresponding to the scheduler, the top-of-rack exchanger controls corresponding caches in the sending module to send data in a staggered time slot mode;
the dislocation time slot specifically comprises: and starting to transmit data at the time t1 of the time slot N, wherein the data transmission at the time t2 is finished, at the moment, the buffer memory module of the top-of-rack exchanger collects buffer memory and transmits a request to the scheduler, the scheduler finishes scheduling at the time t3 and transmits a transmission command to each top-of-rack exchanger, and the data transmission of the N+1 time slot is carried out at the time t3, so that the cycle is finished.
2. The resource scheduling method for an AWGR optical switching data center network according to claim 1, wherein different caches in the top-of-rack switch are grouped, and the caches share a sending module to send, so that streams sent to different destination ends are sent in different time slots.
3. An architecture of a resource scheduling method for an AWGR optical switching data center network according to any one of claims 1-2, comprising a rack, a scheduler and an arrayed waveguide grating connected to the rack, wherein the rack comprises a top switch and a plurality of servers, and the servers communicate with each other through a data switching topology of the top switch.
4. The architecture of claim 3, wherein the arrayed waveguide grating structure is a 4×4 array, the arrayed waveguide grating routes optical signals to corresponding output ports according to a circulating wavelength routing manner, 4 input ports arranged in the arrayed waveguide grating input signals with different wavelengths, the number of transmission modules in each top-of-rack switch is fixed, each transmission module in the same top-of-rack switch can only transmit a unique wavelength λ different from other transmission modules, the arrayed waveguide grating is adopted as an optical switching node, so that the signals with different wavelengths of each input port can reach the fixed output ports, and the transmission module is responsible for transmitting data packets in the buffer module in the form of optical signals with different wavelengths;
5. The architecture of claim 3, wherein the internal structure of the top-of-rack switch comprises an ethernet switch module, a buffer module is disposed between the ethernet switch module and the sending module and the receiving module for connection, the ethernet switch module is disposed between the ethernet switch module and the server and connected with the buffer module, data of the server is uploaded to the ethernet switch module from a server in the rack, the ethernet switch module distributes according to a destination of the data packet, if the destination is a server in the rack, the data is directly forwarded to a destination server in the rack, if the destination is an external server, the data is uploaded to the buffer module corresponding to the sending module, and a control instruction is waited for data transmission.
6. The architecture of claim 3, wherein the number of transmit modules = number of roof-top switches/number of slots = number of wavelengths;
buffer module number = roof-of-rack switch number/wavelength number = arrayed waveguide grating number;
number of wavelengths = number of arrayed waveguide grating ports.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211520300.3A CN115550281B (en) | 2022-11-30 | 2022-11-30 | Resource scheduling method and architecture for AWGR (optical fiber reinforced plastic) optical switching data center network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211520300.3A CN115550281B (en) | 2022-11-30 | 2022-11-30 | Resource scheduling method and architecture for AWGR (optical fiber reinforced plastic) optical switching data center network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115550281A CN115550281A (en) | 2022-12-30 |
CN115550281B true CN115550281B (en) | 2023-04-28 |
Family
ID=84721644
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211520300.3A Active CN115550281B (en) | 2022-11-30 | 2022-11-30 | Resource scheduling method and architecture for AWGR (optical fiber reinforced plastic) optical switching data center network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115550281B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103441942A (en) * | 2013-08-26 | 2013-12-11 | 重庆大学 | Data center network system and data communication method based on software definition |
WO2016045055A1 (en) * | 2014-09-25 | 2016-03-31 | Intel Corporation | Network communications using pooled memory in rack-scale architecture |
WO2022022271A1 (en) * | 2020-07-29 | 2022-02-03 | 浙江大学 | Unbuffered optical interconnection architecture for data center and method |
CN114827782A (en) * | 2022-04-25 | 2022-07-29 | 南京航空航天大学 | Flow group scheduling method in photoelectric hybrid data center network |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105162721B (en) * | 2015-07-31 | 2018-02-27 | 重庆大学 | Full light network data centre network system and data communications method based on software defined network |
US10194222B2 (en) * | 2016-10-17 | 2019-01-29 | Electronics And Telecommunications Research Institute | Packet-based optical signal switching control method and apparatus |
CN106941633B (en) * | 2017-02-20 | 2021-01-15 | 武汉邮电科学研究院 | SDN-based all-optical switching data center network control system and implementation method thereof |
CN110113271B (en) * | 2019-04-04 | 2021-04-27 | 中国科学院计算技术研究所 | MPI application acceleration system and method based on photoelectric hybrid switching network |
CN113709606B (en) * | 2021-08-31 | 2022-07-12 | 山东大学 | Flexible reconfigurable optical switching node system facing elastic optical network and working method |
-
2022
- 2022-11-30 CN CN202211520300.3A patent/CN115550281B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103441942A (en) * | 2013-08-26 | 2013-12-11 | 重庆大学 | Data center network system and data communication method based on software definition |
WO2016045055A1 (en) * | 2014-09-25 | 2016-03-31 | Intel Corporation | Network communications using pooled memory in rack-scale architecture |
WO2022022271A1 (en) * | 2020-07-29 | 2022-02-03 | 浙江大学 | Unbuffered optical interconnection architecture for data center and method |
CN114827782A (en) * | 2022-04-25 | 2022-07-29 | 南京航空航天大学 | Flow group scheduling method in photoelectric hybrid data center network |
Also Published As
Publication number | Publication date |
---|---|
CN115550281A (en) | 2022-12-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8223759B2 (en) | High-capacity data switch employing contention-free switch modules | |
CN101917331B (en) | Systems, methods, and apparatus for a data centre | |
US20080069125A1 (en) | Means and apparatus for a scalable congestion free switching system with intelligent control | |
US20090097497A1 (en) | Flexible bandwidth allocation in high-capacity packet switches | |
CN105162721A (en) | All-optical interconnection data center network system based on software defined network and data communication method | |
US9634960B2 (en) | Petabits-per-second packet switch employing cyclically interconnected switch units | |
JP2001007822A (en) | Data flow control switch and its scheduling method | |
US20050078666A1 (en) | Temporal-spatial burst switching | |
WO2006130130A1 (en) | Scheduling method and system for optical burst switched networks | |
Baziana et al. | Collision-free distributed MAC protocol for passive optical intra-rack data center networks | |
CN100428660C (en) | Optical burst exchange node with internal acceleration | |
CN115550281B (en) | Resource scheduling method and architecture for AWGR (optical fiber reinforced plastic) optical switching data center network | |
Chunming et al. | Polymorphic Control for Cost‐Effective Design of Optical Networks | |
CN101193050B (en) | A method for data receiving and transmitting of core node switching device in optical sudden network | |
Wang et al. | Efficient protocols for multimedia streams on WDMA networks | |
Yang et al. | ABOI: AWGR-Based optical interconnects for single-wavelength and multi-wavelength | |
CN115086185B (en) | Data center network system and data center transmission method | |
Lin et al. | CORNet: A scalable and bandwidth-efficient optical burst switching ring architecture for metro area networks | |
Che et al. | Switched optical star-topology network with edge electronic buffering and centralized control | |
Kam et al. | Real-time distributed scheduling algorithm for supporting QoS over WDM networks | |
Binh et al. | Improved WDM packet switch architectures with output channel grouping | |
CA2570834C (en) | Scalable router-switch | |
Zhao et al. | Contention resolution mechanisms for asynchronous optical packet switching based high performance computing system | |
Okorafor | Design and analysis of a 3-dimensional cluster multicomputer architecture using optical interconnection for petaFLOP computing | |
Peng et al. | On the design of node architectures and MAC protocols for optical burst-switched ring networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |