Network transmit batching
The networking developers are always looking for ways to squeeze a little more performance from their code. Krishna Kumar took a look at the behavior described above and wondered: why not pass the list of accumulated packets to the driver in a single call? Batching of transmission operations in this way has the potential to minimize the cost of locking and device preparation overhead, making packet transmission as a whole more efficient. To explore this idea, Krishna has posted a few versions of the SKB batching patch set.
Implementing SKB batching requires a couple of driver API changes - but they are small and only required for batching-aware drivers. The first step is to set the NETIF_F_BATCH_SKBS bit in the features field of the net_device structure. That flag tells the network stack that the driver can handle batched transmissions.
The prototype for hard_start_xmit() is:
int (*hard_start_xmit)(struct sk_buff *skb, struct net_device *dev);
That prototype does not change, but a driver which has indicated that batching is acceptable for dev may find its hard_start_xmit() method called with skb set to NULL. The NULL value is an indication that there is a batch of packets to transmit; that batch will be found enqueued on the (new) list found at dev->skb_blist. So the (much simplified) form of a batching-aware driver's hard_start_xmit() function will look something like:
driver_specific_locking_and_setup(); if (skb) ret = send_a_packet(internal_dev, skb); else { while ((skb = __skb_dequeue(dev->skb_blist)) != NULL) { ret = send_a_packet(internal_dev, skb); if (ret) break; } } driver_specific_cleanup();
The reality of the situation can be a bit more complicated, especially if the driver implements optimizations like suppressing completion interrupts until the last packet of the batch has been sent. But the core of the change is as described here - not a whole lot to it.
As of this writing, the networking developers are still trying to determine
what the performance effects of this patch are. There is particular
interest in seeing how batching compares with TCP segmentation offloading,
which is also, at its core, a transmission batching mechanism. The proof
is very much in the benchmarks for a patch like this; if the results are
good enough, the patch will likely be merged.
Index entries for this article | |
---|---|
Kernel | Device drivers/Network drivers |
Kernel | Networking |