The return of network channels
That does not mean that no work is happening in this area, though. Evgeniy Polyakov, perhaps the most discouragement-resistant hacker out there, continues to develop his channel patches; the 22nd release came out on December 4.
This version of the patch has a well-defined internal structure to allow kernel code to hook into channels. The best-developed mode, however, is the one which simply transfers packets to and from user space. To that end, there is a new system call:
int netchannel_control(struct unetchannel_control *ctl);
The full contents of the unetchannel_control structure can be seen in the patch. The more important fields are:
- cmd, describing the action that the calling process wishes
to execute. Unlike previous versions of the patch, the current code
only supports one action: NETCHANNEL_CREATE, which makes a
new channel.
- type, the type of the channel to create. At the moment, the
only implemented type is NETCHANNEL_COPY_USER, which copies
packets to and from user space.
- unc.data which describes the channel to be created: it contains source and destination addresses and ports and a protocol number.
Once a network channel is created, it is added to a search tree which is oriented toward blindingly-fast lookups. There is a new hook in the packet receive code which looks up each incoming packet in that tree; packets which do not turn up a hit there are processed normally by the kernel's networking stack. Any packet whose addresses, ports, and protocol are matched by an entry in the tree, however, is shunted over to the channel code before even being queued by the network stack.
The final piece (on the receive side) is a simple read() implementation. A process wishing to receive a packet from a network channel need only read the associated file descriptor and the next available packet will be copied into the supplied buffer. It would, of course, be nice to do away with that copy operation, but that is a hard trick to carry out: the packet must be received before its destination is known. There are network adapters which can direct packets based on their header information, but the current netfilter does does not have the driver API enhancements which would be required to use that capability for zero-copy packet reception.
Similarly, a write() operation causes the associated packet to be copied into the kernel and fed into the networking stack at a fairly low level. There is currently no zero-copy write support.
Evgeniy clearly has zero-copy operations in mind, though, probably using his network allocator patch. Even without that feature, though, the channel code, when used with his user-space network stack appears to be quite fast. Some posted benchmark results claim significant improvements over the core Linux networking stack - three times the maximum bandwidth with one-third of the CPU usage when small packets are being transferred. For larger (4096-byte) packets the performance improvements essentially disappear - most likely the cost of copying the packets into and out of the kernel is the dominating factor there.
Improvements in small-packet performance are welcome: there are a number of
applications, including high-end financial trading, which require large
numbers of small transfers. The addition of zero-copy logic has the
potential to make the large-packet performance better as well. The real
test, though, will be the addition of all of the other features expected by
contemporary networking users, most of which are currently absent from the
channels implementation. There are hooks in the code aimed at the
insertion of per-packet processing; they could be used for filtering,
address translation, traffic control, or any of the other things that one
might want to have. Whether those hooks can be used without killing the
performance advantages of channels remains to be seen, though. But one
suspects that Evgeniy will not give up until he has an answer to that
question.
Index entries for this article | |
---|---|
Kernel | Networking/Channels |