[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
|
|
Subscribe / Log in / New account

The trouble with SMC-R

By Jonathan Corbet
May 18, 2017
Among the many features merged for the 4.11 kernel was the "shared memory communications over RDMA" (SMC-R) protocol from IBM. SMC-R is a high-speed data-center communications protocol that is claimed to be much more efficient than basic TCP sockets. As it turns out, though, the merging of this code was a surprise — and an unpleasant one at that — to a relevant segment of the kernel development community. This issue and the difficulties in resolving it are an indicator of how the increasingly fast-paced kernel development community can go off track.

The patch set that was eventually merged (via the networking tree) for 4.11 claims a decrease in CPU consumption of up to 60% over basic TCP sockets. The protocol is designed in such a way that existing TCP applications can be made to use it simply by linking them against a special library — no code changes required. On the other hand, it requires bypassing much of the network stack (including firewalls, monitoring, and traffic control) and shorting out the code that tries to keep the networking layer from creating too much memory pressure. In many settings, those may be a price that users are willing to pay.

The problem, as raised by Christoph Hellwig on May 1, is that this RDMA-based protocol was merged without any input from the RDMA development community; it was never posted to the linux-rdma mailing list. Once the RDMA developers took a look at it, they found a number of things to dislike. SMC-R adds a new API, rather than using the existing RDMA APIs, for example. It has no support for IPv6, and the fact that it defines its own AF_SMC address family makes it unclear how an application could ever specify whether it wanted IPv6 or not. (It's worth noting that missing IPv6 support has blocked other protocol implementations in the past). There is also a significant security issue with SMC-R, in that it opens read/write access to all of memory from a remote system.

The RDMA developers, being less than pleased with all of this and feeling that they should have been consulted prior to the merging of SMC-R, are wanting to do something about it. But what can actually be done is not entirely clear at this point. Hellwig posted a patch marking the subsystem as "broken" and adding a strong warning about the security issue, but that patch has not yet been merged and probably never will be in that form.

Networking maintainer David Miller responded that Hellwig was being "overbearing" by trying to mark SMC-R as being broken, and added that there is no possibility of changing the API before it develops users: "The API is out there already so we are out of luck, and neither you nor I nor anyone else can 'stop' this from happening". SMC-R, in other words, is a fait accompli that cannot be removed at this point.

RDMA maintainer Doug Ledford disagreed, noting that 4.11 has only been out since the end of April and has almost certainly not appeared in distributions yet. The "standard" that defines this protocol (RFC 7609) is, he pointed out, just an informational posting from IBM without actual standard status. There is nothing, he said, that prevents recalling SMC-R at this time. For now, Miller has applied a version of Hellwig's patch that removes the "broken" marker but keeps the security warning. Ledford still thinks, though, that the option of marking SMC-R broken (or moving it to staging) should still be on the table.

Ledford, along with others, also complained loudly that this subsystem was merged without having ever been brought to the attention of the RDMA mailing list. Miller fired back that he had explicitly tried to slow the progress of this patch set in the hope that it would get some substantive reviews, but "I can't push back on people with silly coding style and small semantic issues forever". He complained that evidently nobody from the RDMA community is following the netdev mailing list, which is where the patches were posted. The discussion went around a bit on whether Miller should have asked the SMC-R submitters to copy their patches to the linux-rdma list as well, without any real agreement being reached.

The reason that there are no RDMA developers on netdev, despite the obvious overlap between RDMA and networking, is an old story: the traffic on netdev (150-200 messages per day) has reached a level where the RDMA developers feel they simply cannot keep up with it. Developers used to say the same thing about linux-kernel, before everybody simply gave up on it altogether. As the community grows and the patch volume increases, this type of process-scalability issue will move downward through the subsystem hierarchy. Developers stop keeping up with relevant discussions because they cannot read all that email and still have time to actually get some development done.

Ledford proposed a solution of sorts for the problem of email volume: split netdev into separate lists for core networking, Ethernet drivers, and "netdev-packet". Ironically, that is likely to make the sort of communication issue that led to this discussion worse; as the development community segregates itself into increasingly specialized lists, communication across the community as a whole will be reduced. In a small town, everybody knows what everybody else is up to; that is not true in a large city. The kernel project resembles an increasingly large city in this regard.

This fracturing of the kernel community has been evident for at least two decades; it is likely to present significant scalability issues if the kernel project continues to grow. For the time being, the SMC-R issue appears to be headed toward a resolution, with the RDMA developers seeing a path by which the problems in the protocol and its implementation can be addressed. But this will certainly not be the last time that the development community is tripped up as a result of developers not being able to keep up with what their colleagues are doing.

Index entries for this article
KernelDevelopment model/Code review
KernelNetworking/Protocols


to post comments

The trouble with SMC-R

Posted May 18, 2017 20:24 UTC (Thu) by ejr (subscriber, #51652) [Link] (1 responses)

Perhaps an automated gizmo could generate weekly email reports of activity on kernel-related lists. A list manager could register kernel source paths of interest. Any patch posted to any monitored list that touches those paths would be mentioned in a weekly posting to the list that registered those paths (excluding those posted to the same list).

Hooking it up to a patchwork-like web interface could be handy as well for the more web-inclined set.

On the technical topic, RDMA, distributed shared memory, and non-volatile memory are going to be cross-cutting concerns with plenty of activity for a while. HP's "the Machine" is one example system where these are deeply tied. There will be many APIs and interfaces. If history is relevant, most will go away. But the kernel still supports DECnet, so...

The trouble with SMC-R

Posted May 23, 2017 0:18 UTC (Tue) by florianfainelli (subscriber, #61952) [Link]

>Perhaps an automated gizmo could generate weekly email reports of activity on kernel-related lists. A list manager could register kernel > source paths of interest. Any patch posted to any monitored list that touches those paths would be mentioned in a weekly posting to the > list that registered those paths (excluding those posted to the same list).

Problem was that AF_SMC was an entirely new path (net/smc) so the ownership and appropriate recipients would have been a bit hard to track down. You could say: with lack of a specialized recipients list, broadcast, but then we go back to square one: too much traffic.

The trouble with SMC-R

Posted May 18, 2017 23:58 UTC (Thu) by shorne (guest, #110879) [Link]

I have Gmail setup listening to the linux-kernel list with a filter to tag any time 'openrisc' pops up in conversations. This allows me to track any activity.

If you don't mind using gmail, Something similar could be done to track and 'rdma' or other keywords you are interested in. I'm sure one could get procmail to do the same.

The trouble with SMC-R

Posted May 19, 2017 5:14 UTC (Fri) by pabs (subscriber, #43278) [Link]

Sounds like they need to switch to mailman, which supports splitting a mailing list up into sub-topics. Recently RH did that for their security mailing list:

https://access.redhat.com/blogs/product-security/posts/rh...

The trouble with SMC-R

Posted May 19, 2017 9:04 UTC (Fri) by liam (guest, #84133) [Link]

Isn't there a linux-api ml?

.......

Hmm, why yes there is!

And there's even this interesting bit of history:
"The difficulty of answering that question is a contributing factor to many problems in the Linux API—for example, insufficient design review before release (with the consequence that mistakes in API designs are recognized too late), insufficient prerelease testing, poor or late documentation, and delays before kernel APIs are made available via C libraries."

https://www.kernel.org/doc/man-pages/linux-api-ml.html

The trouble with SMC-R

Posted May 19, 2017 14:22 UTC (Fri) by marduk (subscriber, #3831) [Link] (7 responses)

A large city... or pretty much any large corporation (and even some not-so-large ones). This problem is definitely not unique to Linux.

The trouble with SMC-R

Posted May 21, 2017 6:17 UTC (Sun) by marcH (subscriber, #57642) [Link] (6 responses)

> This problem is definitely not unique to Linux.

These development scalability issues are indeed very common. On the other hand, trying to mitigate them with a pure email, "database-free" approach is less common.
https://lwn.net/Articles/702177/ "Why kernel development still uses email only"

The trouble with SMC-R

Posted May 25, 2017 14:16 UTC (Thu) by smitty_one_each (subscriber, #28989) [Link] (5 responses)

Does every new API need a one-release "engagement" phase, so that it's in the kernel prior to "marriage"? A reverse deprecation, if you will?
There is always going to be someone who doesn't get the memo, and is surprised.
Putting some additional delay at the "ground truth" of a release would be one way to minimize that set of people.

The trouble with SMC-R

Posted May 25, 2017 16:02 UTC (Thu) by jem (subscriber, #24231) [Link] (4 responses)

Or everybody should get an invitation to the wedding, so that when the preacher says "speak now or forever hold your peace", you get a chance to object.

The trouble with SMC-R

Posted May 25, 2017 17:28 UTC (Thu) by excors (subscriber, #95769) [Link] (3 responses)

But if everyone is sent ten thousand wedding invitations per marriage window, they'll filter them straight into the bin, and you're back to the original problem of how to make them aware of the few weddings that they really didn't want to miss.

The trouble with SMC-R

Posted May 25, 2017 17:31 UTC (Thu) by smitty_one_each (subscriber, #28989) [Link] (2 responses)

I'm saying make the debut of a new API /in the kernel/ the "engagement".
How many APIs does the kernel debut per release?

The trouble with SMC-R

Posted May 26, 2017 1:59 UTC (Fri) by rkeene (guest, #88031) [Link] (1 responses)

The trouble with SMC-R

Posted May 26, 2017 10:58 UTC (Fri) by smitty_one_each (subscriber, #28989) [Link]

With a curated tag ontology, you could. . .
fantasize aloud about solutions that would require an unlikely cultural shift to implement.


Copyright © 2017, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds