[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
|
|
Subscribe / Log in / New account

A Rust implementation of Android's Binder

By Jonathan Corbet
November 30, 2023
LPC
The Android system was once famous for extensive, out-of-tree kernel enhancements. Many of those have been eliminated or upstreamed over the years, bringing Android much closer to the mainline kernel. One significant component in the "upstreamed" category is Binder, an interprocess communication mechanism that is used only by Android. There are a number of factors that make Binder a good candidate for rewriting in the Rust language; at the 2023 Linux Plumbers Conference, Carlos Llamas and Alice Ryhl described the motivation behind and implementation of a rewrite of Binder in Rust.

[Carlos Llamas] Llamas began the talk by describing Binder as the primary interprocess communication channel for the Android system. It has a long history, having originated in BeOS and been used in Palm OS before being adopted by Android. The Binder driver lives in the kernel; most applications access it by way of the libbinder library. The Binder service implements a thread pool that is able to take requests from clients in parallel; a memory-mapped region is used for transactions between server and client processes.

There are a number of Binder features that make it attractive for Android. It performs no buffering at all in the kernel. There is an extensive set of operations, all accessed via ioctl(), including a combined read/write command. Binder implements priority inheritance and the ability to share file descriptors; it has a number of remote object management features and a strong reference-counting implementation.

Given Binder's long history, why would one want to rewrite it now? The module, Llamas said, features a high level of complexity and a large amount of technical debt; that combination has proved to be a recipe for security problems. Analysis has shown that Binder contains about 3.1 vulnerabilities per thousand lines of code, which is a relatively high number — and it is not getting better. Exploits are known for about half of the known vulnerabilities. Given that even the lowest-privilege sandboxes in Android have access to Binder, this security record is troubling indeed.

There is a lot that can be improved in Binder, Llamas said. Functions nearly 1,000 lines long (example) are generally a bad sign. The error handling in Binder is "problematic" and the source of a lot of bugs. But any improvements to this complex body of code tend to be risky, leading to a lot of "fix to the fix to the fix" patches. The Android developers are tired of chasing bugs like these and are looking for a better way.

The new Binder

Ryhl then took over to talk about the new Binder implementation, which was posted to the mailing lists at the beginning of November. One of the big advantages of working in Rust, Ryhl said, is that it makes object ownership visible in the type system. For example, C code working with reference-counted objects has to check for a non-zero reference count at run time, but Rust code simply will not compile if it lacks an object reference where one is needed. In C, one can look at a structure full of function pointers and get no sense for which of those manipulate reference counts; in Rust, that can be made clear by using different types.

Rust also makes it impossible to forget to clean up in error paths. The slides for the talk pointed to some error-handling code in the C implementation that is implemented as a long series of goto targets at the end of a function; the Rust equivalent was simply the "}" that ends the function. Rust also actively prevents almost all of the vulnerabilities that have been found in Binder; over half of them were use-after-free bugs, which "never happen in Rust". Other types of vulnerabilities, such as out-of-bounds array accesses, will be caught at run time, turning a potential compromise into a denial-of-service problem — a much less severe situation.

[Alice Ryhl] Ryhl has thus duly rewritten Binder in Rust. The result implements all of the features provided by the C implementation and passes the entire Binder test suite; it is able to boot and run an Android device. Ryhl brandished a phone that was running the Rust Binder implementation as proof, to applause. The performance is as good as the C version; it "took a bit of optimization" to reach that point, but the C version has also seen a fair amount of optimization work.

Rust, she said, can be rather more verbose than C; there are a lot of invariants that have to be expressed in the code. But that is countered by the need for far less error-handling code; it turns out to be a wash, with the size of the two implementations being about the same.

With regard to unsafe code, she said that there will always be a need for some unsafe code in the kernel, but that the Binder implementation needed little of it, mostly to interface with the (still in C) binderfs implementation. The abstractions needed to interface with the rest of the kernel can involve unsafe code in general; the workqueue abstractions (which Ryhl upstreamed for 6.7) have a fair amount of it. But that is a single module that will be shared across all drivers using workqueues; it only has to be gotten right once.

Writing in Rust does not guarantee the end of vulnerabilities, she said at the conclusion of the talk; one was discovered in the Rust implementation during the conference. It turned out that the C implementation contains a similar bug. In C, that bug can be exploited to cause a use-after-free vulnerability, which is "game over". In Rust, there is no use-after-free, but the result could be a confusion of the mappings used to map messages, which is still a severe vulnerability, though rather harder to exploit. Using Rust, she concluded, prevented the memory-safety problem, but logic errors remain.

In the questions after the talk, Julia Lawall asked about the kinds of optimizations that were needed for Rust code. A source of bottlenecks, Ryhl said, is calling into C to access tiny functions. Performance can be improved by rewriting those functions in Rust, but that leads to duplicated code and the associated maintainability problems.

There were a couple of questions about abstractions and their relationship to the C code. One attendee asked if, during the process of implementing Binder, changes in the kernel's C code had forced changes to the abstractions. The answer was that there was one change in process freezing that has caused some problems. I then asked about the workqueue abstractions; if somebody makes an incompatible change to the (C) workqueue implementation, who is responsible for making the abstractions work again? The answer to that is not entirely clear; eventually, it seems, the kernel's usual "you broke it, you fix it" rule will probably apply, but it is expected that many developers will lack the Rust skills needed to update those abstractions for some time and will thus need help from the Rust-for-Linux developers.

[Thanks to the Linux Foundation, LWN's travel sponsor, for supporting our travel to this event.]

Index entries for this article
KernelAndroid
KernelDevelopment tools/Rust
ConferenceLinux Plumbers Conference/2023


to post comments

A Rust implementation of Android's Binder

Posted Nov 30, 2023 16:56 UTC (Thu) by JoeBuck (subscriber, #2330) [Link] (12 responses)

Would there be any advantage to having Binder available for use in a Linux server or desktop? Or do existing interprocess communication mechanisms work just as well?

A Rust implementation of Android's Binder

Posted Nov 30, 2023 18:26 UTC (Thu) by stop50 (subscriber, #154894) [Link] (1 responses)

In theory nothing stops devs to use the binder, but out of tree modules are rarely used in practice.
binder apparently makes more than the sysv ipc.
i have found only an older link that describes binder: https://elinux.org/Android_Binder

A Rust implementation of Android's Binder

Posted Nov 30, 2023 20:56 UTC (Thu) by gdamjan (subscriber, #33634) [Link]

> but out of tree modules are rarely used in practice.

But binder is actually not out-of-tree.

A Rust implementation of Android's Binder

Posted Nov 30, 2023 18:55 UTC (Thu) by ringerc (subscriber, #3071) [Link] (6 responses)

I'm curious about what it offers over D-Bus. It must be used for a lot more if performance is any significant concern.

A Rust implementation of Android's Binder

Posted Nov 30, 2023 19:04 UTC (Thu) by quotemstr (subscriber, #45331) [Link] (4 responses)

Binder is point-to-point. All DBus messages have to go through the bus manager process.

A Rust implementation of Android's Binder

Posted Nov 30, 2023 20:23 UTC (Thu) by tao (subscriber, #17563) [Link] (3 responses)

I know there was a P2P extension for D-Bus suggested at some point. I never heard more about it though, I guess it went nowhere.

A Rust implementation of Android's Binder

Posted Nov 30, 2023 23:41 UTC (Thu) by ebassi (subscriber, #54855) [Link] (1 responses)

All D-Bus implementations support peer-to-peer channels that don't go through the D-Bus server/broker. You set up the channel through the shared bus, and then talk to the peer using the same wire protocol.

The main advantage of D-Bus is, unsurprisingly, the shared bus, so you get multicast; the other advantage is the service activation—though that got kind of superseded by systemd though a similar mechanism: talk to a well-known name on the bus, the service that provides the name gets activated if it's not already running. For that, you need a central broker.

A Rust implementation of Android's Binder

Posted Dec 1, 2023 11:51 UTC (Fri) by smcv (subscriber, #53363) [Link]

> the other advantage is the service activation—though that got kind of superseded by systemd though a similar mechanism

Several D-Bus implementations now use systemd as an optional or required backend for service activation in preference to doing the fork-and-exec themselves, but it's still the message bus (the broker) that needs to be responsible for the autostarting behaviour where it recognises a message addressed to a not-yet-started service, asks for the service to be started, delays delivery of the message until that has happened, and avoids sending an error reply "service is not running" in the meantime.

systemd would not be able to do this without the message bus's help, because it's the message bus that needs to implement the delayed re-delivery: it wouldn't work without being integrated into the more general logic for how and when to deliver messages.

A Rust implementation of Android's Binder

Posted Dec 1, 2023 12:19 UTC (Fri) by smcv (subscriber, #53363) [Link]

There are a couple of ways D-Bus can be used without a message bus (broker). There is nothing magic about D-Bus from the kernel's perspective: it's just an application-layer protocol over AF_UNIX sockets, like X11, Wayland, Pulseaudio and Pipewire.

Many of the conveniences that are reasons to use D-Bus are really message bus features - distributed naming, broadcast/multicast signals, authentication, service activation with auto-starting, imposing a total order on messages to make them easier to reason about - and bypassing the message bus means reduced access to those conveniences. It's up to an individual application/interface author whether that's a worthwhile tradeoff or not.

The other big reason to use D-Bus is not needing significant incremental work to add more communication with some other component that already uses D-Bus (network effect): in the freedesktop.org ecosystem, if components can agree on using D-Bus for multiple forms of communication that don't have particularly demanding latency or throughput requirements, then that avoids reinventing too many wheels for those forms of communication, allowing the time saved to be re-invested in places where it matters more. Again, that's really more of a message bus feature than a protocol feature. I assume that in Android, there's a similar pressure to use Binder as a "default" IPC mechanism in any situation where there's no particularly good reason to do otherwise.

For situations where the message bus would be a problem, one way to bypass it is that the D-Bus wire protocol can be used as message framing for a simple client/server protocol over AF_UNIX or even TCP, with clients talking directly to the server. systemd does this, over AF_UNIX (the server listens on /run/systemd/private), so that its components can communicate during early boot before the message bus is available. This lets services reuse the D-Bus message framing and type system, which is particularly useful if they already needed that code for some other reason (for example systemd provides more normal D-Bus services over the message bus after it becomes available, so it would have needed access to a D-Bus implementation anyway).

Another way to bypass the message bus without abandoning D-Bus completely is that two peers initially communicating via the message bus can use it to negotiate a side channel, either via fd-passing (using SCM_RIGHTS to attach file descriptors to a D-Bus message) or by sharing the address of a socket in the filesystem, and then do their latency- or throughput-sensitive communication through that side channel, either reusing the D-Bus message framing and type system or switching to some other protocol that is more optimal for what this particular application wants. They can either move their entire communication into the side channel, or continue to use the message bus for "control" messages that are less sensitive to latency/throughput.

A Rust implementation of Android's Binder

Posted Nov 30, 2023 19:25 UTC (Thu) by nickodell (subscriber, #125165) [Link]

I was curious about the same thing, and I found this discussion of how Android uses binder on LKML: https://lkml.org/lkml/2009/6/25/3
For a rough idea of the scope of the binder's use in Android, here is a list of the basic system services that are implemented on top of it: package manager, telephony manager, app widgets, audio services, search manager, location manager, notification manager, accessibility manager, connectivity manager, wifi manager, input method manager, clipboard, status bar, window manager, sensor service, alarm manager, content service, activity manager, power manager, surface compositor.

A Rust implementation of Android's Binder

Posted Nov 30, 2023 23:38 UTC (Thu) by mcon147 (subscriber, #56569) [Link] (2 responses)

I'm curious what the gap is that binder is filling compared to other ipc mechanisms, and if there is one why regular desktop linux didn't feel it was necessary to use something like binder

A Rust implementation of Android's Binder

Posted Dec 1, 2023 8:25 UTC (Fri) by Wol (subscriber, #4433) [Link]

I think binder assumes that no two processes trust each other. On the desktop, I guess that would probably be a hindrance, not a help. (And yes, that is security-naive...)

Cheers,
Wol

A Rust implementation of Android's Binder

Posted Dec 7, 2023 10:59 UTC (Thu) by maxfragg (subscriber, #122266) [Link]

Linux is rather badly suited for IPC heavy architectures, since it exactly lacks an IPC mechanism, which does not shovel data through the kernel and at the same time provides sufficient security and abstractions, which simple shared memory is lacking.
In addition, binder originally was intended as an IPC mechanism, with implicit control flow changes (true RPC), which is something every microkernel offers, but linux historically lacked.
I am not sure, if binder still behaves that way, with all shift towards more of its code running in userspace, though.

A Rust implementation of Android's Binder

Posted Nov 30, 2023 16:58 UTC (Thu) by kleptog (subscriber, #1183) [Link] (7 responses)

That's pretty cool. I especially liked the: We should write this in Rust, oh look, we already did and here's the proof.

I don't see anything about how long this look and whether it was particularly difficult. I assume the original C implementation has man-years of work in it, to get similar performance already is very nice indeed.

A Rust implementation of Android's Binder

Posted Nov 30, 2023 20:36 UTC (Thu) by ballombe (subscriber, #9523) [Link] (6 responses)

On the other hand, if you are allowed to use the C code for inspiration, there is no reason you lose performance much,
because you will lift all the performance improvement when converting the code.

I have done a lot of language conversion in HPC. This is way easier than it sound. It is several order of magnitude easier than rewriting from scratch.

A Rust implementation of Android's Binder

Posted Dec 1, 2023 14:52 UTC (Fri) by tialaramex (subscriber, #21167) [Link] (5 responses)

Because Rust's core library (which is necessarily available in Rust, unlike std or even alloc, it's required) is more extensive than the equivalent in C there are often nicer options available which the C programmer could have written but it was a lot of work so they chose something simpler but not as optimal, or they spend a week writing that part and checking it works as intended - while the Rust programmer can just ask for what they meant.

There are some fundamental algorithmic building blocks which a good C programmer might know in principle, but in C you need to implement them, in Rust they're just method calls. For example [T].select_nth_unstable_by you can see how to write that in C but it's a bunch of work any time you need it, would you do that work? Perhaps only if a simpler but worse performing solution wouldn't pass review or in a later optimisation cycle. Of course you do still need to recognise that select_nth_unstable_by is actually solving the problem you had, this is not trivial, there's a well known C++ meme which falls out of a Sean Parent talk, "That's a rotate" because more often than you'd expect the complicated ad hoc shuffling of data in a contiguous structure which you wrote by hand turns out to actually be equivalent to just... rotate (in Rust [T].rotate_left or [T].rotate_right)

A Rust implementation of Android's Binder

Posted Dec 1, 2023 19:02 UTC (Fri) by ballombe (subscriber, #9523) [Link] (4 responses)

This does not apply to large C frameworks like the kernel that do not use the libc for anything important and provide already all those algorithms already. C programmers had a 30+ years head start implementing algorithms. And do not underestimate the power of cpp!

A Rust implementation of Android's Binder

Posted Dec 2, 2023 9:45 UTC (Sat) by tialaramex (subscriber, #21167) [Link] (2 responses)

Hmm. Where do you see an analogue of [T].select_nth_unstable_by ? My inexpert searching didn't find anything like this.

A Rust implementation of Android's Binder

Posted Dec 7, 2023 0:45 UTC (Thu) by wahern (subscriber, #37304) [Link] (1 responses)

Rust has a surfeit of library support for operating on array-based structures as lists and trees are quite cumbersome. But kernel data structures, when they need something sorted, are more likely to keep them in a sorted state, e.g. a sorted tree. Among other reasons, this approach provides consistent latency, as well as better avoids accidental performance issues. Until recently select_nth_unstable itself was accidentally quadratic, apparently: https://github.com/rust-lang/rust/issues/102451. But an interface like select_nth_unstable is a simple loop away from accidental quadratic behavior, anyhow.

This basic pattern is true of many niches where C remains common. The absence of language environment support for many of these algorithms is not felt as keenly as it would be were they missing from other languages such as Rust. Much of the work of a kernel is fundamentally about juggling shared, global resources, which necessarily involve gnarly reference graphs to which a language like Rust is fundamentally hostile, the design of which is largely predicated on pass-by-value and tidy ownership rules which track the call stack. Many kernel modules would be better off written in Rust, but also would ideally not run in kernel space at all.

A Rust implementation of Android's Binder

Posted Dec 9, 2023 19:15 UTC (Sat) by tialaramex (subscriber, #21167) [Link]

Accidentally quadratic _worst case_ only. It's the Quick Select defect, essentially a defect in the core of Quick Sort which for the same reason impacts most C qsort implementations into the 1990s and made some C++ standard library implementations non-compliant (the ISO document eventually promised better performance for its unstable sort than is actually achievable with Quick Sort) long after that.

I was skimming an old (like 30+ years old) Fortran book last week which gives this problem as a reason why its authors don't like Quick Sort although they admit it's technically faster on average. Today of course that book's advice is obsolete, you should use (and later Rust's authors did here) an introsort or some other modern better sort which didn't exist when that Fortran book was written

Keeping everything in order is excellent except when that's the wrong order, hence select_nth_unstable_by takes a function for the ordering we want, which doesn't have to be (indeed usually won't be) the natural ordering of this type. The kernel does have a lot of gnarly intrusive linked lists, and for its purpose this really is sometimes, even often the Right Thing™ because it's doing tricky concurrency acrobatics for example - but since linked lists are also the go-to data structure for C programmers that's not a sure bet as to why there's a linked list, sometimes it just "Not bad enough to pick something else" and a Vec<T> or similar structure natural to a Rust programmer might be better.

A Rust implementation of Android's Binder

Posted Dec 2, 2023 20:21 UTC (Sat) by Paf (subscriber, #91811) [Link]

There is a real paucity of library availability. The kernel has a bunch of the stuff it needs, but it's still pretty limited compare to what *could* exist. C++ obviously has overwhelmingly huge libraries available, but that's not very helpful for C and even less relevant for kernel stuff.

Video and slides

Posted Nov 30, 2023 20:19 UTC (Thu) by corbet (editor, #1) [Link] (1 responses)

I forgot to put links to the video and slides from the talk — apologies for the omission.

Video and slides

Posted Nov 30, 2023 20:25 UTC (Thu) by ojeda (subscriber, #143370) [Link]

The single-talk (i.e. split) video is also available now (linked for each talk in the Rust MC page now) — I hope that helps!

A Rust implementation of Android's Binder

Posted Dec 1, 2023 0:53 UTC (Fri) by andresfreund (subscriber, #69562) [Link] (3 responses)

> A source of bottlenecks, Ryhl said, is calling into C to access tiny functions. Performance can be improved by rewriting those functions in Rust, but that leads to duplicated code and the associated maintainability problems.

Could that, at least partially, be addressed by compiling with llvm and enabling LTO? Or is the glue code too much of an optimization barrier?

A Rust implementation of Android's Binder

Posted Dec 1, 2023 1:56 UTC (Fri) by nksingh (subscriber, #94354) [Link]

Yeah, I had a similar question. I'd guess that these tiny functions are in headers because I understand the linux kernel generally builds without LTO. If that's the case, I wonder if just those functions could be put in the same LLVM compilation module as the rust code.

A Rust implementation of Android's Binder

Posted Dec 1, 2023 14:50 UTC (Fri) by willy (subscriber, #9762) [Link] (1 responses)

Here's a conversation about exactly that:

https://lore.kernel.org/linux-fsdevel/20231129152305.GB23...

A Rust implementation of Android's Binder

Posted Dec 2, 2023 13:10 UTC (Sat) by foom (subscriber, #14868) [Link]

That thread looks like a slightly different thing: auto-translating the C headers into Rust, so they can be compiled by the Rust compiler.

But, I think what the comment above was suggesting was to compile the C inline functions using Clang into LLVM IR bitcode, and then merge that Clang bitcode and Rust code together when compiling to an object file.

Of course if you build the whole kernel with LTO or ThinLTO this would happen automatically. But you should be able to arrange it to happen on a smaller file-by-file scale only for importing C inline functions, I bet.

A Rust implementation of Android's Binder

Posted Dec 2, 2023 15:34 UTC (Sat) by bangert (subscriber, #28342) [Link]

I would be interested to know if and how the effort to port Binder to Rust is or will be related to Fuchsia
https://en.wikipedia.org/w/index.php?title=Google_Fuchsia
which says: As of 2022, most of Fuchsia's code is written in Rust.


Copyright © 2023, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds