Topics from the LLVM microconference

By Jake Edge
August 26, 2015

A persistent theme throughout the LLVM microconference at the 2015 Linux Plumbers Conference was that of "breaking the monopoly" of GCC, the GNU C library (glibc), and other tools that are relied upon for building Linux systems. One could quibble with the "monopoly" term, since it is self-imposed and not being forced from the outside, but the general idea is clear: using multiple tools to build our software will help us in numerous ways.

Kernel and Clang

Most of the microconference was presentation-oriented, with relatively little discussion. Jan-Simon Möller kicked things off with a status report on the efforts to build a Linux kernel using LLVM's Clang C compiler. The number of patches needed for building the kernel has dropped from around 50 to 22 "small patches", he said. Most of those are in the kernel build system or are for little quirks in driver code. Of those, roughly two-thirds can likely be merged upstream, while the others are "ugly hacks" that will probably stay in the LLVM-Linux tree.

There are currently five patches needed in order to build a kernel for the x86 architecture. Two of those are for problems building the crypto code (the AES_NI assembly code will not build with the LLVM integrated assembler and there are longstanding problems with variable-length arrays in structures). The integrated assembler also has difficulty handling some "assembly" code that is used by the kernel build system to calculate offsets; GCC sees it as a string, but the integrated assembler tries to actually assemble it.

The goal of building an "allyesconfig" kernel has not yet been realized, but a default configuration (defconfig) can be built using the most recent Git versions of LLVM and Clang. It currently requires disabling the integrated assembler for the entire build, but the goal is to disable it just for the files that need it.

Other architectures (including ARM for the Raspberry Pi 2) can be built using roughly half-a-dozen patches per architecture, Möller said. James Bottomley was concerned about the "Balkanization" of kernel builds once Linus Torvalds and others start using Clang for their builds; obsolete architectures and those not supported by LLVM may stop building altogether, he said. But microconference lead Behan Webster thought that to be an unlikely outcome. Red Hat and others will always build their kernels using GCC, he said, so that will be supported for quite a long time, if not forever.

Using multiple compilers

Kostya Serebryany is a member of the "dynamic testing tools" team at Google, which has the goal of providing tools for the C++ developers at the company to find bugs without any help from the team. He was also one of the proponents of the "monopoly" term for GCC, since it is used to build the kernel, glibc, and all of the distribution binaries. But, he said, making all of that code buildable using other compilers will allow various other tools to also be run on the code.

For example, the AddressSanitizer (ASan) can be used to detect memory errors such as stack overflow, use after free, using stack memory after a function has returned, and so on. Likewise, ThreadSanitizer (TSan), MemorySanitizer (MSan), and UndefinedBehaviorSanitizer (UBSan) can find various kinds of problems in C and C++ code. ~~But all are based on Clang and LLVM, so only code that can be built with that compiler suite can be sanitized using these tools.~~

GCC already has some similar tools and the Linux kernel has added some as well (the kernel address sanitizer, for example), which have found various bugs, including quite a few security bugs. GCC's support has largely come about because of the competition with LLVM and still falls short in some areas, he said.

Beyond that, though, there are other techniques beyond "best effort" tools like the sanitizers. For example, fuzzing and hardening are two techniques that can be used to either find more bugs or eliminate certain classes of bugs. He stated that coverage-guided fuzzing can be used to narrow in on problem areas in the code. LLVM's LibFuzzer can be used to perform that kind of fuzzing. He noted that the Heartbleed bug can be "found" using LibFuzzer in roughly five seconds on his laptop.

Two useful hardening techniques are also available with LLVM: control flow integrity (CFI) and SafeStack. CFI will abort the program when it detects certain kinds of undesired behavior—for example that the virtual function table for a program has been altered. SafeStack protects against stack overflows by placing local variables on a separate stack. That way, the return address and any variables are not contiguous in memory.

Serebryany said that it was up to the community to break the monopoly. He was not suggesting simply switching to using LLVM exclusively, but to ensuring that the kernel, glibc, and distributions all could be built with it. Furthermore, he said that continuous integration should be set up so that all of these pieces can always be built with both compilers. When other compilers arrive, they should also be added into the mix.

To that end, Webster asked if Google could help getting the kernel patches needed to build with Clang upstream. Serebryany said that he thought that, by showing some of the advantages of being able to build with Clang (such as the fuzzing support), Google might be able to help get those patches merged.

BPF and LLVM

The "Berkeley Packet Filter" (BPF) language has expanded its role greatly over the years, moving from simply being used for packet filtering to now providing the in-kernel virtual machine for security (seccomp), tracing, and more. Alexei Starovoitov has been the driving force behind extending the BPF language (into eBPF) as well as expanding its scope in the kernel. LLVM can be used to compile eBPF programs for use by the kernel, so Starovoitov presented about the language and its uses at the microconference.

He began by noting wryly that he "works for Linus Torvalds" (in the same sense that all kernel developers do). He merged his first patches into GCC some fifteen years ago, but he has "gone over to Clang" in recent years.

The eBPF language is supported by both GCC and LLVM using backends that he wrote. He noted that the GCC backend is half the size of the LLVM version, but that the latter took much less time to write. "My vote goes to LLVM for the simplicity of the compiler", he said. The LLVM-BPF backend has been used to demonstrate how to write a backend for the compiler. It is now part of LLVM stable and will be released as part of LLVM 3.7.

GCC is built for a single backend, so you have to specifically create a BPF version, but LLVM has all of its backends available using command-line arguments (--target bpf). LLVM also has an integrated assembler that can take the C code describing the BPF and turn it into in-memory BPF bytecode that can be loaded into the kernel.

BPF for tracing is currently a hot area, Starovoitov said. It is a better alternative to SystemTap and runs two to three times faster than Oracle's DTrace. Part of that speed comes from LLVM's optimizations plus the kernel's internal just-in-time compiler for BPF bytecode.

Another interesting tool is the BPF Compiler Collection (BCC). It makes it easy to write and run BPF programs by embedding them into Python (either directly as strings in the Python program or by loading them from a C file). Underneath the Python "bpf" module is LLVM, which compiles the program before the Python code loads it into the kernel. A simple printk() can easily be added into the kernel without recompiling it (or rebooting). He noted that Brendan Gregg has added a bunch of example tools to show how to use the C+Python framework.

Under the covers, the framework uses libbpfprog that compiles a C source file into BPF bytecode using Clang/LLVM. It can also load the bytecode and any BPF maps to the kernel using the bpf() system call and attach the program(s) to various types of hooks (e.g. kprobes, tc classifiers/actions). The Python bpf module simply provides bindings for the library.

The presentation was replete with examples, which are available in the slides [PDF] as well.

Alternatives for the core

There was a fair amount of overlap between the last two sessions I was able to sit in on. Both Bernhard Rosenkraenzer and Khem Raj were interested in replacing more than just the compiler in building a Linux system. Traditionally, building a Linux system starts with GCC, glibc, and binutils, but there are now alternatives to those. How much of a Linux system can be built using those alternatives?

Some parts of binutils are still needed, Rosenkraenzer said. The binutils gold linker can be used instead of the traditional ld. (Other linker options were presented in Mark Charlebois's final session of the microconference, which I unfortunately had to miss.) The gas assembler from binutils can be replaced with Clang's integrated assembler for the most part, but there are still non-standard assembly constructs that require gas.

Tools like nm, ar, ranlib, and others will need to be made to understand three different formats: regular object files, LLVM bytecode, and the GCC intermediate representation. Rosenkraenzer showed a shell-script wrapper that could be used to add this support to various utilities.

For the most part, GCC can be replaced by Clang. OpenMandriva switched to Clang as its primary compiler in 2014. The soon-to-be-released OpenMandriva 3 is almost all built with Clang 3.7. Some packages are still built with gcc or g++, however. OpenMandriva still needed to build GCC, though, to get libraries that were needed such as libgcc, libatomic, and others (including, possibly, libstdc++).

The GCC compatibility claimed by Clang is too conservative, Rosenkraenzer said. The __GNUC__ macro definition in Clang is set to 4.2.1, but switching that to 4.9 produces better code. There were several thoughts on why Clang has chosen 4.2.1, though both are related: 4.2.1 was the last GPLv2 release of GCC, so some people may not be allowed to look at later versions; in addition, GCC 4.2.1 was the last version that was used to build the BSD portions of OS X.

There are a whole list of GCC-isms that should be avoided for compatibility with Clang. Rosenkraenzer's slides [PDF] list many of them. He noted that there have been a number of bugs found via Clang warnings or errors when building various programs—GCC did not complain about those problems.

Another "monopoly component" that one might want to replace would be glibc. The musl libc alternative is viable, but only if binary compatibility with other distributions is not required. But musl cannot be built with Clang, at least yet.

Replacing GCC's libstdc++ with LLVM's libc++ is possible but, again, binary compatibility is sacrificed. That is a bigger problem than it is for musl, though, Rosenkraenzer said. Using both is possible, but there are problems when libraries (e.g. Qt) are linked to, say, libc++ and a binary-only Qt program uses libstdc++, which leads to crashes. libc++ is roughly half the size of libstdc++, however, so environments like Android (which never used libstdc++) are making the switch.

Cross-compiling under LLVM/Clang is easier since all of the backends are present and compilers for each new target do not need to be built. There is still a need to build the cross-toolchains, though, for binutils, libatomic, and so on. Rosenkraenzer has been working on a tool to do automated bootstrapping of the toolchain and core system.

Conclusion

It seems clear that use of LLVM within Linux is growing and that growth is having a positive effect. The competition with GCC is helping both to become better compilers, while building our tools with both is finding bugs in critical components like the kernel. Whether it is called "breaking the monopoly" or "diversifying the build choices", this trend is making beneficial changes to our ecosystem.

[I would like to thank the Linux Plumbers Conference organizing committee for travel assistance to Seattle for LPC.]

Index entries for this article
Conference	Linux Plumbers Conference/2015

Topics from the LLVM microconference

Posted Aug 27, 2015 12:42 UTC (Thu) by ehiggs (subscriber, #90713) [Link] (3 responses)

But all are based on Clang and LLVM, so only code that can be built with that compiler suite can be sanitized using these tools.

GCC has UBSan and TSan so I think this line is incorrect.

Topics from the LLVM microconference

Posted Aug 27, 2015 12:47 UTC (Thu) by jwakely (subscriber, #60262) [Link]

And also Asan. Using the same code:

https://gcc.gnu.org/viewcvs/gcc/trunk/libsanitizer/

Topics from the LLVM microconference

Posted Aug 27, 2015 13:00 UTC (Thu) by JGR (subscriber, #93631) [Link]

>> But all are based on Clang and LLVM, so only code that can be built with that compiler suite can be sanitized using these tools.
> GCC has UBSan and TSan so I think this line is incorrect.

GCC also supports AddressSanitizer and LeakSanitizer.
To the best of my knowledge the only one that isn't supported is MemorySanitizer.

Topics from the LLVM microconference

Posted Aug 27, 2015 22:50 UTC (Thu) by jake (editor, #205) [Link]

[ oops, i just realized, i never actually published this, though I created it a half-dozen hours ago. ]

It is certainly possible that I misunderstood what was said. Clearly most of the sanitizers are shared, so I struck through the offending line.

thanks,

jake

Topics from the LLVM microconference

Posted Aug 27, 2015 13:15 UTC (Thu) by jwakely (subscriber, #60262) [Link] (2 responses)

> Rosenkraenzer's slides [PDF] list many of them. He noted that there have been a number of bugs found via Clang warnings or errors when building various programs—GCC did not complain about those problems.

Which would be interesting if it was true, but GCC does complain about both his examples:

w.c:6:31: warning: ‘sizeof’ on array function parameter ‘n’ will return size of ‘char *’ [-Wsizeof-array-argument]
w.c:6:31: warning: argument to ‘sizeof’ in ‘memcmp’ call is the same pointer type ‘char *’ as the first source; expected ‘char’ or an explicit length [-Wsizeof-pointer-memaccess]

w.c:21:11: warning: the address of ‘a’ will always evaluate as ‘true’ [-Waddress]

At least he's removed the C++ accepts-invalid example he used at FOSDEM, which was also not true (although that might have been simplified from an original using templates, and it's true that G++ doesn't always do access checking correctly inside templates).

Topics from the LLVM microconference

Posted Aug 27, 2015 15:08 UTC (Thu) by lsl (subscriber, #86508) [Link] (1 responses)

Maybe it's another one of the bogus comparisons where a current yet-unreleased Clang SVN snapshot is compared against some vintage GCC from nearly 10 years ago because 'teh GPLv3 is evil!". It really gets annoying sometimes.

Topics from the LLVM microconference

Posted Aug 29, 2015 15:31 UTC (Sat) by krakensden (subscriber, #72039) [Link]

Not using OS X is practically thoughtcrime now. Presumably the source of the error.

GPL3 phobia

Posted Aug 27, 2015 13:32 UTC (Thu) by fuhchee (guest, #40059) [Link] (1 responses)

"4.2.1 was the last GPLv2 release of GCC, so some people may not be allowed to look at later versions"

Can someone elaborate why someone may be worried about looking at GPL3 versions of GCC?

GPL3 phobia

Posted Aug 27, 2015 19:06 UTC (Thu) by HIGHGuY (subscriber, #62277) [Link]

I can think of 2 options:
- libgcc (GPLv3 with static link exception) being compiled in by default might still scare many corporate lawyers.
- compiler plugins

Topics from the LLVM microconference

Posted Aug 29, 2015 21:15 UTC (Sat) by tdz (subscriber, #58733) [Link] (3 responses)

Reading the article left me with the impression that LLVM is mostly about not being GCC. Picking a third-party libc just to not use Glibc makes them look like jerks to me. I guess I'm wrong here, but I still don't like the attitude.

Topics from the LLVM microconference

Posted Sep 4, 2015 0:53 UTC (Fri) by ssokolow (guest, #94568) [Link]

In my case, I just want to have a concise, conversation-ending argument ("There's no GNU componentry on my Linux system and it's still fully ABI-compatible") to use against preachy "It's GNU/Linux!" people.

Right now, I'm stuck with:

Even with GCC (which isn't in your average desktop loadout by default), there's more X.org than GNU in a typical distro and X11/GNU/Linux is even less likely, so, if anything, it'd be X11/Linux, not GNU/Linux.
Stallman intentionally ignores anything beyond what's needed to run Emacs in a text-mode terminal (eg. a GUI like X11) when drawing the boundary between 'OS' and 'extra apps'.
While it IS the ABI that people care about (which is why Android isn't "Linux"), you'll find that application build strings tend to say things like 'X11; Linux', not 'GNU; Linux'. Convince them first.
"GNU/Linux" has too many syllables (more than three) and people prefer to use the same terminology in speech and print.
"It's hard enough getting people to call it 'Linux' rather than 'Ubuntu'. You've already lost.

Topics from the LLVM microconference

Posted Sep 4, 2015 14:04 UTC (Fri) by mlopezibanez (guest, #66088) [Link] (1 responses)

The majority of LLVM folks are pragmatists: If it works better for their purposes, just use it, it doesn't matter if it is not copyleft (or even proprietary). Many of them would not work on LLVM (nor GCC) if they were not paid for it. Their companies see LLVM as a way to monetize open-source by incorporating it into their proprietary software, thus they fund their work. (Others just want the press visibility of being associated with the new cool kid).

However, some people pushing for LLVM are anti-copyleft, anti-GNU, anti-FSF and disdain Richard Stallman. These people have seen an opening to attack GNU/FSF and they are going for the jugular. They could not care less if the alternatives are inferior or impractical.

Topics from the LLVM microconference

Posted Sep 4, 2015 17:54 UTC (Fri) by pizza (subscriber, #46) [Link]

> Their companies see LLVM as a way to monetize open-source by incorporating it into their proprietary software, thus they fund their work

As long as said companies contribute work back to LLVM proper, we're all better off -- but from an end-user's perspective, whether said proprietary software is built on top of LLVM or not, it's still proprietary software. The end-user is still dependent on the vendor to fix problems or keep it updated with whatever upstream LLVM does.

> However, some people pushing for LLVM are anti-copyleft, anti-GNU, anti-FSF and disdain Richard Stallman. These people have seen an opening to attack GNU/FSF and they are going for the jugular.

Unfortunately, the biggest backers of LLVM fall fall under this "anything-but-GNU" umbrella. In particular, a certain Fruit company based in Cupertino...

There's also a third group -- Folks who develop other free/copyleft software, but use LLVM due to some technical superiority over GCC making it more suitable for what they are trying to do. In particular, LLVM has better modularity and is much easier to embed into other software. Mesa is an example of Free Software that uses LLVM for things that GCC is technically incapable of doing.

Topics from the LLVM microconference

Posted Aug 31, 2015 10:44 UTC (Mon) by nix (subscriber, #2304) [Link] (2 responses)

BPF for tracing is currently a hot area, Starovoitov said. It is a better alternative to SystemTap and runs two to three times faster than Oracle's DTrace. Part of that speed comes from LLVM's optimizations plus the kernel's internal just-in-time compiler for BPF bytecode.

This claim seems exceptionally unlikely to me. Interpreting DOF is really not an expensive operation: it's just a switch plus some very simple prologue/epilogue code for shuffling the arguments and return value into place plus the code needed to actually do what the DOF has asked, and most DTrace uses I've seen (even Brendan's! :) ) have no probes with anything longer than a few hundred opcodes attached to them: lacking loops and with only non-nested analogues of conditionals, D is not a language in which one would write something long or complicated enough to need optimization. All of DOF interpretation plus all the buffer management is going to be hugely dominated by the cost of taking a trap (for sdt/usdt) or a ring transition into kernel space (for systrace), so this only really applies to fbt, and if he's tested fbt on Linux I'd be quite astonished since it only exists on one person's computer so far.

But it may be true! It's possible that LLVM's native code for argument marshalling is better than the handwritten stuff DTrace uses, and it's just barely possible that in some synthetic workloads this dominates. If there's some actual data showing it, particularly if it's relevant outside pure benchmarks, I'd be fascinated to see it.

Topics from the LLVM microconference

Posted Aug 31, 2015 16:46 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

Also consider locality and branch prediction - compiled eBPF code has no need to access other data or to branch (in most cases).

Topics from the LLVM microconference

Posted Sep 1, 2015 8:00 UTC (Tue) by nix (subscriber, #2304) [Link]

Hm. True enough. Accessing other data is unlikely to be relevant -- it's likely to be in L1 cache (it's almost always parameters of the containing function or other data that the kernel has just touched). Branches, though... the question is, why do branches dominate? The huge number of branches the kernel does anyway is *still* likely to dwarf them in anything but, say, synthetic benchmarks of tracing getpid() or something near-empty like that.

Really, without knowing the benchmark I'm left grasping in the dark.

(As for fixing it... branches could definitely be reduced, or predicted, I suppose, at least in the hot spots. We haven't really done much performance optimization of this bit of the system -- the assumption has been that getting into dtrace_probe() would almost always be the expensive part. So there is surely room for improvement here.)

Topics from the LLVM microconference

Posted Sep 3, 2015 18:53 UTC (Thu) by dalias (guest, #95815) [Link]

The article says musl can't be built with clang, which should not be the case. There were clang bugs that affected musl in the past but as far as I know they were all resolved. Was this just a mistake or is there a new problem I'm not aware of?