Security patterns and anti-patterns in embedded development
When it comes to security, telling developers to do (or not do) something can be ineffective. Helping them understand the why behind instructions, by illustrating good and bad practices using stories, can be much more effective. With several such stories Marta Rybczyńska fashioned an interesting talk about patterns and anti-patterns in embedded Linux security at the Embedded Open Source Summit (EOSS), co-located with Open Source Summit North America (OSSNA), on April 16 in Seattle, Washington.
Rybczyńska started the talk by discussing her relevant experience as a security consultant as well as being a developer by training. (Though not mentioned, she is a frequent guest author for LWN as well.) She then moved on to her picks for recent, high-profile examples, including bricked trains, HTTP/2 protocol implementation issues, leaked signing keys, and the XZ backdoor.
The little engines that couldn't
The first story touched on embedded security practices for devices where security and safety are of the utmost priority: trains. A Polish railway operator, the Lower Silesian Railway (LSR), encountered problems with trains purchased from Newag. LSR sent the trains out to be serviced by Serwis Pojazdów Szynowych (SPS), a competing train maintenance provider, rather than Newag. SPS found that the trains would no longer start, and brought in a third party to investigate the software used in the trains. (A presentation on the investigation was given at the 37th Chaos Communication Congress in 2023.)
What they found, Rybczyńska said, was that the trains were intentionally bricked under certain conditions. For example, trains that had been stopped for a long period of time or trains that were at certain GPS coordinates that matched SPS repair facilities. The trains were also found to lock up on specific dates (possibly meant to force maintenance), and the researchers discovered "cheat codes" that could unlock the trains.
Rybczyńska identified several anti-patterns in this story. In addition to the apparent ransomware built into the firmware for the trains, they also suffered more general flaws. She mentioned that nearly every train had a different firmware version, and no indication of version control. "So we can have some doubts about the quality of the development process, right?" One might, she said, have doubts about the certification process that didn't detect any of these problems before certifying the trains for use in public transit. "I don't have access to the certification documents, so I'm not able to say what they're checking, but that's an interesting part."
Finally, Rybczyńska questioned the ethics of the developers for including functionality that would prevent third-party repairs and parts, or allow disabling a train according to its location. "Especially for the GPS conditions, because that is pretty obvious what it is going to do."
HTTP/2 implementations
Having ridden the train story to its conclusion, she then switched tracks to HTTP/2 implementations in embedded systems. She looked at CVE-2023-44487, a HTTP/2 rapid reset flaw that impacted NGINX, nghttp2, Apache Tomcat, Apache Traffic Server, and others. In this attack, a client sends multiple requests and then cancels them in rapid succession, causing the server to do extra work processing the requests. This can lead to a denial-of-service as the server is unable to process new incoming requests.
Part of the problem, said Rybczyńska, was a weakness in the HTTP/2 protocol itself. That does not excuse the vulnerabilities, however. She said developers were responsible for not just implementing standards, but anticipating what might happen. "The protocol is not protecting you from everything." (LWN has recently covered continuation-flood attacks on HTTP/2 that might have been prevented with better implementations of the protocol.)
She also asserted that web servers written for embedded systems were "way less affected than the other ones" because they are subject to more stringent resource allocations. Her thesis was that software written for resource-constrained systems, such as embedded systems, would be less likely to be vulnerable to some attacks. As an example of this, she cited lighttpd, a web server designed for low-resource usage compared to other popular web servers. Lighttpd is not considered vulnerable to CVE-2023-44487 in its default configuration. What it did differently, she said, was to process HTTP/2 frames in batches and set a limit of eight streams per client, rather than 100 or greater as recommended by the RFC. This meant that an attack that debilitated other web servers merely caused lighttpd to increase its resource usage.
Watch your keys
Next, she turned to an embarrassing incident for hardware vendor MSI from early 2023. The company was subject to a ransomware attack and data breach to the tune of 1.5TB of data. The stolen data included source code, firmware, and perhaps worst of all, image signing keys for UEFI firmware. Used correctly by a hardware vendor, the signing keys would allow the Intel Boot Guard system to verify firmware before loading it at boot time. In the hands of attackers it would allow distribution of UEFI bootkits that can bypass secure boot features on MSI devices using 11th through 13th generation Intel CPUs.
Rybczyńska identified several anti-patterns in this story that embedded developers should take pains to avoid. Firstly, she noted that the keys could not have been well-secured if they were caught up in a general data breach. Signing keys, she pointed out, should not be on a machine connected to a company's general network. Ideally, they would be stored on hardware tokens or systems that are air-gapped from the main network to reduce the chance they could be exfiltrated.
Better protection of signing keys could have prevented their exposure, but it's not a guarantee. MSI's other sin, in this case, was that the keys had no revocation mechanism. This means attackers can attempt to exploit any of the affected hardware through the entire life of the systems, with no way for MSI or Intel to revoke the vulnerable keys. The one positive in this story, she said, was that MSI had used a separate key for each product rather than a single signing key for all of its products.
XZ, of course
The XZ backdoor episode was a dominant topic at EOSS and OSSNA. If things had gone a bit differently, Rybczyńska said, the backdoor might have been caught by the Yocto project because XZ versions 5.6.0 and 5.6.1 broke the build. The failure to notice the backdoor was because Yocto's build-system maintainer didn't have time to investigate why the builds weren't working before the backdoor was discovered elsewhere.
The reason, or one reason, that compromised versions of XZ wouldn't build is that Yocto does not use the build scripts provided with the source tarball. This is in part because Yocto targets a broader set of compilers and architectures than mainstream Linux distributions. She surveyed the room and asked how many people really understood Autoconf's m4 scripts. In a room with about 100 attendees, few hands went up. "That's the issue," she said, "hide your backdoor in M4 scripts." Developers, she said, should be using languages for build systems that aren't obscure and therefore difficult to read.
She also called out, like many others, that developers need to consider their dependencies. She suggested that having a dependency on a project with a sole maintainer who is underfunded and overworked is something to be wary of. "It is important to consider your dependencies. Those projects maintained by the person in Nebraska, are you really sure you want to use them?"
Rybczyńska wrapped up by summing up some of the lessons learned from the stories in her talk, and reminded the audience that security practices evolve from real-world situations. "Security practices are there for a reason [...] if there's a security practice that is making your life harder, ask the security person why" it exists and see if there's another way to mitigate risk. Odds are, there's a story behind the practice.
[Thanks to the Linux Foundation, LWN's travel sponsor, for supporting our travel to this event.]
Index entries for this article | |
---|---|
Conference | Embedded Open Source Summit/2024 |
Conference | Open Source Summit North America/2024 |
Posted Apr 30, 2024 16:15 UTC (Tue)
by rweikusat2 (subscriber, #117920)
[Link] (11 responses)
How many people know that meta/classes/devtool-source.bbclass¹ exists and how many of these understand it?
¹ From the dated Yocto version I've worked with last.
Posted Apr 30, 2024 16:59 UTC (Tue)
by dskoll (subscriber, #1630)
[Link]
This. I haven't used Yocto directly, but I did use Xylinx's Petalinux, which is built on top of Yocto and it was a complete and utter mess. Over-engineered, under-documented and requiring Internet searches to do just about anything.
Posted Apr 30, 2024 17:50 UTC (Tue)
by Paf (subscriber, #91811)
[Link] (4 responses)
Posted Apr 30, 2024 18:03 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (2 responses)
I think the bigger thing here is that Yocto doesn't take upstream's build system; it rewrites it, instead. As a result, backdoors in the build system don't get run, and there's a human tracing through how the build system works in order to redo the important bits in Yocto's system. Because the build system is "executed" by a human, a backdoor that depends on the build system will get caught in that execution.
Posted Apr 30, 2024 19:18 UTC (Tue)
by dezgeg (subscriber, #92243)
[Link]
Posted Apr 30, 2024 21:16 UTC (Tue)
by rweikusat2 (subscriber, #117920)
[Link]
Posted Apr 30, 2024 21:27 UTC (Tue)
by rweikusat2 (subscriber, #117920)
[Link]
Posted Apr 30, 2024 18:23 UTC (Tue)
by atai (subscriber, #10977)
[Link] (4 responses)
Posted Apr 30, 2024 22:27 UTC (Tue)
by willy (subscriber, #9762)
[Link] (3 responses)
When it's something I don't know like m4, cmake or Bazel, I'm screwed. And honestly when it's something like Bazel which appears to exist Because Google Is Better At Everything Than You Are, I am disinclined to learn.
Posted May 1, 2024 11:47 UTC (Wed)
by smurf (subscriber, #17840)
[Link] (1 responses)
Funnily enough, that last step often is looking for the precise flavor of the pthread[s] library, which habitually "fails" because the check for the 'wrong' variant prints an error message. A web search for pthreads "breaking" your cmake script yields a heap of confused examples.
Makefiles aren't exactly anti-pattern-free either. You can make them arbitrarily complex, if not NP-complete. Look at the Linux kernel's build system if you need an example.
Posted May 3, 2024 1:06 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link]
Posted May 3, 2024 9:40 UTC (Fri)
by mss (subscriber, #138799)
[Link]
Posted Apr 30, 2024 19:40 UTC (Tue)
by epa (subscriber, #39769)
[Link] (8 responses)
There are forces pulling in two opposing directions. On the one hand you want to build from pristine sources, straight out of git, and not from the traditional release tarball. That argues for doing the full autoconf setup each time, rather than relying on a configure script that the maintainer has generated. On the other hand, you want the build to have as few steps as possible and to be understandable in its entirety without having to know m4 and all that.
Just possibly the answer could be to take a step further away from the original sources. Nowadays there is less variety among Unix-like systems, or at least those that 99% of developers care about. Your contributors are not using a mixture of Irix, AIX and old SunOS versions. People using those obscure systems might still run the configure script, but for everyone else why not ship a pre-generated makefile that assumes sensible defaults for a GNU/Linux system? Simple customizations like install root could be set via environment variables. Then Linux distributions could pull the sources, delete the autoconf/ subdirectory just to make sure it’s not used, and build the rest in a predictable way.
Is that at all feasible? Or is it like Microsoft Office, where we agree that only 10% of Autoconf’s functionality is needed, but nobody can agree which 10%?
Posted Apr 30, 2024 20:20 UTC (Tue)
by Paf (subscriber, #91811)
[Link]
Shell-independent shell scripting, babe-y. It's a thing of beauty. Or at least ... a thing.
Posted Apr 30, 2024 20:23 UTC (Tue)
by Paf (subscriber, #91811)
[Link] (3 responses)
The problem - or so it seems to me - is that configure is the main way builds find and report missing dependencies. So I'm building X and the build requires totally-reasonable-but-not-universal library Y (or god forbid it requires obscure library Z). Those dependencies are generally expressed via the configure script. Seems like a non-starter.
Posted Apr 30, 2024 20:50 UTC (Tue)
by epa (subscriber, #39769)
[Link] (2 responses)
There will be exceptions, but generally I think you could define a default build that requires most of the dependencies without having to sniff whether they are available. If that’s not flexible enough for everyone, some will stay using the configure script.
Posted May 1, 2024 7:16 UTC (Wed)
by epa (subscriber, #39769)
[Link] (1 responses)
This configure script was invoked with --strict. The optional dependency 'libfoo' has not been specified.
Then build systems for Linux distributions would tend to use the --strict flag and nail down exactly what dependencies they want, leaving nothing to autodetection. That would have stopped the xz attack where the configure script stopped including the Landlock dependency and nobody noticed.
Posted May 3, 2024 1:22 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link]
Posted May 1, 2024 15:35 UTC (Wed)
by mb (subscriber, #50428)
[Link] (2 responses)
Today there is pretty much *no* good reason for 95% of the autotools mess.
Almost all project don't care, if the build system works on obscure operating systems of the past. Yet, most code and checks in autotools are there because of some obsolete and obscure operating system that it used to support (or claims to still support). Your app won't work on these operating systems *anyway*, if you have never actually tested that.
Seriously, if you are still using autotools, you need to migrate away from that mess.
Having autotools in a project is a anti quality indicator.
Posted May 1, 2024 16:07 UTC (Wed)
by paulj (subscriber, #341)
[Link] (1 responses)
- If a project's build system is not generally a series of declarative statements; with no more than a modicum of small, confined and clear, ad-hoc logic (for whatever transforms or tests needed) then that is an anti-pattern.
Is quite possible to have a small, clean, fast, declarative build system in auto*, and it's possible to make mess in pretty much any build system tool. The anti-pattern is the mess. The anti-pattern is the willingness of the people who made the build system to engage in dirty hacks, rather than read the documentation of the tool and do it properly (whether, using the features of the tool properly; or extending it cleanly).
Basically: You can evaluate a project simply on the amount of ad-hoc logic they've stuffed into their build system, and the actual build system doesn't really matter that much to this.
Posted May 6, 2024 10:26 UTC (Mon)
by LtWorf (subscriber, #124958)
[Link]
Posted May 6, 2024 10:28 UTC (Mon)
by LtWorf (subscriber, #124958)
[Link]
I fear that in practice, they are kept on vault so that they can be used automatically, and be vulnerable to being exfiltrated by anyone who can place a "echo $PRIVKEY" in the appropriate place.
Security patterns and anti-patterns in embedded development
Security patterns and anti-patterns in embedded development
Security patterns and anti-patterns in embedded development
Security patterns and anti-patterns in embedded development
Security patterns and anti-patterns in embedded development
Security patterns and anti-patterns in embedded development
Security patterns and anti-patterns in embedded development
Security patterns and anti-patterns in embedded development
Security patterns and anti-patterns in embedded development
Security patterns and anti-patterns in embedded development
Security patterns and anti-patterns in embedded development
meson seems to be this decade build system of choice for OSS projects.Security patterns and anti-patterns in embedded development
cmake was more like year 2010 thing.
Autoconf and m4
Autoconf and m4
Autoconf and m4
Autoconf and m4
Autoconf and m4
You must pass either --with-libfoo or --without-libfoo.
Autodetection found that libfoo is present on this system.
Autoconf and m4
Autoconf and m4
>test, and Makefile generation is indeed obscure to all but the most advanced wizards.
>There are good reasons for that—it has a messy job to do.
Some projects have started to migrate away from it like 20 years ago.
Autoconf and m4
Autoconf and m4
Security patterns and anti-patterns in embedded development