Systemd as tragedy
a form of drama based on human suffering that invokes an accompanying catharsis or pleasure in audiences". Benno Rice took his inspiration from that definition for his 2019 linux.conf.au talk on the story of systemd which, he said, involves no shortage of suffering. His attempt to cast that story for the pleasure of his audience resulted in a sympathetic and nuanced look at a turbulent chapter in the history of the Linux system.
Rice was also influenced by Aurynn Shaw's writing on "contempt culture". According to Shaw, people use contempt (of developers using a different programming language, for example) as a social signifier, a way of showing that they belong to the correct group. This sort of contempt certainly plays into this story, where large groups identify themselves primarily by their disdain for systemd and those who work with it. A related concept is change, or the resistance thereto. The familiar is comfortable, but it isn't necessarily good, especially if it has been around for a long time.
The roots of the tragedy
The ancestry of systemd, he said, is tied to the origin of Unix, which was
"a happy accident" — a reaction to the perceived complexity of the
systems that came before. It was brutally simple in all regards, including
how its user
space was bootstrapped. Putting up an early init man page, he
called out the "housekeeping functions" that it was designed to carry out,
including mounting filesystems and starting daemons. Those are two
distinct tasks, but they had been lumped together into this one process.
In those days, there were few daemons to worry about; cron, update (whose job was to write out the filesystem superblocks occasionally), and the init process itself listening on a few terminals were about it. By the time that 4BSD came around, Unix had gained a proper getty daemon, network daemons like routed and telnetd, and the "superdaemon" inetd. That is where things started to get interesting, but it still worked well enough for a while.
Then the Internet happened. Using inetd worked well enough for small amounts of traffic, but then the World Wide Web became popular and it was no longer possible to get away with forking a new process for every incoming connection. Sites on the net started running databases and other systems with a great deal of stored state that could not go away between connections. All this shifted the notion of a daemon toward "services", which are a different beast. Old-style init could start services, but was pretty much useless thereafter.
Part of the problem was the conflation of services and configuration. Tasks like mounting filesystems are of the latter variety; they are generally done once at boot time and forgotten thereafter. But that approach is not sufficient for automated service management, which requires ongoing attention. Thus we saw the birth of more service-oriented systems like Upstart and systemd. This is something other operating systems figured out a while back. Windows NT had a strong service model from the beginning, he said, and Mac OS has one now in the form of launchd. Other systems had to play a catch-up game to get there.
Apple's launchd showed up in the Tiger release and replaced a whole series of event-handling daemons, including init, cron, and inetd. Systemd, Rice said, was an attempt to take a lot of good ideas from launchd. When Lennart Poettering started thinking about the problem, he first looked at Upstart, which was an event-based system that was still based around scripts, but he concluded that he could do a better job. His "Rethinking PID 1" blog post cited launchd as an example to work from. He was concerned about improving boot speed and the need for the init system to be tuned into the hardware and software changes on a running system. When the Unix init system was designed, systems were static, but the environment in which the operating system runs now is far more dynamic.
The service layer
Classic Unix-like systems are split into two major components: the kernel and user space. But kernels have become more dynamic and changeable over time, responding to the hardware on which they run. That has led to the need for a new layer, the "service layer", to sit between the kernel and user space. This layer includes components like udev and Network Manager, but systemd seeks to provide a comprehensive service layer; that is why it has pulled in functionality like udev over time. It has been quite successful, achieving wide (but not universal) adoption through much of the Linux distribution space, often creating a great deal of acrimony in the process.
There are a number of often-heard arguments against systemd; one of those is that it violates the Unix philosophy. This argument, he said, seems to be predicated on the notion that systemd is a single, monolithic binary. That would indeed be silly, but that's not how systemd is structured. It is, instead, a lot of separate binaries maintained within a single project. As "a BSD person" (he is a former FreeBSD core-team member), Rice thought this pulling-together of related concepts makes sense. The result is not the bloated, monolithic system that some people seem to see in systemd.
Another frequently heard criticism is that systemd is buggy. "It's software" so of course it's buggy, he said. The notion that systemd has to be perfect, unlike any other system, raises the bar too high. At least systemd has reasonable failure modes much of the time, he said. Then, there is the recurrent complaint usually expressed as some form of "I can't stand Lennart Poettering". Rice declined to defend Poettering's approach to community interaction, but he also said that he had to admire Poettering's willpower and determination. Not everybody could have pushed through such a change.
Systemd makes no attempt to be portable to non-Linux systems, which leads to a separate class of complaints. If systemd becomes the standard, there is a risk that non-Linux operating systems will find themselves increasingly isolated. Many people would prefer that systemd stuck to interfaces that were portable across Unix systems, but Rice had a simple response for them: "Unix is dead". Once upon a time, Unix was an exercise in extreme portability that saw some real success. But now the world is "Linux and some rounding errors" (something that, as a FreeBSD person, he finds a little painful to say), and it makes no sense to stick to classic Unix interfaces. The current situation is "a pathological monoculture", and Linux can dictate the terms that the rest of the world must live by.
Systemd has gained a lot from this situation. For example, control groups are a highly capable and interesting mechanism for process management; it would be much harder to do the job without them. They are much more powerful and granular than FreeBSD jails, he said. Developers for systems like FreeBSD can see systemd's use of these mechanisms, and its subsequent non-portability, as a threat. But they can also use it as an excuse to feel just as liberated to pursue their own solutions to these problems.
Change and tragedy
The whole systemd battle, Rice said, comes down to a lot of disruptive change; that is where the tragedy comes in. Nerds have a complicated relationship to change; it's awesome when we are the ones creating the change, but it's untrustworthy when it comes from outside. Systemd represents that sort of externally imposed change that people find threatening. That is true even when the change isn't coming from developers like Poettering, who has shown little sympathy toward the people who have to deal with this change that has been imposed on them. That leads to knee-jerk reactions, but people need to step back and think about what they are doing. "Nobody needs to send Lennart Poettering death threats over a piece of software". Contempt is not cool.
Instead, it pays to think about this situation; why did systemd show up, and why is it important? What problem is it solving? One solution for people who don't like it is to create their own alternative; that is a good way to find out just how much fun that task is. Among other things, systemd shows how the next generation doesn't think about systems in the same way; they see things more in terms of APIs and containers, for example.
So what can we learn from systemd? One is that messaging transports are important. Systemd uses D-Bus heavily, which gives it a lot of flexibility. Rice is not a fan of D-Bus, but he is very much a fan of messaging systems. He has been pushing for BSD systems to develop a native message transport, preferably built into the kernel with more security than D-Bus offers. On top of that one can make a proper remote procedure call system, which is a way to make kernel and user-space components operate at the same level. In a properly designed system, a process can simply create an API request without having to worry about where that request will be handled.
Other lessons include the importance of supporting a proper service lifecycle without having to install additional service-management systems to get there. Service automation via APIs is important; systemd has provided much of what is needed there. Support for containers is also important; they provide a useful way to encapsulate applications.
Systemd, he concluded, fills in the service layer for contemporary Linux systems; it provides a good platform for service management, but certainly does not have to be the only implementation of such a layer. It provides a number of useful features, including painless user-level units, consistent device naming, and even the logging model is good, Rice said. Binary logs are not a bad thing as long as you have the tools to pull them apart. And systemd provides a new model of an application; rather than being a single binary, an application becomes a bunch of stuff encapsulated within some sort of container.
The world is changing around us, Rice said. We can either go with that change or try to resist it; one path is likely to be more rewarding than the other. He suggested that anybody who is critical of systemd should take some time to look more closely and try to find one thing within it that they like. Then, perhaps, the catharsis phase of the tragedy will be complete and we can move on.
A video of this talk is available on YouTube.
[Thanks to linux.conf.au and the Linux Foundation for supporting my travel
to the event.]
Index entries for this article | |
---|---|
Conference | linux.conf.au/2019 |
Posted Jan 28, 2019 22:36 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
Posted Jan 29, 2019 4:39 UTC (Tue)
by drag (guest, #31333)
[Link]
But I am hopeful we can get a in-kernel message bus before that happens.
Posted Feb 1, 2019 2:16 UTC (Fri)
by SiliconSlick (guest, #39955)
[Link]
Anyway, in 2038 when the sun blows up and the planet catches on fire (OK, maybe only the latter will happen), and the clock on your computer is broken, then it just won't matter if you run systemd or initd or the shell script you dug up from an old AmigaOS floppy, tragedy is sure to follow. (We can always hope for a good movie or play out of the story.)
Posted Jan 28, 2019 22:41 UTC (Mon)
by mangix (guest, #126006)
[Link] (2 responses)
Disruptive change indeed. The pitchforks won't be as bad though.
Posted Jan 29, 2019 14:02 UTC (Tue)
by mskarbek (guest, #115025)
[Link]
Posted Jan 30, 2019 4:16 UTC (Wed)
by j16sdiz (guest, #57302)
[Link]
The ZFS core is quite platform independent. Most of the platform change is confined in SPL.
Posted Jan 29, 2019 0:35 UTC (Tue)
by dgm (subscriber, #49227)
[Link] (7 responses)
Posted Jan 29, 2019 4:24 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link]
Posted Jan 29, 2019 15:05 UTC (Tue)
by nettings (subscriber, #429)
[Link] (4 responses)
Posted Jun 4, 2019 19:25 UTC (Tue)
by mikebabcock (guest, #132457)
[Link] (3 responses)
This is not a well-integrated system (it almost can't be) and is redundant. There are good ways to handle inter-process communication and it doesn't belong in systemd.
Posted Jun 4, 2019 20:41 UTC (Tue)
by zdzichu (subscriber, #17118)
[Link] (2 responses)
Posted Jun 24, 2019 14:54 UTC (Mon)
by nix (subscriber, #2304)
[Link]
Posted Sep 5, 2019 19:25 UTC (Thu)
by soes (guest, #134247)
[Link]
I uses cfengine, which has its own rather dependable solution to authorizing/authentication access to the
the rule languages itself has constructs to authorize policy users ie hosts with only particular
cf-serverd uses a private-public private key pair to authenticate the connection including authenticating itself !
The authentication is done in C(C++) code, and has had basically no security holes the last 10 years.
Posted Feb 1, 2019 4:21 UTC (Fri)
by zblaxell (subscriber, #26385)
[Link]
The lifecycle management, scheduling, dependency management, and external triggers (sockets / mount points / etc) are all jumbled together in systemd, whereas they were traditionally managed by separate specialized daemons and command-line tools.
Posted Jan 29, 2019 0:36 UTC (Tue)
by gerdesj (subscriber, #5446)
[Link] (32 responses)
A very badly chosen example. My laptop wanders around many networks, sometimes with VPN connections enabled and sometimes not. The local filesystems are guaranteed but remote ones should appear as and when I need them. I have to say that autofs does a remarkable job of making my SMB mounts appear as needed from Windows boxes back at the office when the path is clear.
No, filesystems should be considered transient as well as anything else in life. The days of NFS blocking shutdown and crapping all over your files/life/data should be long, long past. Soz to calling out NFS - bad memories.
Posted Jan 29, 2019 1:32 UTC (Tue)
by rahulsundaram (subscriber, #21946)
[Link] (31 responses)
Not really. If you watch the video, the example is clearly about local filesystem mount points and in general, those are indeed static. Regardless of the specific example, the larger point is that older init systems co-mingle different types of configuration.
Posted Jan 29, 2019 13:40 UTC (Tue)
by cesarb (subscriber, #6266)
[Link] (4 responses)
As long as you don't plug and unplug external USB disks...
Posted Jan 30, 2019 2:47 UTC (Wed)
by rahulsundaram (subscriber, #21946)
[Link] (3 responses)
Those haven't been handled in the same way anyway, even before. They weren't listed in /etc/fstab, there were other non init system components (hal etc) managing them on desktop systems and mounting them in /media instead of /mnt etc
Posted Jan 30, 2019 14:09 UTC (Wed)
by mjthayer (guest, #39183)
[Link] (2 responses)
>Those haven't been handled in the same way anyway, even before. They weren't listed in /etc/fstab, there were other non init system components (hal etc) managing them on desktop systems and mounting them in /media instead of /mnt etc
I'm rather surprised that fstab still exists at all in systemd days, or at least that it is not automatically generated from systemd information. It seems to have systemd-fstab-generator for the opposite direction.
Posted Jan 30, 2019 15:00 UTC (Wed)
by rahulsundaram (subscriber, #21946)
[Link]
You don't need it. Support is retained for compatibility. Having said that, you just reminded me to file https://github.com/rhinstaller/anaconda/issues/1788
Posted Jan 30, 2019 16:02 UTC (Wed)
by MarcB (subscriber, #101804)
[Link]
But we run an archival system where data is cycled in and out in the form of SquashFS images and there we use Systemd mount units that are automatically started, stopped, generated/enabled and disabled/deleted. This is much saner than parsing and modifying fstab while providing the same persistence.
We could use autofs for that, but Systemd covers this just fine.
In theory, Systemd should be able to work without fstab, but since fstab is usually so static, it works fine as it is.
Posted Jan 30, 2019 4:25 UTC (Wed)
by j16sdiz (guest, #57302)
[Link] (25 responses)
Posted Jan 30, 2019 4:35 UTC (Wed)
by rahulsundaram (subscriber, #21946)
[Link]
I could see that happening
Have you reported these to systemd developers or publicly documented what those corner cases are?
Posted Jan 30, 2019 11:15 UTC (Wed)
by Limax (guest, #129516)
[Link]
Posted Feb 5, 2019 0:17 UTC (Tue)
by jccleaver (subscriber, #127418)
[Link] (22 responses)
HAHAHAHA
From: https://www.freedesktop.org/wiki/Software/systemd/Predict...
Posted Feb 5, 2019 1:28 UTC (Tue)
by anselm (subscriber, #2796)
[Link] (21 responses)
And please, sir, how is that a systemd problem?
Posted Feb 5, 2019 1:44 UTC (Tue)
by jccleaver (subscriber, #127418)
[Link] (20 responses)
My thoughts exactly -- but probably not in how you were envisioning the question.
Posted Feb 5, 2019 2:03 UTC (Tue)
by pizza (subscriber, #46)
[Link] (19 responses)
(I'm personally a big fan of the 'biosdevname' policy. Because why shouldn't default device names match the vendor's labels on the physical ports?)
Posted Feb 5, 2019 8:45 UTC (Tue)
by jccleaver (subscriber, #127418)
[Link] (18 responses)
That's great, and this no doubt solves a corner case for someone. But having worked at massive Dell shops I can say that even on 6-NIC servers I've *never* been hit by spontaneous random interface re-ordering on boot. Ever. Furthermore, the simplest possible case, *especially* in a VM-focused environment, is undoubtedly a "single host with single NIC", and whatever benefits un-"predictable device naming" purports to provide is completely obliterated by no longer being able to assume eth0 has a meaning, when 99.5% of the time it previously worked as expected. (The only issue addressed here is when your MAC changes, but all major hypervisors have hooks in guest services to deal with that.)
Requiring others to re-code, re-design, or insert hacks into software and configs to solve weird edge cases LP found on his laptop one day and then telling everyone they should be happy in the long run for this extra work epitomizes the systemd "cabal"'s approach to doing things.
Posted Feb 5, 2019 9:46 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Feb 5, 2019 13:05 UTC (Tue)
by pizza (subscriber, #46)
[Link] (15 responses)
Are you sure that isn't because udev has hidden re-ordering behind its default behaviour of persistent interface renaming?
Posted Feb 5, 2019 15:28 UTC (Tue)
by jccleaver (subscriber, #127418)
[Link] (13 responses)
> Are you sure that isn't because udev has hidden re-ordering behind its default behaviour of persistent interface renaming?
Pretty sure, unless the times we did that intentionally were logged in a different way than when it's automatically performing the function. That also really only helps for persistency once the OS is installed -- if it were truly random, then kickstarts without MAC codification would be expected to randomly fail as NICs went to different switches or ports got bonded incorrectly, and that didn't happen for us with any of our models, even though only the GB1 NIC was used by us for tracking in our PXE boot server.
Of course, the persistence with udev kind of proves the point: Even if the kernel can't be trusted deterministically, and with biosdevname out there, udev was already taking care of it. It was a "solved problem". There was no need for systemd to rip everything out and remove the long-standing conventions and make the most common case (a box with a single nic) annoying and difficult.
And with udev being subsumed into systemd, for no good reason, no one outside of the project had any real say in the matter.
Posted Feb 6, 2019 0:00 UTC (Wed)
by anselm (subscriber, #2796)
[Link] (12 responses)
The principal maintainers of udev also happen to be the principal maintainers of systemd, and they seem to find it convenient to arrange things this way. According to them it avoids code duplication and cyclic dependencies and makes administering the code base easier. Their prerogative.
Also, AFAIR the connection between systemd and udev doesn't really go deeper than the fact that they're sharing the same upstream tarball. It is still possible to build and use udev without systemd (and presumably vice-versa, although there would probably be little point – on the whole, udev for its faults seems to be a lot less controversial than systemd).
In any case, the network-interface naming issue is strictly a udev thing and has nothing whatsoever to do with systemd. It also applies to non-systemd installations if they're using udev, and IIRC even predates systemd altogether.
Posted Feb 6, 2019 0:28 UTC (Wed)
by jccleaver (subscriber, #127418)
[Link] (11 responses)
But that's not really the case operationally, and it's why eudev was forked away from it for Gentoo. Once you combine and remove independent support, let's face it: systemd may have different binaries, but when you take away any attempt and a stable platform between those pieces, you're forcing use of it as an integrated whole.
> In any case, the network-interface naming issue is strictly a udev thing and has nothing whatsoever to do with systemd. It also applies to non-systemd installations if they're using udev, and IIRC even predates systemd altogether.
Sure, but it's under the systemd aegis that they decided to implement this change, and with removing udev's support for non-systemd distributions, you no longer really have an option. Rather than allow for distinct approaches, now you're forced (well, by default) to adopt this new methodology that solves for edge cases by frustrating the simple, common case:
From: https://www.freedesktop.org/wiki/Software/systemd/Predict...
Thanks, Lennart. *slow clap*
Posted Feb 6, 2019 9:48 UTC (Wed)
by anselm (subscriber, #2796)
[Link]
That would be the developers of the “eudev” fork of udev talking, so this is not exactly an unbiased view. Whether that fork was really necessary is by no means clear.
Posted Feb 6, 2019 12:50 UTC (Wed)
by pizza (subscriber, #46)
[Link] (9 responses)
For all your bellyaching, that feature is trivially disabled, as per the second-to-last section on that page.
So, yes, thanks Lennart, for building a mechanism that most distributions found useful enough to enable by default, documenting it heavily, and providing an easy way to configure or disable it altogether.
Posted Feb 6, 2019 18:33 UTC (Wed)
by jccleaver (subscriber, #127418)
[Link] (8 responses)
I'm aware that it's disableable, and regularly do so. My original point was that "making the common use case very smooth" is risible for systemd as a whole. The most common use case for a host will be a single attached NIC, and now your options for this are:
And it's not even difficult to come up with workarounds for this -- maybe a built-in alias so that "eth0" *always* means "the only NIC" on single NIC systems even if the real name is its mess, so that thing would continue to work the way they had for the last 20 years in most cases.
> So, yes, thanks Lennart, for building a mechanism that most distributions found useful enough to enable by default, documenting it heavily, and providing an easy way to configure or disable it altogether.
That's a sheer bandwagon argument, and it fails even before you get to the underlying arguments about how responsive some distributions are to either their userbases or their downstreams (hint: not always, especially if they regularly kick policy decisions over to members in the same group popping on a different hat).
Posted Feb 6, 2019 22:26 UTC (Wed)
by jccleaver (subscriber, #127418)
[Link] (7 responses)
https://www.ispcolohost.com/2015/07/11/kickstarting-with-...
And of course, this doesn't even address the straight-up clusterf*ck of the 7.2 to 7.3 update:
https://access.redhat.com/solutions/2592561
Nor does it hit the "systemd will silently hang all boot/reboot attempts because no one tested SELinux policy application" that many of us hit while trying to address the issue: https://bugzilla.redhat.com/show_bug.cgi?id=1393505
Although this was a flat-out collapse of RedHat QA, it was precipitated by a belief by the systemd crew that it should be able to barrel over all sorts of underlying real-world behaviors blithely making assumptions no one had guaranteed and no one had agreed to. And when it inevitably breaks because of some other issue out there, their initial response is *always* "Hey, works for us. Not our problem!"
See also: UEFI bricking, usernames starting with a number, incorrect /etc/fstab entries hanging boot, flooding debug data to klogd, and God knows what else...
Posted Feb 6, 2019 23:00 UTC (Wed)
by johannbg (guest, #65743)
[Link] (2 responses)
Red Hat has a number to call for their support contract have you tried using it?
I'm pretty sure they can give you the moral support you need and even re-train you.
And you still arent critisizing systemd in an area you could arguably critisize it for but carry on throwing bricks at a dead pony...
Posted Feb 9, 2019 22:46 UTC (Sat)
by nix (subscriber, #2304)
[Link] (1 responses)
> Red Hat has a number to call for their support contract have you tried using it?
I'm fairly sure this is not argument in good faith on your part. (At the very least, you are entirely ignoring your interlocutor.)
Posted Feb 10, 2019 10:50 UTC (Sun)
by johannbg (guest, #65743)
[Link]
This problem has existed long before udev and came to be as can be seen by historic multiple prope and pray workarounds from various upstreams and distribution ( and *nixes) trying to get it right, all the way down to administrators trying to manage that by themselves by stuffing something as simple like ifconfig $foo name $bar in /etc/rc.conf or write more complex, scripts which boiled down more or less finding the mac addr and then run ifconfig $foo on it's output.
Users can disable or hack the udev rule to their liking if they dont agree with it's implementation.
And systemd's networkd interface configuration file arent more complex than so...
/etc/systemd/network/20-wired.network
[Match]
[Network]
And a simple ( fallback ) dhcp configuration
/etc/systemd/network/99-wired-dhcp.network
[Match]
[Network]
And for the keen administrator eye he will notices that *both* of the networkd network files combined are atleast half the size of an single
The fact is udev is not do blame for the device naming enumiration problem nor is systemd network configuration, complex or hard to understand
So all this constant blaming udev, systemd or both for all the worlds problem approach, to me makes little to no sense.
Posted Feb 13, 2019 8:44 UTC (Wed)
by flussence (guest, #85566)
[Link] (3 responses)
(and personally I'd prefer if /sbin/mount would grok GPT partition metadata so that fstab could go away, but that's neither here nor there.)
Posted Feb 13, 2019 18:14 UTC (Wed)
by Wol (subscriber, #4433)
[Link] (1 responses)
Cheers,
Posted Feb 14, 2019 7:25 UTC (Thu)
by zdzichu (subscriber, #17118)
[Link]
I think they're utilised by systemd-gpt-mount generator.
Posted Feb 14, 2019 17:51 UTC (Thu)
by nix (subscriber, #2304)
[Link]
Posted Mar 5, 2019 7:07 UTC (Tue)
by immibis (guest, #105511)
[Link]
Posted May 21, 2019 5:03 UTC (Tue)
by buchanmilne (guest, #42315)
[Link]
Posted Jan 29, 2019 2:41 UTC (Tue)
by roman (guest, #24157)
[Link] (55 responses)
/sbin/init is a symlink to /lib/systemd/systemd, which on my recent Fedora installation is 1793808 bytes, and dynamically links to /usr/lib/systemd/libsystemd-shared-238.so which is 2621680 bytes, for a total of about 4.4MB.
For comparison, /sbin/init (Upstart) on Ubuntu 14.04 is 265848 bytes; and if memory serves me well, the SysVInit binary was under 100K.
Runtime memory usage tells a similar story. For systemd, "ps u 1" shows a virtual size of 239560K and resident set size of 8496K for systemd; with Upstart, virtual size is 33888K and resident set size is 3120K.
Posted Jan 29, 2019 4:22 UTC (Tue)
by ncm (guest, #165)
[Link] (14 responses)
I like what I can do with systemd, but this just smacks of poor discipline. Is no one embarrassed about it?
Posted Jan 29, 2019 13:17 UTC (Tue)
by tchernobog (guest, #73595)
[Link] (12 responses)
Sure, it might be an issue in some embedded systems, but there you can forego systemd altogether and just bake your own init script.
Bottom line: we use bloated systems all days. If they make developers' life easier, we put up with them a lot of times. Like Java on your phone.
For me, systemd works extremely well and does what I need it to do within reasonable performance constraints. Anyway, you are right on one thing: probably systemd memory footprint can be reduced, and patches are welcome...
Posted Jan 29, 2019 15:01 UTC (Tue)
by MatejLach (guest, #84942)
[Link]
Posted Jan 29, 2019 15:52 UTC (Tue)
by drag (guest, #31333)
[Link] (7 responses)
At the level of the OS stack that systemd sits at the primary concern should probably be "correctness of code". Following closely with that is probably "usable system administrators" and then following a more distant third: "scalability". By "correctness of code" I mean that it behaves in the most predictable and bug free manner as possible. "System administrators" include the owner of a single user desktop as well as people who must manage systems at scale. "Scalability" means to work well on both at the smallest scale and a very large scale which covers more issues then just resource utilization (although that is certainly part of it).
Code bloat is diametrically opposed to "correctness of code". The more code you have the more bugs you have. The more bugs you have the more likely are you to run into the subset of bugs that cause security concerns or other highly disruptive problems.
So whether or not the systemd daemon's 'binary bloat' is actually a problem is highly dependent on whether or not that code is actually really necessary. It could be that a binary that large is simply required to get done what systemd sets out to do and in that case it's not really a problem at all. It could also be that they have gotten sloppy in a effort to add features. One can only know 'the level of correctness', unfortunately, by being very familiar with the code itself.
Posted Jan 29, 2019 19:18 UTC (Tue)
by zdzichu (subscriber, #17118)
[Link] (2 responses)
Systemd binary size is not directly related to code bloat. Big chunk of systemd binary size is made of strings – messages, textual representation of enums and more good stuff making debugging and daily operation easier.
Posted Jan 29, 2019 20:20 UTC (Tue)
by zblaxell (subscriber, #26385)
[Link] (1 responses)
Indeed, a size comparison that doesn't use the output of 'size' or 'objdump' is mostly meaningless. Buried in megabytes of executable are kilobytes of code.
Posted Jan 29, 2019 21:42 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link]
Posted Jan 29, 2019 22:13 UTC (Tue)
by HenrikH (subscriber, #31152)
[Link]
Posted Jan 30, 2019 15:27 UTC (Wed)
by MarcB (subscriber, #101804)
[Link] (1 responses)
But so does faulty layering and a lack of code reuse.
SysV init does not cover functionality that is essential for every init system, so there are countless reimplementations of this functionality; usually in shell script. Most of them with severe bugs.
The prime example is process management. There is at least one implementation per distribution - usually that one is mostly OK, since it had a decade or more to mature. But then there are countless implementations by upstream developers and those often are a nightmare (they were usually written as an afterthought by developers who had no experience whatsoever in that area).
Systemd's larger core functionality and declarative syntax are a huge improvement. Things that should have been easy from the very beginning finally are.
Posted Jan 31, 2019 19:04 UTC (Thu)
by drag (guest, #31333)
[Link]
Posted Feb 1, 2019 19:12 UTC (Fri)
by mebrown (subscriber, #7960)
[Link]
Posted Jan 30, 2019 15:42 UTC (Wed)
by imMute (guest, #96323)
[Link] (2 responses)
I work at a place where we make "embedded" controllers for LED video walls. I say "embedded" with quotes because the current controller isn't really embedded - it's an x86_64 Supermicro motherboard with custom PCIe cards and it runs a mostly stock debian with some tweaks and a pile of custom software. Between systemd for service management (our software is very "microservice" oriented) and the journal (logging isn't much more complicated than fprintf(stderr, ...)) the systemd project has made our lives muuuuch simpler than in previous generations.
>Sure, it might be an issue in some embedded systems, but there you can forego systemd altogether and just bake your own init script.
Which brings me to this. We're starting development on a "low-end" controller based around a custom SBC with low-cost ARM (well, mid-cost really) processors. That system probably won't get more than 4 to 8 GB of RAM. It would be a huge deflection for us to abandon using stock Debian on this lower end controller. I definitely don't want to abandon systemd, as it plays a big role in our service management. We made the conscious decision to tie ourselves to systemd because it was a force multiplier for our small team.
>Bottom line: we use bloated systems all days. If they make developers' life easier, we put up with them a lot of times.
TLDR: Yes, we sacrifice a little bit of bloat to make development easier/faster. That doesn't mean we should give up trying to keep the excess fat to a minimum.
Posted Jan 30, 2019 17:33 UTC (Wed)
by dezgeg (subscriber, #92243)
[Link] (1 responses)
Posted Jan 30, 2019 18:05 UTC (Wed)
by zdzichu (subscriber, #17118)
[Link]
Posted Jan 31, 2019 0:18 UTC (Thu)
by motk (subscriber, #51120)
[Link]
Posted Jan 29, 2019 4:29 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link] (35 responses)
Posted Jan 29, 2019 7:23 UTC (Tue)
by mjthayer (guest, #39183)
[Link] (34 responses)
Applying that to the larger resident size begs the question: should the resident size of the systemd process be compared to that of several processes on non-systemd systems? (Assuming yes.)
Posted Jan 29, 2019 17:07 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link] (33 responses)
Posted Jan 29, 2019 21:47 UTC (Tue)
by zblaxell (subscriber, #26385)
[Link] (32 responses)
I think it's pretty easy to do that sort of thing without being part of pid1. Replicating and improving cron (or replacing it outright with a different design) is a coding exercise for students. The result can integrate with cron or systemd. I did mine decades ago and still use it today (the need for a tool implementing that scheduling behavior, as you point out, is obvious).
The problem with the question "what functional elements should belong to pid1?" is that with the right subset of supporting evidence, the answers "all of them" and "none of them" can both be as valid as any point in between those extremes. systemd delegates a number of its functions to external processes and reserves a number of functions to itself more or less at random--sometimes there's an identifiable historical practice or a functional requirement, other times features just seem to get added to some random existing binary in the systemd package (e.g. systemd-logind or pid 1). Many agree the specific arrangement systemd currently uses is somehow wrong (i.e. pid1 is "bloated"), but many disagree on how best to change it.
Posted Jan 29, 2019 22:17 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link] (31 responses)
The problem I forsee is proper process tracking. Is this tool going to hook into systemd's cgroups layout? Is it going to spawn processes directly? Then it has the same job as pid1 for proper process management. Is it going to call `systemctl start` and `systemctl stop` on a schedule? If so, how much code are you really saving? Is it going to work with other init systems? That's a lot of code since systemd does lots of heavy lifting for this tool that would need reimplemented for sysvinit. IMO, it belongs with wherever service management is done for both code savings and being more robust. And having read upstream's arguments and those arguing for it, moving process management out of pid1 into a hypothetical pid2 manager doesn't sound like a great idea.
Posted Jan 30, 2019 0:05 UTC (Wed)
by zblaxell (subscriber, #26385)
[Link] (30 responses)
> Is this tool going to hook into systemd's cgroups layout?
The scheduler tool doesn't know what a cgroup is. A process-launcher tool invoked by the scheduler tool might, if that was relevant for the service.
The cgroups layout we use is older than systemd's, and we usually want to run it instead of the systemd one; however, we have systems running both.
> Is it going to spawn processes directly?
Sure, why wouldn't it? A tool that spawns processes on a schedule will spawn processes, and to minimize dependencies it will spawn them directly.
> Then it has the same job as pid1 for proper process management.
In the sense that it's mostly a while loop calling fork, a bunch of child process setup calls, exec, and waitid, with some interface for communicating state with other parts of the system, then yes. In the sense that it's replicating systemd's pid 1, then no.
> Is it going to call `systemctl start` and `systemctl stop` on a schedule?
If the jobs it schedules and runs invoke 'systemctl start' and 'systemctl stop', then yes; otherwise, no.
> If so, how much code are you really saving?
The simplest versions of these tools weigh in at a few hundred bytes (plus the system shell that's running anyway). More complicated ones run a few kilobytes. If the only part of systemd we needed was the timer feature, then we save the entire code of systemd.
The scheduler tool will also function as an 'init', though you'd have to "schedule" the execution of every system service in that case--and because it's a single-purpose tool, you'd need something to watch network sockets, mount filesystems, respond to device connections, etc. On embedded systems and VM containers, though, sometimes the scheduler tool is all you need to make one or two services happen over and over.
> That's a lot of code since systemd does lots of heavy lifting for this tool that would need reimplemented for sysvinit
I'm not sure what you're getting at. There's not a lot of code, most of it serves the tool's primary function, and there's no systemd or sysvinit dependency (other than they have to start the tool, or the tool itself has to be pid 1 to start itself). What heavy lifting do you think systemd could be doing?
> IMO, it belongs with wherever service management is done for both code savings and being more robust.
Isolation between unrelated subsystems improves robustness, and at under 1K, there is not much code to be saved. I would agree with you if you had mentioned the human-factors benefit of consistency in the management interface, or some of the non-timer capabilities of systemd. Configuring things in multiple redundant places can suck.
We mostly use the scheduler tool to give ourselves a consistent interface to systems that do and do not run systemd, and some of the helper tools to replicate specific systemd features in legacy systems that can't be (easily) upgraded.
> And having read upstream's arguments and those arguing for it, moving process management out of pid1 into a hypothetical pid2 manager doesn't sound like a great idea.
At this point I agree further churn in systemd is bad. It took years to get to where it is now.
Since systemd started there have been kernel hooks added to support pid2 process managers (because systemd ended up needing them, apparently), so it's possible to make a more complete service manager implementation in pid2 now than it once was. A new project starting today would have no need to do anything in pid 1 except spawn pid 2 in a loop (or maybe fallback to an earlier version if an upgrade goes badly). systemd is not a new project starting today--it is code that has had the benefit of years of debugging, and most people don't want anyone to mess with it now.
If you're asking "should I use systemd?", the answer is not "yes because timers are awesome."
Posted Jan 30, 2019 6:13 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (28 responses)
Long ago I had an issue - our CI server (Jenkins) was spawning a task that launched a background daemon listening on a specific port. The problem was that this daemon sometimes hanged on shutdown, ignoring anything short of targeted SIGKILL but still having the port open. So our tests periodically failed because of that. The fix way back then was to "lsof | xexec kill" at the start of the test.
Systemd solves this issue cleanly.
Posted Jan 30, 2019 15:48 UTC (Wed)
by imMute (guest, #96323)
[Link]
Technically, cgroups solves that issue cleanly. Systemd is just one [easy] way to use cgroups, but there are other inits that utilize cgroups.
Posted Jan 30, 2019 17:27 UTC (Wed)
by zblaxell (subscriber, #26385)
[Link] (24 responses)
Watch the cgroup tasks or events files (or use notification_agent if you're old-school).
> The fix way back then was to "lsof | xexec kill" at the start of the test. Systemd solves this issue cleanly.
No, cgroups solve that issue cleanly. systemd's cgroup controller is showing its age and doesn't solve the issue in some cases.
Systemd's cgroup killer plays whack-a-mole with processes in a tight loop, instead of freezing the cgroup then picking off processes at leisure without looping. It looks like it's theoretically possible to create a fork bomb that systemd can't kill. systemd hasn't been updated to take advantage of new cgroup features ("new" meaning "somewhere around kernel 3.17, between 2014 and 2015, when we removed assorted workarounds from our cgroup controllers").
The concrete bug that we hit in production is that systemd assumes processes respond instantly to SIGKILL, and can be ignored after the signal is sent. On Linux, SIGKILL can take nontrivial amounts of time to process, especially if processes have big temporary files or use lots of RAM (e.g. databases). If a service restarts, and there was a timeout during the stop, systemd can spawn a new service instance immediately, which will try to allocate its own large memory before its predecessor has released it, OOMing the machine to death. There's a policy problem here: ignoring a slowly dying process somewhere in the cgroup is correct behavior for the shutdown use case (since the machine is being turned off anyway, nobody cares if the service is still stuck in the kernel when that happens), but the policy is incorrect for service restart.
Various workarounds are possible (e.g. use pid files to find the previous service instance and wait until it finishes being killed, check the service cgroup at startup to see if it's empty then wait if it's not, or limit resources available to the application to mitigate the specific OOM risk), but the cleanest (and often easiest, since we already have to do it to support non-systemd machines anyway) solution is to just implement a micro service manager that doesn't have the systemd problems in the first place, then configure systemd (or whatever the machine is using) to forward start/stop requests to it.
Posted Jan 30, 2019 22:21 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (23 responses)
> Systemd's cgroup killer plays whack-a-mole with processes in a tight loop, instead of freezing the cgroup then picking off processes at leisure without looping.
> The concrete bug that we hit in production is that systemd assumes processes respond instantly to SIGKILL, and can be ignored after the signal is sent. On Linux, SIGKILL can take nontrivial amounts of time to process, especially if processes have big temporary files or use lots of RAM (e.g. databases). If a service restarts, and there was a timeout during the stop, systemd can spawn a new service instance immediately, which will try to allocate its own large memory before its predecessor has released it, OOMing the machine to death.
And I think you can increase the timeout time and/or disable it completely in this case.
Posted Jan 31, 2019 21:08 UTC (Thu)
by zblaxell (subscriber, #26385)
[Link] (22 responses)
systemd is not written in stone, but it is unusually difficult to analyze, modify and deploy. Any day that starts with "understand how hundreds or thousands of units interact at runtime with a dynamic dependency resolver to produce some result" usually ends with "I need to do these 20 things, let's write a 25-line program that does those 20 things in the right order and run that program instead of systemd." I can never figure out how to turn that sort of day into a systemd pull request.
> in practice systemd is much faster at killing processes than the kernel is at forking.
That may be true, assuming no scheduler shenanigans; however, the loop in systemd that kills processes terminates when the cgroup contains no processes _that systemd has not already killed once_, so it's vulnerable to pid reuse attacks. If a forkbomb manages to get a reused pid during the kill loop, systemd will not attempt to kill it.
> This doesn't sound right. systemd will wait until the cgroup is empty, by which time all the resources should be freed.
It writes "Processes still around after SIGKILL. Ignoring." on the log and ignores the process(es). It might send SIGKILL to the same process again when the service restarts, but additional SIGKILLs aren't helpful.
There's no single right answer here, and no documented systemd configuration option I can find to address this case. The documentation talks a lot about waiting for various subsets of processes, but the code revolves around actions taken during state transitions for systemd services. These are not the same thing.
If the killed process is truly stuck, e.g. due to a kernel bug, then it will never exit. Processes that can't ever exit pollute systemd's service dependency model (e.g. you can't get to target states that want that service to be dead). systemd solves that problem by ignoring processes whose behavior doesn't fit into its dependency model.
If the killed process isn't stuck, but just doing something in the kernel that takes a long time, then to fit systemd's state model, we need to continue to wait for the process after we sent it SIGKILL. We only need to wait for the process to exit if we're going to do something where the process exit matters (e.g. start the process again, or umount a filesystem the process was using), but systemd doesn't have a place for such optimizations in its data model.
It's probably possible to do this in systemd with a helper utility in ExecStartPre to block restarts until the service cgroup is completely empty except for itself, but that's no longer "clean", and you'd have to configure it separately for every service that might need it.
> And I think you can increase the timeout time and/or disable it completely in this case.
I can't find a post-KILL timeout in the documentation or code, and I've already spent more time looking for it than I typically spend implementing application-specific service managers.
Posted Jan 31, 2019 22:16 UTC (Thu)
by pizza (subscriber, #46)
[Link] (15 responses)
If you don't get a deterministic dependency graph for the things you need to happen in a given order, then you haven't sufficiently specified your dependencies. No dependency resolver can read your mind.
Posted Feb 1, 2019 3:56 UTC (Fri)
by zblaxell (subscriber, #26385)
[Link] (14 responses)
We don't want it to read our minds, or block devices, or assorted device buses, or any data source we didn't explicitly authorize it to consume. We want systems with predictable (or at least highly repeatable) behavior so they behave the same way in test and production.
Yes, we can translate our 25-line startup program into a simple dependency graph, lock down all the places where systemd could pick up extra nodes the graph, and dodge the parts of systemd that have builtin exceptions to the dependency evaluation rules. That's a lot of project risk, though, and sometimes significant cost, and--most important of all--nobody is paying us to do any of that work.
If we're doing safety or security audit on the system, the systemd dependency resolver (and any external code it executes) gets sucked into the audit since it's executing our dependency graph. If the choice comes down to "audit 25 lines of code" or "audit 25 lines of code and also systemd", well, one of those is cheaper.
Posted Feb 1, 2019 12:15 UTC (Fri)
by pizza (subscriber, #46)
[Link] (13 responses)
You're being disingenuous. That first statement should read "Audit 25 lines of code and also the shell interpreter [1] and also everything else the script invokes. [2]"
[1] ie bash dash or csh or busybox or perl or whatever it is that actually parses and executes that that script.
(When all of that "external code" is factored in, I suspect systemd will come out way, way ahead on the "least amount of total code that needs auditing" metric)
Posted Feb 1, 2019 14:12 UTC (Fri)
by zblaxell (subscriber, #26385)
[Link] (3 responses)
Already done for other projects, no need to do them again. Arguably if we ever did a systemd audit then we could reuse the result for multiple projects, but nobody wants to be the first.
> grep, psutils, and util-linux as well [3] Don't forget libreadline, ..., libstdc++,
We don't use 'em. It's mostly shell builtins, flock, maybe a couple of C wrappers for a couple of random kernel API calls. 'echo' and /sys/fs/cgroup are sufficient for a lot of use cases. Our scope ends once the application is execve()ed in a correctly configured environment and resumes when the application exits.
Posted Feb 1, 2019 15:18 UTC (Fri)
by pizza (subscriber, #46)
[Link] (2 responses)
What exactly do you mean when you say "audit"?
That could mean anything between "reviewing the license and patent overlap" to "line-by-line inspection/verification" of the sort that's needed to certify rocket avionics. Are you auditing algorithms and state machines to ensure all states and transitions are sane under any possible input? Are you auditing all input processing to make sure it can't result in buffer overflows or other sorts of security problems? Are your audits intended to ensure there's no leakage of personal information (eg in logs) that could run afoul of things like the GPDR?
Posted Feb 1, 2019 22:31 UTC (Fri)
by zblaxell (subscriber, #26385)
[Link] (1 responses)
Closer to rocket avionics. Part of it is running the intended code under test and verifying that all executed lines behave correctly with a coverage analyzer. Sometimes you can restrict scope by locking down the input and verifying only the parts of the shell that execute, other times you swap in a simpler shell to interpret the service management code and audit 100% of the simpler shell.
> Are you auditing algorithms and state machines to ensure all states and transitions are sane under any possible input?
Attack the problem the other way: precompute an execution plan for the service dependency graph, ordered and annotated for parallel execution. The service manager just executes that at runtime. Conceptually, "shell script" is close enough to get the idea across to people, and correct enough to use in prototyping, but it might be a compiled or interpreted representation of the shell script by the time it gets certified. Add a digital signature somewhere to verify it before executing it.
Most of the time the execution plan can be computed and verified by humans: you need storage, then start networking and UI in parallel, then your application runs until it crashes, then reboot. Someone checks that UI and networking do not in fact depend on each other. In more complicated cases you'd want a tool to do the ordering task, so you provide the auditors with evidence your tool is suitable for the way you use it.
The execution plan does not look anything like sysvinit scripts. It does not have all the capabilities of systemd, since the whole point is to avoid having to audit the parts of systemd you didn't use. Only the required functionality ends up on the target system.
Normally these do not change at runtime except in tightly constrained ways (e.g. a template service can have a variable number of children). If for some reason you need a crazy-high amount of flexibility at runtime in a certified system, there is theoretically a point where the complexity curves cross over and it's cheaper to just sit down and audit systemd.
> Are you auditing all input processing to make sure it can't result in buffer overflows or other sorts of security problems?
If auditing something like a shell, you get a report that says things like "can accept input lines up to the size of the RAM in the system, but don't do that, that would be bad" and "don't let random users control the input of the shell, that would be bad".
So the problem for the service manager script reduces to proving that the shell, as used for the specific service manager input in the specific service manager environment, behaves correctly. This can be as little as some light code review and coverage testing (plus a copy of the shell audit report and a checklist of all the restrictions you observed).
> Are your audits intended to ensure there's no leakage of personal information (eg in logs) that could run afoul of things like the GPDR?
It's never come up. The safety-critical systems don't have any personal information in them, and the security-critical ones have their own logging requirements that are out of scope of the service manager.
In some cases you can mix safety-critical and non-safety-critical services on a system provided there is appropriate isolation between them. Then you audit the safety-critical service, the service manager, and their dependencies, and mostly ignore the rest.
Posted Feb 1, 2019 23:15 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Feb 4, 2019 18:55 UTC (Mon)
by jccleaver (subscriber, #127418)
[Link] (7 responses)
And you're not living in the real world. Your shell is already going to be audited, or an upstream audit is taking place if it matters.
Shell scripts can be complex. A 25 line shell script that just imperatively executes 20 commands and then exec's into a final daemon is not complex, and is in fact the simplest possible way to accomplish a deterministic goal on a *nix system.
It boggles my mind that there are administrators out there that would consider some other solution as simpler. Why are people so scared of procedural programming using a *nix shell?
Posted Feb 4, 2019 19:08 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (5 responses)
Writing a robust shell initscript is incredibly hard. It's not 25 lines of code, many initscripts are hundreds lines of code (and are still buggy as hell).
Posted Feb 4, 2019 19:20 UTC (Mon)
by zblaxell (subscriber, #26385)
[Link] (1 responses)
cgroups, watchdogs, service pings...
> Are you sure PID files are correct?
Don't use 'em, because cgroups.
> How do you kill the daemon in case it's stuck?
cgroups
> What if you want to allow unprivileged users to terminate the service?
We don't.
> Do you need a separate SUID binary?
No.
> Writing a robust shell initscript is incredibly hard.
No.
> It's not 25 lines of code, many initscripts are hundreds lines of code (and are still buggy as hell).
No.
Posted Feb 4, 2019 19:31 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link]
During one of the systemd flamewars some years ago I tried to find an actual init script that uses cgroups for comparison with systemd units. I was not able to do it.
Writing it was also decidedly non-trivial, I had to create my own cgroup hierarchy convention and make sure that there are no race conditions in cgroups manipulation. Both are not easy at all, and I was not even trying to use cgroups controllers to limit the resource use.
Posted Feb 4, 2019 20:30 UTC (Mon)
by jccleaver (subscriber, #127418)
[Link] (2 responses)
The OP was concerned about doing dependency ordering, the 25 line script was not the init script itself.
> Writing a robust shell initscript is incredibly hard. It's not 25 lines of code, many initscripts are hundreds lines of code (and are still buggy as hell).
No, it's not. On RedHat systems it's this (which, other than the URL, has not changed since basically 2004 - EL3/EL4):
1) Cut/paste this: https://fedoraproject.org/wiki/EPEL:SysVInitScripts#Inits...
If your initscript for a basic daemon has a more complex structure than this, then you're probably doing something wrong. If your distribution forces you to do something more complex than this, then I'm sorry for you.
Posted Feb 4, 2019 20:37 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
Noted.
> No, it's not. On RedHat systems it's this (which, other than the URL, has not changed since basically 2004 - EL3/EL4)
Posted Feb 4, 2019 22:05 UTC (Mon)
by jccleaver (subscriber, #127418)
[Link]
cgroup *definition* is out of scope. To *assign* this process to a cgroup just set
> Incorrect. This doesn't have PID file management, for starters.
Incorrect. You can see in the start/stop sections that it's recommended to use the 'daemon' and 'killproc' functions, which handle PID file management for you and fall back to looking for the process's exec if not found. If your daemon does something weird in launch, you can have your daemon make the pidfile and pass that file name into the functions with -p.
The default 'status' function handles pid files automatically too.
All of this is in /etc/init.d/functions.
Posted Feb 4, 2019 19:10 UTC (Mon)
by pizza (subscriber, #46)
[Link]
So why is systemd being held to a different standard?
> It boggles my mind that there are administrators out there that would consider some other solution as simpler. Why are people so scared of procedural programming using a *nix shell?
Shell is used for the same reason that folks use a screwdriver as a chisel or a punch. Sure it's convenient, and often gets the job done, but there's a much higher chance of unintended, and often quite painful, consequences.
Posted Mar 5, 2019 7:26 UTC (Tue)
by immibis (guest, #105511)
[Link]
Posted Feb 1, 2019 6:12 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link] (5 responses)
Systemd failed to kill the process (as expected) and the service entered the "failed" state. It was not restarted.
Posted Feb 1, 2019 14:14 UTC (Fri)
by zblaxell (subscriber, #26385)
[Link] (4 responses)
Not the bug I was looking for, but a bug nonetheless.
Posted Feb 1, 2019 20:15 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link] (3 responses)
Posted Feb 1, 2019 22:11 UTC (Fri)
by zblaxell (subscriber, #26385)
[Link] (2 responses)
Look up the page a little: long-running processes that take a long time to exit after SIGKILL, but eventually get there. You want to restart them, but only after they exit, and there's a big time gap between KILL and exit.
Posted Feb 1, 2019 23:23 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
Duh.
Posted Feb 9, 2019 22:38 UTC (Sat)
by nix (subscriber, #2304)
[Link]
Posted Feb 4, 2019 20:44 UTC (Mon)
by jccleaver (subscriber, #127418)
[Link] (1 responses)
Well, I mean it sounds like the problem was more that you were sending buggy code through the system. I'd have yelled first at the daemon dev, and secondly at the test writer for not cleaning it up itself, potentially with kill -9 if it wasn't responsive.
> The fix way back then was to "lsof | xexec kill" at the start of the test.
A test shouldn't have left it hanging, so I'd run that at the end. But if the problem was the blocked port, then, sure this would work too.
Congratulations, you fixed the blocker. I'd much prefer that approach, which is clean, easy to understand, and easy for a human to debug, than *ripping out PID1 and replacing it with something 50x more complicated* just because someone left a hanging process lying around.
Posted Feb 4, 2019 20:57 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link]
> A test shouldn't have left it hanging, so I'd run that at the end. But if the problem was the blocked port, then, sure this would work too.
> Congratulations, you fixed the blocker. I'd much prefer that approach, which is clean, easy to understand, and easy for a human to debug, than *ripping out PID1 and replacing it with something 50x more complicated* just because someone left a hanging process lying around.
This is why we have protected memory.
Posted Jan 30, 2019 22:25 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link]
> If you're asking "should I use systemd?", the answer is not "yes because timers are awesome."
No, timers alone certainly aren't enough to convert. All the other things that are possible because systemd unifies lots of functionality like configuration file formats, I see as good byproducts, but not a feature in and of itself. But that's a difference of viewpoint I suppose. Timers also certainly aren't the best feature overall, but for the functionality provided, I'd think the amount of code dedicated to it specifically is probably worth much more per line/codesize than most other code for features in systemd (again, IMO).
Posted Jan 29, 2019 13:08 UTC (Tue)
by blossy (guest, #112606)
[Link] (1 responses)
Posted Jan 30, 2019 10:07 UTC (Wed)
by ale2018 (guest, #128727)
[Link]
The nice thing about X is that you can call `startx` after logging in, after all daemons are up and running, and even then, you have a choice of window managers. So, as long as it doesn't interfere with the server's main tasks, it is a pleasure to avail of its features, notwithstanding its monolithic and complicated aspect.
Posted Jan 29, 2019 16:30 UTC (Tue)
by glaubitz (subscriber, #96452)
[Link] (1 responses)
So, for a fair comparison, you should include the various helper tools which Upstart and init need to call in order to achieve the same or similar functionality as systemd.
Posted Jan 29, 2019 22:26 UTC (Tue)
by HenrikH (subscriber, #31152)
[Link]
Posted Jan 29, 2019 3:51 UTC (Tue)
by akkornel (subscriber, #75292)
[Link] (19 responses)
Case in point: I deal with several clusters that use Stanford Central LDAP for account info, and our UID numbers have gotten pretty high. So much so that it overlaps the range systemd uses for dynamic service UIDs.
The biggest annoyance is that the UID range is hard-coded, so I can’t provide an alternate range. Having something upstream take a block of UIDs is not new; Debian’s policy has several such ranges. But in those cases, I can simply grab the package source and rebuild it with a different configured UID. In fact, we do exactly that for one package (specifying a different UID to the install script). But with systemd, the range is buried in the code, and the change is that much more dangerous.
More details: https://github.com/systemd/systemd/issues/9843 (but please don’t spam the GitHub issue!)
Posted Jan 29, 2019 5:55 UTC (Tue)
by mbiebl (subscriber, #41876)
[Link] (6 responses)
option('system-uid-max', type : 'integer', value : '-1',
It should be possible to recompile systemd adjusting the values to your needs. Or am I missing something obvious?
Posted Jan 29, 2019 6:18 UTC (Tue)
by akkornel (subscriber, #75292)
[Link] (5 responses)
The other reason is, systemd is such a critical component, that I don't want to mess with the distro's packaging process.
That is one thing worth noting: I'm not talking about an individual system, where I might be willing to do `make ; make install` (or the equivalent). I need to be able to take this change and package it up for at least Debian, Ubuntu 18.04, and maybe RHEL 8 if it includes this. I need to do that because we have a substantial number of systems running those distributions of Linux.
So, that is three build infrastructures that I would have to maintain for this, along with an additional GPG key that I'll have to push out during system commissioning (because I'd want to have signed packages/repositories). I will also have to keep it in sync with each distributor's patches to systemd, and it also means that if there is a security update to systemd, I will have to rush to pull in the change and rebuild everything.
That is alot of infrastructure to maintain, and what annoys me is, I would only have to do this because the UID range assumption was hard-coded. If, for example, it was possible to specify a UID range via kernel command-line, then that would be perfectly fine with me (and would actually work really well for our diskless systems).
Posted Jan 29, 2019 9:28 UTC (Tue)
by jani (subscriber, #74547)
[Link] (1 responses)
N.b. it's the *kernel* command-line. Not userspace command-line.
Posted Jan 29, 2019 15:59 UTC (Tue)
by akkornel (subscriber, #75292)
[Link]
Posted Jan 29, 2019 16:22 UTC (Tue)
by mbiebl (subscriber, #41876)
[Link] (1 responses)
Posted Feb 5, 2019 0:15 UTC (Tue)
by jccleaver (subscriber, #127418)
[Link]
I've regularly version-bumped upstream RPMs or added in a custom patch (if necessary) for sites I've been at, but systemd is far too complex and too central to risk something like that, which really honestly does help prove the larger point that systemd's project design is flawed.
I can't imagine there's anything I'd need to patch traditional init, or upstart on RHEL6, for, but I'd feel comfortable after a bit of testing releasing it if I had to; there's simply not very much it does, the complexity is mostly in the things it launches, like /etc/rc.sysinit. No way I'd risk that with systemd.
Posted Jan 31, 2019 17:10 UTC (Thu)
by ermo (subscriber, #86690)
[Link]
You could even re-use the verbiage from the comments you just made?
Posted Jan 29, 2019 6:05 UTC (Tue)
by luya (subscriber, #50741)
[Link] (5 responses)
Posted Jan 29, 2019 6:36 UTC (Tue)
by akkornel (subscriber, #75292)
[Link] (4 responses)
If you have a shared computing environment, then you want all of your systems to have identical copies of account information. Our largest computing environment is Sherlock, which has at least a thousand compute nodes, and multiple thousands of users. So we need to be sure that each user has the same UID across all systems.
Of course, you could just maintain your own LDAP server. But that is extra infrastructure, and not everybody is an OpenLDAP expert. So, the Central LDAP service is run on good hardware, with good people who know OpenLDAP, with support from the company that has the people who develop OpenLDAP. And so, that is where the UID number is allocated.
It is worth noting, there are many environments where NFS version 3 is still in use. For NFS version 3, the protocol only works with UID and GID, so both client and server must be working off of the same list of accounts. Yes, I know NFS version 4 does use names instead of IDs, and many of our compute environments are running version 4, but it has not been smooth. We have several compute environments that use scalable NFS storage from a very large, well-known company. We keep up to date with the storage server software. But when we switched to NFS 4.0, we encountered many major bugs (at least one bug was completely new), and got to the point where faculty were very unhappy.
It is also worth noting, reusing UIDs can be very dangerous. Even if your NFS server is running NFS 4, and sending names over the connection, the files are still being stored on the disk with a UID. So, if that UID eventually becomes linked to a different person, then that person might have access to old files that they should not be able to see.
I should also note, everything I said does not mean that we are ignoring cloud. In fact, on Wednesday two of my co-workers will be giving a session on what we are doing in cloud. But traditional, multi-user environments are still going strong, because that is still the most cost-effective option for most workloads. And that requires a single, authoritative source for UIDs.
Posted Jan 29, 2019 7:29 UTC (Tue)
by drag (guest, #31333)
[Link] (1 responses)
Most people's experience with Linux is just using it as server platform and for that case people have long since moved away from using unix account management for end users for large scale internet stuff. Providing Unix workstation and shell accounts to people on such a large scale is very unusual. It's impressive that it works as well as it sounds like it does.
Cloud computing can probably help some what by reducing costs. Even though the per-resource expensive may be higher the convenience it offers to end users ends up saving money; when it works out. Traditionally for most enterprise environments getting access to compute resources involves a lengthy ritual involving getting the attention and approval of some sysadmin somewhere, typically via some ticketing system. It's a painful and time consuming process were you are forced to justify your need in front of some jaded sysadmin looking for some excuse to shoot you down or change how you want to do things. People end up hanging on to resources once they get them because of the effort required to obtain them in the first place. Were as when a cloud is done right users are provided a budget they can use to spin up resources in mere moments. When people can spin up dozens of instances in seconds using whatever tool they prefer then it's no longer a big deal to release those resources when you are finished with them. Especially when it's their own budget.
Obviously, though, this isn't a solution to the UID issue if applications and environment dictates shared access to Unix systems.
Posted Jan 29, 2019 9:32 UTC (Tue)
by nilsmeyer (guest, #122604)
[Link]
Indeed. Most environments I worked in used static accounts, typically deployed using something like Ansible or Puppet. This of course has other issues.
> Cloud computing can probably help some what by reducing costs. Even though the per-resource expensive may be higher the convenience it offers to end users ends up saving money; when it works out.
I've often seen that the costs are a lot higher than projected, especially if you have requirements for spare capacity (HA) and your application doesn't scale well horizontally. You do have a very good point with the time saving for users, it's very easy to overlook that factor.
> Traditionally for most enterprise environments getting access to compute resources involves a lengthy ritual involving getting the attention and approval of some sysadmin somewhere, typically via some ticketing system. It's a painful and time consuming process were you are forced to justify your need in front of some jaded sysadmin looking for some excuse to shoot you down or change how you want to do things.
Not only a sysadmin but also often someone who is actually allowed to spend money, even if it's not "real money" in a sense that the hardware is already paid for. I would say though that it may often be advisable to fix the sysadmins or remove them from the process. This BOFH obstructionist attitude that some people bring really isn't helping things - of course that's usually an issue with overall corporate culture.
> Were as when a cloud is done right users are provided a budget they can use to spin up resources in mere moments. When people can spin up dozens of instances in seconds using whatever tool they prefer then it's no longer a big deal to release those resources when you are finished with them. Especially when it's their own budget.
I agree but the caveat "done right" of course applies, and this is where it often gets hairy since some organizations don't like to spend resources on better tooling. Then you end up with a lot of unused capacity dangling around, budgets being depleted through carelessness and mistakes or things end up breaking when someone pulls the plug once the budget is spent.
Posted Jan 29, 2019 22:33 UTC (Tue)
by bfields (subscriber, #19510)
[Link] (1 responses)
At the NFS layer, yes, at the RPC layer, no. NFSv4 can use strings when getting and setting attributes like file owners or groups or ACLs. At the RPC layer (which is what identifies who is performing a given operation) it still uses numeric IDs, unless you're using kerberos. The NFSv4 design pretty much assumed everyone would want kerberos.
You may already know that, and it's a bit of a digression, apologies. But it causes frequent confusion.
Posted Jan 31, 2019 17:40 UTC (Thu)
by akkornel (subscriber, #75292)
[Link]
Posted Jan 29, 2019 23:26 UTC (Tue)
by intgr (subscriber, #39733)
[Link] (3 responses)
To me, static non-configurable UID ranges in this case actually sounds like a feature: that way you can adapt your own UID allocation to reliably avoid that range. Although sure, it's probably a major inconvenience to do the migration.
It seems the situation would be a worse if the dynamic UID range could vary from machine to machine, that way the collisions would be more surprising and harder to avoid.
Posted Jan 31, 2019 9:56 UTC (Thu)
by jschrod (subscriber, #1646)
[Link] (2 responses)
That is my major issue with systemd: That it postulates that everybody else shall adapt to its conventions, w/o configurability, even if there other conventions exist for a very long time.
Another sign of this mindset is the refusal to add proper environment variable support to service unit files, enabling the relocation of directory trees according to local setup standards. (E.g., not wanting to place your PostgreSQL database files in /var/lib/pgsql, or managing multiple installation of services.)
Posted Jan 31, 2019 17:49 UTC (Thu)
by akkornel (subscriber, #75292)
[Link]
>That is my major issue with systemd: That it postulates that everybody else shall adapt to its conventions, w/o configurability, even if there other conventions exist for a very long time.
This is also pretty difficult here. In our case, Central LDAP is used by groups throughout the University, and getting a complete list of people who use the UIDs is difficult, because the UID attribute is one that is available without needing to authenticate.
So, if we wanted to change someone's UID, we would have to perform a large outreach campaign, which still wouldn't catch everyone. Then, on the day of the change, everyone involved would trigger a big series of `find -user … -exec chown new_uid {} \;` commands, across all of their file storage. Oh, and you'd have to ensure that the user isn't logged in _anywhere_ while the change is done.
Now, in a normal organization, there would be a manager somewhere up the hierarchy who is in charge, and who would tell people "you have to do this, on this schedule". Or, you will have a corporate IT department who sets the policies. Here, the common manager is the University President. Forcing a move like this would burn a fair amount of 'political capital'.
Plus, and this may sound petty, but we were there first. What I mean is, we allocated those UIDs before systemd picked that range to use.
So I hope you can understand now why "adapt[ing] your own UID allocation" is pretty difficult to do here.
Posted Jan 31, 2019 22:32 UTC (Thu)
by rahulsundaram (subscriber, #21946)
[Link]
Not sure what you mean? There are multiple ways to expose environment variables to services. The most straightforward one being
https://coreos.com/os/docs/latest/using-environment-varia...
What more is required here?
Posted Jan 30, 2019 4:41 UTC (Wed)
by filbranden (guest, #87848)
[Link] (1 responses)
Also, most of the developers there agreed that there should be some dynamic configs for your particular case. (They haven't been implemented yet, but I find there's agreement that there should be some.) So I imagine your problem will get solved eventually.
The uid range clash is mainly for systemd Dynamic Users and that feature is not really being used in the wild yet (there are some more large issues with it, such as D-Bus handling of the dynamic owner user/group of the service.) So just not using any Dynamic Users for now is definitely an option for you too.
In short, this should be addressed in time.
Posted Jan 31, 2019 17:58 UTC (Thu)
by akkornel (subscriber, #75292)
[Link]
Yeah, from what I could tell, I would need to have one or more processes that would maintain `flock()` 2,498 files. That seemed like a bit of a stretch.
> Also, most of the developers there agreed that there should be some dynamic configs for your particular case. (They haven't been implemented yet, but I find there's agreement that there should be some.) So I imagine your problem will get solved eventually.
Ah, thank for for the insight! I wasn't really sure because the GitHub issue is still tagged 'needs-discussion'. It hasn't yet been tagged anything else (like 'rfe'), so my impression was that there was not yet a consensus (or agreement), and that my RFE was still at risk for being closed.
> The uid range clash is mainly for systemd Dynamic Users and that feature is not really being used in the wild yet (there are some more large issues with it, such as D-Bus handling of the dynamic owner user/group of the service.) So just not using any Dynamic Users for now is definitely an option for you too.
Indeed. My biggest concern is that, the longer this takes, the harder it will be to get any change brought back into the distros we use.
Posted Jan 29, 2019 7:51 UTC (Tue)
by epa (subscriber, #39769)
[Link] (3 responses)
Posted Jan 29, 2019 13:47 UTC (Tue)
by BradReed (subscriber, #5917)
[Link]
Posted Feb 4, 2019 19:01 UTC (Mon)
by jccleaver (subscriber, #127418)
[Link] (1 responses)
Well, compared to systemd, yes.
Really, though, tools like chkconfig take the drudge out of symlink management, and from there you're left with a relatively simple control loop for flipping your state:
telinit > /etc/rc > scan /etc/rc.d/ directories for things to start and stop
IMO it provides the right amount of flexibility for controlling what is essentially a deterministic startup. For an embedded system where every cycle counts, sure the BSD approach might be appropriate, but for a general server installation having the flexibility of arbitrary control of the system through directories that are read when things need to change (and not single all-encompassing config files) makes sense.
This is from an RedHat perspective, though. If I had to manually move symlinks around I might feel differently.
Posted Mar 5, 2019 7:22 UTC (Tue)
by immibis (guest, #105511)
[Link]
For my desktop that I use, and the embedded devices I work on for a job, I've never really wanted more than two run-levels at most.
If I did want multiple runlevels, they should be up to me to configure, they shouldn't be there by default.
Posted Jan 29, 2019 8:57 UTC (Tue)
by linusw (subscriber, #40300)
[Link]
Posted Jan 29, 2019 16:39 UTC (Tue)
by jthill (subscriber, #56558)
[Link]
Posted Jan 30, 2019 10:05 UTC (Wed)
by tarjei (guest, #29357)
[Link] (2 responses)
I'm willing to bet that a lot of the hate and contempt[1] that systemd is getting are rooted in the frustration of feeling a lot less productive on the command line when doing your job of setting up or debugging a server.
The normal response is often that you should define your own aliases and commandline expantions - which is really missing the point. Very few admins today work on a few systems where they can add their own aliases and run along. Most touch a large amount of different systems and _the defaults must be good_.
A shorter version of systemctl (sd?) and very fast command line expansion support is more important for getting the haters to hate less than any amount of explaining why it was needed.
Systemd solves a real problem, but for a lot of it's users it does so by making their work more frustrating to do.
1. With arguments ranging from "it's not UNIX" to "the binary is too large". IMHO these are just arguments used because pointing to UX frustrations is not "technical".
Posted Jan 30, 2019 13:05 UTC (Wed)
by anselm (subscriber, #2796)
[Link]
Wait, what? Systemd contains auto-completion configuration files for most of its commands.
Also, I think the “admin user experience” is enhanced considerably by systemd features like the ability to query the status of a service along with the most recent log messages from that service, the ability to kill reliably any service that has got stuck (no matter how many processes it uses), or the ability to separate local configuration completely from the system-provided configuration, which makes it easy to inspect local changes to systemd units and carry them forward across upgrades. Not to mention the various ways in which services can be “hardened” by limiting their access to the rest of the system.
Compared to these, bemoaning the fact that “syst<TAB>c<TAB>” is (gasp) seven keystrokes seems petty – especially since the seasoned system administrators who are purportedly vexed by this should find it straightforward to come up with any number of methods to abbreviate that to, say, “sc” in a way that can be auto-deployed with new server installations. For example, if you're using Ansible, a playbook entry like
Posted Jan 30, 2019 13:06 UTC (Wed)
by pizza (subscriber, #46)
[Link]
I'm not sure what you're referring to here. Tab completion on the cmdline is not the responsibility of systemd, sysvinit or any other init system -- instead your login shell is responsible for such things.
That said, there have been shell completion hooks shipped with systemd since v12 all the way back in November 2010. So tab completion has been part of the systemd admin experience nearly all along. So if your distro doesn't ship or enable shell completion, that's not systemd's fault...
Posted Jan 30, 2019 21:06 UTC (Wed)
by rweikusat2 (subscriber, #117920)
[Link] (2 responses)
If you want to sell something to people who are critical of it, consider to stop pseudo-psychoanalyzing their conjectured personality deficiencies, especially in absolutely unchanged form for years. Rational people will embrace positive changes and reject negative ones.
And BTW, a tragedy is a form of drama where the protagonists meet their demise because they're forced into situations in which they cannot prevail because of their fundamental, usually positive, character traits.
Posted Jan 30, 2019 21:44 UTC (Wed)
by mpr22 (subscriber, #60784)
[Link]
When you meet one, you can test this hypothesis for us.
Posted Feb 1, 2019 19:40 UTC (Fri)
by xtifr (guest, #143)
[Link]
Slightly off-topic, but this is certainly not the *dictionary* definition. It might be a specialized (jargon) definition within the realm of literary criticism or something, but it's certainly not *the* definition. (It's not even *the* definition when referring to fiction or theatre, let alone when talking about real-life tragedies--real life doesn't have "protagonists".)
Posted Jan 31, 2019 11:44 UTC (Thu)
by freehck (guest, #91143)
[Link] (27 responses)
Posted Jan 31, 2019 14:05 UTC (Thu)
by pizza (subscriber, #46)
[Link] (26 responses)
Just because you're used to pain of beating your head against a given wall doesn't mean that continuing to do so is a sound strategy.
Posted Jan 31, 2019 18:32 UTC (Thu)
by zblaxell (subscriber, #26385)
[Link] (9 responses)
The former status quo (sysvinit, upstart, et al, in the last few years before systemd came along) was a series of disastrous regressions. It seemed every other upgrade introduced subtle (or not so subtle) changes in behavior that broke something important. Every time something broke, we would take over maintenance of that component, i.e. replace it with something that met our requirements, and disable the old thing. Eventually we locked down all our critical core services because upstream unapologetically broke every one of them at some point. Throughout this process we maintained compatibility with the former former status quo, i.e. decades-old management applications continued to work, and we'd still use most of the upstream initscripts (with suitable helper tools that solved their most obvious problems).
Transitioning to systemd is a tsunami of regressions. systemd comes with a massive reorganization of core services, fueled in part by the systemd community's encouragement of upstream developers to abandon compatibility with the former status quo (kudos to those developers who refused!). Problem areas that we previously identified and locked down need to be identified and fixed again because they now occur in new places. systemd also adds a bunch of new problems you only get to experience by running systemd on specific hardware/firmware combinations.
Some pieces of systemd are clearly better, but we can't run just those pieces without getting serious regressions at the same time. A few people are trying to break systemd into more easily consumable pieces, but so far that's not better, just a distinct set of regressions for us to manage. The former status quo was never fully integrated in the first place, so it's trivial to pick out fragments of working code (often you don't even need to compile anything).
At present, when we try to integrate systemd into an existing system image, that system image immediately stops working properly, so we don't do that. We have a lot of these images and they have indefinite life cycles, so we could be not using systemd for some decades (or until a catastrophic external event occurs, e.g. amd64 hardware stops being available and in-place upgrades cease to be possible).
Posted Jan 31, 2019 18:40 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (5 responses)
What are the regressions? Systemd can fallback to old SysV behavior if you stop using unit files.
Posted Feb 1, 2019 5:20 UTC (Fri)
by zblaxell (subscriber, #26385)
[Link] (4 responses)
It depends on the method of upgrade. We can do a package manager install (e.g. something like 'apt-get install systemd'), or build and install systemd from source on a self-hosting system, or we can just build a completely new boot image with systemd and throw that on a VM or target device that previously ran a similar image without systemd. Each of these approaches fail in their own unique ways.
So far:
* assorted filesystem failures (filesystems don't mount, get mounted rw, get mounted in the wrong place, get stuck because DNS isn't up yet, get stuck because network down, mount with wrong flags, mount in multiple places, expose wrong filesystems to chroots, doesn't fsck, fscks with wrong flags)
A lot of the above are "systemd has a knob we can turn to fix this, but the knob is in a different place from the one we previously turned." Some of those are fixable and we have fixed them, but the new failure modes just keep coming. Some of them aren't even systemd problems, they're problems with random upstream packages that have introduced behavior changes to accommodate systemd.
We haven't yet seen a system that can just upgrade in place from sysvinit to systemd. Even "install Debian jessie on wiped system previously running Debian squeeze" managed to find a way to fail: our earlier netinst/debootstrap-based system images don't include any packages that touch firmware, but systemd ships with power management turned on by default, and the machine got into a state that the power button couldn't get it out of. I had to take the machine apart to make it boot again.
> Systemd can fallback to old SysV behavior if you stop using unit files.
That's...different from the usual advice I get. Is the idea to start with no unit files and systemd as /sbin/init, then replace one legacy config file with a unit at a time? And how do I explain what I'm doing to apt?
Posted Feb 1, 2019 6:00 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
> * assorted network failures (firewall rules not established before devices up, devices up when they shouldn't be, devices not up when they should be, unwanted DHCP, unwanted autoconfigured IP addresses, firewall rules broken by network device name changes)
> * changes in error handling behavior (mount failure stops boot dead if the device is not noauto in fstab)
> * changes in power management behavior (responds to previously deconfigured inputs, suspend loops, triggers buggy firmware, ignores existing PM configuration)
> * package installation failures ('apt-get install random-package' blows up dependency resolver, packages conflict at runtime, init crashes during the upgrade and panics the kernel, system left unbootable)
> * missing dependency information is now a RC bug (systemd picks an execution order different from the order used for the previous decade)
> * miscellaneous data loss (cgroup service kill behavior for ssh, wipes /tmp on boot, backups fail due to not enough mounts or fill up disk because too many)
> * different cgroup controller organization (broke cgroup management code, caused a few host crashes and a lot of performance losses)
So to recap, you have a pile of crap that barely moves and breaks horribly when somebody breathes in its direction. I have no trouble believing that you have "regressions" with it.
Posted Feb 1, 2019 20:17 UTC (Fri)
by zblaxell (subscriber, #26385)
[Link]
> This has nothing to do with systemd.
The "expose wrong systems to chroots", and "get mounted in the wrong place" are directly caused by systemd. Systemd makes / into a shared namespace, changing to the kernel default behavior (originally the _only_ behavior) which is private. So an application that doesn't expect shared namespace does bind mounts with default options (at one point there were no options), and that application gets moved to a system running systemd, and now suddenly the bind mount appears in places it should not, or can be umounted from places it should not. If it's a non-bind mount then there can be problems with device usage at umount (extra references, can't free the block device to manipulate it)
> BTW, it can track network dependencies for mounting
We usually prefer it not to. Unless the machine has an important database it needs to flush to disk or controls an external device that has specific powerdown requirements, we normally just SIGKILL all processes and go straight to poweroff on shutdown, without considering or modifying the state of filesystems or network devices. We have to have the crash code path because we'll always be forced to use it if the host crashes. There's no point in having a separate, longer shutdown code path with more complexity, interacting subsystems, and failure modes to test, when the project has no feature (like the database state) that needs it.
> > * assorted network failures (firewall rules not established before devices up, devices up when they shouldn't be, devices not up when they should be, unwanted DHCP, unwanted autoconfigured IP addresses, firewall rules broken by network device name changes)
Sure, /lib/systemd/systemd probably didn't do that. It was more likely a unit file supplied by the distro with some unfortunate defaults set. But only systemd reads unit files, so here we are.
> > * changes in error handling behavior (mount failure stops boot dead if the device is not noauto in fstab)
If the choices have to be "inaccurate fstab emulation" and "no fstab emulation" I'd prefer the latter; however, last time I checked, we were talking about upgrades, so our initial condition here is that there's fstab and it has important stuff in it that systemd('s units) should do something about.
If there was a tool that did a one-time conversion from /etc/fstab to an equivalent systemd unit collection, and could report failure if the fstab contents couldn't be exactly represented by systemd (because nobody wants to replicate 20-years-old undocumented quirks of behavior)...it'd probably flag every fstab file we have, so maybe not so helpful.
If systemd didn't process /etc/fstab at all (i.e. it calls mount -av to handle fstab because fstab is a legacy config file for a legacy tool, and anyone who really wants systemd to handle their mounting isn't using fstab any more)...then installing systemd on a legacy system wouldn't change behavior so much. Sure, it'd be slower, but correct (assuming replicating the legacy configuration exactly is correct) is more important than fast.
> > * package installation failures ('apt-get install random-package' blows up dependency resolver, packages conflict at runtime, init crashes during the upgrade and panics the kernel, system left unbootable)
Sorry, I left out "random-package-that-depends-on-systemd-somehow." apt-get dist-upgrade is never completely trouble free, but when systemd is involved the failures can be spectacular. Very few Debian packages can trigger reboots when they fail--basically only packages that contain a pid 1 process.
> > * missing dependency information is now a RC bug (systemd picks an execution order different from the order used for the previous decade)
Yes it did. We had to dispose of Debian insserv too.
The insserv event made us pivot our private service manager from "a small watchdog that pings a couple of important services on a few critical machines" to "a standard thing that we install everywhere, that handles everything we need done right by itself, and then (maybe) runs an expurgated version of the distro's normal init as a service after that's done."
> > * miscellaneous data loss (cgroup service kill behavior for ssh, wipes /tmp on boot, backups fail due to not enough mounts or fill up disk because too many)
Only "wipes /tmp on boot" happens with SysV init, and it has an off switch that unit files supplied with systemd don't respect. The cgroup service kill behavior definitely doesn't come with SysV init (though there are more ways to get it than installing systemd).
The backup issues are a consequence of the earlier private/shared mount behavior changes. Sorry, I didn't mean to repeat.
> > * different cgroup controller organization (broke cgroup management code, caused a few host crashes and a lot of performance losses)
That one's only an issue if you were already using cgroups to wrangle services before systemd was installed on the system, and you were relying on resource constraints to prevent services from interfering with each other or the host in general. There are multiple layouts possible when mounting the cgroup fs, systemd just picked a different one than we did, and it's not possible for different layouts to coexist in the same kernel instance. We can adapt to systemd there, but the problem from an upgrade perspective is that we had to adapt to systemd there.
> So to recap, you have a pile of crap that barely moves and breaks horribly when somebody breathes in its direction.
We have requirements, and people who measure whether we meet them. That makes us naturally inflexible (some might say "rigid") about the changes we can accept from upstream. We can wait a year or two (or ten!) while upstream works their issues out, or switch to another upstream with fewer issues.
We are now living on an island that isolated us from the init daemon wars. We built the island because Debian threw insserv at us, and it was cheaper to move to a mole-free hill than to play whack-a-mole long enough to fix Debian. We kind of like it here--it's a cheap place to live, we're not constantly playing whack-a-mole any more, our machines do their jobs and enable us to do ours despite all the crazy bad stuff that happens to them, and we can still use the important parts of Debian. Why would we leave? More people should live here.
If systemd is going to be another round of whack-a-mole like insserv then we'll just stay here on this nice island thanks. If Debian gets their packaging sorted so installing systemd doesn't break existing systems so hard, we might leave the island (the tiny amount of maintenance work we have to do on the island is cheap, but no work at all is cheaper). No rush--there's plenty of other stuff to do while we're waiting for that to happen.
> I have no trouble believing that you have "regressions" with it.
We have lots of cases of the form "triggering event: systemd install" -> "root cause: new behavior introduced by systemd, its default configuration, or its dependencies" -> "fix: update mature code for the first time in 10-20 years." Other cases end with a more direct reference to a previously fixed bug (like "sometimes xyz service is not running") that now has to be fixed again. If those are not "regressions" then I need a better thesaurus.
Posted Feb 3, 2019 13:48 UTC (Sun)
by judas_iscariote (guest, #47386)
[Link] (1 responses)
So, it exposes bugs on your systems that were ignored by other init.
> assorted network failures (firewall rules not established before devices up, devices up when they shouldn't be, devices not up when they should be, unwanted DHCP, unwanted autoconfigured IP addresses, firewall rules broken by network device name changes)
Systemd does not set the firewall up, another component does and unless you are using explicitly using systemd-networkd it does not setup network interfaces other than "lo".
> changes in error handling behavior (mount failure stops boot dead if the device is not noauto in fstab)
As opossed to never noticing or getting ignored.
> changes in power management behavior (responds to previously deconfigured inputs, suspend loops, triggers buggy firmware, ignores existing PM configuration)
Yes, it does respond to previously deconfigured inputs... I do not know about what suspend loop are you talking about.. neither why you think bugs in firmware have anything to do with systemd.. and what do you mean with existing "PM configuration" .. do you mean runtime configuration that was set with a bunch of mostly useless shell scripts from pm-utils are no longer ran ? this is a distribution policy thing. Most if not all the power management configuration are kernel knobs and usually the kernel sets no policy whatsoever by default.
> package installation failures ('apt-get install random-package' blows up dependency resolver, packages conflict at runtime, init crashes during the upgrade and panics the kernel, system left unbootable)
Well.. packages are not part of systemd...for the start.
> missing dependency information is now a RC bug (systemd picks an execution order different from the order used for the previous decade)
Missing dependency information will make obvious what in the previous system was that "funny" ocurrence that caused service to fail on a particular phase of the moon.. as you probably already know, rc scripts are provided by the particular distribution you are using and are incompatible with others.
> miscellaneous data loss (cgroup service kill behavior for ssh, wipes /tmp on boot, backups fail due to not enough mounts or fill up disk because too many)
cgroup service kill behaviour is configurable..and better than it was before. (i.e there was none, if you were lucky things started correctly and that's it). /tmp wiped on boot is configurable and a distribution policy issue..
* different cgroup controller organization (broke cgroup management code, caused a few host crashes and a lot of performance losses)
Not sure what that means at all.
Posted Feb 5, 2019 16:49 UTC (Tue)
by zblaxell (subscriber, #26385)
[Link]
We configured the other init to not have those problems. There's no packaged, working upgrade path from anything to systemd, so on upgrade every configuration knob we've ever set over the years instantly reverts to whatever the distro default is today, and that breaks a lot of stuff.
Some of the bugs are unique to systemd, i.e. if systemd didn't exist, we'd deterministically never have the problems. Other inits don't depend on non-default kernel behavior to the extent systemd does, so we hit problems that happen with test systems using systemd and nowhere else.
The original question was "what are the regressions?" so I just summarized the last decade of ways existing code can combine with systemd and fail (so far).
> > changes in error handling behavior (mount failure stops boot dead if the device is not noauto in fstab)
Those errors were intentionally ignored (by setting fsck pass field to 0, or using the 'bg' option for NFS). Absent and busted filesystems are a significant use case for us, so all the service applications we run handle it. 'mount -a' normally doesn't even try to mount if the device isn't present at all.
Posted Feb 1, 2019 1:12 UTC (Fri)
by anselm (subscriber, #2796)
[Link] (2 responses)
After 15 years of teaching Linux to new system administrators I can confidently state that IMHO the former status quo really, really sucked. There's nothing like explaining the setup to someone who isn't already invested in it that brings out just how terrible it really was. With hindsight it baffles me (a) how we managed to tolerate it for so long (30+ years!), and (b) how people can seriously believe that, in a world where systemd exists, it is still something worth using.
Posted Feb 1, 2019 1:25 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link]
For the first few years of my Linux experience I was treating init scripts like something magical. I remember cutting&pasting a script to run Jetty (a Java web server) as a service and spending a whole day trying to understand how to make it launch automatically.
Posted Sep 6, 2019 0:06 UTC (Fri)
by soes (guest, #134247)
[Link]
Posted Feb 1, 2019 1:08 UTC (Fri)
by flussence (guest, #85566)
[Link] (14 responses)
Yes, it's *good* that the LSB trash fire was put out, less so that the explosives used for it levelled the rest of the town.
Posted Feb 1, 2019 1:51 UTC (Fri)
by luya (subscriber, #50741)
[Link] (5 responses)
Systemd is basically what upstart should be would Canonical properly relicense their clause as requested. Ask both Poettering and Renmant, the original author of upstart. (apology for mispelling).
OpenRC still relied on shell script method that is unwanted for the majority of distributions hence. Linux kernel badly needed its own system management. While both upstart and OpenRC tried, they weren't enough because of these shellscript.
Posted Feb 4, 2019 6:14 UTC (Mon)
by flussence (guest, #85566)
[Link] (4 responses)
Floral prose can't hide your total lack of understanding. You speak of OpenRC in the past tense, ignoring the fact that multiple distros use it and that systemd is *still* playing catch-up (it doesn't even use cgroups in a secure way!)
Posted Feb 4, 2019 11:31 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link]
What exactly is more secure in OpenRC?
Posted Feb 7, 2019 0:29 UTC (Thu)
by luya (subscriber, #50741)
[Link] (2 responses)
Posted Feb 13, 2019 8:19 UTC (Wed)
by flussence (guest, #85566)
[Link] (1 responses)
Posted Feb 13, 2019 10:51 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Feb 1, 2019 5:25 UTC (Fri)
by mjg59 (subscriber, #23239)
[Link]
Posted Feb 2, 2019 0:19 UTC (Sat)
by rgmoore (✭ supporter ✭, #75)
[Link] (6 responses)
It would be easier to believe that RedHat was ignoring Upstart if they hadn't adopted it as their init system for RHEL 6. Fedora also used it starting with Fedora 9. In practice, most of the major distributions first switched to Upstart (or something other than SysV init) and then moved to systemd because they found it did a better job of solving their problems.
Posted Feb 2, 2019 1:10 UTC (Sat)
by rahulsundaram (subscriber, #21946)
[Link] (5 responses)
Indeed. Red Hat developers involved in Fedora looked at either developing a new system or existing alternatives to sysvinit and choose Upstart
https://fedoraproject.org/wiki/FCNewInit
Red Hat specifially moved to using Upstart in RHEL 6 and had no corporate plan to move away from it. Lennart developed Systemd on his own time (despite discouragement from his manager precisely because RHEL 6 had committed to Upstart already) originally after discussing with Scott, the primary developer of Upstart and it is likely that if Canonical had given up on its insistence on a CLA that Upstart would have been morphed into adopting many of the ideas from Systemd (which themselves were heavily based on Launchd)
Red Hat atleast at that time was heavily developer driven. There was no corporate plan to diss Canonical or anything like that. On the contrary, there was prominent Red Hat contributors running Debian and even contributing to them and several of the major JBoss contributors were big Ubuntu users/fans etc. Development for the enterprise releases at that time was surprisingly loosely organized ( Fun fact: RHEL 6 was originally planning to ship both up2date and yum until I argued in the mailing list fairly last minute that it was pointless to do so and won that debate by merely posting to corresponding os development list in the last minute) and the resulting quality was often driven by passionate heroic developers. Judging by the external signs, even in Fedora, things have better a lot more organized and more process driven but anyone thinking that systemd was somehow corporate driven is just simply mistaken.
Posted Feb 4, 2019 19:15 UTC (Mon)
by jccleaver (subscriber, #127418)
[Link] (4 responses)
And it's worth reiterating for folks that RHEL 6's Upstart implementation was mostly identical to the previous (traditional) init implementation, at least for anything that most admins had to deal with on a regular basis. (You could go your entire deployment without having to care how getty was started in early boot, for example, and no one was using /etc/inittab for starting things directly by that point anyway.)
Hard to say if a RHEL7 w/o systemd would have tried to move towards using more of upstart's dynamic features, but I doubt too much change ever would be done in RHEL6 when the static SysV-based method worked fine enough. Those that needed dynamic or monitored service management knew how to hook those *into* a SysV framework, which is why the PID1 vs PID2 debate now is so salient.
Posted Feb 5, 2019 22:08 UTC (Tue)
by johannbg (guest, #65743)
[Link] (3 responses)
So the only person that could truly compare and criticize systemd from upstart perspective is the same man that wrote it and recommended that another init system would written as opposed to his being fixed ( hence systemd got bourne )...
So people can contemplate that *fact" while they continue to riding wishfull thinking pony's in circle, throwing not rocks but bricks in the glass house while worshipping the shell god's and chanting rhymes about glorious sysV, openRC and whatnot.
Bottom line systemd can be critize by many things ( and arguably justly so ) but none of which it has been critizised for in this thread.
Comparing legacy shell script based init systems with systemd is comparing apples to oranges...
Posted Feb 6, 2019 0:16 UTC (Wed)
by anselm (subscriber, #2796)
[Link] (1 responses)
The main advantage of systemd is that it actually exists today. Over and over again we hear a lot about how System-V init or for that matter OpenRC, with just a few bits and pieces added to them, could be far better than systemd, but nobody seems to be prepared to do the drudge work to actually prove this by demonstration.
IOW, one Lennart Poettering who actually releases working code is better than ten people who complain about how bad systemd is, and fantasise about the great code somebody (not them) could write that would make init system XYZ obviously superior to systemd.
Posted Feb 6, 2019 23:28 UTC (Wed)
by johannbg (guest, #65743)
[Link]
I might drop by at the next bsd con to see where they are at with this as in if they have realized they need an service manager ( solaris came up with smf what 10 or tweenty years ago ) and if so are discussing what they can learn from systemd and adapt to their own service manager and what they are going to leave out from systemd or if they are still in denial.
Posted Feb 11, 2019 22:55 UTC (Mon)
by oak (guest, #2786)
[Link]
Posted Feb 4, 2019 19:08 UTC (Mon)
by jccleaver (subscriber, #127418)
[Link]
The "former status quo" is still the "status quo" for RHEL6 users. And it still works just fine.
Posted Feb 11, 2019 12:21 UTC (Mon)
by yxejamir (guest, #103429)
[Link] (1 responses)
https://www.youtube.com/watch?v=6AeWu1fZ7bY
Posted Feb 12, 2019 23:33 UTC (Tue)
by johannbg (guest, #65743)
[Link]
Posted Feb 11, 2019 23:08 UTC (Mon)
by oak (guest, #2786)
[Link] (2 responses)
Posted Feb 12, 2019 13:41 UTC (Tue)
by tao (subscriber, #17563)
[Link]
Posted Feb 12, 2019 23:35 UTC (Tue)
by johannbg (guest, #65743)
[Link]
Posted May 28, 2019 12:43 UTC (Tue)
by chhex (guest, #132284)
[Link]
Posted Sep 6, 2019 0:11 UTC (Fri)
by soes (guest, #134247)
[Link]
He basically says that writting a system is the fast thing.
Verifying that it is correct and documenting it is 75 % of the work, that is if you want
Yeah i know that in many cases supplied shell scripts for example hadnt been thru that ie the check .. is this correct ? Does it do what it needs to ? Can we document it ? Describe it ?
Posted Dec 20, 2019 23:52 UTC (Fri)
by markt- (guest, #136239)
[Link] (3 responses)
I consider this sort of statement to be very much like someone saying that anybody in a country who is critical of its dictator should try and find one thing about the dictatorship that they like.
It doesn't matter if you like it or hate it, it is being forced upon you and you have zero choice in the matter (beyond to leave, if that option happens to be available). This is antithetical to both Unix specifically and freedom in general, and very much the foundation of a lot of objection to it.
Posted Dec 21, 2019 0:20 UTC (Sat)
by pizza (subscriber, #46)
[Link]
Huh? When has Unix ever been about choice or freedom?
Did Unix System V give you a choice of kernels, libc, or even init systems? How about Solaris, HP-UX, AIX, or Irix? Where was the source code that you could modify to suit your needs?
GNU (and to a lesser extent, Linux) owes its entire existence to Unix's shortcomings, especially with respect to freedom (and Freedom). There are good reasons for all the legacy Unix baggage in autotools and gcc; every Unix variant was a special snowflake with its own set of source-level issues, not-quite-compatible utilities, and system integration headaches galore.
Even the very-much-not-Unix BSDs don't give you that choice; indeed they're actually quite monolitic, with their kernel, libc, and init stuff tightly coupled together in the same source tree. But since you have the source code you can always make it do something different.
Posted Dec 21, 2019 1:27 UTC (Sat)
by rahulsundaram (subscriber, #21946)
[Link]
systemd is a free software project under the GPL license from the free software foundation and where every distribution has a choice on whether they want to use it or not and some do and some do not. Anyone can fork it and make modifications to it. Comparing it to a dictatorship is just silly
Posted Dec 21, 2019 9:27 UTC (Sat)
by flussence (guest, #85566)
[Link]
Two for two on factually wrong statements, off to a good start.
> This is antithetical to both Unix specifically and freedom in general
A PID1 that requires Linux-specific interfaces is antithetical to Unix? You don't say!
Here's an idea: how about you go run an actual Unix (not Linux, which is not Unix) for a week, and tell us how much “freedom” we're missing out on? I recommend Mac OS X for someone of your technical level.
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
policy distribution process (or the policy activation on a client.)
Writing a systemd aware version of cf-serverd (which distributes policy and also
is used to activate the installed policy) would require a fork of the software ie cfengine due to:
IP addresses will get an open socket at all, other hosts will not get anything ie connects at all
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
For corner cases, it break your hack every few versions.
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
> Does this have any drawbacks? Yes, it does. Previously it was practically guaranteed that hosts equipped with a single ethernet card only had a single "eth0" interface. With this new scheme in place, an administrator now has to check first what the local interface name is before they can invoke commands on it where previously they had a good chance that "eth0" was the right name.
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
I have that happened on my desktop with two Ethernet ports.
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
And with udev being subsumed into systemd, for no good reason, no one outside of the project had any real say in the matter.
Systemd/udev and complex simple single nic cases
From: https://lwn.net/Articles/529314/
> > Earlier this year, udev upstream was absorbed into systemd. udev often breaks compatibility with older systems by depending upon recent Linux kernel releases, even when such dependencies are avoidable. This became worse after udev became part of systemd, which has jeopardized our ability to support existing installations. The systemd developers are uninterested in providing full support in udev to systemd alternatives.
> > Starting with v197 systemd/udev will automatically assign predictable, stable network interface names for all local Ethernet, WLAN and WWAN interfaces. This is a departure from the traditional interface naming scheme ("eth0", "eth1", "wlan0", ...), but should fix real problems.
Systemd/udev and complex simple single nic cases
Earlier this year, udev upstream was absorbed into systemd. udev often breaks compatibility with older systems by depending upon recent Linux kernel releases, even when such dependencies are avoidable. This became worse after udev became part of systemd, which has jeopardized our ability to support existing installations. The systemd developers are uninterested in providing full support in udev to systemd alternatives.
Systemd/udev and complex simple single nic cases
> Thanks, Lennart. *slow clap*
Systemd/udev and complex simple single nic cases
a) a completely unpredictable in advance interface name
b) add kernel flags at install/kickstart and on your grub line to undo this
Systemd/udev and complex simple single NIC cases
https://access.redhat.com/discussions/916973
https://access.redhat.com/discussions/2620221
https://bugzilla.redhat.com/show_bug.cgi?id=1392506
https://bugzilla.redhat.com/show_bug.cgi?id=1391944
https://github.com/systemd/systemd/commit/6c1e69f9456d022...
Systemd/udev and complex simple single NIC cases
Systemd/udev and complex simple single NIC cases
> I'm pretty sure they can give you the moral support you need and even re-train you.
Systemd/udev and complex simple single NIC cases
( this problem is not spesific to linux, they are present on bsd and solaris as well ).
Name=enp1s0
Domains=<your search domains>
DNS=<your dns >
DNS=<your dns >
Address= <your ipv4 address )
Address=< your ipv6 address )
Gateway=< your gateway >
Name=enp1*
DHCP=yes
ifcfg script.
Systemd/udev and complex simple single NIC cases
The UEFI bricking vuln wasn't unique to systemd.
Systemd/udev and complex simple single NIC cases
Wol
Systemd/udev and complex simple single NIC cases
https://en.wikipedia.org/wiki/GUID_Partition_Table#Partit...
Systemd/udev and complex simple single NIC cases
Systemd as tragedy
If they'd stuck to wlan0 and eth0 then I wouldn't have had a problem.
Systemd as tragedy
That's great, and this no doubt solves a corner case for someone. But having worked at massive Dell shops I can say that even on 6-NIC servers I've *never* been hit by spontaneous random interface re-ordering on boot.
You obviously never had NICs with different drivers, where a kernel patch-release or new initrd resulted in the drivers being loaded in a different order and your network configuration being broken (e.g. IPs, VLANs, bonds/teams being on the wrong devices).
To be clear, on quad-GbE servers with dual-intel and dual-broadcom (e.g. Sun Galaxy range), it was quite common to see this. The workaround (before biosdevname) was to hard-code the MAC addresses in the interface configuration files, and then you would have to update them if ever the motherboard was replaced ...
Ever. Furthermore, the simplest possible case, *especially* in a VM-focused environment, is undoubtedly a "single host with single NIC",
Speak for your own environments. In mine, Linux is the hypervisor, and you absolutely want biosdevname when you have two or more physical NICs with 1 physical function and 14 (SR-IOV) virtual functions each.
and whatever benefits un-"predictable device naming" purports to provide is completely obliterated by no longer being able to assume eth0 has a meaning, when 99.5% of the time it previously worked as expected. (The only issue addressed here is when your MAC changes, but all major hypervisors have hooks in guest services to deal with that.)
VMWare (in the vSphere 4.x days) training specifically called out Linux as being very difficult to manage changes to vNICs compared to Windows ... and yes, I have experienced it too. biosdevname worked much better on the VMWare VMs than getting an unused eth0 and an unconfigured eth1 after cloning a template to a new VM.
Requiring others to re-code, re-design, or insert hacks into software and configs to solve weird edge cases LP found on his laptop one day and then telling everyone they should be happy in the long run for this extra work epitomizes the systemd "cabal"'s approach to doing things.
As noted previously in the comments, biosdevname was not invented by LP. In fact, it was invented by Dell, pre-systemd, and was deployed in many distros before the systemd project was started (e.g. RHEL6 with upstart had it by default).
And if you don't want it, disable it (net.ifnames=0 at boot or one of the other methods).
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Here's the output from bloaty:
Not a "bloated, monolithic system"?
% ./bloaty -d segments =init
VM SIZE FILE SIZE
-------------- --------------
48.9% 724Ki LOAD [RX] 724Ki 45.2%
33.9% 502Ki LOAD [R] 502Ki 31.4%
17.1% 253Ki LOAD [RW] 253Ki 15.8%
0.0% 0 [Unmapped] 119Ki 7.5%
0.0% 0 [ELF Headers] 1.94Ki 0.1%
100.0% 1.45Mi TOTAL 1.56Mi 100.0%
So we have half of it is code, a third is static data, and another fifth for globals(?). For per-section:
VM SIZE FILE SIZE
-------------- --------------
46.7% 691Ki .text 691Ki 43.2%
16.5% 243Ki .data.rel.ro 243Ki 15.2%
14.0% 206Ki .rodata 206Ki 12.9%
0.0% 0 .gnu.build.attributes 110Ki 6.9%
6.9% 102Ki .rela.dyn 102Ki 6.4%
5.9% 87.9Ki .eh_frame 87.9Ki 5.5%
1.8% 26.1Ki .dynsym 26.1Ki 1.6%
1.6% 24.3Ki .rela.plt 24.3Ki 1.5%
1.3% 19.6Ki .dynstr 19.6Ki 1.2%
1.2% 17.6Ki .gcc_except_table 17.6Ki 1.1%
1.1% 16.2Ki .plt 16.2Ki 1.0%
1.1% 16.2Ki .plt.sec 16.2Ki 1.0%
0.9% 13.7Ki .eh_frame_hdr 13.7Ki 0.9%
0.0% 0 [Unmapped] 8.57Ki 0.5%
0.6% 8.50Ki .got 8.50Ki 0.5%
0.0% 736 [ELF Headers] 2.66Ki 0.2%
0.1% 2.17Ki .gnu.version 2.17Ki 0.1%
0.0% 727 [15 Others] 1.05Ki 0.1%
0.0% 672 .dynamic 672 0.0%
0.0% 496 .gnu.version_r 496 0.0%
0.0% 416 .bss 0 0.0%
100.0% 1.45Mi TOTAL 1.56Mi 100.0%
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
It gets even worse once you look at typical "inhouse" development. I have seen things called "init scripts", written by - otherwise capable - Java developers, that I wish I could unsee. Those creations have caused real damage.
Also, those scripts are rarely portable to other shells or non-GNU userspaces. Worst case, they won't even fail noisily but simply no longer work correctly in specific circumstances.
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
(/me runs systemd-based distro on first-gen Rpi with 256MiB).
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
How do you know that all processes started by the task have finished?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Well, systemd is not set in stone. What needs to be changed to fix it?
PID controller might be even better suited for this, but in practice systemd is much faster at killing processes than the kernel is at forking.
This doesn't sound right. systemd will wait until the cgroup is empty, by which time all the resources should be freed.
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
[2] Likely to include grep, psutils, and util-linux as well [3]
[3] Don't forget libreadline, glibc, libstdc++, and everything else the shell and those utilities depends on!
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
I'm sorry. If your safety-critical systems have programs that can't be SIGKILL-ed cleanly and have several hundred of tightly-interconnected modules, then I want to run in the opposite direction from them.
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Nope. How do you track the daemon state? Are you sure PID files are correct? How do you kill the daemon in case it's stuck? What if you want to allow unprivileged users to terminate the service? Do you need a separate SUID binary? ...
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
OK. Can you show me an init script that is 25 lines long AND uses cgroups to guaranteed the process termination?
Not a "bloated, monolithic system"?
2) Edit primary process's name and path
3) Add any additional custom logic your daemon needs
Not a "bloated, monolithic system"?
So now it's 25 lines for deps, then another 1k lines for cgroups manipulation in Bash.
Incorrect. This doesn't have PID file management, for starters.
RH /etc/init.d/functions tooling
> So now it's 25 lines for deps, then another 1k lines for cgroups manipulation in Bash. Noted.
CGROUP_DAEMON="cpu,memory:test1" (or whatever) in /etc/sysconfig/foo
I'm sorry that you don't appear to be using a RedHat system, which is clearly the better distribution.
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
I tried to replicate it by creating an unresponsive NFS share.
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Uhm, why? A service that can't be SIGKILL-ed is clearly not safe to be restarted.
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
This was a closed source binary from a big vendor with the name starting with O and ending with "racle".
Except that a test could also die in the middle of its run. Sometimes from OOM.
The correct decision here is EXCACTLY to create a generic solution that can be used to make sure that no bad code can cause damage.
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Not a "bloated, monolithic system"?
Please don’t hard-code assumptions
Please don’t hard-code assumptions
option('system-gid-max', type : 'integer', value : '-1',
option('dynamic-uid-min', type : 'integer', value : 0x0000EF00,
option('dynamic-uid-max', type : 'integer', value : 0x0000FFEF,
option('container-uid-base-min', type : 'integer', value : 0x00080000,
option('container-uid-base-max', type : 'integer', value : 0x6FFF0000,
Please don’t hard-code assumptions
Please don’t hard-code assumptions
Please don’t hard-code assumptions
Please don’t hard-code assumptions
So I wondered if you weren't aware of those config switches which let you easily change those settings.
Please don’t hard-code assumptions
Please don’t hard-code assumptions
From an educated observation, why not take a look about the functionality of Stanford Central LDAP ? Ask noticed on one of the comment, the hardcoded UID are from the 90s era and making one of systemd configuration may not be a good idea in the modern technology.
Please don’t hard-code assumptions
Please don’t hard-code assumptions
Please don’t hard-code assumptions
Please don’t hard-code assumptions
Please don’t hard-code assumptions
Please don’t hard-code assumptions
Please don’t hard-code assumptions
> The biggest annoyance is that the UID range is hard-coded
Please don’t hard-code assumptions
Please don’t hard-code assumptions
Please don’t hard-code assumptions
Please don’t hard-code assumptions
Please don’t hard-code assumptions
SysV init is now ok?
SysV init is now ok?
SysV init is now ok?
SysV init is now ok?
One is a normal startup. The other is an emergency shell. And I can get the second one by putting "init=/bin/sh" in the kernel command line - I don't need any userspace support for that.
Systemd as tragedy
An earlier version of the talk, which I somewhat prefer, as it seems ... I guess "the focus has since shifted to emphasize storytelling less and pedagogy more" might be roughly right. The way some of the points are made seem more implicit and humorous, the new version puts up "contempt is not cool", the older one is to BSD developers and makes the point with an example and a "that's a bit rich" remark.
Systemd as tragedy
Systemd is missing a DX focus
Systemd is missing a DX focus
What I think systemd is missing is a focus on the admin user experience. With old init scripts you could tab through a few keypresses to restart a service. Now you have to manually type most of the command - and be sure that it works.
- name: Make shorter alternative name for the systemctl command
file:
src: "/usr/bin/systemctl"
path: "/usr/local/bin/sc"
state: link
should do the trick. YMMV.
Systemd is missing a DX focus
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
http://judecnelson.blogspot.com/2014/09/systemd-biggest-f...
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Honestly, this sounds like so much nonsense.
Systemd as tragedy
* assorted network failures (firewall rules not established before devices up, devices up when they shouldn't be, devices not up when they should be, unwanted DHCP, unwanted autoconfigured IP addresses, firewall rules broken by network device name changes)
* changes in error handling behavior (mount failure stops boot dead if the device is not noauto in fstab)
* changes in power management behavior (responds to previously deconfigured inputs, suspend loops, triggers buggy firmware, ignores existing PM configuration)
* package installation failures ('apt-get install random-package' blows up dependency resolver, packages conflict at runtime, init crashes during the upgrade and panics the kernel, system left unbootable)
* missing dependency information is now a RC bug (systemd picks an execution order different from the order used for the previous decade)
* miscellaneous data loss (cgroup service kill behavior for ssh, wipes /tmp on boot, backups fail due to not enough mounts or fill up disk because too many)
* different cgroup controller organization (broke cgroup management code, caused a few host crashes and a lot of performance losses)
Systemd as tragedy
This has nothing to do with systemd. BTW, it can track network dependencies for mounting
Nothing to do with systemd.
Don't use fstab.
Not sure about this.
Duh. Nothing to do with systemd.
Debian's inssrv from the last decade has the same issue.
All happened with regular SysV init.
Not sure about that one.
Systemd as tragedy
> Nothing to do with systemd.
> Don't use fstab.
> Duh. Nothing to do with systemd.
> Debian's inssrv from the last decade has the same issue.
> All happened with regular SysV init.
> Not sure about that one.
Systemd as tragedy
Systemd as tragedy
> As opossed to never noticing or getting ignored.
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
- has the situation with regards to how to design systemd dependencies become
more clear ? One of the trouble with systemd is clear and concise explanation of
how to properly design dependencies ? especially to be resilient.
I personally have trouble with to understand that little gem (and i do have
exposure to more very different systems than i expect most people in this group.)
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Should it be the case, OpenRC would be widely adopted in majority of distributions. Reality shows its usage as pointed out is limited to mostly Gentoo based distributions.
Shellscript used for both init and by extension for system management does not cut in modern technologies for a reason.
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Comparing legacy shell script based init systems with systemd is comparing apples to oranges...
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
BSDCan 2018 talk
https://plus.google.com/+LennartPoetteringTheOneAndOnly/p...
BSDCan 2018 talk
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as hope
Systemd as tragedy
something which is a software PRODUCT, not something which will bite
the customer (and your own customer engineer) hard. Which means that the
guy on your side that sold the system will get a very angry phone call (i
say so because IBM was never afraid to send a stiff bill, but the IBM representative
would theese days by his boss be expected to leave a note with her mobile phone number, saying any trouble call, anytime.)
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy
Systemd as tragedy