[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
|
|
Subscribe / Log in / New account

The imminent stable-version apocalypse

By Jonathan Corbet
February 5, 2021
As has often been pointed out, the stable-kernel releases are meant to be stable; that means they should be even more averse to ABI breaks than mainline releases, if that is possible. This may be a hard promise to keep for the next set of stable kernels, though, for the most mundane of reasons: nobody thought that there would be more than 255 minor updates to any given kernel release.

For most of the existence of the kernel project, few developers within the project itself have maintained any given kernel release for more than a couple years or so, and maintenance releases were relatively rare. There were some exceptions; the 2.4 release happened at the beginning of 2001, and Willy Tarreau finally stopped maintaining it more than eleven years later. Even then, the final version was 2.4.37, though one could perhaps call it 2.4.48 after the final set of eleven small "fixup" releases. Releases for kernels maintained for the long term were relatively few and far apart.

In recent years, though, that situation has changed, with some older kernels receiving much more long-term-maintenance attention. Thus, February 3 saw the release of the 4.9.255 and 4.4.255 updates. Those kernels have received 18,765 and 16,986 patches, respectively, and there is no sign of things slowing down. The current posted plan is to maintain 4.9 through January 2023 and 4.4 through February 2022.

These kernel-release numbers are now a problem, as was pointed out by Jari Ruusu. There are a couple of macros defined within the kernel relating to version codes; these can be found in include/generated/uapi/linux/version.h in a built kernel:

    #define LINUX_VERSION_CODE 330496
    #define KERNEL_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + (c))

The first macro, LINUX_VERSION_CODE, is calculated in the top-level makefile; it is the result of:

    (5 << 16) + (11 << 8) + 0

That number (which is 0x50b00) identifies this as a 5.11-rc kernel; it is the same result one gets from KERNEL_VERSION(5,11,0).

One does not have to look long to see that neither of these macros is going to generate the expected result once the minor version ("c" in the KERNEL_VERSION() macro) exceeds 255. Running that macro on a 4.9.255 kernel yields 0x409ff, but on 4.9.256 it will instead return 0x40a00 — which looks like 4.10.0. That might just cause some confusion in the user community.

This problem does not come as a complete surprise to the stable-kernel maintainers; Sasha Levin posted this patch in mid-January in an attempt to fix it. It changes both LINUX_VERSION_CODE and KERNEL_VERSION() to use 16 bits for the minor version, thus eliminating the overflow. This patch got into linux-next, but seems unlikely to stay there; as Jiri Slaby noted, these macros are used by user space and constitute a part of the kernel's ABI. He added that both the GNU C Library and the GCC compiler (the BPF code in particular) use the kernel version code in its current form and would not handle a change well. There are also many other places in the kernel that exchange these version codes with user space; see this media ioctl() command, for example. Changing the kernel's idea of how KERNEL_VERSION() works will break programs compiled with the older macro, which is not something that is allowed.

So what is to be done? As of this writing that has not yet been worked out, but there are a couple of options on the table:

  • Ruusu's note pointing out the problem suggested that stable releases could start incrementing the EXTRAVERSION field instead; this is the field that normally contains strings like -rc7 (for mainline test releases), or a Git commit ID. The minor version would presumably remain at 255. This would avoid breaking ABI, but would also make it harder for user-space code to distinguish between stable releases after 255. It might also create minor trouble for distributors who are using that field to identify their own builds.
  • Stable maintainer Greg Kroah-Hartman suggested that he could "leave it alone and just see what happens". But, as Slaby pointed out, that will create the wrapping problem described above, which could confuse user space. If this is done, he said, it would be necessary to mask the minor version to eight bits, causing it to wrap back around to zero; whether that would cause confusion is another question. Version numbers are normally expected to increase monotonically.

The most likely outcome can be seen in the kernel's history, though. Once upon a time, mainline kernel releases had three significant numbers rather than two — 2.6.30, for example. In those days, the minor version field wasn't available for stable updates, so the EXTRAVERSION field was used instead. Looking at the 2.6.30.3 makefile, one sees:

    VERSION = 2
    PATCHLEVEL = 6
    SUBLEVEL = 30
    EXTRAVERSION = .3
    NAME = Man-Eating Seals of Antiquity

That solution worked for years, so there should be no real reason why it wouldn't work now as well. Most likely SUBLEVEL would remain stuck at 255, with EXTRAVERSION indicating the real release number.

It is evidently Leon Trotsky who once said that "old age is the most unexpected of all things that can happen to a man". Perhaps similar forces are at play here; running out of bits is the most unexpected of things that can happen to a kernel developer. This version-number overflow could have been foreseen some time ago, and the date of its occurrence forecast with reasonable certainty. But now some sort of solution has to be found before the next stable-kernel release can be made. Happily, the problem should be easier to resolve than that of old age.

Update: Kroah-Hartman appears to have chosen the "do nothing" option with the release of 4.9.256 and 4.4.256, both of which increment the version number but make no other change. "I'll try to hold off on doing a 'real' 4.9.y release for a week to give everyone a chance to test this out and get back to me. The pending patches in the 4.9.y queue are pretty serious, so I am loath to wait longer than that, consider yourself warned..."

Update 2: In the end, it appears that the clamping solution will be taken, with the minor number fixed at 255 going forward.

Index entries for this article
KernelDevelopment model/User-space ABI
KernelReleases/Stable updates


to post comments

The imminent stable-version apocalypse

Posted Feb 5, 2021 15:29 UTC (Fri) by leromarinvit (subscriber, #56850) [Link] (2 responses)

A Trotsky quote in an article about kernel version numbers - that must be the second most unexpected of all things that can happen to a man. Well done, sir, well done!

As always, thanks for the excellent reporting, Jon!

The imminent stable-version apocalypse

Posted Feb 9, 2021 20:35 UTC (Tue) by MarkFrankSharefkin (guest, #135746) [Link] (1 responses)

Red diaper baby here: not Trotsky, either Chernyshevsky or Lenin, IIRC (I'm trying to forget)...

The imminent stable-version apocalypse

Posted Feb 10, 2021 10:29 UTC (Wed) by mbg (subscriber, #4940) [Link]

Not to mention the subtle interposition of a Lenin quote earlier in the article...

The imminent stable-version apocalypse

Posted Feb 5, 2021 16:14 UTC (Fri) by edeloget (subscriber, #88392) [Link]

The use of EXTRAVERSION might be a possibility but I'm not sure it's not already used for other reasons (such as dashed version from various distributions, although I haven't checked this).

The main problem will be users of the KERNEL_VERSION() macro ; tests such as

#if (LINUX_VERSION_CODE >= KERNEL_VERSION(4,5,0))
..
#endif

will break on 4.4.256, but such break will likely be seen (because this kind of code is used to test if feature from kernel 4.5 can be used).

The other version of the test

#if (LINUX_VERSION_CODE < KERNEL_VERSION(4,5,0))
...
#endif

will be silently compiled out even though it should not, and it's very likely that something will break.

The imminent stable-version apocalypse

Posted Feb 5, 2021 16:44 UTC (Fri) by willy (subscriber, #9762) [Link] (3 responses)

I suggest saturating at KERNEL_VERSION at xxxxFF.

Code which needs to care about versions after 255 can check LINUX_VERSION_SUBLEVEL directly.

The imminent stable-version apocalypse

Posted Feb 5, 2021 17:52 UTC (Fri) by mchehab (subscriber, #41156) [Link] (1 responses)

> I suggest saturating at KERNEL_VERSION at xxxxFF.

That seems to be the best thing to be done.

Without that, media applications will break, as they several of them rely at the Kernel version in order to enable some features:

drivers/media/cec/core/cec-api.c: caps.version = LINUX_VERSION_CODE;
drivers/media/mc/mc-device.c: info->media_version = LINUX_VERSION_CODE;
drivers/media/v4l2-core/v4l2-ioctl.c: cap->version = LINUX_VERSION_CODE;
drivers/media/v4l2-core/v4l2-subdev.c: cap->version = LINUX_VERSION_CODE;

The imminent stable-version apocalypse

Posted Feb 5, 2021 20:24 UTC (Fri) by nix (subscriber, #2304) [Link]

So does glibc, but thankfully it never compares the LINUX_VERSION_CODE against 4.10.x or 4.5.x. Phew, dodged a bullet.

The imminent stable-version apocalypse

Posted Feb 5, 2021 19:43 UTC (Fri) by tglx (subscriber, #31301) [Link]

> I suggest saturating at KERNEL_VERSION at xxxxFF.

Yes. And the commit message of x.x.x.255 wants to be:

This kernel has been finally stapled to death. Move on.

The imminent stable-version apocalypse

Posted Feb 5, 2021 18:05 UTC (Fri) by jemarch (subscriber, #116773) [Link]

Just wanted to point out that GCC is not actually impacted by this.

The BPF backend in GCC 10 accepts a -mkernel option where the user can specify a string identifying a kernel version, from 4.0 to 4.20 and from 5.0 to 5.2. This information is then used internally.

However, after getting feedback from the kernel community we decided that -mkernel wasn't useful in practice, and consequently GCC 11 ignores it.

An eventual change in the encoding of LINUX_VERSION_CODE wouldn't impact GCC at all.

Salud!

The imminent stable-version apocalypse

Posted Feb 5, 2021 18:47 UTC (Fri) by flussence (guest, #85566) [Link] (3 responses)

It's been 20 years and we're still haunted by the vengeful ghost of web browser UA sniffing.

I think userspace code might have a few valid reasons to check patchlevel, but anything that cares about specific sublevels in what's supposed to be a stable kernel series is probably doing something horribly wrong - and breaking the former for the latter is the wrong tradeoff. Any program that needs to know about numbers above 255 is getting frequent enough updates that it can be taught to read them from something else.

The imminent stable-version apocalypse

Posted Feb 5, 2021 22:05 UTC (Fri) by Nahor (subscriber, #51583) [Link] (2 responses)

> [...] anything that cares about specific sublevels in what's supposed to be a stable kernel series is probably doing something horribly wrong

If there is a new kernel, it means something changed. Why would userspace not be interested? That change might mean that a workaround for some issue is no longer necessary, or it might mean an updated driver that doesn't need to be pulled from out-of-tree anymore, ...

The imminent stable-version apocalypse

Posted Feb 6, 2021 13:25 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

What driver is going to go from out-of-tree into a stable tree? Even with 6 years of support, that seems unlikely to me to be suitable for a backport…

The imminent stable-version apocalypse

Posted Feb 12, 2021 19:01 UTC (Fri) by flussence (guest, #85566) [Link]

If a userspace program cares about the distinction between 4.9.255 and 4.9.>255, then either it's still being worked on in 1Q2021 and can adapt to reading the extra version in an appropriate way, or it made speculation based on a future it couldn't possibly predict accurately and is asking for trouble.

On the other hand, it's entirely reasonable for code that hasn't been touched in *years* to have different paths for kernel 4.9.x and 4.10.0, and that's the sort of program that'll break when the number wraps around and may be hard or impossible to fix when it does.

The imminent stable-version apocalypse

Posted Feb 5, 2021 19:30 UTC (Fri) by amarao (subscriber, #87073) [Link] (8 responses)

Integers is the most precious thing in the world. Just look at the current prices of meager /8 ipv4 network to appreciate. The same goes for short numbered ICQ number (if someone remember _that_ antiquity), same with phone numbers, etc. Humanity is always in severe deficiency of integers.

p.s. how much would you pay to have 4 billion VLANs instead of 4k?

The imminent stable-version apocalypse

Posted Feb 5, 2021 19:43 UTC (Fri) by edeloget (subscriber, #88392) [Link] (1 responses)

> p.s. how much would you pay to have 4 billion VLANs instead of 4k?

Not much :) as I can use S-VLAN and C-VLAN, and that would open a fantastic world of "add as many VLAN as you want" at the price of adding yet another 4 bytes L2 header (S-VLAN headers can be chained, we you can have

DSTMAC[6] SRCMAC[6] SVLAN[4] SVLAN[4] CVLAN[4] ETHTYPE[2] ...

Which gives you 68 billion VLANs (and you can add even more SVLAN levels).

Granted, you may have to use recent network appliances, but Linux at least has supported 802.1ad for years.

And yet...

> Humanity is always in severe deficiency of integers.

I can't agree more :)

The imminent stable-version apocalypse

Posted Feb 6, 2021 18:32 UTC (Sat) by champtar (subscriber, #128673) [Link]

Not sure if when using stacked VLAN you will still get all hardware offload. If you don't mind the overhead (internal network with jumbo frame) better try VXLAN ;)

The imminent stable-version apocalypse

Posted Feb 5, 2021 22:27 UTC (Fri) by Sesse (subscriber, #53779) [Link]

I'll be happy if I don't have to run 4 billion separate spanning-tree instances, at least!

The imminent stable-version apocalypse

Posted Feb 5, 2021 23:33 UTC (Fri) by jengelh (subscriber, #33263) [Link]

>The same goes for short numbered ICQ number (if someone remember _that_ antiquity), same with phone numbers

ICQ account numbers were assigned in monotonically increasing fashion, I believe.
However, the phone numbers seem to be allocated as a radix tree here, so they are prone to be a lot more unbalanced than ICQ#, or kernel versions for that matter.

(The OSM database, which I have taken care of for a handful of years, yields this statistic piece about phone numbers for my area code: /^[0-9]/: 4807 POIs, /^9/: 450, /^99/: 242, /^999/: 133.)

The imminent stable-version apocalypse

Posted Feb 7, 2021 1:29 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (2 responses)

One supposes we might move to IPv6, which fixes this problem once and for all. But IPv6 was standardized (in RFC 1883) a mere 25 years ago, so *obviously* it must not be mature enough.

/s

The imminent stable-version apocalypse

Posted Feb 8, 2021 10:23 UTC (Mon) by khim (subscriber, #9252) [Link] (1 responses)

It doesn't matter when IPv6 was developed. Only matters when IPv4 pool was exhausted. That happened about one year ago.

The fact that anyone started pushing IPv6 before that moment is a miracle in itself.

This 15-years old article explains that phenomenon well… and it looks as if Linux kernel follows the same trajectory: people are only fixing things when they break. Not before.

The imminent stable-version apocalypse

Posted Feb 8, 2021 12:11 UTC (Mon) by tzafrir (subscriber, #11501) [Link]

This certainly wasn't the first time we were about to run out of IPv4 addresses (It also mentions that time in 2012 as well). But previously in the 1990s IPv4 address blocks were in need. So IPv6 was invented. But so was subnetting.

The imminent stable-version apocalypse

Posted Feb 7, 2021 6:29 UTC (Sun) by pr1268 (subscriber, #24648) [Link]

The same goes for short numbered ICQ number

This reminds me of how high the demand is for low-numbered "classic" Delaware license plates — the lower the number, the higher the auction price.

The imminent stable-version apocalypse

Posted Feb 5, 2021 19:30 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

So... This is finally a stable update that is not one of the you-must-upgrade-or-the-world's-gonna-blow-up kind.

The imminent stable-version apocalypse

Posted Feb 5, 2021 23:10 UTC (Fri) by Sesse (subscriber, #53779) [Link] (1 responses)

“A number of important fixes, and everybody should upgrade”

*eyes some fixes for ATM drivers and m68k*

The imminent stable-version apocalypse

Posted Feb 6, 2021 1:01 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

The imminent stable-version apocalypse

Posted Feb 5, 2021 20:45 UTC (Fri) by zyga (subscriber, #81533) [Link]

The imminent stable-version apocalypse

Posted Feb 6, 2021 13:39 UTC (Sat) by mss (subscriber, #138799) [Link]

I would simply define KERNEL_VERSION_CAPPED at (4, 9, 255) and add 16-bit KERNEL_VERSION_16 at (4, 9, 256).
For userspace-visible headers there would be #define KERNEL_VERSION KERNEL_VERSION_CAPPED, but the kernel internally would use KERNEL_VERSION_16.

This split-version-personality would of course need to be introduced to the upstream, too, since things like VIDIOC_QUERYCAP need to be changed to use KERNEL_VERSION_CAPPED.

The imminent stable-version apocalypse

Posted Feb 7, 2021 11:12 UTC (Sun) by meerdan (subscriber, #119439) [Link] (1 responses)

I hope they find a way to fix this in mainline.

To do nothing might work out for the moment, but it will come back and bite Greg in the future. Especially if he keeps doing this for new stable branches.

The imminent stable-version apocalypse

Posted Feb 7, 2021 13:17 UTC (Sun) by gregkh (subscriber, #8) [Link]

Patches to fix this have already been submitted for review, this will happen again...

The imminent stable-version apocalypse

Posted Feb 18, 2021 1:06 UTC (Thu) by opalmirror (subscriber, #23465) [Link] (2 responses)

  • Noone will EVER need more than 640KB of RAM!
  • Noone will EVER need more than 3GB of user space!
  • Noone will EVER need more than 4GB of RAM!

Whatever has happened before, will happen again. So say we all.

The imminent stable-version apocalypse

Posted Feb 18, 2021 6:36 UTC (Thu) by jem (subscriber, #24231) [Link] (1 responses)

Citation needed. Who said no one will ever need more than 4 GB of RAM?

Somebody may have said "More than 4 GB will not be needed for a long time, and it is not economical to support it now" when 32-bit PCs emerged in the 1980s.

The imminent stable-version apocalypse

Posted Feb 18, 2021 22:32 UTC (Thu) by opalmirror (subscriber, #23465) [Link]

Fair comment. I guess nobody really said that no one would ever need to use more then 4GB, it was genuinely just a technology goal point for the entire industry.


Copyright © 2021, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds