System call conversion for year 2038
Current Linux system calls use a number of different data types to represent times, from the simple time_t value through the timeval and timespec structures and others. Each, though, has one thing in common: an integer value counting the number of seconds since the beginning of 1970 (or from the current time in places where a relative time value is needed). On 32-bit systems, that count is a signed 32-bit value; it clearly needs to gain more bits to function in a world where post-2038 dates need to be represented.
Time representations
One possibility is to simply create 64-bit versions of these time-related structures and use them. But if an incompatible change is to be made, it might be worthwhile thinking a bit more broadly; to that end, Thomas Gleixner recently suggested the creation of a new set of (Linux-specific) system calls that would use a signed, 64-bit nanosecond counter instead. This counter would mirror the ktime_t type (defined in <include/linux/ktime.h>) used to represent times within the kernel:
union ktime { s64 tv64; }; typedef union ktime ktime_t; /* Kill this */
(Incidentally, the "kill this" comment was added by Andrew Morton in 2007; nobody has killed it yet.)
Having user space work with values that mirror those used within the kernel has a certain appeal; a lot of time-conversion operations could be eliminated. But Arnd Bergmann pointed out a number of difficulties with this approach, including the fact that it makes a complicated changeover even more so. The fatal flaw, though, turns up in this survey of time-related system calls posted by Arnd shortly thereafter: system calls that deal with file timestamps need to be able to represent times prior to 1970. They also need to be able to express a wider range of times than is possible with a 64-bit ktime_t. So some variant of time_t must be used with them, at least. (The need to represent times before 1970 also precludes the use of an unsigned value to extend the forward range of a 32-bit time_t value).
So universal use of signed nanosecond time values does not appear to be in the cards, at least not as part of the year-2038 disaster-prevention effort. Still, there is room for some simplification. The current plan is to use the 64-bit version of struct timespec (called, appropriately, struct timespec64 in the kernel, though user space will still see it as simply struct timespec) for almost all time values passed into or out of the kernel. The various system calls that use the other (older) time formats can generally be emulated in user space. So, for example, a call to gettimeofday() (which uses struct timeval) will be turned into a call to clock_gettime() before entry into the kernel. That reduces the number of system calls for which compatibility must be handled in kernel space.
Thus, in the future, a 32-bit system that is prepared to survive 2038 will use struct timespec64 for all time values exchanged with the kernel. That just leaves the minor problem of how to get there with a minimal amount of application breakage. The current plan can be seen in Arnd's patch set, which includes a number of steps to move the kernel closer to a year-2038-safe mode of operation.
Getting to a year-2038-safe system
The first of those steps is to prepare to support 32-bit applications while moving the kernel's internal time-handling code to 64-bit times in all situations. The internal kernel work has been underway for a while, but the user-space interfaces still need work, starting with the implementation of a set of routines that will convert between 32-bit and 64-bit values at the system-call boundary. The good news is that these routines already exist in the form of the "compatibility" system calls used by 32-bit applications running on a 64-bit kernel. In the future, all kernels will be 64-bit when it comes to time handling, so the compatibility functions are just what is needed (modulo a few spots where other data types must be converted differently). So the patch set causes the compatibility system calls to be built into 32-bit kernels as well as 64-bit kernels. These compatibility functions are ready for use, but will not be wired up until the end of the patch series.
The next step is the conversion of the kernel's native time-handling system calls to use 64-bit values exclusively. This process is done in two broad sub-steps, the first of which is to define a new set of types describing the format of native time values in user space. For example, system calls that currently accept struct timespec as a parameter will be changed to take struct __kernel_timespec instead. By default, the two structures are (nearly) the same, so the change has no effect on the built kernel. If the new CONFIG_COMPAT_TIME configuration symbol is set, though, struct __kernel_timespec will look like struct timespec64 instead.
The various __kernel_ types are used at the system-call boundary, but not much beyond that point. Instead, they are immediately converted to 64-bit types on all machines; on 64-bit machines, obviously, there is little conversion to do. Once each of the time-related system calls is converted in this manner, it will use 64-bit time values internally, even if user space is still dealing in 32-bit time values. Any time values returned to user space are converted back to the __kernel_ form before the system call returns. There is still no change visible to user space, though.
The final step is to enable the use of 64-bit time values on 32-bit systems without breaking existing 32-bit binaries. There are three things that must all be done together to make that happen:
- The CONFIG_COMPAT_TIME symbol is set, causing all of the
__kernel_ data structures to switch to their 64-bit versions.
- All of the existing time-related system calls are replaced with the
32-bit compatibility versions. So, for example, on the ARM
architecture, clock_gettime() is system call number 263.
After this change, applications invoking system call 263 will get
compat_sys_clock_gettime() instead. If the compatibility
functions have been done correctly, binary applications will not
notice the change.
- The native 64-bit versions of the system calls are given new system call numbers; clock_gettime() becomes system call 388, for example. Thus, only newly compiled code that is prepared to deal with 64-bit time values will see the 64-bit versions of these calls.
And that is about as far as the kernel can take things. Existing 32-bit binaries will call the compatibility versions of the time-related system calls and will continue to work — until 2038 comes around, of course.
That leaves a fair amount of work to be done in user space, of course. In a simplified view of the situation, the C libraries can be changed to use the 64-bit data structures and invoke the new versions of the relevant system calls. Applications can then be recompiled against the new library, perhaps with some user-space fixes required as well; after that, they will no longer participate in the year 2038 debacle. In practice, all of the libraries in a system and all applications may need to be rebuilt together to ensure that they have a coherent idea of how times are represented. The GNU C library uses symbol versioning, so it can be made to work with both time formats simultaneously, but many other libraries lack that flexibility. So converting a full distribution is likely to be an interesting challenge even once the work on the kernel side is complete.
Finishing the job
Even on the kernel side, though, there are a few pieces of the puzzle that have not yet been addressed. One significant problem is ioctl() calls; of the thousands of them supported by the kernel, a few deal in time_t values. They will have to be located and fixed one-by-one, a process that could take some time. The ext4 filesystem stores timestamps as 32-bit time_t values, though some variants of the on-disk format extend those fields to 34 bits. Ext3 does not support 34-bit timestamps, though, so the solution there is likely to be to drop it entirely in favor of ext4. NFSv3 has a similar problem, and may meet a similar fate; XFS also has some challenges to deal with. The filesystem issues, notably, affect 64-bit systems as well. There are, undoubtedly, many other surprises like this lurking in both the kernel and user space, so the task of making a system ready for 2038 goes well beyond migrating to 64-bit time values in system calls. Still, fixing the system calls is a start.
Once the remaining problems have been addressed, there is a final patch that can be applied. It makes CONFIG_COMPAT_TIME optional, but in a way that leaves the 64-bit paths in place while removing the 32-bit compatibility system calls. If this option is turned off, any binary using the older system calls will fail to run. This is thus a useful setting for testing year-2038 conversions or deploying long-lived systems that must survive past that date. As Arnd put it:
Presumably somebody will be paying attention and will remember to carry out this removal twenty years from now (if they are feeling truly inspired, they might just kill ktime_t while they are at it). At that point, they will likely be grateful to the developers who put their time into dealing with this problem before it became an outright emergency. The rest of us, instead, will just have to find some other way to fund our retirement.
(Thanks to Arnd Bergmann for his helpful comments and suggestions on an
earlier draft of this article.)
Index entries for this article | |
---|---|
Kernel | Year 2038 problem |
Posted May 7, 2015 3:01 UTC (Thu)
by nevets (subscriber, #11875)
[Link]
I think I peed my pants!
Posted May 7, 2015 5:16 UTC (Thu)
by eru (subscriber, #2753)
[Link]
Posted May 7, 2015 12:05 UTC (Thu)
by meuh (guest, #22042)
[Link] (1 responses)
Posted May 7, 2015 13:14 UTC (Thu)
by arnd (subscriber, #8866)
[Link]
Posted May 7, 2015 15:26 UTC (Thu)
by jnareb (subscriber, #46500)
[Link]
Posted May 10, 2015 12:45 UTC (Sun)
by Karellen (subscriber, #67644)
[Link] (8 responses)
If we doubled the size just one more time, with 64-bit seconds (±10^12 years) and 64-bit sub-second-precision (10^-18s), we could handle all almost any time anyone would wish to theoretically manipulate.
Transistor density, and therefore memory sizes, have increased by a factor on the order of a million since the early '70s when time_t was first invented. Surely we can afford to more than just double the size of our timestamps?
Posted May 12, 2015 9:14 UTC (Tue)
by arnd (subscriber, #8866)
[Link] (1 responses)
For interfaces that deal with timeouts (e.g. clock_nanosleep()), or current time (e.g. clock_gettime()), using unsigned 64-bit nanoseconds gives us until 2554, over 500 years before we have to come up with something else. If we can gain a noticeable performance improvement using the nanosecond based interface until then, it should be worth the effort.
Posted May 13, 2015 3:08 UTC (Wed)
by eternaleye (guest, #67051)
[Link]
Furthermore, at that point "now" could be defined in an offset-ish way - use point time for "when we booted" and interval time for "...and how long has it been since then?"
But that'd be an even more drastic API change, and thus deeply unlikely.
Posted May 12, 2015 16:44 UTC (Tue)
by flussence (guest, #85566)
[Link] (5 responses)
Posted May 13, 2015 3:10 UTC (Wed)
by eternaleye (guest, #67051)
[Link] (4 responses)
Posted May 13, 2015 17:56 UTC (Wed)
by joib (subscriber, #8541)
[Link] (3 responses)
Posted May 13, 2015 20:37 UTC (Wed)
by eternaleye (guest, #67051)
[Link] (2 responses)
This is really reinforcing my preference for Rust's [ui]{64,32,16,8} integer types over C's (unsigned) {long long,long,int,short,char} fuzzily-sized ones :/
(and Rust then has usize, defined as "capable of holding a pointer", for the cases where that's relevant)
Posted May 13, 2015 21:08 UTC (Wed)
by rleigh (guest, #14622)
[Link] (1 responses)
Posted May 13, 2015 23:01 UTC (Wed)
by eternaleye (guest, #67051)
[Link]
(of course, for FFI there's libc::c_int and so on, but that's just it: for FFI.)
Posted May 15, 2015 3:56 UTC (Fri)
by neilbrown (subscriber, #359)
[Link]
Similar, but not the same.
See RFC1813 definition of nfstime3. Linux server and client get this right (but silently truncate any timestamps out side that range).
System call conversion for year 2038
Year 2038 and other such roll-overs are getting even the attention of BBC:
http://www.bbc.com/future/story/20150505-the-numbers-that-lead-to-disaster
2038 in the mainstream news
System call conversion for year 2038
System call conversion for year 2038
Time, Clock, and Calendar Programming In C
System call conversion for year 2038
System call conversion for year 2038
System call conversion for year 2038
System call conversion for year 2038
System call conversion for year 2038
System call conversion for year 2038
System call conversion for year 2038
System call conversion for year 2038
System call conversion for year 2038
System call conversion for year 2038
NFSv3 time stamps are unsigned, so they reach until 2106, but don't go back before 1970.