Restartable sequences in glibc

Posted Feb 2, 2022 2:38 UTC (Wed) by NYKevin (subscriber, #129325)
Parent article: Restartable sequences in glibc

If I'm understanding things correctly, it seems that the only real problem glibc is solving is the following:

1. In order to use restartable sequences, someone needs to alloc a struct rseq and tell the kernel about it. Also, you can't free that struct unless the thread exits or you tell the kernel to stop using it.
2. If multiple different libraries or codepaths try to do that, they will step on each other.
3. Therefore, someone needs to "own" the struct rseq for each thread, and everyone needs to agree on the ownership of this struct.
4. However, once everyone agrees on who owns the struct, there's nothing wrong with foreign code overwriting the struct, so long as it stays within its own thread and doesn't have any reentrancy issues (the struct only needs to be valid over short instruction sequences, and nesting is explicitly unsupported - clobbering somebody else's rseq state is a non-issue as long as you don't move the struct, free it, or clobber it from a signal handler).

glibc fills the role of the "someone" in step 3. However, I don't understand why this role must necessarily be filled by a specific, fixed userspace component at all. If the kernel exposed an API for querying the address of a thread's current struct rseq (which the kernel surely knows), then you could just take a "first to call rseq() wins" approach, and completely sidestep the ownership issue altogether. You would still have the problem that the struct must be freed when the thread exits (and no earlier!), and in practice this might result in glibc trying to be the first to initialize it anyway, but there would be no need for an explicit userspace ABI for this sort of coordination - everybody could just use the kernel to coordinate who owns the struct. OTOH, I suppose there might be some sequencing issues when the thread exits (i.e. during the thread-exiting process, exactly when does it become "safe" to free/reclaim the struct rseq?), but I tend to imagine that there are ways of solving this problem (e.g. it must be allocated on the owning thread's stack, it must be free'd from a different thread after the owning thread is gone, or something similar), and it probably wouldn't be too hard to agree on a convention for how to do that.

Have I misunderstood something?

Restartable sequences in glibc

Posted Feb 2, 2022 4:06 UTC (Wed) by foom (subscriber, #14868) [Link]

glibc plans to use rseq internally to do useful things like speed up malloc, so it needs to do rseq registration regardless of whether anyone else wants to use it. There doesn't seem to be much point in creating a mechanism to share registration duties, if libc will always "win" anyways.

As far as not exposing any user space abi: if you don't, every user would then need to make a syscall to retrieve the rseq area's location separately per thread? And presumably then cache it in a tls variable for performance? That seems a bit silly and wasteful, when it's easy enough to just make a constant thread offset available in user space to anyone who needs it.