Preventing stack guard-page hopping

Posted Jun 19, 2017 20:13 UTC (Mon) by Cyberax (✭ supporter ✭, #52523)
In reply to: Preventing stack guard-page hopping by tux3
Parent article: Preventing stack guard-page hopping

A better change is to modify alloca() in libc to touch at least one byte on each allocated page. But it'll take some time to percolate through the userspace software.

Preventing stack guard-page hopping

Posted Jun 19, 2017 20:26 UTC (Mon) by cpitrat (subscriber, #116459) [Link] (15 responses)

This would protect remote attacks but wouldn't prevent an attacker to write his own stack allocation for local privilege escalation.

I'm surprised a 900 lines patch is only about increasing the size of the page guard. Isn't there more in it ?

Preventing stack guard-page hopping

Posted Jun 19, 2017 21:09 UTC (Mon) by roc (subscriber, #30627) [Link] (2 responses)

> This would protect remote attacks but wouldn't prevent an attacker to write his own stack allocation for local privilege escalation.

The local privilege escalation threat assumes that the high-privilege C code is trusted, and then exploits it.

If the attacker can write high-privilege C code, you've already lost.

Preventing stack guard-page hopping

Posted Jun 20, 2017 9:43 UTC (Tue) by moltonel (guest, #45207) [Link] (1 responses)

The libc isn't high-privilege/trusted, and any local attacker can use his own vulnerable libc-equivalent routines instead. So a protection at libc-level would only protect against remote attacks, where the attacker has to contend with the local libc or use a different vulnerability to bring his own libc-equivalent.

Preventing stack guard-page hopping

Posted Jun 20, 2017 10:13 UTC (Tue) by matthias (subscriber, #94967) [Link]

There are certainly some suid binaries linking against libc. Thus the libc is high-privilege code. The local attacker can only use the code/libraries linked into suid binaries.

If the attacker has the ability to run his own code with privileges, everything is already lost. No need for an exploit.

Preventing stack guard-page hopping

Posted Jun 20, 2017 6:54 UTC (Tue) by vbabka (subscriber, #91706) [Link]

> I'm surprised a 900 lines patch is only about increasing the size of the page guard. Isn't there more in it ?

Well, it's 900 lines of .patch file text, but the diffstat is around 300 added+deleted, so not that much.

It's large because, as explained in the commit log, the old 1 stack guard page code simply extended to N pages made many accounting issues visible, because the guard page(s) were part of the VMA's [start, end] addresses. The patch deletes that approach and replaces it so that the gap is always between VMA boundaries. That means adjusting the code to check allowed VMA placement/enlargement so that it maintains the gap if the next/prev VMA is a stack one.

Preventing stack guard-page hopping

Posted Jun 20, 2017 9:55 UTC (Tue) by moltonel (guest, #45207) [Link] (9 responses)

> A better change is to modify alloca() in libc to touch at least one byte on each allocated page.

That's going to mess up with the performance profile (allocating pages earlyer than expected) and decrease total performance in case the app wasn't going to touch those pages at all.

> This would protect remote attacks but wouldn't prevent an attacker to write his own stack allocation for local privilege escalation.

Assuming we accept the performance hit, can we use the same technique in the kernel ? Disable overcommit ? Or is the kernel not aware of what the app is considering its stack space ?

Preventing stack guard-page hopping

Posted Jun 20, 2017 10:39 UTC (Tue) by nix (subscriber, #2304) [Link] (7 responses)

> That's going to mess up with the performance profile (allocating pages earlyer than expected) and decrease total performance in case the app wasn't going to touch those pages at all.

It's... not common for applications to allocate page-size structures on the heap that are not optimized out and then not use them for anything. I suppose functions that have big local variables and then do early exit based only on the parameters, but in that case the compiler can adapt to adjust the stack only after the early exits, if this is really significant (which I very much doubt).

Preventing stack guard-page hopping

Posted Jun 20, 2017 15:15 UTC (Tue) by zblaxell (subscriber, #26385) [Link] (6 responses)

> It's... not common for applications to allocate page-size structures on the [stack] that are not optimized out and then not use them for anything.

In one project I found an innocuous-looking state structure that turned out to have ~5MB of unused bytes in the middle, buried under a pyramid of macro expansion, arrays, nested members, and unreadable coding style. The code did use all the other members in the struct, on both sides of the hole.

Also it's fairly common in userland to do IO to a buffer on the stack, where the buffer is huge and the IO is tiny.

Preventing stack guard-page hopping

Posted Jun 20, 2017 16:30 UTC (Tue) by gutschke (subscriber, #27910) [Link] (5 responses)

Most programs that I see have relatively manageable buffers on the stack. Often, it is just a handful of bytes for state machines that try to read things line by line. And at other times, it might be as much as maybe a handful of kilobytes. That usually faults in no more than an extra page or two. And that's well within the normal variation of stack depth. So, performance impact should be zero.

Do you really commonly see programs allocate many hundred of kilobytes if not many megabytes on the stack? That's not a pattern that I have encountered frequently. Buffers this large are more commonly allocated on the heap.

I am not saying it doesn't happen. Anything stupid that you can think of, somebody else probably thought of before. But common? Hopefully not.

Preventing stack guard-page hopping

Posted Jun 21, 2017 11:14 UTC (Wed) by PaXTeam (guest, #24616) [Link]

besides C and buffers/alloca there's also C++ and classes instantiated as local variables, so code can inadvertantly require larger than usual stack frames that way.

Preventing stack guard-page hopping

Posted Jun 21, 2017 11:24 UTC (Wed) by nix (subscriber, #2304) [Link] (2 responses)

Besides, I never said it didn't happen -- just that most functions don't do it and those that do take so long that a bit of extra page touching is irrelevant (doing I/O into big buffers was what I was thinking of: any function that does I/O is going to have the I/O dominate its performance profile.)

Preventing stack guard-page hopping

Posted Jun 21, 2017 14:57 UTC (Wed) by zblaxell (subscriber, #26385) [Link] (1 responses)

If the program uses the huge thing it allocated on the stack then it's going to fault in all the pages anyway, and that's a pretty big hit the first time around, much larger than the cost of the probe.

On the other hand, if a function is being called in a loop then the probes keep happening over and over even though the page faults don't, so the probing gets expensive.

For programs that handle toxic data there might not be a quick and easy solution--they might just have to suck up the cost of doing probes all the time, or use other techniques (e.g. constant-stack algorithm proofs, coding standards forbidding alloca() and sparse structures, etc.) to make sure stack overflows don't happen.

Since changes to alloca require recompiling the program, it's up to individual applications to make the performance/security tradeoff anyway. Isn't there already a compiler option to do this?

Preventing stack guard-page hopping

Posted Jun 22, 2017 22:37 UTC (Thu) by mikemol (guest, #83507) [Link]

They could also tune these performance/security trade-offs on a routine-by-routine basis, by stuffing sensitive routines in their own compilation unit.

LTO will need to be careful to let these considerations bubble up to the final binary, however.

Preventing stack guard-page hopping

Posted Oct 3, 2019 13:18 UTC (Thu) by ychevali (guest, #134753) [Link]

when you allocate a large array on the stack, initialization (from index 0) starts on the far end and thus jumps over all the pages in between. Case in point: for some programs, it makes a lot of sense to start by computing an array of ``small'' prime numbers (say up to 100,000 or 1,000,000) by Eratosthene sieve.

Preventing stack guard-page hopping

Posted Jun 26, 2017 9:25 UTC (Mon) by anton (subscriber, #25547) [Link]

A better change is to modify alloca() in libc to touch at least one byte on each allocated page.
That's going to mess up with the performance profile (allocating pages earlyer than expected) and decrease total performance in case the app wasn't going to touch those pages at all.

I don't think that that's a significant issue, but anyway: You just need to read the byte (the guard page is not readable, is it?). So all the not-yet-used stack pages can be the same page containing zeroes (which also means that the same cache line will be used for all these reads in a physically-tagged (i.e., normal these days) cache). Only when it is used for real, a physical page is allocated.

Preventing stack guard-page hopping

Posted Jun 26, 2017 9:09 UTC (Mon) by anton (subscriber, #25547) [Link]

This would protect remote attacks but wouldn't prevent an attacker to write his own stack allocation for local privilege escalation.

I don't think that preventing this attack scenario prevents any halfway-competent attack. If the attacker can write his own stack allocation, he can write it to jump over guard regions of any size; actually, he can put the memory writes to the area below the stack in his otherwise-regular stack-allocation code directly. In other words: If you allow the attacker to execute his code in a setting that can escalate priviledges, you are already owned, guard page or not.

Preventing stack guard-page hopping

Posted Jun 20, 2017 14:50 UTC (Tue) by BenHutchings (subscriber, #37955) [Link]

alloca() can't be implemented as a real function, so it's only "in" glibc in the sense that the definition is in a glibc header. Further, that definition just defers to the compiler's pseudo-function __builtin_alloca(). So even rebuilding against an updated glibc isn't enough to fix this. glibc has been updated to make its own use of alloca() safer, though.