On Fri, 2023-03-03 at 16:30 +0000, szabolcs.nagy@xxxxxxx wrote: > the points that i think are worth raising: > > - shadow stack size logic may need to change later. > (it can be too big, or too small in practice.) Looking at making it more efficient in the future seems great. But since we are not in the position of being able to make shadow stacks completely seamless (see below) > - shadow stack overflow is not recoverable and the > possible fix for that (sigaltshstk) breaks longjmp > out of signal handlers. > - jump back after SS_AUTODISARM swapcontext cannot be > reliable if alt signal uses thread shadow stack. > - the above two concerns may be mitigated by different > sigaltstack behaviour which may be hard to add later. Are you aware that you can't simply emit a restore token on x86 without first restoring to another restore token? This is why (I'm assuming) glibc uses incssp to implement longjmp instead of just jumping back to the setjmp point with a shadow stack restore. So of course then longjmp can't jump between shadow stacks. So there are sort of two categories of restrictions on binaries that mark the SHSTK elf bit. The first category is that they have to take special steps when switching stacks or jumping around on the stack. Once they handle this, they can work with shadow stack. The second category is that they can't do certain patterns of jumping around on stacks, regardless of the steps they take. So certain previously allowed software patterns are now impossible, including ones implemented in glibc. (And the exact restrictions on the glibc APIs are not documented and this should be fixed). If applications will violate either type of these restrictions they should not mark the SHSTK elf bit. Now that said, there is an exception to these restrictions on x86, which is the WRSS instruction, which can write to the shadow stack. The arch_prctl() interface allows this to be optionally enabled and locked. The v2 signal analysis I pointed earlier, mentions how this might be used by glibc to support more of the currently restricted patterns. Please take a look if you haven't (section "setjmp()/longjmp()"). It also explains why in the non-WRSS scenarios the kernel can't easily help improve the situation. WRSS opens up writing to the shadow stack, and so a glibc-WRSS mode would be making a security/compatibility tradeoff. I think starting with the more restricted mode was ultimately good in creating a kernel ABI that can support both. If userspace could paper over ABI gaps with WRSS, we might not have realized the issues we did. > - end token for backtrace may be useful, if added > later it can be hard to check. Yes this seems like a good idea. Thanks for the suggestion. I'm not sure it can't be added later though. I'll POC it and do some more thinking.