On Tue, 2023-06-20 at 10:17 +0100, szabolcs.nagy@xxxxxxx wrote: > if there is a fix that's good, i haven't seen it. > > my point was that the current unwinder works with current kernel > patches, but does not allow future extensions which prevents > sigaltshstk to work. the unwinder is not versioned so this cannot > be fixed later. it only works if distros ensure shstk is disabled > until the unwinder is fixed. (however there is no way to detect > old unwinder if somebody builds gcc from source.) This is a problem the kernel is having to deal with, not causing. The userspace changes were upstreamed before the kernel. Userspace folks are adamantly against moving to a new elf bit, to start over with a clean slate. I tried everything to influence this and was not successful. So I'm still not sure what the proposal here is for the kernel. I am guessing that the fnon-call-exceptions/expanded frame size incompatibilities could end up causing something to grow an opt-in at some point. > > also note that there is generic code in the unwinder that will > deal with this and likely the x86 patches will conflict with > arm and riscv etc patches that try to fix the same issue.. > so posting patches on the tools side of the abi would be useful > at this point. The glibc patches are unfortunately mostly upstream already. See HJ for the diff that targets the new enabling interface. From lessons learned earlier in this effort, he was not going push those changes before the kernel support was upstream. There shouldn't be any glibc changes to signal or longjmp stuff in those AFAIK though. [ snip ] > how does "fixed shadow stack signal frame size" relates to > "-fnon-call-exceptions"? > > if there were instruction boundaries within a function where the > ret addr is not yet pushed or already poped from the shstk then > the flag would be relevant, but since push/pop happens atomically > at function entry/return -fnon-call-exceptions makes no > difference as far as shstk unwinding is concerned. As I said, the existing unwinding code for fnon-call-excecptions assumes a fixed shadow stack signal frame size of 8 bytes. Since the exception is thrown out of a signal, it needs to know how to unwind through the shadow stack signal frame. [ snip ] > there is no magic, longjmp should be implemented as: > > target_ssp = read from jmpbuf; > current_ssp = read ssp; > for (p = target_ssp; p != current_ssp; p--) { > if (*p == restore-token) { > // target_ssp is on a different shstk. > switch_shstk_to(p); > break; > } > } > for (; p != target_ssp; p++) > // ssp is now on the same shstk as target. > inc_ssp(); > > this is what setcontext is doing and longjmp can do the same: > for programs that always longjmp within the same shstk the first > loop is just p = current_ssp, but it also works when longjmp > target is on a different shstk assuming nothing is running on > that shstk, which is only possible if there is a restore token > on top. > > this implies if the kernel switches shstk on signal entry it has > to add a restore-token on the switched away shstk. I actually did a POC for this, but rejected it. The problem is, if there is a shadow stack overflow at that point then the kernel can't push the shadow stack token to the old stack. And shadow stack overflow is exactly the alt shadow stack use case. So it doesn't really solve the problem. This reasoning was actually elaborated on when the alt shadow stack patches were posted. And it looks like I previously pointed you at it. This history here is quite long and complicated, but I’ve done my best to summarize it in the coverletters. It would be helpful if you could review those links. [ snip ] > i think longjmp should really be discussed with libc devs, > not on the kernel list, since they know the practical > constraints and trade-offs better. however longjmp is > relevant for the signal abi design so it's not ideal to > push a linux abi and then have the libc side discussion > later.. It sounds like you are aware of the limitations the pre-existing upstream userspace places on the shadow stack signal frame. We also previously discussed how the kernel had to work around other aspects of upstream userspace that assumed undecided kernel ABI. How on earth are you getting that the kernel ABI is being pushed before input from the userspace side? The situation is the opposite.