On Fri, 2023-06-23 at 17:25 +0100, szabolcs.nagy@xxxxxxx wrote: > why? > > a stack can be active or inactive (task executing on it or not), > and if inactive it can be live or dead (has stack frames that can > be jumped to or not). > > this is independent of shadow stacks: longjmp is only valid if the > target is either the same active stack or an inactive live stack. > (there are cases that may seem to work, but fundamentally broken > and not supportable: e.g. two tasks executing on the same stack > where the one above happens to not clobber frames deep enough to > collide with the task below.) > > the proposed longjmp design works for both cases. no assumption is > made about ucontext or signals other than the shadow stack for an > inactive live stack ends in a restore token, One of the problems for the case of longjmp() from another another stack, is how to find the old stack's token. HJ and I had previously discussed searching for the token from the target SSP forward, but the problems are it is 1, not gauranteed to be there and 2, pretty awkward and potentially slow. > which is guaranteed by > the isa so we only need the kernel to do the same when it switches > shadow stacks. then longjmp works by construction. No it's not. > > the only wart is that an overflowed shadow stack is inactive dead > instead of inactive live because the token cannot be added. (note > that handling shstk overflow and avoiding shstk overflow during > signal handling still works with alt shstk!) I thought we were on the same page as far as pushing a restore token on signal not being robust against shadow stack overflow, so is this a new idea? Usually people around here say "code talks", but all I'm asking for is a full explanation of what you are trying to accomplish and what the idea is. Otherwise this is asking to hold up this feature based on hand waving. Could you answer the questions here, along with a full description of your proposal: https://lore.kernel.org/all/1cd67ae45fc379fd82d2745190e4caf74e67499e.camel@xxxxxxxxx/ It we are talking past each other, it could help to do a levelset at this point. > > an alternative solution is to allow jump to inactive dead stack > if that's created by a signal interrupt. for that a syscall is > needed and longjmp has to detect if the target stack is dead or > live. (the kernel also has to be able to tell if switching to the > dead stack is valid for security reasons.) i don't know if this > is doable (if we allow some hacks it's doable). > > unwinding across signal handlers is just a matter of having > enough information at the signal frame to continue, it does > not have to follow crazy jumps or weird coroutine things: > that does not work without shadow stacks either. but unwind > across alt stack frame should work. Even if there is some workable idea, there is already a bunch of userspace built around the existing solution, and users waiting for it. In addition the whole discussion around alt shadow stack cases will require alt shadow stacks to be implemented and we might be constrained there anyway. If we want to take learnings and do something new, let's build it around a new elf bit. This current kernel ABI is to support the old elf bit userspace stack, which has been solidified by the existing upstream userspace. I had thought we should start from scratch now and proposed a patch to block the old elf bit to force this, but lost that battle with other glibc developers. Speaking of which, I don't see any enthusiasm from any other glibc developers that have been involved in this previously. Who were you thinking was going to implement any of this on the glibc side? Have you made any progress in getting any of them onboard with this order of operations in the months since you first brought up changing the design?