On Thu, May 02, 2019 at 11:02:40AM -0700, Linus Torvalds wrote: > On Thu, May 2, 2019 at 9:21 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > > > TL;DR, on x86_32 kernel->kernel IRET frames are only 3 entries and do > > not include ESP/SS, so not only wasn't regs->sp setup, if you changed it > > it wouldn't be effective and corrupt random stack state. > > Indeed, the 32-bit case for same-RPL exceptions/iret is entirely > different, and I'd forgotten about that. > > And honestly, this makes the 32-bit case much worse. Now the entry > stack modifications of int3 suddenly affect not just the entry, but > every exit too. We could fix this by not using the common exit path on int3; not sure we want to go there, but that is an option. > This is _exactly_ the kind of subtle kernel entry/exit code I wanted > us to avoid. > > And while your code looks kind of ok, it's subtly buggy. This sequence: > > + pushl %eax > + movl %esp, %eax > + > + movl 4*4(%eax), %esp # restore (modified) regs->sp > + > + /* rebuild IRET frame */ > + pushl 3*4(%eax) # flags > + pushl 2*4(%eax) # cs > + pushl 1*4(%eax) # ip > + > + andl $0x0000ffff, 4(%esp) # clear high CS bits > + > + movl (%eax), %eax # restore eax > > looks very wrong to me. When you do that "restore (modified) > regs->sp", isn't that now resetting %esp to the point where %eax now > points below the stack? So if we get an NMI in this sequence, that > will overwrite the parts you are trying to copy from? ARGH; I knew it was too pretty :/ Yes, something like what you suggest will be needed, I'll go look at that once my brain recovers a bit from staring at entry code all day.