On Sat, Jul 21, 2018 at 4:19 PM, M. Vefa Bicakci <m.v.b@xxxxxxxxxx> wrote: > On 07/21/2018 05:45 PM, Andy Lutomirski wrote: >> >> On Sat, Jul 21, 2018 at 12:49 PM, M. Vefa Bicakci <m.v.b@xxxxxxxxxx> >> wrote: >>> >>> Commit 3ac6d8c787b8 ("x86/entry/64: Clear registers for >>> exceptions/interrupts, to reduce speculation attack surface") >>> unintendedly >>> broke Xen PV virtual machines by clearing the %rbx register at the end of >>> xen_failsafe_callback before the latter jumps to error_exit. >>> error_exit expects the %rbx register to be a flag indicating whether >>> there should be a return to kernel mode. >>> >>> This commit makes sure that the %rbx register is not cleared by >>> the PUSH_AND_CLEAR_REGS macro, when the macro in question is instantiated >>> by xen_failsafe_callback, to avoid the issue. >> >> >> Seems like a genuine problem, but: >> >>> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S >>> index c7449f377a77..96e8ff34129e 100644 >>> --- a/arch/x86/entry/entry_64.S >>> +++ b/arch/x86/entry/entry_64.S >>> @@ -1129,7 +1129,7 @@ ENTRY(xen_failsafe_callback) >>> addq $0x30, %rsp >>> UNWIND_HINT_IRET_REGS >>> pushq $-1 /* orig_ax = -1 => not a system call */ >>> - PUSH_AND_CLEAR_REGS >>> + PUSH_AND_CLEAR_REGS clear_rbx=0 >>> ENCODE_FRAME_POINTER >>> jmp error_exit >> >> >> The old code first set RBX to zero then, if frame pointers are on, >> sets it to some special non-zero value, then crosses its fingers and >> hopes for the best. Your patched code just skips the zeroing part, so >> RBX either contains the ENCODE_FRAME_POINTER result or is >> uninitialized. >> >> How about actually initializing rbx to something sensible like, say, 1? > > > Hello Andy, > > Thank you for the review! Apparently, I have not done my homework fully. > I will test your suggestion and report back, most likely in a few hours. > > I have been testing with the next/linux-next tree's master branch > (dated 20180720), and I noticed that ENCODE_FRAME_POINTER changes the > frame pointer (i.e., RBP) register, as opposed to the RBX register, > which the patch aims to avoid changing before jumping to error_exit. > It is possible that I am missing something though -- I am not sure about > the connection between the RBP and RBX registers. Sorry, brain fart on my part. > > The change introduced by commit 3ac6d8c787b8 is in the excerpt below. Would > it > be valid to state that the original code had the same issue that you > referred > to (i.e., leaving the RBX register uninitialized)? Presumably. I would propose a rather different fix: https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/pti&id=bb3d76b50c3bc78b67d79cf90d328f38a435c793 Any chance you could test that and see if it fixes your problem?