On Wed, Aug 12, 2015 at 6:08 PM, Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > On Wed, Aug 12, 2015 at 06:03:26PM -0700, gregkh@xxxxxxxxxxxxxxxxxxx wrote: >> >> This is a note to let you know that I've just added the patch titled >> >> x86/nmi/64: Switch stacks on userspace NMI entry >> >> to the 4.1-stable tree which can be found at: >> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary >> >> The filename of the patch is: >> x86-nmi-64-switch-stacks-on-userspace-nmi-entry.patch >> and it can be found in the queue-4.1 subdirectory. >> >> If you, or anyone else, feels it should not be added to the stable tree, >> please let <stable@xxxxxxxxxxxxxxx> know about it. >> >> >> >From 9b6e6a8334d56354853f9c255d1395c2ba570e0a Mon Sep 17 00:00:00 2001 >> From: Andy Lutomirski <luto@xxxxxxxxxx> >> Date: Wed, 15 Jul 2015 10:29:35 -0700 >> Subject: x86/nmi/64: Switch stacks on userspace NMI entry >> >> From: Andy Lutomirski <luto@xxxxxxxxxx> >> >> commit 9b6e6a8334d56354853f9c255d1395c2ba570e0a upstream. >> >> Returning to userspace is tricky: IRET can fail, and ESPFIX can >> rearrange the stack prior to IRET. >> >> The NMI nesting fixup relies on a precise stack layout and >> atomic IRET. Rather than trying to teach the NMI nesting fixup >> to handle ESPFIX and failed IRET, punt: run NMIs that came from >> user mode on the normal kernel stack. >> >> This will make some nested NMIs visible to C code, but the C >> code is okay with that. >> >> As a side effect, this should speed up perf: it eliminates an >> RDMSR when NMIs come from user mode. >> >> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx> >> Reviewed-by: Steven Rostedt <rostedt@xxxxxxxxxxx> >> Reviewed-by: Borislav Petkov <bp@xxxxxxx> >> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> >> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> >> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> >> Cc: stable@xxxxxxxxxxxxxxx >> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> >> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> >> >> --- >> arch/x86/kernel/entry_64.S | 61 ++++++++++++++++++++++++++++++++++++++++++--- >> 1 file changed, 57 insertions(+), 4 deletions(-) >> >> --- a/arch/x86/kernel/entry_64.S >> +++ b/arch/x86/kernel/entry_64.S >> @@ -1424,19 +1424,72 @@ ENTRY(nmi) >> * a nested NMI that updated the copy interrupt stack frame, a >> * jump will be made to the repeat_nmi code that will handle the second >> * NMI. >> + * >> + * However, espfix prevents us from directly returning to userspace >> + * with a single IRET instruction. Similarly, IRET to user mode >> + * can fault. We therefore handle NMIs from user space like >> + * other IST entries. >> */ >> >> /* Use %rdx as our temp variable throughout */ >> pushq_cfi %rdx >> CFI_REL_OFFSET rdx, 0 >> >> + testb $3, CS-RIP+8(%rsp) >> + jz .Lnmi_from_kernel >> + >> + /* >> + * NMI from user mode. We need to run on the thread stack, but we >> + * can't go through the normal entry paths: NMIs are masked, and >> + * we don't want to enable interrupts, because then we'll end >> + * up in an awkward situation in which IRQs are on but NMIs >> + * are off. >> + */ >> + >> + SWAPGS >> + cld >> + movq %rsp, %rdx >> + movq PER_CPU_VAR(kernel_stack), %rsp > > Note, this differs from what is in 4.2-rc, and what was in Ben's > backported version for 4.0 because we don't have a KERNEL_STACK_OFFSET > anymore in 4.1, and we don't yet have cpu_current_top_of_stack either. > > So odds are, this is wrong, but if so, what should I do here for 4.1? > Backport the cpu_current_top_of_stack logic? I haven't tested directly, but this looks correct. In 4.1, KERNEL_STACK_OFFSET was removed and effectively became zero. --Andy -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html