On Thu, Aug 13, 2015 at 11:51:12AM -0700, Andy Lutomirski wrote: > On Wed, Aug 12, 2015 at 6:08 PM, Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > > On Wed, Aug 12, 2015 at 06:03:26PM -0700, gregkh@xxxxxxxxxxxxxxxxxxx wrote: > >> > >> This is a note to let you know that I've just added the patch titled > >> > >> x86/nmi/64: Switch stacks on userspace NMI entry > >> > >> to the 4.1-stable tree which can be found at: > >> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary > >> > >> The filename of the patch is: > >> x86-nmi-64-switch-stacks-on-userspace-nmi-entry.patch > >> and it can be found in the queue-4.1 subdirectory. > >> > >> If you, or anyone else, feels it should not be added to the stable tree, > >> please let <stable@xxxxxxxxxxxxxxx> know about it. > >> > >> > >> >From 9b6e6a8334d56354853f9c255d1395c2ba570e0a Mon Sep 17 00:00:00 2001 > >> From: Andy Lutomirski <luto@xxxxxxxxxx> > >> Date: Wed, 15 Jul 2015 10:29:35 -0700 > >> Subject: x86/nmi/64: Switch stacks on userspace NMI entry > >> > >> From: Andy Lutomirski <luto@xxxxxxxxxx> > >> > >> commit 9b6e6a8334d56354853f9c255d1395c2ba570e0a upstream. > >> > >> Returning to userspace is tricky: IRET can fail, and ESPFIX can > >> rearrange the stack prior to IRET. > >> > >> The NMI nesting fixup relies on a precise stack layout and > >> atomic IRET. Rather than trying to teach the NMI nesting fixup > >> to handle ESPFIX and failed IRET, punt: run NMIs that came from > >> user mode on the normal kernel stack. > >> > >> This will make some nested NMIs visible to C code, but the C > >> code is okay with that. > >> > >> As a side effect, this should speed up perf: it eliminates an > >> RDMSR when NMIs come from user mode. > >> > >> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx> > >> Reviewed-by: Steven Rostedt <rostedt@xxxxxxxxxxx> > >> Reviewed-by: Borislav Petkov <bp@xxxxxxx> > >> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > >> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > >> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > >> Cc: stable@xxxxxxxxxxxxxxx > >> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> > >> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> > >> > >> --- > >> arch/x86/kernel/entry_64.S | 61 ++++++++++++++++++++++++++++++++++++++++++--- > >> 1 file changed, 57 insertions(+), 4 deletions(-) > >> > >> --- a/arch/x86/kernel/entry_64.S > >> +++ b/arch/x86/kernel/entry_64.S > >> @@ -1424,19 +1424,72 @@ ENTRY(nmi) > >> * a nested NMI that updated the copy interrupt stack frame, a > >> * jump will be made to the repeat_nmi code that will handle the second > >> * NMI. > >> + * > >> + * However, espfix prevents us from directly returning to userspace > >> + * with a single IRET instruction. Similarly, IRET to user mode > >> + * can fault. We therefore handle NMIs from user space like > >> + * other IST entries. > >> */ > >> > >> /* Use %rdx as our temp variable throughout */ > >> pushq_cfi %rdx > >> CFI_REL_OFFSET rdx, 0 > >> > >> + testb $3, CS-RIP+8(%rsp) > >> + jz .Lnmi_from_kernel > >> + > >> + /* > >> + * NMI from user mode. We need to run on the thread stack, but we > >> + * can't go through the normal entry paths: NMIs are masked, and > >> + * we don't want to enable interrupts, because then we'll end > >> + * up in an awkward situation in which IRQs are on but NMIs > >> + * are off. > >> + */ > >> + > >> + SWAPGS > >> + cld > >> + movq %rsp, %rdx > >> + movq PER_CPU_VAR(kernel_stack), %rsp > > > > Note, this differs from what is in 4.2-rc, and what was in Ben's > > backported version for 4.0 because we don't have a KERNEL_STACK_OFFSET > > anymore in 4.1, and we don't yet have cpu_current_top_of_stack either. > > > > So odds are, this is wrong, but if so, what should I do here for 4.1? > > Backport the cpu_current_top_of_stack logic? > > I haven't tested directly, but this looks correct. In 4.1, > KERNEL_STACK_OFFSET was removed and effectively became zero. Great, thanks for letting me know. greg k-h -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html