+ update-x86_64-mm-xen-use-iret-directly-where-possible.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     update x86_64-mm-xen-use-iret-directly-where-possible
has been added to the -mm tree.  Its filename is
     update-x86_64-mm-xen-use-iret-directly-where-possible.patch

*** Remember to use Documentation/SubmitChecklist when testing your code ***

See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
out what to do about this

------------------------------------------------------
Subject: update x86_64-mm-xen-use-iret-directly-where-possible
From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>

There's only a minor code change from the version you've got, but the
comments are more accurate.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xxxxxxxxxxxxx>
Cc: Andi Kleen <ak@xxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 arch/i386/xen/xen-asm.S |   56 +++++++++++++++++++++++++-------------
 1 files changed, 37 insertions(+), 19 deletions(-)

diff -puN arch/i386/xen/xen-asm.S~update-x86_64-mm-xen-use-iret-directly-where-possible arch/i386/xen/xen-asm.S
--- a/arch/i386/xen/xen-asm.S~update-x86_64-mm-xen-use-iret-directly-where-possible
+++ a/arch/i386/xen/xen-asm.S
@@ -108,14 +108,28 @@ ENDPATCH(xen_restore_fl_direct)
 	      4: cs
 	esp-> 0: eip
 
-	This attempts to make sure that any pending events are dealt with
-	on return to usermode, but there is a small window in which an event
-	can happen just before entering usermode.  This has three effects:
-	 - There can be interrupt recursion on the stack, which is
-	   unbounded in theory (but very unlikely in practice)
-	 - New softirq events can be queued up, but they won't get
-	   processed until the cpu next enters and leaves the kernel.
-	 - Signals likewise.
+	This attempts to make sure that any pending events are dealt
+	with on return to usermode, but there is a small window in
+	which an event can happen just before entering usermode.  If
+	the nested interrupt ends up setting one of the TIF_WORK_MASK
+	pending work flags, they will not be tested again before
+	returning to usermode. This means that a process can end up
+	with pending work, which will be unprocessed until the process
+	enters and leaves the kernel again, which could be an
+	unbounded amount of time.  This means that a pending signal or
+	reschedule event could be indefinitely delayed.
+
+	The fix is to notice a nested interrupt in the critical
+	window, and if one occurs, then fold the nested interrupt into
+	the current interrupt stack frame, and re-process it
+	iteratively rather than recursively.  This means that it will
+	exit via the normal path, and all pending work will be dealt
+	with appropriately.
+
+	Because the nested interrupt handler needs to deal with the
+	current stack state in whatever form its in, we keep things
+	simple by only using a single register which is pushed/popped
+	on the stack.
 
 	Non-direct iret could be done in the same way, but it would
 	require an annoying amount of code duplication.  We'll assume
@@ -127,9 +141,6 @@ ENTRY(xen_iret_direct)
 	testl $(X86_EFLAGS_VM | XEN_EFLAGS_NMI), 8(%esp)
 	jnz hyper_iret
 
-	/* check IF state we're restoring */
-	testb $X86_EFLAGS_IF>>8, 8+1(%esp)
-
 	push %eax
 	ESP_OFFSET=4	# bytes pushed onto stack
 
@@ -144,6 +155,9 @@ ENTRY(xen_iret_direct)
 	movl $per_cpu__xen_vcpu_info, %eax
 #endif
 
+	/* check IF state we're restoring */
+	testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp)
+
 	/* Maybe enable events.  Once this happens we could get a
 	   recursive event, so the critical region starts immediately
 	   afterwards.  However, if that happens we don't end up
@@ -187,7 +201,7 @@ hyper_iret:
 
    The stack format at this point is:
 	----------------
-	 ss		:
+	 ss		: (ss/esp may be present if we came from usermode)
 	 esp		:
 	 eflags		}  outer exception info
 	 cs		}
@@ -219,17 +233,21 @@ hyper_iret:
    The only caveat is that if the outer eax hasn't been
    restored yet (ie, it's still on stack), we need to insert
    its value into the SAVE_ALL state before going on, since
-   its usermode state which we eventually need to restore.
+   it's usermode state which we eventually need to restore.
  */
 ENTRY(xen_iret_crit_fixup)
 	/* offsets +4 for return address */
 
-	/* Paranoia: make sure we're really coming from userspace.
-	   Once could imagine a case where userspace jumps into
-	   the critical range address, but just before the CPU
-	   delivers a GP, it decides to deliver an interrupt
-	   instead.  Unlikely?  Definitely.  Easy to avoid?
-	   Yes. (Some virtual environments get this wrong.) */
+	/*
+	   Paranoia: Make sure we're really coming from userspace.
+	   One could imagine a case where userspace jumps into the
+	   critical range address, but just before the CPU delivers a GP,
+	   it decides to deliver an interrupt instead.  Unlikely?
+	   Definitely.  Easy to avoid?  Yes.  The Intel documents
+	   explicitly say that the reported EIP for a bad jump is the
+	   jump instruction itself, not the destination, but some virtual
+	   environments get this wrong.
+	 */
 	movl PT_CS+4(%esp), %ecx
 	andl $SEGMENT_RPL_MASK, %ecx
 	cmpl $USER_RPL, %ecx
_

Patches currently in -mm which might be from jeremy@xxxxxxxx are

git-kbuild.patch
add-kstrndup-fix.patch
xen-build-fix.patch
fix-x86_64-mm-xen-xen-smp-guest-support.patch
more-fix-x86_64-mm-xen-xen-smp-guest-support.patch
fix-x86_64-mm-xen-add-xen-virtual-block-device-driver.patch
fix-x86_64-mm-add-common-orderly_poweroff.patch
tidy-up-usermode-helper-waiting-a-bit-fix.patch
update-x86_64-mm-xen-use-iret-directly-where-possible.patch
x86-use-elfnoteh-to-generate-vsyscall-notes-fix.patch
paravirt-helper-to-disable-all-io-space-fix-2.patch
paravirt-helper-to-disable-all-io-space-fix-3.patch
maps2-uninline-some-functions-in-the-page-walker.patch
maps2-eliminate-the-pmd_walker-struct-in-the-page-walker.patch
maps2-remove-vma-from-args-in-the-page-walker.patch
maps2-propagate-errors-from-callback-in-page-walker.patch
maps2-add-callbacks-for-each-level-to-page-walker.patch
maps2-move-the-page-walker-code-to-lib.patch
maps2-simplify-interdependence-of-proc-pid-maps-and-smaps.patch
maps2-move-clear_refs-code-to-task_mmuc.patch
maps2-regroup-task_mmu-by-interface.patch
maps2-make-proc-pid-smaps-optional-under-config_embedded.patch
maps2-make-proc-pid-clear_refs-option-under-config_embedded.patch
maps2-add-proc-pid-pagemap-interface.patch
maps2-add-proc-kpagemap-interface.patch
add-argv_split-fix.patch
add-common-orderly_poweroff-fix.patch
lguest-the-guest-code.patch

-
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Kernel Newbies FAQ]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux