- pi-futex-futex_lock_pi-futex_unlock_pi-support-fix.patch removed from -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled

     Fix for Bug in PI exit code

has been removed from the -mm tree.  Its filename is

     pi-futex-futex_lock_pi-futex_unlock_pi-support-fix.patch

This patch was dropped because it was folded into pi-futex-futex_lock_pi-futex_unlock_pi-support.patch

------------------------------------------------------
Subject: Fix for Bug in PI exit code
From: Dinakar Guniguntala <dino@xxxxxxxxxx>


We were seeing oopses like below a lot when using PI mutexes

===============================================================================
j9/3939[CPU#1]: BUG in free_pi_state at kernel/futex.c:361
 [<c011dd92>] __WARN_ON+0x41/0x57 (8)
 [<c0130e41>] free_pi_state+0x2d/0xb1 (48)
 [<c0131a31>] unqueue_me_pi+0x5c/0x75 (12)
 [<c0132236>] futex_lock_pi+0x50a/0x5b5 (16)
 [<c014bab0>] do_wp_page+0x325/0x340 (36)
 [<c03121a3>] do_page_fault+0x202/0x528 (156)
 [<c01328d9>] do_futex+0x85/0x8b (20)
 [<c0132966>] sys_futex+0x87/0x94 (24)
 [<c01027c7>] sysenter_past_esp+0x54/0x75 (40)
j9/3939[CPU#2]: BUG in free_pi_state at kernel/futex.c:362
 [<c011dd92>] __WARN_ON+0x41/0x57 (8)
 [<c0130e5e>] free_pi_state+0x4a/0xb1 (48)
 [<c0131a31>] unqueue_me_pi+0x5c/0x75 (12)
 [<c0132236>] futex_lock_pi+0x50a/0x5b5 (16)
 [<c014bab0>] do_wp_page+0x325/0x340 (36)
 [<c03121a3>] do_page_fault+0x202/0x528 (156)
 [<c01328d9>] do_futex+0x85/0x8b (20)
 [<c0132966>] sys_futex+0x87/0x94 (24)
 [<c01027c7>] sysenter_past_esp+0x54/0x75 (40)
BUG: Unable to handle kernel NULL pointer dereference at virtual address 00000514
 printing eip:
c0311096
*pde = 2cd36001
Oops: 0002 [#1]
PREEMPT SMP
Modules linked in: loop ipv6 i2c_dev i2c_core nfs lockd sunrpc dm_mirror
dm_multipath dm_mod joydev button battery ac ohci_hcd hw_random shpchp tg3 ext3
jbd sd_mod
CPU:    2
EIP:    0060:[<c0311096>]    Not tainted VLI
EFLAGS: 00010002   (2.6.16-rayrt12.1.1smp #1)
EIP is at _raw_spin_lock_irq+0xb/0x1a
eax: 00000514   ebx: d61349c0   ecx: c038511c   edx: ed040000
esi: d61349c8   edi: c0475d30   ebp: 08054638   esp: ed041e88
ds: 007b   es: 007b   ss: 0068   preempt: 00000002
Process j9 (pid: 3939, threadinfo=ed040000 task=d0951300 stack_left=7764
worst_left=-1)
Stack: <0>c0130e6b ed041f08 ed041f0c c0131a31 ed041f08 ed040000 fffffffc c0132236
       c80c70b0 c0475d30 00000001 00000f67 c0475d30 00000000 00000000 d0951300
       c014bab0 49e7e067 00000000 49e7e025 00000000 ffa20448 ffa52000 ffa51000
Call Trace:
 [<c0130e6b>] free_pi_state+0x57/0xb1 (4)
 [<c0131a31>] unqueue_me_pi+0x5c/0x75 (12)
 [<c0132236>] futex_lock_pi+0x50a/0x5b5 (16)
 [<c014bab0>] do_wp_page+0x325/0x340 (36)
 [<c03121a3>] do_page_fault+0x202/0x528 (156)
 [<c01328d9>] do_futex+0x85/0x8b (20)
 [<c0132966>] sys_futex+0x87/0x94 (24)

===============================================================================
After a lot of debugging we found that this is caused due to the following race.
PM is a PI mutex, A and B are RT threads

        Thread A (RT)                  Thread B (RT)
            |
            v
    pthread_mutex_lock (PM)                 |
    (glibc) got mutex                       v
         do work                   pthread_mutex_lock (PM)
                                   rt_mutex_timed_lock

          EINTR                    EINTR (Process gets aborted)

         do_exit                   lock(pi_mutex->lock->wait_lock)
    exit_pi_state_list             clear_waiters
    lock(hb->lock)
    pi_state->owner = NULL         unlock(pi_mutex->lock->wait_lock)
    rt_mutex_unlock(pi_mutex)      lock(hb->lock) (blocks)
    unlock(hb->lock)               unblock -> free_pi_state
    continue exit processing       doesn't expect pi_state->owner to be NULL
                                   Panic

The patch attached below seems to make this problem go away. This has been
stress tested quite a bit in the past 24 hours.
Does it look sane to you ??

Signed-off-by: Dinakar Guniguntala <dino@xxxxxxxxxx>
Acked-by: Ingo Molnar <mingo@xxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxx>
---

 kernel/futex.c |   17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff -puN kernel/futex.c~pi-futex-futex_lock_pi-futex_unlock_pi-support-fix kernel/futex.c
--- devel/kernel/futex.c~pi-futex-futex_lock_pi-futex_unlock_pi-support-fix	2006-06-10 01:32:26.000000000 -0700
+++ devel-akpm/kernel/futex.c	2006-06-10 01:32:26.000000000 -0700
@@ -355,14 +355,17 @@ static void free_pi_state(struct futex_p
 	if (!atomic_dec_and_test(&pi_state->refcount))
 		return;
 
-	WARN_ON(!pi_state->owner);
-	WARN_ON(!rt_mutex_is_locked(&pi_state->pi_mutex));
-
-	spin_lock_irq(&pi_state->owner->pi_lock);
-	list_del_init(&pi_state->list);
-	spin_unlock_irq(&pi_state->owner->pi_lock);
+	/*
+	 * If pi_state->owner is NULL, the owner is most probably dying
+	 * and has cleaned up the pi_state already
+	 */
+	if (pi_state->owner) {
+		spin_lock_irq(&pi_state->owner->pi_lock);
+		list_del_init(&pi_state->list);
+		spin_unlock_irq(&pi_state->owner->pi_lock);
 
-	rt_mutex_proxy_unlock(&pi_state->pi_mutex, pi_state->owner);
+		rt_mutex_proxy_unlock(&pi_state->pi_mutex, pi_state->owner);
+	}
 
 	if (current->pi_state_cache)
 		kfree(pi_state);
_

Patches currently in -mm which might be from dino@xxxxxxxxxx are

pi-futex-futex_lock_pi-futex_unlock_pi-support.patch
pi-futex-futex_lock_pi-futex_unlock_pi-support-fix.patch

-
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Kernel Newbies FAQ]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux