[patch -rt] Fix infinite loop with 2.6.31.4-rt14

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Thomas,

I see an application hang in 2.6.31.4-rt14 when running some java tests.

The kernel seems to be continuously looping in 
       futex_wait_requeue_pi -> futex_wait_setup ->
       ret -EAGAIN -> goto retry -> futex_wait_setup -> on and on

===============================================================================

    java-5544  [001] 79682.800631: __might_sleep <-rt_spin_lock_fastlock
    java-5544  [001] 79682.800631: get_futex_value_locked <-futex_wait_setup
    java-5544  [001] 79682.800632: pagefault_disable <-get_futex_value_locked
    java-5544  [001] 79682.800632: pagefault_enable <-get_futex_value_locked
    java-5544  [001] 79682.800632: queue_unlock <-futex_wait_setup
    java-5544  [001] 79682.800632: rt_spin_unlock <-queue_unlock
    java-5544  [001] 79682.800633: rt_spin_lock_fastunlock <-rt_spin_unlock
    java-5544  [001] 79682.800633: drop_futex_key_refs <-queue_unlock
    java-5544  [001] 79682.800633: put_futex_key <-futex_wait_setup
    java-5544  [001] 79682.800633: drop_futex_key_refs <-put_futex_key
    java-5544  [001] 79682.800633: put_futex_key <-do_futex
    java-5544  [001] 79682.800634: drop_futex_key_refs <-put_futex_key
    java-5544  [001] 79682.800634: get_futex_key <-do_futex
    java-5544  [001] 79682.800634: get_futex_key_refs <-get_futex_key
    java-5544  [001] 79682.800634: futex_wait_setup <-do_futex
    java-5544  [001] 79682.800635: get_futex_key <-futex_wait_setup
    java-5544  [001] 79682.800635: get_futex_key_refs <-get_futex_key
    java-5544  [001] 79682.800635: queue_lock <-futex_wait_setup
    java-5544  [001] 79682.800635: get_futex_key_refs <-queue_lock
    java-5544  [001] 79682.800635: hash_futex <-queue_lock
    java-5544  [001] 79682.800636: rt_spin_lock <-queue_lock
    java-5544  [001] 79682.800636: rt_spin_lock_fastlock <-rt_spin_lock
    java-5544  [001] 79682.800636: __might_sleep <-rt_spin_lock_fastlock
    java-5544  [001] 79682.800636: get_futex_value_locked <-futex_wait_setup
    java-5544  [001] 79682.800637: pagefault_disable <-get_futex_value_locked
    java-5544  [001] 79682.800637: pagefault_enable <-get_futex_value_locked
    java-5544  [001] 79682.800637: queue_unlock <-futex_wait_setup
    java-5544  [001] 79682.800637: rt_spin_unlock <-queue_unlock
    java-5544  [001] 79682.800637: rt_spin_lock_fastunlock <-rt_spin_unlock
    java-5544  [001] 79682.800638: drop_futex_key_refs <-queue_unlock
    java-5544  [001] 79682.800638: put_futex_key <-futex_wait_setup
    java-5544  [001] 79682.800638: drop_futex_key_refs <-put_futex_key
    java-5544  [001] 79682.800638: put_futex_key <-do_futex
    java-5544  [001] 79682.800639: drop_futex_key_refs <-put_futex_key
    java-5544  [001] 79682.800639: get_futex_key <-do_futex
    java-5544  [001] 79682.800639: get_futex_key_refs <-get_futex_key
    java-5544  [001] 79682.800639: futex_wait_setup <-do_futex
    java-5544  [001] 79682.800639: get_futex_key <-futex_wait_setup
    java-5544  [001] 79682.800640: get_futex_key_refs <-get_futex_key
    java-5544  [001] 79682.800640: queue_lock <-futex_wait_setup
    java-5544  [001] 79682.800640: get_futex_key_refs <-queue_lock
    java-5544  [001] 79682.800640: hash_futex <-queue_lock
    java-5544  [001] 79682.800640: rt_spin_lock <-queue_lock
    java-5544  [001] 79682.800641: rt_spin_lock_fastlock <-rt_spin_lock
    java-5544  [001] 79682.800641: __might_sleep <-rt_spin_lock_fastlock
    java-5544  [001] 79682.800641: get_futex_value_locked <-futex_wait_setup
    java-5544  [001] 79682.800641: pagefault_disable <-get_futex_value_locked
    java-5544  [001] 79682.800642: pagefault_enable <-get_futex_value_locked
    java-5544  [001] 79682.800642: queue_unlock <-futex_wait_setup
    java-5544  [001] 79682.800642: rt_spin_unlock <-queue_unlock
    java-5544  [001] 79682.800642: rt_spin_lock_fastunlock <-rt_spin_unlock


===============================================================================

This looks to be caused by the patch below
      -> http://patchwork.kernel.org/patch/53483/

Not sure if this the best way to go here, but the patch below seems to resolve
the problem for me

If this is fine, I'll send a separate patch for mainline. Currently mainline
seems to be missing the earlier patch referenced above as well

Signed-off-by: Dinakar Guniguntala <dino@xxxxxxxxxx>

	-Dinakar

---
 kernel/futex.c |   84 +++++++++++++++++++++------------------------------------
 1 file changed, 32 insertions(+), 52 deletions(-)

Index: linux-2.6.31.4-rt14-lbf-f1/kernel/futex.c
===================================================================
--- linux-2.6.31.4-rt14-lbf-f1.orig/kernel/futex.c
+++ linux-2.6.31.4-rt14-lbf-f1/kernel/futex.c
@@ -2048,54 +2048,6 @@ pi_faulted:
 }
 
 /**
- * handle_early_requeue_pi_wakeup() - Detect early wakeup on the initial futex
- * @hb:		the hash_bucket futex_q was original enqueued on
- * @q:		the futex_q woken while waiting to be requeued
- * @key2:	the futex_key of the requeue target futex
- * @timeout:	the timeout associated with the wait (NULL if none)
- *
- * Detect if the task was woken on the initial futex as opposed to the requeue
- * target futex.  If so, determine if it was a timeout or a signal that caused
- * the wakeup and return the appropriate error code to the caller.  Must be
- * called with the hb lock held.
- *
- * Returns
- *  0 - no early wakeup detected
- * <0 - -ETIMEDOUT or -ERESTARTNOINTR
- */
-static inline
-int handle_early_requeue_pi_wakeup(struct futex_hash_bucket *hb,
-				   struct futex_q *q, union futex_key *key2,
-				   struct hrtimer_sleeper *timeout)
-{
-	int ret = 0;
-
-	/*
-	 * With the hb lock held, we avoid races while we process the wakeup.
-	 * We only need to hold hb (and not hb2) to ensure atomicity as the
-	 * wakeup code can't change q.key from uaddr to uaddr2 if we hold hb.
-	 * It can't be requeued from uaddr2 to something else since we don't
-	 * support a PI aware source futex for requeue.
-	 */
-	if (!match_futex(&q->key, key2)) {
-		WARN_ON(q->lock_ptr && (&hb->lock != q->lock_ptr));
-		/*
-		 * We were woken prior to requeue by a timeout or a signal.
-		 * Unqueue the futex_q and determine which it was.
-		 */
-		plist_del(&q->list, &q->list.plist);
-
-		/* Handle spurious wakeups gracefully */
-		ret = -EAGAIN;
-		if (timeout && !timeout->task)
-			ret = -ETIMEDOUT;
-		else if (signal_pending(current))
-			ret = -ERESTARTNOINTR;
-	}
-	return ret;
-}
-
-/**
  * futex_wait_requeue_pi() - Wait on uaddr and take uaddr2
  * @uaddr:	the futex we initialyl wait on (non-pi)
  * @fshared:	whether the futexes are shared (1) or not (0).  They must be
@@ -2186,8 +2138,39 @@ retry:
 	futex_wait_queue_me(hb, &q, to);
 
 	spin_lock(&hb->lock);
-	ret = handle_early_requeue_pi_wakeup(hb, &q, &key2, to);
+	/*
+	 * Detect if the task was woken on the initial futex as opposed to the requeue
+	 * target futex.  If so, determine if it was a timeout or a signal that caused
+	 * the wakeup and return the appropriate error code to the caller.  Must be
+	 * called with the hb lock held.
+	 * With the hb lock held, we avoid races while we process the wakeup.
+	 * We only need to hold hb (and not hb2) to ensure atomicity as the
+	 * wakeup code can't change q.key from uaddr to uaddr2 if we hold hb.
+	 * It can't be requeued from uaddr2 to something else since we don't
+	 * support a PI aware source futex for requeue.
+	 */
+	if (!match_futex(&q.key, &key2)) {
+		WARN_ON(q.lock_ptr && (&hb->lock != q.lock_ptr));
+		/*
+		 * We were woken prior to requeue by a timeout or a signal.
+		 * Unqueue the futex_q and determine which it was.
+		 */
+		plist_del(&q.list, &q.list.plist);
+
+		/* Handle spurious wakeups gracefully */
+		ret = -EAGAIN;
+		if (to && !to->task)
+			ret = -ETIMEDOUT;
+		else if (signal_pending(current))
+			ret = -ERESTARTNOINTR;
+	}
 	spin_unlock(&hb->lock);
+	if (ret == -EAGAIN) {
+		/* Retry on spurious wakeup */
+		put_futex_key(fshared, &q.key);
+		put_futex_key(fshared, &key2);
+		goto retry;
+	}
 	if (ret)
 		goto out_put_keys;
 
@@ -2264,9 +2247,6 @@ out_put_keys:
 out_key2:
 	put_futex_key(fshared, &key2);
 
-	/* Spurious wakeup ? */
-	if (ret == -EAGAIN)
-		goto retry;
 out:
 	if (to) {
 		hrtimer_cancel(&to->timer);
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux