Re: [PATCH RT] kernel/futex: don't deboost too early

Steven Rostedt <rostedt@xxxxxxxxxxx> · Fri, 30 Sep 2016 12:00:38 -0400

On Fri, 30 Sep 2016 10:39:14 +0200
Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> wrote:

> The sequence:
>  T1 holds futex
>  T2 blocks on futex and boosts T1
>  T1 unlocks futex and holds hb->lock
>  T1 unlocks rt mutex, so T1 has no more pi waiters
>  T3 blocks on hb->lock and adds itself to the pi waiters list of T1
>  T1 unlocks hb->lock and deboosts itself
>  T4 preempts T1 so the wakeup of T2 gets delayed
> 
> As a workaround I attempt here do unlock the hb->lock without a deboost
> and perform the deboost after the wake up of the waiter.
> 
> Cc: stable-rt@xxxxxxxxxxxxxxx
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
> ---
>  include/linux/spinlock.h    |  6 +++++
>  include/linux/spinlock_rt.h |  2 ++
>  kernel/futex.c              |  2 +-
>  kernel/locking/rtmutex.c    | 53 +++++++++++++++++++++++++++++++++++++++------
>  4 files changed, 55 insertions(+), 8 deletions(-)
> 

This looks awfully complex. Would something as simple as this work?

What harm can happen by moving the holding of the lock after the
wakeups for RT?

-- Steve

diff --git a/kernel/futex.c b/kernel/futex.c
index 2d572ed..bb900bd 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1347,9 +1347,14 @@ static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q *this,
 	 * deboost first (and lose our higher priority), then the task might get
 	 * scheduled away before the wake up can take place.
 	 */
+#ifndef CONFIG_PREEMPT_RT_FULL
 	spin_unlock(&hb->lock);
+#endif
 	wake_up_q(&wake_q);
 	wake_up_q_sleeper(&wake_sleeper_q);
+#ifdef CONFIG_PREEMPT_RT_FULL
+	spin_unlock(&hb->lock);
+#endif
 	if (deboost)
 		rt_mutex_adjust_prio(current);
 
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html