On Tue, May 18, 2021 at 03:20:24AM +0200, Ben Hutchings wrote: > The recent merges of futex changes from 4.9-stable into the 4.9-rt > tree effectively reverted: Hi Ben! > * The deletion of calls to rt_mutex_futex_unlock() from futex_lock_pi() > and futex_wait_requeue_pi() by commit b960d9ae7f76 "futex: Handle > faults correctly for PI futexes". > > * The deletion of uninitialized_var() by commit 48ab8e8e4059 "futex: > Simplify fixup_pi_state_owner()". > > * Commit c59b46c53fa1 "rtmutex: Handle non enqueued waiters gracefully". > > Restore those changes. > > Also resolve some other cosmetic differences from the 4.9-stable > version of futex.c and rtmutex_common.h due to slightly different > backports. Thanks for reviewing the patches and devoting time to fix what you found. While merging the second set of futex changes from v4.9, the two code changes from futex_lock_pi() and futex_wait_requeue_pi() were among the code that I was inspecting because the kernel was not surviving 30 minutes of pi_stress. I will take your patch to a spin but, yes, your changes make sense. Luis > Signed-off-by: Ben Hutchings <ben@xxxxxxxxxxxxxxx> > --- > I started by interactively rebasing v4.9.268-rt179-rebase onto > v4.9.268 and completely dropping futex and rtmutex commits > corresponding to commits that were already backported into the latter. > That rebased branch can be found as v4.9-rt-rebase in: > > https://git.kernel.org/pub/scm/linux/kernel/git/bwh/linux-stable-rt.git > > This patch contains the differences from v4.9.268-rt179 to that new > branch. I think all the code changes make sense, but would appreciate > careful review. > > Ben. > > kernel/futex.c | 39 ++++++++++++--------------------- > kernel/locking/rtmutex.c | 3 +-- > kernel/locking/rtmutex_common.h | 1 - > 3 files changed, 15 insertions(+), 28 deletions(-) > > diff --git a/kernel/futex.c b/kernel/futex.c > index 93f2fb5b21b2..7679831ed809 100644 > --- a/kernel/futex.c > +++ b/kernel/futex.c > @@ -2465,9 +2465,9 @@ static void unqueue_me_pi(struct futex_q *q) > static int __fixup_pi_state_owner(u32 __user *uaddr, struct futex_q *q, > struct task_struct *argowner) > { > - u32 uval, uninitialized_var(curval), newval, newtid; > struct futex_pi_state *pi_state = q->pi_state; > struct task_struct *oldowner, *newowner; > + u32 uval, curval, newval, newtid; > int err = 0; > > oldowner = pi_state->owner; > @@ -3005,9 +3005,10 @@ static int futex_lock_pi(u32 __user *uaddr, unsigned int flags, > * and BUG when futex_unlock_pi() interleaves with this. > * > * Therefore acquire wait_lock while holding hb->lock, but drop the > - * latter before calling rt_mutex_start_proxy_lock(). This still fully > - * serializes against futex_unlock_pi() as that does the exact same > - * lock handoff sequence. > + * latter before calling __rt_mutex_start_proxy_lock(). This > + * interleaves with futex_unlock_pi() -- which does a similar lock > + * handoff -- such that the latter can observe the futex_q::pi_state > + * before __rt_mutex_start_proxy_lock() is done. > */ > raw_spin_lock_irq(&q.pi_state->pi_mutex.wait_lock); > /* > @@ -3019,6 +3020,11 @@ static int futex_lock_pi(u32 __user *uaddr, unsigned int flags, > migrate_disable(); > > spin_unlock(q.lock_ptr); > + /* > + * __rt_mutex_start_proxy_lock() unconditionally enqueues the @rt_waiter > + * such that futex_unlock_pi() is guaranteed to observe the waiter when > + * it sees the futex_q::pi_state. > + */ > ret = __rt_mutex_start_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter, current); > raw_spin_unlock_irq(&q.pi_state->pi_mutex.wait_lock); > migrate_enable(); > @@ -3037,10 +3043,10 @@ static int futex_lock_pi(u32 __user *uaddr, unsigned int flags, > cleanup: > spin_lock(q.lock_ptr); > /* > - * If we failed to acquire the lock (signal/timeout), we must > + * If we failed to acquire the lock (deadlock/signal/timeout), we must > * first acquire the hb->lock before removing the lock from the > - * rt_mutex waitqueue, such that we can keep the hb and rt_mutex > - * wait lists consistent. > + * rt_mutex waitqueue, such that we can keep the hb and rt_mutex wait > + * lists consistent. > * > * In particular; it is important that futex_unlock_pi() can not > * observe this inconsistency. > @@ -3061,13 +3067,6 @@ static int futex_lock_pi(u32 __user *uaddr, unsigned int flags, > if (res) > ret = (res < 0) ? res : 0; > > - /* > - * If fixup_owner() faulted and was unable to handle the fault, unlock > - * it and return the fault to userspace. > - */ > - if (ret && (rt_mutex_owner(&q.pi_state->pi_mutex) == current)) > - rt_mutex_futex_unlock(&q.pi_state->pi_mutex); > - > /* Unqueue and drop the lock */ > unqueue_me_pi(&q); > > @@ -3170,7 +3169,7 @@ static int futex_unlock_pi(u32 __user *uaddr, unsigned int flags) > migrate_disable(); > spin_unlock(&hb->lock); > > - /* Drops pi_state->pi_mutex.wait_lock */ > + /* drops pi_state->pi_mutex.wait_lock */ > ret = wake_futex_pi(uaddr, uval, pi_state); > > migrate_enable(); > @@ -3460,8 +3459,6 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags, > spin_lock(&hb2->lock); > BUG_ON(&hb2->lock != q.lock_ptr); > ret = fixup_pi_state_owner(uaddr2, &q, current); > - if (ret && rt_mutex_owner(&q.pi_state->pi_mutex) == current) > - rt_mutex_futex_unlock(&q.pi_state->pi_mutex); > /* > * Drop the reference to the pi state which > * the requeue_pi() code acquired for us. > @@ -3504,14 +3501,6 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags, > if (res) > ret = (res < 0) ? res : 0; > > - /* > - * If fixup_pi_state_owner() faulted and was unable to handle > - * the fault, unlock the rt_mutex and return the fault to > - * userspace. > - */ > - if (ret && rt_mutex_owner(pi_mutex) == current) > - rt_mutex_futex_unlock(pi_mutex); > - > /* Unqueue and drop the lock. */ > unqueue_me_pi(&q); > } > diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c > index 9816892558b8..a7f971a60191 100644 > --- a/kernel/locking/rtmutex.c > +++ b/kernel/locking/rtmutex.c > @@ -2397,7 +2397,7 @@ int rt_mutex_start_proxy_lock(struct rt_mutex *lock, > > raw_spin_lock_irq(&lock->wait_lock); > ret = __rt_mutex_start_proxy_lock(lock, waiter, task); > - if (unlikely(ret)) > + if (ret && rt_mutex_has_waiters(lock)) > remove_waiter(lock, waiter); > raw_spin_unlock_irq(&lock->wait_lock); > > @@ -2526,7 +2526,6 @@ bool rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock, > remove_waiter(lock, waiter); > cleanup = true; > } > - > /* > * try_to_take_rt_mutex() sets the waiter bit unconditionally. We might > * have to fix that up. > diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h > index 750bad6849e2..98debc11953f 100644 > --- a/kernel/locking/rtmutex_common.h > +++ b/kernel/locking/rtmutex_common.h > @@ -119,7 +119,6 @@ extern int rt_mutex_wait_proxy_lock(struct rt_mutex *lock, > struct rt_mutex_waiter *waiter); > extern bool rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock, > struct rt_mutex_waiter *waiter); > - > extern int rt_mutex_futex_trylock(struct rt_mutex *l); > extern int __rt_mutex_futex_trylock(struct rt_mutex *l); > ---end quoted text---
Attachment:
signature.asc
Description: PGP signature