On Tue, May 18, 2021 at 09:44:12AM -0300, Luis Claudio R. Goncalves wrote: > On Tue, May 18, 2021 at 03:20:24AM +0200, Ben Hutchings wrote: > > The recent merges of futex changes from 4.9-stable into the 4.9-rt > > tree effectively reverted: > > Hi Ben! > > > * The deletion of calls to rt_mutex_futex_unlock() from futex_lock_pi() > > and futex_wait_requeue_pi() by commit b960d9ae7f76 "futex: Handle > > faults correctly for PI futexes". > > > > * The deletion of uninitialized_var() by commit 48ab8e8e4059 "futex: > > Simplify fixup_pi_state_owner()". > > > > * Commit c59b46c53fa1 "rtmutex: Handle non enqueued waiters gracefully". > > > > Restore those changes. > > > > Also resolve some other cosmetic differences from the 4.9-stable > > version of futex.c and rtmutex_common.h due to slightly different > > backports. > > Thanks for reviewing the patches and devoting time to fix what you found. > > While merging the second set of futex changes from v4.9, the two code changes > from futex_lock_pi() and futex_wait_requeue_pi() were among the code that > I was inspecting because the kernel was not surviving 30 minutes of pi_stress. > I will take your patch to a spin but, yes, your changes make sense. Ben, just to confirm, you patch makes sense to me and passes my tests with flying colors. Anybody else has comments? Tomorrow morning I will send an RFC to stable-rt and linux-rt users with your patch and a ptrace fix I would like to backport to v4.9-rt. I intend to backport commit 7103517ca97f ("ptrace: fix ptrace_unfreeze_traced() race with rt-lock") there too, as I am seeing the ptrace_check_attach backtraces when running stress_ng. Thanks again! Luis > > Luis > > > Signed-off-by: Ben Hutchings <ben@xxxxxxxxxxxxxxx> > > --- > > I started by interactively rebasing v4.9.268-rt179-rebase onto > > v4.9.268 and completely dropping futex and rtmutex commits > > corresponding to commits that were already backported into the latter. > > That rebased branch can be found as v4.9-rt-rebase in: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/bwh/linux-stable-rt.git > > > > This patch contains the differences from v4.9.268-rt179 to that new > > branch. I think all the code changes make sense, but would appreciate > > careful review. > > > > Ben. > > > > kernel/futex.c | 39 ++++++++++++--------------------- > > kernel/locking/rtmutex.c | 3 +-- > > kernel/locking/rtmutex_common.h | 1 - > > 3 files changed, 15 insertions(+), 28 deletions(-) > > > > diff --git a/kernel/futex.c b/kernel/futex.c > > index 93f2fb5b21b2..7679831ed809 100644 > > --- a/kernel/futex.c > > +++ b/kernel/futex.c > > @@ -2465,9 +2465,9 @@ static void unqueue_me_pi(struct futex_q *q) > > static int __fixup_pi_state_owner(u32 __user *uaddr, struct futex_q *q, > > struct task_struct *argowner) > > { > > - u32 uval, uninitialized_var(curval), newval, newtid; > > struct futex_pi_state *pi_state = q->pi_state; > > struct task_struct *oldowner, *newowner; > > + u32 uval, curval, newval, newtid; > > int err = 0; > > > > oldowner = pi_state->owner; > > @@ -3005,9 +3005,10 @@ static int futex_lock_pi(u32 __user *uaddr, unsigned int flags, > > * and BUG when futex_unlock_pi() interleaves with this. > > * > > * Therefore acquire wait_lock while holding hb->lock, but drop the > > - * latter before calling rt_mutex_start_proxy_lock(). This still fully > > - * serializes against futex_unlock_pi() as that does the exact same > > - * lock handoff sequence. > > + * latter before calling __rt_mutex_start_proxy_lock(). This > > + * interleaves with futex_unlock_pi() -- which does a similar lock > > + * handoff -- such that the latter can observe the futex_q::pi_state > > + * before __rt_mutex_start_proxy_lock() is done. > > */ > > raw_spin_lock_irq(&q.pi_state->pi_mutex.wait_lock); > > /* > > @@ -3019,6 +3020,11 @@ static int futex_lock_pi(u32 __user *uaddr, unsigned int flags, > > migrate_disable(); > > > > spin_unlock(q.lock_ptr); > > + /* > > + * __rt_mutex_start_proxy_lock() unconditionally enqueues the @rt_waiter > > + * such that futex_unlock_pi() is guaranteed to observe the waiter when > > + * it sees the futex_q::pi_state. > > + */ > > ret = __rt_mutex_start_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter, current); > > raw_spin_unlock_irq(&q.pi_state->pi_mutex.wait_lock); > > migrate_enable(); > > @@ -3037,10 +3043,10 @@ static int futex_lock_pi(u32 __user *uaddr, unsigned int flags, > > cleanup: > > spin_lock(q.lock_ptr); > > /* > > - * If we failed to acquire the lock (signal/timeout), we must > > + * If we failed to acquire the lock (deadlock/signal/timeout), we must > > * first acquire the hb->lock before removing the lock from the > > - * rt_mutex waitqueue, such that we can keep the hb and rt_mutex > > - * wait lists consistent. > > + * rt_mutex waitqueue, such that we can keep the hb and rt_mutex wait > > + * lists consistent. > > * > > * In particular; it is important that futex_unlock_pi() can not > > * observe this inconsistency. > > @@ -3061,13 +3067,6 @@ static int futex_lock_pi(u32 __user *uaddr, unsigned int flags, > > if (res) > > ret = (res < 0) ? res : 0; > > > > - /* > > - * If fixup_owner() faulted and was unable to handle the fault, unlock > > - * it and return the fault to userspace. > > - */ > > - if (ret && (rt_mutex_owner(&q.pi_state->pi_mutex) == current)) > > - rt_mutex_futex_unlock(&q.pi_state->pi_mutex); > > - > > /* Unqueue and drop the lock */ > > unqueue_me_pi(&q); > > > > @@ -3170,7 +3169,7 @@ static int futex_unlock_pi(u32 __user *uaddr, unsigned int flags) > > migrate_disable(); > > spin_unlock(&hb->lock); > > > > - /* Drops pi_state->pi_mutex.wait_lock */ > > + /* drops pi_state->pi_mutex.wait_lock */ > > ret = wake_futex_pi(uaddr, uval, pi_state); > > > > migrate_enable(); > > @@ -3460,8 +3459,6 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags, > > spin_lock(&hb2->lock); > > BUG_ON(&hb2->lock != q.lock_ptr); > > ret = fixup_pi_state_owner(uaddr2, &q, current); > > - if (ret && rt_mutex_owner(&q.pi_state->pi_mutex) == current) > > - rt_mutex_futex_unlock(&q.pi_state->pi_mutex); > > /* > > * Drop the reference to the pi state which > > * the requeue_pi() code acquired for us. > > @@ -3504,14 +3501,6 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags, > > if (res) > > ret = (res < 0) ? res : 0; > > > > - /* > > - * If fixup_pi_state_owner() faulted and was unable to handle > > - * the fault, unlock the rt_mutex and return the fault to > > - * userspace. > > - */ > > - if (ret && rt_mutex_owner(pi_mutex) == current) > > - rt_mutex_futex_unlock(pi_mutex); > > - > > /* Unqueue and drop the lock. */ > > unqueue_me_pi(&q); > > } > > diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c > > index 9816892558b8..a7f971a60191 100644 > > --- a/kernel/locking/rtmutex.c > > +++ b/kernel/locking/rtmutex.c > > @@ -2397,7 +2397,7 @@ int rt_mutex_start_proxy_lock(struct rt_mutex *lock, > > > > raw_spin_lock_irq(&lock->wait_lock); > > ret = __rt_mutex_start_proxy_lock(lock, waiter, task); > > - if (unlikely(ret)) > > + if (ret && rt_mutex_has_waiters(lock)) > > remove_waiter(lock, waiter); > > raw_spin_unlock_irq(&lock->wait_lock); > > > > @@ -2526,7 +2526,6 @@ bool rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock, > > remove_waiter(lock, waiter); > > cleanup = true; > > } > > - > > /* > > * try_to_take_rt_mutex() sets the waiter bit unconditionally. We might > > * have to fix that up. > > diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h > > index 750bad6849e2..98debc11953f 100644 > > --- a/kernel/locking/rtmutex_common.h > > +++ b/kernel/locking/rtmutex_common.h > > @@ -119,7 +119,6 @@ extern int rt_mutex_wait_proxy_lock(struct rt_mutex *lock, > > struct rt_mutex_waiter *waiter); > > extern bool rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock, > > struct rt_mutex_waiter *waiter); > > - > > extern int rt_mutex_futex_trylock(struct rt_mutex *l); > > extern int __rt_mutex_futex_trylock(struct rt_mutex *l); > > > > > ---end quoted text--- ---end quoted text---
Attachment:
signature.asc
Description: PGP signature