[PATCH] locking/rtmutex: Fix ww_mutex deadlock check

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Wed, 1 Sep 2021 11:44:11 +0200

On Wed, Sep 01, 2021 at 10:09:43AM +0200, Peter Zijlstra wrote:
> On Tue, Aug 31, 2021 at 11:21:52AM +0300, Dan Carpenter wrote:
> > Hello Peter Zijlstra,
> 
> Hi Dan :-)
> 
> > This is a semi-automatic email about new static checker warnings.
> > 
> > The patch a055fcc132d4: "locking/rtmutex: Return success on deadlock
> > for ww_mutex waiters" from Aug 26, 2021, leads to the following
> > Smatch complaint:
> > 
> >     kernel/locking/rtmutex.c:756 rt_mutex_adjust_prio_chain()
> >     error: we previously assumed 'orig_waiter' could be null (see line 644)
> > 
> > kernel/locking/rtmutex.c
> >    643		 */
> >    644		if (orig_waiter && !rt_mutex_owner(orig_lock))
> >                     ^^^^^^^^^^^
> > A lot of this code assumes "orig_waiter" can be NULL.
> > 
> 
> >    735		/*
> >    736		 * [6] check_exit_conditions_2() protected by task->pi_lock and
> >    737		 * lock->wait_lock.
> >    738		 *
> >    739		 * Deadlock detection. If the lock is the same as the original
> >    740		 * lock which caused us to walk the lock chain or if the
> >    741		 * current lock is owned by the task which initiated the chain
> >    742		 * walk, we detected a deadlock.
> >    743		 */
> >    744		if (lock == orig_lock || rt_mutex_owner(lock) == top_task) {
> >                     ^^^^^^^^^^^^^^^^^
> > This might mean it's a false positive, but Smatch isn't clever enough to
> > figure it out.  And I'm stupid too!  Plus lazy...  and ugly.
> > 
> >    745			ret = -EDEADLK;
> >    746	
> >    747			/*
> >    748			 * When the deadlock is due to ww_mutex; also see above. Don't
> >    749			 * report the deadlock and instead let the ww_mutex wound/die
> >    750			 * logic pick which of the contending threads gets -EDEADLK.
> >    751			 *
> >    752			 * NOTE: assumes the cycle only contains a single ww_class; any
> >    753			 * other configuration and we fail to report; also, see
> >    754			 * lockdep.
> >    755			 */
> >    756			if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter->ww_ctx)
> >                                                              ^^^^^^^^^^^^^^^^^^^
> > Unchecked dereference.
> 
> 
> This is difficult... and I'm glad you flagged it. The normal de-boost
> path is through rt_mutex_adjust_prio() and that has: .orig_lock == NULL
> && .orig_waiter == NULL. And as such it would never trigger the above
> case.
> 
> However, there is remove_waiter() which is called on rt_mutex_lock()'s
> failure paths and that doesn't have .orig_lock == NULL, and as such
> *could* conceivably trigger this.
> 
> Let me figure out what the right thing to do is.
> 
> Thanks!

I think something like this ought to do.

---
Subject: locking/rtmutex: Fix ww_mutex deadlock check

Dan reported that rt_mutex_adjust_prio_chain() can be called with
.orig_waiter == NULL however commit a055fcc132d4 ("locking/rtmutex:
Return success on deadlock for ww_mutex waiters") unconditionally
dereferences it.

Since both call-sites that have .orig_waiter == NULL don't care for the
return value, simply disable the deadlock squash by adding the NULL
check.

Notably, both callers use the deadlock condition as a termination
condition for the iteration; once detected, we're sure (de)boosting is
done. Arguably [3] would be a more natural termination point, but I'm
not sure adding a third deadlock detection state would improve the code.

Fixes: a055fcc132d4 ("locking/rtmutex: Return success on deadlock for ww_mutex waiters")
Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
 kernel/locking/rtmutex.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 8eabdc79602b..6bb116c559b4 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -753,7 +753,7 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
 		 * other configuration and we fail to report; also, see
 		 * lockdep.
 		 */
-		if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter->ww_ctx)
+		if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter && orig_waiter->ww_ctx)
 			ret = 0;
 
 		raw_spin_unlock(&lock->wait_lock);