The stable cherry pick of commit 3d5c9340d1949733eb37616abd15db36aef9a57c ("rtmutex: Handle deadlock detection smarter") essentially makes the deadlock_detect flag a no-op, as it says: Even in the case when deadlock detection is not requested by the caller, we can detect deadlocks. Right now the code stops the lock chain walk and keeps the waiter enqueued, even on itself. Silly not to yell when such a scenario is detected and to keep the waiter enqueued. Return -EDEADLK unconditionally and handle it at the call sites. So, as part of that change, we see this: @@ -453,7 +453,7 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock, * which is wrong, as the other waiter is not in a deadlock * situation. */ - if (detect_deadlock && owner == task) + if (owner == task) return -EDEADLK; However, as part of the -rt baseline patches, there exists this change within rt-mutex-add-sleeping-spinlocks-support.patch: ret = task_blocks_on_rt_mutex(lock, &waiter, self, 0); BUG_ON(ret); Note that the zero in the call to task_blocks_on_rt_mutex is the value of detect_deadlock; off, but now ignored, and so we get ret = -EDEADLK which triggers the BUG_ON(). Per the quoted commit above, we handle EDEADLK at the call site, by not triggering the BUG_ON for it, and instead it will fall through to the existing for(;;) { ... debug_rt_mutex_print_deadlock() ...} code immediately below. Signed-off-by: Paul Gortmaker <paul.gortmaker@xxxxxxxxxxxxx> --- Notes: -this patch is against 3.10-rt, but the code for all recent -rt that include the recent linux-stable rtmutex changes should have the same issue. [The 3.14-rt has a trivial path change where the kernel/rtmutex.c of v3.10 becomes kernel/locking/rtmutex.c but aside from that it applies to 3.14 too] -I'd got a report of this BUG_ON triggering on a v3.4-rt based kernel; that kernel was using my integration of the tglx rtmutex stable changes into 3.4-rt as described here: https://lkml.org/lkml/2014/9/23/944 but the related code in rostedt's 3.10.53-rt56 (in linux-stable-rt) and in tglx's 3.14.12-rt9 patch queue is AFAICT identical. So I have to conclude that anything using the stable rtmutex changes can inadvertently suffer the same BUG trigger. -this change gets us back to the pre-rtmutex stable commit behaviour, but I suspect that smarter people than me can advise on a way to achieve the same end result. So I'll wait before adding anything to the linux-stable-rt branches I'd put here at: https://git.kernel.org/cgit/linux/kernel/git/paulg/linux-stable-rt.git kernel/rtmutex.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rtmutex.c b/kernel/rtmutex.c index 5f17f55c562d..70edaaee60dc 100644 --- a/kernel/rtmutex.c +++ b/kernel/rtmutex.c @@ -887,7 +887,7 @@ static void noinline __sched rt_spin_lock_slowlock(struct rt_mutex *lock) pi_unlock(&self->pi_lock); ret = task_blocks_on_rt_mutex(lock, &waiter, self, 0); - BUG_ON(ret); + BUG_ON(ret && ret != -EDEADLK); for (;;) { /* Try to acquire the lock again. */ -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html