Mike, On Sun, Aug 01 2021 at 17:14, Mike Galbraith wrote: > On Sun, 2021-08-01 at 05:36 +0200, Mike Galbraith wrote: >> On Fri, 2021-07-30 at 22:49 +0200, Thomas Gleixner wrote: >> > > >> > > First symptom is KDE/Plasma's task manager going comatose. Notice soon >> > >> > KDE/Plasma points at the new fangled rtmutex based ww_mutex from >> > Peter. >> >> Seems not. When booting KVM box with nomodeset, there's exactly one >> early boot ww_mutex lock/unlock, ancient history at the failure point. > > As you've probably already surmised given it isn't the ww_mutex bits, > it's the wake_q bits. Apply the below, 5.14-rt ceases to fail. Take > perfectly healthy 5.13-rt, apply those bits, and it instantly begins > failing as 5.14-rt had been. now staring at it makes it pretty obvious. When I picked up Peter's patch I thought about it briefly and then ignored my doubts :( > /* RT mutex specific wake_q wrappers */ > -static __always_inline void rt_mutex_wake_q_add(struct rt_wake_q_head *wqh, > +static __always_inline void rt_mutex_wake_q_add(struct rt_mutex_wake_q_head *wqh, > struct rt_mutex_waiter *w) > { > if (IS_ENABLED(CONFIG_PREEMPT_RT) && w->wake_state != TASK_NORMAL) { > - wake_q_add(&wqh->rt_head, w->task); > + get_task_struct(w->task); > + wqh->rtlock_task = w->task; This is the key. With the original asymmetric version the wake_q_add for wake_state != TASK_NORMAL is storing the task unconditionally in wqh->rtlock_task. With that wake_q_add() we end up with the following situation: Some code, e.g. futex does: wake_q_add(..., task) which links task->wake_q->next is !NULL. Ergo the wake_q_add() in the rtmutex code bails out. Same the other way round if the rtmutex side queues first then the second - regular wakeup - will not be queued. There's two ways to fix that: 1) Go back to my original version 2) Add another wake_q head to task_struct #2 is overkill IMO simply because the rtlock wait is not subject to multiple wakeups. Thanks a lot Mike for tracking this down! Thanks, tglx