On Thu, Dec 12, 2024 at 7:35 AM Sebastian Sewior <bigeasy@xxxxxxxxxxxxx> wrote: > > On 2024-12-12 16:21:28 [+0100], Michal Hocko wrote: > > On Thu 12-12-24 16:07:44, Sebastian Sewior wrote: > > > But since I see in_nmi(). You can't trylock from NMI on RT. The trylock > > > part is easy but unlock might need to acquire rt_mutex_base::wait_lock > > > and worst case is to wake a waiter via wake_up_process(). > > > > Ohh, I didn't realize that. So try_lock would only be safe on > > raw_spin_lock right? > > If NMI is one of the possible calling contexts, yes. Looks like in_nmi both trylock and unlock are not safe. pcp_spin_trylock() calls __rt_spin_trylock() on RT, which can deadlock inside rt_mutex_slowtrylock(). This part has a potential workaround like: @@ -102,8 +102,11 @@ static __always_inline int __rt_spin_trylock(spinlock_t *lock) { int ret = 1; - if (unlikely(!rt_mutex_cmpxchg_acquire(&lock->lock, NULL, current))) + if (unlikely(!rt_mutex_cmpxchg_acquire(&lock->lock, NULL, current))) { + if (in_nmi()) + return 0; ret = rt_mutex_slowtrylock(&lock->lock); + } but when there are waiters and cmpxchg in this part fails: if (unlikely(!rt_mutex_cmpxchg_release(&lock->lock, current, NULL))) rt_mutex_slowunlock(&lock->lock); will trigger slowunlock that is impossible to do from nmi. We can punt it to irqwork with IRQ_WORK_HARD_IRQ to make sure it runs as soon as nmi finishes. Since it's hard irq debug_rt_mutex_unlock(lock); shouldn't complain. The current will stay the same ? Other ideas?