On Mon, Feb 20 2023 at 12:42, Sebastian Andrzej Siewior wrote: > On 2023-02-20 12:04:56 [+0100], To Thomas Gleixner wrote: >> The ->pi_blocked_on field is set by __rwbase_read_lock() before >> schedule() is invoked while blocking on the sleeping lock. By doing this >> we avoid __blk_flush_plug() and as such will may deadlock because we are >> going to sleep and made I/O progress earlier which is not globally >> visibly but might be (s/might be/is/ in the deadlock case) expected by >> the owner of the lock. Fair enough. > --- a/kernel/locking/rtmutex.c > +++ b/kernel/locking/rtmutex.c > @@ -1700,6 +1700,13 @@ static __always_inline int __rt_mutex_lock(struct rt_mutex_base *lock, > if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current))) > return 0; > > + if (state != TASK_RTLOCK_WAIT) { > + /* > + * If we are going to sleep and we have plugged IO queued, > + * make sure to submit it to avoid deadlocks. > + */ > + blk_flush_plug(tsk->plug, true); This still leaves the problem vs. io_wq_worker_sleeping() and it's running() counterpart after schedule(). Aside of that for CONFIG_DEBUG_RT_MUTEXES=y builds it flushes on every lock operation whether the lock is contended or not. Grmbl.