On 9/1/20 2:41 AM, Sebastian Andrzej Siewior wrote: > During a context switch the scheduler invokes wq_worker_sleeping() with > disabled preemption. Disabling preemption is needed because it protects > access to `worker->sleeping'. As an optimisation it avoids invoking > schedule() within the schedule path as part of possible wake up (thus > preempt_enable_no_resched() afterwards). > > The io-wq has been added to the mix in the same section with disabled > preemption. This breaks on PREEMPT_RT because io_wq_worker_sleeping() > acquires a spinlock_t. Also within the schedule() the spinlock_t must be > acquired after tsk_is_pi_blocked() otherwise it will block on the > sleeping lock again while scheduling out. > > While playing with `io_uring-bench' I didn't notice a significant > latency spike after converting io_wqe::lock to a raw_spinlock_t. The > latency was more or less the same. > > In order to keep the spinlock_t it would have to be moved after the > tsk_is_pi_blocked() check which would introduce a branch instruction > into the hot path. > > The lock is used to maintain the `work_list' and wakes one task up at > most. > Should io_wqe_cancel_pending_work() cause latency spikes, while > searching for a specific item, then it would need to drop the lock > during iterations. > revert_creds() is also invoked under the lock. According to debug > cred::non_rcu is 0. Otherwise it should be moved outside of the locked > section because put_cred_rcu()->free_uid() acquires a sleeping lock. > > Convert io_wqe::lock to a raw_spinlock_t.c Thanks, I've applied this for 5.10. -- Jens Axboe