On Mon, Apr 20, 2020 at 07:40:19PM +0200, Uladzislau Rezki wrote: > On Mon, Apr 20, 2020 at 10:21:26AM -0700, Paul E. McKenney wrote: [...] > > > > > > > > > <snip> > > > > > > > > > /** > > > > > > > > > * queue_work_on - queue work on specific cpu > > > > > > > > > * @cpu: CPU number to execute work on > > > > > > > > > * @wq: workqueue to use > > > > > > > > > * @work: work to queue > > > > > > > > > * > > > > > > > > > * We queue the work to a specific CPU, the caller must ensure it > > > > > > > > > * can't go away. > > > > > > > > > * > > > > > > > > > * Return: %false if @work was already on a queue, %true otherwise. > > > > > > > > > */ > > > > > > > > > <snip> > > > > > > > > > > > > > > > > > > It says, how i see it, we should ensure it can not go away. So, if > > > > > > > > > we drop the lock we should do like: > > > > > > > > > > > > > > > > > > get_online_cpus(); > > > > > > > > > check a CPU is onlen; > > > > > > > > > queue_work_on(); > > > > > > > > > put_online_cpus(); > > > > > > > > > > > > > > > > > > but i suspect we do not want to do it :) > > > > > > > > > > > > > > > > Indeed, it might impose a few restrictions and a bit of overhead that > > > > > > > > might not be welcome at some point in the future. ;-) > > > > > > > > > > > > > > > > On top of this there are potential load-balancing concerns. By specifying > > > > > > > > the CPU, you are limiting workqueue's and scheduler's ability to adjust to > > > > > > > > any sudden changes in load. Maybe not enough to matter in most cases, but > > > > > > > > might be an issue if there is a sudden flood of kfree_rcu() invocations. > > > > > > > > > > > > > > > Agree. Let's keep it as it is now :) > > > > > > > > > > > > I am not sure which "as it is now" you are referring to, but I suspect > > > > > > that the -rt guys prefer two short interrupts-disabled regions to one > > > > > > longer interrupts-disabled region. > > > > > > > > > > I mean to run schedule_delayed_work() under spinlock. > > > > > > > > Which is an interrupt-disabled spinlock, correct? > > > > > > > To do it under holding the lock, currently it is spinlock, but it is > > > going to be(if you agree :)) raw ones, which keeps IRQs disabled. I > > > saw Joel sent out patches. > > > > Then please move the schedule_delayed_work() and friends out from > > under the spinlock. Unless Sebastian has some reason why extending > > an interrupts-disabled critical section (and thus degrading real-time > > latency) is somehow OK in this case. > > > Paul, if move outside of the lock we may introduce unneeded migration > issues, plus it can introduce higher memory footprint(i have not tested). > I have described it in more detail earlier in this mail thread. I do not > think that waking up the work is an issue for RT from latency point of > view. But let's ask Sebastian to confirm. I was also a bit concerned about migration. If we moved it outside of lock, then even on !PREEMPT_RT, we could be migrated before the work is scheduled. Then we'd lose the benefit of executing the work on the same CPU where it is queued. There's no migrate_disable() in non-PREEMPT_RT when I recently checked as well :-\ (PeterZ mentioned that migrate_disable() is hard to achieve on !PREEMPT_RT). > Sebastian, do you think that placing a work on current CPU is an issue? > If we do it under raw spinlock? Yes, I am also curious if calling schedule_delayed_work can cause long delays at all. Considering that workqueue code uses raw spinlocks as Mike mentioned, I was under the impression that this code should not be causing such issues, and the fact that it is called in many places from IRQ-disabled sections as well. Let us definitely double-check and discuss it more to be sure. thanks, - Joel