Re: [PATCH 1/3] rcu: Use static initializer for krc.lock

Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> · Mon, 20 Apr 2020 13:57:50 -0400

On Mon, Apr 20, 2020 at 07:40:19PM +0200, Uladzislau Rezki wrote:
> On Mon, Apr 20, 2020 at 10:21:26AM -0700, Paul E. McKenney wrote:
[...]
> > > > > > > > > <snip>
> > > > > > > > > /**
> > > > > > > > >  * queue_work_on - queue work on specific cpu
> > > > > > > > >  * @cpu: CPU number to execute work on
> > > > > > > > >  * @wq: workqueue to use
> > > > > > > > >  * @work: work to queue
> > > > > > > > >  *
> > > > > > > > >  * We queue the work to a specific CPU, the caller must ensure it
> > > > > > > > >  * can't go away.
> > > > > > > > >  *
> > > > > > > > >  * Return: %false if @work was already on a queue, %true otherwise.
> > > > > > > > >  */
> > > > > > > > > <snip>
> > > > > > > > > 
> > > > > > > > > It says, how i see it, we should ensure it can not go away. So, if
> > > > > > > > > we drop the lock we should do like:
> > > > > > > > > 
> > > > > > > > > get_online_cpus();
> > > > > > > > > check a CPU is onlen;
> > > > > > > > > queue_work_on();
> > > > > > > > > put_online_cpus();
> > > > > > > > > 
> > > > > > > > > but i suspect we do not want to do it :)
> > > > > > > > 
> > > > > > > > Indeed, it might impose a few restrictions and a bit of overhead that
> > > > > > > > might not be welcome at some point in the future.  ;-)
> > > > > > > > 
> > > > > > > > On top of this there are potential load-balancing concerns.  By specifying
> > > > > > > > the CPU, you are limiting workqueue's and scheduler's ability to adjust to
> > > > > > > > any sudden changes in load.  Maybe not enough to matter in most cases, but
> > > > > > > > might be an issue if there is a sudden flood of  kfree_rcu() invocations.
> > > > > > > > 
> > > > > > > Agree. Let's keep it as it is now :)
> > > > > > 
> > > > > > I am not sure which "as it is now" you are referring to, but I suspect
> > > > > > that the -rt guys prefer two short interrupts-disabled regions to one
> > > > > > longer interrupts-disabled region.
> > > > > 
> > > > > I mean to run schedule_delayed_work() under spinlock.
> > > > 
> > > > Which is an interrupt-disabled spinlock, correct?
> > > > 
> > > To do it under holding the lock, currently it is spinlock, but it is
> > > going to be(if you agree :)) raw ones, which keeps IRQs disabled. I
> > > saw Joel sent out patches.
> > 
> > Then please move the schedule_delayed_work() and friends out from
> > under the spinlock.  Unless Sebastian has some reason why extending
> > an interrupts-disabled critical section (and thus degrading real-time
> > latency) is somehow OK in this case.
> > 
> Paul, if move outside of the lock we may introduce unneeded migration
> issues, plus it can introduce higher memory footprint(i have not tested).
> I have described it in more detail earlier in this mail thread. I do not
> think that waking up the work is an issue for RT from latency point of
> view. But let's ask Sebastian to confirm.

I was also a bit concerned about migration. If we moved it outside of lock,
then even on !PREEMPT_RT, we could be migrated before the work is
scheduled. Then we'd lose the benefit of executing the work on the same CPU
where it is queued. There's no migrate_disable() in non-PREEMPT_RT when I
recently checked as well :-\ (PeterZ mentioned that migrate_disable() is hard
to achieve on !PREEMPT_RT).

> Sebastian, do you think that placing a work on current CPU is an issue?
> If we do it under raw spinlock?

Yes, I am also curious if calling schedule_delayed_work can cause long
delays at all. Considering that workqueue code uses raw spinlocks as Mike
mentioned, I was under the impression that this code should not be causing
such issues, and the fact that it is called in many places from IRQ-disabled
sections as well.

Let us definitely double-check and discuss it more to be sure.

thanks,

 - Joel