On Wed, Feb 17, 2021 at 07:01:59PM +0100, Sebastian Andrzej Siewior wrote: > On 2021-02-17 07:54:47 [-0800], Paul E. McKenney wrote: > > > I though boosting is accomplished by acquiring a rt_mutex in a > > > rcu_read() section. Do you have some code to point me to, to see how a > > > timer is involved here? Or is it the timer saying that *now* boosting is > > > needed. > > > > Yes, this last, which is in the grace-period kthread code, for example, > > in rcu_gp_fqs_loop(). > > > > > If your hrtimer is a "normal" hrtimer then it will be served by > > > ksoftirqd, too. You would additionally need one of the > > > HRTIMER_MODE_*_HARD to make it work. > > > > Good to know. Anything I should worry about for this mode? > > Well. It is always hardirq. No spinlock_t, etc. within that callback. > If you intend to wake a thread, that thread needs an elevated priority > otherwise it won't be scheduled (assuming there is a RT tasking running > which would block otherwise ksoftirqd). Good to know, thank you! I believe that all the needed locks are already raw spinlocks, but the actual kernel code always takes precedence over one's beliefs. > Ah. One nice thing is that you can move the RCU threads to a house > keeping CPU - away from the CPU(s) running the RT tasks. Would this > scenario be still affected (if ksoftirqd would be blocked)? At this point, I am going to say that it is the sysadm's job to place the rcuo kthreads, and if they are placed poorly, life is hard. This means that I need to create a couple of additional polling RCU grace-period functions for rcutorture's priority-boosting use, but I probably should have done that a long time ago. Simpler to just call a polling API occasionally than to handle all the corner cases of keeping an RCU callback queued. > Oh. One thing I forgot to mention: the timer_list timer is nice in terms > of moving forward (the timer did not fire, the condition is true and you > move the timeout forward). > A hrtimer timer on the other hand needs to be removed, forwarded and > added back to the "timer tree". This is considered more expensive > especially if the timer does not fire. There are some timers that are used to cause a wakeup to happen from a clean environment, but maybe these can instead use irq-work. > > Also, the current test expects callbacks to be invoked, which involves a > > number of additional kthreads and timers, for example, in nocb_gp_wait(). > > I suppose I could instead look at grace-period sequence numbers, but I > > believe that real-life use cases needing RCU priority boosting also need > > the callbacks to be invoked reasonably quickly (as in within hundreds > > of milliseconds up through very small numbers of seconds). > > A busy/overloaded kvm-host could lead to delays by not scheduling the > guest for a while. That it can! Aravinda Prasad prototyped a mechanism hinting to the hypervisor in such cases, but I don't know that this ever saw the light of day. > My understanding of the need for RCU boosting is to get a task, > preempted (by a RT task) within a RCU section, back on the CPU to > at least close the RCU section. So it is possible to run RCU callbacks > and free memory. > The 10 seconds without RCU callbacks shouldn't be bad unless the OOM > killer got nervous (and if we had memory allocation failures). > Also, running thousands of accumulated callbacks isn't good either. Sounds good, thank you! Thanx, Paul > > Thoughts? > > > > Thanx, Paul > Sebastian