On 2024-07-05 10:39:25 [-0700], Paul E. McKenney wrote: > As a workaround, the following commit in -rcu that is slated for > the upcoming merge window addresses a similar case involving KVM and > nohz_full: > > 68d124b09999 ("rcu: Add rcutree.nohz_full_patience_delay to reduce > nohz_full OS jitter") > > The KVM guys found that setting rcutree.nohz_full_patience_delay to 1000 > (AKA one second) made things work better for them. Does this help your > use case? My problem is that I have a task stuck in percpu_down_write()/ __wait_rcu_gp() and I think this is because the RCU machinery is stuck and there is no grace period. I have see a rcuc/ thread with a wakeup but it won't be scheduled because it's priority is lower than the thread that is currently on the CPU and that thread uses at 100%. I *think* this explains it because the rcuc moves the grace period forward. Looking at the patch, there would be a delay up to 5 secs which would mean if the task consumes 100% of the CPU then it doesn't change a thing. Thank you Paul for the pointers. > This is again a workaround. Clearly, it would be better if we could > eliminate that second rcuc wakeup. I tried something similar some time > back, and there was a problem with it. I will see if I can reconstitute > the corresponding brain cells. Is my assumption correct, in order to push the grace period forward, otherwise the whole is stuck? > But in the meantime, one advantage of the workaround is that in the > common case, it would reduce the number of rcuc wakeups to zero, rather > than to just one. > > Thoughts? I *think* if what I just wrote is correct, I will either have to raise the priority of rcuc/ or make the thread, that consumes 100% of the CPU lose its RT priority. Then with the limited number of wakeups it should be doable. PS: I do remember the RCU-task thread we had. I did have an idea but I need check if this is feasible first. So I did not forget, just slow… > Thanx, Paul Sebastian