On 2024-07-05 18:15:25 [-0700], Paul E. McKenney wrote: > > Looking at the patch, there would be a delay up to 5 secs which would > > I would have said "up to 1 sec", so what am I missing? The patch description said that there is 5 sec upper limit. Yes, default 1 sec. > > mean if the task consumes 100% of the CPU then it doesn't change a > > thing. > > As long as RCU's grace-period kthread gets some CPU and as long as > the CPU-bound task executes often in userspace, that task's CPU's rcuc > kthread need never run. The grace-period kthread would see that the CPU > has been in an extended quiescent state, and would report that quiescent > state on that CPU's behalf. Okay. So the 100% usage is the problem indeed. > > Thank you Paul for the pointers. > > > > > This is again a workaround. Clearly, it would be better if we could > > > eliminate that second rcuc wakeup. I tried something similar some time > > > back, and there was a problem with it. I will see if I can reconstitute > > > the corresponding brain cells. > > > > Is my assumption correct, in order to push the grace period forward, > > otherwise the whole is stuck? > > Again, if the CPU running the CPU-bound task executes in nohz_full > userspace context, that CPU's rcuc kthread need never run. > > Of course, if you tried the patch and it didn't help, that is another > story. Hardware facts beat human theories, now as always. of course. > > > But in the meantime, one advantage of the workaround is that in the > > > common case, it would reduce the number of rcuc wakeups to zero, rather > > > than to just one. > > > > > > Thoughts? > > > > I *think* if what I just wrote is correct, I will either have to raise > > the priority of rcuc/ or make the thread, that consumes 100% of the CPU > > lose its RT priority. Then with the limited number of wakeups it should > > be doable. > > You can: (1) Raise the rcuc kthread's priority, as you say, (2) Ensure > that the CPU-bound task runs frequently (or even always) in nohz_full > usermode context, or (3) #2 and also apply the patch, which would in > addition prevent the wakeups. > > I think. After all, I could easily be missing something here. Let me backport and see what happens in the end. Thank you. > > PS: I do remember the RCU-task thread we had. I did have an idea but I > > need check if this is feasible first. So I did not forget, just slow… > > I must confess that I have been wondering about how much tracing goes > on withing real-time systems running in production... This is little I know. However, it is used during testing of production systems to see what is going on due to its little overhead. > Thanx, Paul Sebastian