On Wed, May 11, 2022 at 11:17:59PM -0400, Joel Fernandes wrote: > On Wed, May 11, 2022 at 11:04 PM Joel Fernandes (Google) > <joel@xxxxxxxxxxxxxxxxx> wrote: > > > > Hello! > > Please find the proof of concept version of call_rcu_lazy() attached. This > > gives a lot of savings when the CPUs are relatively idle. Huge thanks to > > Rushikesh Kadam from Intel for investigating it with me. > > > > Some numbers below: > > > > Following are power savings we see on top of RCU_NOCB_CPU on an Intel platform. > > The observation is that due to a 'trickle down' effect of RCU callbacks, the > > system is very lightly loaded but constantly running few RCU callbacks very > > often. This confuses the power management hardware that the system is active, > > when it is in fact idle. > > > > For example, when ChromeOS screen is off and user is not doing anything on the > > system, we can see big power savings. > > Before: > > Pk%pc10 = 72.13 > > PkgWatt = 0.58 > > CorWatt = 0.04 > > > > After: > > Pk%pc10 = 81.28 > > PkgWatt = 0.41 > > CorWatt = 0.03 > > > > Further, when ChromeOS screen is ON but system is idle or lightly loaded, we > > can see that the display pipeline is constantly doing RCU callback queuing due > > to open/close of file descriptors associated with graphics buffers. This is > > attributed to the file_free_rcu() path which this patch series also touches. > > > > This patch series adds a simple but effective, and lockless implementation of > > RCU callback batching. On memory pressure, timeout or queue growing too big, we > > initiate a flush of one or more per-CPU lists. > > > > Similar results can be achieved by increasing jiffies_till_first_fqs, however > > that also has the effect of slowing down RCU. Especially I saw huge slow down > > of function graph tracer when increasing that. > > > > One drawback of this series is, if another frequent RCU callback creeps up in > > the future, that's not lazy, then that will again hurt the power. However, I > > believe identifying and fixing those is a more reasonable approach than slowing > > RCU down for the whole system. > > > > NOTE: Add debug patch is added in the series toggle /proc/sys/kernel/rcu_lazy > > at runtime to turn it on or off globally. It is default to on. Further, please > > use the sysctls in lazy.c for further tuning of parameters that effect the > > flushing. > > > > Disclaimer 1: Don't boot your personal system on it yet anticipating power > > savings, as TREE07 still causes RCU stalls and I am looking more into that, but > > I believe this series should be good for general testing. Sometimes OOM conditions result in stalls. > > Disclaimer 2: I have intentionally not CC'd other subsystem maintainers (like > > net, fs) to keep noise low and will CC them in the future after 1 or 2 rounds > > of review and agreements. We will of course need them to look at the call_rcu_lazy() conversions at some point, but in the meantime, experimentation is fine. I looked at a few, but quickly decided to defer to the people with a better understanding of the code. > I did forget to add Disclaimer 3, that this breaks rcu_barrier() and > support for that definitely needs work. Good to know. ;-) With this in place, can the system survive a userspace close(open()) loop, or does that result in OOM? (I am not worried about battery lifetime while close(open()) is running, just OOM resistance.) Does waiting for the shrinker to kick in suffice, or should the system pressure be taken into account? As in the "total" numbers from /proc/pressure/memory. Again, it is very good to see this series! Thanx, Paul