On 7/14/2022 4:51 PM, Paul E. McKenney wrote: > On Wed, Jul 13, 2022 at 09:32:32PM +0000, Joel Fernandes (Google) wrote: >> Hello! >> >> Please find the next improved version of call_rcu_lazy() attached. The main >> difference between the previous versions is that: >> - In v2 rcu_barrier is fixed to not hang (I found this to be due to a missing >> GP thread wakeup), now I am limiting this wake up only to rcu_barrier() as >> requested by Paul. >> - Fixed checkpatch and build robot issues. >> - Some more changes to 'lazy' parameter passing and consolidation of segcblist >> functions. >> - more testing via rcutorture and rcuscale. > > Thank you! What I am going to do is to pull these into an experimental > not-for-mainline branch and run the usual set of rcutorture tests. > I will then take a look at the patches. Thanks, that sounds great. > >> Note that these tests were run on v2 patches, I am expecting similar power >> improvements however I've not yet tested power. >> >> Following are power savings we saw on top of RCU_NOCB_CPU on an Intel platform >> in v2. The observation is that due to a 'trickle down' effect of RCU >> callbacks, the system is very lightly loaded but constantly running few RCU >> callbacks very often. This confuses the power management hardware that the >> system is active, when it is in fact idle. >> >> For example, when ChromeOS screen is off and user is not doing anything on the >> system, we can see big power savings. >> Before: >> Pk%pc10 = 72.13 >> PkgWatt = 0.58 >> CorWatt = 0.04 >> >> After: >> Pk%pc10 = 81.28 >> PkgWatt = 0.41 >> CorWatt = 0.03 > > When you update these numbers, please explain what they all are and > evaluate them in the cover letter (or in the relevant patch's commit log). > For final submission, please also include some estimate of the variance. > For example, CorWatt might be essentially the same both before and after, > as in 0.035 and 0.034, or there might be a large difference, as in 0.044 > and 0.025. The 81.28 might be constant in all four digits (ha!), or it > might vary between (say) 80 and 83. And so on. Sure thanks for the suggestions and will do. > > Based on our earlier emails, my guess is that Pk%pc10 is the percent of > time that the system is in a low-power state (bigger is better), PkgWatt > is power consumed by the CPU chip (smaller is better), and CorWatt is > power consumed by the CPU core (again, smaller is better). Yes that's correct. >> Further, when ChromeOS screen is ON but system is idle or lightly loaded, we >> can see that the display pipeline is constantly doing RCU callback queuing due >> to open/close of file descriptors associated with graphics buffers. This is >> attributed to the file_free_rcu() path which this patch series also touches. >> >> On memory pressure, timeout or queue growing too big, we initiate a flush of of >> the bypass lists holding the lazy CBs. >> >> Similar results can be achieved by increasing jiffies_till_first_fqs, however >> that also has the effect of slowing down RCU. Especially I saw huge slow down > > In the final submission, please quantify "huge slow down". ;-) Sure will do. IIRC it was something like 30 second to stop function graph tracer versus the usual 2-3 seconds. >> of function graph tracer when increasing that. That may be possible to fix via >> rcu_expedited=1 boot parameter, however call_rcu_lazy() provides another option >> over slowing down ALL call_rcu() globally. Further using jiffies_till_first_fqs >> approach will still cause a wake up of the main RCU GP kthread, with this work >> we delay even those wakeups. >> >> One drawback of this series is, if another frequent RCU callback creeps up in >> the future, that's not lazy, then that will again hurt the power. However, I >> believe identifying and fixing those is a more reasonable approach than slowing >> RCU down for the whole system. > > Like I said earlier, you are the official call_rcu_lazy() whack-a-mole > developer. ;-) Haha.. Thanks, - Joel