On 2/6/2024 12:51 PM, Andrea Righi wrote: > - stress-ng --matrix seems quite unpredictable to be used a benchmarks > in this scenario (the bogo-ops/s are very susceptible to any kind of > interference, so even if in the long runs NO_HZ_FULL still seems to > provide some benefits looking at the average, we also need to > consider that there might be a significant error in the measurements, > standard deviation was pretty high) > Ack on the bogo-ops disclaimers as also mentioned in the stress-ng docs. Agreed a better metric for perf is helpful. I am assuming you also have RCU_NOCB enabled for this test? > - fio doing short writes (in page cache) seems to perform like 2x > better in terms of iops with nohz_full, respect to the other cases > and it performs 2x slower with large IO writes (not sure why... need > to investigate more) This is interesting, it could be worth counting how many kernel entries/exits occur for large IO vs small IO. I'd imagine for large IO we have fewer syscalls and hence lower entry/exit overhead. But if there more interrupts for whatever reason with large IO, then that also implies more kernel entries/exits. As Frederic was saying, NOHZ_FULL has higher overhead on kernel entry/exit. > > - with lazy RCU enabled hrtimer_interrupt() takes like 2x more to > return, respect to the other cases (is this expected?) It depends on which hrtimer_interrupt() instance? There must be several in the trace due to unrelated timers. Are you saying the worst case or it is always 2x more? We do queue a timer for Lazy RCU to flush the RCU work but it is set to 10 seconds and should be canceled most of the time (Its just a guard rail). It is possible there is lock contention on ->nocb_gp_lock which is causing the timer handler execution to be slow. We have several trace_rcu_nocb* trace points, including for the timer. Perhaps you could enable those and we dig deeper? Further, it is interesting to see if it is only the hrtimer_interrupt() instance that also results in a call to do_nocb_deferred_wakeup_timer() via say function tracing. That will confirm that it is the lazy timer that is slow for you. The actual number of callbacks should not be causing specifically the hrtimer_interrupt() to take too long to run, AFAICS. But RCU's lazy feature does increase the number of timer interrupts. Further still, it depends on how much hrtimer_interrupt() takes with lazy RCU to call it a problem IMO. Some numbers with units will be nice. thanks, - Joel