Re: Observation on NOHZ_FULL

Uladzislau Rezki <urezki@xxxxxxxxx> · Thu, 8 Feb 2024 15:51:06 +0100



On Thu, Feb 08, 2024 at 01:53:58PM +0100, Uladzislau Rezki wrote:
> On Thu, Feb 08, 2024 at 07:55:37AM +0100, Andrea Righi wrote:
> > On Thu, Feb 08, 2024 at 12:54:58AM -0500, Joel Fernandes wrote:
> > ...
> > > >> Slightly related, but one of the things we are wondering also is how
> > > >> much of the overhead for your nohz-full and lazy-RCU test (on top of
> > > >> baseline - that is just CONFIG_HZ=1000 without nohz-full or nocbs) is
> > > >> because of just using NOCB. Uladsizlau mentioned he might run a test
> > > >> for comparing along those lines as well.
> > > > 
> > > > Just to clarify, "lazy rcu on" results are just with rcu_nocb=all and
> > > > lazy RCUs enabled (and HZ=1000), so without nohz_full.
> > > > 
> > > > If I enable only nohz_full=all (without rcu_nocb) I see something like
> > > > this:
> > > 
> > > Ok. I did want to mention nohz_full implies rcu_nocb on the same CPUs as well.
> > > 
> > > Its also mentioned in the boot param docs on the last line of the description:
> > > 
> > >         nohz_full=      [KNL,BOOT,SMP,ISOL]
> > >                         The argument is a cpu list, as described above.
> > >                         In kernels built with CONFIG_NO_HZ_FULL=y, set
> > >                         the specified list of CPUs whose tick will be stopped
> > >                         whenever possible. The boot CPU will be forced outside
> > >                         the range to maintain the timekeeping.  Any CPUs
> > >                         in this list will have their RCU callbacks offloaded,
> > >                         just as if they had also been called out in the
> > >                         rcu_nocbs= boot parameter.
> > 
> > Ah I didn't realize that, it definitely makes sense, thanks for
> > clarifying it.
> > 
> > Then basically in the results that I posted the difference is
> > "nohz_full=all+rcu_nocb=all" vs "rcu_nocb=all+lazy_RCU=on".
> > 
> So, you say that a hrtimer_interrupt() handler takes more time in case
> of lazy + nocb + rcu_nocb=all and for nohz_full + rcu_nocb=all it faster?
> Could you please clarify this? I will try to measure from my side!
> 
> I have done some basic research about hrtimer_interrupt() latency on my
> HW with latest Linux kernel. I have compared below cases:
> 
> case a: 1000HZ + lazy + nocb_all_cpus
> case b: 1000HZ + nocb_all_cpus
> 
> I used "ftrace" to measure time(in microseconds). Steps:
> 
> echo 0 > tracing_on
> echo function_graph > current_tracer
> echo funcgraph-proc > trace_options
> echo funcgraph-abstime > trace_options
> echo hrtimer_interrupt > set_ftrace_filter
> 
> fio --rw=write --bs=1M --size=1G --numjobs=8 --name=worker --time_based --runtime=50&
> 
> echo 1 > tracing_on; sleep 10; echo 0 > tracing_on
> 
> data is based on 10 seconds collection:
> 
> <case a>
>      6  2102 ############################################################
>      8  2079 ############################################################
>     10  1464 ##########################################
>      7   897 ##########################
>      9   625 ##################
>     12   490 ##############
>     13   479 ##############
>     11   289 #########
>      5   249 ########
>     14   124 ####
>     15    72 ###
>     16    41 ##
>     17    24 #
>      4    22 #
>     18    12 #
>     22     2 #
>     19     1 #
> <case a>
> 
> <case b>
>      9  1658 ############################################################
>     13  1308 ################################################
>     12  1224 #############################################
>     10   972 ####################################
>      8   703 ##########################
>     14   595 ######################
>     15   571 #####################
>     11   525 ###################
>     17   350 #############
>     16   235 #########
>      7   214 ########
>      4    73 ###
>      5    68 ###
>      6    54 ##
>     20     9 #
>     18     9 #
>     19     6 #
>     33     1 #
>      3     1 #
>     28     1 #
>     27     1 #
>     25     1 #
>     22     1 #
>     21     1 #
> <case b>
> 
> I do not see the difference, there is a nose of 1/2/3 microseconds diff.
> 
Let me further have a look at what we use for lazy in terms on hrtimer though.

--
Uladzislau Rezki