Re: Observation on NOHZ_FULL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Feb 11, 2024 at 07:22:55PM -0500, Joel Fernandes wrote:
> 
> 
> On 2/8/2024 9:51 AM, Uladzislau Rezki wrote:
> > On Thu, Feb 08, 2024 at 01:53:58PM +0100, Uladzislau Rezki wrote:
> >> On Thu, Feb 08, 2024 at 07:55:37AM +0100, Andrea Righi wrote:
> >>> On Thu, Feb 08, 2024 at 12:54:58AM -0500, Joel Fernandes wrote:
> >>> ...
> >>>>>> Slightly related, but one of the things we are wondering also is how
> >>>>>> much of the overhead for your nohz-full and lazy-RCU test (on top of
> >>>>>> baseline - that is just CONFIG_HZ=1000 without nohz-full or nocbs) is
> >>>>>> because of just using NOCB. Uladsizlau mentioned he might run a test
> >>>>>> for comparing along those lines as well.
> >>>>>
> >>>>> Just to clarify, "lazy rcu on" results are just with rcu_nocb=all and
> >>>>> lazy RCUs enabled (and HZ=1000), so without nohz_full.
> >>>>>
> >>>>> If I enable only nohz_full=all (without rcu_nocb) I see something like
> >>>>> this:
> >>>>
> >>>> Ok. I did want to mention nohz_full implies rcu_nocb on the same CPUs as well.
> >>>>
> >>>> Its also mentioned in the boot param docs on the last line of the description:
> >>>>
> >>>>         nohz_full=      [KNL,BOOT,SMP,ISOL]
> >>>>                         The argument is a cpu list, as described above.
> >>>>                         In kernels built with CONFIG_NO_HZ_FULL=y, set
> >>>>                         the specified list of CPUs whose tick will be stopped
> >>>>                         whenever possible. The boot CPU will be forced outside
> >>>>                         the range to maintain the timekeeping.  Any CPUs
> >>>>                         in this list will have their RCU callbacks offloaded,
> >>>>                         just as if they had also been called out in the
> >>>>                         rcu_nocbs= boot parameter.
> >>>
> >>> Ah I didn't realize that, it definitely makes sense, thanks for
> >>> clarifying it.
> >>>
> >>> Then basically in the results that I posted the difference is
> >>> "nohz_full=all+rcu_nocb=all" vs "rcu_nocb=all+lazy_RCU=on".
> >>>
> >> So, you say that a hrtimer_interrupt() handler takes more time in case
> >> of lazy + nocb + rcu_nocb=all and for nohz_full + rcu_nocb=all it faster?
> >> Could you please clarify this? I will try to measure from my side!
> >>
> >> I have done some basic research about hrtimer_interrupt() latency on my
> >> HW with latest Linux kernel. I have compared below cases:
> >>
> >> case a: 1000HZ + lazy + nocb_all_cpus
> >> case b: 1000HZ + nocb_all_cpus
> >>
> >> I used "ftrace" to measure time(in microseconds). Steps:
> >>
> >> echo 0 > tracing_on
> >> echo function_graph > current_tracer
> >> echo funcgraph-proc > trace_options
> >> echo funcgraph-abstime > trace_options
> >> echo hrtimer_interrupt > set_ftrace_filter
> >>
> >> fio --rw=write --bs=1M --size=1G --numjobs=8 --name=worker --time_based --runtime=50&
> >>
> >> echo 1 > tracing_on; sleep 10; echo 0 > tracing_on
> >>
> >> data is based on 10 seconds collection:
> >>
> >> <case a>
> >>      6  2102 ############################################################
> >>      8  2079 ############################################################
> >>     10  1464 ##########################################
> >>      7   897 ##########################
> 
> So first column is microseconds and second one is count?
> 
> >>      9   625 ##################
> >>     12   490 ##############
> >>     13   479 ##############
> >>     11   289 #########
> >>      5   249 ########
> >>     14   124 ####
> >>     15    72 ###
> >>     16    41 ##
> >>     17    24 #
> >>      4    22 #
> >>     18    12 #
> >>     22     2 #
> >>     19     1 #
> >> <case a>
> >>
> >> <case b>
> >>      9  1658 ############################################################
> >>     13  1308 ################################################
> >>     12  1224 #############################################
> 
> Assuming that, it does seem the "best" case is off by 3 microseconds (9 vs 6),
> still would not warrant being regarded a bug and possibly just in the noise.
> 
> >>     10   972 ####################################
> >>      8   703 ##########################
> >>     14   595 ######################
> >>     15   571 #####################
> >>     11   525 ###################
> >>     17   350 #############
> >>     16   235 #########
> >>      7   214 ########
> >>      4    73 ###
> >>      5    68 ###
> >>      6    54 ##
> >>     20     9 #
> >>     18     9 #
> >>     19     6 #
> >>     33     1 #
> >>      3     1 #
> >>     28     1 #
> >>     27     1 #
> >>     25     1 #
> >>     22     1 #
> >>     21     1 #
> >> <case b>
> >>
> >> I do not see the difference, there is a nose of 1/2/3 microseconds diff.
> >>
> > Let me further have a look at what we use for lazy in terms on hrtimer though.
> 
> Thanks for tracing it. Yeah it would be nice to count how many counts of
> do_nocb_deferred_wakeup() does the fio test trigger. If it is few, then maybe
> the problem with hrtimer_interrupt() is something else.
> 

urezki@pc638:~$ fio --rw=write --bs=1M --size=1G --numjobs=8 --name=worker --time_based --runtime=1000 > /dev/null 2>&1&
[1] 697
urezki@pc638:~$ sudo cat /sys/kernel/debug/tracing/trace_pipe  | grep Timer
          <idle>-0       [005] ..s1.   175.762689: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [002] ..s1.   186.002386: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [005] ..s1.   186.002391: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [001] ..s1.   228.497061: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [001] ..s1.   242.831956: rcu_nocb_wake: rcu_preempt 4 Timer
             fio-705     [004] ..s..   272.009997: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [002] ..s1.   282.253447: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.   347.786758: rcu_nocb_wake: rcu_preempt 4 Timer
            cron-714     [004] ..s1.   362.429287: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [007] .Ns1.   377.993814: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [002] ..s1.   393.353371: rcu_nocb_wake: rcu_preempt 0 Timer
    kworker/6:1H-172     [006] ..s1.   410.760933: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [007] ..s1.   427.148456: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.   438.220173: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [007] .Ns1.   451.719831: rcu_nocb_wake: rcu_preempt 4 Timer
    kworker/6:1H-172     [006] ..s1.   472.199389: rcu_nocb_wake: rcu_preempt 4 Timer
           <...>-717     [000] ..s..   482.434074: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.   484.969012: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [007] ..s1.   495.750151: rcu_nocb_wake: rcu_preempt 4 Timer
             fio-702     [000] ..s1.   516.439595: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.   578.212713: rcu_nocb_wake: rcu_preempt 4 Timer
    kworker/6:1H-172     [006] ..s1.   595.583863: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.   597.634693: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [005] ..s1.   617.598664: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.   655.996890: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.   660.606691: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.   680.059867: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.   688.227533: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.   698.491126: rcu_nocb_wake: rcu_preempt 4 Timer
             fio-703     [000] ..s1.   705.925823: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.   721.018256: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [007] ..s1.   732.793797: rcu_nocb_wake: rcu_preempt 4 Timer
    kworker/6:1H-172     [006] ..s1.   747.129268: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [007] ..s1.   766.075583: rcu_nocb_wake: rcu_preempt 4 Timer
             fio-701     [004] ..s2.   767.047547: rcu_nocb_wake: rcu_preempt 4 Timer
    kworker/6:1H-172     [006] ..s1.   782.456032: rcu_nocb_wake: rcu_preempt 4 Timer
    kworker/6:1H-172     [006] ..s1.   797.303567: rcu_nocb_wake: rcu_preempt 4 Timer
    kworker/6:1H-172     [006] ..s1.   813.686996: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [005] ..s1.   828.534430: rcu_nocb_wake: rcu_preempt 4 Timer
             fio-700     [004] ..s2.   850.536698: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [007] ..s1.   860.789367: rcu_nocb_wake: rcu_preempt 4 Timer
     ksoftirqd/6-63      [006] ..s..   888.948515: rcu_nocb_wake: rcu_preempt 4 Timer
             fio-699     [000] ..s..   905.854912: rcu_nocb_wake: rcu_preempt 0 Timer
    kworker/6:1H-172     [006] ..s1.   911.987781: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [005] ..s1.   924.275337: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.   938.208899: rcu_nocb_wake: rcu_preempt 4 Timer
    kworker/0:1H-100     [000] ..s1.   938.236053: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.   948.338615: rcu_nocb_wake: rcu_preempt 4 Timer
             fio-702     [000] ..s2.   958.216274: rcu_nocb_wake: rcu_preempt 0 Timer
           <...>-728     [004] ..s..   962.432149: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [002] ..s1.   978.545645: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [003] ..s1.  1002.608935: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.  1002.614904: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [001] ..s1.  1012.848592: rcu_nocb_wake: rcu_preempt 0 Timer
    kworker/6:1H-172     [006] ..s1.  1021.552408: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [002] ..s1.  1042.543682: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [001] ..s1.  1068.654892: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [002] ..s1.  1080.430541: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [000] ..s1.  1098.349992: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [002] ..s1.  1115.757464: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [005] ..s1.  1170.539822: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [000] ..s1.  1178.191592: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [000] ..s1.  1188.459281: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.  1196.139077: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [000] ..s1.  1216.762430: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.  1238.374788: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.  1241.194246: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [003] ..s1.  1251.433991: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.  1261.673634: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [001] ..s1.  1261.673637: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [007] ..s1.  1271.913347: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [005] ..s1.  1308.263711: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [000] ..s1.  1308.263738: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [005] ..s1.  1308.270707: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [000] ..s1.  1326.183705: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.  1326.183709: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.  1326.193714: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [005] ..s1.  1336.423403: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [003] ..s1.  1336.423404: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [000] ..s1.  1368.678480: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [007] ..s1.  1398.373573: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [005] ..s1.  1428.580662: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [001] ..s1.  1428.580666: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [007] ..s1.  1468.515358: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [000] ..s1.  1488.482412: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [000] ..s1.  1548.385082: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [005] ..s1.  1548.385083: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [003] ..s1.  1572.448367: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.  1572.448367: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.  1572.453972: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [005] ..s1.  1582.688060: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [001] ..s1.  1582.688062: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.  1592.927895: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [007] ..s1.  1608.287294: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [001] ..s1.  1608.799312: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [002] ..s1.  1621.086909: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [005] ..s1.  1631.326609: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [003] .Ns1.  1653.854052: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [001] ..s1.  1668.189531: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [007] ..s1.  1678.429219: rcu_nocb_wake: rcu_preempt 4 Timer
<snip>

1000 seconds runtime of "fio" test triggers the do_nocb_deferred_wakeup_timer()
around 100 times. Most of them are handled from the soft-irq context. I have
looked at the perf figures vs hrtimer_interrupt(). I do not see any obvious
performance issues with it.

8xCPUs KVM system;
1000HZ timer;
lazy + all offloading.

--
Uladzislau Rezki




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux