On 2/8/2024 9:51 AM, Uladzislau Rezki wrote: > On Thu, Feb 08, 2024 at 01:53:58PM +0100, Uladzislau Rezki wrote: >> On Thu, Feb 08, 2024 at 07:55:37AM +0100, Andrea Righi wrote: >>> On Thu, Feb 08, 2024 at 12:54:58AM -0500, Joel Fernandes wrote: >>> ... >>>>>> Slightly related, but one of the things we are wondering also is how >>>>>> much of the overhead for your nohz-full and lazy-RCU test (on top of >>>>>> baseline - that is just CONFIG_HZ=1000 without nohz-full or nocbs) is >>>>>> because of just using NOCB. Uladsizlau mentioned he might run a test >>>>>> for comparing along those lines as well. >>>>> >>>>> Just to clarify, "lazy rcu on" results are just with rcu_nocb=all and >>>>> lazy RCUs enabled (and HZ=1000), so without nohz_full. >>>>> >>>>> If I enable only nohz_full=all (without rcu_nocb) I see something like >>>>> this: >>>> >>>> Ok. I did want to mention nohz_full implies rcu_nocb on the same CPUs as well. >>>> >>>> Its also mentioned in the boot param docs on the last line of the description: >>>> >>>> nohz_full= [KNL,BOOT,SMP,ISOL] >>>> The argument is a cpu list, as described above. >>>> In kernels built with CONFIG_NO_HZ_FULL=y, set >>>> the specified list of CPUs whose tick will be stopped >>>> whenever possible. The boot CPU will be forced outside >>>> the range to maintain the timekeeping. Any CPUs >>>> in this list will have their RCU callbacks offloaded, >>>> just as if they had also been called out in the >>>> rcu_nocbs= boot parameter. >>> >>> Ah I didn't realize that, it definitely makes sense, thanks for >>> clarifying it. >>> >>> Then basically in the results that I posted the difference is >>> "nohz_full=all+rcu_nocb=all" vs "rcu_nocb=all+lazy_RCU=on". >>> >> So, you say that a hrtimer_interrupt() handler takes more time in case >> of lazy + nocb + rcu_nocb=all and for nohz_full + rcu_nocb=all it faster? >> Could you please clarify this? I will try to measure from my side! >> >> I have done some basic research about hrtimer_interrupt() latency on my >> HW with latest Linux kernel. I have compared below cases: >> >> case a: 1000HZ + lazy + nocb_all_cpus >> case b: 1000HZ + nocb_all_cpus >> >> I used "ftrace" to measure time(in microseconds). Steps: >> >> echo 0 > tracing_on >> echo function_graph > current_tracer >> echo funcgraph-proc > trace_options >> echo funcgraph-abstime > trace_options >> echo hrtimer_interrupt > set_ftrace_filter >> >> fio --rw=write --bs=1M --size=1G --numjobs=8 --name=worker --time_based --runtime=50& >> >> echo 1 > tracing_on; sleep 10; echo 0 > tracing_on >> >> data is based on 10 seconds collection: >> >> <case a> >> 6 2102 ############################################################ >> 8 2079 ############################################################ >> 10 1464 ########################################## >> 7 897 ########################## So first column is microseconds and second one is count? >> 9 625 ################## >> 12 490 ############## >> 13 479 ############## >> 11 289 ######### >> 5 249 ######## >> 14 124 #### >> 15 72 ### >> 16 41 ## >> 17 24 # >> 4 22 # >> 18 12 # >> 22 2 # >> 19 1 # >> <case a> >> >> <case b> >> 9 1658 ############################################################ >> 13 1308 ################################################ >> 12 1224 ############################################# Assuming that, it does seem the "best" case is off by 3 microseconds (9 vs 6), still would not warrant being regarded a bug and possibly just in the noise. >> 10 972 #################################### >> 8 703 ########################## >> 14 595 ###################### >> 15 571 ##################### >> 11 525 ################### >> 17 350 ############# >> 16 235 ######### >> 7 214 ######## >> 4 73 ### >> 5 68 ### >> 6 54 ## >> 20 9 # >> 18 9 # >> 19 6 # >> 33 1 # >> 3 1 # >> 28 1 # >> 27 1 # >> 25 1 # >> 22 1 # >> 21 1 # >> <case b> >> >> I do not see the difference, there is a nose of 1/2/3 microseconds diff. >> > Let me further have a look at what we use for lazy in terms on hrtimer though. Thanks for tracing it. Yeah it would be nice to count how many counts of do_nocb_deferred_wakeup() does the fio test trigger. If it is few, then maybe the problem with hrtimer_interrupt() is something else. - Joel