Re: Observation on NOHZ_FULL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 7, 2024 at 11:31 AM Andrea Righi <andrea.righi@xxxxxxxxxxxxx> wrote:

> > The actual number of callbacks should not be causing specifically the
> > hrtimer_interrupt() to take too long to run, AFAICS. But RCU's lazy feature does
> > increase the number of timer interrupts.
> >
> > Further still, it depends on how much hrtimer_interrupt() takes with lazy RCU to
> > call it a problem IMO. Some numbers with units will be nice.
>
> This is what I see (this is a single run, but the other runs are
> similar), unit is nanosec, with lazy RCU enabled hrtimer_interrupt()
> takes around 4K-16K ns, with lazy RCU off most of the times it takes
> 2K-4K ns:
>
>  - lazy rcu off:
>
> [1K, 2K)         88307 |@@@@@@@@@@@@                                            |
> [2K, 4K)        380695 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@    |
> [4K, 8K)           194 |                                                        |
>
>  - lazy rcu on:
>
> [2K, 4K)          3094 |                                                          |
> [4K, 8K)        265763 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [8K, 16K)       182341 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                       |
> [16K, 32K)        3422 |                                                          |
>
> Again, I'm not sure if this is really a problem or not, or if it is even
> a relevant metric for the overall performance, I was just curious to
> understand why it is different.

This is an interesting find, the number of timer interrupt executions
looks roughly the same in this histogram so it might not be missed
cancellations or such, so it is not clear to me. But it is worth
debugging and we'll try to reproduce your results.

Some more theories from our internal RCU discussion:
- Could it be another user of RCU (call_rcu) from an unrelated hrtimer
interrupt callback that is causing a "flush" of lazy callbacks?
- What does the distribution look like for
do_nocb_deferred_wakeup_timer ? That will have to probably be made
non-static to be picked up by bpftrace (If you could try that real
quick, appreciate!).

Slightly related, but one of the things we are wondering also is how
much of the overhead for your nohz-full and lazy-RCU test (on top of
baseline - that is just CONFIG_HZ=1000 without nohz-full or nocbs) is
because of just using NOCB. Uladsizlau mentioned he might run a test
for comparing along those lines as well.

Frederic, looking forward to any thoughts from you as well on this
behavior. Thanks,

 - Joel





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux