Re: Observation on NOHZ_FULL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Le Mon, Jan 29, 2024 at 05:53:50PM -0500, Joel Fernandes a écrit :
> 
> 
> On 1/29/2024 5:43 PM, Frederic Weisbecker wrote:
> > Le Mon, Jan 29, 2024 at 05:20:23PM -0500, Joel Fernandes a écrit :
> >>> If i do not miss something
> >>> the NO_HZ_FULL will disable the timer if there is only one task on CPU
> >>> so that running task benefits from not being interrupted thus gets more
> >>> CPU time.
> >>
> >> Yes, that's right. I believe it is well known that HPC-type of workloads benefit
> >> from FULL, however it has led to want to try it out for constrained system as
> >> well where CPU cycles are a premium, especially if the improvement is like what
> >> the report suggests (give or take the concerns/questions Paul raised).
> > 
> > I'll be unable to suggest anything related to that Bogomips calculation but
> > I must add something about HPC.
> > 
> > I have long believed that HPC would benefit from nohz_full but I actually never
> > heard of any user of that. The current known users of nohz_full are workloads
> > that don't use the kernel once the application is launched and do their own
> > stack of, for example, networking, talking directly to the device from
> > userspace. Using DPDK for example. These usecases are for extremely low latency
> > expectations (a single interrupt can make you lose).
> > 
> > HPC looks to me different, making use of syscalls and kernel for I/O. Nohz_full
> > may remove timer IRQs but it adds performance loss on kernel entry, making it
> > probably unsuitable there. But I might be wrong.
> > 
> 
> Thanks for the insights!
> 
> The kernel entry/exit overhead bit is an interesting point and also tracking
> state for RCU observation purposes. I wonder how much is the overhead in
> practical cases and curious to measure (especially the tradeoffs between that
> and the tick). Added note to my list ;)

And here is a summary of the added overhead with nohz_full:

* Use of RCU_NOCB
* Kernel entry/exit overhead (context_tracking/rcu/vtime)
* Unbound work reaffine (less distributed across CPUs): workqueues, timers, kthreads...
* Remote tick every 1Hz on CPU 0 (one day I would love to remove that but that
  requires too much scheduler knowledge)

The difference should be significantly measureable, even as compared to RCU_NOCB
alone.

Thanks.

> 
> thanks,
> 
> - Joel
> 
> 
> 
> >>
> >> Thanks,
> >>
> >>  - Joel
> >>




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux