Re: unexpected result with rcu_nocbs option

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2024-08-01 at 10:48 -0700, Paul E. McKenney wrote:
> > 
> > yes I do.
> > 
> > $ ps -eo pid,cpuid,comm | grep rcu
> >       4     0 kworker/R-rcu_gp
> >       8     0 kworker/0:0-rcu_gp
> >      14     0 rcu_tasks_rude_kthread
> >      15     0 rcu_tasks_trace_kthread
> >      17     3 rcu_sched
> >      18     3 rcuog/0
> >      19     0 rcuos/0
> >      20     0 rcu_exp_par_gp_kthread_worker/0
> >      21     3 rcu_exp_gp_kthread_worker
> >      31     3 rcuos/1
> >      38     3 rcuog/2
> >      39     3 rcuos/2
> >      46     0 rcuos/3
> 
> This looks like you had either nohz_full=0-3 or rcu_nocbs=0-3, given
> that you have rcuos kthreads for all four of your CPUs.  Or perhaps
> some
> other setting that implied one or the other of these.

the exact setting is:
isolcpus=0,1,2 nohz_full=1,2 rcu_nocbs=1,2 rcutree.rcu_nocb_gp_stride=4
irqaffinity=3

maybe you can quickly confirm this but by reading rcu/tree_nocb.h
but I have been under the impression that nohz_full=1,2 is implying
rcu_nocbs=1,2, therefore I could remove the rcu_nocbs parameter and it
would not change anything. (is there other ktread implications coming
along with isolcpus?).

I want to have control over what is run on cpu0 so it is enumerated in
isolcpus.

It is, however, not nohz_full. As I have mentionned in 1-2 emails ago.
cpu0 is where the networking I/O is made. net/core is such a big RCU
user + NIC driver interrupts (I am trying hard to eliminate them with
napi_busy_poll), that I have figured out making cpu0 nohz_full
efficiently was a lost battle before it even started.

maybe something somewhere is not used to see isolcpus and nohz_full
having different values and does something unexpected as a result...

> 
> > > > > the absence of of rcuog/1 is causing rcu_irq_work_resched()
> > > > > to
> > > > > raise
> > > > > an
> > > > > interrupt every 2-3 seconds on cpu1.
> > > 
> > > Did you build with CONFIG_LAZY_RCU=y?
> > 
> > no. I was not even aware that it was existing. I left alone the
> > default
> > setting!
> 
> Worth a try, as this is what it is designed for.

Ok, I will but when I read what it is doing, I was under the impression
that the main objective was to trade low-latency with reduced power
consumption...

I am not seeing how batching callbacks would help in eliminating the
interrupts that I am seeing. At best, they will be less frequent...

and most importantly, I am going through all these loops to reduce my
system latency by few uSecs... power consumption is my last worry...

the only way that I would consider it, is if increasing RCU latency is
improving my program latency...

All options are considered... I will take a look at this suggestion.







[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux