On Thu, Aug 01, 2024 at 10:52:15AM -0400, Olivier Langlois wrote: > the initial parsing is fine... > > Aug 01 14:05:51 aws-dublin kernel: rcu: Hierarchical RCU > implementation. > Aug 01 14:05:51 aws-dublin kernel: rcu: RCU restricting CPUs > from NR_CPUS=128 to nr_cpu_ids=4. > Aug 01 14:05:51 aws-dublin kernel: Rude variant of Tasks RCU > enabled. > Aug 01 14:05:51 aws-dublin kernel: Tracing variant of Tasks RCU > enabled. > Aug 01 14:05:51 aws-dublin kernel: rcu: RCU calculated value of > scheduler-enlistment delay is 10 jiffies. > Aug 01 14:05:51 aws-dublin kernel: rcu: Adjusting geometry for > rcu_fanout_leaf=16, nr_cpu_ids=4 > Aug 01 14:05:51 aws-dublin kernel: RCU Tasks Rude: Setting shift to 2 > and lim to 1 rcu_task_cb_adjust=1. > Aug 01 14:05:51 aws-dublin kernel: RCU Tasks Trace: Setting shift to 2 > and lim to 1 rcu_task_cb_adjust=1. > Aug 01 14:05:51 aws-dublin kernel: NR_IRQS: 8448, nr_irqs: 456, > preallocated irqs: 16 > Aug 01 14:05:51 aws-dublin kernel: NO_HZ: Full dynticks CPUs: 1-2. > Aug 01 14:05:51 aws-dublin kernel: rcu: Offload RCU callbacks > from CPUs: 1-2. > Aug 01 14:05:51 aws-dublin kernel: rcu: srcu_init: Setting srcu_struct > sizes based on contention. > > On Thu, 2024-08-01 at 10:28 -0400, Olivier Langlois wrote: > > this is with kernel 6.10.2 > > > > I have these options set on the boot command line: > > isolcpus=0,1,2 nohz_full=1,2 rcu_nocbs=1,2 > > > > $ ps -eo pid,cpuid,comm | grep rcuog > > 18 3 rcuog/0 > > 38 0 rcuog/2 > > > > I do not understand why a rcuog task is spawn for cpu0. > > I would have expected to have one for cpu1. These handle grace periods, each for a group of CPUs. You should have one rcuog kthread for each group of roughly sqrt(nr_cpu_ids) that contains at least one offloaded CPU, in your case, sqrt(4), which is 2. You could use the rcutree.rcu_nocb_gp_stride kernel boot parameter to override this default, for example, you might want rcutree.rcu_nocb_gp_stride=4 in your case. > > I do have a > > 31 3 rcuos/1 > > > > I am not familiar enough with rcu to know what rcuos is for. This is the kthread that invokes the callbacks for CPU 1, assuming you have a non-preemptible kernel (otherwise rcuop/1 for historical reasons that seemed like a good idea at the time). Do you also have an rcuos/2? (See the help text for CONFIG_RCU_NOCB_CPU.) > > the absence of of rcuog/1 is causing rcu_irq_work_resched() to raise > > an > > interrupt every 2-3 seconds on cpu1. Did you build with CONFIG_LAZY_RCU=y? Did you use something like taskset to confine the rcuog and rcuos kthreads to CPUs 0 and 3 (you seem to have 4 CPUs)? Might that interrupt be due to a call_rcu() on CPU 1? If so, can the work causing that call_rcu() be placed on some other CPU? > > I am currently reading rcu/tree_nocb.h to try to make sense of what I > > am seeing but I am pinging the rcu list just in case what I am seeing > > would be immediately obvious to one of you... Others might have more suggestions... Thanx, Paul