On Thu, May 05, 2022 at 12:16:41PM +0200, Uladzislau Rezki (Sony) wrote: > Introduce a RCU_NOCB_CPU_CB_BOOST kernel option. So a user can > decide if an offloading has to be done in a high-prio context or > not. Please note an option depends on RCU_NOCB_CPU and RCU_BOOST > parameters and by default it is off. > > This patch splits the boosting preempted RCU readers and those > kthreads which directly responsible for driving expedited grace > periods forward with enabling/disabling the offloading from/to > SCHED_FIFO/SCHED_OTHER contexts. > > The main reason of such split is, for example on Android there > are some workloads which require fast expedited grace period to > be done whereas offloading in RT context can lead to starvation > and hogging a CPU for a long time what is not acceptable for > latency sensitive environment. For instance: > > <snip> > <...>-60 [006] d..1 2979.028717: rcu_batch_start: rcu_preempt CBs=34619 bl=270 > <snip> > > invoking 34 619 callbacks will take time thus making other CFS > tasks waiting in run-queue to be starved due to such behaviour. > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx> All good points! Some questions and comments below. Adding Sebastian on CC for his perspective. Thanx, Paul > --- > kernel/rcu/Kconfig | 14 ++++++++++++++ > kernel/rcu/tree.c | 5 ++++- > kernel/rcu/tree_nocb.h | 3 ++- > 3 files changed, 20 insertions(+), 2 deletions(-) > > diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig > index 27aab870ae4c..074630b94902 100644 > --- a/kernel/rcu/Kconfig > +++ b/kernel/rcu/Kconfig > @@ -275,6 +275,20 @@ config RCU_NOCB_CPU_DEFAULT_ALL > Say Y here if you want offload all CPUs by default on boot. > Say N here if you are unsure. > > +config RCU_NOCB_CPU_CB_BOOST > + bool "Perform offloading from real-time kthread" > + depends on RCU_NOCB_CPU && RCU_BOOST > + default n I understand that you need this to default to "n" on your systems. However, other groups already using callback offloading should not see a sudden change. I don't see an Android-specific defconfig file, but perhaps something in drivers/android/Kconfig? One easy way to make this work would be to invert the sense of this Kconfig option ("RCU_NOCB_CB_NO_BOOST"?), continue having it default to "n", but then select it somewhere in drivers/android/Kconfig. But I would not be surprised if there is a better way. > + help > + Use this option to offload callbacks from the SCHED_FIFO context > + to make the process faster. As a side effect of this approach is > + a latency especially for the SCHED_OTHER tasks which will not be > + able to preempt an offloading kthread. That latency depends on a > + number of callbacks to be invoked. > + > + Say Y here if you want to set RT priority for offloading kthreads. > + Say N here if you are unsure. > + > config TASKS_TRACE_RCU_READ_MB > bool "Tasks Trace RCU readers use memory barriers in user and idle" > depends on RCU_EXPERT && TASKS_TRACE_RCU > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 9dc4c4e82db6..d769a15bc0e3 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -154,7 +154,10 @@ static void sync_sched_exp_online_cleanup(int cpu); > static void check_cb_ovld_locked(struct rcu_data *rdp, struct rcu_node *rnp); > static bool rcu_rdp_is_offloaded(struct rcu_data *rdp); > > -/* rcuc/rcub/rcuop kthread realtime priority */ > +/* > + * rcuc/rcub/rcuop kthread realtime priority. The former > + * depends on if CONFIG_RCU_NOCB_CPU_CB_BOOST is set. Aren't the rcuo[ps] kthreads controlled by the RCU_NOCB_CPU_CB_BOOST Kconfig option? (As opposed to the "former", which is "rcuc".) > + */ > static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0; > module_param(kthread_prio, int, 0444); > > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h > index 60cc92cc6655..a2823be9b1d0 100644 > --- a/kernel/rcu/tree_nocb.h > +++ b/kernel/rcu/tree_nocb.h > @@ -1315,8 +1315,9 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu) > if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__)) > goto end; > > - if (kthread_prio) > + if (IS_ENABLED(CONFIG_RCU_NOCB_CPU_CB_BOOST)) Don't we need both non-zero kthread_prio and the proper setting of the new Kconfig option before we run it at SCHED_FIFO? Yes, we could rely on sched_setscheduler_nocheck() erroring out in that case, but that sounds like an accident waiting to happen. > sched_setscheduler_nocheck(t, SCHED_FIFO, &sp); > + > WRITE_ONCE(rdp->nocb_cb_kthread, t); > WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread); > return; > -- > 2.30.2 >