Re: [PATCH 2/2] sched: Move task_mm_cid_work to RCU callback

Gabriele Monaco <gmonaco@xxxxxxxxxx> · Mon, 02 Dec 2024 16:21:28 +0100

> I've used the same task work pattern as NUMA here. What makes it
> OK for NUMA and not for mm_cid ?
> 

I didn't investigate the behaviour with the NUMA work, but my rough
guess is that this wouldn't even be visible in an isolated environment
(i.e. no migrations).
Also it doesn't seem to scale linearly with the number of cores.

Your approach (or the NUMA's) isn't wrong, in my opinion, it just
doesn't necessarily require to run in that context.

In an environment with isolated CPUs, we want the lowest latency
possible, that kind of work before switching to userspace imposes a
latency that could simply be elsewhere, even on another core since we
are doing remote accesses.
We are talking about 35us on a rather big system, not many applications
are sensitive to that kind of latency.

> I wonder why we'd want to piggy-back on call_rcu here when
> this has nothing to do with RCU. There is likely a characteristic
> of the call_rcu worker threads that we want to import into
> task_tick_mm_cid(), or change task_work.c to add a new flag
> that says the work can be dispatched to any CPU.
> 

Alright, taking the RCU path was probably a bit lazy, another thought I
had was to run it in a workqueue, perhaps tied to the mm rather than to
the task struct itself. I'm also not entirely sure running it with the
scheduler tick is the best approach, since it doesn't seem quite
predictable, but I didn't really get the full requirements, so a
discussion on this can surely help.

> >   void task_tick_mm_cid(struct rq *rq, struct task_struct *curr)
> >   {
> > - struct callback_head *work = &curr->cid_work;
> > + struct rcu_head *rhp = &curr->rcu;
> 
> Why is it OK to re-use the task struct rcu field ? Where else is it
> used, and is there a risk of being inserted twice ?
> 

The same approach is used in
https://elixir.bootlin.com/linux/v6.12/source/include/linux/sched/task.h#L169
also there it was probably chosen for its simplicity and it isn't the
absolute best approach.
There may be a risk of messing things up, again this was the lazy path
and probably a more robust approach would work better.

Thanks for your comments.
Gabriele