> I've used the same task work pattern as NUMA here. What makes it > OK for NUMA and not for mm_cid ? > I didn't investigate the behaviour with the NUMA work, but my rough guess is that this wouldn't even be visible in an isolated environment (i.e. no migrations). Also it doesn't seem to scale linearly with the number of cores. Your approach (or the NUMA's) isn't wrong, in my opinion, it just doesn't necessarily require to run in that context. In an environment with isolated CPUs, we want the lowest latency possible, that kind of work before switching to userspace imposes a latency that could simply be elsewhere, even on another core since we are doing remote accesses. We are talking about 35us on a rather big system, not many applications are sensitive to that kind of latency. > I wonder why we'd want to piggy-back on call_rcu here when > this has nothing to do with RCU. There is likely a characteristic > of the call_rcu worker threads that we want to import into > task_tick_mm_cid(), or change task_work.c to add a new flag > that says the work can be dispatched to any CPU. > Alright, taking the RCU path was probably a bit lazy, another thought I had was to run it in a workqueue, perhaps tied to the mm rather than to the task struct itself. I'm also not entirely sure running it with the scheduler tick is the best approach, since it doesn't seem quite predictable, but I didn't really get the full requirements, so a discussion on this can surely help. > > void task_tick_mm_cid(struct rq *rq, struct task_struct *curr) > > { > > - struct callback_head *work = &curr->cid_work; > > + struct rcu_head *rhp = &curr->rcu; > > Why is it OK to re-use the task struct rcu field ? Where else is it > used, and is there a risk of being inserted twice ? > The same approach is used in https://elixir.bootlin.com/linux/v6.12/source/include/linux/sched/task.h#L169 also there it was probably chosen for its simplicity and it isn't the absolute best approach. There may be a risk of messing things up, again this was the lazy path and probably a more robust approach would work better. Thanks for your comments. Gabriele