Re: [PATCH v11 2/3] sched: Move task_mm_cid_work to mm work_struct

Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> · Mon, 10 Mar 2025 11:50:23 -0400

On 2025-03-10 10:46, Gabriele Monaco wrote:
On Thu, 2025-02-27 at 16:33 +0100, Gabriele Monaco wrote:
Currently, the task_mm_cid_work function is called in a task work
triggered by a scheduler tick to frequently compact the mm_cids of
each
process. This can delay the execution of the corresponding thread for
the entire duration of the function, negatively affecting the
response
in case of real time tasks. In practice, we observe task_mm_cid_work
increasing the latency of 30-35us on a 128 cores system, this order
of
magnitude is meaningful under PREEMPT_RT.

Run the task_mm_cid_work in a new work_struct connected to the
mm_struct rather than in the task context before returning to
userspace.

This work_struct is initialised with the mm and disabled before
freeing
it. The queuing of the work happens while returning to userspace in
__rseq_handle_notify_resume, maintaining the checks to avoid running
more frequently than MM_CID_SCAN_DELAY.
To make sure this happens predictably also on long running tasks, we
trigger a call to __rseq_handle_notify_resume also from the scheduler
tick if the runtime exceeded a 100ms threshold.
[...]

Fixes: 223baf9d17f2 ("sched: Fix performance regression introduced by
mm_cid")
Signed-off-by: Gabriele Monaco <gmonaco@xxxxxxxxxx>

Is this patch missing anything?

I refactored a bit to have it build in configurations without RSEQ
and/or MM_CID (which was failing v10)

Found a small nit. Please fix and resend with my reviewed-by, and
that version will be ready for inclusion.

Thanks!

Mathieu

Thanks,
Gabriele

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com