On Mon, 2025-03-10 at 11:50 -0400, Mathieu Desnoyers wrote: > On 2025-03-10 10:46, Gabriele Monaco wrote: > > On Thu, 2025-02-27 at 16:33 +0100, Gabriele Monaco wrote: > > > Currently, the task_mm_cid_work function is called in a task work > > > triggered by a scheduler tick to frequently compact the mm_cids > > > of > > > each > > > process. This can delay the execution of the corresponding thread > > > for > > > the entire duration of the function, negatively affecting the > > > response > > > in case of real time tasks. In practice, we observe > > > task_mm_cid_work > > > increasing the latency of 30-35us on a 128 cores system, this > > > order > > > of > > > magnitude is meaningful under PREEMPT_RT. > > > > > > Run the task_mm_cid_work in a new work_struct connected to the > > > mm_struct rather than in the task context before returning to > > > userspace. > > > > > > This work_struct is initialised with the mm and disabled before > > > freeing > > > it. The queuing of the work happens while returning to userspace > > > in > > > __rseq_handle_notify_resume, maintaining the checks to avoid > > > running > > > more frequently than MM_CID_SCAN_DELAY. > > > To make sure this happens predictably also on long running tasks, > > > we > > > trigger a call to __rseq_handle_notify_resume also from the > > > scheduler > > > tick if the runtime exceeded a 100ms threshold. > > > [...] > > > > > > Fixes: 223baf9d17f2 ("sched: Fix performance regression > > > introduced by > > > mm_cid") > > > Signed-off-by: Gabriele Monaco <gmonaco@xxxxxxxxxx> > > > > Is this patch missing anything? > > > > I refactored a bit to have it build in configurations without RSEQ > > and/or MM_CID (which was failing v10) > > Found a small nit. Please fix and resend with my reviewed-by, and > that version will be ready for inclusion. Perfect, thank you! I'm changing that jiffies thing, testing a bit and sending it. Gabriele