On Fri, 2024-12-13 at 09:14 -0500, Mathieu Desnoyers wrote: > On 2024-12-13 04:54, Gabriele Monaco wrote: > > Currently, the task_mm_cid_work function is called in a task work > > triggered by a scheduler tick. This can delay the execution of the > > task > > for the entire duration of the function, negatively affecting the > > response of real time tasks. > > > > This patch runs the task_mm_cid_work in a new delayed work > > connected to > > the mm_struct rather than in the task context before returning to > > userspace. > > > > This delayed work is initialised while allocating the mm and > > disabled > > before freeing it, its execution is no longer triggered by > > scheduler > > ticks but run periodically based on the defined MM_CID_SCAN_DELAY. > > > > The main advantage of this change is that the function can be > > offloaded > > to a different CPU and even preempted by RT tasks. > > > > Moreover, this new behaviour could be more predictable in some > > situations since the delayed work is always scheduled with the same > > periodicity for each mm. > > This last paragraph could be clarified. AFAIR, the problem with > the preexisting approach based on the scheduler tick is with a mm > consisting of a set of periodic threads, where none happen to run > while the scheduler tick is running. > > This would skip mm_cid compaction. So it's not a bug per se, because > the mm_cid allocation will just be slightly less compact than it > should > be in that case. > > The underlying question here is whether eventual convergence of > mm_cid > towards 0 when the number of threads or the allowed CPU mask are > reduced > in a mm should be guaranteed or only best effort. > > If best effort, then this corner-case is not worthy of a "Fix" tag. > Otherwise, we should identify which commit it fixes and introduce a > "Fix" tag. > I will definitely make it clearer, but I'm also not sure if the patch is actually a fix for that. I wanted to mention it rather as a nice consequence of the change. The main purpose for us is that it solves latency issues in isolated environments.