On Mon, Sep 12, 2022 at 03:38:23PM +0100, Aaron Tomlin wrote: > On Fri 2022-09-09 16:35 -0300, Marcelo Tosatti wrote: > Hi Frederic, > > Sorry about that. How about the following: > > - Note: CPU X is part of 'tick_nohz_full_mask' > > 1. CPU Y migrated running task A to CPU X that > was in an idle state i.e. waiting for an IRQ; > marked the current task on CPU X to need/or > require a reschedule i.e., set TIF_NEED_RESCHED > and invoked a reschedule IPI to CPU X > (see sched_move_task()) > > 2. CPU X acknowledged the reschedule IPI. Generic > idle loop code noticed the TIF_NEED_RESCHED flag > against the idle task and attempts to exit of the > loop and calls the main scheduler function i.e. > __schedule(). > > Since the idle tick was previously stopped no > scheduling-clock tick would occur. > So, no deferred timers would be handled > > 3. Post transition to kernel execution Task A > running on CPU X, indirectly released a few pages > (e.g. see __free_one_page()); CPU X's > 'vm_stat_diff[NR_FREE_PAGES]' was updated and zone > specific 'vm_stat[]' update was deferred as per the > CPU-specific stat threshold > > 4. Task A does invoke exit(2) and the kernel does > remove the task from the run-queue; the idle task > was selected to execute next since there are no > other runnable tasks assigned to the given CPU > (see pick_next_task() and pick_next_task_idle()) > > 5. On return to the idle loop since the idle tick > was already stopped and can remain so (see [1] > below) e.g. no pending soft IRQs, no attempt is > made to zero and fold CPU X's vmstat counters > since reprogramming of the scheduling-clock tick > is not required/or needed (see [2]) Much better thanks. Please cut the patch in two patches: one that fixes the stuff in the idle path and another one that fixes the return to user path. The first one is definetly a fix, the second one is rather a feature that is definetly wanted as well but I need to think it through further. > > > > Kind regards, > > -- > Aaron Tomlin >