On Mon, 2025-01-20 at 19:50 +0200, Nadav Amit wrote: > [ Thanks for your patience. ] > > > On 20 Jan 2025, at 19:11, Rik van Riel <riel@xxxxxxxxxxx> wrote: > > > > On Mon, 2025-01-20 at 19:09 +0200, Nadav Amit wrote: > > > > This is the page reclaim code. > > > > The process that has those other pages mapped might be > > running on other CPUs simultaneously with the page > > reclaim code. > > > > Even if we were invalidating one of our own pages this > > way, there could be other threads in the same process, > > running while we are in the page reclaim code. > > > > Of course, but there is nothing new here. Let me see where we lose > each other by first stating the goal, what you propose, and what I > suggest. > > We are issuing invlpgb and we need to ensure tlbsync on the same core > that initiated invlpgb before arch_tlbbatch_flush() finishes. That’s > all that matters for our discussion (correct me if I miss something). > > You solved it by disabling migration and running tlbsync at the end. > > I suggest *not* to disable migration, to keep running tlbsync at the > arch_tlbbatch_flush() as you do, and if context-switch happens after > arch_tlbbatch_add_pending() and before arch_tlbbatch_flush(), to run > tlbsync during the context switch. > How would you keep track of CPUs where the tlbsync has NOT happened before arch_tlbbatch_flush()? That part seems to be missing still. -- All Rights Reversed.