Re: [PATCH v5 10/12] x86,tlb: do targeted broadcast flushing from tlbbatch code

Rik van Riel <riel@xxxxxxxxxxx> · Mon, 20 Jan 2025 12:56:45 -0500

On Mon, 2025-01-20 at 19:50 +0200, Nadav Amit wrote:
> [ Thanks for your patience. ]
> 
> > On 20 Jan 2025, at 19:11, Rik van Riel <riel@xxxxxxxxxxx> wrote:
> > 
> > On Mon, 2025-01-20 at 19:09 +0200, Nadav Amit wrote:
> > 
> > This is the page reclaim code.
> > 
> > The process that has those other pages mapped might be
> > running on other CPUs simultaneously with the page
> > reclaim code.
> > 
> > Even if we were invalidating one of our own pages this
> > way, there could be other threads in the same process,
> > running while we are in the page reclaim code.
> 
> 
> 
> Of course, but there is nothing new here. Let me see where we lose
> each other by first stating the goal, what you propose, and what I
> suggest.
> 
> We are issuing invlpgb and we need to ensure tlbsync on the same core
> that initiated invlpgb before arch_tlbbatch_flush() finishes. That’s
> all that matters for our discussion (correct me if I miss something).
> 
> You solved it by disabling migration and running tlbsync at the end.
> 
> I suggest *not* to disable migration, to keep running tlbsync at the
> arch_tlbbatch_flush() as you do, and if context-switch happens after
> arch_tlbbatch_add_pending() and before arch_tlbbatch_flush(), to run
> tlbsync during the context switch.
> 
How would you keep track of CPUs where the tlbsync
has NOT happened before arch_tlbbatch_flush()?

That part seems to be missing still.

-- 
All Rights Reversed.