Re: [PATCH 09/12] x86/mm: enable broadcast TLB invalidation for multi-threaded processes

Rik van Riel <riel@xxxxxxxxxxx> · Mon, 06 Jan 2025 09:26:14 -0500

On Mon, 2025-01-06 at 14:04 +0100, Jann Horn wrote:
> On Sat, Jan 4, 2025 at 3:55 AM Rik van Riel <riel@xxxxxxxxxxx> wrote:
> 
> > 
> > Then the only change needed to switch_mm_irqs_off
> > would be to move the LOADED_MM_SWITCHING line to
> > before choose_new_asid, to fully close the window.
> > 
> > Am I overlooking anything here?
> 
> I think that might require having a full memory barrier in
> switch_mm_irqs_off to ensure that the write of LOADED_MM_SWITCHING
> can't be reordered after reads in choose_new_asid(). Which wouldn't
> be
> very nice; we probably should avoid adding heavy barriers to the task
> switch path...
> 
> Hmm, but I think luckily the cpumask_set_cpu() already implies a
> relaxed RMW atomic, which I think on X86 is actually the same as a
> sequentially consistent atomic, so as long as you put the
> LOADED_MM_SWITCHING line before that, it might do the job? Maybe with
> an smp_mb__after_atomic() and/or an explainer comment.
> (smp_mb__after_atomic() is a no-op on x86, so maybe just a comment is
> the right way. Documentation/memory-barriers.txt says
> smp_mb__after_atomic() can be used together with atomic RMW bitop
> functions.)
> 

That noop smp_mb__after_atomic() might be the way to go,
since we do not actually use the mm_cpumask with INVLPGB,
and we could conceivably skip updates to the bitmask for
tasks using broadcast TLB flushing.

> > > 
> > 
> > I'll add the READ_ONCE.
> > 
> > Will the race still exist if we wait on
> > LOADED_MM_SWITCHING as proposed above?
> 
> I think so, since between reading the loaded_mm and reading the
> loaded_mm_asid, the remote CPU might go through an entire task
> switch.
> Like:
> 
> 1. We read the loaded_mm, and see that the remote CPU is currently
> running in our mm_struct.
> 2. The remote CPU does a task switch to another process with a
> different mm_struct.
> 3. We read the loaded_mm_asid, and see an ASID that does not match
> our
> broadcast ASID (because the loaded ASID is not for our mm_struct).
> 

A false positive, where we do not clear the
asid_transition field, and will check again
in the future should be harmless, though.

The worry is false negatives, where we fail
to detect an out-of-sync CPU, yet still clear
the asid_transition field.

-- 
All Rights Reversed.