Re: [PATCH 7/8] membarrier: Remove arm (32) support for SYNC_CORE

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Thu, 17 Jun 2021 17:13:17 +0200

On Thu, Jun 17, 2021 at 05:01:53PM +0200, Peter Zijlstra wrote:
> On Thu, Jun 17, 2021 at 07:00:26AM -0700, Andy Lutomirski wrote:
> > On Thu, Jun 17, 2021, at 6:51 AM, Mark Rutland wrote:
> 
> > > It's not clear to me what "the right thing" would mean specifically, and
> > > on architectures with userspace cache maintenance JITs can usually do
> > > the most optimal maintenance, and only need help for the context
> > > synchronization.
> > > 
> > 
> > This I simply don't believe -- I doubt that any sane architecture
> > really works like this.  I wrote an email about it to Intel that
> > apparently generated internal discussion but no results.  Consider:
> > 
> > mmap(some shared library, some previously unmapped address);
> > 
> > this does no heavyweight synchronization, at least on x86.  There is
> > no "serializing" instruction in the fast path, and it *works* despite
> > anything the SDM may or may not say.
> 
> I'm confused; why do you think that is relevant?
> 
> The only way to get into a memory address space is CR3 write, which is
> serializing and will flush everything. Since there wasn't anything
> mapped, nothing could be 'cached' from that location.
> 
> So that has to work...

Ooh, you mean mmap where there was something mmap'ed before. Not virgin
space so to say.

But in that case, the unmap() would've caused a TLB invalidate, which on
x86 is IPIs, which is IRET.

Other architectures include I/D cache flushes in their TLB
invalidations -- but as elsewhere in the thread, that might not be
suffient on its own.

But yes, I think TLBI has to imply flushing micro-arch instruction
related buffers for any of that to work.