Re: [RFC PATCH 3/3] powerpc/64s/radix: optimise TLB flush with precise TLB ranges in mmu_gather

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Tue, 12 Jun 2018 18:10:26 -0700

On Tue, Jun 12, 2018 at 5:12 PM Nicholas Piggin <npiggin@xxxxxxxxx> wrote:
> >
> > And in _theory_, maybe you could have just used "invalpg" with a
> > targeted address instead. In fact, I think a single invlpg invalidates
> > _all_ caches for the associated MM, but don't quote me on that.

Confirmed. The SDK says

 "INVLPG also invalidates all entries in all paging-structure caches
  associated with the current PCID, regardless of the linear addresses
  to which they correspond"

so if x86 wants to do this "separate invalidation for page directory
entryes", then it would want to

 (a) remove the __tlb_adjust_range() operation entirely from
pud_free_tlb() and friends

 (b) instead just have a single field for "invalidate_tlb_caches",
which could be a boolean, or could just be one of the addresses

and then the logic would be that IFF no other tlb invalidate is done
due to an actual page range, then we look at that
invalidate_tlb_caches field, and do a single INVLPG instead.

I still am not sure if this would actually make a difference in
practice, but I guess it does mean that x86 could at least participate
in some kind of scheme where we have architecture-specific actions for
those page directory entries.

And we could make the default behavior - if no architecture-specific
tlb page directory invalidation function exists - be the current
"__tlb_adjust_range()" case. So the default would be to not change
behavior, and architectures could opt in to something like this.

            Linus