Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

Will Deacon <will.deacon@xxxxxxx> · Fri, 24 Aug 2018 16:49:53 +0100

Hi Peter,

On Fri, Aug 24, 2018 at 03:13:32PM +0200, Peter Zijlstra wrote:
> On Fri, Aug 24, 2018 at 10:35:56AM +0200, Peter Zijlstra wrote:
> 
> > Anyway, its sorted now; although I'd like to write me a fairly big
> > comment in asm-generic/tlb.h about things, before I forget again.
> 
> How's something like so? There's a little page_size thingy in this;
> mostly because I couldn't be arsed to split it for now.
> 
> Will has opinions on the page_size thing; I'll let him explain.

They're not especially strong opinions, it's just that I don't think the
page size is necessarily the right thing to track and I'd rather remove that
altogether.

In the patches I've hacked up (I'll post shortly as an RFC), I track the
levels of page-table instead so you can relate the mmu_gather explicitly
with the page-table structure, rather than have to infer it from the page
size. For example, if an architecture could put down huge mappings at the
pte level (e.g. using a contiguous hint in the pte like we have on arm64),
then actually you want to know about the level rather than the size. You can
also track the levels using only 4 bits in the gather structure.

Finally, both approaches have a funny corner case when a VMA contains a
mixture of granule sizes. With the "page size has changed so flush
synchronously" you can theoretically end up with a lot of flushes, where
you'd have been better off just invalidating the whole mm. If you track the
levels instead and postpone a flush using the smallest level you saw, then
you're likely to hit whatever threshold you have and nuke the mm.

Will