Re: [PATCH 06/12] x86/mm: use INVLPGB for kernel TLB flushes

Dave Hansen <dave.hansen@xxxxxxxxx> · Fri, 10 Jan 2025 08:29:01 -0800

On 1/10/25 08:08, Rik van Riel wrote:
> On Fri, 2025-01-10 at 07:14 -0800, Dave Hansen wrote:
...
>> I think the key thing we need to decide is whether to treat a single
>> INVLPGB(stride=8) more like a single INVLPGB or eight INVLPGBs.
>> Measuring a bunch of invalidation looks should tell us that.
> 
> Would I be wrong to assume that the CPUs have
> some optimizations built in to efficiently
> execute an invalidation for "everything in a
> PCID"?

There's only a few bits in the actual TLBs to store the PCID (or VPID),
roughly 3 on Intel. Then there's another structure to map between the
architectural PCID and the 3 bits of actual hardware alias.

That's what I know. The rest is pure speculation:

All you have to do in theory is zap the one entry in the PCID=>HW
mapping structure to invalidate a whole PCID. You don't need to run
through the TLB itself to invalidate it.

You need to do something else to make sure that the now-unused 3-bit
hardware identifier gets reused at _some_ point, but there may be other
tricks for that.

> The "global invalidate" we send does not
> zap everything in the TLB, but only the
> translations for a single PCID.
> 
> I suppose we should measure these things
> at some point (after I do the other
> cleanups?), because the CPUs may well have
> made a bunch of optimizations that we
> don't know about.

IIRC, the "big" invalidation modes are pretty cheap to execute. Most of
the cost comes from the TLB refill, not the flush itself.

But there's no substitute for actually measuring it. There's some wonky
stuff out there. The last time Andy L. went and looked at it, there were
oddities like INVPCID's "Individual-address invalidation" and INVLPG
having surprisingly different performance.