Re: [PATCH 16/23] sched: Use lightweight hazard pointers to grab lazy mms

Nadav Amit <nadav.amit@xxxxxxxxx> · Sun, 9 Jan 2022 00:49:06 -0800

> On Jan 8, 2022, at 10:48 PM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> 
> On Sat, Jan 8, 2022 at 9:56 PM Nadav Amit <nadav.amit@xxxxxxxxx> wrote:
>> 
>> Just wondering: In a world of ASID/PCID - does the “lazy TLB” really
>> have a worthy advantage?
>> 
>> Considering the fact that with PTI anyhow address spaces are switched
>> all the time, can’t we just get rid of it?
> 

[snip]

> 
> Or maybe it's only worth it on platforms where it's free (UP, possibly
> other situations - like if you have IPI and it's "free").

On UP it might be free, but on x86+IPIs there is a tradeoff.

When you decide which CPUs you want to send the IPI to, in the
common flow (no tables freed) you check whether they use
“lazy TLB” or not in order to filter out the lazy ones. In the
past this was on a cacheline with other frequently-dirtied data so
many times the cacheline bounced from cache to cache. Worse, the
test used an indirect branch so was expensive with Spectre v2
mitigations. I fixed it some time ago, so things are better and
today the cacheline of is_lazy should bounce less between caches,
but there is a tradeoff in maintaining and checking both cpumask
and then is_lazy for each CPU in cpumask.

It is possible for instance to get rid of is_lazy, keep the CPU
on mm_cpumask when switching to kernel thread, and then if/when
an IPI is received remove it from cpumask to avoid further
unnecessary TLB shootdown IPIs.

I do not know whether it is a pure win, but there is a tradeoff.