> On Jan 8, 2022, at 10:48 PM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Sat, Jan 8, 2022 at 9:56 PM Nadav Amit <nadav.amit@xxxxxxxxx> wrote: >> >> Just wondering: In a world of ASID/PCID - does the “lazy TLB” really >> have a worthy advantage? >> >> Considering the fact that with PTI anyhow address spaces are switched >> all the time, can’t we just get rid of it? > [snip] > > Or maybe it's only worth it on platforms where it's free (UP, possibly > other situations - like if you have IPI and it's "free"). On UP it might be free, but on x86+IPIs there is a tradeoff. When you decide which CPUs you want to send the IPI to, in the common flow (no tables freed) you check whether they use “lazy TLB” or not in order to filter out the lazy ones. In the past this was on a cacheline with other frequently-dirtied data so many times the cacheline bounced from cache to cache. Worse, the test used an indirect branch so was expensive with Spectre v2 mitigations. I fixed it some time ago, so things are better and today the cacheline of is_lazy should bounce less between caches, but there is a tradeoff in maintaining and checking both cpumask and then is_lazy for each CPU in cpumask. It is possible for instance to get rid of is_lazy, keep the CPU on mm_cpumask when switching to kernel thread, and then if/when an IPI is received remove it from cpumask to avoid further unnecessary TLB shootdown IPIs. I do not know whether it is a pure win, but there is a tradeoff.