TLB flushes on fixmap changes

Nadav Amit <nadav.amit@xxxxxxxxx> · Fri, 24 Aug 2018 11:35:57 -0700

at 11:04 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Fri, Aug 24, 2018 at 10:26:50AM -0700, Nadav Amit wrote:
>> at 1:47 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> 
>>> On Thu, Aug 23, 2018 at 02:39:59PM +0100, Will Deacon wrote:
>>>> The only problem with this approach is that we've lost track of the granule
>>>> size by the point we get to the tlb_flush(), so we can't adjust the stride of
>>>> the TLB invalidations for huge mappings, which actually works nicely in the
>>>> synchronous case (e.g. we perform a single invalidation for a 2MB mapping,
>>>> rather than iterating over it at a 4k granule).
>>>> 
>>>> One thing we could do is switch to synchronous mode if we detect a change in
>>>> granule (i.e. treat it like a batch failure).
>>> 
>>> We could use tlb_start_vma() to track that, I think. Shouldn't be too
>>> hard.
>> 
>> Somewhat unrelated, but I use this opportunity that TLB got your attention
>> for something that bothers me for some time. clear_fixmap(), which is used
>> in various places (e.g., text_poke()), ends up in doing only a local TLB
>> flush (in __set_pte_vaddr()).
>> 
>> Is that sufficient?
> 
> Urgh.. weren't the fixmaps per cpu? Bah, I remember looking at this
> during PTI, but I seem to have forgotten everything again.

[ Changed the title. Sorry for hijacking the thread. ]

Since:

native_set_fixmap()->set_pte_vaddr()->pgd_offset_k()

And pgd_offset_k() uses init_mm, they do not seem to be per-CPU.

In addition, the __flush_tlb_one_kernel() in text_poke() seems redundant
(since set_fixmap() should do it as well).

If you also think the current behavior is inappropriate, I can take a stab
at fixing it by adding a shootdown. But, if text_poke() is called when
interrupts are disabled, the fix would be annoying.