Re: Excessive TLB flush ranges

Thomas Gleixner <tglx@xxxxxxxxxxxxx> · Wed, 17 May 2023 18:32:25 +0200

On Wed, May 17 2023 at 14:15, Uladzislau Rezki wrote:
> On Wed, May 17, 2023 at 01:58:44PM +0200, Thomas Gleixner wrote:
>> Keeping executable mappings around until some other flush happens is
>> obviously neither a brilliant idea nor correct.
>> 
> It avoids of blocking a caller on vfree() by deferring the freeing into
> a workqueue context. At least i got the filling that "your task" that
> does vfree() blocks for unacceptable time. It can happen only if it
> performs VM_FLUSH_RESET_PERMS freeing(other freeing are deferred):
>
> <snip>
> if (unlikely(vm->flags & VM_FLUSH_RESET_PERMS))
> 	vm_reset_perms(vm);
> <snip>
>
> in this case the vfree() can take some time instead of returning back to
> a user asap. Is that your issue? I am not talking that TLB flushing takes
> time, in this case holding on mutex also can take time.

This is absolutely not the problem at all. This comes via do_exit() and
I explained already here:

 https://lore.kernel.org/all/871qjg8wqe.ffs@tglx

what made us look into this and I'm happy to quote myself for your
conveniance:

 "The scenario which made us look is that CPU0 is housekeeping and CPU1 is
  isolated for RT.

  Now CPU0 does that flush nonsense and the RT workload on CPU1 suffers
  because the compute time is suddenly factor 2-3 larger, IOW, it misses
  the deadline. That means a one off event is already a problem."

So it does not matter at all how long the operations on CPU0 take. The
only thing which matters is how much these operations affect the
workload on CPU1.

That made me look into this coalescing code. I understand why you want
to batch and coalesce and rather do a rare full tlb flush than sending
gazillions of IPIs.

But that creates a policy at the core code which does not leave any
decision to make for the architecture, whether it's worth to do full or
single flushes. That's what I worried about and not about the question
whether that free takes 1ms or 10us. That's a completely different
debate.

Whether that list based flush turns out to be the better solution or
not, has still to be decided by deeper analysis.

Thanks,

        tglx