Re: [PATCH v3 4/4] arm64: support batched/deferred tlb shootdown during page reclamation

Barry Song <21cnbao@xxxxxxxxx> · Wed, 21 Sep 2022 13:50:24 +1200

On Tue, Sep 20, 2022 at 8:45 PM Anshuman Khandual
<anshuman.khandual@xxxxxxx> wrote:
>
>
>
> On 9/20/22 09:09, Barry Song wrote:
> > On Tue, Sep 20, 2022 at 3:00 PM Anshuman Khandual
> > <anshuman.khandual@xxxxxxx> wrote:
> >>
> >>
> >> On 8/22/22 13:51, Yicong Yang wrote:
> >>> +static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm)
> >>> +{
> >>> +     return true;
> >>> +}
> >>
> >> This needs to be conditional on systems, where there will be performance
> >> improvements, and should not just be enabled all the time on all systems.
> >> num_online_cpus() > X, which does not hold any cpu hotplug lock would be
> >> a good metric ?
> >
> > for a small system, i don't see how this patch will help, e.g. cpus <= 4;
> > so we can actually disable tlb-batch on small systems.
>
> Do not subscribe ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH based on NR_CPUS ?
> That might not help much as the default value is 256 for NR_CPUS.
>
> OR
>
> arch_tlbbatch_should_defer() checks on
>
> 1. online cpus                  (dont enable batched TLB if <= X)
> 2. ARM64_WORKAROUND_REPEAT_TLBI (dont enable batched TLB)
>
> > just need to check if we will have any race condition since hotplug will
> > make the condition true and false dynamically.
>
> If should_defer_flush() evaluate to be false, then ptep_clear_flush()
> clears and flushes the entry right away. This should not race with other
> queued up TLBI requests, which will be flushed separately. Wondering how
> there can be a race here !

Right. How about we make something as below?

static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm)
{
    /* for a small system very small number of CPUs, TLB shootdown is cheap */
    if (num_online_cpus() <= 4 ||
unlikely(this_cpu_has_cap(ARM64_WORKAROUND_REPEAT_TLBI)))
          return false;

#ifdef CONFIG_ARM64_WORKAROUND_REPEAT_TLBI
    if (unlikely(this_cpu_has_cap(ARM64_WORKAROUND_REPEAT_TLBI)))
         return false;
#endif

    return true;
}

Thanks
Barry