On Tue, Sep 20, 2022 at 8:45 PM Anshuman Khandual <anshuman.khandual@xxxxxxx> wrote: > > > > On 9/20/22 09:09, Barry Song wrote: > > On Tue, Sep 20, 2022 at 3:00 PM Anshuman Khandual > > <anshuman.khandual@xxxxxxx> wrote: > >> > >> > >> On 8/22/22 13:51, Yicong Yang wrote: > >>> +static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm) > >>> +{ > >>> + return true; > >>> +} > >> > >> This needs to be conditional on systems, where there will be performance > >> improvements, and should not just be enabled all the time on all systems. > >> num_online_cpus() > X, which does not hold any cpu hotplug lock would be > >> a good metric ? > > > > for a small system, i don't see how this patch will help, e.g. cpus <= 4; > > so we can actually disable tlb-batch on small systems. > > Do not subscribe ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH based on NR_CPUS ? > That might not help much as the default value is 256 for NR_CPUS. > > OR > > arch_tlbbatch_should_defer() checks on > > 1. online cpus (dont enable batched TLB if <= X) > 2. ARM64_WORKAROUND_REPEAT_TLBI (dont enable batched TLB) > > > just need to check if we will have any race condition since hotplug will > > make the condition true and false dynamically. > > If should_defer_flush() evaluate to be false, then ptep_clear_flush() > clears and flushes the entry right away. This should not race with other > queued up TLBI requests, which will be flushed separately. Wondering how > there can be a race here ! Right. How about we make something as below? static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm) { /* for a small system very small number of CPUs, TLB shootdown is cheap */ if (num_online_cpus() <= 4 || unlikely(this_cpu_has_cap(ARM64_WORKAROUND_REPEAT_TLBI))) return false; #ifdef CONFIG_ARM64_WORKAROUND_REPEAT_TLBI if (unlikely(this_cpu_has_cap(ARM64_WORKAROUND_REPEAT_TLBI))) return false; #endif return true; } Thanks Barry