On Tue, Oct 8, 2024 at 9:52 AM Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote: > > Replace lazy active mm existence tracking with hazard pointers. This > removes the following implementations and their associated config > options: > > - MMU_LAZY_TLB_REFCOUNT > - MMU_LAZY_TLB_SHOOTDOWN > - This removes the call_rcu delayed mm drop for RT. > > It leverages the fact that each CPU only ever have at most one single > lazy active mm. This makes it a very good fit for a hazard pointer > domain implemented with one hazard pointer slot per CPU. > > -static void cleanup_lazy_tlbs(struct mm_struct *mm) > +static void remove_lazy_mm_hp(int cpu, struct hazptr_slot *slot, void *addr) > { > - if (!IS_ENABLED(CONFIG_MMU_LAZY_TLB_SHOOTDOWN)) { > - /* > - * In this case, lazy tlb mms are refounted and would not reach > - * __mmdrop until all CPUs have switched away and mmdrop()ed. > - */ > - return; > - } > + smp_call_function_single(cpu, do_shoot_lazy_tlb, addr, 1); > + smp_call_function_single(cpu, do_check_lazy_tlb, addr, 1); > +} > > +static void cleanup_lazy_tlbs(struct mm_struct *mm) > +{ [...] > - on_each_cpu_mask(mm_cpumask(mm), do_shoot_lazy_tlb, (void *)mm, 1); > - if (IS_ENABLED(CONFIG_DEBUG_VM_SHOOT_LAZIES)) > - on_each_cpu(do_check_lazy_tlb, (void *)mm, 1); > + hazptr_scan(&hazptr_domain_sched_lazy_mm, mm, remove_lazy_mm_hp); Hey Mathieu, Take comments with a grain of salt because I am digging into active_mm after a while. It seems to me IMO this seems a strange hazard pointer callback usecase. Because "scan" here immediately retires even though the reader has a "reference". Here it is more like, the callback is forcing all other readers holding a "reference" to switch immediately whether they like it or not and not wait until _they_ release the reference. There is no such precedent in RCU for instance, where a callback never runs before a reader even finishes. That might work for active_mm, but it sounds like a fringe usecase to me that it might probably be better to just force CONFIG_MMU_LAZY_TLB_SHOOTDOWN=y for everyone and use on_each_cpu() instead. That will give the same code simplification for this patch without requiring hazard pointers AFAICS? Or maybe start with that, and _then_ convert to HP if it makes sense to? These are just some thoughts and I am Ok with all the reviewer's consensus. And if I understand correctly, for this usecase - we are not even grabbing a "true reference" to the mm_struct object because direct access to mm_struct should require a proper mmgrab(), not a lazy_tlb flavored one? -- correct me if I'm wrong though. Also, isn't it that on x86, now with this patch there will be more IPIs, whereas previously the global refcount was not requiring that as the last kthread switching out would no longer access the old mm? Might it be worth checking the performance of fork/exit and if that scales? thanks, - Joel