On Wed, Dec 11, 2024 at 06:31:31PM +0800, Qi Zheng wrote: > Now, it is also possible not to use the mmu gather mechanism to > free PTE pages: > > pte_free_tlb > --> ___pte_free_tlb > --> pagetable_pte_dtor > --> ptlock_free > paravirt_tlb_remove_table > --> free PTE page > > pte_free > --> pagetable_pte_dtor > --> ptlock_free > pagetable_free > > If we want to move the freeing of the ptlock into __tlb_remove_table(), > we may need to define a pagetable_pte_dtor_no_ptlock() and let > ___pte_free_tlb() to call it. > > And ALLOC_SPLIT_PTLOCKS only be enabled when the spinlock debug function > is turned on, so I finally choose to only change ptlock itself without > changing the general path. PREEMPT_RT will also have it on, and it feels a bit odd to have two disparate RCU frees for something so very closely related. We could push the whole of pagetable_*_dtor into tlb_remove_table(), except that seems to cause trouble for p4d, which is not conforming. Argh, all this is a giant trainwreck and needs some serious cleanup, and I feel your patch is just making it worse. Notably: - s390 pud isn't calling the existing pagetable_pud_[cd]tor() - none of the p4d things have pagetable_p4d_[cd]tor() (x86,arm64,s390,riscv) and they have inconsistent accounting - while much of the _ctor calls are in generic code, many of the _dtor calls are in arch code for hysterial raisins, this could easily be fixed - if we fix ptlock_free() to handle NULL, then all the _dtor() functions can use it, and we can observe they're all identical and can be folded after all that cleanup, you can move the _dtor from *_free_tlb() into tlb_remove_table() -- which for the above case, would then have it called from __tlb_remove_table_free(). Thomas, you were looking for technical debt -- there's a ton of that around here :/