On 03/01/2025 10:35, Qi Zheng wrote: > On 2025/1/3 17:13, Qi Zheng wrote: >> On 2025/1/3 16:02, Kevin Brodsky wrote: >>> On 03/01/2025 04:48, Qi Zheng wrote: >>>> [...] >>>> >>>> In __tlb_batch_free_encoded_pages(), we can indeed detect PageTable() >>>> and call pagetable_dtor() to dtor the page table pages. >>>> But __tlb_batch_free_encoded_pages() is also used to free normal pages >>>> (not page table pages), so I don't want to add overhead there. >>> >>> Interesting, can a tlb batch refer to pages than are not PTPs then? >> >> Yes, you can see the caller of __tlb_remove_folio_pages() or >> tlb_remove_page_size(). I had a brief look but clearly not a good enough one! I hadn't realised that "table" in tlb_remove_table() means PTP, while "page" in tlb_remove_page() can mean any page, and it's making more sense now. [...] >> >> For arm, the call to pagetable_dtor() is indeed missed in the >> non-MMU_GATHER_RCU_TABLE_FREE case. This needs to be fixed. But we >> can't fix this by adding pagetable_dtor() to tlb_remove_table(), >> because some architectures call tlb_remove_table() but don't support >> page table statistics, like sparc. When I investigated this for my own series, I found that the only case where ctor/dtor are not called for page-sized page tables is 32-bit sparc (see table at the end of [1]). However only 64-bit sparc makes use of tlb_remove_table() (at PTE level, where ctor/dtor are already called). So really calling pagetable_dtor() from tlb_remove_table() in the non-MMU_GATHER_TABLE_FREE case seems to be the obvious thing to do. Once this is done, we should be able to replace all those confusing calls to tlb_remove_page() on PTPs with tlb_remove_table() and remove the explicit call to pagetable_dtor(). AIUI this is essentially what Peter suggested on v3 [2]. [1] https://lore.kernel.org/linux-mm/20241219164425.2277022-1-kevin.brodsky@xxxxxxx/ [2] https://lore.kernel.org/linux-mm/20250103111457.GC22934@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ [...] > Or can we just not let tlb_remove_table() fall back to > tlb_remove_page()? Like the following: > > diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h > index a59205863f431..354ffaa4bd120 100644 > --- a/include/asm-generic/tlb.h > +++ b/include/asm-generic/tlb.h > @@ -195,8 +195,6 @@ > * various ptep_get_and_clear() functions. > */ > > -#ifdef CONFIG_MMU_GATHER_TABLE_FREE > - > struct mmu_table_batch { > #ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE > struct rcu_head rcu; > @@ -219,16 +217,6 @@ static inline void __tlb_remove_table(void *table) > > extern void tlb_remove_table(struct mmu_gather *tlb, void *table); > > -#else /* !CONFIG_MMU_GATHER_HAVE_TABLE_FREE */ > - > -/* > - * Without MMU_GATHER_TABLE_FREE the architecture is assumed to have > page based > - * page directories and we can use the normal page batching to free > them. > - */ > -#define tlb_remove_table(tlb, page) tlb_remove_page((tlb), (page)) We still need a different implementation of tlb_remove_table() in this case. We could define it inline here: static inline void tlb_remove_table(struct mmu_gather *tlb, void *table) { struct page *page = table; pagetable_dtor(page_ptdesc(page)); tlb_remove_page(page); } - Kevin