On 12/14/23 3:00 PM, Matthew Wilcox wrote: > On Thu, Dec 14, 2023 at 02:27:17PM -0800, Jianfeng Wang wrote: >> When unmapping VMA pages, pages will be gathered in batch and released by >> tlb_finish_mmu() if CONFIG_MMU_GATHER_NO_GATHER is not set. The function >> tlb_finish_mmu() is responsible for calling free_pages_and_swap_cache(), >> which calls lru_add_drain() to drain cached pages in folio_batch before >> releasing gathered pages. Thus, it is redundant to call lru_add_drain() >> before gathering pages, if CONFIG_MMU_GATHER_NO_GATHER is not set. >> >> Remove lru_add_drain() prior to gathering and unmapping pages in >> exit_mmap() and unmap_region() if CONFIG_MMU_GATHER_NO_GATHER is not set. >> >> Note that the page unmapping process in oom_killer (e.g., in >> __oom_reap_task_mm()) also uses tlb_finish_mmu() and does not have >> redundant lru_add_drain(). So, this commit makes the code more consistent. > > Shouldn't we put this in __tlb_gather_mmu() which already has the > CONFIG_MMU_GATHER_NO_GATHER ifdefs? That would presuambly help with, eg > zap_page_range_single() too. > After looking at different use cases of tlb_gather_mmu(), I feel it is questionable to move lru_add_drain() into __tlb_gather_mmu(). There are two use cases of tlb_gather_mmu(): one for unmapping and releasing pages (e.g., the two cases in mmap.c); the other one is to update page table entries and flush TLB without releasing pages (e.g., together with mprotect_fixup()). For the latter use case, it is reasonable to not call lru_add_drain() prior to or within tlb_gather_mmu(). Of course, we may update tlb_gather_mmu()'s API to take this into account. For example, we can have tlb_gather_mmu_for_release() for the first case and tlb_gather_mmu() for the latter. I'd like to have your opinion on this. Thanks! >> Signed-off-by: Jianfeng Wang <jianfeng.w.wang@xxxxxxxxxx> >> --- >> mm/mmap.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/mm/mmap.c b/mm/mmap.c >> index 1971bfffcc03..da0308eef435 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -2330,7 +2330,10 @@ static void unmap_region(struct mm_struct *mm, struct ma_state *mas, >> struct mmu_gather tlb; >> unsigned long mt_start = mas->index; >> >> + /* Defer lru_add_drain() to tlb_finish_mmu() for the ifndef case. */ >> +#ifdef CONFIG_MMU_GATHER_NO_GATHER >> lru_add_drain(); >> +#endif >> tlb_gather_mmu(&tlb, mm); >> update_hiwater_rss(mm); >> unmap_vmas(&tlb, mas, vma, start, end, tree_end, mm_wr_locked); >> @@ -3300,7 +3303,10 @@ void exit_mmap(struct mm_struct *mm) >> return; >> } >> >> + /* Defer lru_add_drain() to tlb_finish_mmu() for the ifndef case. */ >> +#ifdef CONFIG_MMU_GATHER_NO_GATHER >> lru_add_drain(); >> +#endif >> flush_cache_mm(mm); >> tlb_gather_mmu_fullmm(&tlb, mm); >> /* update_hiwater_rss(mm) here? but nobody should be looking */ >> -- >> 2.42.1 >> >>