Ryan Roberts <ryan.roberts@xxxxxxx> writes: > Somehow I managed to reply only to the linux-arm-kernel list on first attempt so > resending: > > On 07/07/2023 09:21, Huang, Ying wrote: >> Ryan Roberts <ryan.roberts@xxxxxxx> writes: >> >>> With the introduction of large folios for anonymous memory, we would >>> like to be able to split them when they have unmapped subpages, in order >>> to free those unused pages under memory pressure. So remove the >>> artificial requirement that the large folio needed to be at least >>> PMD-sized. >>> >>> Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx> >>> Reviewed-by: Yu Zhao <yuzhao@xxxxxxxxxx> >>> Reviewed-by: Yin Fengwei <fengwei.yin@xxxxxxxxx> >>> --- >>> mm/rmap.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/mm/rmap.c b/mm/rmap.c >>> index 82ef5ba363d1..bbcb2308a1c5 100644 >>> --- a/mm/rmap.c >>> +++ b/mm/rmap.c >>> @@ -1474,7 +1474,7 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, >>> * page of the folio is unmapped and at least one page >>> * is still mapped. >>> */ >>> - if (folio_test_pmd_mappable(folio) && folio_test_anon(folio)) >>> + if (folio_test_large(folio) && folio_test_anon(folio)) >>> if (!compound || nr < nr_pmdmapped) >>> deferred_split_folio(folio); >>> } >> >> One possible issue is that even for large folios mapped only in one >> process, in zap_pte_range(), we will always call deferred_split_folio() >> unnecessarily before freeing a large folio. > > Hi Huang, thanks for reviewing! > > I have a patch that solves this problem by determining a range of ptes covered > by a single folio and doing a "batch zap". This prevents the need to add the > folio to the deferred split queue, only to remove it again shortly afterwards. > This reduces lock contention and I can measure a performance improvement for the > kernel compilation benchmark. See [1]. > > However, I decided to remove it from this patch set on Yu Zhao's advice. We are > aiming for the minimal patch set to start with and wanted to focus people on > that. I intend to submit it separately later on. > > [1] https://lore.kernel.org/linux-mm/20230626171430.3167004-8-ryan.roberts@xxxxxxx/ Thanks for your information! "batch zap" can solve the problem. And, I agree with Matthew's comments to fix the large folios interaction issues before merging the patches to allocate large folios as in the following email. https://lore.kernel.org/linux-mm/ZKVdUDuwNWDUCWc5@xxxxxxxxxxxxxxxxxxxx/ If so, we don't need to introduce the above problem or a large patchset. Best Regards, Huang, Ying