On Wed, Aug 16, 2023 at 07:30:35AM +0800, Yin Fengwei wrote: > > > On 8/15/23 21:25, Daniel Gomez wrote: > > Hi Yin, > > On Tue, Aug 08, 2023 at 10:09:17AM +0800, Yin Fengwei wrote: > >> Commit 98b211d6415f ("madvise: convert madvise_free_pte_range() to use a > >> folio") replaced the page_mapcount() with folio_mapcount() to check > >> whether the folio is shared by other mapping. > >> > >> It's not correct for large folios. folio_mapcount() returns the total > >> mapcount of large folio which is not suitable to detect whether the folio > >> is shared. > >> > >> Use folio_estimated_sharers() which returns a estimated number of shares. > >> That means it's not 100% correct. It should be OK for madvise case here. > > > > I'm trying to understand why it should be ok for madvise this change, so > > I hope it's okay to ask you few questions. > > > > folio_mapcount() calculates the total maps for all the subpages of a > > folio. However, the folio_estimated_sharers does it only for the first > > subpage making it not true for large folios. Then, wouldn't this change > > drop support for large folios? > I saw David explained this very well in another mail. > > > > > Seems like folio_entire_mapcount() is not accurate either because of it > > does not inclue PTE-mapped sub-pages which I think we need here. Hence, > > the folio_mapcount(). Could this be something missing in the test side? > > > > I tried to replicate the setup with CONFIG_TRANSPARENT_HUGEPAGE but > > seems like I'm not able to do it: > > > > ./cow > > # [INFO] detected THP size: 2048 KiB > > # [INFO] detected hugetlb size: 2048 KiB > > # [INFO] detected hugetlb size: 1048576 KiB > > # [INFO] huge zeropage is enabled > > TAP version 13 > > 1..166 > > # [INFO] Anonymous memory tests in private mappings > > # [RUN] Basic COW after fork() ... with base page > > not ok 1 MADV_NOHUGEPAGE failed > > # [RUN] Basic COW after fork() ... with swapped out base page > > not ok 2 MADV_NOHUGEPAGE failed > > # [RUN] Basic COW after fork() ... with THP > > not ok 3 MADV_HUGEPAGE failed > > # [RUN] Basic COW after fork() ... with swapped-out THP > > not ok 4 MADV_HUGEPAGE failed > > # [RUN] Basic COW after fork() ... with PTE-mapped THP > > not ok 5 MADV_HUGEPAGE failed > > # [RUN] Basic COW after fork() ... with swapped-out, PTE-mapped THP > > not ok 6 MADV_HUGEPAGE failed > > ... > Can you post the MADV_PAGEOUT and PTE-mapped THP related testing result? > And I suppose swap need be enabled also for the testing. You may find a dump of the logs in the link below with system information. Let me know if you find something wrong in my setup or if you need something else. Besides CONFIG_TRANSPARENT_HUGEPAGE, CONFIG_SWAP is also enabled in the kernel. https://gitlab.com/-/snippets/2584135 Also, strace reports ENOSYS for MADV_*: madvise(0x7f2912465000, 4096, MADV_NOHUGEPAGE) = -1 ENOSYS (Function not implemented) madvise(0x7f2912000000, 2097152, MADV_HUGEPAGE) = -1 ENOSYS (Function not implemented) > > > Regards > Yin, Fengwei > > > > > > > Daniel > >> > >> User-visible effects is that the THP is skipped when user call madvise. > >> But the correct behavior is THP should be split and processed then. > >> > >> NOTE: this change is a temporary fix to reduce the user-visible effects > >> before the long term fix from David is ready. > >> > >> Fixes: 98b211d6415f ("madvise: convert madvise_free_pte_range() to use a folio") > >> Cc: stable@xxxxxxxxxxxxxxx > >> Signed-off-by: Yin Fengwei <fengwei.yin@xxxxxxxxx> > >> Reviewed-by: Yu Zhao <yuzhao@xxxxxxxxxx> > >> Reviewed-by: Ryan Roberts <ryan.roberts@xxxxxxx> > >> --- > >> mm/madvise.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/mm/madvise.c b/mm/madvise.c > >> index 49af35e2d99a..4dded5d27e7e 100644 > >> --- a/mm/madvise.c > >> +++ b/mm/madvise.c > >> @@ -683,7 +683,7 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr, > >> if (folio_test_large(folio)) { > >> int err; > >> > >> - if (folio_mapcount(folio) != 1) > >> + if (folio_estimated_sharers(folio) != 1) > >> break; > >> if (!folio_trylock(folio)) > >> break; > >> -- > >> 2.39.2 > >>