On 24 Jul 2023, at 7:59, Ryan Roberts wrote: > On 14/07/2023 17:04, Ryan Roberts wrote: >> Hi All, >> >> This is v3 of a series to implement variable order, large folios for anonymous >> memory. (currently called "FLEXIBLE_THP") The objective of this is to improve >> performance by allocating larger chunks of memory during anonymous page faults. >> See [1] and [2] for background. > > A question for anyone that can help; I'm preparing v4 and as part of that am > running the mm selftests, now that I've fixed them up to run reliably for > arm64. This is showing 2 regressions vs the v6.5-rc3 baseline: > > 1) khugepaged test fails here: > # Run test: collapse_max_ptes_none (khugepaged:anon) > # Maybe collapse with max_ptes_none exceeded.... Fail > # Unexpected huge page > > 2) split_huge_page_test fails with: > # Still AnonHugePages not split > > I *think* (but haven't yet verified) that (1) is due to khugepaged ignoring > non-order-0 folios when looking for candidates to collapse. Now that we have > large anon folios, the memory allocated by the test is in large folios and > therefore does not get collapsed. We understand this issue, and I believe > DavidH's new scheme for determining exclusive vs shared should give us the tools > to solve this. > > But (2) is weird. If I run this test on its own immediately after booting, it > passes. If I then run the khugepaged test, then re-run this test, it fails. > > The test is allocating 4 hugepages, then requesting they are split using the > debugfs interface. Then the test looks at /proc/self/smaps to check that > AnonHugePages is back to 0. > > In both the passing and failing cases, the kernel thinks that it has > successfully split the pages; the debug logs in split_huge_pages_pid() confirm > this. In the failing case, I wonder if somehow khugepaged could be immediately > re-collapsing the pages before user sapce can observe the split? Perhaps the > failed khugepaged test has left khugepaged in an "awake" state and it > immediately pounces? This is more likely to be a stats issue. Have you checked smap to see if AnonHugePages is 0 KB by placing a getchar() before the exit(EXIT_FAILURE)? Since split_huge_page_test checks that stats to make sure the split indeed happened. -- Best Regards, Yan, Zi
Attachment:
signature.asc
Description: OpenPGP digital signature