Re: [PATCH] selftests/mm: Introduce a test program to assess swap entry allocation for thp_swapout

"Huang, Ying" <ying.huang@xxxxxxxxx> · Fri, 21 Jun 2024 17:22:49 +0800

Barry Song <21cnbao@xxxxxxxxx> writes:

> On Fri, Jun 21, 2024 at 7:25 PM Ryan Roberts <ryan.roberts@xxxxxxx> wrote:
>>
>> On 20/06/2024 12:34, David Hildenbrand wrote:
>> > On 20.06.24 11:04, Ryan Roberts wrote:
>> >> On 20/06/2024 01:26, Barry Song wrote:
>> >>> From: Barry Song <v-songbaohua@xxxxxxxx>
>> >>>
>> >>> Both Ryan and Chris have been utilizing the small test program to aid
>> >>> in debugging and identifying issues with swap entry allocation. While
>> >>> a real or intricate workload might be more suitable for assessing the
>> >>> correctness and effectiveness of the swap allocation policy, a small
>> >>> test program presents a simpler means of understanding the problem and
>> >>> initially verifying the improvements being made.
>> >>>
>> >>> Let's endeavor to integrate it into the self-test suite. Although it
>> >>> presently only accommodates 64KB and 4KB, I'm optimistic that we can
>> >>> expand its capabilities to support multiple sizes and simulate more
>> >>> complex systems in the future as required.
>> >>
>> >> I'll try to summarize the thread with Huang Ying by suggesting this test program
>> >> is "neccessary but not sufficient" to exhaustively test the mTHP swap-out path.
>> >> I've certainly found it useful and think it would be a valuable addition to the
>> >> tree.
>> >>
>> >> That said, I'm not convinced it is a selftest; IMO a selftest should provide a
>> >> clear pass/fail result against some criteria and must be able to be run
>> >> automatically by (e.g.) a CI system.
>> >
>> > Likely we should then consider moving other such performance-related thingies
>> > out of the selftests?
>>
>> Yes, that would get my vote. But of the 4 tests you mentioned that use
>> clock_gettime(), it looks like transhuge-stress is the only one that doesn't
>> have a pass/fail result, so is probably the only candidate for moving.
>>
>> The others either use the times as a timeout and determines failure if the
>> action didn't occur within the timeout (e.g. ksm_tests.c) or use it to add some
>> supplemental performance information to an otherwise functionality-oriented test.
>
> Thank you very much, Ryan. I think you've found a better home for this
> tool . I will
> send v2, relocating it to tools/mm and adding a function to swap in
> either the whole
> mTHPs or a portion of mTHPs by "-a"(aligned swapin).
>
> So basically, we will have
>
> 1. Use MADV_PAGEPUT for rapid swap-out, putting the swap allocation code under
> high exercise in a short time.
>
> 2. Use MADV_DONTNEED to simulate the behavior of libc and Java heap in freeing
> memory, as well as for munmap, app exits, or OOM killer scenarios. This ensures
> new mTHP is always generated, released or swapped out, similar to the behavior
> on a PC or Android phone where many applications are frequently started and
> terminated.

MADV_DONTNEED 64KB memory, then memset() it, this just simulates the
large folio swap-in exactly, which hasn't been merged by upstream.  I
don't think that it's a good idea to make such kind of trick.

> 3. Swap in with or without the "-a" option to observe how fragments
> due to swap-in
> and the incoming swap-in of large folios will impact swap-out fallback.

It's good to create fragmentation with swap-in.  Which is more practical
and future-proof.  And, I believe that we can reduce large folio
swap-out fallback rate without the large folio swap-in trick.

> And many thanks to Chris for the suggestion on improving it within
> selftest, though I
> prefer to place it in tools/mm.

--
Best Regards,
Huang, Ying