Re: [PATCH] selftests/mm: Introduce a test program to assess swap entry allocation for thp_swapout

Barry Song <21cnbao@xxxxxxxxx> · Fri, 21 Jun 2024 19:20:43 +0800

On Fri, Jun 21, 2024 at 4:50 PM Chris Li <chrisl@xxxxxxxxxx> wrote:
>
> On Fri, Jun 21, 2024 at 12:47 AM Barry Song <21cnbao@xxxxxxxxx> wrote:
> >
> > On Fri, Jun 21, 2024 at 7:25 PM Ryan Roberts <ryan.roberts@xxxxxxx> wrote:
> > >
> > > On 20/06/2024 12:34, David Hildenbrand wrote:
> > > > On 20.06.24 11:04, Ryan Roberts wrote:
> > > >> On 20/06/2024 01:26, Barry Song wrote:
> > > >>> From: Barry Song <v-songbaohua@xxxxxxxx>
> > > >>>
> > > >>> Both Ryan and Chris have been utilizing the small test program to aid
> > > >>> in debugging and identifying issues with swap entry allocation. While
> > > >>> a real or intricate workload might be more suitable for assessing the
> > > >>> correctness and effectiveness of the swap allocation policy, a small
> > > >>> test program presents a simpler means of understanding the problem and
> > > >>> initially verifying the improvements being made.
> > > >>>
> > > >>> Let's endeavor to integrate it into the self-test suite. Although it
> > > >>> presently only accommodates 64KB and 4KB, I'm optimistic that we can
> > > >>> expand its capabilities to support multiple sizes and simulate more
> > > >>> complex systems in the future as required.
> > > >>
> > > >> I'll try to summarize the thread with Huang Ying by suggesting this test program
> > > >> is "neccessary but not sufficient" to exhaustively test the mTHP swap-out path.
> > > >> I've certainly found it useful and think it would be a valuable addition to the
> > > >> tree.
> > > >>
> > > >> That said, I'm not convinced it is a selftest; IMO a selftest should provide a
> > > >> clear pass/fail result against some criteria and must be able to be run
> > > >> automatically by (e.g.) a CI system.
> > > >
> > > > Likely we should then consider moving other such performance-related thingies
> > > > out of the selftests?
> > >
> > > Yes, that would get my vote. But of the 4 tests you mentioned that use
> > > clock_gettime(), it looks like transhuge-stress is the only one that doesn't
> > > have a pass/fail result, so is probably the only candidate for moving.
> > >
> > > The others either use the times as a timeout and determines failure if the
> > > action didn't occur within the timeout (e.g. ksm_tests.c) or use it to add some
> > > supplemental performance information to an otherwise functionality-oriented test.
> >
> > Thank you very much, Ryan. I think you've found a better home for this
> > tool . I will
> > send v2, relocating it to tools/mm and adding a function to swap in
> > either the whole
> > mTHPs or a portion of mTHPs by "-a"(aligned swapin).
> >
> > So basically, we will have
> >
> > 1. Use MADV_PAGEPUT for rapid swap-out, putting the swap allocation code under
> > high exercise in a short time.
> >
> > 2. Use MADV_DONTNEED to simulate the behavior of libc and Java heap in freeing
> > memory, as well as for munmap, app exits, or OOM killer scenarios. This ensures
> > new mTHP is always generated, released or swapped out, similar to the behavior
> > on a PC or Android phone where many applications are frequently started and
> > terminated.
>
> Will this cover the case that the ratio of order 0 and order 4 swap
> requests change during LMK, and swapfile is almost full?
>
> If not, please add that :-)

Due to 2, we ensure a certain proportion of mTHP. Similarly, because
of 3, we maintain
a certain proportion of small folios, as we don't support large folios
swap-in, meaning
any swap-in will immediately result in small folios. Therefore, with
both 2 and 3, we
automatically achieve a system containing both mTHP and small folios.
Additionally,
1 provides the ability to continuously swap them out. If we set the
same sizes for 2
and 3, we'll achieve a 1:1 ratio of large folios to small folios. How
about starting with
a 1:1 ratio?

To meet the requirement that the swapfile is almost full, I can
increase the memory to
ensure the total size is quite close to zRAM. This way, we give the
small folios a chance
to perform a slow scan and observe the impact.

>
> > 3. Swap in with or without the "-a" option to observe how fragments
> > due to swap-in
> > and the incoming swap-in of large folios will impact swap-out fallback.
> >
> > And many thanks to Chris for the suggestion on improving it within
> > selftest, though I
> > prefer to place it in tools/mm.
>
> I am perfectly fine with that. Looking forward to your V2.
>
> Chris

Thanks
Barry