Re: [PATCH v5 0/9] mm: swap: mTHP swap allocator base on swap cluster order

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Chris Li <chrisl@xxxxxxxxxx> writes:

> On Wed, Aug 7, 2024 at 12:59 AM Huang, Ying <ying.huang@xxxxxxxxx> wrote:
>>
>> Hi, Chris,
>>
>> Chris Li <chrisl@xxxxxxxxxx> writes:
>>
>> > This is the short term solutions "swap cluster order" listed
>> > in my "Swap Abstraction" discussion slice 8 in the recent
>> > LSF/MM conference.
>> >
>> > When commit 845982eb264bc "mm: swap: allow storage of all mTHP
>> > orders" is introduced, it only allocates the mTHP swap entries
>> > from the new empty cluster list.  It has a fragmentation issue
>> > reported by Barry.
>> >
>> > https://lore.kernel.org/all/CAGsJ_4zAcJkuW016Cfi6wicRr8N9X+GJJhgMQdSMp+Ah+NSgNQ@xxxxxxxxxxxxxx/
>> >
>> > The reason is that all the empty clusters have been exhausted while
>> > there are plenty of free swap entries in the cluster that are
>> > not 100% free.
>> >
>> > Remember the swap allocation order in the cluster.
>> > Keep track of the per order non full cluster list for later allocation.
>> >
>> > This series gives the swap SSD allocation a new separate code path
>> > from the HDD allocation. The new allocator use cluster list only
>> > and do not global scan swap_map[] without lock any more.
>>
>> This sounds good.  Can we use SSD allocation method for HDD too?
>> We may not need a swap entry allocator optimized for HDD.
>
> Yes, that is the plan as well. That way we can completely get rid of
> the old scan_swap_map_slots() code.

Good!

> However, considering the size of the series, let's focus on the
> cluster allocation path first, get it tested and reviewed.

OK.

> For HDD optimization, mostly just the new block allocations portion
> need some separate code path from the new cluster allocator to not do
> the per cpu allocation.  Allocating from the non free list doesn't
> need to change too

I suggest not consider HDD optimization at all.  Just use SSD algorithm
to simplify.

>>
>> Hi, Hugh,
>>
>> What do you think about this?
>>
>> > This streamline the swap allocation for SSD. The code matches the
>> > execution flow much better.
>> >
>> > User impact: For users that allocate and free mix order mTHP swapping,
>> > It greatly improves the success rate of the mTHP swap allocation after the
>> > initial phase.
>> >
>> > It also performs faster when the swapfile is close to full, because the
>> > allocator can get the non full cluster from a list rather than scanning
>> > a lot of swap_map entries.
>>
>> Do you have some test results to prove this?  Or which test below can
>> prove this?
>
> The two zram tests are already proving this. The system time
> improvement is about 2% on my low CPU count machine.
> Kairui has a higher core count machine and the difference is higher
> there. The theory is that higher CPU count has higher contentions.

I will interpret this as the performance is better in theory.  But
there's almost no measurable results so far.

> The 2% system time number does not sound like much. But consider this
> two factors:
> 1) swap allocator only takes a small percentage of the overall workload.
> 2) The new allocator does more work.
> The old allocator has a time tick budget. It will abort and fail to
> find an entry when it runs out of time budget, even though there are
> still some free entries on the swapfile.

What is the time tick budget you mentioned?

> The new allocator can get to the last few free swap entries if it is
> available. If not then, the new swap allocator will work harder on
> swap cache reclaim.
>
> From the swap cache reclaim aspect, it is very hard to optimize the
> swap cache reclaim in the old allocation path because the scan
> position is randomized.
> The full list and frag list both design to help reduce the repeat
> reclaim attempt of the swap cache.

[snip]

--
Best Regards,
Huang, Ying





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux