Re: [RFC PATCH v1 0/5] Alternative mTHP swap allocator improvements

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 19/06/2024 08:19, Huang, Ying wrote:
> Hi, Ryan,
> 
> Ryan Roberts <ryan.roberts@xxxxxxx> writes:
> 
>> Hi All,
>>
>> Chris has been doing great work at [1] to clean up my mess in the mTHP swap
>> entry allocator.
> 
> I don't think the original behavior is something like mess.  It's just
> the first step in the correct direction.  It's straightforward and
> obviously correctly.  Then, we can optimize it step by step with data to
> justify the increased complexity.

OK, perhaps I was over-egging it by calling it a "mess". What you're describing
was my initial opinion too, but I saw Andrew complaining that we shouldn't be
merging a feature if it doesn't work. This series fixes the problem in a minimal
way - if you ignore the last patch, which is really is just a performance
optimization and could be dropped.

If we can ultimately get Chris's series to 0% fallback like this one, and
everyone is happy with the current state for v6.10, then agreed - let's
concentrate on Chris's series for v6.11.

Thanks,
Ryan

> 
>> But Barry posted a test program and results at [2] showing that
>> even with Chris's changes, there are still some fallbacks (around 5% - 25% in
>> some cases). I was interested in why that might be and ended up putting this PoC
>> patch set together to try to get a better understanding. This series ends up
>> achieving 0% fallback, even with small folios ("-s") enabled. I haven't done
>> much testing beyond that (yet) but thought it was worth posting on the strength
>> of that result alone.
>>
>> At a high level this works in a similar way to Chris's series; it marks a
>> cluster as being for a particular order and if a new cluster cannot be allocated
>> then it scans through the existing non-full clusters. But it does it by scanning
>> through the clusters rather than assembling them into a list. Cluster flags are
>> used to mark clusters that have been scanned and are known not to have enough
>> contiguous space, so the efficiency should be similar in practice.
>>
>> Because its not based around a linked list, there is less churn and I'm
>> wondering if this is perhaps easier to review and potentially even get into
>> v6.10-rcX to fix up what's already there, rather than having to wait until v6.11
>> for Chris's series? I know Chris has a larger roadmap of improvements, so at
>> best I see this as a tactical fix that will ultimately be superseeded by Chris's
>> work.
> 
> I don't think we need any mTHP swap entry allocation optimization to go
> into v6.10-rcX.  There's no functionality or performance regression.
> Per my understanding, we merge optimization when it's ready.
> 
> Hi, Andrew,
> 
> Please correct me if you don't agree.
> 
> [snip]
> 
> --
> Best Regards,
> Huang, Ying





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux