Re: [PATCH] mm, proc: collect percpu free pages into the free pages

"Huang, Ying" <ying.huang@xxxxxxxxx> · Wed, 11 Sep 2024 13:37:21 +0800

mawupeng <mawupeng1@xxxxxxxxxx> writes:

> On 2024/9/4 15:28, Michal Hocko wrote:
>> On Wed 04-09-24 14:49:20, mawupeng wrote:
>>>
>>>
>>> On 2024/9/3 16:09, Michal Hocko wrote:
>>>> On Tue 03-09-24 09:50:48, mawupeng wrote:
>>>>>> Drain remote PCP may be not that expensive now after commit 4b23a68f9536
>>>>>> ("mm/page_alloc: protect PCP lists with a spinlock").  No IPI is needed
>>>>>> to drain the remote PCP.
>>>>>
>>>>> This looks really great, we can think a way to drop pcp before goto slowpath
>>>>> before swap.
>>>>
>>>> We currently drain after first unsuccessful direct reclaim run. Is that
>>>> insufficient? 
>>>
>>> The reason i said the drain of pcp is insufficient or expensive is based
>>> on you comment[1] :-）. Since IPIs is not requiered since commit 4b23a68f9536
>>> ("mm/page_alloc: protect PCP lists with a spinlock"). This could be much
>>> better.
>>>
>>> [1]: https://lore.kernel.org/linux-mm/ZWRYZmulV0B-Jv3k@tiehlicka/
>> 
>> there are other reasons I have mentioned in that reply which play role
>> as well.
>> 
>>>> Should we do a less aggressive draining sooner? Ideally
>>>> restricted to cpus on the same NUMA node maybe? Do you have any specific
>>>> workloads that would benefit from this?
>>>
>>> Current the problem is amount the pcp, which can increase to 4.6%(24644M)
>>> of the total 512G memory.
>> 
>> Why is that a problem? 
>
> MemAvailable
>               An estimate of how much memory is available for starting new
>               applications, without swapping. Calculated from MemFree,
>               SReclaimable, the size of the file LRU lists, and the low
>               watermarks in each zone.
>
> The PCP memory is essentially available memory and will be reclaimed before OOM.
> In essence, it is not fundamentally different from reclaiming file pages, as both
> are reclaimed within __alloc_pages_direct_reclaim. Therefore, why shouldn't it be
> included in MemAvailable to avoid confusion.
>
> __alloc_pages_direct_reclaim
>   __perform_reclaim
>   if (!page && !drained)
>     drain_all_pages(NULL);
>
>
>> Just because some tools are miscalculating memory
>> pressure because they are based on MemAvailable? Or does this lead to
>> performance regressions on the kernel side? In other words would the
>> same workload behaved better if the amount of pcp-cache was reduced
>> without any userspace intervention?

Back to the original PCP cache issue.  I want to make sure that whether
PCP auto-tuning works properly on your system.  If so, the total PCP
pages should be less than the sum of the low watermark of zones.  Can
you verify that first?

--
Best Regards,
Huang, Ying