mawupeng <mawupeng1@xxxxxxxxxx> writes: > On 2024/9/4 15:28, Michal Hocko wrote: >> On Wed 04-09-24 14:49:20, mawupeng wrote: >>> >>> >>> On 2024/9/3 16:09, Michal Hocko wrote: >>>> On Tue 03-09-24 09:50:48, mawupeng wrote: >>>>>> Drain remote PCP may be not that expensive now after commit 4b23a68f9536 >>>>>> ("mm/page_alloc: protect PCP lists with a spinlock"). No IPI is needed >>>>>> to drain the remote PCP. >>>>> >>>>> This looks really great, we can think a way to drop pcp before goto slowpath >>>>> before swap. >>>> >>>> We currently drain after first unsuccessful direct reclaim run. Is that >>>> insufficient? >>> >>> The reason i said the drain of pcp is insufficient or expensive is based >>> on you comment[1] :-). Since IPIs is not requiered since commit 4b23a68f9536 >>> ("mm/page_alloc: protect PCP lists with a spinlock"). This could be much >>> better. >>> >>> [1]: https://lore.kernel.org/linux-mm/ZWRYZmulV0B-Jv3k@tiehlicka/ >> >> there are other reasons I have mentioned in that reply which play role >> as well. >> >>>> Should we do a less aggressive draining sooner? Ideally >>>> restricted to cpus on the same NUMA node maybe? Do you have any specific >>>> workloads that would benefit from this? >>> >>> Current the problem is amount the pcp, which can increase to 4.6%(24644M) >>> of the total 512G memory. >> >> Why is that a problem? > > MemAvailable > An estimate of how much memory is available for starting new > applications, without swapping. Calculated from MemFree, > SReclaimable, the size of the file LRU lists, and the low > watermarks in each zone. > > The PCP memory is essentially available memory and will be reclaimed before OOM. > In essence, it is not fundamentally different from reclaiming file pages, as both > are reclaimed within __alloc_pages_direct_reclaim. Therefore, why shouldn't it be > included in MemAvailable to avoid confusion. > > __alloc_pages_direct_reclaim > __perform_reclaim > if (!page && !drained) > drain_all_pages(NULL); > > >> Just because some tools are miscalculating memory >> pressure because they are based on MemAvailable? Or does this lead to >> performance regressions on the kernel side? In other words would the >> same workload behaved better if the amount of pcp-cache was reduced >> without any userspace intervention? Back to the original PCP cache issue. I want to make sure that whether PCP auto-tuning works properly on your system. If so, the total PCP pages should be less than the sum of the low watermark of zones. Can you verify that first? -- Best Regards, Huang, Ying