Re: [PATCH -V2] mm: fix draining PCP of remote zone

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat,  7 Oct 2023 14:23:56 +0800 Huang Ying <ying.huang@xxxxxxxxx> wrote:

> If there is no memory allocation/freeing in the PCP (Per-CPU Pageset)
> of a remote zone (zone in remote NUMA node) after some time (3 seconds
> for now), the pages of the PCP of the remote zone will be drained to
> avoid memory wastage.
> 
> This behavior was introduced in the commit 4ae7c03943fc ("[PATCH]
> Periodically drain non local pagesets") and the commit
> 4037d452202e ("Move remote node draining out of slab allocators")
> 
> But, after the commit 7cc36bbddde5 ("vmstat: on-demand vmstat workers
> V8"), the vmstat updater worker which is used to drain the PCP of
> remote zones may not be re-queued when we are waiting for the
> timeout (pcp->expire != 0) if there are no vmstat changes on this CPU,
> for example, when the CPU goes idle or runs user space only workloads.
> This may cause the pages of a remote zone be kept in PCP of this CPU
> for long time.  So that, the page reclaiming of the remote zone may be
> triggered prematurely.  This isn't a severe problem in practice,
> because the PCP of the remote zone will be drained if some memory are
> allocated/freed again on this CPU.  And, the PCP will eventually be
> drained during the direct reclaiming if necessary.
> 
> Anyway, the problem still deserves a fix via guaranteeing that the
> vmstat updater worker will always be re-queued when we are waiting for
> the timeout.  In effect, this restores the original behavior before
> the commit 7cc36bbddde5.
> 
> We can reproduce the bug via allocating/freeing pages from a remote
> zone then go idle as follows.  And the patch can fix it.
> 
> - Run some workloads, use `numactl` to bind CPU to node 0 and memory to
>   node 1.  So the PCP of the CPU on node 0 for zone on node 1 will be
>   filled.
> 
> - After workloads finish, idle for 60s
> 
> - Check /proc/zoneinfo
> 
> With the original kernel, the number of pages in the PCP of the CPU on
> node 0 for zone on node 1 is non-zero after idle.  With the patched
> kernel, it becomes 0 after idle.  That is, we avoid to keep pages in
> the remote PCP during idle.
> 

Thanks, I updated the changelog in place and queued this for mm-stable.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux