Re: [PATCH 1/2] mm/mglru: only clear kswapd_failures if reclaimable

Wei Xu <weixugc@xxxxxxxxxx> · Mon, 14 Oct 2024 16:41:03 -0700

On Mon, Oct 14, 2024 at 4:25 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Mon, 14 Oct 2024 22:12:11 +0000 Wei Xu <weixugc@xxxxxxxxxx> wrote:
>
> > lru_gen_shrink_node() unconditionally clears kswapd_failures, which
> > can prevent kswapd from sleeping and cause 100% kswapd cpu usage even
> > when kswapd repeatedly fails to make progress in reclaim.
> >
> > Only clear kswap_failures in lru_gen_shrink_node() if reclaim makes
> > some progress, similar to shrink_node().
>
> That sounds bad.  What triggers this?  Can you suggest why it has just
> bee discovered, after 1.5 years?  And should the fix be backported into
> -stable kernels?
>

I happened to run into this problem in one of my tests recently. It
requires a combination of several conditions: The allocator needs to
allocate a right amount of pages such that it can wake up kswapd
without itself being OOM killed; there is no memory for kswapd to
reclaim (My test disables swap and cleans page cache first); no other
process frees enough memory at the same time.

I think the fix is a good candidate for stable kernels.