Re: [PATCH] mm/vmscan: don't scan adjust too much if current is not kswapd

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Thu, 15 Sep 2022 00:02:14 +0100

On Wed, Sep 14, 2022 at 03:51:42PM -0700, Andrew Morton wrote:
> On Wed, 14 Sep 2022 10:33:18 +0800 Hongchen Zhang <zhanghongchen@xxxxxxxxxxx> wrote:
> 
> > when a process falls into page fault and there is not enough free
> > memory,it will do direct reclaim. At the same time,it is holding
> > mmap_lock.So in case of multi-thread,it should exit from page fault
> > ASAP.
> > When reclaim memory,we do scan adjust between anon and file lru which
> > may cost too much time and trigger hung task for other thread.So for a
> > process which is not kswapd,it should just do a little scan adjust.
> 
> Well, that's a pretty nasty bug.  Before diving into a possible fix,
> can you please tell us more about how this happens?  What sort of
> machine, what sort of workload.  Can you suggest why others are not
> experiencing this?

One thing I'd like to know is whether the page fault is for an anonymous or
file-backed page.  We already drop the mmap_lock for doing file I/O
(or we should ...) and maybe we also need to drop the mmap_lock for
doing direct reclaim?