Re: [PATCH v3] mm, memcg: fix the stupid OOM killer when shrinking memcg hard limit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 28, 2019 at 12:28 PM Qian Cai <cai@xxxxxx> wrote:
>
>
>
> > On Nov 27, 2019, at 11:17 PM, Yafang Shao <laoar.shao@xxxxxxxxx> wrote:
> >
> > When there are no more processes in a memcg (e.g., due to OOM
> > group), we can still have file pages in the page cache.
> >
> > If these pages are protected by memory.min, they can't be reclaimed.
> > Especially if there won't be another process in this memcg and the memcg
> > is kept online, we do want to drop these pages from the page cache.
> >
> > By dropping these page caches we can avoid reclaimers (e.g., kswapd or
> > direct) to scan and reclaim pages from all memcgs in the system -
> > because the reclaimers will try to fairly reclaim pages from all memcgs
> > in the system when under memory pressure.
> >
> > By setting the hard limit of such a memcg to 0, we allow to drop the
> > page cache of such memcgs. Unfortunately, this may invoke the OOM killer
> > and generate a lot of output. The OOM output is not expected by an admin
> > who wants to drop these caches and knows that there are no processes in
> > this memcg anymore.
> >
> > Therefore, if a memcg is not populated, we should not invoke the OOM
> > killer - there is nothing to kill. The next time a new process is
> > started in the memcg and the "max" is still below usage, the OOM killer
> > will be invoked and the new process will be killed.
> >
> > [ Above commit log is contributed by David ]
> >
> > What's worse about this issue is that when there're killable tasks and the
> > OOM killer killed the last task, and what will happen then ? As nr_reclaims
> > is already 0 and drained is alreay true, the OOM killer will try to kill
> > nothing (because he knows he has killed the last task), what's a stupid
> > behavior.
> >
> > Someone may worry that the admins may not going to see that the memcg was
> > OOM due to the limit change. But this is not a issue, because the admins
> > changes the limit and then the admins must check the result of his change
> > - by checking memory.{max, current, stat} he can get all he wants.
> >
> > Cc: David Hildenbrand <david@xxxxxxxxxx>
> > Nacked-by: Michal Hocko <mhocko@xxxxxxxx>
> > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx>
>
> Surely too big a turkey to swallow ? — unprofessional wording and carrying a NACK
> from one of the maintainers.
>

I'm sorry if there're any unprofessional wording.
I just want to make the kernel more stable.

Thanks
Yafang





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux