[PATCH 0/1] mm/oom_kill: system enters a state something like hang when running stress-ng

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



When we run stress-ng on the UC (Ubuntu Core), the system will be in a
state similar to hang. And we found if a testcase could introduce the
oom (like stress-ng-bigheap, stress-ng-brk, ...) under the UC, it is
highly possible that this testcase will make the system be in a state
like hang. We had a discussion for this issue here:
https://github.com/ColinIanKing/stress-ng/pull/270

The root cause are Ubuntu Core is constructed on squashfs, and
out_of_memory() will not trigger the oom killer for an allocation
without __GFP_FS.

For squashfs side:
memalloc_nofs_save()
  squashfs_readahead()--->...--->alloc_page() /* alloc without __GFP_FS */
memalloc_nofs_restore()

For oom side:
out_of_memory()
  if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS) && !is_memcg_oom(oc))
          return true;

In the out_of_memory(), this design exists for a long period of time,
If removing these 2 lines, the issue could be fixed. But I don't know
if it is allowed to do so or not. From my understanding, there will be
no problem to remove those 2 lines, since if the allocation without
__GFP_FS triggers the oom killer, the killer will select a process to
kill, it has no difference with an allocation with __GFP_FS.

Since this design exists for a long time, it is risky to change it,
here I use a kthread to trigger oom with __GFP_FS when needed, I think
it is a safer way to resolve this issue. Please help review it.

Thanks.

Hui Wang (1):
  mm/oom_kill: trigger the oom killer if oom occurs without __GFP_FS

 mm/oom_kill.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

-- 
2.34.1





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux