[RFC PATCH 0/2] mm: fix OOMs for binding workloads to movable zone only node

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

This patchset tries to report a problem and get suggestion/review
for the RFC fix patches.

We recently got a OOM report, that when user try to bind a docker(container)
instance to a memory node which only has movable zones, and OOM killing
still can't solve the page allocation failure.

The callstack was:

	[ 1387.877565] runc:[2:INIT] invoked oom-killer: gfp_mask=0x500cc2(GFP_HIGHUSER|__GFP_ACCOUNT), order=0, oom_score_adj=0
	[ 1387.877568] CPU: 8 PID: 8291 Comm: runc:[2:INIT] Tainted: G        W I E     5.8.2-0.g71b519a-default #1 openSUSE Tumbleweed (unreleased)
	[ 1387.877569] Hardware name: Dell Inc. PowerEdge R640/0PHYDR, BIOS 2.6.4 04/09/2020
	[ 1387.877570] Call Trace:
	[ 1387.877579]  dump_stack+0x6b/0x88
	[ 1387.877584]  dump_header+0x4a/0x1e2
	[ 1387.877586]  oom_kill_process.cold+0xb/0x10
	[ 1387.877588]  out_of_memory.part.0+0xaf/0x230
	[ 1387.877591]  out_of_memory+0x3d/0x80
	[ 1387.877595]  __alloc_pages_slowpath.constprop.0+0x954/0xa20
	[ 1387.877599]  __alloc_pages_nodemask+0x2d3/0x300
	[ 1387.877602]  pipe_write+0x322/0x590
	[ 1387.877607]  new_sync_write+0x196/0x1b0
	[ 1387.877609]  vfs_write+0x1c3/0x1f0
	[ 1387.877611]  ksys_write+0xa7/0xe0
	[ 1387.877617]  do_syscall_64+0x52/0xd0
	[ 1387.877621]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

The meminfo log only shows the movable only node, which has plenty
of free memory. And in our reproducing with 1/2 patch, the normal
node (has DMA/DMA32/Normal) also has lot of free memory when OOM
happens. 

If we hack to make this (GFP_HIGHUSER|__GFP_ACCOUNT) request get
a page, and following full docker run (like installing and running
'stress-ng' stress test) will see more allocation failures due to
different kinds of request(gfp_masks). And the 2/2 patch will detect
such cases that the allowed target nodes only have movable zones
and loose the binding check, otherwise it will trigger OOM while
the OOM won't do any help, as the problem is not lack of free memory.

Feng Tang (2):
  mm, oom: dump meminfo for all memory nodes
  mm, page_alloc: loose the node binding check to avoid helpless oom
    killing

 mm/oom_kill.c   |  2 +-
 mm/page_alloc.c | 22 ++++++++++++++++++++++
 2 files changed, 23 insertions(+), 1 deletion(-)

-- 
2.7.4





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux