Re: Frequent oom introduced in mainline when migrate_highatomic replace migrate_reserve

Michal Hocko <mhocko@xxxxxxxxxx> · Mon, 24 Jun 2019 16:01:20 +0200

On Mon 24-06-19 21:11:55, zhong jiang wrote:
> [  652.272622] sh invoked oom-killer: gfp_mask=0x26080c0, order=3, oom_score_adj=0
> [  652.272683] CPU: 0 PID: 1748 Comm: sh Tainted: P           O    4.4.171 #8
> [  653.452827] Mem-Info:
> [  653.466390] active_anon:20377 inactive_anon:187 isolated_anon:0
> [  653.466390]  active_file:5087 inactive_file:4825 isolated_file:0
> [  653.466390]  unevictable:12 dirty:0 writeback:32 unstable:0
> [  653.466390]  slab_reclaimable:636 slab_unreclaimable:1754
> [  653.466390]  mapped:5338 shmem:194 pagetables:231 bounce:0
> [  653.466390]  free:1086 free_pcp:85 free_cma:0
> [  653.625286] Normal free:4248kB min:1696kB low:2120kB high:2544kB active_anon:81508kB inactive_anon:748kB active_file:20348kB inactive_file:19300kB unevictable:48kB isolated(anon):0kB isolated(file):0kB present:252928kB managed:180496kB mlocked:0kB dirty:0kB writeback:128kB mapped:21352kB shmem:776kB slab_reclaimable:2544kB slab_unreclaimable:7016kB kernel_stack:9856kB pagetables:924kB unstable:0kB bounce:0kB free_pcp:392kB local_pcp:392kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> [  654.177121] lowmem_reserve[]: 0 0 0
> [  654.462015] Normal: 752*4kB (UME) 128*8kB (UM) 21*16kB (M) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4368kB
> [  654.601093] 10132 total pagecache pages
> [  654.606655] 63232 pages RAM
[...]
> >> As the process is created,  kernel stack will use the higher order to allocate continuous memory.
> >> Due to the fragmentabtion,  we fails to allocate the memory.   And the low memory will result
> >> in hardly memory compction.  hence,  it will easily to reproduce the oom.
> > How get your get such a large fragmentation that you cannot allocate
> > order-1 pages and compaction is not making any progress?
> >From the above oom report,  we can see that  there is not order-2 pages.  It wil hardly to allocate kernel stack when
> creating the process.  And we can easily to reproduce the situation when runing some userspace program.
> 
> But it rarely trigger the oom when It do not introducing the highatomic.  we test that in the kernel 3.10.

I do not really see how highatomic reserves could make any difference.
We do drain them before OOM killer is invoked. The above oom report
confirms that there is indeed no order-3+ free page to be used.

It is hard to tell whether compaction has done all it could but there
have many changes in this area since 4.4 so I would be really curious
about the current upstream kernel behavior. I would also note that
relying on order-3 allocation is far from optimal. I am not sure what
exactly copy_process.part.2+0xe4 refers to but if this is really a stack
allocation then I would consider such a large stack really dangerous for
a small system.
-- 
Michal Hocko
SUSE Labs