Re: Frequent oom introduced in mainline when migrate_highatomic replace migrate_reserve

Michal Hocko <mhocko@xxxxxxxxxx> · Tue, 25 Jun 2019 12:36:50 +0200



On Tue 25-06-19 10:52:17, zhong jiang wrote:
> On 2019/6/25 1:54, Michal Hocko wrote:
> > On Tue 25-06-19 00:47:11, zhong jiang wrote:
> >> On 2019/6/24 22:01, Michal Hocko wrote:
> >>> On Mon 24-06-19 21:11:55, zhong jiang wrote:
> >>>> [  652.272622] sh invoked oom-killer: gfp_mask=0x26080c0, order=3, oom_score_adj=0
> >>>> [  652.272683] CPU: 0 PID: 1748 Comm: sh Tainted: P           O    4.4.171 #8
> >>>> [  653.452827] Mem-Info:
> >>>> [  653.466390] active_anon:20377 inactive_anon:187 isolated_anon:0
> >>>> [  653.466390]  active_file:5087 inactive_file:4825 isolated_file:0
> >>>> [  653.466390]  unevictable:12 dirty:0 writeback:32 unstable:0
> >>>> [  653.466390]  slab_reclaimable:636 slab_unreclaimable:1754
> >>>> [  653.466390]  mapped:5338 shmem:194 pagetables:231 bounce:0
> >>>> [  653.466390]  free:1086 free_pcp:85 free_cma:0
> >>>> [  653.625286] Normal free:4248kB min:1696kB low:2120kB high:2544kB active_anon:81508kB inactive_anon:748kB active_file:20348kB inactive_file:19300kB unevictable:48kB isolated(anon):0kB isolated(file):0kB present:252928kB managed:180496kB mlocked:0kB dirty:0kB writeback:128kB mapped:21352kB shmem:776kB slab_reclaimable:2544kB slab_unreclaimable:7016kB kernel_stack:9856kB pagetables:924kB unstable:0kB bounce:0kB free_pcp:392kB local_pcp:392kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> >>>> [  654.177121] lowmem_reserve[]: 0 0 0
> >>>> [  654.462015] Normal: 752*4kB (UME) 128*8kB (UM) 21*16kB (M) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4368kB
> >>>> [  654.601093] 10132 total pagecache pages
> >>>> [  654.606655] 63232 pages RAM
> >>> [...]
> >>>>>> As the process is created,  kernel stack will use the higher order to allocate continuous memory.
> >>>>>> Due to the fragmentabtion,  we fails to allocate the memory.   And the low memory will result
> >>>>>> in hardly memory compction.  hence,  it will easily to reproduce the oom.
> >>>>> How get your get such a large fragmentation that you cannot allocate
> >>>>> order-1 pages and compaction is not making any progress?
> >>>> >From the above oom report,  we can see that  there is not order-2 pages.  It wil hardly to allocate kernel stack when
> >>>> creating the process.  And we can easily to reproduce the situation when runing some userspace program.
> >>>>
> >>>> But it rarely trigger the oom when It do not introducing the highatomic.  we test that in the kernel 3.10.
> >>> I do not really see how highatomic reserves could make any difference.
> >>> We do drain them before OOM killer is invoked. The above oom report
> >>> confirms that there is indeed no order-3+ free page to be used.
> >> I mean that all order with migrate_highatomic is alway zero,  it can be  true that
> > Yes, highatomic is meant to be used for higher order allocations which
> > already do have access to memory reserves. E.g. via __GFP_ATOMIC.
> If current kernel have not use __GFP_ATOMIC to allocate memory,  highatomic will have not available higher order.
> And we have order-3 kernel stack allocation requirement in the system.  
> 
> There is not  memory reserve to use for us in the emergency situation,  which is different from migrate_reserve.
> Maybe I  think that we can change the reserve memory behaviour,  Not only reserve higher order in GFP_ATOMIC.

Let me repeat. This is unlikely to help for something like a fork code
path which can be triggered by userspace and no matter how much you
reserve it can get depleted easily. Your real problem is to require
an order-3 allocation for this particular path.
-- 
Michal Hocko
SUSE Labs