Re: [PATCH 3/5] oom: clear TIF_MEMDIE after oom_reaper managed to unmap the address space

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Michal Hocko wrote:
> From: Michal Hocko <mhocko@xxxxxxxx>
> 
> When oom_reaper manages to unmap all the eligible vmas there shouldn't
> be much of the freable memory held by the oom victim left anymore so it
> makes sense to clear the TIF_MEMDIE flag for the victim and allow the
> OOM killer to select another task.
> 
> The lack of TIF_MEMDIE also means that the victim cannot access memory
> reserves anymore but that shouldn't be a problem because it would get
> the access again if it needs to allocate and hits the OOM killer again
> due to the fatal_signal_pending resp. PF_EXITING check. We can safely
> hide the task from the OOM killer because it is clearly not a good
> candidate anymore as everyhing reclaimable has been torn down already.
> 
> This patch will allow to cap the time an OOM victim can keep TIF_MEMDIE
> and thus hold off further global OOM killer actions granted the oom
> reaper is able to take mmap_sem for the associated mm struct. This is
> not guaranteed now but further steps should make sure that mmap_sem
> for write should be blocked killable which will help to reduce such a
> lock contention. This is not done by this patch.

Did you decide what to do if the OOM reaper is unable to take mmap_sem
for the associated mm struct? I can not only disable the OOM reaper but
also disable further global OOM killer actions at the same time because
the OOM reaper does not clear TIF_MEMDIE. I think we need to take action
regardless of whether the OOM reaper was able to reap the associated mm
struct.

The reproducer program and kernel patch is shown at
http://lkml.kernel.org/r/201602252025.ICH57371.FMOtOOFJFQHVSL@xxxxxxxxxxxxxxxxxxx .

Complete log is at http://I-love.SAKURA.ne.jp/tmp/serial-20160225.txt.xz .
---------- console log ----------
[   59.707294] Killed process 2020 (a.out) total-vm:8272kB, anon-rss:4160kB, file-rss:0kB, shmem-rss:0kB
[   59.714851] oom_reaper: reaped process 2020 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0lB
[   59.739672] Out of memory: Kill process 1562 (a.out) score 1003 or sacrifice child
[   59.746018] Killed process 1562 (a.out) total-vm:8268kB, anon-rss:4024kB, file-rss:0kB, shmem-rss:0kB
[   60.003576] MemAlloc-Info: stalling=880 dying=42 exiting=0 victim=1 oom_count=113/49
(...snipped...)
[   60.757062] oom_reaper: unable to reap pid:1562 (a.out)
(...snipped...)
[   60.758417] 1 lock held by a.out/2269:
[   60.758417]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8117e6aa>] SyS_mremap+0xaa/0x500
[   60.759217] 1 lock held by a.out/2371:
[   60.759217]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8117e6aa>] SyS_mremap+0xaa/0x500
[   60.759732] 1 lock held by a.out/2600:
[   60.759732]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8117e6aa>] SyS_mremap+0xaa/0x500
[   60.760394] 1 lock held by a.out/2673:
[   60.760394]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8117e6aa>] SyS_mremap+0xaa/0x500
[   60.760528] 1 lock held by a.out/2689:
[   60.760528]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8117e6aa>] SyS_mremap+0xaa/0x500
[   60.760803] 1 lock held by a.out/2723:
[   60.760803]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8117e6aa>] SyS_mremap+0xaa/0x500
[   60.760884] 1 lock held by a.out/2731:
[   60.760884]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8117e6aa>] SyS_mremap+0xaa/0x500
[   60.761381] 1 lock held by a.out/2781:
[   60.761382]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8117e6aa>] SyS_mremap+0xaa/0x500
[   60.761466] 1 lock held by a.out/2791:
[   60.761466]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8117e6aa>] SyS_mremap+0xaa/0x500
[   60.761615] 1 lock held by a.out/2803:
[   60.761615]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8117e6aa>] SyS_mremap+0xaa/0x500
(...snipped...)
[  222.428230] MemAlloc-Info: stalling=876 dying=36 exiting=0 victim=1 oom_count=72229/1744
[  232.434607] MemAlloc-Info: stalling=1082 dying=36 exiting=0 victim=1 oom_count=75777/1855
(...snipped...)
[  232.711169] MemAlloc: kswapd0(49) flags=0xa60840 switches=59 uninterruptible
[  232.716820] kswapd0         D ffff880039f935d0     0    49      2 0x00000000
[  232.722494]  ffff880039f935d0 ffff8800366500c0 ffff880039f8c040 ffff880039f94000
[  232.728190]  ffff8800367eac80 ffff8800367eac98 0000000000000000 ffff880039f93820
[  232.733785]  ffff880039f935e8 ffffffff81672d1a ffff880039f8c040 ffff880039f93650
[  232.739694] Call Trace:
[  232.741797]  [<ffffffff81672d1a>] schedule+0x3a/0x90
[  232.745674]  [<ffffffff81676b16>] rwsem_down_read_failed+0xd6/0x140
[  232.750368]  [<ffffffff8132c324>] call_rwsem_down_read_failed+0x14/0x30
[  232.755333]  [<ffffffff8167645d>] ? down_read+0x3d/0x50
[  232.759397]  [<ffffffffa026286b>] ? xfs_log_commit_cil+0x5b/0x490 [xfs]
[  232.764628]  [<ffffffffa026286b>] xfs_log_commit_cil+0x5b/0x490 [xfs]
[  232.769470]  [<ffffffffa025d1e3>] __xfs_trans_commit+0x123/0x230 [xfs]
[  232.774341]  [<ffffffffa025d57b>] xfs_trans_commit+0xb/0x10 [xfs]
[  232.778929]  [<ffffffffa024e394>] xfs_iomap_write_allocate+0x194/0x380 [xfs]
[  232.784237]  [<ffffffffa023b3ed>] xfs_map_blocks+0x13d/0x150 [xfs]
[  232.788860]  [<ffffffffa023c2ab>] xfs_do_writepage+0x15b/0x520 [xfs]
[  232.793646]  [<ffffffffa023c6a6>] xfs_vm_writepage+0x36/0x70 [xfs]
[  232.798328]  [<ffffffff8115773f>] pageout.isra.43+0x18f/0x240
[  232.802677]  [<ffffffff811590a3>] shrink_page_list+0x803/0xae0
[  232.807247]  [<ffffffff81159ae7>] shrink_inactive_list+0x207/0x550
[  232.811912]  [<ffffffff8115a7d6>] shrink_zone_memcg+0x5b6/0x780
[  232.816585]  [<ffffffff811b38ed>] ? mem_cgroup_iter+0x15d/0x7c0
[  232.820923]  [<ffffffff8115aa74>] shrink_zone+0xd4/0x2f0
[  232.824963]  [<ffffffff8115b8fe>] kswapd+0x41e/0x800
[  232.828916]  [<ffffffff8115b4e0>] ? mem_cgroup_shrink_node_zone+0xb0/0xb0
[  232.833835]  [<ffffffff810923ee>] kthread+0xee/0x110
[  232.837628]  [<ffffffff81678572>] ret_from_fork+0x22/0x50
[  232.841640]  [<ffffffff81092300>] ? kthread_create_on_node+0x230/0x230
(...snipped...)
[  247.278695] MemAlloc: a.out(2269) flags=0x400040 switches=2115 seq=2 gfp=0x26084c0(GFP_KERNEL|__GFP_REPEAT|__GFP_ZERO|__GFP_NOTRACK) order=0 delay=181584 uninterruptible
[  247.289757] a.out           D ffff880027aafa48     0  2269   1229 0x00000080
[  247.295134]  ffff880027aafa48 ffff880027a38040 ffff880027aa8100 ffff880027ab0000
[  247.301091]  ffff880027aafa80 ffff88003d610240 00000000ffff2f67 000000000000004c
[  247.306879]  ffff880027aafa60 ffffffff81672d1a ffff88003d610240 ffff880027aafb08
[  247.312590] Call Trace:
[  247.314749]  [<ffffffff81672d1a>] schedule+0x3a/0x90
[  247.319758]  [<ffffffff8167717e>] schedule_timeout+0x11e/0x1c0
[  247.324427]  [<ffffffff810bd056>] ? mark_held_locks+0x66/0x90
[  247.329085]  [<ffffffff810e1270>] ? init_timer_key+0x40/0x40
[  247.333385]  [<ffffffff810e8197>] ? ktime_get+0xa7/0x130
[  247.337469]  [<ffffffff816720c1>] io_schedule_timeout+0xa1/0x110
[  247.342046]  [<ffffffff811650ed>] congestion_wait+0x7d/0xd0
[  247.346570]  [<ffffffff810b73e0>] ? wait_woken+0x80/0x80
[  247.350714]  [<ffffffff8114e584>] __alloc_pages_nodemask+0xd74/0xed0
[  247.355918]  [<ffffffff81198026>] alloc_pages_current+0x96/0x1b0
[  247.361015]  [<ffffffff81062162>] pte_alloc_one+0x12/0x60
[  247.365132]  [<ffffffff811740e9>] __pte_alloc+0x19/0x110
[  247.369162]  [<ffffffff8117e20a>] move_page_tables+0x5da/0x700
[  247.373560]  [<ffffffff8117b00d>] ? copy_vma+0x20d/0x260
[  247.377647]  [<ffffffff8117e41b>] move_vma+0xeb/0x2d0
[  247.381577]  [<ffffffff8117eac8>] SyS_mremap+0x4c8/0x500
[  247.385727]  [<ffffffff8100365d>] do_syscall_64+0x5d/0x180
[  247.389869]  [<ffffffff816783ff>] entry_SYSCALL64_slow_path+0x25/0x25
(...snipped...)
[  275.592260] MemAlloc: a.out(1562) flags=0x400040 switches=17 uninterruptible dying victim
[  275.598299] a.out           D ffff88002e3c7d68     0  1562   1229 0x00100084
[  275.603752]  ffff88002e3c7d68 ffffffff818f9500 ffff88002e3c00c0 ffff88002e3c8000
[  275.609544]  ffff88003d1f74c8 0000000000000246 ffff88002e3c00c0 00000000ffffffff
[  275.615527]  ffff88002e3c7d80 ffffffff81672d1a ffff88003d1f74c0 ffff88002e3c7d90
[  275.621526] Call Trace:
[  275.623707]  [<ffffffff81672d1a>] schedule+0x3a/0x90
[  275.627629]  [<ffffffff81672fa3>] schedule_preempt_disabled+0x13/0x20
[  275.632511]  [<ffffffff81674d3e>] mutex_lock_nested+0x16e/0x430
[  275.636989]  [<ffffffff811d37c3>] ? lock_rename+0xd3/0x100
[  275.641192]  [<ffffffff811d37c3>] lock_rename+0xd3/0x100
[  275.645337]  [<ffffffff811d774d>] SyS_renameat2+0x1ed/0x530
[  275.649788]  [<ffffffff811d7ab9>] SyS_rename+0x19/0x20
[  275.653731]  [<ffffffff8100365d>] do_syscall_64+0x5d/0x180
[  275.657938]  [<ffffffff816783ff>] entry_SYSCALL64_slow_path+0x25/0x25
(...snipped...)
[  385.466224] MemAlloc-Info: stalling=1081 dying=36 exiting=0 victim=1 oom_count=143237/3490
[  385.472491] INFO: task kworker/0:0:4 blocked for more than 120 seconds.
[  385.477667]       Not tainted 4.5.0-rc5-next-20160224+ #75
---------- console log ----------

While number of out_of_memory() calls is increasing over time ( 75777 -> 143237 ),
kswapd being stuck ( http://lkml.kernel.org/r/20160211225929.GU14668@dastard )
might be involved to this problem.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]