On Tue 20-09-16 09:07:43, Michal Hocko wrote: > [CCing Tetsuo again - please make sure you CC everybody who did respond > in earlier versions of the patch] now for real > > I am sorry to insist here but this doesn't address the previous review > feedback. Let me try to show you what I would find much better. I do not > insist on this precise wording of course but I do insist on mentioning > the current state and making clear why GFP_NORETRY is really ok. > > On Tue 20-09-16 13:50:13, zhongjiang wrote: > > From: zhong jiang <zhongjiang@xxxxxxxxxx> > > > > I hit the following issue when run a OOM case of the LTP and > > ksm enable. > > " > I hit the following hung task when running an OOM LTP test case with 4.1 > kernel. > " > > > > > Call trace: > > [<ffffffc000086a88>] __switch_to+0x74/0x8c > > [<ffffffc000a1bae0>] __schedule+0x23c/0x7bc > > [<ffffffc000a1c09c>] schedule+0x3c/0x94 > > [<ffffffc000a1eb84>] rwsem_down_write_failed+0x214/0x350 > > [<ffffffc000a1e32c>] down_write+0x64/0x80 > > [<ffffffc00021f794>] __ksm_exit+0x90/0x19c > > [<ffffffc0000be650>] mmput+0x118/0x11c > > [<ffffffc0000c3ec4>] do_exit+0x2dc/0xa74 > > [<ffffffc0000c46f8>] do_group_exit+0x4c/0xe4 > > [<ffffffc0000d0f34>] get_signal+0x444/0x5e0 > > [<ffffffc000089fcc>] do_signal+0x1d8/0x450 > > [<ffffffc00008a35c>] do_notify_resume+0x70/0x78 > > > > it will leads to a hung task because the exiting task cannot get the > > mmap sem for write. but the root cause is that the ksmd holds it for > > read while allocateing memory which just takes ages to complete. > > and ksmd will loop in the following path. > > " > The oom victim cannot terminate because it needs to take mmap_sem for > write while the lock is held by ksmd for read which loops in the page > allocator > > ksm_do_scan > scan_get_next_rmap_item > down_read > get_next_rmap_item > alloc_rmap_item #ksmd will loop permanently. > > There is not way forward because the oom victim cannot release any > memory in 4.1 based kernel. Since 4.6 we have the oom reaper which would > solve this problem because it would release the memory asynchronously. > Nevertheless we can relax alloc_rmap_item requirements and use > __GFP_NORETRY because the allocation failure is acceptable as > ksm_do_scan would just retry later after the lock got dropped. > > Such a patch would be also easy to backport to older stable kernels > which do not have oom_reaper. > > While we are at it add GFP_NOWARN as the admin doesn't have to be > alarmed by the allocation failure. > > > > CC: <stable@xxxxxxxxxxxxxxx> > > Suggested-by: Hugh Dickins <hughd@xxxxxxxxxx> > > Suggested-by: Michal Hocko <mhocko@xxxxxxx> > > Signed-off-by: zhong jiang <zhongjiang@xxxxxxxxxx> > > --- > > mm/ksm.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/mm/ksm.c b/mm/ksm.c > > index 73d43ba..5048083 100644 > > --- a/mm/ksm.c > > +++ b/mm/ksm.c > > @@ -283,7 +283,8 @@ static inline struct rmap_item *alloc_rmap_item(void) > > { > > struct rmap_item *rmap_item; > > > > - rmap_item = kmem_cache_zalloc(rmap_item_cache, GFP_KERNEL); > > + rmap_item = kmem_cache_zalloc(rmap_item_cache, GFP_KERNEL | > > + __GFP_NORETRY | __GFP_NOWARN); > > if (rmap_item) > > ksm_rmap_items++; > > return rmap_item; > > -- > > 1.8.3.1 > > -- > Michal Hocko > SUSE Labs -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>