On Tue, Jan 02, 2018 at 04:11:17PM -0800, James Bottomley wrote: > On Wed, 2018-01-03 at 08:56 +0900, Minchan Kim wrote: > > On Tue, Jan 02, 2018 at 02:42:21PM -0800, James Bottomley wrote: > > > > > > On Tue, 2018-01-02 at 13:22 -0800, Andrew Morton wrote: > > > > > > > > On Fri, 29 Dec 2017 09:55:07 +0900 Minchan Kim <minchan@xxxxxxxxx > > > > g> > > > > wrote: > > > > > > > > > > > > > > > > > > > James reported a bug of swap paging-in for his testing and > > > > > found it > > > > > at rc5, soon to be -rc5. > > > > > > > > > > Although we can fix the specific problem at the moment, it may > > > > > have other lurkig bugs so want to have one more cycle in -next > > > > > before merging. > > > > > > > > > > This patchset reverts 23c47d2ada9f, 08fa93021d80, 8e31f339295f > > > > > completely > > > > > but 79b5f08fa34e partially because the swp_swap_info function > > > > > that > > > > > 79b5f08fa34e introduced is used by [1]. > > > > > > > > Gets a significant reject in do_swap_page(). Could you please > > > > take a > > > > look, redo against current mainline? > > > > > > > > Or not. We had a bug and James fixed it. That's what -rc is > > > > for. Why not fix the thing and proceed? > > > > > > My main worry was lack of testing at -rc5, since the bug could > > > essentially be excited by pushing pages out to swap and then trying > > > to > > > access them again ... plus since one serious bug was discovered it > > > wouldn't be unusual for there to be others. However, because of > > > the > > > IPT stuff, I think Linus is going to take 4.15 over a couple of > > > extra > > > -rc releases, so this is less of a problem. > > > > Then, Here is right fix patch against current mainline. > > > > > > From 012bdb0774744455ab7aa8abd74c8b9ca1cdc009 Mon Sep 17 00:00:00 > > 2001 > > From: Minchan Kim <minchan@xxxxxxxxxx> > > Date: Wed, 3 Jan 2018 08:25:15 +0900 > > Subject: [PATCH] mm: release locked page in do_swap_page > > > > James reported a bug of swap paging-in for his testing. It is that > > do_swap_page doesn't release locked page so system hang-up happens > > by deadlock of PG_locked. > > > > It was introduced by [1] because I missed swap cache hit places to > > update swapcache variable to work well with other logics against > > swapcache in do_swap_page. > > > > This patch fixes it. > > > > [1] 0bcac06f27d7, mm, swap: skip swapcache for swapin of synchronous > > device > > > > Link: http://lkml.kernel.org/r/<1514407817.4169.4.camel@HansenPartner > > ship.com>; > > Cc: Hugh Dickins <hughd@xxxxxxxxxx> > > Cc: Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx> > > Cc: Huang Ying <ying.huang@xxxxxxxxx> > > Debugged-by: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> > > Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx> > > --- > > mm/memory.c | 10 ++++++++-- > > 1 file changed, 8 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index ca5674cbaff2..793004608332 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -2857,8 +2857,11 @@ int do_swap_page(struct vm_fault *vmf) > > int ret = 0; > > bool vma_readahead = swap_use_vma_readahead(); > > > > - if (vma_readahead) > > + if (vma_readahead) { > > page = swap_readahead_detect(vmf, &swap_ra); > > + swapcache = page; > > + } > > + > > if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf- > > >orig_pte)) { > > if (page) > > put_page(page); > > @@ -2889,9 +2892,12 @@ int do_swap_page(struct vm_fault *vmf) > > > > > > delayacct_set_flag(DELAYACCT_PF_SWAPIN); > > - if (!page) > > + if (!page) { > > page = lookup_swap_cache(entry, vma_readahead ? vma > > : NULL, > > vmf->address); > > + swapcache = page; > > + } > > + > > I've got to say I prefer my version. The problem with the above is Unfortunately, it was not right fix because with synchronous device, we skip swapcache but your patch set the swapcache variable unconditionally. So, it doesn't work by below logic, IOW, it goes with out_page jump. /* * Make sure try_to_free_swap or reuse_swap_page or swapoff did not * release the swapcache from under us. The page pin, and pte_same * test below, are not enough to exclude that. Even if it is still * swapcache, we need to check that the page's swap has not changed. */ if (unlikely((!PageSwapCache(page) || page_private(page) != entry.val)) && swapcache) goto out_page; > that if something else gets added to this path and forgets to set > swapcache = page you'll get the locked pages problem back. > > Instead of setting swapcache to NULL at the top, don't set it until it > matters, which is just before the second if (!page). It doesn't matter > before this because you're using it as a signal for the synchronous I/O > path, so why have a whole section of code where you invite people to > get it wrong for no benefit. > Yub, it doesn't looks neat. But please see below patch I am planning to send to Andrew for current mmotm. It looks better because in the head of do_swap_page, we set swapcache unconditionally and others logic work well due to patchset vma-based readahead cleanup. If you don't against, I want to go my patch into current mainline and Andrew can apply below patch into current mmots. Thanks. >From f6ac10636d68c6a5011cc2359031bffa056becd2 Mon Sep 17 00:00:00 2001 From: Minchan Kim <minchan@xxxxxxxxxx> Date: Wed, 3 Jan 2018 09:08:25 +0900 Subject: [PATCH] mm for mmots: release locked page in do_swap_page James reported a bug of swap paging-in for his testing. It is that do_swap_page doesn't release locked page so system hang-up happens by deadlock of PG_locked. It was introduced by [1] because I missed swap cache hit places to update swapcache variable to work well with other logics against swapcache. This patch fixes it. [1] 0bcac06f27d7, mm, swap: skip swapcache for swapin of synchronous device Link: http://lkml.kernel.org/r/<1514407817.4169.4.camel@xxxxxxxxxxxxxxxxxxxxx> Cc: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx> Cc: Huang Ying <ying.huang@xxxxxxxxx> Debugged-by: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx> --- mm/memory.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 36dd3a66aa5a..59be576b024d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2869,7 +2869,7 @@ EXPORT_SYMBOL(unmap_mapping_range); int do_swap_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; - struct page *page = NULL, *swapcache = NULL; + struct page *page = NULL, *swapcache; struct mem_cgroup *memcg; swp_entry_t entry; pte_t pte; @@ -2904,7 +2904,7 @@ int do_swap_page(struct vm_fault *vmf) delayacct_set_flag(DELAYACCT_PF_SWAPIN); - page = lookup_swap_cache(entry, vma, vmf->address); + swapcache = page = lookup_swap_cache(entry, vma, vmf->address); if (!page) { struct swap_info_struct *si = swp_swap_info(entry); -- 2.7.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>