On Wed, 17 Jan 2018, Andrew Morton wrote: > On Wed, 17 Jan 2018 14:33:21 -0800 (PST) Hugh Dickins <hughd@xxxxxxxxxx> wrote: > > On Thu, 28 Dec 2017, James Bottomley wrote: > > > On Thu, 2017-12-28 at 09:41 -0800, James Bottomley wrote: > > > > I'd guess that since they're both in io_schedule, the problem is that > > > > the io_scheduler is taking far too long servicing the requests due to > > > > some priority issue you've introduced. > > > > > > OK, so after some analysis, that turned out to be incorrect. The > > > problem seems to be that we're exiting do_swap_page() with locked pages > > > that have been read in from swap. > > > > > > Your changelogs are entirely unclear on why you changed the swapcache > > > setting logic in this patch: > > > > > > commit 0bcac06f27d7528591c27ac2b093ccd71c5d0168 > > > Author: Minchan Kim <minchan@xxxxxxxxxx> > > > Date: Wed Nov 15 17:33:07 2017 -0800 > > > > > > mm, swap: skip swapcache for swapin of synchronous device > > > > > > But I think you're using swapcache == NULL as a signal the page came > > > from a synchronous device. In which case the bug is that you've > > > forgotten we may already have picked up a page in > > > swap_readahead_detect() which you're wrongly keeping swapcache == NULL > > > for and the fix is this (it works on my system, although I'm still > > > getting an unaccountable shutdown delay). > > > > > > I still think we should revert this series, because this may not be the > > > only bug lurking in the code, so it should go through a lot more > > > rigorous testing than it has. > > > > Andrew, neither the fix below (works for me, though I have seen other > > swap funniness, most probably unrelated), nor the reversion preferred > > by James and Minchan (later in this linux-mm thread), was in 4.15-rc8: > > the sands of time are running out... > > Yup. I'm actually planning on sending in this one. OK by you? Thanks, yes, that looks equivalent to what I've been running with. > > > From: Minchan Kim <minchan@xxxxxxxxxx> > Subject: mm/memory.c: release locked page in do_swap_page() > > James reported a bug in swap paging-in from his testing. It is that > do_swap_page doesn't release locked page so system hang-up happens due to > a deadlock on PG_locked. > > It was introduced by 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of > synchronous device") because I missed swap cache hit places to update > swapcache variable to work well with other logics against swapcache in > do_swap_page. > > This patch fixes it. > > Debugged by James Bottomley. > > Link: http://lkml.kernel.org/r/<1514407817.4169.4.camel@xxxxxxxxxxxxxxxxxxxxx> > Link: http://lkml.kernel.org/r/20180102235606.GA19438@bbox > Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx> > Reported-by: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> > Cc: Hugh Dickins <hughd@xxxxxxxxxx> Acked-by: Hugh Dickins <hughd@xxxxxxxxxx> > Cc: Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx> > Cc: Huang Ying <ying.huang@xxxxxxxxx> > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > --- > > mm/memory.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff -puN mm/memory.c~mm-release-locked-page-in-do_swap_page mm/memory.c > --- a/mm/memory.c~mm-release-locked-page-in-do_swap_page > +++ a/mm/memory.c > @@ -2857,8 +2857,11 @@ int do_swap_page(struct vm_fault *vmf) > int ret = 0; > bool vma_readahead = swap_use_vma_readahead(); > > - if (vma_readahead) > + if (vma_readahead) { > page = swap_readahead_detect(vmf, &swap_ra); > + swapcache = page; > + } > + > if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte)) { > if (page) > put_page(page); > @@ -2889,9 +2892,12 @@ int do_swap_page(struct vm_fault *vmf) > > > delayacct_set_flag(DELAYACCT_PF_SWAPIN); > - if (!page) > + if (!page) { > page = lookup_swap_cache(entry, vma_readahead ? vma : NULL, > vmf->address); > + swapcache = page; > + } > + > if (!page) { > struct swap_info_struct *si = swp_swap_info(entry); > > _