On Wed, 17 Jan 2018 14:33:21 -0800 (PST) Hugh Dickins <hughd@xxxxxxxxxx> wrote: > On Thu, 28 Dec 2017, James Bottomley wrote: > > On Thu, 2017-12-28 at 09:41 -0800, James Bottomley wrote: > > > I'd guess that since they're both in io_schedule, the problem is that > > > the io_scheduler is taking far too long servicing the requests due to > > > some priority issue you've introduced. > > > > OK, so after some analysis, that turned out to be incorrect. The > > problem seems to be that we're exiting do_swap_page() with locked pages > > that have been read in from swap. > > > > Your changelogs are entirely unclear on why you changed the swapcache > > setting logic in this patch: > > > > commit 0bcac06f27d7528591c27ac2b093ccd71c5d0168 > > Author: Minchan Kim <minchan@xxxxxxxxxx> > > Date: Wed Nov 15 17:33:07 2017 -0800 > > > > mm, swap: skip swapcache for swapin of synchronous device > > > > But I think you're using swapcache == NULL as a signal the page came > > from a synchronous device. In which case the bug is that you've > > forgotten we may already have picked up a page in > > swap_readahead_detect() which you're wrongly keeping swapcache == NULL > > for and the fix is this (it works on my system, although I'm still > > getting an unaccountable shutdown delay). > > > > I still think we should revert this series, because this may not be the > > only bug lurking in the code, so it should go through a lot more > > rigorous testing than it has. > > Andrew, neither the fix below (works for me, though I have seen other > swap funniness, most probably unrelated), nor the reversion preferred > by James and Minchan (later in this linux-mm thread), was in 4.15-rc8: > the sands of time are running out... Yup. I'm actually planning on sending in this one. OK by you? From: Minchan Kim <minchan@xxxxxxxxxx> Subject: mm/memory.c: release locked page in do_swap_page() James reported a bug in swap paging-in from his testing. It is that do_swap_page doesn't release locked page so system hang-up happens due to a deadlock on PG_locked. It was introduced by 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of synchronous device") because I missed swap cache hit places to update swapcache variable to work well with other logics against swapcache in do_swap_page. This patch fixes it. Debugged by James Bottomley. Link: http://lkml.kernel.org/r/<1514407817.4169.4.camel@xxxxxxxxxxxxxxxxxxxxx> Link: http://lkml.kernel.org/r/20180102235606.GA19438@bbox Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx> Reported-by: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> Cc: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx> Cc: Huang Ying <ying.huang@xxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/memory.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff -puN mm/memory.c~mm-release-locked-page-in-do_swap_page mm/memory.c --- a/mm/memory.c~mm-release-locked-page-in-do_swap_page +++ a/mm/memory.c @@ -2857,8 +2857,11 @@ int do_swap_page(struct vm_fault *vmf) int ret = 0; bool vma_readahead = swap_use_vma_readahead(); - if (vma_readahead) + if (vma_readahead) { page = swap_readahead_detect(vmf, &swap_ra); + swapcache = page; + } + if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte)) { if (page) put_page(page); @@ -2889,9 +2892,12 @@ int do_swap_page(struct vm_fault *vmf) delayacct_set_flag(DELAYACCT_PF_SWAPIN); - if (!page) + if (!page) { page = lookup_swap_cache(entry, vma_readahead ? vma : NULL, vmf->address); + swapcache = page; + } + if (!page) { struct swap_info_struct *si = swp_swap_info(entry); _ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href