On Wed 06-11-19 14:45:43, Robert Stupp wrote: > On Wed, 2019-11-06 at 13:03 +0100, Jan Kara wrote: > > On Tue 05-11-19 13:22:11, Johannes Weiner wrote: > > > What I don't quite understand yet is why the fault path doesn't > > > make > > > progress eventually. We must drop the mmap_sem without changing the > > > state in any way. How can we keep looping on the same page? > > > > That may be a slight suboptimality with Josef's patches. If the page > > is marked as PageReadahead, we always drop mmap_sem if we can and > > start > > readahead without checking whether that makes sense or not in > > do_async_mmap_readahead(). OTOH page_cache_async_readahead() then > > clears > > PageReadahead so the only way how I can see we could loop like this > > is when > > file->ra->ra_pages is 0. Not sure if that's what's happening through. > > We'd > > need to find which of the paths in filemap_fault() calls > > maybe_unlock_mmap_for_io() to tell more. > > Yes, ra_pages==0 BTW, attached patch should workaround your problem as well. But that's just a performance optimization that happens to paper over your problem. Real fix is the proper handling of fault retry as you did it. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR
>From e12c861e687364ff5a891f0ae90283b384d74197 Mon Sep 17 00:00:00 2001 From: Jan Kara <jack@xxxxxxx> Date: Wed, 6 Nov 2019 15:30:26 +0100 Subject: [PATCH] mm: Don't bother dropping mmap_sem for zero size readahead When handling a page fault, we drop mmap_sem to start async readahead so that we don't block on IO submission with mmap_sem held. However there's no point to drop mmap_sem in case readahead is disabled. Handle that case to avoid pointless dropping of mmap_sem and retrying the fault. Signed-off-by: Jan Kara <jack@xxxxxxx> --- mm/filemap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/filemap.c b/mm/filemap.c index 1146fcfa3215..3d39c437b07e 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2458,7 +2458,7 @@ static struct file *do_async_mmap_readahead(struct vm_fault *vmf, pgoff_t offset = vmf->pgoff; /* If we don't want any read-ahead, don't bother */ - if (vmf->vma->vm_flags & VM_RAND_READ) + if (vmf->vma->vm_flags & VM_RAND_READ || !ra->ra_pages) return fpin; if (ra->mmap_miss > 0) ra->mmap_miss--; -- 2.16.4