On Thu, Jun 18, 2020 at 5:03 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > On Thu, Jun 18, 2020 at 02:46:03PM +0200, Andreas Gruenbacher wrote: > > On Wed, Jun 17, 2020 at 4:22 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > On Wed, Jun 17, 2020 at 02:57:14AM +0200, Andreas Grünbacher wrote: > > > > Right, the approach from the following thread might fix this: > > > > > > > > https://lore.kernel.org/linux-fsdevel/20191122235324.17245-1-agruenba@xxxxxxxxxx/T/#t > > > > > > In general, I think this is a sound approach. > > > > > > Specifically, I think FAULT_FLAG_CACHED can go away. map_pages() > > > will bring in the pages which are in the page cache, so when we get to > > > gfs2_fault(), we know there's a reason to acquire the glock. > > > > We'd still be grabbing a glock while holding a dependent page lock. > > Another process could be holding the glock and could try to grab the > > same page lock (i.e., a concurrent writer), leading to the same kind > > of deadlock. > > What I'm saying is that gfs2_fault should just be: > > +static vm_fault_t gfs2_fault(struct vm_fault *vmf) > +{ > + struct inode *inode = file_inode(vmf->vma->vm_file); > + struct gfs2_inode *ip = GFS2_I(inode); > + struct gfs2_holder gh; > + vm_fault_t ret; > + int err; > + > + gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh); > + err = gfs2_glock_nq(&gh); > + if (err) { > + ret = block_page_mkwrite_return(err); > + goto out_uninit; > + } > + ret = filemap_fault(vmf); > + gfs2_glock_dq(&gh); > +out_uninit: > + gfs2_holder_uninit(&gh); > + return ret; > +} > > because by the time gfs2_fault() is called, map_pages() has already been > called and has failed to insert the necessary page, so we should just > acquire the glock now instead of trying again to look for the page in > the page cache. Okay, that's great. Thanks, Andreas