Re: [RFC] Bypass filesystems for reading cached pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 02, 2020 at 05:16:43PM +0200, Andreas Gruenbacher wrote:
> On Wed, Jun 24, 2020 at 2:35 PM Andreas Gruenbacher <agruenba@xxxxxxxxxx> wrote:
> > On Mon, Jun 22, 2020 at 8:13 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> > > On Mon, Jun 22, 2020 at 04:35:05PM +0200, Andreas Gruenbacher wrote:
> > > > I'm fine with not moving that functionality into the VFS. The problem
> > > > I have in gfs2 is that taking glocks is really expensive. Part of that
> > > > overhead is accidental, but we definitely won't be able to fix it in
> > > > the short term. So something like the IOCB_CACHED flag that prevents
> > > > generic_file_read_iter from issuing readahead I/O would save the day
> > > > for us. Does that idea stand a chance?
> > >
> > > For the short-term fix, is switching to a trylock in gfs2_readahead()
> > > acceptable?
> >
> > Well, it's the only thing we can do for now, right?
> 
> It turns out that gfs2 can still deadlock with a trylock in
> gfs2_readahead, just differently: in this instance, gfs2_glock_nq will
> call inode_dio_wait. When there is pending direct I/O, we'll end up
> waiting for iomap_dio_complete, which will call
> invalidate_inode_pages2_range, which will try to lock the pages
> already locked for gfs2_readahead.

That seems like a bug in trylock.  If I trylock something I'm not
expecting it to wait; i'm expecting it to fail because it would have to wait.
ie something like this:

diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index c84887769b5a..97ca8f5ed22b 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -470,7 +470,10 @@ static int inode_go_lock(struct gfs2_holder *gh)
                        return error;
        }
 
-       if (gh->gh_state != LM_ST_DEFERRED)
+       if (gh->gh_flags & LM_FLAG_TRY) {
+               if (atomic_read(&ip->i_inode.i_dio_count))
+                       return -EWOULDBLOCK;
+       } else if (gh->gh_state != LM_ST_DEFERRED)
                inode_dio_wait(&ip->i_inode);
 
        if ((ip->i_diskflags & GFS2_DIF_TRUNC_IN_PROG) &&

... but probably not exactly that because I didn't try to figure out the
calling conventions or whether I should set some state in the gfs2_holder.

> This late in the 5.8 release cycle, I'd like to propose converting
> gfs2 back to use mpage_readpages. This requires reinstating
> mpage_readpages, but it's otherwise relatively trivial.
> We can then introduce an IOCB_CACHED or equivalent flag, fix the
> locking order in gfs2, convert gfs2 to mpage_readahead, and finally
> remove mage_readpages in 5.9.

I would rather not do that.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux