Re: [PATCH] ceph: request Fw caps before updating the mtime in ceph_write_iter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2021-08-20 at 13:16 +0800, Yan, Zheng wrote:
> On Wed, Aug 11, 2021 at 7:24 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > 
> > The current code will update the mtime and then try to get caps to
> > handle the write. If we end up having to request caps from the MDS, then
> > the mtime in the cap grant will clobber the updated mtime and it'll be
> > lost.
> > 
> > This is most noticable when two clients are alternately writing to the
> > same file. Fw caps are continually being granted and revoked, and the
> > mtime ends up stuck because the updated mtimes are always being
> > overwritten with the old one.
> > 
> > Fix this by changing the order of operations in ceph_write_iter. Get the
> > caps much earlier, and only update the times afterward. Also, make sure
> > we check the NEARFULL conditions before making any changes to the inode.
> > 
> > URL: https://tracker.ceph.com/issues/46574
> > Reported-by: Jozef Kováč <kovac@xxxxxxxxxxxxxxx>
> > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > ---
> >  fs/ceph/file.c | 34 +++++++++++++++++-----------------
> >  1 file changed, 17 insertions(+), 17 deletions(-)
> > 
> > diff --git a/fs/ceph/file.c b/fs/ceph/file.c
> > index f55ca2c4c7de..5867acfc6a51 100644
> > --- a/fs/ceph/file.c
> > +++ b/fs/ceph/file.c
> > @@ -1722,22 +1722,6 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
> >                 goto out;
> >         }
> > 
> > -       err = file_remove_privs(file);
> > -       if (err)
> > -               goto out;
> > -
> > -       err = file_update_time(file);
> > -       if (err)
> > -               goto out;
> > -
> > -       inode_inc_iversion_raw(inode);
> > -
> > -       if (ci->i_inline_version != CEPH_INLINE_NONE) {
> > -               err = ceph_uninline_data(file, NULL);
> > -               if (err < 0)
> > -                       goto out;
> > -       }
> > -
> >         down_read(&osdc->lock);
> >         map_flags = osdc->osdmap->flags;
> >         pool_flags = ceph_pg_pool_flags(osdc->osdmap, ci->i_layout.pool_id);
> > @@ -1748,6 +1732,12 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
> >                 goto out;
> >         }
> > 
> > +       if (ci->i_inline_version != CEPH_INLINE_NONE) {
> > +               err = ceph_uninline_data(file, NULL);
> > +               if (err < 0)
> > +                       goto out;
> > +       }
> > +
> >         dout("aio_write %p %llx.%llx %llu~%zd getting caps. i_size %llu\n",
> >              inode, ceph_vinop(inode), pos, count, i_size_read(inode));
> >         if (fi->fmode & CEPH_FILE_MODE_LAZY)
> > @@ -1759,6 +1749,16 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
> >         if (err < 0)
> >                 goto out;
> > 
> > +       err = file_remove_privs(file);
> > +       if (err)
> > +               goto out_caps;
> 
> this may send setattr request to mds. holding cap here may cause deadlock.
> 

Thanks, Zheng -- good point. I guess we can move this call to before the
cap acquisition. I'll test that out and send a v3.

> > +
> > +       err = file_update_time(file);
> > +       if (err)
> > +               goto out_caps;
> > +
> > +       inode_inc_iversion_raw(inode);
> > +
> >         dout("aio_write %p %llx.%llx %llu~%zd got cap refs on %s\n",
> >              inode, ceph_vinop(inode), pos, count, ceph_cap_string(got));
> > 
> > @@ -1822,7 +1822,7 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
> >                 if (ceph_quota_is_max_bytes_approaching(inode, iocb->ki_pos))
> >                         ceph_check_caps(ci, 0, NULL);
> >         }
> > -
> > +out_caps:
> >         dout("aio_write %p %llx.%llx %llu~%u  dropping cap refs on %s\n",
> >              inode, ceph_vinop(inode), pos, (unsigned)count,
> >              ceph_cap_string(got));
> > --
> > 2.31.1
> > 

-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux