Re: [PATCH] ceph: fix error handling in ceph_sync_write

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2022-08-25 at 10:32 +0200, Ilya Dryomov wrote:
> On Wed, Aug 24, 2022 at 10:53 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > 
> > ceph_sync_write has assumed that a zero result in req->r_result means
> > success. Testing with a recent cluster however shows the OSD returning
> > a non-zero length written here. I'm not sure whether and when this
> > changed, but fix the code to accept either result.
> > 
> > Assume a negative result means error, and anything else is a success. If
> > we're given a short length, then return a short write.
> > 
> > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > ---
> >  fs/ceph/file.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/ceph/file.c b/fs/ceph/file.c
> > index 86265713a743..c0b2c8968be9 100644
> > --- a/fs/ceph/file.c
> > +++ b/fs/ceph/file.c
> > @@ -1632,11 +1632,19 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos,
> >                                           req->r_end_latency, len, ret);
> >  out:
> >                 ceph_osdc_put_request(req);
> > -               if (ret != 0) {
> > +               if (ret < 0) {
> >                         ceph_set_error_write(ci);
> >                         break;
> >                 }
> > 
> > +               /*
> > +                * FIXME: it's unclear whether all OSD versions return the
> > +                * length written on a write. For now, assume that a 0 return
> > +                * means that everything got written.
> > +                */
> > +               if (ret && ret < len)
> > +                       len = ret;
> > +
> >                 ceph_clear_error_write(ci);
> >                 pos += len;
> >                 written += len;
> > --
> > 2.37.2
> > 
> 
> Hi Jeff,
> 
> AFAIK OSDs aren't allowed to return any kind of length on a write
> and there is no such thing as a short write.  This definitely needs
> deeper investigation.
> 
> What is the cluster version you are testing against?
> 

That's what I had thought too but I wasn't sure:

    [ceph: root@quad1 /]# ceph --version
    ceph version 17.0.0-14400-gf61b38dc (f61b38dc82e94f14e7a0a5f6a5888c0c78fafa6c) quincy (dev)

I'll see if I can confirm that this is coming from the OSD and not some
other layer as well.
-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux