Re: [PATCH] ceph: fix error handling in ceph_sync_write

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 25, 2022 at 10:32:56AM +0200, Ilya Dryomov wrote:
> On Wed, Aug 24, 2022 at 10:53 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> >
> > ceph_sync_write has assumed that a zero result in req->r_result means
> > success. Testing with a recent cluster however shows the OSD returning
> > a non-zero length written here. I'm not sure whether and when this
> > changed, but fix the code to accept either result.
> >
> > Assume a negative result means error, and anything else is a success. If
> > we're given a short length, then return a short write.
> >
> > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > ---
> >  fs/ceph/file.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/ceph/file.c b/fs/ceph/file.c
> > index 86265713a743..c0b2c8968be9 100644
> > --- a/fs/ceph/file.c
> > +++ b/fs/ceph/file.c
> > @@ -1632,11 +1632,19 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos,
> >                                           req->r_end_latency, len, ret);
> >  out:
> >                 ceph_osdc_put_request(req);
> > -               if (ret != 0) {
> > +               if (ret < 0) {
> >                         ceph_set_error_write(ci);
> >                         break;
> >                 }
> >
> > +               /*
> > +                * FIXME: it's unclear whether all OSD versions return the
> > +                * length written on a write. For now, assume that a 0 return
> > +                * means that everything got written.
> > +                */
> > +               if (ret && ret < len)
> > +                       len = ret;
> > +
> >                 ceph_clear_error_write(ci);
> >                 pos += len;
> >                 written += len;
> > --
> > 2.37.2
> >
> 
> Hi Jeff,
> 
> AFAIK OSDs aren't allowed to return any kind of length on a write
> and there is no such thing as a short write.  This definitely needs
> deeper investigation.
> 
> What is the cluster version you are testing against?

OK, I'm only seeing 'ret' being set to the write length only when enabling
encryption (i.e. with test_dummy_encryption mount option).  So, maybe the
right fix is something like:

diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 16dcade66923..5119d87d61fb 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -1889,6 +1889,7 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos,
 				ceph_release_page_vector(pages, num_pages);
 				break;
 			}
+			ret = 0;
 		}
 
 		req = ceph_osdc_new_request(osdc, &ci->i_layout,

Cheers,
--
Luís



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux