Re: [PATCH v2 1/1] NFSD: Simplify READ_PLUS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2022-09-07 at 20:51 +0000, Chuck Lever III wrote:
> 
> 
> > On Sep 7, 2022, at 4:37 PM, Anna Schumaker <anna@xxxxxxxxxx> wrote:
> > 
> > On Wed, Sep 7, 2022 at 4:29 PM Chuck Lever III
> > <chuck.lever@xxxxxxxxxx> wrote:
> > > 
> > > Be sure to Cc: Jeff on these. Thanks!
> > > 
> > > 
> > > > On Sep 7, 2022, at 3:52 PM, Anna Schumaker <anna@xxxxxxxxxx>
> > > > wrote:
> > > > 
> > > > From: Anna Schumaker <Anna.Schumaker@xxxxxxxxxx>
> > > > 
> > > > Chuck had suggested reverting READ_PLUS so it returns a single
> > > > DATA
> > > > segment covering the requested read range. This prepares the
> > > > server for
> > > > a future "sparse read" function so support can easily be added
> > > > without
> > > > needing to rip out the old READ_PLUS code at the same time.
> > > > 
> > > > Signed-off-by: Anna Schumaker <Anna.Schumaker@xxxxxxxxxx>
> > > > ---
> > > > fs/nfsd/nfs4xdr.c | 139 +++++++++++----------------------------
> > > > -------
> > > > 1 file changed, 32 insertions(+), 107 deletions(-)
> > > > 
> > > > diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> > > > index 1e9690a061ec..bcc8c385faf2 100644
> > > > --- a/fs/nfsd/nfs4xdr.c
> > > > +++ b/fs/nfsd/nfs4xdr.c
> > > > @@ -4731,79 +4731,37 @@ nfsd4_encode_offload_status(struct
> > > > nfsd4_compoundres *resp, __be32 nfserr,
> > > > 
> > > > static __be32
> > > > nfsd4_encode_read_plus_data(struct nfsd4_compoundres *resp,
> > > > -                         struct nfsd4_read *read,
> > > > -                         unsigned long *maxcount, u32 *eof,
> > > > -                         loff_t *pos)
> > > > +                         struct nfsd4_read *read)
> > > > {
> > > > -     struct xdr_stream *xdr = resp->xdr;
> > > > +     bool splice_ok = test_bit(RQ_SPLICE_OK, &resp->rqstp-
> > > > >rq_flags);
> > > >      struct file *file = read->rd_nf->nf_file;
> > > > -     int starting_len = xdr->buf->len;
> > > > -     loff_t hole_pos;
> > > > -     __be32 nfserr;
> > > > -     __be32 *p, tmp;
> > > > -     __be64 tmp64;
> > > > -
> > > > -     hole_pos = pos ? *pos : vfs_llseek(file, read->rd_offset,
> > > > SEEK_HOLE);
> > > > -     if (hole_pos > read->rd_offset)
> > > > -             *maxcount = min_t(unsigned long, *maxcount,
> > > > hole_pos - read->rd_offset);
> > > > -     *maxcount = min_t(unsigned long, *maxcount, (xdr->buf-
> > > > >buflen - xdr->buf->len));
> > > > +     struct xdr_stream *xdr = resp->xdr;
> > > > +     unsigned long maxcount;
> > > > +     __be32 nfserr, *p;
> > > > 
> > > >      /* Content type, offset, byte count */
> > > >      p = xdr_reserve_space(xdr, 4 + 8 + 4);
> > > >      if (!p)
> > > > -             return nfserr_resource;
> > > > +             return nfserr_io;
> > > 
> > > Wouldn't nfserr_rep_too_big be a more appropriate status for
> > > running
> > > off the end of the send buffer? I'm not 100% sure, but I would
> > > expect
> > > that exhausting send buffer space would imply the reply has grown
> > > too
> > > large.
> > 
> > I can switch it to that, no problem.
> 
> I would like to hear opinions from protocol experts before we go
> with that choice.

I'd agree that NFS4ERR_REP_TOO_BIG is correct if you're not able to
even return a short read. However if you can return a short read, then
that's better than returning an error.

It looks to me as if this function can bit hit in both cases, so
perhaps some care is in order.

> 
> > > > +     if (resp->xdr->buf->page_len && splice_ok) {
> > > > +             WARN_ON_ONCE(splice_ok);
> > > > +             return nfserr_io;
> > > > +     }
> > > 
> > > I wish I understood why this test was needed. It seems to have
> > > been
> > > copied and pasted from historic code into nfsd4_encode_read(),
> > > and
> > > there have been recent mechanical changes to it, but there's no
> > > comment explaining it there...
> > 
> > Yeah, I saw this was in the read code and assumed it was an
> > important
> > check so I added it here too.
> > > 
> > > In any event, this seems to be checking for a server software
> > > bug,
> > > so maybe this should return nfserr_serverfault. Oddly that status
> > > code isn't defined yet.
> > 
> > Do you want me to add that code and return it in this patch?
> 
> Sure. Make that a predecessor patch and fix up the code in
> nfsd4_encode_read() before using it for READ_PLUS in this patch.
> 
> Suggested patch description:
> 
> RESOURCE is the wrong status code here because
> 
> a) This encoder is shared between NFSv4.0 and NFSv4.1+, and
>    NFSv4.1+ doesn't support RESOURCE, and

Is it? I thought it was READ_PLUS only. That's NFSv4.2 or higher.

If the encoder is to be shared with both READ and READ_PLUS, then
perhaps it might be wiser to choose a name other than
nfsd4_encode_read_plus_data()?

> b) That status code might cause the client to retry, in which
>    case it will get the same failure because this check seems
>    to be looking for a server-side coding mistake.
> 
> 
> > > Do you have some performance results for v2?
> > 
> > Not yet, I have it running now so hopefully I'll have something
> > ready
> > by tomorrow morning.
> 
> Great, thanks!
> 
> 
> > Anna
> > > 
> > > 
> > > > -     read->rd_vlen = xdr_reserve_space_vec(xdr, resp->rqstp-
> > > > >rq_vec, *maxcount);
> > > > -     if (read->rd_vlen < 0)
> > > > -             return nfserr_resource;
> > > > +     maxcount = min_t(unsigned long, read->rd_length,
> > > > +                      (xdr->buf->buflen - xdr->buf->len));
> > > > 
> > > > -     nfserr = nfsd_readv(resp->rqstp, read->rd_fhp, file,
> > > > read->rd_offset,
> > > > -                         resp->rqstp->rq_vec, read->rd_vlen,
> > > > maxcount, eof);
> > > > +     if (file->f_op->splice_read && splice_ok)
> > > > +             nfserr = nfsd4_encode_splice_read(resp, read,
> > > > file, maxcount);
> > > > +     else
> > > > +             nfserr = nfsd4_encode_readv(resp, read, file,
> > > > maxcount);
> > > >      if (nfserr)
> > > >              return nfserr;
> > > > -     xdr_truncate_encode(xdr, starting_len + 16 +
> > > > xdr_align_size(*maxcount));
> > > > 
> > > > -     tmp = htonl(NFS4_CONTENT_DATA);
> > > > -     write_bytes_to_xdr_buf(xdr->buf, starting_len,     
> > > > &tmp,   4);
> > > > -     tmp64 = cpu_to_be64(read->rd_offset);
> > > > -     write_bytes_to_xdr_buf(xdr->buf, starting_len + 4, 
> > > > &tmp64, 8);
> > > > -     tmp = htonl(*maxcount);
> > > > -     write_bytes_to_xdr_buf(xdr->buf, starting_len + 12,
> > > > &tmp,   4);
> > > > -
> > > > -     tmp = xdr_zero;
> > > > -     write_bytes_to_xdr_buf(xdr->buf, starting_len + 16 +
> > > > *maxcount, &tmp,
> > > > -                            xdr_pad_size(*maxcount));
> > > > -     return nfs_ok;
> > > > -}
> > > > -
> > > > -static __be32
> > > > -nfsd4_encode_read_plus_hole(struct nfsd4_compoundres *resp,
> > > > -                         struct nfsd4_read *read,
> > > > -                         unsigned long *maxcount, u32 *eof)
> > > > -{
> > > > -     struct file *file = read->rd_nf->nf_file;
> > > > -     loff_t data_pos = vfs_llseek(file, read->rd_offset,
> > > > SEEK_DATA);
> > > > -     loff_t f_size = i_size_read(file_inode(file));
> > > > -     unsigned long count;
> > > > -     __be32 *p;
> > > > -
> > > > -     if (data_pos == -ENXIO)
> > > > -             data_pos = f_size;
> > > > -     else if (data_pos <= read->rd_offset || (data_pos <
> > > > f_size && data_pos % PAGE_SIZE))
> > > > -             return nfsd4_encode_read_plus_data(resp, read,
> > > > maxcount, eof, &f_size);
> > > > -     count = data_pos - read->rd_offset;
> > > > -
> > > > -     /* Content type, offset, byte count */
> > > > -     p = xdr_reserve_space(resp->xdr, 4 + 8 + 8);
> > > > -     if (!p)
> > > > -             return nfserr_resource;
> > > > -
> > > > -     *p++ = htonl(NFS4_CONTENT_HOLE);
> > > > +     *p++ = cpu_to_be32(NFS4_CONTENT_DATA);
> > > >      p = xdr_encode_hyper(p, read->rd_offset);
> > > > -     p = xdr_encode_hyper(p, count);
> > > > +     *p = cpu_to_be32(read->rd_length);
> > > > 
> > > > -     *eof = (read->rd_offset + count) >= f_size;
> > > > -     *maxcount = min_t(unsigned long, count, *maxcount);
> > > >      return nfs_ok;
> > > > }
> > > > 
> > > > @@ -4811,69 +4769,36 @@ static __be32
> > > > nfsd4_encode_read_plus(struct nfsd4_compoundres *resp, __be32
> > > > nfserr,
> > > >                     struct nfsd4_read *read)
> > > > {
> > > > -     unsigned long maxcount, count;
> > > > +     struct file *file = read->rd_nf->nf_file;
> > > >      struct xdr_stream *xdr = resp->xdr;
> > > > -     struct file *file;
> > > >      int starting_len = xdr->buf->len;
> > > > -     int last_segment = xdr->buf->len;
> > > > -     int segments = 0;
> > > > -     __be32 *p, tmp;
> > > > -     bool is_data;
> > > > -     loff_t pos;
> > > > -     u32 eof;
> > > > +     u32 segments = 0;
> > > > +     __be32 *p;
> > > > 
> > > >      if (nfserr)
> > > >              return nfserr;
> > > > -     file = read->rd_nf->nf_file;
> > > > 
> > > >      /* eof flag, segment count */
> > > >      p = xdr_reserve_space(xdr, 4 + 4);
> > > >      if (!p)
> > > > -             return nfserr_resource;
> > > > +             return nfserr_io;
> > > >      xdr_commit_encode(xdr);
> > > > 
> > > > -     maxcount = min_t(unsigned long, read->rd_length,
> > > > -                      (xdr->buf->buflen - xdr->buf->len));
> > > > -     count    = maxcount;
> > > > -
> > > > -     eof = read->rd_offset >= i_size_read(file_inode(file));
> > > > -     if (eof)
> > > > +     read->rd_eof = read->rd_offset >=
> > > > i_size_read(file_inode(file));
> > > > +     if (read->rd_eof)
> > > >              goto out;
> > > > 
> > > > -     pos = vfs_llseek(file, read->rd_offset, SEEK_HOLE);
> > > > -     is_data = pos > read->rd_offset;
> > > > -
> > > > -     while (count > 0 && !eof) {
> > > > -             maxcount = count;
> > > > -             if (is_data)
> > > > -                     nfserr =
> > > > nfsd4_encode_read_plus_data(resp, read, &maxcount, &eof,
> > > > -                                             segments == 0 ?
> > > > &pos : NULL);
> > > > -             else
> > > > -                     nfserr =
> > > > nfsd4_encode_read_plus_hole(resp, read, &maxcount, &eof);
> > > > -             if (nfserr)
> > > > -                     goto out;
> > > > -             count -= maxcount;
> > > > -             read->rd_offset += maxcount;
> > > > -             is_data = !is_data;
> > > > -             last_segment = xdr->buf->len;
> > > > -             segments++;
> > > > -     }
> > > > -
> > > > -out:
> > > > -     if (nfserr && segments == 0)
> > > > +     nfserr = nfsd4_encode_read_plus_data(resp, read);
> > > > +     if (nfserr) {
> > > >              xdr_truncate_encode(xdr, starting_len);
> > > > -     else {
> > > > -             if (nfserr) {
> > > > -                     xdr_truncate_encode(xdr, last_segment);
> > > > -                     nfserr = nfs_ok;
> > > > -                     eof = 0;
> > > > -             }
> > > > -             tmp = htonl(eof);
> > > > -             write_bytes_to_xdr_buf(xdr->buf,
> > > > starting_len,     &tmp, 4);
> > > > -             tmp = htonl(segments);
> > > > -             write_bytes_to_xdr_buf(xdr->buf, starting_len +
> > > > 4, &tmp, 4);
> > > > +             return nfserr;
> > > >      }
> > > > 
> > > > +     segments++;
> > > > +
> > > > +out:
> > > > +     p = xdr_encode_bool(p, read->rd_eof);
> > > > +     *p = cpu_to_be32(segments);
> > > >      return nfserr;
> > > > }
> > > > 
> > > > --
> > > > 2.37.2
> > > > 
> > > 
> > > --
> > > Chuck Lever
> 
> --
> Chuck Lever
> 
> 
> 

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux