Re: [PATCH v2 1/1] NFSD: Simplify READ_PLUS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2022-09-08 at 15:36 +0000, Chuck Lever III wrote:
> 
> 
> > On Sep 7, 2022, at 6:33 PM, Trond Myklebust
> > <trondmy@xxxxxxxxxxxxxxx> wrote:
> > 
> > On Wed, 2022-09-07 at 20:51 +0000, Chuck Lever III wrote:
> > > 
> > > 
> > > > On Sep 7, 2022, at 4:37 PM, Anna Schumaker <anna@xxxxxxxxxx>
> > > > wrote:
> > > > 
> > > > On Wed, Sep 7, 2022 at 4:29 PM Chuck Lever III
> > > > <chuck.lever@xxxxxxxxxx> wrote:
> > > > > 
> > > > > Be sure to Cc: Jeff on these. Thanks!
> > > > > 
> > > > > 
> > > > > > On Sep 7, 2022, at 3:52 PM, Anna Schumaker
> > > > > > <anna@xxxxxxxxxx>
> > > > > > wrote:
> > > > > > 
> > > > > > From: Anna Schumaker <Anna.Schumaker@xxxxxxxxxx>
> > > > > > 
> > > > > > Chuck had suggested reverting READ_PLUS so it returns a
> > > > > > single
> > > > > > DATA
> > > > > > segment covering the requested read range. This prepares
> > > > > > the
> > > > > > server for
> > > > > > a future "sparse read" function so support can easily be
> > > > > > added
> > > > > > without
> > > > > > needing to rip out the old READ_PLUS code at the same time.
> > > > > > 
> > > > > > Signed-off-by: Anna Schumaker <Anna.Schumaker@xxxxxxxxxx>
> > > > > > ---
> > > > > > fs/nfsd/nfs4xdr.c | 139 +++++++++++------------------------
> > > > > > ----
> > > > > > -------
> > > > > > 1 file changed, 32 insertions(+), 107 deletions(-)
> > > > > > 
> > > > > > diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> > > > > > index 1e9690a061ec..bcc8c385faf2 100644
> > > > > > --- a/fs/nfsd/nfs4xdr.c
> > > > > > +++ b/fs/nfsd/nfs4xdr.c
> > > > > > @@ -4731,79 +4731,37 @@ nfsd4_encode_offload_status(struct
> > > > > > nfsd4_compoundres *resp, __be32 nfserr,
> > > > > > 
> > > > > > static __be32
> > > > > > nfsd4_encode_read_plus_data(struct nfsd4_compoundres *resp,
> > > > > > -                         struct nfsd4_read *read,
> > > > > > -                         unsigned long *maxcount, u32
> > > > > > *eof,
> > > > > > -                         loff_t *pos)
> > > > > > +                         struct nfsd4_read *read)
> > > > > > {
> > > > > > -     struct xdr_stream *xdr = resp->xdr;
> > > > > > +     bool splice_ok = test_bit(RQ_SPLICE_OK, &resp->rqstp-
> > > > > > > rq_flags);
> > > > > >      struct file *file = read->rd_nf->nf_file;
> > > > > > -     int starting_len = xdr->buf->len;
> > > > > > -     loff_t hole_pos;
> > > > > > -     __be32 nfserr;
> > > > > > -     __be32 *p, tmp;
> > > > > > -     __be64 tmp64;
> > > > > > -
> > > > > > -     hole_pos = pos ? *pos : vfs_llseek(file, read-
> > > > > > >rd_offset,
> > > > > > SEEK_HOLE);
> > > > > > -     if (hole_pos > read->rd_offset)
> > > > > > -             *maxcount = min_t(unsigned long, *maxcount,
> > > > > > hole_pos - read->rd_offset);
> > > > > > -     *maxcount = min_t(unsigned long, *maxcount, (xdr-
> > > > > > >buf-
> > > > > > > buflen - xdr->buf->len));
> > > > > > +     struct xdr_stream *xdr = resp->xdr;
> > > > > > +     unsigned long maxcount;
> > > > > > +     __be32 nfserr, *p;
> > > > > > 
> > > > > >      /* Content type, offset, byte count */
> > > > > >      p = xdr_reserve_space(xdr, 4 + 8 + 4);
> > > > > >      if (!p)
> > > > > > -             return nfserr_resource;
> > > > > > +             return nfserr_io;
> > > > > 
> > > > > Wouldn't nfserr_rep_too_big be a more appropriate status for
> > > > > running
> > > > > off the end of the send buffer? I'm not 100% sure, but I
> > > > > would
> > > > > expect
> > > > > that exhausting send buffer space would imply the reply has
> > > > > grown
> > > > > too
> > > > > large.
> > > > 
> > > > I can switch it to that, no problem.
> > > 
> > > I would like to hear opinions from protocol experts before we go
> > > with that choice.
> > 
> > I'd agree that NFS4ERR_REP_TOO_BIG is correct if you're not able to
> > even return a short read. However if you can return a short read,
> > then
> > that's better than returning an error.
> 
> Many thanks for reviewing!
> 
> In general, I might want to convert all NFSv4 encoders to return
> either RESOURCE or REP_TOO_BIG when xdr_reserve_space() fails. I
> can post a patch or two that makes that attempt so the special
> cases can be sorted on the mailing list.
> 
> 
> > It looks to me as if this function can bit hit in both cases, so
> > perhaps some care is in order.
> 
> Intriguing idea.
> 
> For READ, if xdr_reserve_space() fails, that's all she wrote.
> I think all these calls happen before the data payload is encoded.
> 
> For READ_PLUS, if xdr_reserve_space() fails, it might be possible
> to use xdr_truncate_encode() or simply start over with a limit on
> the number of segments to be encoded. We're not really there yet,
> as currently we want to return just a single segment at this
> point.
> 
> I feel the question is whether it's practical or a frequent
> enough occurrence to bother with the special cases. Encoding a
> READ_PLUS response is already challenging.

What I'm saying is that you are more likely to hit this issue when you
have a reply with something like "<data><hole><data>"...

If the second "<data>" chunk hits the above xdr_reserve_space() limit
(i.e. the first call to nfsd4_encode_read_plus_data() succeeds as does
the call to nfsd4_encode_read_plus_hole()), then you just want to
return a short read of the form "<data><hole>" instead of returning an
error on the whole READ_PLUS operation.


-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux