Re: [PATCH] NFS: Retry a zero-length short read

Benjamin Coddington <bcodding@xxxxxxxxxx> · Wed, 16 Mar 2016 15:46:28 -0400 (EDT)

> On Wed, Mar 16, 2016 at 01:36:48PM -0400, Benjamin Coddington wrote:
> > On Wed, 16 Mar 2016, J. Bruce Fields wrote:
> >
> > > On Wed, Mar 16, 2016 at 12:22:22PM -0400, Trond Myklebust wrote:
> > > > On Wed, Mar 16, 2016 at 11:20 AM, Benjamin Coddington
> > > > <bcodding@xxxxxxxxxx> wrote:
> > > > > On Wed, 16 Mar 2016, Benjamin Coddington wrote:
> > > > >
> > > > >> On Wed, 16 Mar 2016, Trond Myklebust wrote:
> > > > >>
> > > > >> > On Wed, Mar 16, 2016 at 10:22 AM, Benjamin Coddington
> > > > >> > <bcodding@xxxxxxxxxx> wrote:
> > > > >> > > On Wed, 16 Mar 2016, Trond Myklebust wrote:
> > > > >> > >
> > > > >> > >> On Wed, Mar 16, 2016 at 5:17 AM, Benjamin Coddington
> > > > >> > >> <bcodding@xxxxxxxxxx> wrote:
> > > > >> > >> >
> > > > >> > >> > A zero-length short read without eof should be retried rather than sending
> > > > >> > >> > an error to the application.
> > > > >> > >>
> > > > >> > >>
> > > > >> > >> In what situation would returning a 0 length read not be a bug? If the
> > > > >> > >> server intended that we back off and retry, it has the alternative of
> > > > >> > >> sending a JUKEBOX/DELAY error.
> > > > >> > >
> > > > >> > > If the server completes a local read but then another writer comes in and
> > > > >> > > appends to the file before the server checks if it needs to set EOF, then
> > > > >> > > the response might be 0 length without EOF set.
> > > > >> >
> > > > >> > Why isn't that EOF check done atomically with the read itself? This
> > > > >> > still sounds like a server bug to me.
> > > > >>
> > > > >> I don't know -- I would guess because doing that atomically is harder than
> > > > >> not, and I don't know where the RFCs say that a zero length response without
> > > > >> eof is to be treated as an error or condition to be avoided.
> > > > >>
> > > > >> I'll look into that, and respond here.
> > > > >
> > > > > Indeed, it seems that it is more convenient for the linux server to send a
> > > > > zero-length response without eof when the file grows.  It would probably be
> > > > > more helpful if the server handled that case, but I think that 7530 states
> > > > > that it doesn't have to handle that case.
> > > >
> > > > Here is what RFC5661 and RFC7530 say.
> > > >
> > > >    If the READ ended at the end-of-file (formally, in a correctly formed
> > > >    READ request, if offset + count is equal to the size of the file), or
> > > >    the READ request extends beyond the size of the file (if offset +
> > > >    count is greater than the size of the file), eof is returned as TRUE;
> > > >    otherwise, it is FALSE.  A successful READ of an empty file will
> > > >    always return eof as TRUE.
> > > >
> > > > Here is what RFC1813 says:
> > > >
> > > >       eof
> > > >          If the read ended at the end-of-file (formally, in a
> > > >          correctly formed READ request, if READ3args.offset plus
> > > >          READ3resok.count is equal to the size of the file), eof
> > > >          is returned as TRUE; otherwise it is FALSE. A
> > > >          successful READ of an empty file will always return eof
> > > >          as TRUE.
> > > >
> > > > Where does it say that the eof determination is allowed to be
> > > > non-atomic? Unlike structures such as change_info4, there isn't an
> > > > "atomic" flag to allow the server to communicate to the client that it
> > > > cannot rely on the eof flag. Since the determination is part of the
> > > > same READ operation, you can't point to the "COMPOUNDS are not atomic"
> > > > either.
> > >
> > > I wonder why READ has eof at all, instead of adopting read()'s
> > > convention that 0 means eof?
> > >
> > > In any case, our server should be able to count on that convention
> > > internally, so if getting a real "eof" out of the filesystem looks
> > > daunting, why not for now just check for that case in
> > > nfsd4_encode_readv()?:
> >
> > It's probably not really that daunting.  Couldn't we check to see if inode's
> > size has increased, and if so do the read again?
>
> Does the client really have trouble with short reads in general, or only
> with zero-length short reads?

Just the zero-length short read.

> > The concept of READ and eof atomicity should also include a concept of a
> > boundary or barrier.  The operation is atomic within what bound - the
> > server's response processing or the underlying filesystem?
>
> Ideally the read data and eof both represent the same state of the file
> with no file-modifying operations intervening between the two.
>
> I could come up with artificial examples that would allow clients to
> detect that we don't meet this standard.  I can't think of one that
> would occur in practice (may just be my lack of imagination).

So, sounds like fixing this is a good idea on the server. I hope Trond will
let us know if he still feels that the client ought not to be changed since
it seems an easy enough fix to avoid a similar problem on another server.
Perhaps there's a downside I'm not seeing on the client.  Or maybe the
convention of read() returning 0 meaning eof is global enough to cause it to
be acceptible behavior -- we really should treat a zero-length read response
without eof as an error.  My lack of experience is showing..  :)

Ben
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html