Re: [PATCH - v2] NFSv4: handle EINVAL from EXCHANGE_ID better.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2018-03-21 at 05:44 +1100, NeilBrown wrote:
> On Tue, Mar 20 2018, Trond Myklebust wrote:
> 
> > On Tue, 2018-03-20 at 11:32 +1100, NeilBrown wrote:
> > > On Fri, Mar 16 2018, Trond Myklebust wrote:
> > > 
> > > > On Fri, 2018-03-16 at 10:31 +0100, Mkrtchyan, Tigran wrote:
> > > > > Hi Neil,
> > > > > 
> > > > > according to rfc5661, NFS4ERR_INVAL is returned by the server
> > > > > if
> > > > > it
> > > > > thinks that client sends an invalid request (e.g. points to a
> > > > > client
> > > > > bug)
> > > > > or server misinterpret it (broken server).
> > > > > 
> > > > > With your change instead of failing the mount, client will
> > > > > silently
> > > > > go for
> > > > > v4.0, even v4.1 mount was requested and produce undesirable
> > > > > behavior,
> > > > > e.g.
> > > > > proxy-io instead of pnfs. I fill prefer fail-fast instead of
> > > > > long
> > > > > debug
> > > > > sessions.
> > > > > 
> > > > > On the other hand, I understand, that it's not always
> > > > > possible to
> > > > > fix
> > > > > server
> > > > > or clients in production environment and time-to-time
> > > > > workarounds
> > > > > are
> > > > > necessary.
> > > > > 
> > > > > 
> > > > 
> > > > I'd tend to agree with Tigran. Hiding server bugs, should not
> > > > be a
> > > > priority and particularly not in this case, where the
> > > > workaround is
> > > > simple: either turn off version negotiation altogether, or edit
> > > > /etc/nfsmount.conf to negotiate a different set of versions.
> > > 
> > > Yes, it could be worked-around in nfsmount.conf, but manual
> > > configuration should be seen as an optimization or a last
> > > resort.   If
> > > we can make things work without configuration, that provides the
> > > best
> > > experience.
> > > In this case, the kernel has strong evidence that the server
> > > isn't responding as expected, but it gives an unhelpful error
> > > message.
> > 
> > Server do not spontaneously break their ability to process NFSv4
> > operations and so this is not an issue that we need to worry about
> > in
> > ordinary operation. It should only ever be an issue when
> > 
> > 1) An insufficiently tested and broken upgrade is applied to an
> > existing server, in which case the main workaround should be to
> > revert
> > the upgrade until it can be fixed.
> > 2) A completely new broken server is introduced to the system.
> 
> 3) An upgrade to the client defaults to trying 4.2 first, then 4.1
> and
>    only then 4.0.  Previous client defaulted to 4.0.

I'm sorry, but this still does not sounds like a good case for "fix the
kernel client". The kernel has no opinion on which NFS version you
should try first in a mount attempt: that decision is made in
userspace.
If you upgraded the nfs-utils to something that now tries 4.2 first,
then that is a user space policy issue.

> > 
> > > At the very least, nfs4_discover_server_trunking() should not
> > > treat
> > > -NFS4ERR_INVAL as unexpect (because there is code in
> > > nfs4_check_cl_exchange_flags which explicitly generates it).
> > > If it just let this error through, instead of translating it to
> > > EIO,
> > > then the problem would go away.
> > 
> > The code in nfs4_check_cl_exchange_flags is there to check for
> > explicit NFSv4.1 protocol violations. Is it broken?
> 
> It is broken in that it reports -EINVAL when no arguments were
> invalid.
> This gets translated to -EIO when there was no IO error.

The purpose of that function is to ensure that the server is
advertising itself as a bona fide NFSv4.1 or newer server, and to
ensure that it is not replying to our EXCHANGE_ID request with some
flag or service that we did not expect and that might cause us to
break.

IOW: it is there to check protocol compliance. As far as I can tell, it
is doing that, and doing so correctly.


> Thanks,
> NeilBrown
> 
> 
> > 
> > > > 
> > > > What we might want to do, is make it easier to allow the user
> > > > to
> > > > detect
> > > > that this is indeed a server bug and is not a problem with the
> > > > arguments supplied to the "mount" utility. Perhaps we might
> > > > have
> > > > the
> > > > kernel log something in the syslogs?
> > > 
> > > Yes, logging a message might be useful.  Most of the messages
> > > logged
> > > about bad servers are currently going through dprintk(), so they
> > > won't
> > > often be seen.  Is that what we want??  Don't know...
> > > 
> > > Anyway, you point that it "is not a problem with the arguments"
> > > is
> > > stop-on.  If the client gets EINVAL from the server, then it
> > > shouldn't
> > > blindly report that back to the user as EINVAL means "Invalid
> > > argument" and the argements given to the server are probably not
> > > the
> > > argument given by the user.
> > > 
> > > Following that line of reasoning, I think
> > > nfs4_check_cl_exchange_flag()
> > > should *not* return -NFS4ERR_INVAL, and _nfs4_proc_exchange_id()
> > > shouldn't pass NFS4ERR_INVAL through unchanged.
> > > 
> > > So I propose the following version.
> > > 
> > > Thanks,
> > > NeilBrown
> > > 
> > > ------------------------8<---------------------------
> > > From: NeilBrown <neilb@xxxxxxxx>
> > > Date: Tue, 20 Mar 2018 11:31:33 +1100
> > > Subject: [PATCH] NFSv4: handle EINVAL from EXCHANGE_ID better.
> > > 
> > > nfs4_proc_exchange_id() can return -EINVAL if the server
> > > reported NFS4INVAL (which I have seen in a packet trace),
> > > or nfs4_check_cl_exchange_flags() exchange flags detects
> > > a problem.
> > > Each of these mean that NFSv4.1 and later cannot be used,
> > > but they should not prevent fallback to NFSv4.0.  Currently
> > > they do.
> > > 
> > > Currently this EINVAL error is returned by
> > > nfs4_proc_exchange_id() to nfs41_discover_server_trunking(),
> > > and thence to nfs4_discover_server_trunking().
> > > nfs4_discover_server_trunking() doesn't understand EINVAL,
> > > so converts it to EIO which causes mount.nfs to think
> > > something is horribly wrong and to give up.
> > > 
> > > EINVAL is never a sensible error code here.  It means "Invalid
> > > argument", but is being used when the problem is "Invalid
> > > response
> > > from the server".  If we change these two circumstances to report
> > > EPROTONOSUPPORT to the caller (which seems a reasonable
> > > assessment
> > > when the server gives confusing responses), and if we enhance
> > > nfs4_discover_server_trunking() to treat -EPROTONOSUPPORT as an
> > > expected error to pass through, then the error reported to user-
> > > space
> > > will be more representative of the actual fault.
> > > 
> > > A failure to negotiate a client ID clearly shows that NFSv4.1
> > > cannot
> > > be supported, but isn't as general a failure as EIO.
> > > 
> > > Signed-off-by: NeilBrown <neilb@xxxxxxxx>
> > > ---
> > >  fs/nfs/nfs4proc.c  | 18 +++++++++++++++---
> > >  fs/nfs/nfs4state.c |  1 +
> > >  2 files changed, 16 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > > index 47f3c273245e..97757f646f13 100644
> > > --- a/fs/nfs/nfs4proc.c
> > > +++ b/fs/nfs/nfs4proc.c
> > > @@ -7364,7 +7364,8 @@ static int nfs4_check_cl_exchange_flags(u32
> > > flags)
> > >  		goto out_inval;
> > >  	return NFS_OK;
> > >  out_inval:
> > > -	return -NFS4ERR_INVAL;
> > > +	dprintk("NFS: server returns invalid flags for
> > > EXCHANGE_ID\n");
> > > +	return -EPROTONOSUPPORT;
> > >  }
> > >  
> > >  static bool
> > > @@ -7741,8 +7742,19 @@ static int _nfs4_proc_exchange_id(struct
> > > nfs_client *clp, struct rpc_cred *cred,
> > >  	int status;
> > >  
> > >  	task = nfs4_run_exchange_id(clp, cred, sp4_how, NULL);
> > > -	if (IS_ERR(task))
> > > -		return PTR_ERR(task);
> > > +	if (IS_ERR(task)) {
> > > +		status = PTR_ERR(task);
> > > +		if (status == -NFS4ERR_INVAL) {
> > > +			/* If the server think we did something
> > > invalid, it is certainly
> > > +			 * not the fault of our caller, so it
> > > would
> > > wrong to report
> > > +			 * this error back up.  So in that case
> > > simply acknowledge that
> > > +			 * we don't seem able to support this
> > > protocol.
> > > +			 */
> > > +			dprintk("NFS: server return
> > > NFS4ERR_INVAL to
> > > EXCHANGE_ID\n");
> > > +			status = -EPROTONOSUPPORT;
> > > +		}
> > > +		return status;
> > > +	}
> > >  
> > >  	argp = task->tk_msg.rpc_argp;
> > >  	resp = task->tk_msg.rpc_resp;
> > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> > > index 91a4d4eeb235..273c032089c4 100644
> > > --- a/fs/nfs/nfs4state.c
> > > +++ b/fs/nfs/nfs4state.c
> > > @@ -2219,6 +2219,7 @@ int nfs4_discover_server_trunking(struct
> > > nfs_client *clp,
> > >  		clnt = clp->cl_rpcclient;
> > >  		goto again;
> > >  
> > > +	case -EPROTONOSUPPORT:
> > >  	case -NFS4ERR_MINOR_VERS_MISMATCH:
> > >  		status = -EPROTONOSUPPORT;
> > >  		break;
> > 
> > -- 
> > Trond Myklebust
> > Linux NFS client maintainer, PrimaryData
> > trond.myklebust@xxxxxxxxxxxxxxx
-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@xxxxxxxxxxxxxxx

Attachment: signature.asc
Description: This is a digitally signed message part


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux