On Mon, 2012-06-25 at 17:32 -0400, Chuck Lever wrote: > On Jun 25, 2012, at 5:20 PM, Myklebust, Trond wrote: > > > On Mon, 2012-06-25 at 16:47 -0400, Chuck Lever wrote: > >> Commit 2a6ee6aa "NFSv4: Clean up the error handling for > >> nfs4_reclaim_lease" May 25, 2012 appears to have changed the error > >> value returned by nfs4_reclaim_lease() if rpc.gssd fails to create > >> a machine credential (either due to a local error or a problem on > >> the server). > > > > It is _not_ an error to be mounting without machine credentials. > > Correct, and the client shouldn't oops in this case either. > > >> In this case, nfs4_proc_setclientid() returns -EACCES. Before > >> 2a6ee6aa, nfs4_reclaim_lease() converted this to -EAGAIN. Now it > >> returns zero. The state manager assumes all is well, nfs_client > >> initialization then proceeds in process context until it oopses while > >> trying to set up the mount's nfs_server. > > > > The PURGE_STATE and/or LEASE_EXPIRED flags should still be set if we > > exit via nfs4_handle_reclaim_lease_error. The zero error return is there > > in order to ensure that the state manager thread loops and retries > > instead of just aborting. > > My test case is attempting to mount a server with "sec=krb5" explicitly set. The server was not configured correctly. Why should the client retry in this case? There's nothing it can do to correct this situation. > > (Not to mention the fact that the gssd error messages on the client are incredibly obtuse and unhelpful). > > > This patch makes it impossible to run without machine creds because you > > circumvent the nfs4_clear_machine_cred()+retry case. > > Perhaps that's incorrect, but it's what the code used to do before 2a6ee6aa, as near as I can tell. Reverting to a broken previous state isn't really all that useful. Can you please send me a copy of the Oops that you are seeing? > > Why does setting up the nfs_server depend on a successful setclientid > > call anyway in the case of minor version 0? > > Should this also be the case during trunking discovery? I don't see how trunking discovery enters into it. The nfs_server setup should not Oops in case of a state initialisation failure. However if the user sets the '-omigration' flag, then we should have minor version 0 mount wait for trunking discovery to complete, just like minor version 1... > The server will return CLID_INUSE, possibly. Either we fail the mount, or we allow it and split the lease. CLID_INUSE needs to be handled carefully. In most cases we'll want to retry a couple of times, and then error out. However, if the user has not specified a 'sec=' mount option, then we should try to negotiate the right security mechanism as part of the retries. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥