Re: long delay when mounting due to SETCLIENTID AUTH_GSS attempts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 3 May 2013 18:33:54 +0000
"Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> wrote:

> On Fri, 2013-05-03 at 14:24 -0400, Jeff Layton wrote:
> > On Fri, 3 May 2013 13:56:13 -0400
> > Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> > 
> > > 
> > > On May 3, 2013, at 1:25 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > 
> > > > I've noticed that when running a 3.10-pre kernel that if I try to mount
> > > > up a NFSv4 filesystem that it now takes ~15s for the mount to complete.
> > > > 
> > > > Here's a little rpcdebug output:
> > > > 
> > > > [ 3056.385078] svc: server ffff8800368fc000 waiting for data (to = 9223372036854775807)
> > > > [ 3056.392056] RPC:       new task initialized, procpid 2471
> > > > [ 3056.392758] RPC:       allocated task ffff88010cd90100
> > > > [ 3056.393303] RPC:    42 __rpc_execute flags=0x1280
> > > > [ 3056.393630] RPC:    42 call_start nfs4 proc SETCLIENTID (sync)
> > > > [ 3056.394056] RPC:    42 call_reserve (status 0)
> > > > [ 3056.394368] RPC:    42 reserved req ffff8801019f9600 xid 21ad6c40
> > > > [ 3056.394783] RPC:       wake_up_first(ffff88010a989990 "xprt_sending")
> > > > [ 3056.395252] RPC:    42 call_reserveresult (status 0)
> > > > [ 3056.395595] RPC:    42 call_refresh (status 0)
> > > > [ 3056.395901] RPC:       gss_create_cred for uid 0, flavor 390004
> > > > [ 3056.396361] RPC:       gss_create_upcall for uid 0
> > > > [ 3071.396134] RPC: AUTH_GSS upcall timed out.
> > > > Please check user daemon is running.
> > > > [ 3071.397374] RPC:       gss_create_upcall for uid 0 result -13
> > > > [ 3071.398192] RPC:    42 call_refreshresult (status -13)
> > > > [ 3071.398873] RPC:    42 call_refreshresult: refresh creds failed with error -13
> > > > [ 3071.399881] RPC:    42 return 0, status -13
> > > > 
> > > > The problem is that we're now trying to upcall for GSS creds to do the
> > > > SETCLIENTID call, but this host isn't running rpc.gssd. Not running
> > > > rpc.gssd is pretty common for people not using kerberized NFS. I think
> > > > we'll see a lot of complaints about this.
> > > > 
> > > > Is this expected?
> > > 
> > > Yes.
> > > 
> > > There are operations like SETCLIENTID and GETATTR(fs_locations) which should always use an integrity-checking security flavor, even if particular mount points use sec=sys.
> > > 
> > > There are cases where GSS is not available, and we fall back to using AUTH_SYS.  That should happen as quickly as possible, I agree.
> > > 
> > > > If so, what's the proposed remedy?
> > > > Simply have everyone run rpc.gssd even if they're not using kerberized NFS?
> > > 
> > > 
> > > That's one possibility.  Or we could shorten the upcall timeout.  Or, add a mechanism by which rpc.gssd can provide a positive indication to the kernel that it is running.
> > > 
> > > It doesn't seem like an intractable problem.
> > > 
> > 
> > Nope, it's not intractable at all...
> > 
> > Currently, the gssd upcall uses the RPC_PIPE_WAIT_FOR_OPEN flag to
> > allow you to queue upcalls to be processed when the daemon isn't up
> > yet. When the daemon starts, it processes that queue. The caller gives
> > up after 15s (which is what's happening here), and the upcall
> > eventually gets scraped out of the queue after 30s.
> > 
> > We could stop using that flag on this rpc_pipe and simply require that
> > the daemon be up and running before attempting any sort of AUTH_GSS
> > rpc. That might be a little less friendly in the face of boot-time
> > ordering problems, but it should presumably make this problem go away.
> 
> You probably don't want to do that... The main reason for the
> RPC_PIPE_WAIT_FOR_OPEN is that even if the gssd daemon is running, it
> takes it a moment or two to notice that a new client directory has been
> created, and that there is a new 'krb' pipe to attach to.
> 

Ok yeah, good point...

Shortening the timeout will also suck -- that'll just reduce the pain
somewhat but will still be a performance regression. It looks like even
specifying '-o sec=sys' doesn't disable this behavior. Should it?

Instead of using AUTH_GSS for SETCLIENTID by default, would it make
sense to add a switch (module parm?) that turns it on so that it can be
an opt-in thing rather than doing this by default?

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux