On May 3, 2013, at 3:17 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > On Fri, 3 May 2013 14:48:59 -0400 > Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > >> >> On May 3, 2013, at 2:44 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: >> >>> On Fri, 3 May 2013 18:33:54 +0000 >>> "Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> wrote: >>> >>>> On Fri, 2013-05-03 at 14:24 -0400, Jeff Layton wrote: >>>>> On Fri, 3 May 2013 13:56:13 -0400 >>>>> Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: >>>>> >>>>>> >>>>>> On May 3, 2013, at 1:25 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: >>>>>> >>>>>>> I've noticed that when running a 3.10-pre kernel that if I try to mount >>>>>>> up a NFSv4 filesystem that it now takes ~15s for the mount to complete. >>>>>>> >>>>>>> Here's a little rpcdebug output: >>>>>>> >>>>>>> [ 3056.385078] svc: server ffff8800368fc000 waiting for data (to = 9223372036854775807) >>>>>>> [ 3056.392056] RPC: new task initialized, procpid 2471 >>>>>>> [ 3056.392758] RPC: allocated task ffff88010cd90100 >>>>>>> [ 3056.393303] RPC: 42 __rpc_execute flags=0x1280 >>>>>>> [ 3056.393630] RPC: 42 call_start nfs4 proc SETCLIENTID (sync) >>>>>>> [ 3056.394056] RPC: 42 call_reserve (status 0) >>>>>>> [ 3056.394368] RPC: 42 reserved req ffff8801019f9600 xid 21ad6c40 >>>>>>> [ 3056.394783] RPC: wake_up_first(ffff88010a989990 "xprt_sending") >>>>>>> [ 3056.395252] RPC: 42 call_reserveresult (status 0) >>>>>>> [ 3056.395595] RPC: 42 call_refresh (status 0) >>>>>>> [ 3056.395901] RPC: gss_create_cred for uid 0, flavor 390004 >>>>>>> [ 3056.396361] RPC: gss_create_upcall for uid 0 >>>>>>> [ 3071.396134] RPC: AUTH_GSS upcall timed out. >>>>>>> Please check user daemon is running. >>>>>>> [ 3071.397374] RPC: gss_create_upcall for uid 0 result -13 >>>>>>> [ 3071.398192] RPC: 42 call_refreshresult (status -13) >>>>>>> [ 3071.398873] RPC: 42 call_refreshresult: refresh creds failed with error -13 >>>>>>> [ 3071.399881] RPC: 42 return 0, status -13 >>>>>>> >>>>>>> The problem is that we're now trying to upcall for GSS creds to do the >>>>>>> SETCLIENTID call, but this host isn't running rpc.gssd. Not running >>>>>>> rpc.gssd is pretty common for people not using kerberized NFS. I think >>>>>>> we'll see a lot of complaints about this. >>>>>>> >>>>>>> Is this expected? >>>>>> >>>>>> Yes. >>>>>> >>>>>> There are operations like SETCLIENTID and GETATTR(fs_locations) which should always use an integrity-checking security flavor, even if particular mount points use sec=sys. >>>>>> >>>>>> There are cases where GSS is not available, and we fall back to using AUTH_SYS. That should happen as quickly as possible, I agree. >>>>>> >>>>>>> If so, what's the proposed remedy? >>>>>>> Simply have everyone run rpc.gssd even if they're not using kerberized NFS? >>>>>> >>>>>> >>>>>> That's one possibility. Or we could shorten the upcall timeout. Or, add a mechanism by which rpc.gssd can provide a positive indication to the kernel that it is running. >>>>>> >>>>>> It doesn't seem like an intractable problem. >>>>>> >>>>> >>>>> Nope, it's not intractable at all... >>>>> >>>>> Currently, the gssd upcall uses the RPC_PIPE_WAIT_FOR_OPEN flag to >>>>> allow you to queue upcalls to be processed when the daemon isn't up >>>>> yet. When the daemon starts, it processes that queue. The caller gives >>>>> up after 15s (which is what's happening here), and the upcall >>>>> eventually gets scraped out of the queue after 30s. >>>>> >>>>> We could stop using that flag on this rpc_pipe and simply require that >>>>> the daemon be up and running before attempting any sort of AUTH_GSS >>>>> rpc. That might be a little less friendly in the face of boot-time >>>>> ordering problems, but it should presumably make this problem go away. >>>> >>>> You probably don't want to do that... The main reason for the >>>> RPC_PIPE_WAIT_FOR_OPEN is that even if the gssd daemon is running, it >>>> takes it a moment or two to notice that a new client directory has been >>>> created, and that there is a new 'krb' pipe to attach to. >>>> >>> >>> Ok yeah, good point... >>> >>> Shortening the timeout will also suck -- that'll just reduce the pain >>> somewhat but will still be a performance regression. It looks like even >>> specifying '-o sec=sys' doesn't disable this behavior. Should it? >> >> Nope. >> >> We should always use krb5i if a GSS context can be established with our machine cred. As I said before, SETCLIENTID and GETATTR(fs_locations) really should use an integrity-protecting security flavor no matter what flavor is in effect on the mount points themselves. >> >>> Instead of using AUTH_GSS for SETCLIENTID by default, would it make >>> sense to add a switch (module parm?) that turns it on so that it can be >>> an opt-in thing rather than doing this by default? >> >> Why add another tunable when we really should just fix the delay? >> > > Because just shortening the delay will still leave you with a delay. > Less people might notice and complain if it's shorter, but it'll still > be there. It'll be particularly annoying with autofs... > > You also run the risk of hitting the problem Trond mentioned if you > shorten it too much (timing out the upcall before gssd's duty cycle has > a chance to get to it). So what about taking one of the other approaches I mentioned? > >> Besides, if gssd is running and no keytab exists, then the fallback to AUTH_SYS should be fast. Is that not an effective workaround until we address the delay problem? >> > > Yep, no problem if gssd is running. I'm concerned about the common case > where it isn't. The expectation in the past has always been that if you > weren't running kerberized NFS that you didn't need to run gssd. That > has now changed and if you don't want to suffer a delay when mounting > (however short it eventually is) then you need to run it. Why are you assuming this is a permanent change? > Might it make sense to introduce this change more gradually? Somehow > warn people who aren't running gssd that they ought to start turning it > on before we do this by default? I don't expect this issue to last for release after release. A moment ago you agreed that this shouldn't be intractable, so I fail to see the need to start wiring up long-term workarounds. Can't we just agree on a fix, and then get that into 3.10 as a regression fix? -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html