On Fri, 2013-01-18 at 20:27 -0500, Chuck Lever wrote: > On Jan 18, 2013, at 8:01 PM, "Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> wrote: > > > On Fri, 2013-01-18 at 19:44 -0500, Trond Myklebust wrote: > >> On Fri, 2013-01-18 at 18:14 -0500, Chuck Lever wrote: > >>> On Jan 18, 2013, at 5:59 PM, "Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> wrote: > >>> > >>>> On Fri, 2013-01-18 at 17:03 -0500, an unknown sender wrote: > >>>>> On Fri, 2013-01-18 at 16:33 -0500, Chuck Lever wrote: > >>>>>> On Jan 18, 2013, at 4:28 PM, Ben Greear <greearb@xxxxxxxxxxxxxxx> wrote: > >>>>>> > >>>>>>> Any chance the STALE_CLIENTID case needs a 'break'? > >>>>>> > >>>>>> I don't think so. LEASE_CONFIRM is set, and we want to wake the state renewal thread. > >>>>>> > >>>>>>> > >>>>>>> Twice I've seen kernel crashes after the nfs40_walk_client_list > >>>>>>> failed (though code comments say it should never fail). > >>>>>> > >>>>>> nfs40_walk_client_list() is looking for an nfs_client that is supposed to already be in the nfs_client list. If the search fails, that's a bug. > >>>>>> > >>>>>> Eyeball the contents of your nfs_client list. You should find an appropriate nfs_client in there, and then figure out why the search doesn't find it. > >>>>> > >>>>> You have considered the fact that the call to > >>>>> nfs4_proc_setclientid_confirm can potentially return > >>>>> NFS4ERR_STALE_CLIENTID if the server rebooted while the client was > >>>>> walking the list? > >>>> > >>>> In fact, as far as I can see, the correct behaviour in > >>>> nfs40_discover_server_trunking() should be to re-issue the setclientid > >>>> call, and then walk the list again if nfs40_walk_client_list() returns > >>>> NFS4ERR_STALE_CLIENTID. > >>> > >>> When I wrote the server trunking detection logic, I think we hadn't clearly decided what needed to be done in the STALE_CLIENTID case. > >>> > >>>> Something like the attached patch: > >>> > >>> A couple of comments: > >>> > >>> o nfs_get_client() already sticks the new client on the tail of the nfs_client list > >> > >> OK. That allows us to get rid of 5-6 lines of code. > >> > >>> o We don't want to get stuck in a loop here. Should the "do {}" loop in nfs40_discover_server_trunking() be bounded by a retry count? > >> > >> Not a number of retries, but possibly a timeout value. > >> > >>> However, I haven't heard Ben say "oh, yes, my server had rebooted." I'd like some confirmation that the match failed for an explainable and expected reason. > >> > >> The spec allows the server to discard unconfirmed clientids, so our code > >> should take that into account. > >> > >> > >> BTW: I'm going insane trying to figure out how the refcounting in that > >> code is supposed to work. The assumption that we can somehow safely > >> return a non-referenced pointer from nfs4_discover_server_trunking and > >> then bump the reference count in nfs4_init_client() drives me nuts... > > > > Attached is v2 of the patch that addresses all of the above. > > At first glance, it looks plausible. Well done. Perhaps the reference counting fix should be split into a separate patch. > > This: > > "If walking the list in nfs40_walk_client_list fails, then the most > likely explanation is that the server rebooted." > > I think in Ben's case the server may be dropping the unconfirmed client ID because it is so busy and the client is probably dealing with a thousand or more mount points, which causes trunking detection to take a lot of time. I'd still like to see real evidence of this, but I can't think of any easy way for Ben to provide it. > > The "No match found." comment is good. > > Then: > > "Ensure that we catch that case by testing our new nfs_client last." > > "testing the new nfs_client last" is the way it's supposed to work -- that's why nfs_get_client() uses list_add_tail() now to insert the new client. Adding the new client to the client list is supposed to guarantee that walk_client_list always finds an appropriate nfs_client on the list. > > So you're not actually changing the order of the tests. What you're really changing is the recovery behavior in the "our new clientid is already stale" case. The patch description is probably inaccurate in this regard. > > Do we need a similar change for STALE_CLIENTID in the NFSv4.1 path? > > Finally, because you are replacing that bogus switch statement in nfs40_discover_server_trunking(), you are fixing Ben's crash. His GPF back trace should be included in your patch description if you are sending this to stable. The Oops is just a symptom. The underlying problem here is the client side spec violation. See attachments for the v3 patches... -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com
From 79e6b226aa8da2fe37974c4d0915ddc78b6cdd11 Mon Sep 17 00:00:00 2001 From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> Date: Fri, 18 Jan 2013 22:41:53 -0500 Subject: [PATCH v3 1/3] NFSv4: Fix NFSv4 reference counting for trunked sessions The reference counting in nfs4_init_client assumes wongly that it is safe for nfs4_discover_server_trunking() to return a pointer to a nfs_client prior to bumping the reference count. Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> Cc: Chuck Lever <chuck.lever@xxxxxxxxxx> Cc: Ben Greear <greearb@xxxxxxxxxxxxxxx> Cc: stable@xxxxxxxxxxxxxxx [>=3.7] --- fs/nfs/nfs4client.c | 31 +++++++++++++++---------------- 1 file changed, 15 insertions(+), 16 deletions(-) diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c index acc3472..65a290a 100644 --- a/fs/nfs/nfs4client.c +++ b/fs/nfs/nfs4client.c @@ -236,11 +236,10 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp, error = nfs4_discover_server_trunking(clp, &old); if (error < 0) goto error; + nfs_put_client(clp); if (clp != old) { clp->cl_preserve_clid = true; - nfs_put_client(clp); clp = old; - atomic_inc(&clp->cl_count); } return clp; @@ -306,7 +305,7 @@ int nfs40_walk_client_list(struct nfs_client *new, .clientid = new->cl_clientid, .confirm = new->cl_confirm, }; - int status; + int status = -NFS4ERR_STALE_CLIENTID; spin_lock(&nn->nfs_client_lock); list_for_each_entry_safe(pos, n, &nn->nfs_client_list, cl_share_link) { @@ -332,28 +331,28 @@ int nfs40_walk_client_list(struct nfs_client *new, if (prev) nfs_put_client(prev); + prev = pos; status = nfs4_proc_setclientid_confirm(pos, &clid, cred); - if (status == 0) { + switch (status) { + case -NFS4ERR_STALE_CLIENTID: + break; + case 0: nfs4_swap_callback_idents(pos, new); - nfs_put_client(pos); + prev = NULL; *result = pos; dprintk("NFS: <-- %s using nfs_client = %p ({%d})\n", __func__, pos, atomic_read(&pos->cl_count)); - return 0; - } - if (status != -NFS4ERR_STALE_CLIENTID) { - nfs_put_client(pos); - dprintk("NFS: <-- %s status = %d, no result\n", - __func__, status); - return status; + default: + goto out; } spin_lock(&nn->nfs_client_lock); - prev = pos; } + spin_unlock(&nn->nfs_client_lock); +out: /* * No matching nfs_client found. This should be impossible, * because the new nfs_client has already been added to @@ -363,9 +362,8 @@ int nfs40_walk_client_list(struct nfs_client *new, */ if (prev) nfs_put_client(prev); - spin_unlock(&nn->nfs_client_lock); - pr_err("NFS: %s Error: no matching nfs_client found\n", __func__); - return -NFS4ERR_STALE_CLIENTID; + dprintk("NFS: <-- %s status = %d\n", __func__, status); + return status; } #ifdef CONFIG_NFS_V4_1 @@ -473,6 +471,7 @@ int nfs41_walk_client_list(struct nfs_client *new, if (!nfs4_match_serverowners(pos, new)) continue; + atomic_inc(&pos->cl_count); spin_unlock(&nn->nfs_client_lock); dprintk("NFS: <-- %s using nfs_client = %p ({%d})\n", __func__, pos, atomic_read(&pos->cl_count)); -- 1.8.1
From 0c66c02db14b5dbeec5206dafcb45f38f9d4b43c Mon Sep 17 00:00:00 2001 From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> Date: Fri, 18 Jan 2013 22:56:23 -0500 Subject: [PATCH v3 2/3] NFSv4: Fix NFSv4 trunking discovery If walking the list in nfs4[01]_walk_client_list fails, then the most likely explanation is that the server dropped the clientid before we actually managed to confirm it. As long as our nfs_client is the very last one in the list to be tested, the caller can be assured that this is the case when the final return value is NFS4ERR_STALE_CLIENTID. Reported-by: Ben Greear <greearb@xxxxxxxxxxxxxxx> Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> Cc: Chuck Lever <chuck.lever@xxxxxxxxxx> Cc: stable@xxxxxxxxxxxxxxx [>=3.7] --- fs/nfs/nfs4client.c | 26 +++++++------------------- fs/nfs/nfs4state.c | 8 ++------ 2 files changed, 9 insertions(+), 25 deletions(-) diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c index 65a290a..2f21f17 100644 --- a/fs/nfs/nfs4client.c +++ b/fs/nfs/nfs4client.c @@ -352,14 +352,8 @@ int nfs40_walk_client_list(struct nfs_client *new, } spin_unlock(&nn->nfs_client_lock); + /* No match found. The server lost our clientid */ out: - /* - * No matching nfs_client found. This should be impossible, - * because the new nfs_client has already been added to - * nfs_client_list by nfs_get_client(). - * - * Don't BUG(), since the caller is holding a mutex. - */ if (prev) nfs_put_client(prev); dprintk("NFS: <-- %s status = %d\n", __func__, status); @@ -430,7 +424,7 @@ int nfs41_walk_client_list(struct nfs_client *new, { struct nfs_net *nn = net_generic(new->cl_net, nfs_net_id); struct nfs_client *pos, *n, *prev = NULL; - int error; + int status = -NFS4ERR_STALE_CLIENTID; spin_lock(&nn->nfs_client_lock); list_for_each_entry_safe(pos, n, &nn->nfs_client_list, cl_share_link) { @@ -446,8 +440,8 @@ int nfs41_walk_client_list(struct nfs_client *new, nfs_put_client(prev); prev = pos; - error = nfs_wait_client_init_complete(pos); - if (error < 0) { + status = nfs_wait_client_init_complete(pos); + if (status < 0) { nfs_put_client(pos); spin_lock(&nn->nfs_client_lock); continue; @@ -480,16 +474,10 @@ int nfs41_walk_client_list(struct nfs_client *new, return 0; } - /* - * No matching nfs_client found. This should be impossible, - * because the new nfs_client has already been added to - * nfs_client_list by nfs_get_client(). - * - * Don't BUG(), since the caller is holding a mutex. - */ + /* No matching nfs_client found. */ spin_unlock(&nn->nfs_client_lock); - pr_err("NFS: %s Error: no matching nfs_client found\n", __func__); - return -NFS4ERR_STALE_CLIENTID; + dprintk("NFS: <-- %s status = %d\n", __func__, status); + return status; } #endif /* CONFIG_NFS_V4_1 */ diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index 9448c57..f72561ca 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -136,16 +136,11 @@ int nfs40_discover_server_trunking(struct nfs_client *clp, clp->cl_confirm = clid.confirm; status = nfs40_walk_client_list(clp, result, cred); - switch (status) { - case -NFS4ERR_STALE_CLIENTID: - set_bit(NFS4CLNT_LEASE_CONFIRM, &clp->cl_state); - case 0: + if (status == 0) { /* Sustain the lease, even if it's empty. If the clientid4 * goes stale it's of no use for trunking discovery. */ nfs4_schedule_state_renewal(*result); - break; } - out: return status; } @@ -1863,6 +1858,7 @@ again: case -ETIMEDOUT: case -EAGAIN: ssleep(1); + case -NFS4ERR_STALE_CLIENTID: dprintk("NFS: %s after status %d, retrying\n", __func__, status); goto again; -- 1.8.1
From 8102b6146af9a071457e371da4e9516156694880 Mon Sep 17 00:00:00 2001 From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> Date: Fri, 18 Jan 2013 23:01:43 -0500 Subject: [PATCH v3 3/3] NFSv4.1: Ensure that nfs41_walk_client_list() does start lease recovery We do need to start the lease recovery thread prior to waiting for the client initialisation to complete in NFSv4.1. Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> Cc: Chuck Lever <chuck.lever@xxxxxxxxxx> Cc: Ben Greear <greearb@xxxxxxxxxxxxxxx> Cc: stable@xxxxxxxxxxxxxxx [>=3.7] --- fs/nfs/nfs4client.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c index 2f21f17..2e9779b 100644 --- a/fs/nfs/nfs4client.c +++ b/fs/nfs/nfs4client.c @@ -440,14 +440,17 @@ int nfs41_walk_client_list(struct nfs_client *new, nfs_put_client(prev); prev = pos; + nfs4_schedule_lease_recovery(pos); status = nfs_wait_client_init_complete(pos); if (status < 0) { nfs_put_client(pos); spin_lock(&nn->nfs_client_lock); continue; } - + status = pos->cl_cons_state; spin_lock(&nn->nfs_client_lock); + if (status < 0) + continue; } if (pos->rpc_ops != new->rpc_ops) -- 1.8.1