[PATCH v2 3/4] pNFS: Handle NFS4ERR_RECALLCONFLICT correctly in LAYOUTGET

Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> · Tue, 19 Jul 2016 16:33:56 -0400

Instead of giving up altogether and falling back to doing I/O
through the MDS, which may make the situation worse, wait for
2 lease periods for the callback to resolve itself, and then
try destroying the existing layout.

Only if this was an attempt at getting a first layout, do we
give up altogether, as the server is clearly crazy.

Fixes: 183d9e7b112aa ("pnfs: rework LAYOUTGET retry handling")
Cc: stable@xxxxxxxxxxxxxxx # 4.7
Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>
---
 fs/nfs/pnfs.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index c50d4ebab5c5..7d992362ff04 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1505,7 +1505,7 @@ pnfs_update_layout(struct inode *ino,
 	struct pnfs_layout_segment *lseg = NULL;
 	nfs4_stateid stateid;
 	long timeout = 0;
-	unsigned long giveup = jiffies + rpc_get_timeout(server->client);
+	unsigned long giveup = jiffies + (clp->cl_lease_time << 1);
 	bool first;
 
 	if (!pnfs_enabled_sb(NFS_SERVER(ino))) {
@@ -1649,9 +1649,18 @@ lookup_again:
 	if (IS_ERR(lseg)) {
 		switch(PTR_ERR(lseg)) {
 		case -EBUSY:
-		case -ERECALLCONFLICT:
 			if (time_after(jiffies, giveup))
 				lseg = NULL;
+			break;
+		case -ERECALLCONFLICT:
+			/* Huh? We hold no layouts, how is there a recall? */
+			if (first) {
+				lseg = NULL;
+				break;
+			}
+			/* Destroy the existing layout and start over */
+			if (time_after(jiffies, giveup))
+				pnfs_destroy_layout(NFS_I(ino));
 			/* Fallthrough */
 		case -EAGAIN:
 			break;
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html