On Mon, 2013-11-25 at 20:29 +-0000, Adamson, Andy wrote: +AD4- On Nov 25, 2013, at 3:20 PM, +ACI-Myklebust, Trond+ACI- +ADw-Trond.Myklebust+AEA-netapp.com+AD4- +AD4- wrote: +AD4- +AD4- +AD4- +AD4- +AD4- On Nov 25, 2013, at 15:10, Adamson, Andy +ADw-William.Adamson+AEA-netapp.com+AD4- wrote: +AD4- +AD4- +AD4- +AD4APg- +AD4- +AD4APg- On Nov 25, 2013, at 2:53 PM, +ACI-Myklebust, Trond+ACI- +ADw-Trond.Myklebust+AEA-netapp.com+AD4- +AD4- +AD4APg- wrote: +AD4- +AD4APg- +AD4- +AD4APgA+- +AD4- +AD4APgA+- On Nov 25, 2013, at 14:27, Adamson, Andy +ADw-William.Adamson+AEA-netapp.com+AD4- wrote: +AD4- +AD4APgA+- +AD4- +AD4APgA+AD4- +AD4- +AD4APgA+AD4- On Nov 25, 2013, at 1:33 PM, +ACI-Myklebust, Trond+ACI- +ADw-Trond.Myklebust+AEA-netapp.com+AD4- +AD4- +AD4APgA+AD4- wrote: +AD4- +AD4APgA+AD4- +AD4- +AD4APgA+AD4APg- +AD4- +AD4APgA+AD4APg- On Nov 25, 2013, at 13:13, Myklebust, Trond +ADw-Trond.Myklebust+AEA-netapp.com+AD4- wrote: +AD4- +AD4APgA+AD4APg- +AD4- +AD4APgA+AD4APgA+- +AD4- +AD4APgA+AD4APgA+- On Nov 25, 2013, at 12:57, +ADw-andros+AEA-netapp.com+AD4- +ADw-andros+AEA-netapp.com+AD4- wrote: +AD4- +AD4APgA+AD4APgA+- +AD4- +AD4APgA+AD4APgA+AD4- From: Andy Adamson +ADw-andros+AEA-netapp.com+AD4- +AD4- +AD4APgA+AD4APgA+AD4- +AD4- +AD4APgA+AD4APgA+AD4- The state manager is recovering expired state and recovery OPENs are being +AD4- +AD4APgA+AD4APgA+AD4- processed. If kswapd is pruning inodes at the same time, a deadlock can occur +AD4- +AD4APgA+AD4APgA+AD4- when kswapd calls evict+AF8-inode on an NFSv4.1 inode with a layout, and the +AD4- +AD4APgA+AD4APgA+AD4- resultant layoutreturn gets an error that the state mangager is to handle, +AD4- +AD4APgA+AD4APgA+AD4- causing the layoutreturn to wait on the (NFS client) cl+AF8-rpcwaitq. +AD4- +AD4APgA+AD4APgA+AD4- +AD4- +AD4APgA+AD4APgA+AD4- At the same time an open is waiting for the inode deletion to complete in +AD4- +AD4APgA+AD4APgA+AD4- +AF8AXw-wait+AF8-on+AF8-freeing+AF8-inode. +AD4- +AD4APgA+AD4APgA+AD4- +AD4- +AD4APgA+AD4APgA+AD4- If the open is either the open called by the state manager, or an open from +AD4- +AD4APgA+AD4APgA+AD4- the same open owner that is holding the NFSv4.0 sequence id which causes the +AD4- +AD4APgA+AD4APgA+AD4- OPEN from the state manager to wait for the sequence id on the Seqid+AF8-waitqueue, +AD4- +AD4APgA+AD4APgA+AD4- then the state is deadlocked with kswapd. +AD4- +AD4APgA+AD4APgA+AD4- +AD4- +AD4APgA+AD4APgA+AD4- Do not handle LAYOUTRETURN errors when called from nfs4+AF8-evict+AF8-inode. +AD4- +AD4APgA+AD4APgA+- +AD4- +AD4APgA+AD4APgA+- Why are we waiting for recovery in LAYOUTRETURN at all? Layouts are automatically lost when the server reboots or when the lease is otherwise lost. +AD4- +AD4APgA+AD4APgA+- +AD4- +AD4APgA+AD4APgA+- IOW: Is there any reason why we need to special-case nfs4+AF8-evict+AF8-inode? Shouldn+IBk-t we just bail out on error in +AF8-all+AF8- cases? +AD4- +AD4APgA+AD4APg- +AD4- +AD4APgA+AD4APg- BTW: Is it possible that we might have a similar problem with delegreturn? That too can be called from nfs4+AF8-evict+AF8-inode+ICY- +AD4- +AD4APgA+AD4- +AD4- +AD4APgA+AD4- Yes, good point. kswapd could be waiting for a delegation to return which has an error along with the same scenario with sys+AF8-open and the state manager running. +AD4- +AD4APgA+AD4- +AD4- +AD4APgA+AD4- With delegreturn, we most definately want to limit 'no error handling' to the evict inode case. +AD4- +AD4APgA+- +AD4- +AD4APgA+- Ah+ICY- I forgot that the delegreturn in nfs4+AF8-evict+AF8-inode is asynchronous and doesn+IBk-t wait for completion, so it shouldn+IBk-t be a problem here. +AD4- +AD4APg- +AD4- +AD4APg- Except we just changed that to fix a different state manager hang: +AD4- +AD4APg- +AD4- +AD4APg- commit 4a82fd7c4e78a1b7a224f9ae8bb7e1fd95f670e0 +AD4- +AD4APg- Author: Andy Adamson +ADw-andros+AEA-netapp.com+AD4- +AD4- +AD4APg- Date: Fri Nov 15 16:36:16 2013 -0500 +AD4- +AD4APg- +AD4- +AD4APg- NFSv4 wait on recovery for async session errors +AD4- +AD4- +AD4- +AD4- Right, but that won+IBk-t prevent nfs4+AF8-evict+AF8-inode from completing, +AD4- +AD4- Ah - I was thinking of the synchronous handlers call to nfs4+AF8-wait+AF8-clnt+AF8-recover - so yes, no problem +AD4- +AD4- --+AD4-Andy +AD4- +AD4- +AD4- and hence the OPEN that is waiting in nfs+AF8-fhget() can also complete, and so there is no deadlock with the state manager thread. How about something like the attached... -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust+AEA-netapp.com www.netapp.com
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index f01e2aa53210..e040359983ce 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -7599,7 +7599,14 @@ static void nfs4_layoutreturn_done(struct rpc_task *task, void *calldata) return; server = NFS_SERVER(lrp->args.inode); - if (nfs4_async_handle_error(task, server, NULL) == -EAGAIN) { + switch (task->tk_status) { + default: + task->tk_status = 0; + case 0: + break; + case -NFS4ERR_DELAY: + if (nfs4_async_handle_error(task, server, NULL) != -EAGAIN) + break; rpc_restart_call_prepare(task); return; }