On Nov 25, 2013, at 13:17, Adamson, Andy <William.Adamson@xxxxxxxxxx> wrote: > > On Nov 25, 2013, at 1:13 PM, "Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> > wrote: > >> >> On Nov 25, 2013, at 12:57, <andros@xxxxxxxxxx> <andros@xxxxxxxxxx> wrote: >> >>> From: Andy Adamson <andros@xxxxxxxxxx> >>> >>> The state manager is recovering expired state and recovery OPENs are being >>> processed. If kswapd is pruning inodes at the same time, a deadlock can occur >>> when kswapd calls evict_inode on an NFSv4.1 inode with a layout, and the >>> resultant layoutreturn gets an error that the state mangager is to handle, >>> causing the layoutreturn to wait on the (NFS client) cl_rpcwaitq. >>> >>> At the same time an open is waiting for the inode deletion to complete in >>> __wait_on_freeing_inode. >>> >>> If the open is either the open called by the state manager, or an open from >>> the same open owner that is holding the NFSv4.0 sequence id which causes the >>> OPEN from the state manager to wait for the sequence id on the Seqid_waitqueue, >>> then the state is deadlocked with kswapd. >>> >>> Do not handle LAYOUTRETURN errors when called from nfs4_evict_inode. >> >> Why are we waiting for recovery in LAYOUTRETURN at all? Layouts are automatically lost when the server reboots or when the lease is otherwise lost. >> >> IOW: Is there any reason why we need to special-case nfs4_evict_inode? Shouldn’t we just bail out on error in _all_ cases? > > Yeah, I was thinking about this as well - perhaps recovering from session-level errors or grace/delay errors would be useful for the block client. NFS4ERR_DELAY, probably, yes. NFS4ERR_GRACE, no… That’s a reboot situation As for session level errors, I’d say that complicates things too much, since several of those can basically end up masking a NFS4ERR_STALE_CLIENTID error. Either way, all the layout types (including blocks) should be able to continue on even if we miss a layout return or two. The server has to be coded to expect a forgetful client. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html