On Nov 25, 2013, at 3:29 PM, "Adamson, Andy" <William.Adamson@xxxxxxxxxx> wrote: > > On Nov 25, 2013, at 3:20 PM, "Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> > wrote: > >> >> On Nov 25, 2013, at 15:10, Adamson, Andy <William.Adamson@xxxxxxxxxx> wrote: >> >>> >>> On Nov 25, 2013, at 2:53 PM, "Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> >>> wrote: >>> >>>> >>>> On Nov 25, 2013, at 14:27, Adamson, Andy <William.Adamson@xxxxxxxxxx> wrote: >>>> >>>>> >>>>> On Nov 25, 2013, at 1:33 PM, "Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> >>>>> wrote: >>>>> >>>>>> >>>>>> On Nov 25, 2013, at 13:13, Myklebust, Trond <Trond.Myklebust@xxxxxxxxxx> wrote: >>>>>> >>>>>>> >>>>>>> On Nov 25, 2013, at 12:57, <andros@xxxxxxxxxx> <andros@xxxxxxxxxx> wrote: >>>>>>> >>>>>>>> From: Andy Adamson <andros@xxxxxxxxxx> >>>>>>>> >>>>>>>> The state manager is recovering expired state and recovery OPENs are being >>>>>>>> processed. If kswapd is pruning inodes at the same time, a deadlock can occur >>>>>>>> when kswapd calls evict_inode on an NFSv4.1 inode with a layout, and the >>>>>>>> resultant layoutreturn gets an error that the state mangager is to handle, >>>>>>>> causing the layoutreturn to wait on the (NFS client) cl_rpcwaitq. >>>>>>>> >>>>>>>> At the same time an open is waiting for the inode deletion to complete in >>>>>>>> __wait_on_freeing_inode. >>>>>>>> >>>>>>>> If the open is either the open called by the state manager, or an open from >>>>>>>> the same open owner that is holding the NFSv4.0 sequence id which causes the >>>>>>>> OPEN from the state manager to wait for the sequence id on the Seqid_waitqueue, >>>>>>>> then the state is deadlocked with kswapd. >>>>>>>> >>>>>>>> Do not handle LAYOUTRETURN errors when called from nfs4_evict_inode. >>>>>>> >>>>>>> Why are we waiting for recovery in LAYOUTRETURN at all? Layouts are automatically lost when the server reboots or when the lease is otherwise lost. >>>>>>> >>>>>>> IOW: Is there any reason why we need to special-case nfs4_evict_inode? Shouldn’t we just bail out on error in _all_ cases? >>>>>> >>>>>> BTW: Is it possible that we might have a similar problem with delegreturn? That too can be called from nfs4_evict_inode… >>>>> >>>>> Yes, good point. kswapd could be waiting for a delegation to return which has an error along with the same scenario with sys_open and the state manager running. >>>>> >>>>> With delegreturn, we most definately want to limit 'no error handling' to the evict inode case. >>>> >>>> Ah… I forgot that the delegreturn in nfs4_evict_inode is asynchronous and doesn’t wait for completion, so it shouldn’t be a problem here. >>> >>> Except we just changed that to fix a different state manager hang: >>> >>> commit 4a82fd7c4e78a1b7a224f9ae8bb7e1fd95f670e0 >>> Author: Andy Adamson <andros@xxxxxxxxxx> >>> Date: Fri Nov 15 16:36:16 2013 -0500 >>> >>> NFSv4 wait on recovery for async session errors >> >> Right, but that won’t prevent nfs4_evict_inode from completing, > > Ah - I was thinking of the synchronous handlers call to nfs4_wait_clnt_recover - so yes, no problem In fact, this issue is NOT an upstream issue! RHEL6.5-pre has nfs4_proc_layoutreturn as as SYNC rpc call, and _that_ is the bug that is fixed upstream. Really sorry for the confusion. I'll back port a solution for RHEL6.5 -->Andy > > -->Andy > >> and hence the OPEN that is waiting in nfs_fhget() can also complete, and so there is no deadlock with the state manager thread. > >> >> Cheers >> Trond >> -- >> Trond Myklebust >> Linux NFS client maintainer >> >> NetApp >> Trond.Myklebust@xxxxxxxxxx >> www.netapp.com >> > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html