On Thu, 2012-10-25 at 14:02 -0400, Weston Andros Adamson wrote: > Calling nfs_kill_super from an RPC task callback would result in a deadlock > where nfs_free_server (via rpc_shutdown_client) tries to kill all > RPC tasks associated with that connection - including itself! > > Instead of calling nfs_kill_super directly, queue a job on the nfsiod > workqueue. > > Signed-off-by: Weston Andros Adamson <dros@xxxxxxxxxx> > --- > > This fixes the current incarnation of the lockup I've been tracking down for > some time now. I still have to go back and see why the reproducer changed > behavior a few weeks ago - tasks used to get stuck in rpc_prepare_task, but > now (before this patch) are stuck in rpc_exit. > > The reproducer works against a server with write delegations: > > ./nfsometer.py -m v4 server:/path dd_100m_100k > > which is basically: > - mount > - dd if=/dev/zero of=./dd_file.100m_100k bs=102400 count=1024 > - umount > - break if /proc/fs/nfsfs/servers still has entry after 5 seconds (in this > case it NEVER goes away) > > There are clearly other ways to trigger this deadlock, like a v4.1 CLOSE - the > done handler calls nfs_sb_deactivate... > > I've tested this approach with 10 runs X 3 nfs versions X 5 workloads > (dd_100m_100k, dd_100m_1k, python, kernel, cthon), so I'm pretty confident > its correct. > > One question for the list: should nfs_free_server *always* be scheduled on > the nfsiod workqueue? It's called in error paths in several locations. > After looking at them, I don't think my approach would break anything, but > some might have style objections. > This doesn't add up. There should be nothing calling nfs_sb_deactive() from a rpc_call_done() callback. If so, then that would be the bug. All calls to things like rpc_put_task(), put_nfs_open_context(), dput(), or nfs_sb_deactive() should occur in the rpc_call_release() callback if they can't be done in a process context. In both those cases, the rpc_task will be invisible to rpc_killall_tasks and rpc_shutdown_client. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥