Re: [PATCH] NFS: avoid deadlock in nfs_kill_super

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2012-10-25 at 14:02 -0400, Weston Andros Adamson wrote:
> Calling nfs_kill_super from an RPC task callback would result in a deadlock
> where nfs_free_server (via rpc_shutdown_client) tries to kill all
> RPC tasks associated with that connection - including itself!
> 
> Instead of calling nfs_kill_super directly, queue a job on the nfsiod
> workqueue.
> 
> Signed-off-by: Weston Andros Adamson <dros@xxxxxxxxxx>
> ---
> 
> This fixes the current incarnation of the lockup I've been tracking down for
> some time now.  I still have to go back and see why the reproducer changed
> behavior a few weeks ago - tasks used to get stuck in rpc_prepare_task, but
> now (before this patch) are stuck in rpc_exit.
> 
> The reproducer works against a server with write delegations:
> 
> ./nfsometer.py -m v4 server:/path dd_100m_100k
> 
> which is basically:
>  - mount
>  - dd if=/dev/zero of=./dd_file.100m_100k bs=102400 count=1024
>  - umount
>  - break if /proc/fs/nfsfs/servers still has entry after 5 seconds (in this
>    case it NEVER goes away)
> 
> There are clearly other ways to trigger this deadlock, like a v4.1 CLOSE - the
> done handler calls nfs_sb_deactivate...
> 
> I've tested this approach with 10 runs X 3 nfs versions X 5 workloads 
> (dd_100m_100k, dd_100m_1k, python, kernel, cthon), so I'm pretty confident
> its correct.
> 
> One question for the list: should nfs_free_server *always* be scheduled on
> the nfsiod workqueue? It's called in error paths in several locations.
> After looking at them, I don't think my approach would break anything, but 
> some might have style objections.
> 

This doesn't add up. There should be nothing calling nfs_sb_deactive()
from a rpc_call_done() callback. If so, then that would be the bug.

All calls to things like rpc_put_task(), put_nfs_open_context(), dput(),
or nfs_sb_deactive() should occur in the rpc_call_release() callback if
they can't be done in a process context. In both those cases, the
rpc_task will be invisible to rpc_killall_tasks and rpc_shutdown_client.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@xxxxxxxxxx
www.netapp.com
��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux