Re: nfsd deadlock, 2.6.36-rc3

Neil Brown <neilb@xxxxxxx> · Thu, 2 Sep 2010 06:55:51 +1000

On Wed, 1 Sep 2010 12:54:01 -0400
"J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote:

> On Wed, Sep 01, 2010 at 09:39:55AM -0600, Tim Gardner wrote:
> > I've been pursuing a simple reproducer for an NFS lockup that shows
> > up under stress. There is a bunch of info (some of it extraneous) in
> > http://bugs.launchpad.net/bugs/561210. I can reproduce it by writing
> > loop mounted NFS exports:
> > 
> > /etc/fstab: 127.0.0.1:/srv /mnt/srv nfs rw 0 2
> > /etc/exports: /srv 127.0.0.1(rw,insecure,no_subtree_check)
> > 
> > See the attached scripts test_master.sh and test_client.sh. I simply
> > repeat './test_master.sh wait' until nfsd locks up, typically within
> > 1-3 cycles, e.g.,
> 
> Without looking at the dmesg and scripts carefully to confirm, one
> possible explanation is a deadlock when the server can't allocate memory
> required to service client requests, memory which the client itself
> needs to free by writing back dirty pages, but can't because the server
> isn't processing its writes.

Having looked closely I'd say it is almost certainly this issue.
nfsd thread 1266 is in zone_reclaim waiting on a page to be written out so
the memory can be reused.
The other nfsd threads are blocking on a mutex held by 1266.
The dd processes are waiting for pages to be written to the server

The particular page that 1266 is waiting on is almost certainly a page on an
NFS file, so you have a cyclic deadlock.

> 
> For that reason we just don't support loopback mounts--they're OK for
> light testing, but it would be difficult to make them completely robust
> under load.

I wonder if we could use 'containers' to partition available memory between
'nfsd threads' and 'everything else'??  Probably not worth the effort.

NeilBrown

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html