Re: NFS client hang on attempt to do async blocking posix lock enqueue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 8 Feb 2008, J. Bruce Fields wrote:

> On Fri, Feb 08, 2008 at 07:15:02AM -0500, Jeff Layton wrote:
> > On Thu, 7 Feb 2008 18:26:18 -0500
> > "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote:
> > 
> > > On Sun, Jan 20, 2008 at 09:58:59AM -0500, Oleg Drokin wrote:
> > > > Hello!
> > > >
> > > > On Jan 18, 2008, at 6:07 PM, J. Bruce Fields wrote:
> > > >
> > > >> On Thu, Nov 29, 2007 at 02:41:57PM -0800, Marc Eshel wrote:
> > > >>> The problem seems to be with the fact that the client and server are 
> > > >>> on
> > > >>> the same machine. This test work fine with or without an underlaying 
> > > >>> fs
> > > >>> that supports locking when the client and the server are on a  
> > > >>> different
> > > >>> machines. Like you said the server is trying to send the grant  
> > > >>> message to
> > > >>> the client but for some reason it fails when the client is on the  
> > > >>> same
> > > >>> machine.
> > > >> That *shouldn't* make a difference, so we need to take another look at
> > > >> this--Oleg, this problem is still unfixed, right?
> > > >
> > > > Yes, I just pulled your latest nfs tree and I still can reproduce the  
> > > > problem.
> > > 
> > > OK, we have finally reproduced this problem here, and David's working on
> > > debugging.  It does indeed seem to only be reproduceable with client and
> > > server on the same machine.  Thanks for the report....
> > > 
> > > --b.
> > 
> > It might be worth testing this both with and without the patchset I
> > posted to linux-nfs recently to take care of the lockd hang. If
> > lockd is stuck trying to rpc_ping itself then it probably would hang
> > like this, wouldn't it?
> 
> Of course!  Yes, that fits.
> 
> --b.

	right on, jeff, good catch and thanks for directing my attention 
to your patches.

	i applied them on top of 2.6.23.1 and tested them on a cluster 
exporting GFS2 over NFS, using oleg's reproducer code.  your patches fix 
that lockd hang.

	in a bit more detail, oleg's reproducer basically gets a 
whole-file read lock, tests the lock, upgrades to a whole-file exclusive 
lock, tests the lock, then unlocks.  the problem was that when getting 
that exclusive lock things would hang.  this only happened when the client 
and server were on the same machine, and i could reproduce it with NFS 
exporting GFS2 but not NFS exporting EXT3.


	thanks,

	d
	.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux