On Fri, 8 Feb 2008, J. Bruce Fields wrote: > On Fri, Feb 08, 2008 at 07:15:02AM -0500, Jeff Layton wrote: > > On Thu, 7 Feb 2008 18:26:18 -0500 > > "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote: > > > > > On Sun, Jan 20, 2008 at 09:58:59AM -0500, Oleg Drokin wrote: > > > > Hello! > > > > > > > > On Jan 18, 2008, at 6:07 PM, J. Bruce Fields wrote: > > > > > > > >> On Thu, Nov 29, 2007 at 02:41:57PM -0800, Marc Eshel wrote: > > > >>> The problem seems to be with the fact that the client and server are > > > >>> on > > > >>> the same machine. This test work fine with or without an underlaying > > > >>> fs > > > >>> that supports locking when the client and the server are on a > > > >>> different > > > >>> machines. Like you said the server is trying to send the grant > > > >>> message to > > > >>> the client but for some reason it fails when the client is on the > > > >>> same > > > >>> machine. > > > >> That *shouldn't* make a difference, so we need to take another look at > > > >> this--Oleg, this problem is still unfixed, right? > > > > > > > > Yes, I just pulled your latest nfs tree and I still can reproduce the > > > > problem. > > > > > > OK, we have finally reproduced this problem here, and David's working on > > > debugging. It does indeed seem to only be reproduceable with client and > > > server on the same machine. Thanks for the report.... > > > > > > --b. > > > > It might be worth testing this both with and without the patchset I > > posted to linux-nfs recently to take care of the lockd hang. If > > lockd is stuck trying to rpc_ping itself then it probably would hang > > like this, wouldn't it? > > Of course! Yes, that fits. > > --b. right on, jeff, good catch and thanks for directing my attention to your patches. i applied them on top of 2.6.23.1 and tested them on a cluster exporting GFS2 over NFS, using oleg's reproducer code. your patches fix that lockd hang. in a bit more detail, oleg's reproducer basically gets a whole-file read lock, tests the lock, upgrades to a whole-file exclusive lock, tests the lock, then unlocks. the problem was that when getting that exclusive lock things would hang. this only happened when the client and server were on the same machine, and i could reproduce it with NFS exporting GFS2 but not NFS exporting EXT3. thanks, d . - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html