Re: NFS hang when writing to loopback file from VMWare ESX (kernel 2.6.30)

Chuck Lever <chuck.lever@xxxxxxxxxx> · Mon, 10 May 2010 13:36:25 -0400

On 05/10/2010 05:20 AM, Beast in Black wrote:
Greetings.

Every so often, when i'm writing via NFS to a loopback-mounted file, i
find that about 10-15 nfsd threads (out of a total of 64) go into D
state, along with the loop file, and never recover from the D state.
My setup is as follows:

1. sparse file is created via dd and loopback-mounted onto a
/dev/loopX device (where 0<= X<= 100)
2. sparse file is mke2fs'd and mounted on mount point "/volumes/localvol"
3. "/volumes/localvol" is exported with options
*(rw,no_root_squash,no_subtree_check,async,insecure,nohide,no_wdelay).
4. /volumes/localvol is set as a network datastore (NFS mount) in ESX
5. Virtual machine files for an ESX VM are copied into the NFS mount on ESX
6. Virtual machine is powered on and I do some activity in it...write files etc.

At this point, the VM is running fine in ESX. After a while, however,
I notice that the VM freezes and that ESX reports the NFS mounted
datastore as unreachable. When I check the NFS server machine, I find
that 10-15 NFS threads are in D state, along with the associated
loopback-mounted file. The D states are never recovered from, and the
only way out is to reboot the NFS server machine.

I have also tried with specifying the export as "sync" instead of
"async" (and removing no_wdelay) but I still see the same behavior.

The NFS server is running the vanilla 2.6.30 kernel on Ubuntu 8.10.
The NFS exports are all NFSv3.

Does anyone have an idea of why this may be occurring? I would be glad
to provide any additional info required.

There may be a deadlock due to memory pressure on the server.  You might 
get some information by doing a "sudo echo 't' > /proc/sysrq_trigger", 
then looking in your syslog, when the server gets into the hung state.

--
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html