Re: Linux NFSv4 client uses returned delegation in subsequent READ resulting in hang (BAD_STATEID)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jun 30, 2012, at 9:53 PM, Charles 'Boyo wrote:

> Hello.
> 
> I have repeatedly had Linux NFS clients hang while trying to access
> files on a NFSv4 mount (Solaris).
> Investigations revealed that this client is using a delegation that it
> has already returned, resulting in the BAD_STATEID error.
> Unfortunately, it then proceeds to hammer the server with these
> "doomed" requests, resulting in the client-side unresponsiveness and
> constant network traffic.
> 
> A sample trace can be found at http://pastebin.centos.org/39046
> As shown, the READ in frame 10 (line 112) follows the DELEGRETURN in
> frame 9 which results in the error. This READ was then repeated
> infinitely until either the server or client was restarted.
> Disabling delegations on the server-side caused the problem to cease.
> So what is wrong with delegations on the client-side?

Usually we see this behavior because of a race between an OPEN with delegation and a delegation recall.  In this case, however, the client is actively returning a READ delegation, then proceeding to use it anyway.  I don't see the server's recall callback, though, and there are other indications that this trace is not complete.  So it's hard to be 100% confident.

As far as I know, the EL6.2 client does not have support for recovering a single bad STATEID, which is why it is looping.  That support is available in mainline kernels 3.0 and later.

However, it seems to me that it is a bug for the client to continue using a delegation that it has returned.

You have already found one work-around: disable delegations on the NFS server.  Or you could mount with NFSv3.  Or, if feasible, your application could be modified to use fcntl() locking.


> I am using the latest nfs-utils packages and my mount options are as
> shown below:
> 
> # cat /etc/redhat-release
> CentOS release 6.2 (Final)
> 
> # uname -r
> 2.6.32-220.4.1.el6.x86_64
> 
> # rpm -qa '*nfs*'
> nfs-utils-lib-1.1.5-4.el6.x86_64
> nfs-utils-1.2.3-15.el6_2.1.x86_64
> 
> # grep nfs4 /proc/mounts
> 10.51.1.6:/SharedFolder/ /var/LocalMountPoint nfs4
> rw,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.51.1.34,minorversion=0,local_lock=none,addr=10.51.1.6
> 0 0
> 
> Regards,
> 
> Charles
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux