Re: clients fail to reclaim locks after server reboot or manual sm-notify

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Pavel,

What kernel version is Debian using?  I haven't been able to reproduce the problem using 3.0 (But I'm on Archlinux, so there might be other differences).

- Bryan

On Mon 14 Nov 2011 12:11:56 PM EST, Pavel wrote:
> Hi! I'm trying to set up an NFS server (particularly an A/A NFS cluster) and 
> having issues with locking and reboot notifications. These are the tests I have 
> done:
>
> 1. The simplest test includes single NFS server machine (Debian Squeeze), 
> running nfs-kernel-server (nfs-utils 1.2.2-4) and a single client machine (same 
> OS), that mounts a share with “-o 'vers=3'” option. From the client I lock some 
> file on share using 'testlk -w <filename>' (testlk from nfsutils/tools/locktest) 
> so that a corresponding file appears in /var/lib/nfs/sm/ on server. Then I 
> reboot the server and this is what I get in client logs:
>
> lockd: request from 127.0.0.1, port=1007
> lockd: SM_NOTIFY     called
> lockd: host nfs-server1 (192.168.0.101) rebooted, cnt 2
> lockd: get host nfs-server1
> lockd: get host nfs-server1
> lockd: release host nfs-server1
> lockd: reclaiming locks for host nfs-server1
> lockd: rebind host nfs-server1
> lockd: call procedure 2 on nfs-server1
> lockd: nlm_bind_host nfs-server1 (192.168.0.101)
> lockd: rpc_call returned error 13
> lockd: failed to reclaim lock for pid 1555 (errno -13, status 0)
> NLM: done reclaiming locks for host nfs-server1
> lockd: release host nfs-server1
>
> 2. As I'm building a cluster I'll need to notify clients when NFS resource 
> migrates (since it is an A/A cluster nfs-kernel-server is always running on all 
> nodes and shares migrate using exportfs resource agent), but manually calling 
> sm-notify ('sm-notify -f -v <virtual IP of share>') from either the initial for 
> that share or backup node results in the following (client logs):
>
> lockd: request from 127.0.0.1, port=637
> lockd: SM_NOTIFY     called
> lockd: host B (192.168.0.110) rebooted, cnt 2
> lockd: get host B
> lockd: get host B
> lockd: release host B
> lockd: reclaiming locks for host B
> lockd: rebind host B
> lockd: call procedure 2 on B
> lockd: nlm_bind_host B (192.168.0.110)
> lockd: server in grace period
> lockd: spurious grace period reject?!
> lockd: failed to reclaim lock for pid 2508 (errno -37, status 4)
> NLM: done reclaiming locks for host B
> lockd: release host B
>
> even though grace period is intended for lock reclamation. B/w after such 
> invocation no files, corresponding to the notified clients, appear in 
> /var/lib/nfs/sm/ on server for about 10 minutes, if I try locking from any of 
> these notified clients, even though locking itself is ok. Locking from other 
> clients generates files for them instantly.
>
> As of the rest: simple concurrent lock tests from couple of clients work fine as 
> well as server frees locks of rebooted clients.
>
> I'm new to NFS an may be missing obvious things, but I've already spent several 
> days googling around, but don't seem to find any solution.
> Any help or guidance is highly appreciated. Thanks!
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux