Re: What functionality is expected from persistent NFS-client tracking?

Anand Avati <anand.avati@xxxxxxxxx> · Wed, 6 Feb 2013 12:38:31 -0800

On Wed, Feb 6, 2013 at 9:19 AM, Niels de Vos <ndevos@xxxxxxxxxx> wrote:

> Oh, OK.  Looking at the code in xlators/nfs/server/src/nlm4.c....  Looks

> like it's probably just using the same statd as the kernel server--the

> one installed as a part of nfs-utils, which by default puts its state in

> /var/lib/nfs/statd/.

>

> So if you want failover to work, then the contents of

> /var/lib/nfs/statd/ has to be made available to the server that takes

> over somehow.

This statd data and the implementation of the NLM protocol is not

something I am very familiar with. But Rajesh (on CC) explained a little

about it and informed me that the current NLM implementation indeed does

not support transparent fail-over yet.

The NLM implementation in gluster is stateless for all practical reasons (all locks are translated to lk() FOPs on the bricks). However we just use / depend on the RHEL rpc.statd -- which is not clustered. If the RHEL rpc.statd is replaced with a "clustered" statd, Gluster's NLM should "just work" even in failovers (by making a failover appear as a server reboot and kick off NLM's lock recovery) -- which may not be ideal and efficient, but should be functional.

Avati