Re: [PATCH 18/19] lockd: Update NSM state from SM_MON replies

"J. Bruce Fields" <bfields@xxxxxxxxxxxx> · Tue, 28 Apr 2009 12:25:24 -0400

On Thu, Apr 23, 2009 at 07:33:33PM -0400, Chuck Lever wrote:
> When rpc.statd starts up in user space at boot time, it attempts to
> write the latest NSM local state number into
> /proc/sys/fs/nfs/nsm_local_state.
> 
> If lockd.ko isn't loaded yet (as is the case in most configurations),
> that file doesn't exist, thus the kernel's NSM state remains set to
> its initial value of zero during lockd operation.
> 
> This is a problem because rpc.statd and lockd use the NSM state number
> to prevent repeated lock recovery on rebooted hosts.  If lockd sends
> a zero NSM state, but then a delayed SM_NOTIFY with a real NSM state
> number is received, there is no way for lockd or rpc.statd to
> distinguish that stale SM_NOTIFY from an actual reboot.  Thus lock
> recovery could be performed after the rebooted host has already
> started reclaiming locks, and those locks will be lost.
> 
> We could change /etc/init.d/nfslock so it always modprobes lockd.ko
> before starting rpc.statd.  However, if lockd.ko is ever unloaded
> and reloaded, we are back at square one, since the NSM state is not
> preserved across an unload/reload cycle.  This may happen frequently
> on clients that use automounter.  A period of NFS inactivity causes
> lockd.ko to be unloaded, and the kernel loses its NSM state setting.

Aie.  Can we also fix the automounter or some other part of the
userspace configuration?

> Instead, let's use the fact that rpc.statd plants the local system's
> NSM state in every SM_MON (and SM_UNMON) reply.  lockd performs a
> synchronous SM_MON upcall to the local rpc.statd _before_ sending its
> first NLM request to a new remote.  This would permit rpc.statd to
> provide the current NSM state to lockd, even after lockd.ko had been
> unloaded and reloaded.
> 
> Note that NLMPROC_LOCK arguments are constructed before the
> nsm_monitor() call, so we have to rearrange argument construction very
> slightly to make this all work out.
> 
> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> ---
> 
>  fs/lockd/clntproc.c |    2 +-
>  fs/lockd/mon.c      |    6 +++++-
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
> index dd79570..f55b900 100644
> --- a/fs/lockd/clntproc.c
> +++ b/fs/lockd/clntproc.c
> @@ -126,7 +126,6 @@ static void nlmclnt_setlockargs(struct nlm_rqst *req, struct file_lock *fl)
>  	struct nlm_lock	*lock = &argp->lock;
>  
>  	nlmclnt_next_cookie(&argp->cookie);
> -	argp->state   = nsm_local_state;
>  	memcpy(&lock->fh, NFS_FH(fl->fl_file->f_path.dentry->d_inode), sizeof(struct nfs_fh));
>  	lock->caller  = utsname()->nodename;
>  	lock->oh.data = req->a_owner;
> @@ -519,6 +518,7 @@ nlmclnt_lock(struct nlm_rqst *req, struct file_lock *fl)
>  
>  	if (nsm_monitor(host) < 0)
>  		goto out;
> +	req->a_args.state = nsm_local_state;

Hm.  It looks like a_args.state is never used, except in ifdef'd-out
code in nlm4svc_proc_lock() and nlmsvc_proc_lock() ifdef'd out.
Something's wrong there.  (Not your fault; but needs looking into.)

>  
>  	fl->fl_flags |= FL_ACCESS;
>  	status = do_vfs_lock(fl);
> diff --git a/fs/lockd/mon.c b/fs/lockd/mon.c
> index 6d5d4a4..5017d50 100644
> --- a/fs/lockd/mon.c
> +++ b/fs/lockd/mon.c
> @@ -188,8 +188,12 @@ int nsm_monitor(const struct nlm_host *host)
>  		status = -EIO;
>  	if (status < 0)
>  		printk(KERN_NOTICE "lockd: cannot monitor %s\n", nsm->sm_name);
> -	else
> +	else {
>  		nsm->sm_monitored = 1;
> +		nsm_local_state = res.state;
> +		dprintk("lockd: nsm_monitor: NSM state is now %d\n",
> +				nsm_local_state);

Could we make that a dprintk in the case where this changes nsm_local
state from something other than zero (nsm_lock_state && nsm_local_state
!= res.state)?

(Just to make sure no statd is returning inconsistent nsm_local_stats
here.)

> +	}
>  	return status;
>  }

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html