Re: [RFC] After server stop nfslock service, client still can get lock success

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Nov 17, 2009, at 4:47 AM, Mi Jinlong wrote:

When testing NLM, i find a bug.
After server stop nfslock service, client still can get lock success

Test process:

 Step1: client open nfs file.
 Step2: client using fcntl to get lock.
 Step3: client using fcntl to release lock.
 Step4: service stop it's nfslock service.
 Step5: client using fcntl to get lock again.

At step5, client should get lock fail, but it's success.

Reason:
 When server stop nfslock service, client's host struct not be
 unmonitor at server. When client get lock again, the client's
 host struct will be reuse but don't monitor again.
 So that, at step5 client can get lock success.

Effectively, the client is still monitored, since it is still in statd's monitored list. Shutting down statd does not remove it from the monitor list. If the local host reboots, sm-notify will still send the remote an SM_NOTIFY request, which is correct.

Additionally, new clients attempting to lock files when statd is down will fail, which is correct if statd is not available.

Conversely, if a monitored remote reboots, there is no way to notify the local lockd of the reboot, since statd normally relays the SM_NOTIFY to lockd, but isn't running. That might be a problem.

However, shutting down statd during normal operation is not a normal or supported thing to do.

Question:
 1. Should unmonitor the client's host struct at server
    when server stop nfslock service ?

 2. Whether let rpc.statd tell kernel it's status(when start and stop)
    by send a SM_NOTIFY ?

There are a number of other coordination issues around statd start-up and shut down. The server's grace period, for instance, is not synchronized with sending reboot notifications. So, we do recognize this is a general problem.

In this case, however, I would expect indeterminate behavior if statd is shut down during normal operation, and that's exactly what we get. I'm not sure it's even reasonable to support this use case. Why would someone shut down statd and expect reliable NFSv2/v3 locking behavior? In other words, with due respect, what problem would we solve by fixing this, other than making your test case work?

Out of curiosity, what happens if you try this on a Solaris server?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux