On Dec 16, 2009, at 5:27 AM, Mi Jinlong wrote:
Chuck Lever:
On Dec 15, 2009, at 5:02 AM, Mi Jinlong wrote:
Hi,
When testing the NLM at the latest kernel(2.6.32), i find a bug.
When a client hold locks, after server restart its nfslock service,
server's statd will not synchronize with lockd.
If server restart nfslock twice or more, client's lock will be lost.
Test process:
Step1: client open nfs file.
Step2: client using fcntl to get lock.
Step3: server restart it's nfslock service.
I'll assume here that you mean the equivalent of "service nfslock
restart". This restarts statd and possibly runs sm-notify, but it
has
no effect on lockd.
Yes, i used "service nfslock restart".
It has effect on lockd too, when service stop, lockd will get a
KILL signal.
Lockd will release all client's locks, and go into grace_period and
wait
client reclaime it's lock.
Again, this test seems artificial to me. Is there a real world use
case
where someone would deliberately restart statd while an NFS server is
serving files? I pose this question because I've worked on statd
only
for a year or so, and I am quite likely ignorant of all the ways it
can
be deployed.
^/^, but maybe someone will restart nfslock when an NFS server is
serving files.
It is inevitable.
After step3, server's lockd records client holding locks, but
statd's
/var/lib/nfs/statd/sm/ directory is empty. It means statd and
lockd are
not sync. If server restart it's nfslock again, client's locks
will be
lost.
The Primary Reason:
At step3, when client's reclaimed lock request is sent to server,
client's host(the host struct) is reused but not be re-monitored at
server's lockd. After that, statd and lockd are not sync.
The kernel squashes SM_MON upcalls for hosts that it already believes
are monitored. This is a scalability feature.
When statd start, it will move files from /var/lib/nfs/statd/sm/ to
/var/lib/nfs/statd/sm.bak/.
Well, it's really sm-notify that does this. sm-notify is run by
rpc.statd when it starts up.
However, sm-notify should only retire the monitor list the first time
it is run after a reboot. Simply restarting statd should not change
the on-disk monitor list in the slightest. If it does, there's some
kind of problem with the way sm-notify's pid file is managed, or
perhaps with the nfslock script.
If lockd don't send a SM_MON to statd,
statd will not monitor those client which be monitored before statd
restart.
Question:
In my opinion, if lockd is allowed reuseing the client's host, it
should
send a SM_MON to statd when reuse. If not allowed, the client's host
should
be destroyed immediately.
What should lockd to do? Reuse ? Destroy ? Or some other action?
I don't immediately see why lockd should change it's behavior.
Perhaps
statd/sm-notify were incorrect to delete the monitor list when you
restarted the nfslock service?
Sorry, maybe i did not express clearly.
I mean, lockd reuse the host struct which was created before statd
restart.
It seems have deleted the monitor list when nfslock restart.
lockd does not touch any user space files; the on-disk monitor list is
managed by statd and sm-notify. A remote peer rebooting does not
clear the "monitored" flag for that peer in the local kernel's lockd,
so it won't send another SM_MON request.
Now, it may be the case that "service nfslock start" uses a command
line option that forces a fresh sm-notify run, and that is what is
wiping the on-disk monitor list. That would be the bug in this case
-- sm-notify can and should be allowed to make its own determination
of whether the monitor list gets retired. Notification should not
normally be forced by command line options in the nfslock script.
Can you show exactly how statd's state (ie it's on-disk monitor
list in
/var/lib/nfs/statd/sm) changed across the restart? Did sm-notify run
when you restarted statd? If so, why didn't the sm-notify pid file
stop
it?
The statd and lockd's state at server when nfslock restart:
lockd statd |
|
host(monitored = 1) /sm/client | client get locks
success at first
(locks) |
|
host(monitored = 1) /sm/client | nfslock stop (lockd
release client's locks)
(no locks) |
|
host(monitored = 1) /sm/ | nfslock start
(client reclaim locks)
(locks) | (but
statd don't monitor it)
note: host(monitored=1) means: client's host struct is created,
and is marked be monitored.
(locks), (no locks)means: host strcut holds locks, or not.
/sm/client means: there have a file under /var/lib/
nfs/statd/sm directory
/sm/ means: /var/lib/nfs/statd/sm is empty!
thanks,
Mi Jinlong
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html