On May 19, 2009, at 6:39 PM, Neil Brown wrote:
On Tuesday May 19, chuck.lever@xxxxxxxxxx wrote:
Hi Neil-
As part of IPv6 support for NFS, I've been looking at rpc.statd and
sm-
notify. IPv6 support touches so many parts of both, and the current
open-coded RPC request schedulers in both can't support netids
without
major revision or replacement. So I've decided to write a
replacement
instead of grafting in support for IPv6 to the current
implementation.
For many reasons I'm thinking of merging sm-notify and rpc.statd back
together. The two were split only a few years ago, and it seems to
me
that it was done to support SuSE's in-kernel statd, which has since
been effectively abandoned.
Having the two separated has ushered in a host of minor
complications. Packaging and init-scripts are more complicated.
Both
executables have separate knowlege about /var/lib/nfs/{sm,sm.bak}.
There are two separate man pages that share a lot of the same
content.
So, what do you think about folding sm-notify back into rpc.statd?
Steve suggested there may have been a customer issue that drove the
separation. Do you have any recollection of the issues?
For the rest of the list: are there strong dependencies outside RH
and
SuSE distributions that would require a separate sm-notify
executable? Any other issues?
While the separation of sm-notify was presumably driven by the suse
in-kernel statd, that wasn't the reason that I copied the idea in
nfs-utils.
sm-notify and statd really have two very different tasks.
sm-notify :
- is a 'client' for the "SM" protocol.
- must be run at boot time, and after that is not needed.
statd :
- is a 'server' for the "SM" protocol.
- only needs to be running when either nfsd is running or an
nfs mount which supports locks is active
Thus I feel they are conceptually quite distinct.
There are details that make it not such a clean conceptual break:
o Who manages the NSM state number? sm-notify sends it out to
remote peers, and statd returns it in SM_MON and SM_UNMON replies.
There has to be some co-ordination of how the state number is
updated. If sm-notify runs separately (for example, with the "--
force" option) and updates the state number, how does statd know
there's a new state number? If lockd isn't loaded and running when sm-
notify runs, how is the kernel going to get the right NSM state number?
o statd still has client duties: it has to post NLM callbacks to
the local lockd. Sending notifications to remote peers is not so
different from that, conceptually. One could argue, therefore, that
we should split that piece out of statd as well, but that would mean
we fork/exec every time we get an unauthenticated SM_NOTIFY request
from a monitored peer. That exposes a DoS vulnerability.
o statd has to wait while sm-notify copies the monitor list. It
really shouldn't accept SM_MON requests while the notification list is
created. But if it waits for long, it will appear that the NSM
service has died. So there is some non-trivial synchronization
between the two, and that appears to be split between statd and sm-
notify today (and that synchronization requirement isn't documented in
any way).
o statd has to fire up sm-notify when it receives SM_SIMU_CRASH.
Today our lockd doesn't send that, but it could in the future. So, sm-
notify is not strictly an "only-at-reboot" kind of affair.
o sm-notify tries to do a sync(2) to make sure that the file system
state is made permanent after an NSM state update. Bruce has
suggested doing the sync only after the first SM_MON (to reduce
overhead during system boot), but that moves the sync(2) far away from
the logic that updates the state number. That exposes us to NSM state
number walk-back if the system crashes at the wrong time. It's
arguable how much of a problem that is.
o It is better to send notifications when lockd is up. For
clients, at least, lockd comes up only after the first NFS mount, and
in automounter scenarios, that may not be for some time after a
reboot. Servers may not start nfslock until they do "service nfslock
start; service nfs start" at some point possibly long after reboot.
So should clients be notified right when the server peer starts up, or
after the server peer has fired up its NFSD and lockd service?
o Those who package statd/sm-notify have to understand how these
operate. The people who create system init-scripts are generally not
NFS experts, thus they must have local knowledge about statd and sm-
notify in order to get this all correct. It would be more fool-proof
if we hard-coded the start-up behavior, and took it out of the hands
of the init-scripts folks, whom we do not control. How do we document
the operational dependencies in a way that makes it very hard for non-
NFS folks to set this up incorrectly? One way is to build it all in a
single program.
It is probably true that they could share a slab of code, and putting
that code in a common .c file would make a lot of sense.
Yes, I've started doing that to try to understand what code can be
shared.
I am not strongly against re-uniting them. However before doing that,
I think it would be a good idea to collect a list of the problems that
would be solved by unifying them, and the asking the question: is
unifying them the only or best solution to these problems.
Agreed. See above.
If there are one or more strong reasons to keep these separate, I can
go down that road. But I think the practical matters of making NSM
work in multiple Linux distributions, each with their own packaging
and init-script mechanisms and requirements, suggests we'd be better
off making it simple to get this right.
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html