Re: RFC: merging sm-notify and rpc.statd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On May 19, 2009, at 6:39 PM, Neil Brown wrote:
On Tuesday May 19, chuck.lever@xxxxxxxxxx wrote:
Hi Neil-

As part of IPv6 support for NFS, I've been looking at rpc.statd and sm-
notify.  IPv6 support touches so many parts of both, and the current
open-coded RPC request schedulers in both can't support netids without major revision or replacement. So I've decided to write a replacement instead of grafting in support for IPv6 to the current implementation.

For many reasons I'm thinking of merging sm-notify and rpc.statd back
together. The two were split only a few years ago, and it seems to me
that it was done to support SuSE's in-kernel statd, which has since
been effectively abandoned.

Having the two separated has ushered in a host of minor
complications. Packaging and init-scripts are more complicated. Both
executables have separate knowlege about /var/lib/nfs/{sm,sm.bak}.
There are two separate man pages that share a lot of the same content.

So, what do you think about folding sm-notify back into rpc.statd?
Steve suggested there may have been a customer issue that drove the
separation.  Do you have any recollection of the issues?

For the rest of the list: are there strong dependencies outside RH and
SuSE distributions that would require a separate sm-notify
executable?  Any other issues?

While the separation of sm-notify was presumably driven by the suse
in-kernel statd, that wasn't the reason that I copied the idea in
nfs-utils.

sm-notify and statd really have two very different tasks.

sm-notify :
  - is a 'client' for the "SM" protocol.
  - must be run at boot time, and after that is not needed.

statd :
  - is a 'server' for the "SM" protocol.
  - only needs to be running when either nfsd is running or an
    nfs mount which supports locks is active

Thus I feel they are conceptually quite distinct.

There are details that make it not such a clean conceptual break:

o Who manages the NSM state number? sm-notify sends it out to remote peers, and statd returns it in SM_MON and SM_UNMON replies. There has to be some co-ordination of how the state number is updated. If sm-notify runs separately (for example, with the "-- force" option) and updates the state number, how does statd know there's a new state number? If lockd isn't loaded and running when sm- notify runs, how is the kernel going to get the right NSM state number?

o statd still has client duties: it has to post NLM callbacks to the local lockd. Sending notifications to remote peers is not so different from that, conceptually. One could argue, therefore, that we should split that piece out of statd as well, but that would mean we fork/exec every time we get an unauthenticated SM_NOTIFY request from a monitored peer. That exposes a DoS vulnerability.

o statd has to wait while sm-notify copies the monitor list. It really shouldn't accept SM_MON requests while the notification list is created. But if it waits for long, it will appear that the NSM service has died. So there is some non-trivial synchronization between the two, and that appears to be split between statd and sm- notify today (and that synchronization requirement isn't documented in any way).

o statd has to fire up sm-notify when it receives SM_SIMU_CRASH. Today our lockd doesn't send that, but it could in the future. So, sm- notify is not strictly an "only-at-reboot" kind of affair.

o sm-notify tries to do a sync(2) to make sure that the file system state is made permanent after an NSM state update. Bruce has suggested doing the sync only after the first SM_MON (to reduce overhead during system boot), but that moves the sync(2) far away from the logic that updates the state number. That exposes us to NSM state number walk-back if the system crashes at the wrong time. It's arguable how much of a problem that is.

o It is better to send notifications when lockd is up. For clients, at least, lockd comes up only after the first NFS mount, and in automounter scenarios, that may not be for some time after a reboot. Servers may not start nfslock until they do "service nfslock start; service nfs start" at some point possibly long after reboot. So should clients be notified right when the server peer starts up, or after the server peer has fired up its NFSD and lockd service?

o Those who package statd/sm-notify have to understand how these operate. The people who create system init-scripts are generally not NFS experts, thus they must have local knowledge about statd and sm- notify in order to get this all correct. It would be more fool-proof if we hard-coded the start-up behavior, and took it out of the hands of the init-scripts folks, whom we do not control. How do we document the operational dependencies in a way that makes it very hard for non- NFS folks to set this up incorrectly? One way is to build it all in a single program.

It is probably true that they could share a slab of code, and putting
that code in a common .c file would make a lot of sense.

Yes, I've started doing that to try to understand what code can be shared.

I am not strongly against re-uniting them.  However before doing that,
I think it would be a good idea to collect a list of the problems that
would be solved by unifying them, and the asking the question: is
unifying them the only or best solution to these problems.

Agreed.  See above.

If there are one or more strong reasons to keep these separate, I can go down that road. But I think the practical matters of making NSM work in multiple Linux distributions, each with their own packaging and init-script mechanisms and requirements, suggests we'd be better off making it simple to get this right.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux