Re: [PATCH v2] nfs.man: document requirements for NFSv4 identity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Mar 17, 2022, at 10:00 PM, NeilBrown <neilb@xxxxxxx> wrote:
> 
> On Thu, 17 Mar 2022, Chuck Lever III wrote:
>> Howdy Neil-
> 
> G'day
> 
>>>> The last sentence is made ambiguous by the use of passive voice.
>>>> 
>>>> Suggest: "When hostname uniqueness cannot be guaranteed, the client
>>>> administrator must provide extra identity information."
>>> 
>>> Why must the client administrator do this?  Why can't some automated
>>> tool do this?  Or some container-building environment.
>>> That's an advantage of the passive voice, you don't need to assign
>>> responsibility for the verb.
>> 
>> My point is that in order to provide the needed information,
>> elevated privilege is required. The current sentence reads as
>> if J. Random User could be interrupted at some point and asked
>> for help.
>> 
>> In other words, the documentation should state that this is
>> an administrative task. Here I'm not advocating for a specific
>> mechanism to actually perform that task.
> 
> ???  This whole man page is primarily about mount options, particularly
> as they appear in /etc/fstab.  These are not available to the non-admin.
> Why would anyone think this section is any different?

Because the nfs_client_id4 uniquifier is not a mount option and
isn't mentioned anywhere else. It's not going to be familiar to
some. As you and I know, most people are not careful readers.

Do note that nfs(5) is really just an extension of mount(8).
The sections you pointed to earlier (eg, DATA AND METADATA
COHERENCE) are there to provide context explaining how to use
NFS mount options. The patch you have proposed is for an API
and protocol element that have nothing to do with NFS mount
options. That by itself disqualifies a proposed addition to
nfs(5).

I suggest instead constructing an independent man page that
is attached to the /etc file that contains the client ID
uniquifier. Something akin to machine-id(5) ?


>>>> I have a problem with basing our default uniqueness guarantee on
>>>> hostnames "most of the time" hoping it will all work out. There
>>>> are simply too many common cases where hostname stability can't be
>>>> relied upon. Our sustaining teams will happily tell us this hope
>>>> hasn't so far been born out.
>>> 
>>> Maybe it has not been born out because there is no documented
>>> requirement for it that we can point people to.
>>> Clearly containers that use NFS are not currently all configured well to do
>>> this.  Some change is needed.  Maybe adding a unique host name is the
>>> easiest change ... or maybe not.
>> 
>> You seem to be documenting the client's current behavior.
>> The tone of the documentation is that this behavior is fine
>> and works for most people.
> 
> It certainly works for a lot of people.  Many people are using NFSv4
> quite effectively.  I'm sure there are people who are having problems
> too, but let's not fall for the squeaky wheel fallacy.

For some folks it fails silently and/or requires round trips
with their distributor's call center. I would like not to
discount their experience.


>> It's the second part that I disagree with. Oracle Linux has
>> bugs documenting this behavior is a problem, and I'm sure
>> Red Hat does too. The current behavior is broken. It is this
>> brokeness that we are trying to resolve.
> 
> The current behaviour of NFS is NOT broken.  Maybe is it not adequately
> robust against certain configuration choices.  Certainly we should make
> it as robust as we reasonably can.  But let's not overstate the problem.

Years of bug reports suggests I'm not overstating anything.

The plan, for a while now, has been to supplement the use of
the hostname to address this very situation. You are now
suggesting there is nothing to address, which I find difficult
to swallow.


>> So let me make a stronger statement: we should not
>> document that broken behavior in nfs(5). Instead, we should
>> fix that behavior, and then document the golden brown and
>> delicious behavior. Updating nfs(5) first is putting
>> DeCarte in front of de horse.
>> 
>> 
>>> Surely NFS is not the *only* service that uses the host name.
>>> Encouraging the use of unique host names might benefit others.
>> 
>> Unless you have specific use cases that might benefit from
>> ensuring hostname uniqueness, I would beg that you stay
>> focused on the immediate issue of how the Linux client
>> constructs its nfs_client_id4 strings.
>> 
>> 
>>> The practical reality is that a great many NFS client installations do
>>> currently depend on unique host names - after all, it actually works.
>>> Is it really so unreasonable to try to encourage the exceptions to fit
>>> the common pattern better?
>> 
>> Yes it is unreasonable.
>> 
>> NFS servers typically have a fixed DNS presence. They have
>> to because clients mount by hostname.
>> 
>> NFS clients, on the other hand, are not under that constraint.
>> The only time I can think of where a client has to have a
>> fixed hostname is if a krb5 host principal is involved.
>> 
>> In so many other cases, eg. mobile computing or elastic
>> services, the client hostname is mutable. I don't think
>> it's fair to put another constraint on host naming here,
>> especially one with implications of service denial or
>> data corruption (see below).
>> 
>> 
>>>> Maybe I'm just stating this to understand the purpose of this
>>>> patch, but it could also be used as an "Intended audience"
>>>> disclaimer in this new section.
>>> 
>>> OK, so the "purpose of this patch" relates in part to a comment you made
>>> earlier, which I include here:
>>> 
>>>> Since it is just a line or two of code, it might be of little
>>>> harm just to go with separate implementations for now and stop
>>>> talking about it. If it sucks, we can fix the suckage.
>>>> 
>>>> Who volunteers to implement this mechanism in mount.nfs ?
>>> 
>>> I don't think this is the best next step.  I think we need to get some
>>> container system developer to contribute here.  So far we only have
>>> second hand anecdotes about problems.  I think the most concrete is from
>>> Ben suggesting that in at least one container system, using
>>> /etc/machine-id is a good idea.
>>> 
>>> I don't think we can change nfs-utils (whether mount.nfs or mount.conf
>>> or some other way) to set identity from /etc/machine-id for everyone.
>>> So we need at least for that container system to request that change.
>>> 
>>> How would they like to do that?
>>> 
>>> I suggest that we explain the problem to representatives of the various
>>> container communities that we have contact with (Well...  "you", more
>>> than "we" as I don't have contacts).
>> 
>> I'm all for involving one or more container experts. But IMO
>> it's not appropriate to update our man page to do that. Let's
>> update nfs(5) when we are done with this effort.
> 
> Don't let perfect be the enemy of good.
> We were making no progress with "fixing" nfs.  Documenting "how it works
> today" should never be a bad thing.

To be clear, I don't have a problem with documenting the current
behavior /somewhere else/. I do have a problem documenting it in
nfs(5) as a situation that is fine, given its known shortcomings
and the fact that it will be updated in short order.


> Obviously we can (and must) update
> the documentation when we update the behaviour.
> 
> But if some concrete behavioural changes can be agreed and implemented
> through this discussion, I'm happy for the documentation to land only
> after those changes.
> 
>>>>> +.IP \- 2
>>>>> +NFS-root (diskless) clients, where the DCHP server (or equivalent) does
>>>>> +not provide a unique host name.
>>>> 
>>>> Suggest this addition:
>>>> 
>>>> .IP \- 2
>>>> 
>>>> Dynamically-assigned hostnames, where the hostname can be changed after
>>>> a client reboot, while the client is booted, or if a client often 
>>>> repeatedly connects to multiple networks (for example if it is moved
>>>> from home to an office every day).
>>> 
>>> This is a different kettle of fish.  The hostname is *always* included
>>> in the identifier.  If it isn't stable, then the identifier isn't
>>> stable.
>>> 
>>> I saw in the history that when you introduced the module parameter it
>>> replaced the hostname.  This caused problems in containers (which had
>>> different host names) so Trond changed it so the module parameter
>>> supplemented the hostname.
>>> 
>>> If hostnames are really so poorly behaved I can see there might be a
>>> case to suppress the hostname, but we don't have that option is current
>>> kernels.  Should we add it?
>> 
>> I claim that it has become problematic to use the hostname in the
>> nfs_client_id4 string.
> 
> In that case, we should fix it - make it possible to exclude the
> hostname from the nfs_client_id4 string.  You make a convincing case.
> Have you thoughts on how we should implement that?

This functionality has been implemented for some time using either
sysfs or a module parameter. Those APIs supplement the hostname
with whatever string is provided. I don't think we need to
exclude the hostname from the nfs_client_id4 -- in fact some folks
might prefer keeping the hostname in there as an eye-catcher. But
it's simply that the hostname by itself does not provide enough
uniqueness.

The plan for some time now has been to construct user space mechanisms
to use the sysfs/module parameter APIs to always plug in a uniquifier.
That relieves the hostname uniqueness dependencies as long as those
mechanisms are used as often as possible.

So in other words, today the default is to use the hostname; using
the random uniqifier is an exception. The plan is to make the random
uniqifier the default, and fall back on the hostname if for some
reason the uniquifier initialization mechanism did not work.


>>> The hostname is copied at boot by NFS, and
>>> if it is included in the /sys/fs/nfs/client/identifier (which would be
>>> pointless, but not harmful) it has again been copied.
>>> 
>>> If it is different on subsequent boots, then that is a big problem and
>>> not one that we can currently fix.
>> 
>> Yes, we can fix it: don't use the client's hostname but
>> instead use a separate persistent uniquifier, as has been
>> proposed.
>> 
>> 
>>> ....except that non-persistent client identifiers isn't an enormous
>>> problem, just a possible cause of delays.
>> 
>> I disagree, it's a significant issue.
>> 
>> - If locks are lost, that is a potential source of data corruption.
>> 
>> - If a lease is stolen, that is a denial of service.
>> 
>> Our customers take this very seriously.
> 
> Of course, as they should.  data integrity is paramount.
> non-persistent client identifier doesn't put that as risk - not in and
> of itself.
> 
> If a client's identifier changed during the lifetime of one instance of
> the client, then that would allow locks to be lost.  That does NOT
> happen just because you happen to change the host name.  The hostname is
> copied at first use.
> It *could* happen if you changed the module parameter or sysfs identity
> after the first mount, but I hope we can agree that not a justifiable
> action.
> 
> A lease can only be "stolen" by a non-unique identifier, not simply by
> non-persistent identifiers.  But maybe this needs a caveat.

In this thread, I refer mostly to issues caused by
nfs_client_id4 non-uniqueness.

This is indeed the class of misbehavior that is significant
to our customer base. Multiple clients might use
"localhost.localdomain" simply because that's the way the
imaging template is built. Or when an image is copied to
create a new guest, the hostname is not changed. Those are
but two examples. In many cases, client administrators
are simply not in control of their hostnames.

In cloud deployments, AUTH_SYS is the norm because managing a
large Kerberos realm is generally onerous. Thus AUTH_SYS plus
a hostname-uniquified nfs_client_id4 is by far the common
case, though it is the most risky one.


> If a set of clients are each given host names from time to time which
> are, at any moment in time, unique, but are able to "migrate" from one
> client to another, then it would be possible for two clients to both
> have performed their first NFS mount when they have some common
> hosttname X.  The "first" was given host X at boot time, it mounted
> something.  The hostname was subsequently change to Y and some other
> host booted and got X and then mounted from the same server.  This
> would be seriously problematic.  I class this as "non-unique" hostnames,
> not as non-persistent-identifier.
> 
>>                                        The NFS clients's
>> out-of-the-shrink-wrap default behavior/configuration should be
>> conservative enough to prevent these issues. Customers store
>> mission critical data via NFS. Most customers expect NFS to work
>> reliably without a lot of configuration fuss.
> 
> I've been working on the assumption that it is not possible to provide
> ideal zero-config behaviour "out-of-the-shrink-wrap".  You have hinted
> (or more) a few times that this is your goal.  Certainly a worthy goal if
> possible.  Is it possible?
> 
> I contend that if there is no common standard for how containers (and
> network namespaces in particular) are used, then it is simply not
> possible to provide perfect out-of-the-box behaviour.  There *must* be
> some local configuration that we cannot enforce through the kernel or
> through nfs-utils.  We can offer, but we cannot enforce.  So we must
> document.
> 
> The very best that we could do would be to provide a random component to
> the identifier unless we had a high level of confidence that a unique
> identifier had been provided some other way.  I don't know how to get
> that high level of confidence in a way that doesn't break working
> configurations.
> Ben suggested defaulting 'identity' to a random string for any network
> namespace other than init.  I don't think that is cautious enough.
> Maybe if we did it when the network namespace is not init, but the UTS
> namepsace is init.  But that feels like a hack and is probably brittle.
> 
> Can you suggest *any* way to improve the "out-of-shrink-wrap" behaviour
> significantly?

Well it sounds like we agree that making the random uniquifier
the default is a good step forward. Just because this has been
contentious so far, I think we should strive for something that
is a best effort but clearly a step up. The fall back can use
the hostname. Over time the remaining gaps can be stopped.

Here are some suggestions that might make it simpler to implement.

1. Ben's tool manufactures the uniqifier if the file doesn't
   already exist. That seems somewhat racy. Instead, why not
   make installation utilities responsible for creating the
   uniquifier? We need some guarantee that when a VM is cloned,
   the uniquifier is replaced, for instance; that's well
   outside nfs-utils' sphere of influence.

   Document the requirements (a la machine-id(5)) then point
   the distributors and Docker folks at that. I think that is
   your plan, right? I've done the same with at least one of
   Oracle's virtualization products, while waiting for a more
   general upstream solution.

   Then, either each mount.nfs invocation or some part of
   system start-up checks for the uniquifier file and pushes
   the uniquifier into the local net namespace. (Doing this
   only once at boot has its appeal). If the uniquifier file
   does not exist, then the NFS client continues to use a
   hostname uniquifier. Over time we find and address the
   fallback cases.


2. The udev rule mechanism that Trond proposed attempted to
   address both init_ns and subsequent namespaces the same way.
   Maybe it's time to examine the assumptions there to help
   us make more progress.

   Use independent mechanisms for the init_ns and for subsequent
   net namespaces. Perhaps Ben already suggested this. Looking
   back over weeks of this conversation, these two use cases
   seem fundamentally different from each other. The init_ns
   has to handle NFSROOT, can use the boot command line or the
   module parameter to deal with PXE booting and so on. The
   Docker case can use whatever works better for them.


3. We don't yet have a way to guarantee that the uniquifier is
   in place before the first NFS mount is initiated. Talking
   with someone who has deep systemd expertise might help. It
   might also help at least in the non-container case if the
   uniquifier is provided on the kernel command line, the same
   way that root= is specified.


4. An alternative for the init_ns case might be to add a
   mechanism to initramfs to set the client's uniquifier.
   On my clients where containers are not in use, I set the
   uniquifier using the module parameter; the module load
   config file needs to be added to initramfs before it
   takes effect.


>>>> If we want to create a good uniquifier here, then combine the
>>>> hostname, netns identity, and/or the host's machine-id and then
>>>> hash that blob with a known strong digest algorithm like
>>>> SHA-256. A man page must not recommend the use of deprecated or
>>>> insecure obfuscation mechanisms.
>>> 
>>> I didn't realize the hash that uuidgen uses was deprecated.  Is there
>>> some better way to provide an app-specific obfuscation of a string from
>>> the command line?
>>> 
>>> Maybe
>>>   echo nfs-id:`cat /etc/machine-id`| sha256sum
>>> 
>>> ??
>> 
>> Something like that, yes. But the scriptlet needs to also
>> involve the netns identity somehow.
> 
> Hmmm..  the impression I got from Ben was that the container system
> ensured that /etc/machine-id was different in different containers.  So
> there would be no need to add anything.  Of course I should make that
> explicit in the documentation.
> 
> I would be nice if we could always use "ip netns identify", but that
> doesn't seem to be generally supported.

If containers provide unique machine-ids, a digest of the
machine-id is fine with me.

Note that many implementations don't tolerate a large
nfs_client_id4 string, so keeping the digest size small
might be needed. Using blake2 might be a better choice.

--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux