Re: [PATCH] Documentation: Add an explanation of NFSv4 client identifiers

Chuck Lever III <chuck.lever@xxxxxxxxxx> · Thu, 14 Apr 2022 15:25:57 +0000

> On Apr 14, 2022, at 12:13 AM, NeilBrown <neilb@xxxxxxx> wrote:
> 
> On Wed, 13 Apr 2022, Chuck Lever III wrote:
>> 
>>> On Apr 12, 2022, at 3:06 AM, NeilBrown <neilb@xxxxxxx> wrote:
>>> 
>>> On Tue, 12 Apr 2022, Chuck Lever wrote:
>>>> +
>>>> +If a client's "co_ownerid" string or principal are not stable,
>>>> +state recovery after a server or client reboot is not guaranteed.
>>>> +If a client unexpectedly restarts but presents a different
>>>> +"co_ownerid" string or principal to the server, the server orphans
>>>> +the client's previous open and lock state. This blocks access to
>>>> +locked files until the server removes the orphaned state.
>>>> +
>>>> +If the server restarts and a client presents a changed "co_ownerid"
>>>> +string or principal to the server, the server will not allow the
>>>> +client to reclaim its open and lock state, and may give those locks
>>>> +to other clients in the mean time. This is referred to as "lock
>>>> +stealing".
>>> 
>>> This is not a possible scenario with Linux NFS client.  The client
>>> assembles the string once from various sources, then uses it
>>> consistently at least until unmount or reboot.  Is it worth mentioning?
>> 
>> Neil, thanks for the eyes-on. I've integrated the other suggestions
>> in your reply. However there are some corner cases here that I'd
>> like to consider before proceeding.
>> 
>> Generally, preserving the cl_owner_id string is good defense against
>> lock stealing. Looks like the Linux NFS client didn't do that before
>> ceb3a16c070c ("NFSv4: Cache the NFSv4/v4.1 client owner_id in the
>> struct nfs_client").
>> 
>> If a server filesystem is migrated to a server that the client hasn't
>> contacted before, and the client's uniquifier or hostname has changed
>> since the client established its lease with the first server, there
>> is the possibility of lock stealing during transparent state migration.
>> 
>> I'm also not certain how the Linux NFS client preserves the principal
>> that was used when a lease is first established. It's going to use
>> Kerberos if possible, but what if the kernel's cred cache expires and
>> the keytab has been altered in the meantime? I haven't walked through
>> that code carefully enough to understand whether there is still a
>> vulnerability.
>> 
> 
> I don't think id stability relates to lock stealing.
> 
> - global uniqueness guards against stealing
> - stability guards against misplacing locks during migration ("stolen"
> - seems an inappropriate term as no entity an be blamed for stealing).

An entity can be blamed: the client didn't adequately protect it's
locks during a migration or reboot recovery. That permits other
clients to lock those files instead.

The issue I was most concerned about describing here was principal
changes, not co_ownerid changes. But if either one changes, there
is going to be a potential problem.

> Certainly stability of both the identity and the credential are
> important.  If that stability is not provided then that is a kernel bug.
> As you say, ceb3a16c070c fixed a bug in this area.
> I don't think this document is an appropriate place to warn against
> kernel bugs - that doesn't fit with the purpose given in the intro.
> 
> I don't know know if the credential can change either - I hope not.
> IF the credential can actually change, that would be a kernel bug.
> IF the same credential cannot be renewed, that is a separate problem,
> but should be treated like any other credential that cannot be renewed.
> 
> I won't argue strongly against this text - I just don't see how it is
> appropriate and think it could be confusing.

Well, it seems like this document is perhaps warning against something
that is not likely but still a risk. The warning doesn't seem to cause
more harm -- rather it seems like keeping the principal and co_ownerid
string stable is perhaps a little too restrictive, but that isn't a
constraint that introduces more risk to data, IMO.

I think we can agree on the initial text for this document, and admit
that it might need some editing as we go along.

--
Chuck Lever