> On Apr 14, 2022, at 12:13 AM, NeilBrown <neilb@xxxxxxx> wrote: > > On Wed, 13 Apr 2022, Chuck Lever III wrote: >> >>> On Apr 12, 2022, at 3:06 AM, NeilBrown <neilb@xxxxxxx> wrote: >>> >>> On Tue, 12 Apr 2022, Chuck Lever wrote: >>>> + >>>> +If a client's "co_ownerid" string or principal are not stable, >>>> +state recovery after a server or client reboot is not guaranteed. >>>> +If a client unexpectedly restarts but presents a different >>>> +"co_ownerid" string or principal to the server, the server orphans >>>> +the client's previous open and lock state. This blocks access to >>>> +locked files until the server removes the orphaned state. >>>> + >>>> +If the server restarts and a client presents a changed "co_ownerid" >>>> +string or principal to the server, the server will not allow the >>>> +client to reclaim its open and lock state, and may give those locks >>>> +to other clients in the mean time. This is referred to as "lock >>>> +stealing". >>> >>> This is not a possible scenario with Linux NFS client. The client >>> assembles the string once from various sources, then uses it >>> consistently at least until unmount or reboot. Is it worth mentioning? >> >> Neil, thanks for the eyes-on. I've integrated the other suggestions >> in your reply. However there are some corner cases here that I'd >> like to consider before proceeding. >> >> Generally, preserving the cl_owner_id string is good defense against >> lock stealing. Looks like the Linux NFS client didn't do that before >> ceb3a16c070c ("NFSv4: Cache the NFSv4/v4.1 client owner_id in the >> struct nfs_client"). >> >> If a server filesystem is migrated to a server that the client hasn't >> contacted before, and the client's uniquifier or hostname has changed >> since the client established its lease with the first server, there >> is the possibility of lock stealing during transparent state migration. >> >> I'm also not certain how the Linux NFS client preserves the principal >> that was used when a lease is first established. It's going to use >> Kerberos if possible, but what if the kernel's cred cache expires and >> the keytab has been altered in the meantime? I haven't walked through >> that code carefully enough to understand whether there is still a >> vulnerability. >> > > I don't think id stability relates to lock stealing. > > - global uniqueness guards against stealing > - stability guards against misplacing locks during migration ("stolen" > - seems an inappropriate term as no entity an be blamed for stealing). An entity can be blamed: the client didn't adequately protect it's locks during a migration or reboot recovery. That permits other clients to lock those files instead. The issue I was most concerned about describing here was principal changes, not co_ownerid changes. But if either one changes, there is going to be a potential problem. > Certainly stability of both the identity and the credential are > important. If that stability is not provided then that is a kernel bug. > As you say, ceb3a16c070c fixed a bug in this area. > I don't think this document is an appropriate place to warn against > kernel bugs - that doesn't fit with the purpose given in the intro. > > I don't know know if the credential can change either - I hope not. > IF the credential can actually change, that would be a kernel bug. > IF the same credential cannot be renewed, that is a separate problem, > but should be treated like any other credential that cannot be renewed. > > I won't argue strongly against this text - I just don't see how it is > appropriate and think it could be confusing. Well, it seems like this document is perhaps warning against something that is not likely but still a risk. The warning doesn't seem to cause more harm -- rather it seems like keeping the principal and co_ownerid string stable is perhaps a little too restrictive, but that isn't a constraint that introduces more risk to data, IMO. I think we can agree on the initial text for this document, and admit that it might need some editing as we go along. -- Chuck Lever