On Thu, 2022-07-07 at 11:13 +1000, NeilBrown wrote: > > In NFSv4.1 when we EXCHANGE_ID to talk to a new server - possibly a > PNFS > Data Server that we haven't talked to before - we by default send an > implementation id. This is created from several fields obtained from > utsname(). > utsname() depends on current->nsproxy, and will crash if that is > NULL. > > When a process exits it calls, among other things, > > exit_task_namespaces(tsk); > exit_task_work(tsk); > > exit_task_namespaces() will set ->nsproxy to NULL > exit_task_work() will run delayed work items, including fput() on all > files that were still open when the process exited. This will cause > any > pending writes to be flushed for NFS. > > So if a process writes to a file on a PNFS server, exits, and the MDS > tells the client to send the data to a DS which it hasn't established > a > connection with before, then it will crash in encode_exchange_id(). > > That order of calls in do_exit() is deliberate so we cannot swap them > - see > Commit: 8aac62706ada ("move exit_task_namespaces() outside of > exit_notify()") > > The options that I can see are: > 1/ generate the implementation-id string at mount time and keep it > around much like we do for cl_owner_id > 2/ Check current->nsproxy in encode_exchange_id() and skip the > implementation id if ->nsproxy is not available. > Note that there is no risk for a race with testing ->nxproxy. > > Doesn't anyone have a strong opinion of which is best. I'm inclined > to > go with '2', but mostly because it is less coding. > I'd be fine with ignoring the implementation id in that case. The protocol doesn't require it, so servers are expecte to be able to deal with that case. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx