On Thu, 07 Jul 2022, Trond Myklebust wrote: > On Thu, 2022-07-07 at 11:13 +1000, NeilBrown wrote: > > > > In NFSv4.1 when we EXCHANGE_ID to talk to a new server - possibly a > > PNFS > > Data Server that we haven't talked to before - we by default send an > > implementation id. This is created from several fields obtained from > > utsname(). > > utsname() depends on current->nsproxy, and will crash if that is > > NULL. > > > > When a process exits it calls, among other things, > > > > exit_task_namespaces(tsk); > > exit_task_work(tsk); > > > > exit_task_namespaces() will set ->nsproxy to NULL > > exit_task_work() will run delayed work items, including fput() on all > > files that were still open when the process exited. This will cause > > any > > pending writes to be flushed for NFS. > > > > So if a process writes to a file on a PNFS server, exits, and the MDS > > tells the client to send the data to a DS which it hasn't established > > a > > connection with before, then it will crash in encode_exchange_id(). > > > > That order of calls in do_exit() is deliberate so we cannot swap them > > - see > > Commit: 8aac62706ada ("move exit_task_namespaces() outside of > > exit_notify()") > > > > The options that I can see are: > > 1/ generate the implementation-id string at mount time and keep it > > around much like we do for cl_owner_id > > 2/ Check current->nsproxy in encode_exchange_id() and skip the > > implementation id if ->nsproxy is not available. > > Note that there is no risk for a race with testing ->nxproxy. > > > > Doesn't anyone have a strong opinion of which is best. I'm inclined > > to > > go with '2', but mostly because it is less coding. > > > > I'd be fine with ignoring the implementation id in that case. The > protocol doesn't require it, so servers are expecte to be able to deal > with that case. Thanks for the quick reply. I agree with you. However it turns out that this isn't a problem in mainline. When you close a file the ->flush() happens earlier in do_exit() and the name spaces are still around. However if you have a file mapped and have dirty pages, then mainline doesn't force a flush on final unmap, or exit of the process that had it mapped. As I explained in https://lore.kernel.org/all/150304037195.30218.15740287358704674869.stgit@noble/ I think this is wrong, so I have an fsync() call in nfs_file_release(). This *is* run after nsproxy has been cleared. So mainline doesn't need this fix, but I do. Thanks, NeilBrown