On Mon, Aug 21, 2023 at 05:31:48PM +0200, Miklos Szeredi wrote: (Apologies for the delay, I have been away without cell signal for some time.) > > I think the idea is that they're saving snapshots of their own threads > > to the fs for debugging purposes. > > This seems a fairly special situation. Have they (whoever they may > be) thought about fixing this in their server? Sorry, "we" here is some internal team that works for my employer Netflix. We can't use imap clients on our corporate e-mails, whee. > > Whether this is a sane thing to do or not, it doesn't seem like it > > should deadlock pid ns destruction. > > True. So the suggested solution is to allow wait_event_killable() to > return if a terminal signal is pending in the exiting state and only > in that case turn the flush into a background request? That would > still allow for regressions like the one reported, but that would be > much less likely to happen in real life. Okay, I said this for the > original solution as well, so this may turn out to be wrong as well. I wonder if there's room here for a completion that doesn't use the wait primitives. Something like an atomic + queuing in task_work() would both fix this bug and not exhibit this regression, IIUC. > Anyway, I'd prefer if this was fixed in the server code, as it looks > fairly special and adding complexity to the kernel for this case might > not be justifiable. But I'm also open to suggestions on fixing this > in the kernel in a not too complex manner. I don't think this is specific to the server-accessing-its-own-file case. My reproducer uses that because I didn't quite understand the bug fully at the time. I believe that *any* task that is killed with an inflight fuse request will exhibit this. We have seen this fairly rarely on another fuse fs we use throughout the fleet: https://github.com/lxc/lxcfs and it doesn't really do anything strange, and is mounted from the host's pid ns. Tycho