On 6/28/23 4:55?PM, Kent Overstreet wrote: >> But it's not aio (or io_uring or whatever), it's simply the fact that >> doing an fput() from an exiting task (for example) will end up being >> done async. And hence waiting for task exits is NOT enough to ensure >> that all file references have been released. >> >> Since there are a variety of other reasons why a mount may be pinned and >> fail to umount, perhaps it's worth considering that changing this >> behavior won't buy us that much. Especially since it's been around for >> more than 10 years: > > Because it seems that before io_uring the race was quite a bit harder to > hit - I only started seeing it when things started switching over to > io_uring. generic/388 used to pass reliably for me (pre backpointers), > now it doesn't. I literally just pasted a script that hits it in one second with aio. So maybe generic/388 doesn't hit it as easily, but it's surely TRIVIAL to hit with aio. As demonstrated. The io_uring is not hard to bring into parity on that front, here's one I posted earlier today for 6.5: https://lore.kernel.org/io-uring/20230628170953.952923-4-axboe@xxxxxxxxx/ Doesn't change the fact that you can easily hit this with io_uring or aio, and probably more things too (didn't look any further). Is it a realistic thing outside of funky tests? Probably not really, or at least if those guys hit it they'd probably have the work-around hack in place in their script already. But the fact is that it's been around for a decade. It's somehow a lot easier to hit with bcachefs than XFS, which may just be because the former has a bunch of workers and this may be deferring the delayed fput work more. Just hand waving. >> then we'd probably want to move that deferred fput list to the >> task_struct and ensure that it gets run if the task exits rather than >> have a global deferred list. Currently we have: >> >> >> 1) If kthread or in interrupt >> 1a) add to global fput list >> 2) task_work_add if not. If that fails, goto 1a. >> >> which would then become: >> >> 1) If kthread or in interrupt >> 1a) add to global fput list >> 2) task_work_add if not. If that fails, we know task is existing. add to >> per-task defer list to be run at a convenient time before task has >> exited. > > no, it becomes: > if we're running in a user task, or if we're doing an operation on > behalf of a user task, add to the user task's deferred list: otherwise > add to global deferred list. And how would the "on behalf of a user task" work in terms of being in_interrupt()? -- Jens Axboe