On 6/28/23 7:00?PM, Dave Chinner wrote: > On Wed, Jun 28, 2023 at 07:50:18PM -0400, Kent Overstreet wrote: >> On Wed, Jun 28, 2023 at 05:14:09PM -0600, Jens Axboe wrote: >>> On 6/28/23 4:55?PM, Kent Overstreet wrote: >>>>> But it's not aio (or io_uring or whatever), it's simply the fact that >>>>> doing an fput() from an exiting task (for example) will end up being >>>>> done async. And hence waiting for task exits is NOT enough to ensure >>>>> that all file references have been released. >>>>> >>>>> Since there are a variety of other reasons why a mount may be pinned and >>>>> fail to umount, perhaps it's worth considering that changing this >>>>> behavior won't buy us that much. Especially since it's been around for >>>>> more than 10 years: >>>> >>>> Because it seems that before io_uring the race was quite a bit harder to >>>> hit - I only started seeing it when things started switching over to >>>> io_uring. generic/388 used to pass reliably for me (pre backpointers), >>>> now it doesn't. >>> >>> I literally just pasted a script that hits it in one second with aio. So >>> maybe generic/388 doesn't hit it as easily, but it's surely TRIVIAL to >>> hit with aio. As demonstrated. The io_uring is not hard to bring into >>> parity on that front, here's one I posted earlier today for 6.5: >>> >>> https://lore.kernel.org/io-uring/20230628170953.952923-4-axboe@xxxxxxxxx/ >>> >>> Doesn't change the fact that you can easily hit this with io_uring or >>> aio, and probably more things too (didn't look any further). Is it a >>> realistic thing outside of funky tests? Probably not really, or at least >>> if those guys hit it they'd probably have the work-around hack in place >>> in their script already. >>> >>> But the fact is that it's been around for a decade. It's somehow a lot >>> easier to hit with bcachefs than XFS, which may just be because the >>> former has a bunch of workers and this may be deferring the delayed fput >>> work more. Just hand waving. >> >> Not sure what you're arguing here...? >> >> We've had a long standing bug, it's recently become much easier to hit >> (for multiple reasons); we seem to be in agreement on all that. All I'm >> saying is that the existence of that bug previously is not reason to fix >> it now. > > I agree with Kent here - the kernel bug needs to be fixed > regardless of how long it has been around. Blaming the messenger > (userspace, fstests, etc) and saying it should work around a > spurious, unpredictable, undesirable and user-undebuggable kernel > behaviour is not an acceptible solution here... Not sure why you both are putting words in my mouth, I've merely been arguing pros and cons and the impact of this. I even linked the io_uring addition for ensuring that side will work better once the deferred fput is sorted out. I didn't like the idea of fixing this through umount, and even outlined how it could be fixed properly by ensuring we flush per-task deferred puts on task exit. Do I think it's a big issue? Not at all, because a) nobody has reported it until now, and b) it's kind of a stupid case. If we can fix it with minimal impact, should we? Yep. Particularly as the assumptions stated in the original commit I referenced were not even valid back then. -- Jens Axboe