On Thu, Jun 29, 2023 at 01:18:11PM +0200, Christian Brauner wrote: > On Wed, Jun 28, 2023 at 07:33:18PM -0600, Jens Axboe wrote: > > On 6/28/23 7:00?PM, Dave Chinner wrote: > > > On Wed, Jun 28, 2023 at 07:50:18PM -0400, Kent Overstreet wrote: > > >> On Wed, Jun 28, 2023 at 05:14:09PM -0600, Jens Axboe wrote: > > >>> On 6/28/23 4:55?PM, Kent Overstreet wrote: > > >>>>> But it's not aio (or io_uring or whatever), it's simply the fact that > > >>>>> doing an fput() from an exiting task (for example) will end up being > > >>>>> done async. And hence waiting for task exits is NOT enough to ensure > > >>>>> that all file references have been released. > > >>>>> > > >>>>> Since there are a variety of other reasons why a mount may be pinned and > > >>>>> fail to umount, perhaps it's worth considering that changing this > > >>>>> behavior won't buy us that much. Especially since it's been around for > > >>>>> more than 10 years: > > >>>> > > >>>> Because it seems that before io_uring the race was quite a bit harder to > > >>>> hit - I only started seeing it when things started switching over to > > >>>> io_uring. generic/388 used to pass reliably for me (pre backpointers), > > >>>> now it doesn't. > > >>> > > >>> I literally just pasted a script that hits it in one second with aio. So > > >>> maybe generic/388 doesn't hit it as easily, but it's surely TRIVIAL to > > >>> hit with aio. As demonstrated. The io_uring is not hard to bring into > > >>> parity on that front, here's one I posted earlier today for 6.5: > > >>> > > >>> https://lore.kernel.org/io-uring/20230628170953.952923-4-axboe@xxxxxxxxx/ > > >>> > > >>> Doesn't change the fact that you can easily hit this with io_uring or > > >>> aio, and probably more things too (didn't look any further). Is it a > > >>> realistic thing outside of funky tests? Probably not really, or at least > > >>> if those guys hit it they'd probably have the work-around hack in place > > >>> in their script already. > > >>> > > >>> But the fact is that it's been around for a decade. It's somehow a lot > > >>> easier to hit with bcachefs than XFS, which may just be because the > > >>> former has a bunch of workers and this may be deferring the delayed fput > > >>> work more. Just hand waving. > > >> > > >> Not sure what you're arguing here...? > > >> > > >> We've had a long standing bug, it's recently become much easier to hit > > >> (for multiple reasons); we seem to be in agreement on all that. All I'm > > >> saying is that the existence of that bug previously is not reason to fix > > >> it now. > > > > > > I agree with Kent here - the kernel bug needs to be fixed > > > regardless of how long it has been around. Blaming the messenger > > > (userspace, fstests, etc) and saying it should work around a > > > spurious, unpredictable, undesirable and user-undebuggable kernel > > > behaviour is not an acceptible solution here... > > > > Not sure why you both are putting words in my mouth, I've merely been > > arguing pros and cons and the impact of this. I even linked the io_uring > > addition for ensuring that side will work better once the deferred fput > > is sorted out. I didn't like the idea of fixing this through umount, and > > even outlined how it could be fixed properly by ensuring we flush > > per-task deferred puts on task exit. > > > > Do I think it's a big issue? Not at all, because a) nobody has reported > > it until now, and b) it's kind of a stupid case. If we can fix it with > > Agreed. yeah, the rest of this email that I snipped is _severely_ confused about what is going on here. Look, the main thing I want to say is - I'm not at all impressed by this continual evasiveness from you and Jens. It's a bug, it needs to be fixed. We are engineers. It is our literal job to do the hard work and solve the hard problems, and leave behind a system more robust and more reliable for the people who come after us to use. Not to kick the can down the line and leave lurking landmines in the form of "oh you just have to work around this like x..."