on 2023/08/21 19:16, Filipe Manana wrote: > On Mon, Aug 21, 2023 at 6:39 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote: >> >> On Sat, Aug 19, 2023 at 05:25:35PM +0100, Filipe Manana wrote: >>> On Fri, Aug 18, 2023 at 1:43 PM Zorro Lang <zlang@xxxxxxxxxx> wrote: >>>> On Wed, Aug 16, 2023 at 02:58:10PM +0000, Yang Xu (Fujitsu) wrote: >>>>> I think the url that Jens mention should be this[1] when he reviewed >>>>> Stefan V7 patch for "io-uring/xfs: support async buffered writes". >>>>> >>>>> [1]https://lkml.kernel.org/linux-fsdevel/ca60a7dc-b16d-d8ce-f56c-547b449da8c9@xxxxxxxxx/ >>>> >>>> Hi Filipe, >>>> >>>> Does above explanation make sense to you? >>> >>> Not completely. >>> >>> It justifies that the test's assumptions are not necessarily correct, >>> that I understood and it's reasonable. >>> >>> However I don't see, neither in that thread nor in this patch's >>> changelog, why the test started to fail (on xfs, it still passes on >>> btrfs and ext4) after adding support for async buffered IO writes to >>> xfs and iomap - as all the writes done by the test are using direct >>> IO. >> >> We changed how timestamps are updated so that they are aware of >> IOCB_NOWAIT. If the IOCB_NOWIAT DIO write now needs to update the >> inode timestamps, it will return -EAGAIN instead of doing >> potentially blocking operations that require IO to complete (i.e. >> taking a transaction reservation). >> >> Hence the first time we go to do a DIO read an inode, it's going to >> do an atime update, which now occurrs from an IOCB_NOWAIT context >> and we return -EAGAIN.... >> >> Yes, we added non-blocking timestamp updates as part of the async >> buffered write support, but this was a general XFS IO path change of >> behaviour to address a potential blocking point in *all* IOCB_NOWAIT >> reads and writes, buffered or direct. > > Ok, now that's the kind of explanation I would expect in the changelog. > >> >>> At least the changelog should point to that thread, and not the one it >>> currently refers to because it doesn't provide any useful information. >>> For completeness it should also justify why the async buffered write >>> support broke the test, as it points out the test fails after those 2 >>> commits for buffered write support, yet there are no buffered writes >>> performed by the test. >> >> async buffered writes didn't break the test. The addition of >> nonblocking timestamp updates in XFS is what causes the test to now >> fail. > > Ok, so the changelog is very misleading, as it points to commits that, > as far as I can see at least, > have nothing to do with the changes that make timestamp updates aware > of NOWAIT semantics. > > So it should be the following commit to be referred (and possibly others): > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=66fa3cedf16abc82d19b943e3289c82e685419d5 > >> >> Indeed, we may change this XFS behaviour again some day - if we can >> guarantee that we can get a transaciton reservation without blocking >> then we -could- allow the timestamp update to run in IOCB_NOWAIT >> context. Doing this would then mean the test might randomly fail, >> depending on whether the timestamp update can be done without >> blocking or not.... >> >> Put simply, the test is not validating that RWF_NOWAIT is behaving >> correctly - it just was a simple operation that kinda exercised >> RWF_NOWAIT semantics when we had no other way to test this code. It >> has outlived it's original purpose, so it should be removed... > > Thanks for the detailed explanation. > > A simpler version of this, or perhaps a lore link to your reply should > be added to the changelog, > because the current one is more like "remove this test because someone > else told to do so" without > any relevant details. Sorry, I just saw part of the generic/471 history and discussion last week, then wrote the misleading changelog. But now , it is better than do nothing. At least, we figured out why xfs fail reason and why remove this test. So I guess I only need to add the following word as commit message is enough "The test is not validating that RWF_NOWAIT is behaving correctly - it just was a simple operation that kinda exercised RWF_NOWAIT semantics when we had no other way to test this code. It has outlived it's original purpose, so it should be removed... " Is it right? Best Regards Yang Xu > >> >> -Dave. >> -- >> Dave Chinner >> david@xxxxxxxxxxxxx