On Mon, Jul 08 2024, Theodore Ts'o wrote: > On Wed, May 29, 2024 at 10:20:29AM +0100, Luis Henriques (SUSE) wrote: >> When a full journal commit is on-going, any fast commit has to be enqueued >> into a different queue: FC_Q_STAGING instead of FC_Q_MAIN. This enqueueing >> is done only once, i.e. if an inode is already queued in a previous fast >> commit entry it won't be enqueued again. However, if a full commit starts >> _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to >> be done into FC_Q_STAGING. And this is not being done in function >> ext4_fc_track_template(). >> >> This patch fixes the issue by re-enqueuing an inode into the STAGING queue >> during the fast commit clean-up callback if it has a tid (i_sync_tid) >> greater than the one being handled. The STAGING queue will then be spliced >> back into MAIN. >> >> This bug was found using fstest generic/047. This test creates several 32k >> bytes files, sync'ing each of them after it's creation, and then shutting >> down the filesystem. Some data may be loss in this operation; for example a >> file may have it's size truncated to zero. >> >> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@xxxxxxxxx> > > This patch is causing a regression for the test generic/472 > generic/496 generic/643 if fast_commit is enabled. So using the > ext4/adv or ext4/fast_commit configuration, e.g: > > % kvm-xfstests -c ext4/fast_commit generic/472 generic/496 generic/643 > > For all of these test, the failures seem to involve the swapon command > erroring out: > > --- tests/generic/496.out 2024-06-13 18:57:39.000000000 -0400 > +++ /results/ext4/results-fast_commit/generic/496.out.bad 2024-07-08 23:46:39.720 > @@ -1,3 +1,4 @@ > QA output created by 496 > fallocate swap > mixed swap > +swapon: Invalid argument > ... > > but it's unclear why this patch would affect swapon. OK, that's... embarrassing. I should have caught these failures :-( > I've never been able to see generic/047 failure in any of my ext4/dev > testing, nor in any of my daily fs-next CI testing. So for that > reason, I'm going to drop this patch from my tree. There's nothing special about my test environment. I can reproduce the generic/047 failure (although not 100% of the times) by running it manually in a virtme-ng test environment, using MKFS_OPTIONS="-O fast_commit". Here's what I see when running it: FSTYP -- ext4 PLATFORM -- Linux/x86_64 virtme-ng 6.10.0-rc7+ #269 SMP PREEMPT_DYNAMIC Tue Jul 9 14:24:22 WEST 2024 MKFS_OPTIONS -- -F -O fast_commit /dev/vdb1 MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdb1 /tmp/mnt/scratch generic/047 162s ... - output mismatch (see [...]/testing/xfstests-dev/results//generic/047.out.bad) --- tests/generic/047.out 2021-01-11 12:08:14.972458324 +0000 +++ [...]/testing/xfstests-dev/results//generic/047.out.bad 2024-07-09 14:28:36.626435948 +0100 @@ -1 +1,2 @@ QA output created by 047 +file /tmp/mnt/scratch/944 has incorrect size - fsync failed ... (Run 'diff -u [...]/testing/xfstests-dev/tests/generic/047.out [...]/testing/xfstests-dev/results//generic/047.out.bad' to see the entire diff) Ran: generic/047 Failures: generic/047 Failed 1 of 1 tests > The second patch in this series appears to be independent at least > from a logical perspective --- although a minor change is needed to > resolve a merge conflict after dropping this change. > > Luis, Harshad, could you look in this failure and then resubmit once > it's been fixed? Thanks!! Also, Luis, can you give more details > about the generic/047 failure that you had seen? Is it one of those > flaky tests where you need to run the test dozens or hundreds of time > to see the failure? So, I've done some quick tests, but I'll need some more time to dig into it. And this is what I _think_ it's happening: When activating a swap file, the kernel forces an fsync, calling ext4_sync_file() which will then call ext4_fc_commit() and, eventually, the ext4_fc_cleanup(). With this patch an inode may be re-enqueued into the STAGING queue and then spliced back into MAIN; and that's exactly what I see happening. Later, still on the swap activation path, ext4_set_iomap() will be called and will do this: if (ext4_inode_datasync_dirty(inode) || offset + length > i_size_read(inode)) iomap->flags |= IOMAP_F_DIRTY; 'ext4_inode_datasync_dirty()' will be true because '->i_fc_list' is not empty. And that's why the swapoff will fail. Anyway, I'll try to figure out what's missing here (or what's wrong with my patch). Cheers, -- Luís