On Mon, Dec 06, 2021 at 07:37:31PM +0530, Chandan Babu R wrote: > On 06 Dec 2021 at 11:17, Chandan Babu R wrote: > > On 02 Dec 2021 at 21:15, Josef Bacik wrote: > >> On Thu, Dec 02, 2021 at 02:45:24PM +0000, Filipe Manana wrote: > >>> On Wed, Dec 01, 2021 at 01:18:52PM -0500, Josef Bacik wrote: > >>> > We've been seeing transient errors with any test that uses a dm device > >>> > for the entirety of the time that we've been running nightly xfstests > >>> > >>> I have been having it on my tests vms since ever as well. > >>> It's really annoying, but fortunatelly it doesn't happen too often. > >>> > >> > >> Yeah before this change we'd fail ~6 tests on every configruation on every > >> overnight run. With this change we fail 0. It's rare, but with our continual > >> testing it happens sooooooo much. > >> > >>> > runs. This turns out to be because sometimes we get EBUSY while trying > >>> > to create our new dm device. Generally this is because the test comes > >>> > right after another test that messes with the dm device, and thus we > >>> > still have udev messing around with the device when DM tries to O_EXCL > >>> > the block device. > >>> > > >>> > Add a UDEV_SETTLE_PROG before creating the device to make sure we can > >>> > create our new dm device without getting this transient error. > >>> > >>> I suspect this might only make it seem the problem goes away but does not > >>> really fix it. > >>> > >>> I say that for 2 reasons: > >>> > >>> 1) All tests that use dm end up calling _dmsetup_remove(), like through > >>> _log_writes_remove() or _cleanup_flakey() for example. Normally those > >>> are called in the _cleanup() function, which ensures it's done even if > >>> the test fails for some reason. > >>> > >>> So I don't understand why we need that UDEV_SETTLE_PROG at _dmsetup_create(). > >>> > >>> And I've seen the ebusy failure happen even when the previous tests did > >>> not use any dm device; > >>> > >>> 2) Some tests fail after creating the dm device and using it. For example > >>> btrfs/206 often fails when it tries to fsck the filesystem: > >>> > >>> btrfs/206 3s ... [failed, exit status 1]- output mismatch (see /home/fdmanana/git/hub/xfstests/results//btrfs/206.out.bad) > >>> --- tests/btrfs/206.out 2020-10-16 23:13:46.554152652 +0100 > >>> +++ /home/fdmanana/git/hub/xfstests/results//btrfs/206.out.bad 2021-12-01 21:09:46.317632589 +0000 > >>> @@ -3,3 +3,5 @@ > >>> XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > >>> wrote 8192/8192 bytes at offset 0 > >>> XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > >>> +_check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent > >>> +(see /home/fdmanana/git/hub/xfstests/results//btrfs/206.full for details) > >>> ... > >>> > >>> (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/206.out /home/fdmanana/git/hub/xfstests/results//btrfs/206.out.bad' to see the entire diff) > >>> > >>> In the .full file I got: > >>> > >>> (...) > >>> replaying 1239@11201: sector 2173408, size 16384, flags 0x10(METADATA) > >>> replaying 1240@11234: sector 0, size 0, flags 0x1(FLUSH) > >>> replaying 1241@11235: sector 128, size 4096, flags 0x12(FUA|METADATA) > >>> _check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent > >>> *** fsck.btrfs output *** > >>> ERROR: cannot open device '/dev/sdc': Device or resource busy > >>> ERROR: cannot open file system > >>> Opening filesystem to check... > >>> *** end fsck.btrfs output > >>> *** mount output *** > >>> > >>> The ebusy failure is not when the test starts, but when somewhere in the middle > >>> of the replay loop when it calls fsck, or when it ends and the fstests framework > >>> calls fsck. > >>> > >>> I've seen that with btrfs/172 too, which also uses dm logwrites in a similar way. > >>> > >>> So to me this suggests 2 things: > >>> > >>> 1) Calling UDEV_SETTLE_PROG at _dmsetup_create() doesn't solve that problem with > >>> btrfs/206 (and other tests) - the problem is fsck failing to open the scratch > >>> device after it called _log_writes_remove() -> _dmsetup_remove(), and not a > >>> failure to create the dm device; > >>> > >>> 2) The problem is likely something missing at _dmsetup_remove(). Perhaps add > >>> another UDEV_SETTLE_PROG there: > >>> > >>> diff --git a/common/rc b/common/rc > >>> index 8e351f17..22b34677 100644 > >>> --- a/common/rc > >>> +++ b/common/rc > >>> @@ -4563,6 +4563,7 @@ _dmsetup_remove() > >>> $UDEV_SETTLE_PROG >/dev/null 2>&1 > >>> $DMSETUP_PROG remove "$@" >>$seqres.full 2>&1 > >>> $DMSETUP_PROG mknodes >/dev/null 2>&1 > >>> + $UDEV_SETTLE_PROG >/dev/null 2>&1 > >>> } > >>> > >>> _dmsetup_create() > >>> > >>> I can't say if that change to _dmsetup_remove() is correct, or what it's > >>> needed, as I really haven't spent time trying to figure out why the issue > >>> happens. > >>> > >> > >> I actually tried a few iterations before I settled on this one, but I was only > >> trying to reproduce the EBUSY when creating the flakey device, I hadn't seen it > >> with fsck. So I originally started with your change, but it didn't fix the > >> problem. Then I did both, UDEV_SETTLE at the end of remove and at the beginning > >> of create and the problem went away, and then I removed the one from remove and > >> the problem still was gone. > >> > >> But since I've made this change I also have been investigating another problem > >> where we'll get EBUSY at mkfs time when we use SCRATCH_DEV_POOL. I have a test > >> patch in our staging branch to make sure it actuall fixes it, but I have to add > >> this to the start of _scratch_pool_mkfs as well. > >> > >> It turns out that udev is doing this thing where it writes to > >> /sys/block/whatever/uevent to make sure that a KOBJ_CHANGE event gets sent out > >> for the device. > >> > >> This is racing with the test doing a mount. So the mount gets O_EXCL, which > >> means the uevent doesn't get emitted until umount. This would explain what > >> you're seeing, we umount, we get the KOBJ_CHANGE event once the O_EXCL is > >> dropped, udev runs, and then fsck gets an EBUSY. > > > > I started debugging this issue late last week. Among several tests which > > failed, xfs/033 was failing once in a while because the umount syscall > > returned -EBUSY. Debugging this further led me to the following, > > > > 1. Mounting an fs causes udev to invoke xfs_spaceman (via udisksd => xfs_info). > > This causes the per-cpu counter at mount->mnt_pcp->mnt_count to increase by > > 1. > > 2. Unmount at this stage causes the umount syscall to fail since the refcount > > on struct mount is greater than 2. > > > > I changed my test disks from the regular /dev/sd* devices to loop devices. I > > then added loop devices to be ignored by udev with the following change in > > /lib/udev/rules.d/60-persistent-storage.rules, > > > > KERNEL!="sr*|loop*", IMPORT{builtin}="blkid" > > > > This led to xfs/033 execute fine for 100 times without failure. > > > However, other tests which use device mapper devices are still failing > > arbitrarily. > > This failure occurs because 60-persistent-storage-dm.rules has the following, > > # Add inotify watch to track changes on this device. > # Using the watch rule is not optimal - it generates a lot of spurious > # and useless events whenever the device opened for read-write is closed. > # The best would be to generete the event directly in the tool changing > # relevant information so only relevant events will be processed > # (like creating a filesystem, changing filesystem label etc.). > # > # But let's use this until we have something better... > LABEL="dm_watch" > OPTIONS+="watch" > > Hence any change to the device will generate a uevent causing udev's internal > blkid to be invoked. This ends up forking and executing xfs_spaceman. > First of all excellent work digging into this Chandan, thanks so much! <snip> > > > > To be honest, I don't understand udev well enough. But as pointed above, a > > simple invocation of mount is causing xfs_spaceman to be executed (indirectly) > > by udevd. Hence, may be executing UDEV_SETTLE_PROG is probably not > > sufficient. > > Yeah so the question is what do we do here to make xfstests and udev co-exist nicely? I imagine if you added UDEV_SETTLE_PROG around every operation we do (for example before umount in your test) we would stop hitting these problems. But this clearly sucks. Should we instead come up with a couple of udev rules and put them here in fstests with the disclaimer "hey if you see problems, install these udev rules" and call it good enough? I see there's UDEV_DISABLE_PERSISTENT_STORAGE_RULES_FLAG and DM_NOSCAN, I don't know how to write udev rules, but is there a way we could write a script to use the local.config to generate a udev rule to be copied into place based on the users device settings in local.config that would keep udev from touching the devices we're using? Thanks, Josef