From: Dave Chinner <dchinner@xxxxxxxxxx> When there is load on the system, newly created DM devices don't seem to be created consistently. When a new device is created, it is supposed to be created as /dev/dm-X, and then a udev rule creates the symlink from /dev/mapper/<dev name> to /dev/dm-X. Unfortunately, a lot of the tests that use dynamically created dm devices (dmerror, dmflakey) are not being created with this device node structure. This is resulting in getting the wrong short device name for the block device and hence we can't find the filesystem sysfs attribute directory for the filesystem on that block device. For example, with added debug to check what device name was being passed around and resolved: eneric/489 - output mismatch (see /mnt/xfs/runner-10/results/xfs/generic/489.out.bad) --- tests/generic/489.out 2022-12-21 15:53:25.503043574 +1100 +++ /mnt/xfs/runner-10/results/xfs/generic/489.out.bad 2024-10-24 10:27:29.767196340 +1100 @@ -1,4 +1,10 @@ QA output created by 489 +./common/rc: line 4955: /sys/fs/xfs/flakey-test.489/error/fail_at_unmount: No such file or directory +dev: /dev/mapper/flakey-test.489 +resolved dev: /dev/mapper/flakey-test.489 +brw-rw----. 1 root disk 251, 5 Oct 24 10:27 /dev/mapper/flakey-test.489 +./common/rc: line 4955: /sys/fs/xfs/flakey-test.489/error/metadata/EIO/max_retries: No such file or directory +./common/rc: line 4955: /sys/fs/xfs/flakey-test.489/error/metadata/EIO/retry_timeout_seconds: No such file or directory ... (Run 'diff -u /home/dave/src/xfstests-dev/tests/generic/489.out /mnt/xfs/runner-10/results/xfs/generic/489.out.bad' to see the entire diff) Here we see that the block device node is actually at /dev/mapper/flakey-test.489, not a link to a /dev/dm-X device node. This implies that the udev rule to create the /dev/dm-X node and the symlink to it at /dev/mapper/flakey-test.489 has not run, and something else created the device node. That looks like a bug in _dmsetup_create(). It creates the new DM device, then runs 'dmsetup mknodes', then waits for udev to settle. This means the mknodes command - which makes sure the dm device nodes exist - is racing with udev to create the device nodes. They don't use the same rules to create nodes, so we end up with this broken situation. 'dmsetup mknodes' is considered legacy functionality, intended for systems that have no udev capability. For systems that have udev enabled (i.e. all modern distros), mknodes should not be run because it creates a different device node structure to what udev creates and can race with udev as we see here. Fix it by removing the 'dmsetup mknodes' as it is unnecessary to create the correct device node layout the rest of the system is expecting to see. Additionally,_dmsetup_remove() calls 'dmsetup mknodes' and that can also race with udev and cause issues. Hence we need to remove that call from the remove operation as well. Further, 'dmsetup remove' is also subject to races with udev which results in device remove failing. This problem is documented in the dmsetup man page and suggests the use of the "--retry" option. This means dmsetup will retry several times over a few seconds before failing the removal. This reduces the remove failure rate substantially, but it can still occasionally fail when the system is under heavy load and udev processing is very slow. This is fixable, but requires fstests udev infrastructure changes as it requires udevadm functionality that is relatively new. Hence that will be done as a separate fix. Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> --- common/rc | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/common/rc b/common/rc index 391370fd5..a601e2c80 100644 --- a/common/rc +++ b/common/rc @@ -5162,8 +5162,8 @@ _require_label_get_max() _dmsetup_remove() { $UDEV_SETTLE_PROG >/dev/null 2>&1 - $DMSETUP_PROG remove "$@" >>$seqres.full 2>&1 - $DMSETUP_PROG mknodes >/dev/null 2>&1 + $DMSETUP_PROG remove --retry "$@" >>$seqres.full 2>&1 + $UDEV_SETTLE_PROG >/dev/null 2>&1 } _dmsetup_create() @@ -5174,7 +5174,6 @@ _dmsetup_create() # device open won't also fail. $UDEV_SETTLE_PROG >/dev/null 2>&1 $DMSETUP_PROG create "$@" >>$seqres.full 2>&1 || return 1 - $DMSETUP_PROG mknodes >/dev/null 2>&1 $UDEV_SETTLE_PROG >/dev/null 2>&1 } -- 2.45.2